Anand Mishra, PhD
CSE-210, Department of Computer Science and Engineering
Indian Institute of Technology Jodhpur
Jodhpur - 342030 (RJ), India

Currently, I serve as an Assistant Professor at the Department of Computer Science and Engineering at the Indian Institute of Technology Jodhpur. Prior to this role, I had the opportunity to work as a Postdoctoral Researcher under the mentorship of Dr. Partha Pratim Talukdar at the Indian Institute of Science, focusing on Knowledge-aware Computer Vision for nearly two years. For my doctoral studies, I conducted research on the interpretation of text within scene images at IIIT Hyderabad, where I had the privilege of being supervised by Prof. C. V. Jawahar and Dr. Karteek Alahari.

My current research interest lies in the intersection of vision and language. Specifically, I am deeply engaged in exploring the field of developing AI Agents that can understand human language and perceive and comprehend the visual world. The overarching goal of my research group at IIT Jodhpur, known as Vision, Language, and Learning Group or VL2G in short, is to advance the development of these intelligent agents towards bridging the gap between human and machine interaction. To know more about the recent research focus and activities of VL2G, please visit the group's website.

Email | CV | Google Scholar | DBLP | Selected Publications | Teaching | VL2G

Open Positions:

Please visit HERE for updates on available positions within our group.

Recent/upcoming professional activities:

  • Reviewer and/or PC member for: CVPR'20/22/23/24, ECCV'20/22/24, ICCV'19/21/23, ACL Rolling Review, ICLR'23, WACV'23, AAAI'20/21/22, IJCAI'19/20, ICDAR'19, IEEE TPAMI, IJCV, IEEE TKDD, CVIU, IJDAR, Pattern Recognition.
  • Co-organizer: ScalDoc 2023, NCVPRIPG'23, WDAR 2021/WDAR 2023, KBMM 2019/KBMM 2020.
  • Workshop Co-Chair: ICFHR'22.

  • News/Activities (A complete list is here)

    Selected Publications


    • Multimodal Query-guided Object Localization,
      Aditay Tripathi, Rajath R. Dani, Anand Mishra, Anirban Chakraborty
      Volume: 83 (5), Pages: 14857-14881, Multimedia Tools and Applications, 2024.

    • DHFML: deep heterogeneous feature metric learning for matching photograph and cartoon pairs
      Anand Mishra
      pages: 1-8, International Journal of Multimedia Information Retrieval 2018.

    • Unsupervised refinement of color and stroke features for text binarization
      Anand Mishra, Karteek Alahari and C. V. Jawahar
      Volume 20:105–121, International Journal on Document Analysis and Recognition 2017

    • Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues
      Anand Mishra, Karteek Alahari and C. V. Jawahar
      Volume 145: 30-42, Computer Vision and Image Understanding 2016

    Conference Papers

  • QDETRv: Query-Guided DETR for One-Shot Object Localization in Videos,
    Yogesh Kumar, Saswat Mallick, Anand Mishra, Sowmya Rasipuram, Anutosh Maitra, Roshni Ramnani
    AAAI 2024.(NEW)
  • [Paper]

  • Composite Sketch+Text Queries for Retrieving Objects with Elusive Names and Complex Interactions,
    Prajwal Gatti, Kshitij Parikh, Dhriti Paul, Manish Gupta, Anand Mishra.
    AAAI 2024.(NEW)
  • [Paper] [Project Page] [CSTBIR Dataset] [Code]

  • Query-guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch,
    Aditay Tripathi, Anand Mishra, Anirban Chakraborty
    WACV 2024.(NEW)
    [Paper][Project Page][Code]

  • Semantic Labels-Aware Transformer Model for Searching over a Large Collection of Lecture-Slides,
    K.V. Jobin, Anand Mishra, C. V. Jawahar
    WACV 2024 (Oral). (NEW: Best Paper Award Finalist)
    [Paper][Project Page] [LecSD Dataset][Short Talk]

  • Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering, (NEW)
    Abhirama Subramanyam Penamakuri, Manish Gupta, Mithun Das Gupta, Anand Mishra
    IJCAI 2023.
    [Paper][Project Page][Code]

  • Towards Making Flowchart Images Machine Interpretable (NEW)
    Shreya Shukla, Prajwal Gatti, Yogesh Kumar, Vikash Yadav, Anand Mishra
    ICDAR 2023.
    [Paper][Project Page][Code]

  • Few-Shot Referring Relationships in Videos, (NEW)
    Yogesh Kumar, Anand Mishra
    CVPR 2023.
    [Paper][Project Page][Code]

  • Grounding Scene Graphs on Natural Images via Visio-Lingual Message Passing
    Aditay Tripathi, Anand Mishra, Anirban Chakraborty,
    WACV 2023.
    [Paper][Project Page][Code]

  • VISTOT: Vision-Augmented Table-to-Text Generation,
    Prajwal Gatti, Anand Mishra, Manish Gupta, Mithun Das Gupta,
    EMNLP 2022.
    [Paper][Project Page][Code]

  • COFAR: Commonsense and Factual Reasoning in Image Search
    Prajwal Gatti, Abhirama Subramanyam Penamakuri, Revant Teotia, Anand Mishra, Shubhashis Sengupta, Roshni Ramnani
    AACL-IJCNLP 2022.
    [Paper][Project Page][Code]

  • Few-shot Visual Relationship Co-localization
    Revant Teotia*, Vaibhav Mishra*, Mayank Maheshwari*, Anand Mishra,
    ICCV 2021.
    [Paper][Project Page][Code] (*: equal contribution)

  • Look, Read and Ask: Learning to Ask Questions by Reading Text in Images ,
    Soumya Jahagirdar, Shankar Gangisetty, Anand Mishra,
    ICDAR 2021 (Oral).

  • Sketch-Guided Object Localization in Natural Images,
    Aditay Tripathi, Rajath R. Dani, Anand Mishra, Anirban Chakraborty
    ECCV 2020 (Spotlight Presentation).
    [Paper] [bibtex] [Project page][Code] [Know the paper in 90 seconds] [Know the paper in ten minutes]

  • From Strings to Things: Knowledge-enabled VQA model that can read and reason,
    Ajeet Kumar Singh, Anand Mishra, Shashank Shekhar, and Anirban Chakraborty
    ICCV 2019 (oral).
    [Paper] [bibtex] [Project page]

  • OCR-VQA: Visual Question Answering by Reading Text in Images
    Anand Mishra, Shashank Shekhar, Ajeet Kumar Singh, and Anirban Chakraborty
    ICDAR 2019.
    [Paper] [bibtex] [Project page]

  • KVQA: Knowledge-aware Visual Question Answering
    Sanket Shah*, Anand Mishra*, Naganand Yadati and Partha Pratim Talukdar
    (*: equal contribution) AAAI 2019. (acceptance rate: 16.1%)
    [Paper] [bibtex] [Project page]

  • Deep Embedding using Bayesian Risk Minimization with Application to Sketch Recognition
    Anand Mishra,and Ajeet Kumar Singh
    ACCV, 2018. (acceptance rate: 28%)
    [Paper (arXiv)] [bibtex]

  • IIIT-CFW: A Benchmark database of Cartoon Faces in the Wild
    Ashutosh Mishra, Shyam N. Roy, Anand Mishra,and C. V. Jawahar
    ECCVW, 2016. (Oral)
    [PDF] [bibtex][ IIIT-CFW dataset]

  • A Simple and Effective method for Script Identification in the Wild
    Ajeet Kumar Singh, Anand Mishra, Pranav Dabaral and C. V. Jawahar
    DAS, 2016.
    [Paper] [bibtex]

  • Scene Text Recognition and Retrieval for Large Lexicons
    Udit Roy, Anand Mishra, Karteek Alhari and C. V. Jawahar
    ACCV 2014.
    [Paper] [bibtex]

  • Image Retrieval using Textual Cues
    Anand Mishra, Karteek Alhari and C. V. Jawahar
    ICCV, 2013.
    [Paper] [bibtex]

  • Whole is Greater than Sum of Parts: Recognizing Scene Text Words
    Vibhor Goel, Anand Mishra, Karteek Alhari and C. V. Jawahar
    ICDAR, 2013.
    [Paper] [bibtex]

  • Scene Text Recognition using Higher Order Language Priors
    Anand Mishra, Karteek Alhari and C. V. Jawahar
    BMVC 2012. (Oral)
    [Paper] [bibtex] [ IIIT-5K Word dataset]

  • Top-down and Bottom-up cues for Scene Text Recognition
    Anand Mishra, Karteek Alhari and C. V. Jawahar
    CVPR 2012.
    [Paper] [bibtex]

  • An MRF model for Binarization of Natural Scene Text
    Anand Mishra, Karteek Alhari and C. V. Jawahar
    ICDAR 2011. (Oral)
    [Paper] [bibtex]

  • Teaching

    At IIT Jodhpur

  • CSL2050: Pattern Recognition and Machine Learning (Spring’24)
  • CSL7670: Fundamentals of Machine Learning (Monsoon’23)
  • CSL7360: Computer Vision (Spring’23)
  • CSL2040: Maths for Computing (Monsoon’22/21/AY 20-21–Tri-3)
  • CSL7410: Graph Theory and Application (AY 20-21–Tri-1, Spring’22)
  • CS222: Theory of Computation (Spring’20)
  • CS212: Object-oriented Design and Analysis (Monsoon’19)
  • At IISc Bangalore

  • UE101: Algorithms and Programming (Monsoon'18)
  • - Co-taught at Indian Institute of Science with Dr. Sathish Govindrajan and Dr. Viraj Kumar.

    At IIIT Hyderabad (during PhD)

  • Computer Problem Solving (Monsoon'16)
  • -- Introductory course for M.Tech. Bioinformatics

    At IIIT Sri City (during PhD)

  • Computer Architecture (Spring'15)
  • -- Co-taught at IIIT Sricity as a visiting instructor with Dr. Suresh Purini and Prof. Govindrajulu
  • Operating Systems (Monsoon'14)
  • -- Co-taught at IIIT Sricity as a visiting instructor with Dr. Suresh Purini
    Open Positions

    We do not offer any short-term (summer/winter) internship positions, except occasional special calls for specific project needs. I apologize for not being able to respond to individual emails regarding these positions.

    For Non-IITJ students: below are the current open positions:

    • PhD/MTech-PhD Position: We have open positions for Full-time PhD and MTech-PhD currently. If you are interested in pursuing PhD or MTech-PhD, please consider applying through the institute's official route: Link.
    • Research Assistant/Research Engineers Position:We hire highly motivated engineering graduates, those who graduated or graduating this semester BTech/BE, preferably in CS/EE/AI for "Full-Time (in-person)" Research Engineer or Research Assistant positions. This is a rolling call. Exceptional academic credentials, sound machine learning and deep learning knowledge, good programming skills, and passion for doing world-class research (and development) are essential for these positions. If you are eligible and interested, please consider applying HERE. Next cut-off date: May 15, 2024. (NEW)