Anand Mishra, PhD
CSE-210, Department of Computer Science and Engineering
Indian Institute of Technology Jodhpur
Jodhpur - 342030 (RJ), India

Currently, I serve as an Assistant Professor at the Department of Computer Science and Engineering at the Indian Institute of Technology Jodhpur. Prior to this role, I had the opportunity to work as a Postdoctoral Researcher under the mentorship of Dr. Partha Pratim Talukdar at the Indian Institute of Science, focusing on Knowledge-aware Computer Vision for nearly two years. For my doctoral studies, I conducted research on the interpretation of text within scene images at IIIT Hyderabad, where I had the privilege of being supervised by Prof. C. V. Jawahar and Dr. Karteek Alahari.

My current research interest lies in the intersection of vision and language. Specifically, I am deeply engaged in exploring the field of developing AI Agents that possess the ability to understand human language and perceive and comprehend the visual world. The overarching goal of my research group at IIT Jodhpur, known as Vision, Language, and Learning Group or VL2G in short, is to advance the development of these intelligent agents towards bridging the gap between human and machine interaction. To know more about the recent research focus and activities of VL2G, please visit the group's website.

Email | CV | Google Scholar | DBLP | Selected Publications | Teaching | VL2G

Recent/upcoming professional activities:

  • Reviewer and/or PC member for: ICLR'23, ECCV'20/22, CVPR'20/22/23, AAAI'20/21/22, ICCV'19/21/23, IJCAI'19/20, ICDAR'19, IEEE TPAMI, IJCV, IEEE TKDD, CVIU, IJDAR, Pattern Recognition.
  • Co-organizer: Workshop on Scaling-up Document Image Understanding (in conjunction with ICDAR'23), ICFHR 2022 (Workshop Co-Chair), 5th Workshop on Document Analysis and Recognition in conjunction with ICVGIP 2021, Workshop on Knowledge Bases and Multiple Modalities (KBMM) under AKBC 2019/2020.

  • News/Activities (A complete list is here)

    • [September 2023] Speaking at NISER, Bhubaneswar virtually on our sketch-guided visual understanding works.
    • [July 2023] Received the Microsoft Academic Partnership Grant (MAPG) 2023 (see the announcement).
    • [May 2023] Recognized as one of the Outstanding Reviewers at CVPR 2023. The complete list is here.
    • [April 2023] Our work on retVQA and Floco-T5 have been accepted in IJCAI 2023 (Main Track) and ICDAR 2023, respectively.
    • [March 2023] Our work on Few-shot Referring Relationships in Videos got accepted in CVPR 2023.
    • [January 2023] I am in the organizing team of NCVPRIPG'23. Please consider participating.
    • [January 2023] Speaking at ACM-India ARCS'23.
    • [October 2022] Thanks to Accenture Labs for a Gift Grant.
    • [October 2022] Our works COFAR, VisTOT and Scene Graph Grounding has been accpeted at AACL-IJCNLP 2022, EMNLP 2022 and WACV 2023, respectively.
    • [March 2022] Speaking at Search Technology Centre India (STCI), Microsoft.
    Selected Publications


    • Multimodal Query-guided Object Localization,
      Aditay Tripathi, Rajath R. Dani, Anand Mishra, Anirban Chakraborty
      Multimedia Tools and Applications, 2023 (Accepted)

    • DHFML: deep heterogeneous feature metric learning for matching photograph and cartoon pairs
      Anand Mishra
      pages: 1-8, International Journal of Multimedia Information Retrieval 2018

    • Unsupervised refinement of color and stroke features for text binarization
      Anand Mishra, Karteek Alahari and C. V. Jawahar
      Volume 20:105–121, International Journal on Document Analysis and Recognition 2017

    • Enhancing Energy Minimization Framework for Scene Text Recognition with Top-Down Cues
      Anand Mishra, Karteek Alahari and C. V. Jawahar
      Volume 145: 30-42, Computer Vision and Image Understanding 2016

    Conference Papers

  • Answer Mining from a Pool of Images: Towards Retrieval-Based Visual Question Answering, (NEW)
    Abhirama Subramanyam Penamakuri, Manish Gupta, Mithun Das Gupta, Anand Mishra
    IJCAI 2023.

  • Towards Making Flowchart Images Machine Interpretable (NEW)
    Shreya Shukla, Prajwal Gatti, Yogesh Kumar, Vikash Yadav, Anand Mishra
    ICDAR 2023.
    [Paper][Project Page][Code]

  • Few-Shot Referring Relationships in Videos, (NEW)
    Yogesh Kumar, Anand Mishra
    CVPR 2023.

  • Grounding Scene Graphs on Natural Images via Visio-Lingual Message Passing (NEW)
    Aditay Tripathi, Anand Mishra, Anirban Chakraborty,
    WACV 2023.
    [Paper][Project Page][Code]

  • VISTOT: Vision-Augmented Table-to-Text Generation, (NEW)
    Prajwal Gatti, Anand Mishra, Manish Gupta, Mithun Das Gupta,
    EMNLP 2022.
    [Paper][Project Page][Code]

  • COFAR: Commonsense and Factual Reasoning in Image Search (NEW)
    Prajwal Gatti, Abhirama Subramanyam Penamakuri, Revant Teotia, Anand Mishra, Shubhashis Sengupta, Roshni Ramnani
    AACL-IJCNLP 2022.
    [Paper][Project Page][Code]

  • Few-shot Visual Relationship Co-localization
    Revant Teotia*, Vaibhav Mishra*, Mayank Maheshwari*, Anand Mishra,
    ICCV 2021.
    [Paper][Project Page][Code] (*: equal contribution)

  • Look, Read and Ask: Learning to Ask Questions by Reading Text in Images ,
    Soumya Jahagirdar, Shankar Gangisetty, Anand Mishra,
    ICDAR 2021 (Oral).

  • Sketch-Guided Object Localization in Natural Images,
    Aditay Tripathi, Rajath R. Dani, Anand Mishra, Anirban Chakraborty
    ECCV 2020 (Spotlight Presentation).
    [Paper] [bibtex] [Project page][Code] [Know the paper in 90 seconds] [Know the paper in ten minutes]

  • From Strings to Things: Knowledge-enabled VQA model that can read and reason,
    Ajeet Kumar Singh, Anand Mishra, Shashank Shekhar, and Anirban Chakraborty
    ICCV 2019 (oral).
    [Paper] [bibtex] [Project page]

  • OCR-VQA: Visual Question Answering by Reading Text in Images
    Anand Mishra, Shashank Shekhar, Ajeet Kumar Singh, and Anirban Chakraborty
    ICDAR 2019.
    [Paper] [bibtex] [Project page]

  • KVQA: Knowledge-aware Visual Question Answering
    Sanket Shah*, Anand Mishra*, Naganand Yadati and Partha Pratim Talukdar
    (*: equal contribution) AAAI 2019. (acceptance rate: 16.1%)
    [Paper] [bibtex] [Project page]

  • Deep Embedding using Bayesian Risk Minimization with Application to Sketch Recognition
    Anand Mishra,and Ajeet Kumar Singh
    ACCV, 2018. (acceptance rate: 28%)
    [Paper (arXiv)] [bibtex]

  • IIIT-CFW: A Benchmark database of Cartoon Faces in the Wild
    Ashutosh Mishra, Shyam N. Roy, Anand Mishra,and C. V. Jawahar
    ECCVW, 2016. (Oral)
    [PDF] [bibtex][ IIIT-CFW dataset]

  • A Simple and Effective method for Script Identification in the Wild
    Ajeet Kumar Singh, Anand Mishra, Pranav Dabaral and C. V. Jawahar
    DAS, 2016.
    [Paper] [bibtex]

  • Scene Text Recognition and Retrieval for Large Lexicons
    Udit Roy, Anand Mishra, Karteek Alhari and C. V. Jawahar
    ACCV 2014.
    [Paper] [bibtex]

  • Image Retrieval using Textual Cues
    Anand Mishra, Karteek Alhari and C. V. Jawahar
    ICCV, 2013.
    [Paper] [bibtex]

  • Whole is Greater than Sum of Parts: Recognizing Scene Text Words
    Vibhor Goel, Anand Mishra, Karteek Alhari and C. V. Jawahar
    ICDAR, 2013.
    [Paper] [bibtex]

  • Scene Text Recognition using Higher Order Language Priors
    Anand Mishra, Karteek Alhari and C. V. Jawahar
    BMVC 2012. (Oral)
    [Paper] [bibtex] [ IIIT-5K Word dataset]

  • Top-down and Bottom-up cues for Scene Text Recognition
    Anand Mishra, Karteek Alhari and C. V. Jawahar
    CVPR 2012.
    [Paper] [bibtex]

  • An MRF model for Binarization of Natural Scene Text
    Anand Mishra, Karteek Alhari and C. V. Jawahar
    ICDAR 2011. (Oral)
    [Paper] [bibtex]

  • Teaching

    At IIT Jodhpur

  • CSL7360: Computer Vision (Spring’23)
  • CSL2040: Maths for Computing (Monsoon’22/21/AY 20-21–Tri-3)
  • CSL7410: Graph Theory and Application (AY 20-21–Tri-1, Spring’22)
  • CS222: Theory of Computation (Spring’20)
  • CS212: Object-oriented Design and Analysis (Monsoon’19)
  • At IISc Bangalore

  • UE101: Algorithms and Programmin (Monsoon'18)
  • - Co-taught at Indian Institute of Science with Dr. Sathish Govindrajan and Dr. Viraj Kumar.

    At IIIT Hyderabad (during PhD)

  • Computer Problem Solving (Monsoon'16)
  • -- Introductory course for M.Tech. Bioinformatics

    At IIIT Sri City (during PhD)

  • Computer Architecture (Spring'15)
  • -- Co-taught at IIIT Sricity as visiting instructor with Dr. Suresh Purini and Prof. Govindrajulu
  • Operating Systems (Monsoon'14)
  • -- Co-taught at IIIT Sricity as visiting instructor with Dr. Suresh Purini