On this page

Object detection on documents (word spotting)

Jose A. Rodriguez-Serrano

Abstract

Object detection and matching are important applications in computer vision and pattern recognition. In the domain of document images, the detection and matching of objects are known as word spotting. In this talk I will explain the main contributions to the handwritten word spotting domain, achieved during my PhD at the Xerox Research Centre Europe. Starting from a baseline where objects are described as sequences and matched with hidden Markov models (HMM), several improvements to the state-of-the-art are introduced: (i) a feature extraction scheme inspired in the SIFT approach; (ii) modelling "a priori" knowledge using visual vocabularies; and (iii) a method to transfer knowledge between scenarios applied to writer style adaptation. This talk may be of interest to those concerned with object detection, statistical pattern recognition, and sequence modelling.

Slides

Object detection in documents (word spotting) seminar 19/11/2008 (PDF)

References

  • Rodriguez-Serrano and Perronnin, Local gradient histogram features for word spotting in unconstrained handwritten documents, ICFHR 2008
  • Rodriguez-Serrano and Perronnin, Score normalization for HMM-based handwritten word spotting based on a universal background model, ICFHR 2008
  • Rodriguez-Serrano et al., Unsupervised writer style adaptation for handwritten word spotting, ICPR 2008
  • Lampert et al., Beyond sliding windows: object localization by efficient subwindow search, CVPR 2008
  • Heitz & Koller, Learning Spatial Context: Using Stuff to Find Things, ECCV 2008
  • Blaschko & Lampert, Learning to Localize Objects with Structured Output Regression, ECCV 2008
  • Sakoe & Chiba, Dynamic programming algorithm optimization for spoken word recognition, IEEE Transactions on Acoustics, Speech and Signal Processing, 1978
  • Rath & Manmatha, Word image matching using dynamic time warping, CVPR 2003.
  • Marti & Bunke, Using a statistical language model to improve the performance of an hmm-based cursive handwriting recognition system”, Int. J. of Pattern Recognition and Artificial Intelligence, 2001
  • Vinciarelli et al., Offline Recognition of Large Vocabulary Cursive Handwritten Text, IEEE Trans. On Pattern Analysis and Machine Intelligence, 2004
  • Rath & Manmatha, “Features for word spotting in historical manuscripts”, ICDAR 2003.
  • Baum et al., A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains, Ann. Math. Statist., 1970
  • Rabiner & Huang, A tutorial on hidden Markov models and selected applications in speech recognition, Proceedings of the IEEE, 1989
  • Lowe, Distinctive Image Features from Scale-Invariant Keypoints, Int. J. of Computer Vision, 2004
  • Csurka et al., Visual categorization with bags of keypoints, ECCV 2003
  • Perronnin, Universal and Adapted Vocabularies for Generic Visual Categorization, IEEE Trans. On Pattern Analysis and Machine Intelligence, 2008
  • Gauvain & Lee, Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains, IEEE Trans. On Speech and Audio Processing, 1994
  • Leggetter & Woodland, Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models, Comp. Speech and Language, 1995