for Image Processing and Computer Vision, J. R. Parker.
John Wiley & Sons, 1997.
If it were possible to learn to do OCR from one book, this might be the
right one. The algorithms surveyed (with a wealth of "C" source code
included on CD-ROM) cover a wide range of techniques but particularly emphasize
those which arise in an OCR context, such as segmentation, edge detection,
skeletonization, and morphological filtering. Chapter 8 is a tutorial
in which an OCR system is developed, component by component, until a usable
system is obtained for reading machine-print characters from fax
images. Handprint recognition is treated as a separate problem in
Chapter 9, which also contains a well-organized survey of techniques for
combining multiple classifiers. This book covers both classical
image processing heuristics and newer general techniques such as wavelets,
neural networks and genetic algorithms. However, the overall approach
is bottom-up, with the OCR system of Chapter 8 being designed as a succession
of solutions to particular problems. This is a practical approach
(made yet more practical by the inclusion of source code) and reflects
the manner in which much of the OCR field was established. However,
it doesn't do much to place OCR in the context of the larger field of
pattern analysis and recognition. So, if you were to choose two
books, a good complement to Parker would be one that takes a
top-down approach, such as Schürmann.
ISBN 0-471-14056-2 (softcover with CD-ROM).
Digital Document Processing, H. S.
Wiley & Sons, 1983. Gives a complete view of computerized document
processing from scanner to output, with compact but informative descriptions
of the important algorithmic techniques at each stage. Covers text recognition
from the image quality, character feature, and language-dependent perspectives,
as well as other techniques used in a complete document processing system
including image halftoning, compression, transmission, and document retrieval.
Because the book concentrates on the major basic algorithms in each area
rather than specific implementations, the material holds up well despite
the age of the book. Age does, however, make the book difficult to
obtain. ISBN 0471862479.