Book scanning or book digitization (also: magazine scanning or magazine digitization) is the process of converting physical books and magazines into digital media such as images, electronic text, or electronic books (e-books) by using an image scanner.
Digital books can be easily distributed, reproduced, and read on-screen. Common file formats are Jpeg, Portable Document Format (PDF), and Tagged Image File Format (TIFF). To convert the raw images optical character recognition (OCR) is used to turn book pages into a digital text format like ASCII or other similar format, which reduces the file size and allows the text to be reformatted, searched, or processed by other applications.
Image scanners may be manual or automated. In an ordinary commercial image scanner, the book is placed on a flat glass plate (or platen), and a light and optical array moves across the book underneath the glass. In manual book scanners, the glass plate extends to the edge of the scanner, making it easier to line up the book's spine. Other book scanners place the book face up in a v-shaped frame, and photograph the pages from above. Pages may be turned by hand or by automated paper transport devices. Glass or plastic sheets are usually pressed against the page to flatten it.
After scanning, software adjusts the document images by lining it up, cropping it, picture-editing it, and converting it to text and final e-book form. Human proofreaders usually check the output for errors.
Scanning at 118 dots/centimeter (300 dpi) is adequate for conversion to digital text output, but for archival reproduction of rare, elaborate or illustrated books, much higher resolution is used.