๐ฐ๏ธ OCR Time Machine
Travel through time to see how OCR technology has evolved!
For decades, galleries, libraries, archives, and museums (GLAMs) have used Optical Character Recognition to transform digitized books, newspapers, and manuscripts into machine-readable text. Traditional OCR produces complex XML formats like ALTO, packed with layout details but difficult to use. Now, cutting-edge Vision-Language Models (VLMs) are revolutionizing OCR with simpler, cleaner Markdown output. This Space makes it easy to compare these two approaches and see which works best for your historical documents. Upload a historical document image and its XML file to compare these approaches side-by-side. We'll extract the reading order from your XML for an apples-to-apples comparison of the actual text content.
Available models: RolmOCR | Nanonets-OCR-s
๐ How it works
- ๐ค Upload Image: Select a historical document image (JPG, PNG, JP2)
- ๐ Upload XML (Optional): Add the corresponding ALTO or PAGE XML file for comparison
- ๐ค Choose Model: Select between RolmOCR (new) or Nanonets-OCR-s (even newer!)
- ๐ Compare: Click 'Compare OCR Methods' to process
- ๐พ Download: Save the results for further analysis
๐ฅ Upload Files
๐ค Step 1: Upload your document
๐ค Step 2: Select OCR Model
๐ Results
๐ผ๏ธ Document Image
๐ค Modern VLM OCR Output
๐ Traditional OCR Output
๐ฏ Try an Example
Historical Document Image | XML File (Optional - ALTO or PAGE format) | Choose Model |
---|
Example from 'A Medical History of British India' collection, National Library of Scotland
๐ About ALTO/PAGE XML
- ALTO (Analyzed Layout and Text Object) and PAGE are XML formats that store OCR results with detailed layout information
- These files are typically generated by traditional OCR software and include position data for each text element
- This tool extracts just the reading order text for easier comparison
๐ฏ Best Practices
- Use high-resolution scans (300+ DPI) for best results
- Historical documents with clear text work best
- The VLM models can handle complex layouts, tables, and mathematical notation
Built with โค๏ธ for the GLAM community | Learn more about OCR formats | Questions? Open an issue