OCR-IR Challenges
Find the text
- tables, figures
- identify key structures (abstract, title, headline, etc)
Impact of multilingual documents on recognition quality
How degraded is too degraded?
- important for legacy data
- specialized collections (e.g. personal papers)