Retrieving OCR Output
High-quality (95%) OCR can be retrieved [UNLV,TREC]
assumes text captured
redundancy in text sufficient for fuzzy matching done in IR systems
weighting affected: put less confidence in rare terms
Previous slide
Next slide
Back to first slide
View graphic version