SCIENTIFIC REPORT NO. ISR-10
Information Storage and Retrieval
Table of Contents
TABLE OF CONTENTS
page
Preface
v
List of Figures
xi
List of Tables
xiii
Synopsis
xv
CHAPTER 1 INTRODUCTION
1-1
1.
The Document Retrieval Problem
1-1
2.
A Functional Model
1-2
3.
A Specific Model - The SMART System
1-6
A.
Property Vector Indexing
1-11
B.
Request Processing
1-13
C.
Angular Distance Matching
1-17
D.
Terminology
1-17
CHAPTER 2 THE INDEXING FUNCTION
2-1
1.
Introduction
2-1
2.
Manual Indexing
2-2
3.
Automatic Indexing
2-2
A.
The Statistical Approach
2-3
B.
Semantic Techniques
2-4
C.
Syntactic Techniques
2-6
4.
The Structure of Index Representations
2-8
5.
Optimizing the Index Transformation
2-12
CHAPTER 3 SEARCH REQUEST FORMULATION
3-1
1.
Introduction
3-1
2.
Request Formulation
3-3
3.
Request Optimization
3-5
4.
Relevance Feedback
3-11
5.
The Case of No Relevant Documents
3-20
6.
Experimental Results
3-21
A.
Some Sample Search Requests
3-21
B.
Average Results and Successuve Iterations
3-32
C.
Convergence
3-42
CHAPTER 4 THE QUERY-DOCUMENT MATCHING FUNCTION
4-1
1.
The Comparison of Structured Operands
4-1
2.
Storage Organization
4-7
3.
Automatic Document Classification
4-12
4.
Classification and Metric Searching
4-16
5.
A Houristic Classification Algorithm
4-19
A.
Basic Concepts
4-19
B.
Description of the Classification Algorithm
4-20
6.
Experimental Results
4-36
CHAPTER 5 EVALUATION OF DOCUMENT RETRIEVAL SYSTEMS
5-1
1.
The General Problem
5-1
2.
Evaluation Measures and the Collection of Statistics
5-2
A.
The Idealized Experiment
5-2
B.
Evaluation Statistics
5-6
C.
Output Characterization
5-12
D.
The Precision-Recall Tradeoff
5-15
3.
The Use of Optimal Queries in Test Design
5-17
4.
Cutoff-Independent Performance Indices
5-19
A.
Derivation
5-19
B.
Experimental Use
5-29
Appendix A - The SMART System
A-1
Retrieval Group home page
IAD home page
Date updated: Tuesday, 10-Jul-2001 11:48:10 MDT
Date created: Monday, 18-Sept-00