IRLIB logo

SCIENTIFIC REPORT NO. ISR-10

Information Storage and Retrieval

Table of Contents





TABLE OF CONTENTS


page
Preface v
List of Figures xi
List of Tables xiii
Synopsis

xv
CHAPTER 1        INTRODUCTION 1-1
1. The Document Retrieval Problem 1-1
2. A Functional Model 1-2
3. A Specific Model - The SMART System 1-6
A. Property Vector Indexing 1-11
B. Request Processing 1-13
C. Angular Distance Matching 1-17
D. Terminology

1-17
CHAPTER 2       THE INDEXING FUNCTION 2-1
1. Introduction 2-1
2. Manual Indexing 2-2
3. Automatic Indexing 2-2
A. The Statistical Approach 2-3
B. Semantic Techniques 2-4
C. Syntactic Techniques 2-6
4. The Structure of Index Representations 2-8
5. Optimizing the Index Transformation

2-12
CHAPTER 3       SEARCH REQUEST FORMULATION 3-1
1. Introduction 3-1
2. Request Formulation 3-3
3. Request Optimization 3-5
4. Relevance Feedback 3-11
5. The Case of No Relevant Documents 3-20
6. Experimental Results 3-21
A. Some Sample Search Requests 3-21
B. Average Results and Successuve Iterations 3-32
C. Convergence

3-42
CHAPTER 4       THE QUERY-DOCUMENT MATCHING FUNCTION 4-1
1. The Comparison of Structured Operands 4-1
2. Storage Organization 4-7
3. Automatic Document Classification 4-12
4. Classification and Metric Searching 4-16
5. A Houristic Classification Algorithm 4-19
A. Basic Concepts 4-19
B. Description of the Classification Algorithm 4-20
6. Experimental Results

4-36
CHAPTER 5       EVALUATION OF DOCUMENT RETRIEVAL SYSTEMS 5-1
1. The General Problem 5-1
2. Evaluation Measures and the Collection of Statistics 5-2
A. The Idealized Experiment 5-2
B. Evaluation Statistics 5-6
C. Output Characterization 5-12
D. The Precision-Recall Tradeoff 5-15
3. The Use of Optimal Queries in Test Design 5-17
4. Cutoff-Independent Performance Indices 5-19
A. Derivation 5-19
B. Experimental Use

5-29
Appendix A - The SMART System A-1

NIST home Retrieval Group home page
IAD home page
Date updated:
Date created: Monday, 18-Sept-00