TIPSTER Text Program A multi-agency, multi-contractor program | |
TABLE OF CONTENTS Introduction TIPSTER Overview TIPSTER Technology Overview TIPSTER Related Research Phase III Overview TIPSTER Calendar Reinvention Laboratory Project What's New Conceptual Papers Generic Information Retrieval Generic Text Extraction Summarization Concepts 12 Month Workshop Notes Conferences Text Retrieval Conference Multilingual Entity Task Summarization Evaluation More Information Other Related Projects Document Down Loading Request for Change (RFC) Glossary of Terms TIPSTER Source Information Return to Retrieval Group home page Return to IAD home page Last updated: Date created: Monday, 31-Jul-00 |
Phase III OverviewTIPSTER Phase III - A Four Part ProgramDARPA and other TIPSTER members sponsored 17 research and architecture development contracts with academic institutions and commercial companies in their effort to continue a balanced overall program, consisting of four basic parts: Advanced Research, Metrics-based Evaluations, a Structured Software Architecture, and Demonstration and Implementation Projects. Phase III Kick-off workshop was held October 1996. Advanced ResearchTIPSTER's Phase III was aimed at continuing innovative research in basic areas while adding several new topics of investigation: detection (improving search algorithms; merging results from different engines); extraction (tuning systems to new domains; raising accuracy); summarizing (producing a single summary of multiple documents); multilingual capabilities (porting tools and techniques proven in one language to work in other languages); and cross-technology (sharing of information between detection and extraction tools). Metrics-based EvaluationsThe Message Understanding Conferences (MUC) was a forum for assessing and discussing progress in natural language processing which showcased high performing systems in 1995 with scores about 90% in named entity tagging task. The Text Retrieval Conference (TREC), initiated by TIPSTER to evaluate document detection performance, will continue under NIST leadership. Acquisitions boosted the TREC data base to five gigabytes, an invaluable resource for the Information Retrieval (IR) community to test querying and ranking methods. The Summarization Evaluation Conference (SEC) recently completed an initial test of methods to evaluate summarization performance. Some results can be found here. Software ArchitecturePhase III proposed a feature called the TIPSTER Architecture Capabilities Platform (ACP). ACP's goals were aimed at providing framework for research and development in both Document Detection and Information Extraction. This resource, began development in 1997, it focused on providing the community with the opportunity to test components in a TIPSTER Architecture compliant environment and to perform experiments using TIPSTER components and modules. Inclusion of CORBA capabilities and the Z39.50 Information Retrieval Protocol supported reusable components and a more 'standards' base approach to the Architecture. Demonstration and Implementation ProjectsInitial prototyping efforts in Phase II led to several demonstrations of the capabilities of TIPSTER components against operational tasks in the Intelligence Community and elsewhere. The most successful of these early systems was prepared to be migrated to operational use. Phase III proposed a new round of prototyping and applications development. The expansion of the Software Architecture and to bring the results to the user as quickly as possible. Phase III WorkNew Research AreasSeveral new research areas were added as Phase III tasks :
Continuing Research AreasA number of research areas from Phase I and II required further effort in Phase III :
Architecture and Capabilities Platform DevelopmentThe major focus of the Architecture component for Phase III was the development of a COmmon Request Broker Architecture (CORBA) compliant Architecture Capabilities Platform (ACP) to host TIPSTER-compliant software components and modules. The ACP planned to provide a software platform for testing of individual TIPSTER tools and capabilities. Plans proposed to have developers demonstrate to the Government the modularity of their text handling systems by plugging components and modules into the ACP and interacting with the other TIPSTER components on the platform. In addition, the ACP intended to demonstrate the capability to interact with systems based on Z39.50 standards. Various supporting components for the ACP would have included document collections, standard detection needs, lexicons, a document manager and a default Graphical User Interface (GUI). As a continuation from Phase II, the TIPSTER program was aimming to foster a cooperative effort among the research entities and the ACP developers to provide enhancements to the TIPSTER Architecture design. |