Title: Thresholds for Detection Page 1 of Date Prepared: October 17, 1995 Date Needed: April 3, 1996 May 28, 1996 Change Required: --------------- Modify RetrieveDocuments() DocumentCollectionIndex [DCI] and AddQuery() for QueryCollectionIndex [QCI] to accept additional float arguments indicating the user's "relevance thresholds" for retrieved or matched Documents. A lower and upper threshold will be accepted as input to the specified operations. For the RetrieveDocuments operation, all objects ranking between the upper and lower thresholds will be returned in the new Collection. For the AddQuery operation, a RoutingQuery will match all Documents which rank between the thresholds specified. The threshold will be represented by a real number in the range [0,1.0]. A lower threshold of 0.0 and an upper threshold of 1.0 indicates that all matching Documents will be returned. Specific Recommendation: ----------------------- (add text) Section 1.2 Each component has a value, which may be.... * a float (a real number) (modify operation) Section 7.1.4 Document and Query Indexes DocumentCollectionIndex .... RetrieveDocuments(sequence of DocumentCollectionIndex, RetrievalQuery, NumberToRetrieve : integer, MinThreshold : float, MaxThreshold : float): Collection or nil returns a Collection of Documents (of maximal length NumberToRetrieve) which are most closely related to the DetectionNeed from which the RetrievalQuery is derived. All returned Documents will have a relevance rating above the user-specified MinThreshold and below the user-specified MaxThreshold. Threshold values will be a value in the range [0.0, 1.0]. A MinThreshold of 0.0 and a MaxThreshold of 1.0 indicates that all matching Documents up to a maximum of NumberToRetrieve will be returned. If no Documents match the RetrievalQuery argument or the MinThreshold/MaxThreshold criterion, nil will be returned. Title: Thresholds for Detection Page 2 of (modify operation) Section 7.1.4 Document and Query Indexes QueryCollectionIndex .... AddQuery(QueryCollectionIndex, RoutingQuery, MinThreshold : float, MaxThreshold : float) Adds RoutingQuery to the QueryCollectionIndex. If a RoutingQuery in the QueryCollectionIndex has a DetectionNeed component matching the DetectionNeed component of the RoutingQuery argument, the existing RoutingQuery is replaced by the RoutingQuery argument. The MinThreshold and MaxThreshold arguments are used to control the Documents the RoutingQuery matches during RetrieveQueries() operations. The RoutingQuery will match Documents if the relevance rating is above the specified relevance rating MinThreshold and below the relevance rating MaxThreshold. A MinThreshold of 0.0 and a MaxThreshold of 1.0 indicates that the DetectionNeed the RoutingQuery was derived from will be returned from the RetrieveQueries() operation for all matching Documents. Reason for Proposed Change: -------------------------- In the current TIPSTER Architecture, a user can query a DocumentCollectionIndex(DCI) to receive a Collection of Documents relevant to the user's information need (DetectionNeed). The user controls the retrieval of Documents only by specifying a maximum number of Documents they will receive. However, a user may wish to receive Documents above a certain "relevance threshold" in the returned Collection. The threshold limitation also occurs with the QueryCollectionIndex. A user can use the AddQuery() operation to add their RoutingQuery to a QueryCollectionIndex (QCI). Once added to the QCI, the RoutingQuery will be used to determine whether an incoming Document matches the user's information need. Again, a user has no way of limiting the range of Documents which may match their information need. A user may wish to only have their RoutingQuery match a Document only if the Document's relevance to the query is above a certain "relevance threshold". Adding a relevance threshold to the operations RetrieveDocuments() for DocumentCollectionIndex [DCI] and AddQuery() for QueryCollectionIndex [QCI] offers users more flexibility. This type of filtering can be done currently outside the scope of the architecture, but doing so prevents plug and play of Detection systems. By adding the threshold to the Tipster API, plug & play of Detection modules is enabled. Title: Thresholds for Detection Page 3 of The "relevance threshold" shall be general so that all TIPSTER Detection systems shall be able to interpret the threshold in a manner appropriate to their system. The threshold will be represented by an float on a scale of [0,1.0]. Change Requested By: Organization: University of Massachusetts Name: Kathleen S. DiBella Phone Number: (413)545-9781 Date: