TF-IDF on Terms or Phrases
Compute TF x IDF for each term in the folder:
- Used IDF = Total # docs / # docs with this term
Choose 5 terms with highest value
Use Minimum Description Length (MDL) algorithm to find phrases
Use BBN IdentiFinder™ to find names of people, places, organizations.
Always finds terms that are in the folder
Sometimes terms are not that important
Phrases are more descriptive