First page Back Continue Last page Graphics
Automatic Text Retrieval Systems Design Questions
What will the “content unit” be for document descriptors (if not weighted single terms)?
- Related terms: “Words that co-occur with sufficient frequency in the documents of a collection are…related to each other.”
- Term phrases: Terms grouped together based on frequency counts, other statistical methods, and syntactic procedures (parsing of some sort).
- Words group by thesaurus: Groups of related terms are aggregated under limited headings, which may themselves be used as descriptors.
- Knowledge base entries: Artificial intelligence structures are constructed to represent subject content