894 resultados para Query expansion, Text mining, Information retrieval, Chinese IR
Resumo:
Thesis (M. S.)--University of Illinois at Urbana-Champaign.
Resumo:
Thesis (M. S.)--University of Illinois at Urbana-Champaign.
Resumo:
Thesis--University of Illinois at Urbana-Champaign.
Resumo:
Thesis (M. S.)--University of Illinois at Urbana-Champaign.
Resumo:
Bibliography: p. 114.
Resumo:
Originally presented as the author's thesis, University of Illinois at Urbana-Champaign, 1974.
Resumo:
Thesis (M. S.)--University of Illinois at Urbana-Champaign.
Resumo:
Originally presented as the author's thesis, University of Illinois at Urbana-Champaign.
Resumo:
"Originally issued in 1973 as open-file report SW-104of."
Resumo:
"AD 273 115."
Resumo:
"August 1997."
Resumo:
Mode of access: Internet.
Resumo:
Domain specific information retrieval has become in demand. Not only domain experts, but also average non-expert users are interested in searching domain specific (e.g., medical and health) information from online resources. However, a typical problem to average users is that the search results are always a mixture of documents with different levels of readability. Non-expert users may want to see documents with higher readability on the top of the list. Consequently the search results need to be re-ranked in a descending order of readability. It is often not practical for domain experts to manually label the readability of documents for large databases. Computational models of readability needs to be investigated. However, traditional readability formulas are designed for general purpose text and insufficient to deal with technical materials for domain specific information retrieval. More advanced algorithms such as textual coherence model are computationally expensive for re-ranking a large number of retrieved documents. In this paper, we propose an effective and computationally tractable concept-based model of text readability. In addition to textual genres of a document, our model also takes into account domain specific knowledge, i.e., how the domain-specific concepts contained in the document affect the document’s readability. Three major readability formulas are proposed and applied to health and medical information retrieval. Experimental results show that our proposed readability formulas lead to remarkable improvements in terms of correlation with users’ readability ratings over four traditional readability measures.
Resumo:
The expansion of the Internet has made the task of searching a crucial one. Internet users, however, have to make a great effort in order to formulate a search query that returns the required results. Many methods have been devised to assist in this task by helping the users modify their query to give better results. In this paper we propose an interactive method for query expansion. It is based on the observation that documents are often found to contain terms with high information content, which can summarise their subject matter. We present experimental results, which demonstrate that our approach significantly shortens the time required in order to accomplish a certain task by performing web searches.
Resumo:
With the explosive growth of the volume and complexity of document data (e.g., news, blogs, web pages), it has become a necessity to semantically understand documents and deliver meaningful information to users. Areas dealing with these problems are crossing data mining, information retrieval, and machine learning. For example, document clustering and summarization are two fundamental techniques for understanding document data and have attracted much attention in recent years. Given a collection of documents, document clustering aims to partition them into different groups to provide efficient document browsing and navigation mechanisms. One unrevealed area in document clustering is that how to generate meaningful interpretation for the each document cluster resulted from the clustering process. Document summarization is another effective technique for document understanding, which generates a summary by selecting sentences that deliver the major or topic-relevant information in the original documents. How to improve the automatic summarization performance and apply it to newly emerging problems are two valuable research directions. To assist people to capture the semantics of documents effectively and efficiently, the dissertation focuses on developing effective data mining and machine learning algorithms and systems for (1) integrating document clustering and summarization to obtain meaningful document clusters with summarized interpretation, (2) improving document summarization performance and building document understanding systems to solve real-world applications, and (3) summarizing the differences and evolution of multiple document sources.