873 resultados para 080704 Information Retrieval and Web Search
Resumo:
Over the last decade, the rapid growth and adoption of the World Wide Web has further exacerbated user needs for e±cient mechanisms for information and knowledge location, selection, and retrieval. How to gather useful and meaningful information from the Web becomes challenging to users. The capture of user information needs is key to delivering users' desired information, and user pro¯les can help to capture information needs. However, e®ectively acquiring user pro¯les is di±cult. It is argued that if user background knowledge can be speci¯ed by ontolo- gies, more accurate user pro¯les can be acquired and thus information needs can be captured e®ectively. Web users implicitly possess concept models that are obtained from their experience and education, and use the concept models in information gathering. Prior to this work, much research has attempted to use ontologies to specify user background knowledge and user concept models. However, these works have a drawback in that they cannot move beyond the subsumption of super - and sub-class structure to emphasising the speci¯c se- mantic relations in a single computational model. This has also been a challenge for years in the knowledge engineering community. Thus, using ontologies to represent user concept models and to acquire user pro¯les remains an unsolved problem in personalised Web information gathering and knowledge engineering. In this thesis, an ontology learning and mining model is proposed to acquire user pro¯les for personalised Web information gathering. The proposed compu- tational model emphasises the speci¯c is-a and part-of semantic relations in one computational model. The world knowledge and users' Local Instance Reposito- ries are used to attempt to discover and specify user background knowledge. From a world knowledge base, personalised ontologies are constructed by adopting au- tomatic or semi-automatic techniques to extract user interest concepts, focusing on user information needs. A multidimensional ontology mining method, Speci- ¯city and Exhaustivity, is also introduced in this thesis for analysing the user background knowledge discovered and speci¯ed in user personalised ontologies. The ontology learning and mining model is evaluated by comparing with human- based and state-of-the-art computational models in experiments, using a large, standard data set. The experimental results are promising for evaluation. The proposed ontology learning and mining model in this thesis helps to develop a better understanding of user pro¯le acquisition, thus providing better design of personalised Web information gathering systems. The contributions are increasingly signi¯cant, given both the rapid explosion of Web information in recent years and today's accessibility to the Internet and the full text world.
Resumo:
The Web has become a worldwide repository of information which individuals, companies, and organizations utilize to solve or address various information problems. Many of these Web users utilize automated agents to gather this information for them. Some assume that this approach represents a more sophisticated method of searching. However, there is little research investigating how Web agents search for online information. In this research, we first provide a classification for information agent using stages of information gathering, gathering approaches, and agent architecture. We then examine an implementation of one of the resulting classifications in detail, investigating how agents search for information on Web search engines, including the session, query, term, duration and frequency of interactions. For this temporal study, we analyzed three data sets of queries and page views from agents interacting with the Excite and AltaVista search engines from 1997 to 2002, examining approximately 900,000 queries submitted by over 3,000 agents. Findings include: (1) agent sessions are extremely interactive, with sometimes hundreds of interactions per second (2) agent queries are comparable to human searchers, with little use of query operators, (3) Web agents are searching for a relatively limited variety of information, wherein only 18% of the terms used are unique, and (4) the duration of agent-Web search engine interaction typically spans several hours. We discuss the implications for Web information agents and search engines.
Resumo:
An increasing amount of people seek health advice on the web using search engines; this poses challenging problems for current search technologies. In this paper we report an initial study of the effectiveness of current search engines in retrieving relevant information for diagnostic medical circumlocutory queries, i.e., queries that are issued by people seeking information about their health condition using a description of the symptoms they observes (e.g. hives all over body) rather than the medical term (e.g. urticaria). This type of queries frequently happens when people are unfamiliar with a domain or language and they are common among health information seekers attempting to self-diagnose or self-treat themselves. Our analysis reveals that current search engines are not equipped to effectively satisfy such information needs; this can have potential harmful outcomes on people’s health. Our results advocate for more research in developing information retrieval methods to support such complex information needs.