871 resultados para Interactive Information Retrieval
Resumo:
In this paper, we propose a text mining method called LRD (latent relation discovery), which extends the traditional vector space model of document representation in order to improve information retrieval (IR) on documents and document clustering. Our LRD method extracts terms and entities, such as person, organization, or project names, and discovers relationships between them by taking into account their co-occurrence in textual corpora. Given a target entity, LRD discovers other entities closely related to the target effectively and efficiently. With respect to such relatedness, a measure of relation strength between entities is defined. LRD uses relation strength to enhance the vector space model, and uses the enhanced vector space model for query based IR on documents and clustering documents in order to discover complex relationships among terms and entities. Our experiments on a standard dataset for query based IR shows that our LRD method performed significantly better than traditional vector space model and other five standard statistical methods for vector expansion.
Resumo:
DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT
Resumo:
Similar to Genetic algorithm, Evolution strategy is a process of continuous reproduction, trial and selection. Each new generation is an improvement on the one that went before. This paper presents two different proposals based on the vector space model (VSM) as a traditional model in information Retrieval (TIR). The first uses evolution strategy (ES). The second uses the document centroid (DC) in query expansion technique. Then the results are compared; it was noticed that ES technique is more efficient than the other methods.
Resumo:
An ontological representation of buyer interests’ knowledge in process of e-commerce is proposed to use. It makes it more efficient to make a search of the most appropriate sellers via multiagent systems. An algorithm of a comparison of buyer ontology with one of e-shops (the taxonomies) and an e-commerce multiagent system are realised using ontology of information retrieval in distributed environment.
Resumo:
This is an extended version of an article presented at the Second International Conference on Software, Services and Semantic Technologies, Sofia, Bulgaria, 11–12 September 2010.
Resumo:
The rapid growth of the Internet and the advancements of the Web technologies have made it possible for users to have access to large amounts of on-line music data, including music acoustic signals, lyrics, style/mood labels, and user-assigned tags. The progress has made music listening more fun, but has raised an issue of how to organize this data, and more generally, how computer programs can assist users in their music experience. An important subject in computer-aided music listening is music retrieval, i.e., the issue of efficiently helping users in locating the music they are looking for. Traditionally, songs were organized in a hierarchical structure such as genre->artist->album->track, to facilitate the users’ navigation. However, the intentions of the users are often hard to be captured in such a simply organized structure. The users may want to listen to music of a particular mood, style or topic; and/or any songs similar to some given music samples. This motivated us to work on user-centric music retrieval system to improve users’ satisfaction with the system. The traditional music information retrieval research was mainly concerned with classification, clustering, identification, and similarity search of acoustic data of music by way of feature extraction algorithms and machine learning techniques. More recently the music information retrieval research has focused on utilizing other types of data, such as lyrics, user-access patterns, and user-defined tags, and on targeting non-genre categories for classification, such as mood labels and styles. This dissertation focused on investigating and developing effective data mining techniques for (1) organizing and annotating music data with styles, moods and user-assigned tags; (2) performing effective analysis of music data with features from diverse information sources; and (3) recommending music songs to the users utilizing both content features and user access patterns.
Resumo:
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
Users seeking information may not find relevant information pertaining to their information need in a specific language. But information may be available in a language different from their own, but users may not know that language. Thus users may experience difficulty in accessing the information present in different languages. Since the retrieval process depends on the translation of the user query, there are many issues in getting the right translation of the user query. For a pair of languages chosen by a user, resources, like incomplete dictionary, inaccurate machine translation system may exist. These resources may be insufficient to map the query terms in one language to its equivalent terms in another language. Also for a given query, there might exist multiple correct translations. The underlying corpus evidence may suggest a clue to select a probable set of translations that could eventually perform a better information retrieval. In this paper, we present a cross language information retrieval approach to effectively retrieve information present in a language other than the language of the user query using the corpus driven query suggestion approach. The idea is to utilize the corpus based evidence of one language to improve the retrieval and re-ranking of news documents in the other language. We use FIRE corpora - Tamil and English news collections in our experiments and illustrate the effectiveness of the proposed cross language information retrieval approach.
Resumo:
Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
International audience
Resumo:
Conventional web search engines are centralised in that a single entity crawls and indexes the documents selected for future retrieval, and the relevance models used to determine which documents are relevant to a given user query. As a result, these search engines suffer from several technical drawbacks such as handling scale, timeliness and reliability, in addition to ethical concerns such as commercial manipulation and information censorship. Alleviating the need to rely entirely on a single entity, Peer-to-Peer (P2P) Information Retrieval (IR) has been proposed as a solution, as it distributes the functional components of a web search engine – from crawling and indexing documents, to query processing – across the network of users (or, peers) who use the search engine. This strategy for constructing an IR system poses several efficiency and effectiveness challenges which have been identified in past work. Accordingly, this thesis makes several contributions towards advancing the state of the art in P2P-IR effectiveness by improving the query processing and relevance scoring aspects of a P2P web search. Federated search systems are a form of distributed information retrieval model that route the user’s information need, formulated as a query, to distributed resources and merge the retrieved result lists into a final list. P2P-IR networks are one form of federated search in routing queries and merging result among participating peers. The query is propagated through disseminated nodes to hit the peers that are most likely to contain relevant documents, then the retrieved result lists are merged at different points along the path from the relevant peers to the query initializer (or namely, customer). However, query routing in P2P-IR networks is considered as one of the major challenges and critical part in P2P-IR networks; as the relevant peers might be lost in low-quality peer selection while executing the query routing, and inevitably lead to less effective retrieval results. This motivates this thesis to study and propose query routing techniques to improve retrieval quality in such networks. Cluster-based semi-structured P2P-IR networks exploit the cluster hypothesis to organise the peers into similar semantic clusters where each such semantic cluster is managed by super-peers. In this thesis, I construct three semi-structured P2P-IR models and examine their retrieval effectiveness. I also leverage the cluster centroids at the super-peer level as content representations gathered from cooperative peers to propose a query routing approach called Inverted PeerCluster Index (IPI) that simulates the conventional inverted index of the centralised corpus to organise the statistics of peers’ terms. The results show a competitive retrieval quality in comparison to baseline approaches. Furthermore, I study the applicability of using the conventional Information Retrieval models as peer selection approaches where each peer can be considered as a big document of documents. The experimental evaluation shows comparative and significant results and explains that document retrieval methods are very effective for peer selection that brings back the analogy between documents and peers. Additionally, Learning to Rank (LtR) algorithms are exploited to build a learned classifier for peer ranking at the super-peer level. The experiments show significant results with state-of-the-art resource selection methods and competitive results to corresponding classification-based approaches. Finally, I propose reputation-based query routing approaches that exploit the idea of providing feedback on a specific item in the social community networks and manage it for future decision-making. The system monitors users’ behaviours when they click or download documents from the final ranked list as implicit feedback and mines the given information to build a reputation-based data structure. The data structure is used to score peers and then rank them for query routing. I conduct a set of experiments to cover various scenarios including noisy feedback information (i.e, providing positive feedback on non-relevant documents) to examine the robustness of reputation-based approaches. The empirical evaluation shows significant results in almost all measurement metrics with approximate improvement more than 56% compared to baseline approaches. Thus, based on the results, if one were to choose one technique, reputation-based approaches are clearly the natural choices which also can be deployed on any P2P network.
Resumo:
This paper reports preliminary results from a study modeling the interplay between multitasking, cognitive coordination, and cognitive shifts during Web search. Study participants conducted three Web searches on personal information problems. Data collection techniques included pre- and post-search questionnaires; think-aloud protocols, Web search logs, observation, and post-search interviews. Key findings include: (1) users Web searches included multitasking, cognitive shifting and cognitive coordination processes, (2) cognitive coordination is the hinge linking multitasking and cognitive shifting that enables Web search construction, (3) cognitive shift levels determine the process of cognitive coordination, and (4) cognitive coordination is interplay of task, mechanism and strategy levels that underpin multitasking and task switching. An initial model depicts the interplay between multitasking, cognitive coordination, and cognitive shifts during Web search. Implications of the findings and further research are also discussed.
Resumo:
As Web searching becomes more prolific for information access worldwide, we need to better understand users’ Web searching behaviour and develop better models of their interaction with Web search systems. Web search modelling is a significant and important area of Web research. Searching on the Web is an integral element of information behaviour and human–computer interaction. Web searching includes multitasking processes, the allocation of cognitive resources among several tasks, and shifts in cognitive, problem and knowledge states. In addition to multitasking, cognitive coordination and cognitive shifts are also important, but are under-explored aspects of Web searching. During the Web searching process, beyond physical actions, users experience various cognitive activities. Interactive Web searching involves many users’ cognitive shifts at different information behaviour levels. Cognitive coordination allows users to trade off the dependences among multiple information tasks and the resources available. Much research has been conducted into Web searching. However, few studies have modelled the nature of and relationship between multitasking, cognitive coordination and cognitive shifts in the Web search context. Modelling how Web users interact with Web search systems is vital for the development of more effective Web IR systems. This study aims to model the relationship between multitasking, cognitive coordination and cognitive shifts during Web searching. A preliminary theoretical model is presented based on previous studies. The research is designed to validate the preliminary model. Forty-two study participants were involved in the empirical study. A combination of data collection instruments, including pre- and post-questionnaires, think-aloud protocols, search logs, observations and interviews were employed to obtain users’ comprehensive data during Web search interactions. Based on the grounded theory approach, qualitative analysis methods including content analysis and verbal protocol analysis were used to analyse the data. The findings were inferred through an analysis of questionnaires, a transcription of think-aloud protocols, the Web search logs, and notes on observations and interviews. Five key findings emerged. (1) Multitasking during Web searching was demonstrated as a two-dimensional behaviour. The first dimension was represented as multiple information problems searching by task switching. Users’ Web searching behaviour was a process of multiple tasks switching, that is, from searching on one information problem to searching another. The second dimension of multitasking behaviour was represented as an information problem searching within multiple Web search sessions. Users usually conducted Web searching on a complex information problem by submitting multiple queries, using several Web search systems and opening multiple windows/tabs. (2) Cognitive shifts were the brain’s internal response to external stimuli. Cognitive shifts were found as an essential element of searching interactions and users’ Web searching behaviour. The study revealed two kinds of cognitive shifts. The first kind, the holistic shift, included users’ perception on the information problem and overall information evaluation before and after Web searching. The second kind, the state shift, reflected users’ changes in focus between the different cognitive states during the course of Web searching. Cognitive states included users’ focus on the states of topic, strategy, evaluation, view and overview. (3) Three levels of cognitive coordination behaviour were identified: the information task coordination level, the coordination mechanism level, and the strategy coordination level. The three levels of cognitive coordination behaviour interplayed to support multiple information tasks switching. (4) An important relationship existed between multitasking, cognitive coordination and cognitive shifts during Web searching. Cognitive coordination as a management mechanism bound together other cognitive processes, including multitasking and cognitive shifts, in order to move through users’ Web searching process. (5) Web search interaction was shown to be a multitasking process which included information problems ordering, task switching and task and mental coordinating; also, at a deeper level, cognitive shifts took place. Cognitive coordination was the hinge behaviour linking multitasking and cognitive shifts. Without cognitive coordination, neither multitasking Web searching behaviour nor the complicated mental process of cognitive shifting could occur. The preliminary model was revisited with these empirical findings. A revised theoretical model (MCC Model) was built to illustrate the relationship between multitasking, cognitive coordination and cognitive shifts during Web searching. Implications and limitations of the study are also discussed, along with future research work.