874 resultados para Search intent
Resumo:
Discovering proper search intents is a vi- tal process to return desired results. It is constantly a hot research topic regarding information retrieval in recent years. Existing methods are mainly limited by utilizing context-based mining, query expansion, and user profiling techniques, which are still suffering from the issue of ambiguity in search queries. In this pa- per, we introduce a novel ontology-based approach in terms of a world knowledge base in order to construct personalized ontologies for identifying adequate con- cept levels for matching user search intents. An iter- ative mining algorithm is designed for evaluating po- tential intents level by level until meeting the best re- sult. The propose-to-attempt approach is evaluated in a large volume RCV1 data set, and experimental results indicate a distinct improvement on top precision after compared with baseline models.
Resumo:
The keyword based search technique suffers from the problem of synonymic and polysemic queries. Current approaches address only theproblem of synonymic queries in which different queries might have the same information requirement. But the problem of polysemic queries,i.e., same query having different intentions, still remains unaddressed. In this paper, we propose the notion of intent clusters, the members of which will have the same intention. We develop a clustering algorithm that uses the user session information in query logs in addition to query URL entries to identify cluster of queries having the same intention. The proposed approach has been studied through case examples from the actual log data from AOL, and the clustering algorithm is shown to be successful in discerning the user intentions.
Resumo:
In this paper, we define and present a comprehensive classification of user intent for Web searching. The classification consists of three hierarchical levels of informational, navigational, and transactional intent. After deriving attributes of each, we then developed a software application that automatically classified queries using a Web search engine log of over a million and a half queries submitted by several hundred thousand users. Our findings show that more than 80% of Web queries are informational in nature, with about 10% each being navigational and transactional. In order to validate the accuracy of our algorithm, we manually coded 400 queries and compared the results from this manual classification to the results determined by the automated method. This comparison showed that the automatic classification has an accuracy of 74%. Of the remaining 25% of the queries, the user intent is vague or multi-faceted, pointing to the need for probabilistic classification. We discuss how search engines can use knowledge of user intent to provide more targeted and relevant results in Web searching.
Resumo:
Purpose: Web search engines are frequently used by people to locate information on the Internet. However, not all queries have an informational goal. Instead of information, some people may be looking for specific web sites or may wish to conduct transactions with web services. This paper aims to focus on automatically classifying the different user intents behind web queries. Design/methodology/approach: For the research reported in this paper, 130,000 web search engine queries are categorized as informational, navigational, or transactional using a k-means clustering approach based on a variety of query traits. Findings: The research findings show that more than 75 percent of web queries (clustered into eight classifications) are informational in nature, with about 12 percent each for navigational and transactional. Results also show that web queries fall into eight clusters, six primarily informational, and one each of primarily transactional and navigational. Research limitations/implications: This study provides an important contribution to web search literature because it provides information about the goals of searchers and a method for automatically classifying the intents of the user queries. Automatic classification of user intent can lead to improved web search engines by tailoring results to specific user needs. Practical implications: The paper discusses how web search engines can use automatically classified user queries to provide more targeted and relevant results in web searching by implementing a real time classification method as presented in this research. Originality/value: This research investigates a new application of a method for automatically classifying the intent of user queries. There has been limited research to date on automatically classifying the user intent of web queries, even though the pay-off for web search engines can be quite beneficial. © Emerald Group Publishing Limited.
Resumo:
Entity-oriented retrieval aims to return a list of relevant entities rather than documents to provide exact answers for user queries. The nature of entity-oriented retrieval requires identifying the semantic intent of user queries, i.e., understanding the semantic role of query terms and determining the semantic categories which indicate the class of target entities. Existing methods are not able to exploit the semantic intent by capturing the semantic relationship between terms in a query and in a document that contains entity related information. To improve the understanding of the semantic intent of user queries, we propose concept-based retrieval method that not only automatically identifies the semantic intent of user queries, i.e., Intent Type and Intent Modifier but introduces concepts represented by Wikipedia articles to user queries. We evaluate our proposed method on entity profile documents annotated by concepts from Wikipedia category and list structure. Empirical analysis reveals that the proposed method outperforms several state-of-the-art approaches.
Resumo:
In most intent recognition studies, annotations of query intent are created post hoc by external assessors who are not the searchers themselves. It is important for the field to get a better understanding of the quality of this process as an approximation for determining the searcher's actual intent. Some studies have investigated the reliability of the query intent annotation process by measuring the interassessor agreement. However, these studies did not measure the validity of the judgments, that is, to what extent the annotations match the searcher's actual intent. In this study, we asked both the searchers themselves and external assessors to classify queries using the same intent classification scheme. We show that of the seven dimensions in our intent classification scheme, four can reliably be used for query annotation. Of these four, only the annotations on the topic and spatial sensitivity dimension are valid when compared with the searcher's annotations. The difference between the interassessor agreement and the assessor-searcher agreement was significant on all dimensions, showing that the agreement between external assessors is not a good estimator of the validity of the intent classifications. Therefore, we encourage the research community to consider using query intent classifications by the searchers themselves as test data.
Resumo:
Expert searchers engage with information as information brokers, researchers, reference librarians, information architects, faculty who teach advanced search, and in a variety of other information-intensive professions. Their experiences are characterized by a profound understanding of information concepts and skills and they have an agile ability to apply this knowledge to interacting with and having an impact on the information environment. This study explored the learning experiences of searchers to understand the acquisition of search expertise. The research question was: What can be learned about becoming an expert searcher from the learning experiences of proficient novice searchers and highly experienced searchers? The key objectives were: (1) to explore the existence of threshold concepts in search expertise; (2) to improve our understanding of how search expertise is acquired and how novice searchers, intent on becoming experts, can learn to search in more expertlike ways. The participant sample drew from two population groups: (1) highly experienced searchers with a minimum of 20 years of relevant professional experience, including LIS faculty who teach advanced search, information brokers, and search engine developers (11 subjects); and (2) MLIS students who had completed coursework in information retrieval and online searching and demonstrated exceptional ability (9 subjects). Using these two groups allowed a nuanced understanding of the experience of learning to search in expertlike ways, with data from those who search at a very high level as well as those who may be actively developing expertise. The study used semi-structured interviews, search tasks with think-aloud narratives, and talk-after protocols. Searches were screen-captured with simultaneous audio-recording of the think-aloud narrative. Data were coded and analyzed using NVivo9 and manually. Grounded theory allowed categories and themes to emerge from the data. Categories represented conceptual knowledge and attributes of expert searchers. In accord with grounded theory method, once theoretical saturation was achieved, during the final stage of analysis the data were viewed through lenses of existing theoretical frameworks. For this study, threshold concept theory (Meyer & Land, 2003) was used to explore which concepts might be threshold concepts. Threshold concepts have been used to explore transformative learning portals in subjects ranging from economics to mathematics. A threshold concept has five defining characteristics: transformative (causing a shift in perception), irreversible (unlikely to be forgotten), integrative (unifying separate concepts), troublesome (initially counter-intuitive), and may be bounded. Themes that emerged provided evidence of four concepts which had the characteristics of threshold concepts. These were: information environment: the total information environment is perceived and understood; information structures: content, index structures, and retrieval algorithms are understood; information vocabularies: fluency in search behaviors related to language, including natural language, controlled vocabulary, and finesse using proximity, truncation, and other language-based tools. The fourth threshold concept was concept fusion, the integration of the other three threshold concepts and further defined by three properties: visioning (anticipating next moves), being light on one's 'search feet' (dancing property), and profound ontological shift (identity as searcher). In addition to the threshold concepts, findings were reported that were not concept-based, including praxes and traits of expert searchers. A model of search expertise is proposed with the four threshold concepts at its core that also integrates the traits and praxes elicited from the study, attributes which are likewise long recognized in LIS research as present in professional searchers. The research provides a deeper understanding of the transformative learning experiences involved in the acquisition of search expertise. It adds to our understanding of search expertise in the context of today's information environment and has implications for teaching advanced search, for research more broadly within library and information science, and for methodologies used to explore threshold concepts.
Resumo:
In this paper we define two models of users that require diversity in search results; these models are theoretically grounded in the notion of intrinsic and extrinsic diversity. We then examine Intent-Aware Expected Reciprocal Rank (ERR-IA), one of the official measures used to assess diversity in TREC 2011-12, with respect to the proposed user models. By analyzing ranking preferences as expressed by the user models and those estimated by ERR-IA, we investigate whether ERR-IA assesses document rankings according to the requirements of the diversity retrieval task expressed by the two models. Empirical results demonstrate that ERR-IA neglects query-intents coverage by attributing excessive importance to redundant relevant documents. ERR-IA behavior is contrary to the user models that require measures to first assess diversity through the coverage of intents, and then assess the redundancy of relevant intents. Furthermore, diversity should be considered separately from document relevance and the documents positions in the ranking.
Resumo:
The problem of identifying user intent has received considerable attention in recent years, particularly in the context of improving the search experience via query contextualization. Intent can be characterized by multiple dimensions, which are often not observed from query words alone. Accurate identification of Intent from query words remains a challenging problem primarily because it is extremely difficult to discover these dimensions. The problem is often significantly compounded due to lack of representative training sample. We present a generic, extensible framework for learning the multi-dimensional representation of user intent from the query words. The approach models the latent relationships between facets using tree structured distribution which leads to an efficient and convergent algorithm, FastQ, for identifying the multi-faceted intent of users based on just the query words. We also incorporated WordNet to extend the system capabilities to queries which contain words that do not appear in the training data. Empirical results show that FastQ yields accurate identification of intent when compared to a gold standard.
Resumo:
A intenção deste trabalho é explorar dinâmicas de competição por meio de “simulação baseada em agentes”. Apoiando-se em um crescente número de estudos no campo da estratégia e teoria das organizações que utilizam métodos de simulação, desenvolveu-se um modelo computacional para simular situações de competição entre empresas e observar a eficiência relativa dos métodos de busca de melhoria de desempenho teorizados. O estudo também explora possíveis explicações para a persistência de desempenho superior ou inferior das empresas, associados às condições de vantagem ou desvantagem competitiva