896 resultados para Web, Search Engine, Overlap


Relevância:

100.00% 100.00%

Publicador:

Resumo:

As a model for knowledge description and formalization, ontologies are widely used to represent user profiles in personalized web information gathering. However, when representing user profiles, many models have utilized only knowledge from either a global knowledge base or a user local information. In this paper, a personalized ontology model is proposed for knowledge representation and reasoning over user profiles. This model learns ontological user profiles from both a world knowledge base and user local instance repositories. The ontology model is evaluated by comparing it against benchmark models in web information gathering. The results show that this ontology model is successful.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Performance comparisons between File Signatures and Inverted Files for text retrieval have previously shown several significant shortcomings of file signatures relative to inverted files. The inverted file approach underpins most state-of-the-art search engine algorithms, such as Language and Probabilistic models. It has been widely accepted that traditional file signatures are inferior alternatives to inverted files. This paper describes TopSig, a new approach to the construction of file signatures. Many advances in semantic hashing and dimensionality reduction have been made in recent times, but these were not so far linked to general purpose, signature file based, search engines. This paper introduces a different signature file approach that builds upon and extends these recent advances. We are able to demonstrate significant improvements in the performance of signature file based indexing and retrieval, performance that is comparable to that of state of the art inverted file based systems, including Language models and BM25. These findings suggest that file signatures offer a viable alternative to inverted files in suitable settings and positions the file signatures model in the class of Vector Space retrieval models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The interoperable and loosely-coupled web services architecture, while beneficial, can be resource-intensive, and is thus susceptible to denial of service (DoS) attacks in which an attacker can use a relatively insignificant amount of resources to exhaust the computational resources of a web service. We investigate the effectiveness of defending web services from DoS attacks using client puzzles, a cryptographic countermeasure which provides a form of gradual authentication by requiring the client to solve some computationally difficult problems before access is granted. In particular, we describe a mechanism for integrating a hash-based puzzle into existing web services frameworks and analyze the effectiveness of the countermeasure using a variety of scenarios on a network testbed. Client puzzles are an effective defence against flooding attacks. They can also mitigate certain types of semantic-based attacks, although they may not be the optimal solution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Since manually constructing domain-specific sentiment lexicons is extremely time consuming and it may not even be feasible for domains where linguistic expertise is not available. Research on the automatic construction of domain-specific sentiment lexicons has become a hot topic in recent years. The main contribution of this paper is the illustration of a novel semi-supervised learning method which exploits both term-to-term and document-to-term relations hidden in a corpus for the construction of domain specific sentiment lexicons. More specifically, the proposed two-pass pseudo labeling method combines shallow linguistic parsing and corpusbase statistical learning to make domain-specific sentiment extraction scalable with respect to the sheer volume of opinionated documents archived on the Internet these days. Another novelty of the proposed method is that it can utilize the readily available user-contributed labels of opinionated documents (e.g., the user ratings of product reviews) to bootstrap the performance of sentiment lexicon construction. Our experiments show that the proposed method can generate high quality domain-specific sentiment lexicons as directly assessed by human experts. Moreover, the system generated domain-specific sentiment lexicons can improve polarity prediction tasks at the document level by 2:18% when compared to other well-known baseline methods. Our research opens the door to the development of practical and scalable methods for domain-specific sentiment analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intuitively, any ‘bag of words’ approach in IR should benefit from taking term dependencies into account. Unfortunately, for years the results of exploiting such dependencies have been mixed or inconclusive. To improve the situation, this paper shows how the natural language properties of the target documents can be used to transform and enrich the term dependencies to more useful statistics. This is done in three steps. The term co-occurrence statistics of queries and documents are each represented by a Markov chain. The paper proves that such a chain is ergodic, and therefore its asymptotic behavior is unique, stationary, and independent of the initial state. Next, the stationary distribution is taken to model queries and documents, rather than their initial distributions. Finally, ranking is achieved following the customary language modeling paradigm. The main contribution of this paper is to argue why the asymptotic behavior of the document model is a better representation then just the document’s initial distribution. A secondary contribution is to investigate the practical application of this representation in case the queries become increasingly verbose. In the experiments (based on Lemur’s search engine substrate) the default query model was replaced by the stable distribution of the query. Just modeling the query this way already resulted in significant improvements over a standard language model baseline. The results were on a par or better than more sophisticated algorithms that use fine-tuned parameters or extensive training. Moreover, the more verbose the query, the more effective the approach seems to become.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Major Web search engines, such as AltaVista, are essential tools in the quest to locate online information. This article reports research that used transaction log analysis to examine the characteristics and changes in AltaVista Web searching that occurred from 1998 to 2002. The research questions we examined are (1) What are the changes in AltaVista Web searching from 1998 to 2002? (2) What are the current characteristics of AltaVista searching, including the duration and frequency of search sessions? (3) What changes in the information needs of AltaVista users occurred between 1998 and 2002? The results of our research show (1) a move toward more interactivity with increases in session and query length, (2) with 70% of session durations at 5 minutes or less, the frequency of interaction is increasing, but it is happening very quickly, and (3) a broadening range of Web searchers' information needs, with the most frequent terms accounting for less than 1% of total term usage. We discuss the implications of these findings for the development of Web search engines. © 2005 Wiley Periodicals, Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Discovering proper search intents is a vi- tal process to return desired results. It is constantly a hot research topic regarding information retrieval in recent years. Existing methods are mainly limited by utilizing context-based mining, query expansion, and user profiling techniques, which are still suffering from the issue of ambiguity in search queries. In this pa- per, we introduce a novel ontology-based approach in terms of a world knowledge base in order to construct personalized ontologies for identifying adequate con- cept levels for matching user search intents. An iter- ative mining algorithm is designed for evaluating po- tential intents level by level until meeting the best re- sult. The propose-to-attempt approach is evaluated in a large volume RCV1 data set, and experimental results indicate a distinct improvement on top precision after compared with baseline models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The existing Collaborative Filtering (CF) technique that has been widely applied by e-commerce sites requires a large amount of ratings data to make meaningful recommendations. It is not directly applicable for recommending products that are not frequently purchased by users, such as cars and houses, as it is difficult to collect rating data for such products from the users. Many of the e-commerce sites for infrequently purchased products are still using basic search-based techniques whereby the products that match with the attributes given in the target user's query are retrieved and recommended to the user. However, search-based recommenders cannot provide personalized recommendations. For different users, the recommendations will be the same if they provide the same query regardless of any difference in their online navigation behaviour. This paper proposes to integrate collaborative filtering and search-based techniques to provide personalized recommendations for infrequently purchased products. Two different techniques are proposed, namely CFRRobin and CFAg Query. Instead of using the target user's query to search for products as normal search based systems do, the CFRRobin technique uses the products in which the target user's neighbours have shown interest as queries to retrieve relevant products, and then recommends to the target user a list of products by merging and ranking the returned products using the Round Robin method. The CFAg Query technique uses the products that the user's neighbours have shown interest in to derive an aggregated query, which is then used to retrieve products to recommend to the target user. Experiments conducted on a real e-commerce dataset show that both the proposed techniques CFRRobin and CFAg Query perform better than the standard Collaborative Filtering (CF) and the Basic Search (BS) approaches, which are widely applied by the current e-commerce applications. The CFRRobin and CFAg Query approaches also outperform the e- isting query expansion (QE) technique that was proposed for recommending infrequently purchased products.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For more than a decade research in the field of context aware computing has aimed to find ways to exploit situational information that can be detected by mobile computing and sensor technologies. The goal is to provide people with new and improved applications, enhanced functionality and better use experience (Dey, 2001). Early applications focused on representing or computing on physical parameters, such as showing your location and the location of people or things around you. Such applications might show where the next bus is, which of your friends is in the vicinity and so on. With the advent of social networking software and microblogging sites such as Facebook and Twitter, recommender systems and so on context-aware computing is moving towards mining the social web in order to provide better representations and understanding of context, including social context. In this paper we begin by recapping different theoretical framings of context. We then discuss the problem of context- aware computing from a design perspective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many user studies in Web information searching have found the significant effect of task types on search strategies. However, little attention was given to Web image searching strategies, especially the query reformulation activity despite that this is a crucial part in Web image searching. In this study, we investigated the effects of topic domains and task types on user’s image searching behavior and query reformulation strategies. Some significant differences in user’s tasks specificity and initial concepts were identified among the task domains. Task types are also found to influence participant’s result reviewing behavior and query reformulation strategies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses users’ query reformulation behaviour while searching information on the Web. Query reformulations have emerged as an important component of Web search behaviour and human-computer interaction (HCI) because a user’s success of information retrieval (IR) depends on how he or she formulates queries. There are various factors, such as cognitive styles, that influence users’ query reformulation behaviour. Understanding how users with different cognitive styles formulate their queries while performing Web searches can help HCI researchers and information systems (IS) developers to provide assistance to the users. This paper aims to examine the effects of users’ cognitive styles on their query reformation behaviour. To achieve the goal of the study, a user study was conducted in which a total of 3613 search terms and 872 search queries were submitted by 50 users who engaged in 150 scenario-based search tasks. Riding’s (1991) Cognitive Style Analysis (CSA) test was used to assess users’ cognitive style as wholist or analytic, and verbaliser or imager. The study findings show that users’ query reformulation behaviour is affected by their cognitive styles. The results reveal that analytic users tended to prefer Add queries while all other users preferred New queries. A significant difference was found among wholists and analytics in the manner they performed Remove query reformulations. Future HCI researchers and IS developers can utilize the study results to develop interactive and user-cantered search model, and to provide context-based query suggestions for users.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Success of query reformulation and relevant information retrieval depends on many factors, such as users’ prior knowledge, age, gender, and cognitive styles. One of the important factors that affect a user’s query reformulation behaviour is that of the nature of the search tasks. Limited studies have examined the impact of the search task types on query reformulation behaviour while performing Web searches. This paper examines how the nature of the search tasks affects users’ query reformulation behaviour during information searching. The paper reports empirical results from a user study in which 50 participants performed a set of three Web search tasks – exploratory, factorial and abstract. Users’ interactions with search engines were logged by using a monitoring program. 872 unique search queries were classified into five query types – New, Add, Remove, Replace and Repeat. Users submitted fewer queries for the factual task, which accounted for 26%. They completed a higher number of queries (40% of the total queries) while carrying out the exploratory task. A one-way MANOVA test indicated a significant effect of search task types on users’ query reformulation behaviour. In particular, the search task types influenced the manner in which users reformulated the New and Repeat queries.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recent Australian early childhood policy and curriculum guidelines promoting the use of technologies invite investigations of young children’s practices in classrooms. This study examined the practices of one preparatory year classroom, to show teacher and child interactions as they engaged in Web searching. The study investigated the in situ practices of the teacher and children to show how they accomplished the Web search. The data corpus consists of eight hours of videorecorded interactions over three days where children and teachers engaged in Web searching. One episode was selected that showed a teacher and two children undertaking a Web search. The episode is shown to consist of four phases: deciding on a new search subject, inputting the search query, considering the result options, and exploring the selected result. The sociological perspectives of ethnomethodology and conversation analysis were employed as the conceptual and methodological frameworks of the study, to analyse the video-recorded teacher and child interactions as they co-constructed a Web search. Ethnomethodology is concerned with how people make ‘sense’ in everyday interactions, and conversation analysis focuses on the sequential features of interaction to show how the interaction unfolds moment by moment. This extended single case analysis showed how the Web search was accomplished over multiple turns, and how the children and teacher collaboratively engaged in talk. There are four main findings. The first was that Web searching featured sustained teacher-child interaction, requiring a particular sort of classroom organisation to enable the teacher to work in this sustained way. The second finding was that the teacher’s actions recognised the children’s interactional competence in situ, orchestrating an interactional climate where everyone was heard. The third finding was that the teacher drew upon a range of interactional resources designed to progress the activity at hand, that of accomplishing the Web search. The teacher drew upon the interactional resources of interrogatives, discourse markers, and multi-unit turns during the Web search, and these assisted the teacher and children to co-construct their discussion, decide upon and co-ordinate their future actions, and accomplish the Web search in a timely way. The fourth finding explicates how particular social and pedagogic orders are accomplished through talk, where children collaborated with each other and with the teacher to complete the Web search. The study makes three key recommendations for the field of early childhood education. The study’s first recommendation is that fine-grained transcription and analysis of interaction aids in understanding interactional practices of Web searching. This study offers material for use in professional development, such as using transcribed and videorecorded interactions to highlight how teachers strategically engage with children, that is, how talk works in classroom settings. Another strategy is to focus on the social interactions of members engaging in Web searches, which is likely to be of interest to teachers as they work to engage with children in an increasingly online environment. The second recommendation involves classroom organisation; how teachers consider and plan for extended periods of time for Web searching, and how teachers accommodate children’s prior knowledge of Web searching in their classrooms. The third recommendation is in relation to future empirical research, with suggested possible topics focusing on the social interactions of children as they engage with peers as they Web search, as well as investigations of techno-literacy skills as children use the Internet in the early years.