837 resultados para Search Engine Indexing
Resumo:
Search engines sometimes apply the search on the full text of documents or web-pages; but sometimes they can apply the search on selected parts of the documents only, e.g. their titles. Full-text search may consume a lot of computing resources and time. It may be possible to save resources by applying the search on the titles of documents only, assuming that a title of a document provides a concise representation of its content. We tested this assumption using Google search engine. We ran search queries that have been defined by users, distinguishing between two types of queries/users: queries of users who are familiar with the area of the search, and queries of users who are not familiar with the area of the search. We found that searches which use titles provide similar and sometimes even (slightly) better results compared to searches which use the full-text. These results hold for both types of queries/users. Moreover, we found an advantage in title-search when searching in unfamiliar areas because the general terms used in queries in unfamiliar areas match better with general terms which tend to be used in document titles.
Resumo:
Most Internet search engines are keyword-based. They are not efficient for the queries where geographical location is important, such as finding hotels within an area or close to a place of interest. A natural interface for spatial searching is a map, which can be used not only to display locations of search results but also to assist forming search conditions. A map-based search engine requires a well-designed visual interface that is intuitive to use yet flexible and expressive enough to support various types of spatial queries as well as aspatial queries. Similar to hyperlinks for text and images in an HTML page, spatial objects in a map should support hyperlinks. Such an interface needs to be scalable with the size of the geographical regions and the number of websites it covers. In spite of handling typically a very large amount of spatial data, a map-based search interface should meet the expectation of fast response time for interactive applications. In this paper we discuss general requirements and the design for a new map-based web search interface, focusing on integration with the WWW and visual spatial query interface. A number of current and future research issues are discussed, and a prototype for the University of Queensland is presented. (C) 2001 Published by Elsevier Science Ltd.
Resumo:
Yandex is the dominant search engine in Russia, followed by the world leader Google. This study focuses on the performance differences between the two in search advertising in the context of tourism, by running two identical campaigns and measuring the KPI’s, such as CPA (cost-per-action), on both campaigns. Search engine advertising is a new and fast changing form of advertising, which should be studied frequently in order to keep up with the changes. Research was done as an experimental study in cooperation with a Finnish tourism company and the data is gathered from the clickstream and not from questionnaires, which is recommended method by the literature. The results of the study suggests that Yandex.Direct performed better in the selected niche and that the individual campaign planning for Yandex.Direct and Google AdWords is an important part of the optimization of search advertising in Russia.
Resumo:
Given the significant growth of the Internet in recent years, marketers have been striving for new techniques and strategies to prosper in the online world. Statistically, search engines have been the most dominant channels of Internet marketing in recent years. However, the mechanics of advertising in such a market place has created a challenging environment for marketers to position their ads among their competitors. This study uses a unique cross-sectional dataset of the top 500 Internet retailers in North America and hierarchical multiple regression analysis to empirically investigate the effect of keyword competition on the relationship between ad position and its determinants in the sponsored search market. To this end, the study utilizes the literature in consumer search behavior, keyword auction mechanism design, and search advertising performance as the theoretical foundation. This study is the first of its kind to examine the sponsored search market characteristics in a cross-sectional setting where the level of keyword competition is explicitly captured in terms of the number of Internet retailers competing for similar keywords. Internet retailing provides an appropriate setting for this study given the high-stake battle for market share and intense competition for keywords in the sponsored search market place. The findings of this study indicate that bid values and ad relevancy metrics as well as their interaction affect the position of ads on the search engine result pages (SERPs). These results confirm some of the findings from previous studies that examined sponsored search advertising performance at a keyword level. Furthermore, the study finds that the position of ads for web-only retailers is dependent on bid values and ad relevancy metrics, whereas, multi-channel retailers are more reliant on their bid values. This difference between web-only and multi-channel retailers is also observed in the moderating effect of keyword competition on the relationships between ad position and its key determinants. Specifically, this study finds that keyword competition has significant moderating effects only for multi-channel retailers.
Resumo:
Les moteurs de recherche font partie de notre vie quotidienne. Actuellement, plus d’un tiers de la population mondiale utilise l’Internet. Les moteurs de recherche leur permettent de trouver rapidement les informations ou les produits qu'ils veulent. La recherche d'information (IR) est le fondement de moteurs de recherche modernes. Les approches traditionnelles de recherche d'information supposent que les termes d'indexation sont indépendants. Pourtant, les termes qui apparaissent dans le même contexte sont souvent dépendants. L’absence de la prise en compte de ces dépendances est une des causes de l’introduction de bruit dans le résultat (résultat non pertinents). Certaines études ont proposé d’intégrer certains types de dépendance, tels que la proximité, la cooccurrence, la contiguïté et de la dépendance grammaticale. Dans la plupart des cas, les modèles de dépendance sont construits séparément et ensuite combinés avec le modèle traditionnel de mots avec une importance constante. Par conséquent, ils ne peuvent pas capturer correctement la dépendance variable et la force de dépendance. Par exemple, la dépendance entre les mots adjacents "Black Friday" est plus importante que celle entre les mots "road constructions". Dans cette thèse, nous étudions différentes approches pour capturer les relations des termes et de leurs forces de dépendance. Nous avons proposé des méthodes suivantes: ─ Nous réexaminons l'approche de combinaison en utilisant différentes unités d'indexation pour la RI monolingue en chinois et la RI translinguistique entre anglais et chinois. En plus d’utiliser des mots, nous étudions la possibilité d'utiliser bi-gramme et uni-gramme comme unité de traduction pour le chinois. Plusieurs modèles de traduction sont construits pour traduire des mots anglais en uni-grammes, bi-grammes et mots chinois avec un corpus parallèle. Une requête en anglais est ensuite traduite de plusieurs façons, et un score classement est produit avec chaque traduction. Le score final de classement combine tous ces types de traduction. Nous considérons la dépendance entre les termes en utilisant la théorie d’évidence de Dempster-Shafer. Une occurrence d'un fragment de texte (de plusieurs mots) dans un document est considérée comme représentant l'ensemble de tous les termes constituants. La probabilité est assignée à un tel ensemble de termes plutôt qu’a chaque terme individuel. Au moment d’évaluation de requête, cette probabilité est redistribuée aux termes de la requête si ces derniers sont différents. Cette approche nous permet d'intégrer les relations de dépendance entre les termes. Nous proposons un modèle discriminant pour intégrer les différentes types de dépendance selon leur force et leur utilité pour la RI. Notamment, nous considérons la dépendance de contiguïté et de cooccurrence à de différentes distances, c’est-à-dire les bi-grammes et les paires de termes dans une fenêtre de 2, 4, 8 et 16 mots. Le poids d’un bi-gramme ou d’une paire de termes dépendants est déterminé selon un ensemble des caractères, en utilisant la régression SVM. Toutes les méthodes proposées sont évaluées sur plusieurs collections en anglais et/ou chinois, et les résultats expérimentaux montrent que ces méthodes produisent des améliorations substantielles sur l'état de l'art.
Resumo:
The ISO norm line 9241 states some criteria for ergonomics of human system interaction. In markets with a huge variety of offers and little possibility of differentiation, providers can gain a decisive competitive advantage by user oriented interfaces. A precondition for this is that relevant information can be obtained for entrepreneurial decisions in this regard. To test how users of universal search result pages use those pages and pay attention to different elements, an eye tracking experiment with a mixed design has been developed. Twenty subjects were confronted with search engine result pages (SERPs) and were instructed to make a decision while conditions “national vs. international city” and “with vs. without miniaturized Google map” were used. Different parameters like fixation count, duration and time to first fixation were computed from the eye tracking raw data and supplemented by click rate data as well as data from questionnaires. Results of this pilot study revealed some remarkable facts like a vampire effect on miniaturized Google maps. Furthermore, Google maps did not shorten the process of decision making, Google ads were not fixated, visual attention on SERPs was influenced by position of the elements on the SERP and by the users’ familiarity with the search target. These results support the theory of Amount of Invested Mental Effort (AIME) and give providers empirical evidence to take users’ expectations into account. Furthermore, the results indicated that the task oriented goal mode of participants was a moderator for the attention spent on ads. Most important, SERPs with images attracted the viewers’ attention much longer than those without images. This unique selling proposition may lead to a distortion of competition on markets.
Resumo:
This paper describes the implementation of a semantic web search engine on conversation styled transcripts. Our choice of data is Hansard, a publicly available conversation style transcript of parliamentary debates. The current search engine implementation on Hansard is limited to running search queries based on keywords or phrases hence lacks the ability to make semantic inferences from user queries. By making use of knowledge such as the relationship between members of parliament, constituencies, terms of office, as well as topics of debates the search results can be improved in terms of both relevance and coverage. Our contribution is not algorithmic instead we describe how we exploit a collection of external data sources, ontologies, semantic web vocabularies and named entity extraction in the analysis of underlying semantics of user queries as well as the semantic enrichment of the search index thereby improving the quality of results.
Resumo:
We analyze the impact on consumer prices of the size and bias of price comparison search engines. In the context of a model related to Burdett and Judd (1983) and Varian (1980), we develop and test experimentally several theoretical predictions. The experimental results confirm the model’s predictions regarding the impact of the number of firms, and the type of bias of the search engine, but reject the model’s predictions regarding changes in the size of the index. The explanatory power of an econometric model for the price distributions is significantly improved when variables accounting for risk attitudes are introduced.
Resumo:
The promise of search-driven development is that developers will save time and resources by reusing external code in their local projects. To efficiently integrate this code, users must be able to trust it, thus trustability of code search results is just as important as their relevance. In this paper, we introduce a trustability metric to help users assess the quality of code search results and therefore ease the cost-benefit analysis they undertake trying to find suitable integration candidates. The proposed trustability metric incorporates both user votes and cross-project activity of developers to calculate a "karma" value for each developer. Through the karma value of all its developers a project is ranked on a trustability scale. We present JBENDER, a proof-of-concept code search engine which implements our trustability metric and we discuss preliminary results from an evaluation of the prototype.
Resumo:
Our research project develops an intranet search engine with concept- browsing functionality, where the user is able to navigate the conceptual level in an interactive, automatically generated knowledge map. This knowledge map visualizes tacit, implicit knowledge, extracted from the intranet, as a network of semantic concepts. Inductive and deductive methods are combined; a text ana- lytics engine extracts knowledge structures from data inductively, and the en- terprise ontology provides a backbone structure to the process deductively. In addition to performing conventional keyword search, the user can browse the semantic network of concepts and associations to find documents and data rec- ords. Also, the user can expand and edit the knowledge network directly. As a vision, we propose a knowledge-management system that provides concept- browsing, based on a knowledge warehouse layer on top of a heterogeneous knowledge base with various systems interfaces. Such a concept browser will empower knowledge workers to interact with knowledge structures.
Resumo:
Tras liderar la investigación e indización de la información por cerca de una década, el motor de búsqueda Google se ha convertido en un sistema económico que influye nuestro mundo contemporáneo, contribuyendo grandemente a la transformación de nuestro mundo en un único globo virtual. En años recientes, Google ha comenzado a ofrecer a los usuarios globales aplicaciones o software que son usados para nuestra herencia cultural. Este software se resalta aquí en su potencial, desde un punto de vista económico, cultural y turístico. Tratamos de describir lo más importante del software de Google (como Google Maps, Google Street View, Google Earth, Google SketchUp, Google Books y Google Art Project), con el mayor y más evidente impacto en los sectores culturales y turísticos. Este ensayo muestra la digitalización y promoción de la herencia cultural italiana en Google, a través de sus programas informáticos (por ejemplo, Google Street View que ha llevado al uso de vistas tridimensionales remotas de algunos de los más importantes monumentos y sitios arqueológicos de Italia; el uso de Google SketchUp ha llevado al diseño de reconstrucción tridimensional del histórico centro de la ciudad de L'Aquila, devastada luego del terremoto de abril de 2009 y nunca reconstruida), y a través de diversos programas asociados específicos con el Ministerio Italiano de Cultura
Resumo:
Tras liderar la investigación e indización de la información por cerca de una década, el motor de búsqueda Google se ha convertido en un sistema económico que influye nuestro mundo contemporáneo, contribuyendo grandemente a la transformación de nuestro mundo en un único globo virtual. En años recientes, Google ha comenzado a ofrecer a los usuarios globales aplicaciones o software que son usados para nuestra herencia cultural. Este software se resalta aquí en su potencial, desde un punto de vista económico, cultural y turístico. Tratamos de describir lo más importante del software de Google (como Google Maps, Google Street View, Google Earth, Google SketchUp, Google Books y Google Art Project), con el mayor y más evidente impacto en los sectores culturales y turísticos. Este ensayo muestra la digitalización y promoción de la herencia cultural italiana en Google, a través de sus programas informáticos (por ejemplo, Google Street View que ha llevado al uso de vistas tridimensionales remotas de algunos de los más importantes monumentos y sitios arqueológicos de Italia; el uso de Google SketchUp ha llevado al diseño de reconstrucción tridimensional del histórico centro de la ciudad de L'Aquila, devastada luego del terremoto de abril de 2009 y nunca reconstruida), y a través de diversos programas asociados específicos con el Ministerio Italiano de Cultura
Resumo:
Tras liderar la investigación e indización de la información por cerca de una década, el motor de búsqueda Google se ha convertido en un sistema económico que influye nuestro mundo contemporáneo, contribuyendo grandemente a la transformación de nuestro mundo en un único globo virtual. En años recientes, Google ha comenzado a ofrecer a los usuarios globales aplicaciones o software que son usados para nuestra herencia cultural. Este software se resalta aquí en su potencial, desde un punto de vista económico, cultural y turístico. Tratamos de describir lo más importante del software de Google (como Google Maps, Google Street View, Google Earth, Google SketchUp, Google Books y Google Art Project), con el mayor y más evidente impacto en los sectores culturales y turísticos. Este ensayo muestra la digitalización y promoción de la herencia cultural italiana en Google, a través de sus programas informáticos (por ejemplo, Google Street View que ha llevado al uso de vistas tridimensionales remotas de algunos de los más importantes monumentos y sitios arqueológicos de Italia; el uso de Google SketchUp ha llevado al diseño de reconstrucción tridimensional del histórico centro de la ciudad de L'Aquila, devastada luego del terremoto de abril de 2009 y nunca reconstruida), y a través de diversos programas asociados específicos con el Ministerio Italiano de Cultura
Resumo:
Analyzing how software engineers use the Integrated Development Environment (IDE) is essential to better understanding how engineers carry out their daily tasks. Spotter is a code search engine for the Pharo programming language. Since its inception, Spotter has been rapidly and broadly adopted within the Pharo community. However, little is known about how practitioners employ Spotter to search and navigate within the Pharo code base. This paper evaluates how software engineers use Spotter in practice. To achieve this, we remotely gather user actions called events. These events are then visually rendered using an adequate navigation tool chain. Sequences of events are represented using a visual alphabet. We found a number of usage patterns and identified underused Spotter features. Such findings are essential for improving Spotter.
Resumo:
When a query is passed to multiple search engines, each search engine returns a ranked list of documents. Researchers have demonstrated that combining results, in the form of a "metasearch engine", produces a significant improvement in coverage and search effectiveness. This paper proposes a linear programming mathematical model for optimizing the ranked list result of a given group of Web search engines for an issued query. An application with a numerical illustration shows the advantages of the proposed method. © 2011 Elsevier Ltd. All rights reserved.