930 resultados para Web Search Behaviour
Resumo:
El projecte iSAC (Servei Intel·ligent d’Atenció Ciutadana via web) es va iniciar el mes de gener de 2006 amb l’ajut del nou coneixement científic en agents intel·ligents, junt amb l’aplicació de les Tecnologies de la Informació i la Comunicació (TIC) i els cercadors. Actualment, el servei actual d’atenció al ciutadà està composat per dues àrees: l’atenció directa a les oficines i l’atenció telefònica a través del Call Center. Les limitacions de personal i horari d’atenció fan que aquest servei perdi eficàcia. Es vol desenvolupar un producte amb una tecnologia capaç d’ampliar i millorar la capacitat i la qualitat de l’atenció ciutadana en les administracions públiques, sigui quina sigui la seva dimensió. Tot i això, aquest projecte l’explotaran especialment els ajuntaments, als quals la ciutadania s'acosta amb tot tipus de preguntes i dubtes, habitualment no restringides a l'àmbit local. Més concretament, es vol automatitzar a través d’un portal web l’atenció al ciutadà per tal d’obtenir un servei més efectiu
Resumo:
This paper describes the implementation of a semantic web search engine on conversation styled transcripts. Our choice of data is Hansard, a publicly available conversation style transcript of parliamentary debates. The current search engine implementation on Hansard is limited to running search queries based on keywords or phrases hence lacks the ability to make semantic inferences from user queries. By making use of knowledge such as the relationship between members of parliament, constituencies, terms of office, as well as topics of debates the search results can be improved in terms of both relevance and coverage. Our contribution is not algorithmic instead we describe how we exploit a collection of external data sources, ontologies, semantic web vocabularies and named entity extraction in the analysis of underlying semantics of user queries as well as the semantic enrichment of the search index thereby improving the quality of results.
Resumo:
A location-based search engine must be able to find and assign proper locations to Web resources. Host, content and metadata location information are not sufficient to describe the location of resources as they are ambiguous or unavailable for many documents. We introduce target location as the location of users of Web resources. Target location is content-independent and can be applied to all types of Web resources. A novel method is introduced which uses log files and IN to track the visitors of websites. The experiments show that target location can be calculated for almost all documents on the Web at country level and to the majority of them in state and city levels. It can be assigned to Web resources as a new definition and dimension of location. It can be used separately or with other relevant locations to define the geography of Web resources. This compensates insufficient geographical information on Web resources and would facilitate the design and development of location-based search engines.
Resumo:
When a query is passed to multiple search engines, each search engine returns a ranked list of documents. Researchers have demonstrated that combining results, in the form of a "metasearch engine", produces a significant improvement in coverage and search effectiveness. This paper proposes a linear programming mathematical model for optimizing the ranked list result of a given group of Web search engines for an issued query. An application with a numerical illustration shows the advantages of the proposed method. © 2011 Elsevier Ltd. All rights reserved.
Resumo:
This article presents a new method for data collection in regional dialectology based on site-restricted web searches. The method measures the usage and determines the distribution of lexical variants across a region of interest using common web search engines, such as Google or Bing. The method involves estimating the proportions of the variants of a lexical alternation variable over a series of cities by counting the number of webpages that contain the variants on newspaper websites originating from these cities through site-restricted web searches. The method is evaluated by mapping the 26 variants of 10 lexical variables with known distributions in American English. In almost all cases, the maps based on site-restricted web searches align closely with traditional dialect maps based on data gathered through questionnaires, demonstrating the accuracy of this method for the observation of regional linguistic variation. However, unlike collecting dialect data using traditional methods, which is a relatively slow process, the use of site-restricted web searches allows for dialect data to be collected from across a region as large as the United States in a matter of days.
Resumo:
International audience
Resumo:
This thesis investigates how web search evaluation can be improved using historical interaction data. Modern search engines combine offline and online evaluation approaches in a sequence of steps that a tested change needs to pass through to be accepted as an improvement and subsequently deployed. We refer to such a sequence of steps as an evaluation pipeline. In this thesis, we consider the evaluation pipeline to contain three sequential steps: an offline evaluation step, an online evaluation scheduling step, and an online evaluation step. In this thesis we show that historical user interaction data can aid in improving the accuracy or efficiency of each of the steps of the web search evaluation pipeline. As a result of these improvements, the overall efficiency of the entire evaluation pipeline is increased. Firstly, we investigate how user interaction data can be used to build accurate offline evaluation methods for query auto-completion mechanisms. We propose a family of offline evaluation metrics for query auto-completion that represents the effort the user has to spend in order to submit their query. The parameters of our proposed metrics are trained against a set of user interactions recorded in the search engine’s query logs. From our experimental study, we observe that our proposed metrics are significantly more correlated with an online user satisfaction indicator than the metrics proposed in the existing literature. Hence, fewer changes will pass the offline evaluation step to be rejected after the online evaluation step. As a result, this would allow us to achieve a higher efficiency of the entire evaluation pipeline. Secondly, we state the problem of the optimised scheduling of online experiments. We tackle this problem by considering a greedy scheduler that prioritises the evaluation queue according to the predicted likelihood of success of a particular experiment. This predictor is trained on a set of online experiments, and uses a diverse set of features to represent an online experiment. Our study demonstrates that a higher number of successful experiments per unit of time can be achieved by deploying such a scheduler on the second step of the evaluation pipeline. Consequently, we argue that the efficiency of the evaluation pipeline can be increased. Next, to improve the efficiency of the online evaluation step, we propose the Generalised Team Draft interleaving framework. Generalised Team Draft considers both the interleaving policy (how often a particular combination of results is shown) and click scoring (how important each click is) as parameters in a data-driven optimisation of the interleaving sensitivity. Further, Generalised Team Draft is applicable beyond domains with a list-based representation of results, i.e. in domains with a grid-based representation, such as image search. Our study using datasets of interleaving experiments performed both in document and image search domains demonstrates that Generalised Team Draft achieves the highest sensitivity. A higher sensitivity indicates that the interleaving experiments can be deployed for a shorter period of time or use a smaller sample of users. Importantly, Generalised Team Draft optimises the interleaving parameters w.r.t. historical interaction data recorded in the interleaving experiments. Finally, we propose to apply the sequential testing methods to reduce the mean deployment time for the interleaving experiments. We adapt two sequential tests for the interleaving experimentation. We demonstrate that one can achieve a significant decrease in experiment duration by using such sequential testing methods. The highest efficiency is achieved by the sequential tests that adjust their stopping thresholds using historical interaction data recorded in diagnostic experiments. Our further experimental study demonstrates that cumulative gains in the online experimentation efficiency can be achieved by combining the interleaving sensitivity optimisation approaches, including Generalised Team Draft, and the sequential testing approaches. Overall, the central contributions of this thesis are the proposed approaches to improve the accuracy or efficiency of the steps of the evaluation pipeline: the offline evaluation frameworks for the query auto-completion, an approach for the optimised scheduling of online experiments, a general framework for the efficient online interleaving evaluation, and a sequential testing approach for the online search evaluation. The experiments in this thesis are based on massive real-life datasets obtained from Yandex, a leading commercial search engine. These experiments demonstrate the potential of the proposed approaches to improve the efficiency of the evaluation pipeline.
Resumo:
Los trastornos del comportamiento alimentario (TCA) son las patologías psicológicas que más se han incrementado en los últimos años. Uno de los factores que determina la elevada prevalencia de TCA en nuestra sociedad es el gran desconocimiento sobre alimentación. Este desconocimiento puede deberse a la consulta de recursos online sin validez científica. El objetivo de este trabajo ha sido analizar la calidad científica y el posicionamiento de los sitios web en español con información sobre nutrición, TCA y obesidad. Material y métodos: Se realizó una búsqueda de páginas web en el navegador Google Chrome con las palabras clave: dieta, anorexia, bulimia, nutrición y obesidad, seleccionándose los 20 primeros resultados de cada búsqueda según los índices de posicionamiento ofrecidos por SEOquake (Page Rank, Alexa Rank y SEMrush Rank). Las variables de análisis fueron: información relacionada con dietas y hábitos alimentarios, información sobre alimentación saludable, información sobre TCA y sus criterios diagnósticos e información de carácter formativo acerca de temas profesionales de salud general. Sólo el 50% de las web encontradas cumplían los criterios de inclusión en el estudio. La mayoría no seguían las pautas establecidas por e-Europa sobre calidad. La mediana de Page Rank fue de 2, excepto en aquellas asociadas a instituciones sanitarias de prestigio. Dada la escasez de webs sanitarias con rigor científico, es imprescindible la revisión de las existentes y la creación de nuevos espacios on-line cuya supervisión sea realizada por profesionales especialistas en salud y nutrición.
Resumo:
Die Arbeit geht dem Status quo der unternehmensweiten Suche in österreichischen Großunternehmen nach und beleuchtet Faktoren, die darauf Einfluss haben. Aus der Analyse des Ist-Zustands wird der Bedarf an Enterprise-Search-Software abgeleitet und es werden Rahmenbedingungen für deren erfolgreiche Einführung skizziert. Die Untersuchung stützt sich auf eine im Jahr 2009 durchgeführte Onlinebefragung von 469 österreichischen Großunternehmen (Rücklauf 22 %) und daran anschließende Leitfadeninterviews mit zwölf Teilnehmern der Onlinebefragung. Der theoretische Teil verortet die Arbeit im Kontext des Informations- und Wissensmanagements. Der Fokus liegt auf dem Ansatz der Enterprise Search, ihrer Abgrenzung gegenüber der Suche im Internet und ihrem Leistungsspektrum. Im empirischen Teil wird zunächst aufgezeigt, wie die Unternehmen ihre Informationen organisieren und welche Probleme dabei auftreten. Es folgt eine Analyse des Status quo der Informationssuche im Unternehmen. Abschließend werden Bekanntheit und Einsatz von Enterprise-Search-Software in der Zielgruppe untersucht sowie für die Einführung dieser Software nötige Rahmenbedingungen benannt. Defizite machen die Befragten insbesondere im Hinblick auf die übergreifende Suche im Unternehmen und die Suche nach Kompetenzträgern aus. Hier werden Lücken im Wissensmanagement offenbar. 29 % der Respondenten der Onlinebefragung geben zudem an, dass es in ihren Unternehmen gelegentlich bis häufig zu Fehlentscheidungen infolge defizitärer Informationslagen kommt. Enterprise-Search-Software kommt in 17 % der Unternehmen, die sich an der Onlinebefragung beteiligten, zum Einsatz. Die durch Enterprise-Search-Software bewirkten Veränderungen werden grundsätzlich positiv beurteilt. Alles in allem zeigen die Ergebnisse, dass Enterprise-Search-Strategien nur Erfolg haben können, wenn man sie in umfassende Maßnahmen des Informations- und Wissensmanagements einbettet.
Resumo:
The roles of herbivory and predation in determining the structure and diversity of communities have been tested across most intertidal systems. In contrast, the importance of omnivorous consumers remains untested in many rocky shore communities. We tested the role of a small omnivorous crab in an intertidal food web on rocky shores of the sub-tropical southwest Atlantic. Exclusion of the grapsid crab Pachygrapsus transversus in the field resulted in significant changes in the abundance of functional groups in the sublittoral fringe of sheltered shores, where the dominant cover changed from a suite of macroalgae to an assemblage of filter-feeding animals (ascidians, sponges, mussels). In contrast, limpets, whelks, large crabs and fish did not significantly affect community composition of the assemblage. To examine the omnivorous feeding pattern of P. transversus, we did laboratory experiments to test its foraging behaviour among animal and algal groups. The crab showed selective behaviour, preferring invertebrate groups to macroalgae, and opportunistic behaviour among types of prey within those major groups. According to our results, the role of slow-moving and large fast-moving consumers is apparently negligible compared to the effect of an omnivorous consumer. P. transversus plays an important role in determining the intertidal community composition on these subtropical rocky shores, causing changes in the balance of functional groups and controlling invasive species.
Resumo:
Tecnologias da Web Semântica como RDF, OWL e SPARQL sofreram nos últimos anos um forte crescimento e aceitação. Projectos como a DBPedia e Open Street Map começam a evidenciar o verdadeiro potencial da Linked Open Data. No entanto os motores de pesquisa semânticos ainda estão atrasados neste crescendo de tecnologias semânticas. As soluções disponíveis baseiam-se mais em recursos de processamento de linguagem natural. Ferramentas poderosas da Web Semântica como ontologias, motores de inferência e linguagens de pesquisa semântica não são ainda comuns. Adicionalmente a esta realidade, existem certas dificuldades na implementação de um Motor de Pesquisa Semântico. Conforme demonstrado nesta dissertação, é necessária uma arquitectura federada de forma a aproveitar todo o potencial da Linked Open Data. No entanto um sistema federado nesse ambiente apresenta problemas de performance que devem ser resolvidos através de cooperação entre fontes de dados. O standard actual de linguagem de pesquisa na Web Semântica, o SPARQL, não oferece um mecanismo para cooperação entre fontes de dados. Esta dissertação propõe uma arquitectura federada que contém mecanismos que permitem cooperação entre fontes de dados. Aborda o problema da performance propondo um índice gerido de forma centralizada assim como mapeamentos entre os modelos de dados de cada fonte de dados. A arquitectura proposta é modular, permitindo um crescimento de repositórios e funcionalidades simples e de forma descentralizada, à semelhança da Linked Open Data e da própria World Wide Web. Esta arquitectura trabalha com pesquisas por termos em linguagem natural e também com inquéritos formais em linguagem SPARQL. No entanto os repositórios considerados contêm apenas dados em formato RDF. Esta dissertação baseia-se em múltiplas ontologias partilhadas e interligadas.
Resumo:
Whether a 1-year nationwide, government supported programme is effective in significantly increasing the number of smoking cessation clinics at major Swiss hospitals as well as providing basic training for the staff running them. We conducted a baseline evaluation of hospital services for smoking cessation, hypertension, and obesity by web search and telephone contact followed by personal visits between October 2005 and January 2006 of 44 major public hospitals in the 26 cantons of Switzerland; we compared the number of active smoking cessation services and trained personnel between baseline to 1 year after starting the programme including a training workshop for doctors and nurses from all hospitals as well as two further follow-up visits. At base line 9 (21%) hospitals had active smoking cessation services, whereas 43 (98%) and 42 (96%) offered medical services for hypertension and obesity respectively. Hospital directors and heads of Internal Medicine of 43 hospitals were interested in offering some form of help to smokers provided they received outside support, primarily funding to get started or to continue. At two identical workshops, 100 health professionals (27 in Lausanne, 73 in Zurich) were trained for one day. After the programme, 22 (50%) hospitals had an active smoking cessation service staffed with at least 1 trained doctor and 1 nurse. A one-year, government-supported national intervention resulted in a substantial increase in the number of hospitals allocating trained staff and offering smoking cessation services to smokers. Compared to the offer for hypertension and obesity this offer is still insufficient.
Resumo:
Actualment seria impensable la existència d'una xarxa d'informació com Internet sense la existència dels motors de cerca. Gràcies a ells, qualsevol usuari tenim al nostre abast la possibilitat d'obtenir informació sobre qualsevol tema només enviant una consulta des dels nostres ordinadors i rebent una resposta en qüestió de segons. Entre els usuaris dels cercadors d'Internet és molt habitual que les consultes facin referència a la empresa on treballem, la ciutat on vivim, els llocs que visitem, o inclús sobre problemes que tenim o malalties que patim amb l'objectiu de trobar opinions, consells o solucions. En resum, els usuaris, a través de les nostres consultes, proporcionem a diari als motors de cerca informació sobre nosaltres mateixos i sobre la nostra identitat que, juntament amb la adreça IP de la màquina des d'on fem les nostres consultes, ens fa perdre l'anonimat dins dels seus sistemes. Sobre aquesta problemàtica és del que tracta el present Projecte de Final de Carrera. En ell s'ha implementat una solució de la proposta especificada per Alexandre Viejo i Jordi Castellà-Roca en la seva publicació "Using social networks to disort users' profiles generated by web search engines", en la qual es documenten una sèrie de protocols de seguretat i d'algorismes de protecció i distribució que garanteixen la privacitat de la identitat dels usuaris dins dels motors de cerca aprofitant per aquest fi la relació existent entre aquests usuaris a través de les xarxes socials.
Resumo:
Construcció d'un servei de cerca automatitzada de notícies. La cerca es realitza en els canals RSS als quals es vol subscriure l'usuari.
Resumo:
L'increment d'ús del préstec interbibliotecari i l'obtenció de documents (PI/SOD) obria, en el si del Consorci de Biblioteques Universitàries de Catalunya, un debat de com millorar la seva accessibilitat i gestió, especialment de cara a l'usuari que havia d'utilitzar diferents aplicatius en el procés de sol·licitar documents en PI/SOD. Fruit d'aquesta reflexió, s'ha realitzat un nou pas en la millora i consolidació d'aquest servei, integrant dins de la interfície de consulta per web del Catàleg Col·lectiu de les Universitats de Catalunya la possibilitat de demanar els documents en línia tant sols entrant la identificació d'usuari. L'Aplicació desenvolupada en PL-SQL per la Universitat Oberta de Catalunya dins del projecte GALA (Global Access to Local Applications and services), permet la captura automàtica de les dades del registre bibliogràfic i la seva exportació al formulari de PI/SOD de la institució a la que pertany l'usuari, generant una comanda que finalment és validada per la institució.