930 resultados para Web search engines


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Programming Assignment on Search Engines

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Search engines - such as Google - have been characterized as "Databases of intentions". This class will focus on different aspects of intentionality on the web, including goal mining, goal modeling and goal-oriented search. Readings: M. Strohmaier, M. Lux, M. Granitzer, P. Scheir, S. Liaskos, E. Yu, How Do Users Express Goals on the Web? - An Exploration of Intentional Structures in Web Search, We Know'07 International Workshop on Collaborative Knowledge Management for Web Information Systems in conjunction with WISE'07, Nancy, France, 2007. [Web link] Readings: Automatic identification of user goals in web search, U. Lee and Z. Liu and J. Cho WWW '05: Proceedings of the 14th International World Wide Web Conference 391--400 (2005) [Web link]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Slides and an essay on the Web Graph, search engines and how Google calculates Page Rank

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Search engines exploit the Web's hyperlink structure to help infer information content. The new phenomenon of personal Web logs, or 'blogs', encourage more extensive annotation of Web content. If their resulting link structures bias the Web crawling applications that search engines depend upon, there are implications for another form of annotation rapidly on the rise, the Semantic Web. We conducted a Web crawl of 160 000 pages in which the link structure of the Web is compared with that of several thousand blogs. Results show that the two link structures are significantly different. We analyse the differences and infer the likely effect upon the performance of existing and future Web agents. The Semantic Web offers new opportunities to navigate the Web, but Web agents should be designed to take advantage of the emerging link structures, or their effectiveness will diminish.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an approach called the Co-Recommendation Algorithm, which consists of the features of the recommendation rule and the co-citation algorithm. The algorithm addresses some challenges that are essential for further searching and recommendation algorithms. It does not require users to provide a lot of interactive communication. Furthermore, it supports other queries, such as keyword, URL and document investigations. When the structure is compared to other algorithms, the scalability is noticeably easier. The high online performance can be obtained as well as the repository computation, which can achieve a high group-forming accuracy using only a fraction of Web pages from a cluster.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rapid increase of web complexity and size makes web searched results far from satisfaction in many cases due to a huge amount of information returned by search engines. How to find intrinsic relationships among the web pages at a higher level to implement efficient web searched information management and retrieval is becoming a challenge problem. In this paper, we propose an approach to measure web page similarity. This approach takes hyperlink transitivity and page importance into consideration. From this new similarity measurement, an effective hierarchical web page clustering algorithm is proposed. The primary evaluations show the effectiveness of the new similarity measurement and the improvement of web page clustering. The proposed page similarity, as well as the matrix-based hyperlink analysis methods, could be applied to other web-based research areas..

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose – The application of “Google” econometrics (Geco) has evolved rapidly in recent years and can be applied in various fields of research. Based on accepted theories in existing economic literature, this paper seeks to contribute to the innovative use of research on Google search query data to provide a new innovative to property research.

Design/methodology/approach – In this study, existing data from Google Insights for Search (GI4S) is extended into a new potential source of consumer sentiment data based on visits to a commonly-used UK online real-estate agent platform (Rightmove.co.uk). In order to contribute to knowledge about the use of Geco's black box, namely the unknown sampling population and the specific search queries influencing the variables, the GI4S series are compared to direct web navigation.

Findings – The main finding from this study is that GI4S data produce immediate real-time results with a high level of reliability in explaining the future volume of transactions and house prices in comparison to the direct website data. Furthermore, the results reveal that the number of visits to Rightmove.co.uk is driven by GI4S data and vice versa, and indeed without a contemporaneous relationship.

Originality/value – This study contributes to the new emerging and innovative field of research involving search engine data. It also contributes to the knowledge base about the increasing use of online consumer data in economic research in property markets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Each year search engines like Google, Bing and Yahoo, complete trillions of search queries online. Students are especially dependent on these search tools because of their popularity, convenience and accessibility. However, what students are unaware of, by choice or naiveté is the amount of personal information that is collected during each search session, how that data is used and who is interested in their online behavior profile. Privacy policies are frequently updated in favor of the search companies but are lengthy and often are perused briefly or ignored entirely with little thought about how personal web habits are being exploited for analytics and marketing. As an Information Literacy instructor, and a member of the Electronic Frontier Foundation, I believe in the importance of educating college students and web users in general that they have a right to privacy online. Class discussions on the topic of web privacy have yielded an interesting perspective on internet search usage. Students are unaware of how their online behavior is recorded and have consistently expressed their hesitancy to use tools that disguise or delete their IP address because of the stigma that it may imply they have something to hide or are engaging in illegal activity. Additionally, students fear they will have to surrender the convenience of uber connectivity in their applications to maintain their privacy. The purpose of this lightning presentation is to provide educators with a lesson plan highlighting and simplifying the privacy terms for the three major search engines, Google, Bing and Yahoo. This presentation focuses on what data these search engines collect about users, how that data is used and alternative search solutions, like DuckDuckGo, for increased privacy. Students will directly benefit from this lesson because informed internet users can protect their data, feel safer online and become more effective web searchers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The popularization of the Internet has stimulated the appearance of Search Engines that have as their objective aid the users in the Web information research process. However, it s common for users to make queries and receive results which do not satisfy their initial needs. The Information Retrieval in Context (IRiX) technique allows for the information related to a specific theme to be related to the initial user query, enabling, in this way, better results. This study presents a prototype of a search engine based on contexts built from linguistic gatherings and on relationships defined by the user. The context information can be shared with softwares and other tool users with the objective of promoting a socialization of contexts

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The traditional characteristics and challenges for organizing and searching information on the World Wide Web are outlined and reviewed. The classification features of two of these methods, such as Google, in the case of automated search engines, and Yahoo! Directory, in the case of subject directories are analyzed. Recent advances in the Semantic Web, particularly the growing application of ontologies and Linked Data are also reviewed. Finally, some problems and prospects related to the use of classification and indexing on the World Wide Web are discussed, emphasizing the need of rethinking the role of classification in the organization of these resources and outlining the possibilities of applying Ranganathan's facet theories of classification.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La tesi ha ad oggetto lo studio e l’approfondimento delle forme di promozione commerciale presenti in Rete caratterizzate, più che da una normale evoluzione, da continue metamorfosi che ridefiniscono ogni giorno il concetto di pubblicità. L’intento è quello di analizzare il quadro giuridico applicabile alla pubblicità via Web, a fronte della varità di forme e di modalità che essa può assumere. Nel lavoro vengono passate in rassegna le caratteristiche che differenziano la pubblicità commerciale on-line rispetto a quella tradizionale; tra le quali, particolare rilievo assume la capacità d’istaurare una relazione – diretta e non mediata – tra impresa e consumatore. Nel prosieguo viene affrontato il problema dell’individuazione, stante il carattere a-territoriale della Rete, della legge applicabile al web advertising, per poi passare ad una ricognizione delle norme europee ed italiane in materia, senza trascurare quelle emanate in sede di autodisciplina. Ampio spazio è dedicato, infine, all’esame delle diverse e più recenti tecniche di promozione pubblicitaria, di cui sono messi in evidenza gli aspetti tecnico-informatici, imprescindibili ai fini di una corretta valutazione del tema giuridico. In particolare, vengono approfonditi il servizio di posizionamento a pagamento offerto dai principali motori di ricerca (keywords advertising) e gli strumenti di tracciamento dei “comportamenti” on-line degli utenti, che consentono la realizzazione di campagne pubblicitarie mirate (on-line behavioural advertising). Il Web, infatti, non offre più soltanto la possibilità di superare barriere spaziali, linguistiche o temporali e di ampliare la propria sfera di notorietà, ma anche di raggiungere l’utente “interessato” e, pertanto, potenziale acquirente. Di queste nuove realtà pubblicitarie vengono vagliati gli aspetti più critici ed esaminata la disciplina giuridica eventualmente applicabile anche alla luce delle principali decisioni giurisprudenziali nazionali ed europee in materia, nonché delle esperienze giuridiche nord-americane e di tipo autoregolamentare.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La presente tesi illustra e discute due attività legate all'ambito dei siti web, ovvero la localizzazione e l'ottimizzazione per i motori di ricerca (o SEO, dall'inglese "Search Engine Optimization"). Quest'ultima è un'attività mirata a permettere che i siti stessi ottengano un posizionamento migliore nella pagina dei risultati dei motori di ricerca e siano dunque più visibili agli utenti. Poiché la SEO prevede vari interventi sui siti web, alcuni dei quali implicano la manipolazione di codice HTML, essa viene spesso considerata come un'attività strettamente informatica. L'obiettivo della presente tesi, dunque, è quello di illustrare come i traduttori possano sfruttare le proprie competenze linguistiche per dedicarsi non soltanto alla localizzazione di siti web, ma anche alla loro ottimizzazione per i motori di ricerca. Per dimostrare l'applicabilità di tali tecniche è stato utilizzato come esempio pratico il sito web de "Il Palio di San Donato", un sito gestito dal Comune di Cividale del Friuli e finalizzato alla descrizione dell'omonima rievocazione storica cittadina. La tesi si compone di quattro capitoli. Nel primo capitolo vengono introdotti i principi teorici alla base della localizzazione di siti web, della SEO, della scrittura per il web e della traduzione per il settore turistico. Nel secondo capitolo, invece, viene descritto il sito del Palio di San Donato, esaminandone in particolare la struttura e i contenuti. Il terzo capitolo è dedicato alla descrizione del progetto di localizzazione a cui è stato sottoposto il sito in esame. Infine, il quarto capitolo contiene un breve commento relativo alle problematiche linguistiche, culturali e tecnologiche riscontrate durante il processo traduttivo e un elenco di strategie di SEO applicate a cinque pagine del sito web in esame, selezionate sulla base della possibilità di illustrare il maggior numero possibile di interventi di SEO attuabili dai traduttori.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Many users search the Internet for answers to health questions. Complementary and alternative medicine (CAM) is a particularly common search topic. Because many CAM therapies do not require a clinician's prescription, false or misleading CAM information may be more dangerous than information about traditional therapies. Many quality criteria have been suggested to filter out potentially harmful online health information. However, assessing the accuracy of CAM information is uniquely challenging since CAM is generally not supported by conventional literature. OBJECTIVE: The purpose of this study is to determine whether domain-independent technical quality criteria can identify potentially harmful online CAM content. METHODS: We analyzed 150 Web sites retrieved from a search for the three most popular herbs: ginseng, ginkgo and St. John's wort and their purported uses on the ten most commonly used search engines. The presence of technical quality criteria as well as potentially harmful statements (commissions) and vital information that should have been mentioned (omissions) was recorded. RESULTS: Thirty-eight sites (25%) contained statements that could lead to direct physical harm if acted upon. One hundred forty five sites (97%) had omitted information. We found no relationship between technical quality criteria and potentially harmful information. CONCLUSIONS: Current technical quality criteria do not identify potentially harmful CAM information online. Consumers should be warned to use other means of validation or to trust only known sites. Quality criteria that consider the uniqueness of CAM must be developed and validated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Complementary and alternative medicine (CAM) use is growing rapidly. As CAM is relatively unregulated, it is important to evaluate the type and availability of CAM information. The goal of this study is to deter-mine the prevalence, content and readability of online CAM information based on searches for arthritis, diabetes and fibromyalgia using four common search engines. Fifty-eight of 599 web pages retrieved by a "condition search" (9.6%) were CAM-oriented. Of 216 CAM pages found by the "condition" and "condition + herbs" searches, 78% were authored by commercial organizations, whose pur-pose involved commerce 69% of the time and 52.3% had no references. Although 98% of the CAM information was intended for consumers, the mean read-ability was at grade level 11. We conclude that consumers searching the web for health information are likely to encounter consumer-oriented CAM advertising, which is difficult to read and is not supported by the conventional literature.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current teen pregnancy and repeat pregnancy rates reveal that there is a pressing need for comprehensive care for pregnant and parenting teens to address their unique needs. The Internet has become a source of various types of information and as a result, several efforts have begun to assess the quality of health information provided on websites. The objective of this study was to assess the functionality and quality of websites containing health information and resources for pregnant and parenting teens. The three most widely used search engines currently: Google, MSN, and Yahoo were searched using three general search terms “teen pregnancy”, “pregnant teen”, and “teen parent”. The first 5 pages of each search were reviewed and categorized to yield 12 websites which met inclusion criteria for content evaluation. The 12 websites were rated using a pre-existing instrument encompassing two domains: functionality and content analysis. Within the functionality domain, this sample highlighted the need to improve accessibility and credibility for the target population. The content analysis revealed that among the topics which are recommended for pregnant and parenting teens, the topics most commonly covered were mental health and primary and preventive health care. The majority of websites neglected sexual health topics including STI’s and family planning. This study provides the first glimpse into health information and resources for pregnant and parenting teens on the Internet. Researchers, health care providers, social workers, health educators, and website sponsors can use these results to maintain and recommend websites which offer easily accessible, accurate, and practical information for pregnant and parenting teens.^