Biblioteca Digital

61 resultados para 080505 Web Technologies (excl. Web Search)

em Deakin Research Online - Australia

Web search activity data accurately predict population chronic disease risk in the USA

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: The WHO framework for non-communicable disease (NCD) describes risks and outcomes comprising the majority of the global burden of disease. These factors are complex and interact at biological, behavioural, environmental and policy levels presenting challenges for population monitoring and intervention evaluation. This paper explores the utility of machine learning methods applied to population-level web search activity behaviour as a proxy for chronic disease risk factors. METHODS: Web activity output for each element of the WHO's Causes of NCD framework was used as a basis for identifying relevant web search activity from 2004 to 2013 for the USA. Multiple linear regression models with regularisation were used to generate predictive algorithms, mapping web search activity to Centers for Disease Control and Prevention (CDC) measured risk factor/disease prevalence. Predictions for subsequent target years not included in the model derivation were tested against CDC data from population surveys using Pearson correlation and Spearman's r. RESULTS: For 2011 and 2012, predicted prevalence was very strongly correlated with measured risk data ranging from fruits and vegetables consumed (r=0.81; 95% CI 0.68 to 0.89) to alcohol consumption (r=0.96; 95% CI 0.93 to 0.98). Mean difference between predicted and measured differences by State ranged from 0.03 to 2.16. Spearman's r for state-wise predicted versus measured prevalence varied from 0.82 to 0.93. CONCLUSIONS: The high predictive validity of web search activity for NCD risk has potential to provide real-time information on population risk during policy implementation and other population-level NCD prevention efforts.

Context-aware meta search engine for distributed web service

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ubiquity of the Internet and Web has led to the emergency of several Web search engines with varying capabilities. A weakness of existing search engines is the very extensive amount of hits that they can produce. Moreover, only a small number of web users actually know how to utilize the true power of Web search engines. Therefore, there is a need for searching infrastructure to help ease and guide the searching efforts of web users toward their desired objectives. In this paper, we propose a context-based meta-search engine and discuss its implementation on top of the actual Google.com search engine. The proposed meta-search engine benefits the user the most when the user does not know what exact document he or she is looking for. Comparison of the context-based meta-search engine with both Google and Guided Google shows that the results returned by context-based meta-search engine is much more intuitive and accurate than the results returned by both Google and Guided Google.

Effectively finding relevant web pages from linkage information

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents two hyperlink analysis-based algorithms to find relevant pages for a given Web page (URL). The first algorithm comes from the extended cocitation analysis of the Web pages. It is intuitive and easy to implement. The second one takes advantage of linear algebra theories to reveal deeper relationships among the Web pages and to identify relevant pages more precisely and effectively. The experimental results show the feasibility and effectiveness of the algorithms. These algorithms could be used for various Web applications, such as enhancing Web search. The ideas and techniques in this work would be helpful to other Web-related researches.

A maximal frequent itemset approach for web document clustering

Relevância:

100.00% 100.00%

Publicador:

Resumo:

To efficiently and yet accurately cluster Web documents is of great interests to Web users and is a key component of the searching accuracy of a Web search engine. To achieve this, this paper introduces a new approach for the clustering of Web documents, which is called maximal frequent itemset (MFI) approach. Iterative clustering algorithms, such as K-means and expectation-maximization (EM), are sensitive to their initial conditions. MFI approach firstly locates the center points of high density clusters precisely. These center points then are used as initial points for the K-means algorithm. Our experimental results tested on 3 Web document sets show that our MFI approach outperforms the other methods we compared in most cases, particularly in the case of large number of categories in Web document sets.

Web communities analysis and construction

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Yanchun Zhang and his co-authors explain how to construct and analyse Web communities based on information like Web document contents, hyperlinks, or user access logs. Their approaches combine results from Web search algorithms, Web clustering methods, and Web usage mining. They also detail the necessary preliminaries needed to understand the algorithms presented, and they discuss several successful existing applications. Researchers and students in information retrieval and Web search find in this all the necessary basics and methods to create and understand Web communities. Professionals developing Web applications will additionally benefit from the samples presented for their own designs and implementations

Our anonymous online research participants are not always anonymous: Is this a problem?

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When educational research is conducted online, we sometimes promise our participants that they will be anonymous—but do we deliver on this promise? We have been warned since 1996 to be careful when using direct quotes in Internet research, as full-text web search engines make it easy to find chunks of text online. This paper details an empirical study into the prevalence of direct quotes from participants in a subset of the educational technology literature. Using basic web search techniques, the source of direct quotes could be found in 10 of 112 articles. Analysis of the articles revealed previously undiscussed threats from data triangulation and expert analysis/diagnosis. Issues of ethical obliviousness, obscurity and concern for future privacy-invasive technologies are also discussed. Recommendations for researchers, journals and institutional ethics review boards are made for how to better protect participants' anonymity against current and future threats.

Ipoll: Automatic polling using online search

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For years, opinion polls rely on data collected through telephone or person-to-person surveys. The process is costly, inconvenient, and slow. Recently online search data has emerged as potential proxies for the survey data. However considerable human involvement is still needed for the selection of search indices, a task that requires knowledge of both the target issue and how search terms are used by the online community. The robustness of such manually selected search indices can be questionable. In this paper, we propose an automatic polling system through a novel application of machine learning. In this system, the needs for examining, comparing, and selecting search indices have been eliminated through automatic generation of candidate search indices and intelligent combination of the indices. The results include a publicly accessible web application that provides real-time, robust, and accurate measurements of public opinions on several subjects of general interest.

Spectral kernels for classification

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Spectral methods, as an unsupervised technique, have been used with success in data mining such as LSI in information retrieval, HITS and PageRank in Web search engines, and spectral clustering in machine learning. The essence of success in these applications is the spectral information that captures the semantics inherent in the large amount of data required during unsupervised learning. In this paper, we ask if spectral methods can also be used in supervised learning, e.g., classification. In an attempt to answer this question, our research reveals a novel kernel in which spectral clustering information can be easily exploited and extended to new incoming data during classification tasks. From our experimental results, the proposed Spectral Kernel has proved to speedup classification tasks without compromising accuracy.

Algorithms and applications of preference based ranking for information retrieval

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, the author designed three sets of preference based ranking algorithms for information retrieval and provided the corresponsive applications for the algorithms. The main goal is to retrieve recommended, high similar and valuable ranking results to users.

Image retrieval based on bag of images

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conventional relevance feedback schemes may not be suitable to all practical applications of content-based image retrieval (CBIR), since most ordinary users would like to complete their search in a single interaction, especially on the web search. In this paper, we explore a new approach to improve the retrieval performance based on a new concept, bag of images, rather than relevance feedback. We consider that image collection comprises of image bags instead of independent individual images. Each image bag includes some relevant images with the same perceptual meaning. A theoretical case study demonstrates that image retrieval can benefit from the new concept. A number of experimental results show that the CBIR scheme based on bag of images can improve the retrieval performance dramatically.

Statistical comparisons of non-deterministic IR systems using two dimensional variance

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Retrieval systems with non-deterministic output are widely used in information retrieval. Common examples include sampling, approximation algorithms, or interactive user input. The effectiveness of such systems differs not just for different topics, but also for different instances of the system. The inherent variance presents a dilemma - What is the best way to measure the effectiveness of a non-deterministic IR system? Existing approaches to IR evaluation do not consider this problem, or the potential impact on statistical significance. In this paper, we explore how such variance can affect system comparisons, and propose an evaluation framework and methodologies capable of doing this comparison. Using the context of distributed information retrieval as a case study for our investigation, we show that the approaches provide a consistent and reliable methodology to compare the effectiveness of a non-deterministic system with a deterministic or another non-deterministic system. In addition, we present a statistical best-practice that can be used to safely show how a non-deterministic IR system has equivalent effectiveness to another IR system, and how to avoid the common pitfall of misusing a lack of significance as a proof that two systems have equivalent effectiveness.

Current technologies and development of web-based databases : a survey

Relevância:

60.00% 60.00%

Publicador:

The use of Web 2.0 Technologies to promote higher order thinking skills

Relevância:

60.00% 60.00%

Publicador:

Resumo:

During 2007 several independent Victorian secondary schools participated in a study exploring the ways in which the use of learning technologies can support the development of higher order thinking skills for students. This paper focuses on the use of Information and Communications Technologies (ICT) including Web 2.0 technologies for promoting effective teaching and learning in science. A case study methodology was used to describe how individual teachers used ICT and Web 2.0 in their settings. Data included interviews (focus group and individual), questionnaires, monitoring of teacher and student use of smart tools, analysis of curriculum documents and delivery methods and of student work samples. The evaluation used an interpretive methodology to investigate five research areas: Higher-order thinking, Metacognitive awareness, Team work/collaboration, Affect towards school/learning and Ownership of learning. Three cases are reported on in this paper. Each describes how student engagement and learning increased and how teachers’ attitudes and skills developed. Examples of student and teacher blogs are provided to illustrate how such technologies encourage students and teachers to look beyond text science.

Examining how teachers use web 2.0 technologies in Science lessons to promote higher order thinking in teaching science

Relevância:

60.00% 60.00%

Publicador:

Resumo:

During 2007 several independent Victorian secondary schools participated in a study exploring the ways in which the use of learning technologies can support the development of higher order thinking skills for students. This paper focuses on the use of Information and Communications Technologies (ICT) including Web 2.0 technologies for promoting effective teaching and learning in science. A case study methodology was used to describe how individual teachers used ICT and Web 2.0 in their settings. Data included interviews (focus group and individual), questionnaires, monitoring of teacher and student use of smart tools, analysis of curriculum documents and delivery methods and of student work samples. The evaluation used an interpretive methodology to investigate five research areas'. Higher-order thinking, Metacognitive awareness, Team work/collaboration, Affect towards school/learning and Ownership of learning. Three cases are reported on in this paper. Each describes how student engagement and learning increased and how teachers' attitudes and skills developed. Examples of student and teacher blogs are provided to illustrate how such technologies encourage students and teachers to look beyond text science.

Tim O'Reilly and web 2.0 : the economics of memetic liberty and control

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This article presents an account of the role of Tim O'Reilly, both as an individual and as a corporate entity (O'Reilly Group), in the creation, spread and use of the concept of Web 2.0. It demonstrates that, whatever Web 2.0's current uses to describe variously the technologies, politics, commerce or social meaning of the Internet, it originates as a deliberately open signifier of novel and potential internet development in the mid-2000s. The article argues that O'Reilly has promoted the diversity of the term's meanings and uses - celebrating textual liberties - but has also emphasised the special role that O'Reilly plays in providing the authoritative definition of that term. In essence, O'Reilly profits from this 'control' of the idea of Web 2.0 but that, to enjoy that control O'Reilly must also allow differences in meaning. The article concludes by suggesting that Web 2.0 therefore signifies a new kind of economics that brings together freedom and control in a new way.

«
1
2
3
4
5
»