984 resultados para Web documents


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – The work presented in this paper aims to provide an approach to classifying web logs by personal properties of users. Design/methodology/approach – The authors describe an iterative system that begins with a small set of manually labeled terms, which are used to label queries from the log. A set of background knowledge related to these labeled queries is acquired by combining web search results on these queries. This background set is used to obtain many terms that are related to the classification task. The system then ranks each of the related terms, choosing those that most fit the personal properties of the users. These terms are then used to begin the next iteration. Findings – The authors identify the difficulties of classifying web logs, by approaching this problem from a machine learning perspective. By applying the approach developed, the authors are able to show that many queries in a large query log can be classified. Research limitations/implications – Testing results in this type of classification work is difficult, as the true personal properties of web users are unknown. Evaluation of the classification results in terms of the comparison of classified queries to well known age-related sites is a direction that is currently being exploring. Practical implications – This research is background work that can be incorporated in search engines or other web-based applications, to help marketing companies and advertisers. Originality/value – This research enhances the current state of knowledge in short-text classification and query log learning. Classification schemes, Computer networks, Information retrieval, Man-machine systems, User interfaces

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Detecting query reformulations within a session by a Web searcher is an important area of research for designing more helpful searching systems and targeting content to particular users. Methods explored by other researchers include both qualitative (i.e., the use of human judges to manually analyze query patterns on usually small samples) and nondeterministic algorithms, typically using large amounts of training data to predict query modification during sessions. In this article, we explore three alternative methods for detection of session boundaries. All three methods are computationally straightforward and therefore easily implemented for detection of session changes. We examine 2,465,145 interactions from 534,507 users of Dogpile.com on May 6, 2005. We compare session analysis using (a) Internet Protocol address and cookie; (b) Internet Protocol address, cookie, and a temporal limit on intrasession interactions; and (c) Internet Protocol address, cookie, and query reformulation patterns. Overall, our analysis shows that defining sessions by query reformulation along with Internet Protocol address and cookie provides the best measure, resulting in an 82% increase in the count of sessions. Regardless of the method used, the mean session length was fewer than three queries, and the mean session duration was less than 30 min. Searchers most often modified their query by changing query terms (nearly 23% of all query modifications) rather than adding or deleting terms. Implications are that for measuring searching traffic, unique sessions may be a better indicator than the common metric of unique visitors. This research also sheds light on the more complex aspects of Web searching involving query modifications and may lead to advances in searching tools.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper reports on an exploratory study of the role of web and social media in e-governments, especially in the context of Malaysia, with some comparisons and contrasts from other countries where such governmental efforts have been underway for awhile. It describes the current e-government efforts in Malaysia, and proposes that applying a theoretical framework would help understand the context and streamline these ongoing efforts. Specifically, it lays out a theoretical and cultural framework based on Mary Douglas’ (1996) Grid-Group Theory, Mircea Georgescu’s (2005) Three Pillars of E-Government, and Gerald Grant’s and Derek Chau’s (2006) Generic Framework for E-Government. Although this study is in its early stages, it has relevance to everyone who is interested in e-government efforts across the world, and especially relevant to developing countries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Quantum theory has recently been employed to further advance the theory of information retrieval (IR). A challenging research topic is to investigate the so called quantum-like interference in users’ relevance judgement process, where users are involved to judge the relevance degree of each document with respect to a given query. In this process, users’ relevance judgement for the current document is often interfered by the judgement for previous documents, due to the interference on users’ cognitive status. Research from cognitive science has demonstrated some initial evidence of quantum-like cognitive interference in human decision making, which underpins the user’s relevance judgement process. This motivates us to model such cognitive interference in the relevance judgement process, which in our belief will lead to a better modeling and explanation of user behaviors in relevance judgement process for IR and eventually lead to more user-centric IR models. In this paper, we propose to use probabilistic automaton(PA) and quantum finite automaton (QFA), which are suitable to represent the transition of user judgement states, to dynamically model the cognitive interference when the user is judging a list of documents.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the growing number of XML documents on theWeb it becomes essential to effectively organise these XML documents in order to retrieve useful information from them. A possible solution is to apply clustering on the XML documents to discover knowledge that promotes effective data management, information retrieval and query processing. However, many issues arise in discovering knowledge from these types of semi-structured documents due to their heterogeneity and structural irregularity. Most of the existing research on clustering techniques focuses only on one feature of the XML documents, this being either their structure or their content due to scalability and complexity problems. The knowledge gained in the form of clusters based on the structure or the content is not suitable for reallife datasets. It therefore becomes essential to include both the structure and content of XML documents in order to improve the accuracy and meaning of the clustering solution. However, the inclusion of both these kinds of information in the clustering process results in a huge overhead for the underlying clustering algorithm because of the high dimensionality of the data. The overall objective of this thesis is to address these issues by: (1) proposing methods to utilise frequent pattern mining techniques to reduce the dimension; (2) developing models to effectively combine the structure and content of XML documents; and (3) utilising the proposed models in clustering. This research first determines the structural similarity in the form of frequent subtrees and then uses these frequent subtrees to represent the constrained content of the XML documents in order to determine the content similarity. A clustering framework with two types of models, implicit and explicit, is developed. The implicit model uses a Vector Space Model (VSM) to combine the structure and the content information. The explicit model uses a higher order model, namely a 3- order Tensor Space Model (TSM), to explicitly combine the structure and the content information. This thesis also proposes a novel incremental technique to decompose largesized tensor models to utilise the decomposed solution for clustering the XML documents. The proposed framework and its components were extensively evaluated on several real-life datasets exhibiting extreme characteristics to understand the usefulness of the proposed framework in real-life situations. Additionally, this research evaluates the outcome of the clustering process on the collection selection problem in the information retrieval on the Wikipedia dataset. The experimental results demonstrate that the proposed frequent pattern mining and clustering methods outperform the related state-of-the-art approaches. In particular, the proposed framework of utilising frequent structures for constraining the content shows an improvement in accuracy over content-only and structure-only clustering results. The scalability evaluation experiments conducted on large scaled datasets clearly show the strengths of the proposed methods over state-of-the-art methods. In particular, this thesis work contributes to effectively combining the structure and the content of XML documents for clustering, in order to improve the accuracy of the clustering solution. In addition, it also contributes by addressing the research gaps in frequent pattern mining to generate efficient and concise frequent subtrees with various node relationships that could be used in clustering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nowadays, everyone can effortlessly access a range of information on the World Wide Web (WWW). As information resources on the web continue to grow tremendously, it becomes progressively more difficult to meet high expectations of users and find relevant information. Although existing search engine technologies can find valuable information, however, they suffer from the problems of information overload and information mismatch. This paper presents a hybrid Web Information Retrieval approach allowing personalised search using ontology, user profile and collaborative filtering. This approach finds the context of user query with least user’s involvement, using ontology. Simultaneously, this approach uses time-based automatic user profile updating with user’s changing behaviour. Subsequently, this approach uses recommendations from similar users using collaborative filtering technique. The proposed method is evaluated with the FIRE 2010 dataset and manually generated dataset. Empirical analysis reveals that Precision, Recall and F-Score of most of the queries for many users are improved with proposed method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In Bowenbrae Pty Ltd v Flying Fighters Maintenance and Restoration [2010] QDC 347 Reid DCJ made orders requiring the plaintiffs to make application under the Freedom of Information Act 1982 (Cth) (“the FOI Act”) for documents sought by the defendant.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – To investigate and identify the patterns of interaction between searchers and search engine during web searching. Design/methodology/approach – The authors examined 2,465,145 interactions from 534,507 users of Dogpile.com submitted on May 6, 2005, and compared query reformulation patterns. They investigated the type of query modifications and query modification transitions within sessions. Findings – The paper identifies three strong query reformulation transition patterns: between specialization and generalization; between video and audio, and between content change and system assistance. In addition, the findings show that web and images content were the most popular media collections. Originality/value – This research sheds light on the more complex aspects of web searching involving query modifications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many user studies in Web information searching have found the significant effect of task types on search strategies. However, little attention was given to Web image searching strategies, especially the query reformulation activity despite that this is a crucial part in Web image searching. In this study, we investigated the effects of topic domains and task types on user’s image searching behavior and query reformulation strategies. Some significant differences in user’s tasks specificity and initial concepts were identified among the task domains. Task types are also found to influence participant’s result reviewing behavior and query reformulation strategies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There is extensive uptake of ICT in the teaching of science but more evidence is needed on how ICT impacts on the learning practice and the learning outcomes at the classroom level. In this study, a physics website (Getsmart) was developed using the cognitive apprenticeship framework for students at a high school in Australia. This website was designed to enhance students’ knowledge of concepts in physics. Reflexive pedagogies were used in the delivery learning materials in a blended learning environment. The students in the treatment group accessed the website over a 10 week period. Pre and post-test results of the treatment (N= 48) and comparison group (N=32) were compared. The MANCOVA analysis showed that the web-based learning experience benefited the students in the treatment group. It not only impacted on the learning outcomes, but qualitative data from the students suggested that it had a positive impact on their attitudes towards studying physics in a blended environment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract: LiteSteel beam (LSB) is a new cold-formed steel hollow flange channel section produced using a patented manufacturing process involving simultaneous cold-forming and dual electric resistance welding. The LSBs are commonly used as floor joists and bearers with web openings in residential, industrial and commercial buildings. Their shear strengths are considerably reduced when web openings are included for the purpose of locating building services. However, no research has been undertaken on the shear behaviour and strength of LSBs with web openings. Therefore experimental and numerical studies were undertaken to investigate the shear behaviour and strength of LSBs with web openings. In this research, finite element models of LSBs with web openings in shear were developed to simulate the shear behaviour and strength of LSBs including their buckling characteristics. They were then validated by comparing their results with available experimental test results and used in a detailed parametric study. The results showed that the current design rules in cold-formed steel structures design codes are very conservative for the shear design of LSBs with web openings. Improved design equations have been proposed for the shear capacity of LSBs with web openings based on both experimental and parametric study results. An alternative shear design method based on an equivalent reduced web thickness was also proposed. It was found that the same shear strength design rules developed for LSBs without web openings can be used for LSBs with web openings provided the equivalent reduced web thickness equation developed in this paper is used. This is a significant advancement as it simplifies the shear design methods of LSBs with web openings considerably.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract: The LiteSteel Beam (LSB) is a new cold-formed steel hollow flange channel beam recently developed in Australia. It is commonly used as a floor joist or bearer in buildings. Current practice in flooring systems is to include openings in the web element of floor joists or bearers so that building services can be located within them. Shear behaviour of LSBs with web openings is more complicated while their shear strengths are considerably reduced by the presence of web openings. However, no research has been undertaken on the shear behaviour and strength of LSBs with web openings. Therefore a detailed experimental study involving 26 shear tests was undertaken on simply supported LSB test specimens with web openings and an aspect ratio of 1.5. This paper presents the details of this experimental study and the results of their shear capacities and behavioural characteristics. Experimental results showed that the current design rules in cold-formed steel structures design codes are very conservative for the shear design of LSBs with web openings. Improved design equations have been proposed for the shear strength of LSBs with web openings based on the experimental results from this study.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classroom learning environments are rapidly changing as new digital technologies become more education-friendly. What are students’ perceptions of their technology-rich learning environments? This question is critical as it may have an impact on the effectiveness of the new technologies in classrooms. There are numerous reliable and valid learning environment instruments which have been used to ascertain students’ perceptions of their learning environments. This chapter focuses on one of these instruments, the Web-based Learning Environment Instrument (WEBLEI) (Chang & Fisher, 2003). Since its initial development, this instrument has been used to study a range of learning environments and this chapter presents the findings of two example case-studies that involve such environments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The convergence of Internet marketplaces and service-oriented architectures has spurred the growth of Web service ecosystems. This paper articulates a vision for Web service ecosystems, discusses early manifestations of this vision, and presents a unifying architecture to support the emergence of larger and more sophisticated ecosystems

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In information retrieval (IR) research, more and more focus has been placed on optimizing a query language model by detecting and estimating the dependencies between the query and the observed terms occurring in the selected relevance feedback documents. In this paper, we propose a novel Aspect Language Modeling framework featuring term association acquisition, document segmentation, query decomposition, and an Aspect Model (AM) for parameter optimization. Through the proposed framework, we advance the theory and practice of applying high-order and context-sensitive term relationships to IR. We first decompose a query into subsets of query terms. Then we segment the relevance feedback documents into chunks using multiple sliding windows. Finally we discover the higher order term associations, that is, the terms in these chunks with high degree of association to the subsets of the query. In this process, we adopt an approach by combining the AM with the Association Rule (AR) mining. In our approach, the AM not only considers the subsets of a query as “hidden” states and estimates their prior distributions, but also evaluates the dependencies between the subsets of a query and the observed terms extracted from the chunks of feedback documents. The AR provides a reasonable initial estimation of the high-order term associations by discovering the associated rules from the document chunks. Experimental results on various TREC collections verify the effectiveness of our approach, which significantly outperforms a baseline language model and two state-of-the-art query language models namely the Relevance Model and the Information Flow model