967 resultados para Web Mining


Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the increasing number of XML documents in varied domains, it has become essential to identify ways of finding interesting information from these documents. Data mining techniques were used to derive this interesting information. Mining on XML documents is impacted by its model due to the semi-structured nature of these documents. Hence, in this chapter we present an overview of the various models of XML documents, how these models were used for mining and some of the issues and challenges in these models. In addition, this chapter also provides some insights into the future models of XML documents for effectively capturing the two important features namely structure and content of XML documents for mining.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – The work presented in this paper aims to provide an approach to classifying web logs by personal properties of users. Design/methodology/approach – The authors describe an iterative system that begins with a small set of manually labeled terms, which are used to label queries from the log. A set of background knowledge related to these labeled queries is acquired by combining web search results on these queries. This background set is used to obtain many terms that are related to the classification task. The system then ranks each of the related terms, choosing those that most fit the personal properties of the users. These terms are then used to begin the next iteration. Findings – The authors identify the difficulties of classifying web logs, by approaching this problem from a machine learning perspective. By applying the approach developed, the authors are able to show that many queries in a large query log can be classified. Research limitations/implications – Testing results in this type of classification work is difficult, as the true personal properties of web users are unknown. Evaluation of the classification results in terms of the comparison of classified queries to well known age-related sites is a direction that is currently being exploring. Practical implications – This research is background work that can be incorporated in search engines or other web-based applications, to help marketing companies and advertisers. Originality/value – This research enhances the current state of knowledge in short-text classification and query log learning. Classification schemes, Computer networks, Information retrieval, Man-machine systems, User interfaces

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Detecting query reformulations within a session by a Web searcher is an important area of research for designing more helpful searching systems and targeting content to particular users. Methods explored by other researchers include both qualitative (i.e., the use of human judges to manually analyze query patterns on usually small samples) and nondeterministic algorithms, typically using large amounts of training data to predict query modification during sessions. In this article, we explore three alternative methods for detection of session boundaries. All three methods are computationally straightforward and therefore easily implemented for detection of session changes. We examine 2,465,145 interactions from 534,507 users of Dogpile.com on May 6, 2005. We compare session analysis using (a) Internet Protocol address and cookie; (b) Internet Protocol address, cookie, and a temporal limit on intrasession interactions; and (c) Internet Protocol address, cookie, and query reformulation patterns. Overall, our analysis shows that defining sessions by query reformulation along with Internet Protocol address and cookie provides the best measure, resulting in an 82% increase in the count of sessions. Regardless of the method used, the mean session length was fewer than three queries, and the mean session duration was less than 30 min. Searchers most often modified their query by changing query terms (nearly 23% of all query modifications) rather than adding or deleting terms. Implications are that for measuring searching traffic, unique sessions may be a better indicator than the common metric of unique visitors. This research also sheds light on the more complex aspects of Web searching involving query modifications and may lead to advances in searching tools.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Australia is currently in the midst of a major resources boom. Resultant growing demands for labour in regional and remote areas have accelerated the recruitment of non resident workers, mostly contractors, who work extended block rosters of 12-hour shifts and are accommodated in work camps, often adjacent to established mining towns. Serious social impacts of these practices, including violence and crime, have generally escaped industry, government and academic scrutiny. This paper highlights some of these impacts on affected regional communities and workers and argues that post-industrial mining regimes serve to mask and privatize these harms and risks, shifting them on to workers, families and communities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper reports on an exploratory study of the role of web and social media in e-governments, especially in the context of Malaysia, with some comparisons and contrasts from other countries where such governmental efforts have been underway for awhile. It describes the current e-government efforts in Malaysia, and proposes that applying a theoretical framework would help understand the context and streamline these ongoing efforts. Specifically, it lays out a theoretical and cultural framework based on Mary Douglas’ (1996) Grid-Group Theory, Mircea Georgescu’s (2005) Three Pillars of E-Government, and Gerald Grant’s and Derek Chau’s (2006) Generic Framework for E-Government. Although this study is in its early stages, it has relevance to everyone who is interested in e-government efforts across the world, and especially relevant to developing countries.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While the role of executives’ cognition in organisations’ responses to change is a central topic in strategic cognition research, changes in firms’ environment are typically not measured directly but described either as an event (for example, new industry legislation) or represented by a time period (e.g. when a new technology impacted an industry). The Australian mining sector has witnessed a historically significant change in demand for its products and we begin by developing measures of changes in supply and demand for key commodities during the period 1992-2008. We identify sub-groups of firms based on their activities and commodity sector and examine the relation of these variables to executives’ cognition and to firms’ CapEx. We find industry, firm and cognitive variables are related to both strategic cognition and firms’ CapEx.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the long term, with development of skill, knowledge, exposure and confidence within the engineering profession, rigorous analysis techniques have the potential to become a reliable and far more comprehensive method for design and verification of the structural adequacy of OPS, write Nimal J Perera, David P Thambiratnam and Brian Clark. This paper explores the potential to enhance operator safety of self-propelled mechanical plant subjected to roll over and impact of falling objects using the non-linear and dynamic response simulation capabilities of analytical processes to supplement quasi-static testing methods prescribed in International and Australian Codes of Practice for bolt on Operator Protection Systems (OPS) that are post fitted. The paper is based on research work carried out by the authors at the Queensland University of Technology (QUT) over a period of three years by instrumentation of prototype tests, scale model tests in the laboratory and rigorous analysis using validated Finite Element (FE) Models. The FE codes used were ABAQUS for implicit analysis and LSDYNA for explicit analysis. The rigorous analysis and dynamic simulation technique described in the paper can be used to investigate the structural response due to accident scenarios such as multiple roll over, impact of multiple objects and combinations of such events and thereby enhance the safety and performance of Roll Over and Falling Object Protection Systems (ROPS and FOPS). The analytical techniques are based on sound engineering principles and well established practice for investigation of dynamic impact on all self propelled vehicles. They are used for many other similar applications where experimental techniques are not feasible.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Handling information overload online, from the user's point of view is a big challenge, especially when the number of websites is growing rapidly due to growth in e-commerce and other related activities. Personalization based on user needs is the key to solving the problem of information overload. Personalization methods help in identifying relevant information, which may be liked by a user. User profile and object profile are the important elements of a personalization system. When creating user and object profiles, most of the existing methods adopt two-dimensional similarity methods based on vector or matrix models in order to find inter-user and inter-object similarity. Moreover, for recommending similar objects to users, personalization systems use the users-users, items-items and users-items similarity measures. In most cases similarity measures such as Euclidian, Manhattan, cosine and many others based on vector or matrix methods are used to find the similarities. Web logs are high-dimensional datasets, consisting of multiple users, multiple searches with many attributes to each. Two-dimensional data analysis methods may often overlook latent relationships that may exist between users and items. In contrast to other studies, this thesis utilises tensors, the high-dimensional data models, to build user and object profiles and to find the inter-relationships between users-users and users-items. To create an improved personalized Web system, this thesis proposes to build three types of profiles: individual user, group users and object profiles utilising decomposition factors of tensor data models. A hybrid recommendation approach utilising group profiles (forming the basis of a collaborative filtering method) and object profiles (forming the basis of a content-based method) in conjunction with individual user profiles (forming the basis of a model based approach) is proposed for making effective recommendations. A tensor-based clustering method is proposed that utilises the outcomes of popular tensor decomposition techniques such as PARAFAC, Tucker and HOSVD to group similar instances. An individual user profile, showing the user's highest interest, is represented by the top dimension values, extracted from the component matrix obtained after tensor decomposition. A group profile, showing similar users and their highest interest, is built by clustering similar users based on tensor decomposed values. A group profile is represented by the top association rules (containing various unique object combinations) that are derived from the searches made by the users of the cluster. An object profile is created to represent similar objects clustered on the basis of their similarity of features. Depending on the category of a user (known, anonymous or frequent visitor to the website), any of the profiles or their combinations is used for making personalized recommendations. A ranking algorithm is also proposed that utilizes the personalized information to order and rank the recommendations. The proposed methodology is evaluated on data collected from a real life car website. Empirical analysis confirms the effectiveness of recommendations made by the proposed approach over other collaborative filtering and content-based recommendation approaches based on two-dimensional data analysis methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Open pit mine operations are complex businesses that demand a constant assessment of risk. This is because the value of a mine project is typically influenced by many underlying economic and physical uncertainties, such as metal prices, metal grades, costs, schedules, quantities, and environmental issues, among others, which are not known with much certainty at the beginning of the project. Hence, mining projects present a considerable challenge to those involved in associated investment decisions, such as the owners of the mine and other stakeholders. In general terms, when an option exists to acquire a new or operating mining project, , the owners and stock holders of the mine project need to know the value of the mining project, which is the fundamental criterion for making final decisions about going ahead with the venture capital. However, obtaining the mine project’s value is not an easy task. The reason for this is that sophisticated valuation and mine optimisation techniques, which combine advanced theories in geostatistics, statistics, engineering, economics and finance, among others, need to be used by the mine analyst or mine planner in order to assess and quantify the existing uncertainty and, consequently, the risk involved in the project investment. Furthermore, current valuation and mine optimisation techniques do not complement each other. That is valuation techniques based on real options (RO) analysis assume an expected (constant) metal grade and ore tonnage during a specified period, while mine optimisation (MO) techniques assume expected (constant) metal prices and mining costs. These assumptions are not totally correct since both sources of uncertainty—that of the orebody (metal grade and reserves of mineral), and that about the future behaviour of metal prices and mining costs—are the ones that have great impact on the value of any mining project. Consequently, the key objective of this thesis is twofold. The first objective consists of analysing and understanding the main sources of uncertainty in an open pit mining project, such as the orebody (in situ metal grade), mining costs and metal price uncertainties, and their effect on the final project value. The second objective consists of breaking down the wall of isolation between economic valuation and mine optimisation techniques in order to generate a novel open pit mine evaluation framework called the ―Integrated Valuation / Optimisation Framework (IVOF)‖. One important characteristic of this new framework is that it incorporates the RO and MO valuation techniques into a single integrated process that quantifies and describes uncertainty and risk in a mine project evaluation process, giving a more realistic estimate of the project’s value. To achieve this, novel and advanced engineering and econometric methods are used to integrate financial and geological uncertainty into dynamic risk forecasting measures. The proposed mine valuation/optimisation technique is then applied to a real gold disseminated open pit mine deposit to estimate its value in the face of orebody, mining costs and metal price uncertainties.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In Central Queensland Mining Supplies Pty Ltd v Columbia Steel Casting Co Ltd [2011] QSC 183 Applegarth J considered complaints made by the defendant about the approach the plaintiff had taken in its endeavour to comply with its disclosure obligation under r 211 of the Uniform Civil Procedure Rules 1999 (Qld). The judgment also provides an indication of the direction the court is taking in relation to disclosure and document management in matters involving large numbers of documents.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The existing Collaborative Filtering (CF) technique that has been widely applied by e-commerce sites requires a large amount of ratings data to make meaningful recommendations. It is not directly applicable for recommending products that are not frequently purchased by users, such as cars and houses, as it is difficult to collect rating data for such products from the users. Many of the e-commerce sites for infrequently purchased products are still using basic search-based techniques whereby the products that match with the attributes given in the target user's query are retrieved and recommended to the user. However, search-based recommenders cannot provide personalized recommendations. For different users, the recommendations will be the same if they provide the same query regardless of any difference in their online navigation behaviour. This paper proposes to integrate collaborative filtering and search-based techniques to provide personalized recommendations for infrequently purchased products. Two different techniques are proposed, namely CFRRobin and CFAg Query. Instead of using the target user's query to search for products as normal search based systems do, the CFRRobin technique uses the products in which the target user's neighbours have shown interest as queries to retrieve relevant products, and then recommends to the target user a list of products by merging and ranking the returned products using the Round Robin method. The CFAg Query technique uses the products that the user's neighbours have shown interest in to derive an aggregated query, which is then used to retrieve products to recommend to the target user. Experiments conducted on a real e-commerce dataset show that both the proposed techniques CFRRobin and CFAg Query perform better than the standard Collaborative Filtering (CF) and the Basic Search (BS) approaches, which are widely applied by the current e-commerce applications. The CFRRobin and CFAg Query approaches also outperform the e- isting query expansion (QE) technique that was proposed for recommending infrequently purchased products.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

From location-aware computing to mining the social web, representations of context have promised to make better software applications. The opportunities and challenges of context-aware computing from representational, situated and interactional perspectives have been well documented, but arguments from the perspective of design are somewhat disparate. This paper draws on both theoretical perspectives and a design framing, using the problem of designing a social mobile agile ridesharing system, in order to reflect upon and call for broader design approaches for context-aware computing and human-computer Interaction research in general.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For more than a decade research in the field of context aware computing has aimed to find ways to exploit situational information that can be detected by mobile computing and sensor technologies. The goal is to provide people with new and improved applications, enhanced functionality and better use experience (Dey, 2001). Early applications focused on representing or computing on physical parameters, such as showing your location and the location of people or things around you. Such applications might show where the next bus is, which of your friends is in the vicinity and so on. With the advent of social networking software and microblogging sites such as Facebook and Twitter, recommender systems and so on context-aware computing is moving towards mining the social web in order to provide better representations and understanding of context, including social context. In this paper we begin by recapping different theoretical framings of context. We then discuss the problem of context- aware computing from a design perspective.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose – To investigate and identify the patterns of interaction between searchers and search engine during web searching. Design/methodology/approach – The authors examined 2,465,145 interactions from 534,507 users of Dogpile.com submitted on May 6, 2005, and compared query reformulation patterns. They investigated the type of query modifications and query modification transitions within sessions. Findings – The paper identifies three strong query reformulation transition patterns: between specialization and generalization; between video and audio, and between content change and system assistance. In addition, the findings show that web and images content were the most popular media collections. Originality/value – This research sheds light on the more complex aspects of web searching involving query modifications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many user studies in Web information searching have found the significant effect of task types on search strategies. However, little attention was given to Web image searching strategies, especially the query reformulation activity despite that this is a crucial part in Web image searching. In this study, we investigated the effects of topic domains and task types on user’s image searching behavior and query reformulation strategies. Some significant differences in user’s tasks specificity and initial concepts were identified among the task domains. Task types are also found to influence participant’s result reviewing behavior and query reformulation strategies.