143 resultados para Recommender System, Opinion Mining, Association Rule Mining, User Review

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Twitter network has been labelled the most commonly used microblogging application around today. With about 500 million estimated registered users as of June, 2012, Twitter has become a credible medium of sentiment/opinion expression. It is also a notable medium for information dissemination; including breaking news on diverse issues since it was launched in 2007. Many organisations, individuals and even government bodies follow activities on the network in order to obtain knowledge on how their audience reacts to tweets that affect them. We can use postings on Twitter (known as tweets) to analyse patterns associated with events by detecting the dynamics of the tweets. A common way of labelling a tweet is by including a number of hashtags that describe its contents. Association Rule Mining can find the likelihood of co-occurrence of hashtags. In this paper, we propose the use of temporal Association Rule Mining to detect rule dynamics, and consequently dynamics of tweets. We coined our methodology Transaction-based Rule Change Mining (TRCM). A number of patterns are identifiable in these rule dynamics including, new rules, emerging rules, unexpected rules and ?dead' rules. Also the linkage between the different types of rule dynamics is investigated experimentally in this paper.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper considers the use of Association Rule Mining (ARM) and our proposed Transaction based Rule Change Mining (TRCM) to identify the rule types present in tweet’s hashtags over a specific consecutive period of time and their linkage to real life occurrences. Our novel algorithm was termed TRCM-RTI in reference to Rule Type Identification. We created Time Frame Windows (TFWs) to detect evolvement statuses and calculate the lifespan of hashtags in online tweets. We link RTI to real life events by monitoring and recording rule evolvement patterns in TFWs on the Twitter network.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to gain insights into events and issues that may cause errors and outages in parts of IP networks, intelligent methods that capture and express causal relationships online (in real-time) are needed. Whereas generalised rule induction has been explored for non-streaming data applications, its application and adaptation on streaming data is mostly undeveloped or based on periodic and ad-hoc training with batch algorithms. Some association rule mining approaches for streaming data do exist, however, they can only express binary causal relationships. This paper presents the ongoing work on Online Generalised Rule Induction (OGRI) in order to create expressive and adaptive rule sets real-time that can be applied to a broad range of applications, including network telemetry data streams.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Garment information tracking is required for clean room garment management. In this paper, we present a camera-based robust system with implementation of Optical Character Reconition (OCR) techniques to fulfill garment label recognition. In the system, a camera is used for image capturing; an adaptive thresholding algorithm is employed to generate binary images; Connected Component Labelling (CCL) is then adopted for object detection in the binary image as a part of finding the ROI (Region of Interest); Artificial Neural Networks (ANNs) with the BP (Back Propagation) learning algorithm are used for digit recognition; and finally the system is verified by a system database. The system has been tested. The results show that it is capable of coping with variance of lighting, digit twisting, background complexity, and font orientations. The system performance with association to the digit recognition rate has met the design requirement. It has achieved real-time and error-free garment information tracking during the testing.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this article, we examine the case of a system that cooperates with a “direct” user to plan an activity that some “indirect” user, not interacting with the system, should perform. The specific application we consider is the prescription of drugs. In this case, the direct user is the prescriber and the indirect user is the person who is responsible for performing the therapy. Relevant characteristics of the two users are represented in two user models. Explanation strategies are represented in planning operators whose preconditions encode the cognitive state of the indirect user; this allows tailoring the message to the indirect user's characteristics. Expansion of optional subgoals and selection among candidate operators is made by applying decision criteria represented as metarules, that negotiate between direct and indirect users' views also taking into account the context where explanation is provided. After the message has been generated, the direct user may ask to add or remove some items, or change the message style. The system defends the indirect user's needs as far as possible by mentioning the rationale behind the generated message. If needed, the plan is repaired and the direct user model is revised accordingly, so that the system learns progressively to generate messages suited to the preferences of people with whom it interacts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Healthcare information systems have the potential to enhance productivity, lower costs, and reduce medication errors by automating business processes. However, various issues such as system complexity and system abilities in a relation to user requirements as well as rapid changes in business needs have an impact on the use of these systems. In many cases failure of a system to meet business process needs has pushed users to develop alternative work processes (workarounds) to fill this gap. Some research has been undertaken on why users are motivated to perform and create workarounds. However, very little research has assessed the consequences on patient safety. Moreover, the impact of performing these workarounds on the organisation and how to quantify risks and benefits is not well analysed. Generally, there is a lack of rigorous understanding and qualitative and quantitative studies on healthcare IS workarounds and their outcomes. This project applies A Normative Approach for Modelling Workarounds to develop A Model of Motivation, Constraints, and Consequences. It aims to understand the phenomenon in-depth and provide guidelines to organisations on how to deal with workarounds. Finally the method is demonstrated on a case study example and its relative merits discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A programmable data acquisition system to allow novel use of meteorological radiosondes for atmospheric science measurements is described. In its basic form it supports four analogue inputs at 16 bit resolution, and up to two further inputs at lower resolution configurable instead for digital instruments. It also provides multiple instrument power supplies (+8V, +16V, +5V and -8V) from the 9V radiosonde battery. During a balloon flight encountering air temperatures from +17°C to -66°C, the worst case voltage drift in the 5V unipolar digitisation circuitry was 20mV. The system liberates a new range of low cost atmospheric research measurements, by utilising radiosondes routinely launched internationally for weather forecasting purposes. No additional receiving equipment is required. Comparisons between the specially instrumented and standard meteorological radiosondes show negligible effect of the additional instrumentation on the standard meteorological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Arctic has undergone substantial changes over the last few decades in various cryospheric and derivative systems and processes. Of these, the Arctic sea ice regime has seen some of the most rapid change and is one of the most visible markers of Arctic change outside the scientific community. This has drawn considerable attention not only from the natural sciences, but increasingly, from the political and commercial sectors as they begin to grapple with the problems and opportunities that are being presented. The possible impacts of past and projected changes in Arctic sea ice, especially as it relates to climatic response, are of particular interest and have been the subject of increasing research activity. A review of the current knowledge of the role of sea ice in the climate system is therefore timely. We present a review that examines both the current state of understanding, as regards the impacts of sea-ice loss observed to date, and climate model projections, to highlight hypothesised future changes and impacts on storm tracks and the North Atlantic Oscillation. Within the broad climate-system perspective, the topics of storminess and large-scale variability will be specifically considered. We then consider larger-scale impacts on the climatic system by reviewing studies that have focused on the interaction between sea-ice extent and the North Atlantic Oscillation. Finally, an overview of the representation of these topics in the literature in the context of IPCC climate projections is presented. While most agree on the direction of Arctic sea-ice change, the rates amongst the various projections vary greatly. Similarly, the response of storm tracks and climate variability are uncertain, exacerbated possibly by the influence of other factors. A variety of scientific papers on the relationship between sea-ice changes and atmospheric variability have brought to light important aspects of this complex topic. Examples are an overall reduction in the number of Arctic winter storms, a northward shift of mid-latitude winter storms in the Pacific and a delayed negative NAO-like response in autumn/winter to a reduced Arctic sea-ice cover (at least in some months). This review paper discusses this research and the disagreements, bringing about a fresh perspective on this issue.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a general Multi-Agent System framework for distributed data mining based on a Peer-to-Peer model. Agent protocols are implemented through message-based asynchronous communication. The framework adopts a dynamic load balancing policy that is particularly suitable for irregular search algorithms. A modular design allows a separation of the general-purpose system protocols and software components from the specific data mining algorithm. The experimental evaluation has been carried out on a parallel frequent subgraph mining algorithm, which has shown good scalability performances.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Aircraft Maintenance, Repair and Overhaul (MRO) agencies rely largely on row-data based quotation systems to select the best suppliers for the customers (airlines). The data quantity and quality becomes a key issue to determining the success of an MRO job, since we need to ensure we achieve cost and quality benchmarks. This paper introduces a data mining approach to create an MRO quotation system that enhances the data quantity and data quality, and enables significantly more precise MRO job quotations. Regular Expression was utilized to analyse descriptive textual feedback (i.e. engineer’s reports) in order to extract more referable highly normalised data for job quotation. A text mining based key influencer analysis function enables the user to proactively select sub-parts, defects and possible solutions to make queries more accurate. Implementation results show that system data would improve cost quotation in 40% of MRO jobs, would reduce service cost without causing a drop in service quality.