844 resultados para Data mining and knowledge discovery
Resumo:
Electricity market price forecast is a changeling yet very important task for electricity market managers and participants. Due to the complexity and uncertainties in the power grid, electricity prices are highly volatile and normally carry with spikes. which may be (ens or even hundreds of times higher than the normal price. Such electricity spikes are very difficult to be predicted. So far. most of the research on electricity price forecast is based on the normal range electricity prices. This paper proposes a data mining based electricity price forecast framework, which can predict the normal price as well as the price spikes. The normal price can be, predicted by a previously proposed wavelet and neural network based forecast model, while the spikes are forecasted based on a data mining approach. This paper focuses on the spike prediction and explores the reasons for price spikes based on the measurement of a proposed composite supply-demand balance index (SDI) and relative demand index (RDI). These indices are able to reflect the relationship among electricity demand, electricity supply and electricity reserve capacity. The proposed model is based on a mining database including market clearing price, trading hour. electricity), demand, electricity supply and reserve. Bayesian classification and similarity searching techniques are used to mine the database to find out the internal relationships between electricity price spikes and these proposed. The mining results are used to form the price spike forecast model. This proposed model is able to generate forecasted price spike, level of spike and associated forecast confidence level. The model is tested with the Queensland electricity market data with promising results. Crown Copyright (C) 2004 Published by Elsevier B.V. All rights reserved.
Resumo:
Virtual teams differ from tradidonal, co-located teams in that they primarily communicate via informadon technolog}' such as email, video conferencing and web based coUaboradve environments rather than in a face-to-face medium. There has been a lack of empirical research into the influence that leadership has within virtual teams upon key outcomes such as performance and knowledge sharing. This paper examines antecedents of knowledge sharing and performance, namely role clarit)' and trust in a team leader. We predicted that transformadonal leadership would posidvely influence both performance and knowledge sharing within virtual teams. We also h^'pothesised that trust in a leader and role clarit)' would mediate both the associadon between transformadonal leadership and performance as well as the associadon between transformadonal leadership and knowledge sharing within virtual teams. Data was collected from a public sector organisadon using virtual teams, Pardcipants responded to a self-report quesdonnaire. Supervisor radngs of performance and knowledge sharing were also obtained. In general we found support for a posidve reladonship between transformadonal leadership and performance and knowledge sharing within virtual teams. Using mediated muldple regression, we found support for the mediadng role of trust in the leader and role clarit}' between transformadonal leadership and performance and knowledge sharing. Implicadons of the results are provided.
Resumo:
Objective: An estimation of cut-off points for the diagnosis of diabetes mellitus (DM) based on individual risk factors. Methods: A subset of the 1991 Oman National Diabetes Survey is used, including all patients with a 2h post glucose load >= 200 mg/dl (278 subjects) and a control group of 286 subjects. All subjects previously diagnosed as diabetic and all subjects with missing data values were excluded. The data set was analyzed by use of the SPSS Clementine data mining system. Decision Tree Learners (C5 and CART) and a method for mining association rules (the GRI algorithm) are used. The fasting plasma glucose (FPG), age, sex, family history of diabetes and body mass index (BMI) are input risk factors (independent variables), while diabetes onset (the 2h post glucose load >= 200 mg/dl) is the output (dependent variable). All three techniques used were tested by use of crossvalidation (89.8%). Results: Rules produced for diabetes diagnosis are: A- GRI algorithm (1) FPG>=108.9 mg/dl, (2) FPG>=107.1 and age>39.5 years. B- CART decision trees: FPG >=110.7 mg/dl. C- The C5 decision tree learner: (1) FPG>=95.5 and 54, (2) FPG>=106 and 25.2 kg/m2. (3) FPG>=106 and =133 mg/dl. The three techniques produced rules which cover a significant number of cases (82%), with confidence between 74 and 100%. Conclusion: Our approach supports the suggestion that the present cut-off value of fasting plasma glucose (126 mg/dl) for the diagnosis of diabetes mellitus needs revision, and the individual risk factors such as age and BMI should be considered in defining the new cut-off value.
Resumo:
The data available during the drug discovery process is vast in amount and diverse in nature. To gain useful information from such data, an effective visualisation tool is required. To provide better visualisation facilities to the domain experts (screening scientist, biologist, chemist, etc.),we developed a software which is based on recently developed principled visualisation algorithms such as Generative Topographic Mapping (GTM) and Hierarchical Generative Topographic Mapping (HGTM). The software also supports conventional visualisation techniques such as Principal Component Analysis, NeuroScale, PhiVis, and Locally Linear Embedding (LLE). The software also provides global and local regression facilities . It supports regression algorithms such as Multilayer Perceptron (MLP), Radial Basis Functions network (RBF), Generalised Linear Models (GLM), Mixture of Experts (MoE), and newly developed Guided Mixture of Experts (GME). This user manual gives an overview of the purpose of the software tool, highlights some of the issues to be taken care while creating a new model, and provides information about how to install & use the tool. The user manual does not require the readers to have familiarity with the algorithms it implements. Basic computing skills are enough to operate the software.
Resumo:
Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. miniDVMS v1.8 provides a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualisation domain. The advantage of this interface is that the user is directly involved in the data mining process. Principled projection methods, such as generative topographic mapping (GTM) and hierarchical GTM (HGTM), are integrated with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, and user interaction facilities, to provide this integrated visual data mining framework. The software also supports conventional visualisation techniques such as principal component analysis (PCA), Neuroscale, and PhiVis. This user manual gives an overview of the purpose of the software tool, highlights some of the issues to be taken care while creating a new model, and provides information about how to install and use the tool. The user manual does not require the readers to have familiarity with the algorithms it implements. Basic computing skills are enough to operate the software.
Resumo:
The work reported in this paper is part of a project simulating maintenance operations in an automotive engine production facility. The decisions made by the people in charge of these operations form a crucial element of this simulation. Eliciting this knowledge is problematic. One approach is to use the simulation model as part of the knowledge elicitation process. This paper reports on the experience so far with using a simulation model to support knowledge management in this way. Issues are discussed regarding the data available, the use of the model, and the elicitation process itself. © 2004 Elsevier B.V. All rights reserved.
Resumo:
Today, the data available to tackle many scientific challenges is vast in quantity and diverse in nature. The exploration of heterogeneous information spaces requires suitable mining algorithms as well as effective visual interfaces. Most existing systems concentrate either on mining algorithms or on visualization techniques. Though visual methods developed in information visualization have been helpful, for improved understanding of a complex large high-dimensional dataset, there is a need for an effective projection of such a dataset onto a lower-dimension (2D or 3D) manifold. This paper introduces a flexible visual data mining framework which combines advanced projection algorithms developed in the machine learning domain and visual techniques developed in the information visualization domain. The framework follows Shneiderman’s mantra to provide an effective user interface. The advantage of such an interface is that the user is directly involved in the data mining process. We integrate principled projection methods, such as Generative Topographic Mapping (GTM) and Hierarchical GTM (HGTM), with powerful visual techniques, such as magnification factors, directional curvatures, parallel coordinates, billboarding, and user interaction facilities, to provide an integrated visual data mining framework. Results on a real life high-dimensional dataset from the chemoinformatics domain are also reported and discussed. Projection results of GTM are analytically compared with the projection results from other traditional projection methods, and it is also shown that the HGTM algorithm provides additional value for large datasets. The computational complexity of these algorithms is discussed to demonstrate their suitability for the visual data mining framework.
Resumo:
Background: Research into mental-health risks has tended to focus on epidemiological approaches and to consider pieces of evidence in isolation. Less is known about the particular factors and their patterns of occurrence that influence clinicians’ risk judgements in practice. Aims: To identify the cues used by clinicians to make risk judgements and to explore how these combine within clinicians’ psychological representations of suicide, self-harm, self-neglect, and harm to others. Method: Content analysis was applied to semi-structured interviews conducted with 46 practitioners from various mental-health disciplines, using mind maps to represent the hierarchical relationships of data and concepts. Results: Strong consensus between experts meant their knowledge could be integrated into a single hierarchical structure for each risk. This revealed contrasting emphases between data and concepts underpinning risks, including: reflection and forethought for suicide; motivation for self-harm; situation and context for harm to others; and current presentation for self-neglect. Conclusions: Analysis of experts’ risk-assessment knowledge identified influential cues and their relationships to risks. It can inform development of valid risk-screening decision support systems that combine actuarial evidence with clinical expertise.
Resumo:
We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.
Resumo:
This dissertation examines internationalisation of small and medium sized enterprises. There has been a journey to achieve this. The research has started as an action research as Teaching Company Scheme Associate. This has been done in two research cycles, which investigated factors for successful internationalisation of a small and medium sized UK manufacturing enterprise. This has revealed that successful internationalisation requires good technology and knowledge transfer to the new operations. The action research is followed by a survey that has been conducted within UK manufacturing companies. The data collected was analysed under three models: entry mode selection, role of factory and level of internationalisation. The first two models explain two major aspects of internationalisation decision. The last is showing what makes successful internationalising small and medium sized companies. These models provided several important results. The small and medium sized enterprise internationalisation is harder to achieve because most of these organisations do not have experience in technology and knowledge transfer. The success of internationalisation depends on the success of the transfer. This is achieved through employee ownership of the new knowledge. There are many factors affecting this result such as the network relationships such as trust, control and commitment and cognitive distance between two organisations. The last is a product of the difference between prior knowledge and the required level of knowledge. The entry mode and role of factory are decided through these factors while the level of internationalisation can only be explained by absorptive capacity of the recipient organisation and the technology transfer ability of the host organisation.
Resumo:
Despite years of effort in building organisational taxonomies, the potential of ontologies to support knowledge management in complex technical domains is under-exploited. The authors of this chapter present an approach to using rich domain ontologies to support sense-making tasks associated with resolving mechanical issues. Using Semantic Web technologies, the authors have built a framework and a suite of tools which support the whole semantic knowledge lifecycle. These are presented by describing the process of issue resolution for a simulated investigation concerning failure of bicycle brakes. Foci of the work have included ensuring that semantic tasks fit in with users’ everyday tasks, to achieve user acceptability and support the flexibility required by communities of practice with differing local sub-domains, tasks, and terminology.
Resumo:
This paper describes the knowledge elicitation and knowledge representation aspects of a system being developed to help with the design and maintenance of relational data bases. The size algorithmic components. In addition, the domain contains multiple experts, but any given expert's knowledge of this large domain is only partial. The paper discusses the methods and techniques used for knowledge elicitation, which was based on a "broad and shallow" approach at first, moving to a "narrow and deep" one later, and describes the models used for knowledge representation, which were based on a layered "generic and variants" approach. © 1995.
Resumo:
The management and sharing of complex data, information and knowledge is a fundamental and growing concern in the Water and other Industries for a variety of reasons. For example, risks and uncertainties associated with climate, and other changes require knowledge to prepare for a range of future scenarios and potential extreme events. Formal ways in which knowledge can be established and managed can help deliver efficiencies on acquisition, structuring and filtering to provide only the essential aspects of the knowledge really needed. Ontologies are a key technology for this knowledge management. The construction of ontologies is a considerable overhead on any knowledge management programme. Hence current computer science research is investigating generating ontologies automatically from documents using text mining and natural language techniques. As an example of this, results from application of the Text2Onto tool to stakeholder documents for a project on sustainable water cycle management in new developments are presented. It is concluded that by adopting ontological representations sooner, rather than later in an analytical process, decision makers will be able to make better use of highly knowledgeable systems containing automated services to ensure that sustainability considerations are included.