48 resultados para Data mining and knowledge discovery
Resumo:
Two hazard risk assessment matrices for the ranking of occupational health risks are described. The qualitative matrix uses qualitative measures of probability and consequence to determine risk assessment codes for hazard-disease combinations. A walk-through survey of an underground metalliferous mine and concentrator is used to demonstrate how the qualitative matrix can be applied to determine priorities for the control of occupational health hazards. The semi-quantitative matrix uses attributable risk as a quantitative measure of probability and uses qualitative measures of consequence. A practical application of this matrix is the determination of occupational health priorities using existing epidemiological studies. Calculated attributable risks from epidemiological studies of hazard-disease combinations in mining and minerals processing are used as examples. These historic response data do not reflect the risks associated with current exposures. A method using current exposure data, known exposure-response relationships and the semi-quantitative matrix is proposed for more accurate and current risk rankings.
Resumo:
This paper proposes a novel application of fuzzy logic to web data mining for two basic problems of a website: popularity and satisfaction. Popularity means that people will visit the website while satisfaction refers to the usefulness of the site. We will illustrate that the popularity of a website is a fuzzy logic problem. It is an important characteristic of a website in order to survive in Internet commerce. The satisfaction of a website is also a fuzzy logic problem that represents the degree of success in the application of information technology to the business. We propose a framework of fuzzy logic for the representation of these two problems based on web data mining techniques to fuzzify the attributes of a website.
Resumo:
Electricity market price forecast is a changeling yet very important task for electricity market managers and participants. Due to the complexity and uncertainties in the power grid, electricity prices are highly volatile and normally carry with spikes. which may be (ens or even hundreds of times higher than the normal price. Such electricity spikes are very difficult to be predicted. So far. most of the research on electricity price forecast is based on the normal range electricity prices. This paper proposes a data mining based electricity price forecast framework, which can predict the normal price as well as the price spikes. The normal price can be, predicted by a previously proposed wavelet and neural network based forecast model, while the spikes are forecasted based on a data mining approach. This paper focuses on the spike prediction and explores the reasons for price spikes based on the measurement of a proposed composite supply-demand balance index (SDI) and relative demand index (RDI). These indices are able to reflect the relationship among electricity demand, electricity supply and electricity reserve capacity. The proposed model is based on a mining database including market clearing price, trading hour. electricity), demand, electricity supply and reserve. Bayesian classification and similarity searching techniques are used to mine the database to find out the internal relationships between electricity price spikes and these proposed. The mining results are used to form the price spike forecast model. This proposed model is able to generate forecasted price spike, level of spike and associated forecast confidence level. The model is tested with the Queensland electricity market data with promising results. Crown Copyright (C) 2004 Published by Elsevier B.V. All rights reserved.
Resumo:
Virtual teams differ from tradidonal, co-located teams in that they primarily communicate via informadon technolog}' such as email, video conferencing and web based coUaboradve environments rather than in a face-to-face medium. There has been a lack of empirical research into the influence that leadership has within virtual teams upon key outcomes such as performance and knowledge sharing. This paper examines antecedents of knowledge sharing and performance, namely role clarit)' and trust in a team leader. We predicted that transformadonal leadership would posidvely influence both performance and knowledge sharing within virtual teams. We also h^'pothesised that trust in a leader and role clarit)' would mediate both the associadon between transformadonal leadership and performance as well as the associadon between transformadonal leadership and knowledge sharing within virtual teams. Data was collected from a public sector organisadon using virtual teams, Pardcipants responded to a self-report quesdonnaire. Supervisor radngs of performance and knowledge sharing were also obtained. In general we found support for a posidve reladonship between transformadonal leadership and performance and knowledge sharing within virtual teams. Using mediated muldple regression, we found support for the mediadng role of trust in the leader and role clarit}' between transformadonal leadership and performance and knowledge sharing. Implicadons of the results are provided.
Resumo:
Objective: An estimation of cut-off points for the diagnosis of diabetes mellitus (DM) based on individual risk factors. Methods: A subset of the 1991 Oman National Diabetes Survey is used, including all patients with a 2h post glucose load >= 200 mg/dl (278 subjects) and a control group of 286 subjects. All subjects previously diagnosed as diabetic and all subjects with missing data values were excluded. The data set was analyzed by use of the SPSS Clementine data mining system. Decision Tree Learners (C5 and CART) and a method for mining association rules (the GRI algorithm) are used. The fasting plasma glucose (FPG), age, sex, family history of diabetes and body mass index (BMI) are input risk factors (independent variables), while diabetes onset (the 2h post glucose load >= 200 mg/dl) is the output (dependent variable). All three techniques used were tested by use of crossvalidation (89.8%). Results: Rules produced for diabetes diagnosis are: A- GRI algorithm (1) FPG>=108.9 mg/dl, (2) FPG>=107.1 and age>39.5 years. B- CART decision trees: FPG >=110.7 mg/dl. C- The C5 decision tree learner: (1) FPG>=95.5 and 54, (2) FPG>=106 and 25.2 kg/m2. (3) FPG>=106 and =133 mg/dl. The three techniques produced rules which cover a significant number of cases (82%), with confidence between 74 and 100%. Conclusion: Our approach supports the suggestion that the present cut-off value of fasting plasma glucose (126 mg/dl) for the diagnosis of diabetes mellitus needs revision, and the individual risk factors such as age and BMI should be considered in defining the new cut-off value.
Resumo:
The enormous amount of information generated through sequencing of the human genome has increased demands for more economical and flexible alternatives in genomics, proteomics and drug discovery. Many companies and institutions have recognised the potential of increasing the size and complexity of chemical libraries by producing large chemical libraries on colloidal support beads. Since colloid-based compounds in a suspension are randomly located, an encoding system such as optical barcoding is required to permit rapid elucidation of the compound structures. We describe in this article innovative methods for optical barcoding of colloids for use as support beads in both combinatorial and non-combinatorial libraries. We focus in particular on the difficult problem of barcoding extremely large libraries, which if solved, will transform the manner in which genomics, proteomics and drug discovery research is currently performed.