50 resultados para Data Pre-Processing and Performance Evaluation

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Non-technical losses (NTL) identification and prediction are important tasks for many utilities. Data from customer information system (CIS) can be used for NTL analysis. However, in order to accurately and efficiently perform NTL analysis, the original data from CIS need to be pre-processed before any detailed NTL analysis can be carried out. In this paper, we propose a feature selection based method for CIS data pre-processing in order to extract the most relevant information for further analysis such as clustering and classifications. By removing irrelevant and redundant features, feature selection is an essential step in data mining process in finding optimal subset of features to improve the quality of result by giving faster time processing, higher accuracy and simpler results with fewer features. Detailed feature selection analysis is presented in the paper. Both time-domain and load shape data are compared based on the accuracy, consistency and statistical dependencies between features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The movement of chemicals through the soil to the groundwater or discharged to surface waters represents a degradation of these resources. In many cases, serious human and stock health implications are associated with this form of pollution. The chemicals of interest include nutrients, pesticides, salts, and industrial wastes. Recent studies have shown that current models and methods do not adequately describe the leaching of nutrients through soil, often underestimating the risk of groundwater contamination by surface-applied chemicals, and overestimating the concentration of resident solutes. This inaccuracy results primarily from ignoring soil structure and nonequilibrium between soil constituents, water, and solutes. A multiple sample percolation system (MSPS), consisting of 25 individual collection wells, was constructed to study the effects of localized soil heterogeneities on the transport of nutrients (NO3-, Cl-, PO43-) in the vadose zone of an agricultural soil predominantly dominated by clay. Very significant variations in drainage patterns across a small spatial scale were observed tone-way ANOVA, p < 0.001) indicating considerable heterogeneity in water flow patterns and nutrient leaching. Using data collected from the multiple sample percolation experiments, this paper compares the performance of two mathematical models for predicting solute transport, the advective-dispersion model with a reaction term (ADR), and a two-region preferential flow model (TRM) suitable for modelling nonequilibrium transport. These results have implications for modelling solute transport and predicting nutrient loading on a larger scale. (C) 2001 Elsevier Science Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: To describe the workload profile in a network of Australian skin cancer clinics. Design and setting: Analysis of billing data for the first 6 months of 2005 in a primary-care skin cancer clinic network, consisting of seven clinics and staffed by 20 doctors, located in the Northern Territory, Queensland and New South Wales. Main outcome measures: Consultation to biopsy ratio (CBR); biopsy to treatment ratio (BTR); number of benign naevi excised per melanoma (number needed to treat [NNT]). Results: Of 69780 billed activities, 34 622 (49.6%) were consultations, 19 358 (27.7%) biopsies, 8055 (11.5%) surgical excisions, 2804 (4.0%) additional surgical repairs, 1613 (2.3%) non-surgical treatments of cancers and 3328 (4.8%) treatments of premalignant or non-malignant lesions. A total of 6438 cancers were treated (116 melanomas by excision, 4709 non-melanoma skin cancers [NMSCs] by excision, and 1613 NMSCs non-surgically); 5251 (65.2%) surgical wounds were repaired by direct suture, 2651 (32.9%) by a flap (of which 44.8% were simple flaps), 42 (0.5%) by wedge excision and 111 (1.4%) by grafts. The CBR was 1.79, the BTR was 3.1 and the NNT was 28.6. Conclusions: In this network of Australian skin cancer clinics, one in three biopsies identified a skin cancer (BTR, 3.1), and about 29 benign lesions were excised per melanoma (NNT, 28.6). The estimated NNT was similar to that reported previously in general practice. More data are needed on health outcomes, including effectiveness of treatment and surgical repair.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this study was to establish the effect that pre-cooling the skin without a concomitant reduction in core temperature has on subsequent self-paced cycling performance under warm humid (31 degrees C and 60% relative humidity) conditions. Seven moderately trained males performed a 30 min self-paced cycling trial on two separate occasions. The conditions were counterbalanced as control or whole-body pre-cooling by water immersion so that resting skin temperature was reduced by approximate to 5-6 degrees C. After pre-cooling, mean skin temperature was lower throughout exercise and rectal temperature was lower (P < 0.05) between 15 and 25 min of exercise. Consequently, heat storage increased (P < 0.003) from 84.0 +/- 8.8 W . m(-2) to 153 +/- 13.1 W . m(-2) (mean +/- s((x) over bar)) after pre-cooling, while total body sweat fell from 1.7 +/- 0.1 1 . h(-1) to 1.2 +/- 0.1 1 . h(-1) (P < 0.05). The distance cycled increased from 14.9 +/- 0.8 to 15.8 +/- 0.7 km (P < 0.05) after pre-cooling. The results indicate that skin pre-cooling in the absence of a reduced rectal temperature is effective in reducing thermal strain and increasing the distance cycled in 30 min under warm humid conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In simultaneous analyses of multiple data partitions, the trees relevant when measuring support for a clade are the optimal tree, and the best tree lacking the clade (i.e., the most reasonable alternative). The parsimony-based method of partitioned branch support (PBS) forces each data set to arbitrate between the two relevant trees. This value is the amount each data set contributes to clade support in the combined analysis, and can be very different to support apparent in separate analyses. The approach used in PBS can also be employed in likelihood: a simultaneous analysis of all data retrieves the maximum likelihood tree, and the best tree without the clade of interest is also found. Each data set is fitted to the two trees and the log-likelihood difference calculated, giving partitioned likelihood support (PLS) for each data set. These calculations can be performed regardless of the complexity of the ML model adopted. The significance of PLS can be evaluated using a variety of resampling methods, such as the Kishino-Hasegawa test, the Shimodiara-Hasegawa test, or likelihood weights, although the appropriateness and assumptions of these tests remains debated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper assesses the importance of fund flows in the performance evaluation of Australian international equity funds. Two concepts of fund flows are considered in the context of a conditional asset pricing model. The first measure is net fund flow relative to fund size and the second is net fund flow relative to sector flows. We find that incorporating a fund flow measure relative to the sector flow results in a reduction of measured perverse market timing. The results indicate that, at the individual fund level, cash flows are relevant in assessing management outcomes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The schema of an information system can significantly impact the ability of end users to efficiently and effectively retrieve the information they need. Obtaining quickly the appropriate data increases the likelihood that an organization will make good decisions and respond adeptly to challenges. This research presents and validates a methodology for evaluating, ex ante, the relative desirability of alternative instantiations of a model of data. In contrast to prior research, each instantiation is based on a different formal theory. This research theorizes that the instantiation that yields the lowest weighted average query complexity for a representative sample of information requests is the most desirable instantiation for end-user queries. The theory was validated by an experiment that compared end-user performance using an instantiation of a data structure based on the relational model of data with performance using the corresponding instantiation of the data structure based on the object-relational model of data. Complexity was measured using three different Halstead metrics: program length, difficulty, and effort. For a representative sample of queries, the average complexity using each instantiation was calculated. As theorized, end users querying the instantiation with the lower average complexity made fewer semantic errors, i.e., were more effective at composing queries. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Frequent Itemsets mining is well explored for various data types, and its computational complexity is well understood. There are methods to deal effectively with computational problems. This paper shows another approach to further performance enhancements of frequent items sets computation. We have made a series of observations that led us to inventing data pre-processing methods such that the final step of the Partition algorithm, where a combination of all local candidate sets must be processed, is executed on substantially smaller input data. The paper shows results from several experiments that confirmed our general and formally presented observations.