774 resultados para data warehouse tuning aggregato business intelligence performance


Relevância:

40.00% 40.00%

Publicador:

Resumo:

As more diagnostic testing options become available to physicians, it becomes more difficult to combine various types of medical information together in order to optimize the overall diagnosis. To improve diagnostic performance, here we introduce an approach to optimize a decision-fusion technique to combine heterogeneous information, such as from different modalities, feature categories, or institutions. For classifier comparison we used two performance metrics: The receiving operator characteristic (ROC) area under the curve [area under the ROC curve (AUC)] and the normalized partial area under the curve (pAUC). This study used four classifiers: Linear discriminant analysis (LDA), artificial neural network (ANN), and two variants of our decision-fusion technique, AUC-optimized (DF-A) and pAUC-optimized (DF-P) decision fusion. We applied each of these classifiers with 100-fold cross-validation to two heterogeneous breast cancer data sets: One of mass lesion features and a much more challenging one of microcalcification lesion features. For the calcification data set, DF-A outperformed the other classifiers in terms of AUC (p < 0.02) and achieved AUC=0.85 +/- 0.01. The DF-P surpassed the other classifiers in terms of pAUC (p < 0.01) and reached pAUC=0.38 +/- 0.02. For the mass data set, DF-A outperformed both the ANN and the LDA (p < 0.04) and achieved AUC=0.94 +/- 0.01. Although for this data set there were no statistically significant differences among the classifiers' pAUC values (pAUC=0.57 +/- 0.07 to 0.67 +/- 0.05, p > 0.10), the DF-P did significantly improve specificity versus the LDA at both 98% and 100% sensitivity (p < 0.04). In conclusion, decision fusion directly optimized clinically significant performance measures, such as AUC and pAUC, and sometimes outperformed two well-known machine-learning techniques when applied to two different breast cancer data sets.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

BACKGROUND: Dropouts and missing data are nearly-ubiquitous in obesity randomized controlled trails, threatening validity and generalizability of conclusions. Herein, we meta-analytically evaluate the extent of missing data, the frequency with which various analytic methods are employed to accommodate dropouts, and the performance of multiple statistical methods. METHODOLOGY/PRINCIPAL FINDINGS: We searched PubMed and Cochrane databases (2000-2006) for articles published in English and manually searched bibliographic references. Articles of pharmaceutical randomized controlled trials with weight loss or weight gain prevention as major endpoints were included. Two authors independently reviewed each publication for inclusion. 121 articles met the inclusion criteria. Two authors independently extracted treatment, sample size, drop-out rates, study duration, and statistical method used to handle missing data from all articles and resolved disagreements by consensus. In the meta-analysis, drop-out rates were substantial with the survival (non-dropout) rates being approximated by an exponential decay curve (e(-lambdat)) where lambda was estimated to be .0088 (95% bootstrap confidence interval: .0076 to .0100) and t represents time in weeks. The estimated drop-out rate at 1 year was 37%. Most studies used last observation carried forward as the primary analytic method to handle missing data. We also obtained 12 raw obesity randomized controlled trial datasets for empirical analyses. Analyses of raw randomized controlled trial data suggested that both mixed models and multiple imputation performed well, but that multiple imputation may be more robust when missing data are extensive. CONCLUSION/SIGNIFICANCE: Our analysis offers an equation for predictions of dropout rates useful for future study planning. Our raw data analyses suggests that multiple imputation is better than other methods for handling missing data in obesity randomized controlled trials, followed closely by mixed models. We suggest these methods supplant last observation carried forward as the primary method of analysis.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The performance of the register insertion protocol for mixed voice-data traffic is investigated by simulation. The simulation model incorporates a common insertion buffer for station and ring packets. Bandwidth allocation is achieved by imposing a queue limit at each node. A simple priority scheme is introduced by allowing the queue limit to vary from node to node. This enables voice traffic to be given priority over data. The effect on performance of various operational and design parameters such as ratio of voice to data traffic, queue limit and voice packet size is investigated. Comparisons are made where possible with related work on other protocols proposed for voice-data integration. The main conclusions are: (a) there is a general degradation of performance as the ratio of voice traffic to data traffic increases, (b) substantial improvement in performance can be achieved by restricting the queue length at data nodes and (c) for a given ring utilisation, smaller voice packets result in lower delays for both voice and data traffic.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Many Web applications walk the thin line between the need for dynamic data and the need to meet user performance expectations. In environments where funds are not available to constantly upgrade hardware inline with user demand, alternative approaches need to be considered. This paper introduces a ‘Data farming’ model whereby dynamic data, which is ‘grown’ in operational applications, is ‘harvested’ and ‘packaged’ for various consumer markets. Like any well managed agricultural operation, crops are harvested according to historical and perceived demand as inferred by a self-optimising process. This approach aims to make enhanced use of available resources through better utlilisation of system downtime - thereby improving application performance and increasing the availability of key business data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Purpose: Environmental turbulence including rapid changes in technology and markets has resulted in the need for new approaches to performance measurement and benchmarking. There is a need for studies that attempt to measure and benchmark upstream, leading or developmental aspects of organizations. Therefore, the aim of this paper is twofold. The first is to conduct an in-depth case analysis of lead performance measurement and benchmarking leading to the further development of a conceptual model derived from the extant literature and initial survey data. The second is to outline future research agendas that could further develop the framework and the subject area.

Design/methodology/approach: A multiple case analysis involving repeated in-depth interviews with managers in organisational areas of upstream influence in the case organisations.

Findings: It was found that the effect of external drivers for lead performance measurement and benchmarking was mediated by organisational context factors such as level of progression in business improvement methods. Moreover, the legitimation of the business improvement methods used for this purpose, although typical, had been extended beyond their original purpose with the development of bespoke sets of lead measures.

Practical implications: Examples of methods and lead measures are given that can be used by organizations in developing a programme of lead performance measurement and benchmarking.

Originality/value: There is a paucity of in-depth studies relating to the theory and practice of lead performance measurement and benchmarking in organisations.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the last decade, data mining has emerged as one of the most dynamic and lively areas in information technology. Although many algorithms and techniques for data mining have been proposed, they either focus on domain independent techniques or on very specific domain problems. A general requirement in bridging the gap between academia and business is to cater to general domain-related issues surrounding real-life applications, such as constraints, organizational factors, domain expert knowledge, domain adaption, and operational knowledge. Unfortunately, these either have not been addressed, or have not been sufficiently addressed, in current data mining research and development.Domain-Driven Data Mining (D3M) aims to develop general principles, methodologies, and techniques for modeling and merging comprehensive domain-related factors and synthesized ubiquitous intelligence surrounding problem domains with the data mining process, and discovering knowledge to support business decision-making. This paper aims to report original, cutting-edge, and state-of-the-art progress in D3M. It covers theoretical and applied contributions aiming to: 1) propose next-generation data mining frameworks and processes for actionable knowledge discovery, 2) investigate effective (automated, human and machine-centered and/or human-machined-co-operated) principles and approaches for acquiring, representing, modelling, and engaging ubiquitous intelligence in real-world data mining, and 3) develop workable and operational systems balancing technical significance and applications concerns, and converting and delivering actionable knowledge into operational applications rules to seamlessly engage application processes and systems.