31 resultados para Data-driven analysis
em University of Queensland eSpace - Australia
Resumo:
The integration of geo-information from multiple sources and of diverse nature in developing mineral favourability indexes (MFIs) is a well-known problem in mineral exploration and mineral resource assessment. Fuzzy set theory provides a convenient framework to combine and analyse qualitative and quantitative data independently of their source or characteristics. A novel, data-driven formulation for calculating MFIs based on fuzzy analysis is developed in this paper. Different geo-variables are considered fuzzy sets and their appropriate membership functions are defined and modelled. A new weighted average-type aggregation operator is then introduced to generate a new fuzzy set representing mineral favourability. The membership grades of the new fuzzy set are considered as the MFI. The weights for the aggregation operation combine the individual membership functions of the geo-variables, and are derived using information from training areas and L, regression. The technique is demonstrated in a case study of skarn tin deposits and is used to integrate geological, geochemical and magnetic data. The study area covers a total of 22.5 km(2) and is divided into 349 cells, which include nine control cells. Nine geo-variables are considered in this study. Depending on the nature of the various geo-variables, four different types of membership functions are used to model the fuzzy membership of the geo-variables involved. (C) 2002 Elsevier Science Ltd. All rights reserved.
Resumo:
Performance indicators in the public sector have often been criticised for being inadequate and not conducive to analysing efficiency. The main objective of this study is to use data envelopment analysis (DEA) to examine the relative efficiency of Australian universities. Three performance models are developed, namely, overall performance, performance on delivery of educational services, and performance on fee-paying enrolments. The findings based on 1995 data show that the university sector was performing well on technical and scale efficiency but there was room for improving performance on fee-paying enrolments. There were also small slacks in input utilisation. More universities were operating at decreasing returns to scale, indicating a potential to downsize. DEA helps in identifying the reference sets for inefficient institutions and objectively determines productivity improvements. As such, it can be a valuable benchmarking tool for educational administrators and assist in more efficient allocation of scarce resources. In the absence of market mechanisms to price educational outputs, which renders traditional production or cost functions inappropriate, universities are particularly obliged to seek alternative efficiency analysis methods such as DEA.
Resumo:
The tests that are currently available for the measurement of overexpression of the human epidermal growth factor-2 (HER2) in breast cancer have shown considerable problems in accuracy and interlaboratory reproducibility. Although these problems are partly alleviated by the use of validated, standardised 'kits', there may be considerable cost involved in their use. Prior to testing it may therefore be an advantage to be able to predict from basic pathology data whether a cancer is likely to overexpress HER2. In this study, we have correlated pathology features of cancers with the frequency of HER2 overexpression assessed by immunohistochemistry (IHC) using HercepTest (Dako). In addition, fluorescence in situ hybridisation (FISH) has been used to re-test the equivocal cancers and interobserver variation in assessing HER2 overexpression has been examined by a slide circulation scheme. Of the 1536 cancers, 1144 (74.5%) did not overexpress HER2. Unequivocal overexpression (3+ by IHC) was seen in 186 cancers (12%) and an equivocal result (2+ by IHC) was seen in 206 cancers (13%). Of the 156 IHC 3+ cancers for which complete data was available, 149 (95.5%) were ductal NST and 152 (97%) were histological grade 2 or 3. Only 1 of 124 infiltrating lobular carcinomas (0.8%) showed HER2 overexpression. None of the 49 'special types' of carcinoma showed HER2 overexpression. Re-testing by FISH of a proportion of the IHC 2+ cancers showed that only 25 (23%) of those assessable exhibited HER2 gene amplification, but 46 of the 47 IHC 3+ cancers (98%) were confirmed as showing gene amplification. Circulating slides for the assessment of HER2 score showed a moderate level of agreement between pathologists (kappa 0.4). As a result of this study we would advocate consideration of a triage approach to HER-2 testing. Infiltrating lobular and special types of carcinoma may not need to be routinely tested at presentation nor may grade 1 NST carcinomas in which only 1.4% have been shown to overexpress HER2. Testing of these carcinomas may be performed when HER2 status is required to assist in therapeutic or other clinical/prognostic decision-making. The highest yield of HER2 overexpressing carcinomas is seen in the grade 3 NST subgroup in which 24% are positive by IHC. (C) 2003 Elsevier Science Ltd. All rights reserved.
Resumo:
Australian banks are currently generating huge profits but are they sustainable? NECMI AVKIRAN suggests that banks will need to scrutinise the performance of their networks to ensure future profits.
Resumo:
Functional magnetic resonance imaging (FMRI) analysis methods can be quite generally divided into hypothesis-driven and data-driven approaches. The former are utilised in the majority of FMRI studies, where a specific haemodynamic response is modelled utilising knowledge of event timing during the scan, and is tested against the data using a t test or a correlation analysis. These approaches often lack the flexibility to account for variability in haemodynamic response across subjects and brain regions which is of specific interest in high-temporal resolution event-related studies. Current data-driven approaches attempt to identify components of interest in the data, but currently do not utilise any physiological information for the discrimination of these components. Here we present a hypothesis-driven approach that is an extension of Friman's maximum correlation modelling method (Neurolmage 16, 454-464, 2002) specifically focused on discriminating the temporal characteristics of event-related haemodynamic activity. Test analyses, on both simulated and real event-related FMRI data, will be presented.
Resumo:
We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.
Resumo:
The aim of this study was to apply multifailure survival methods to analyze time to multiple occurrences of basal cell carcinoma (BCC). Data from 4.5 years of follow-up in a randomized controlled trial, the Nambour Skin Cancer Prevention Trial (1992-1996), to evaluate skin cancer prevention were used to assess the influence of sunscreen application on the time to first BCC and the time to subsequent BCCs. Three different approaches of time to ordered multiple events were applied and compared: the Andersen-Gill, Wei-Lin-Weissfeld, and Prentice-Williams-Peterson models. Robust variance estimation approaches were used for all multifailure survival models. Sunscreen treatment was not associated with time to first occurrence of a BCC (hazard ratio = 1.04, 95% confidence interval: 0.79, 1.45). Time to subsequent BCC tumors using the Andersen-Gill model resulted in a lower estimated hazard among the daily sunscreen application group, although statistical significance was not reached (hazard ratio = 0.82, 95% confidence interval: 0.59, 1.15). Similarly, both the Wei-Lin-Weissfeld marginal-hazards and the Prentice-Williams-Peterson gap-time models revealed trends toward a lower risk of subsequent BCC tumors among the sunscreen intervention group. These results demonstrate the importance of conducting multiple-event analysis for recurring events, as risk factors for a single event may differ from those where repeated events are considered.
Resumo:
The second edition of An Introduction to Efficiency and Productivity Analysis is designed to be a general introduction for those who wish to study efficiency and productivity analysis. The book provides an accessible, well-written introduction to the four principal methods involved: econometric estimation of average response models; index numbers, data envelopment analysis (DEA); and stochastic frontier analysis (SFA). For each method, a detailed introduction to the basic concepts is presented, numerical examples are provided, and some of the more important extensions to the basic methods are discussed. Of special interest is the systematic use of detailed empirical applications using real-world data throughout the book. In recent years, there have been a number of excellent advance-level books published on performance measurement. This book, however, is the first systematic survey of performance measurement with the express purpose of introducing the field to a wide audience of students, researchers, and practitioners. Indeed, the 2nd Edition maintains its uniqueness: (1) It is a well-written introduction to the field. (2) It outlines, discusses and compares the four principal methods for efficiency and productivity analysis in a well-motivated presentation. (3) It provides detailed advice on computer programs that can be used to implement these performance measurement methods. The book contains computer instructions and output listings for the SHAZAM, LIMDEP, TFPIP, DEAP and FRONTIER computer programs. More extensive listings of data and computer instruction files are available on the book's website: (www.uq.edu.au/economics/cepa/crob2005).
Resumo:
The effect of number of samples and selection of data for analysis on the calculation of surface motor unit potential (SMUP) size in the statistical method of motor unit number estimates (MUNE) was determined in 10 normal subjects and 10 with amyotrophic lateral sclerosis (ALS). We recorded 500 sequential compound muscle action potentials (CMAPs) at three different stable stimulus intensities (10–50% of maximal CMAP). Estimated mean SMUP sizes were calculated using Poisson statistical assumptions from the variance of 500 sequential CMAP obtained at each stimulus intensity. The results with the 500 data points were compared with smaller subsets from the same data set. The results using a range of 50–80% of the 500 data points were compared with the full 500. The effect of restricting analysis to data between 5–20% of the CMAP and to standard deviation limits was also assessed. No differences in mean SMUP size were found with stimulus intensity or use of different ranges of data. Consistency was improved with a greater sample number. Data within 5% of CMAP size gave both increased consistency and reduced mean SMUP size in many subjects, but excluded valid responses present at that stimulus intensity. These changes were more prominent in ALS patients in whom the presence of isolated SMUP responses was a striking difference from normal subjects. Noise, spurious data, and large SMUP limited the Poisson assumptions. When these factors are considered, consistent statistical MUNE can be calculated from a continuous sequence of data points. A 2 to 2.5 SD or 10% window are reasonable methods of limiting data for analysis. Muscle Nerve 27: 320–331, 2003
Resumo:
Remotely sensed data have been used extensively for environmental monitoring and modeling at a number of spatial scales; however, a limited range of satellite imaging systems often. constrained the scales of these analyses. A wider variety of data sets is now available, allowing image data to be selected to match the scale of environmental structure(s) or process(es) being examined. A framework is presented for use by environmental scientists and managers, enabling their spatial data collection needs to be linked to a suitable form of remotely sensed data. A six-step approach is used, combining image spatial analysis and scaling tools, within the context of hierarchy theory. The main steps involved are: (1) identification of information requirements for the monitoring or management problem; (2) development of ideal image dimensions (scene model), (3) exploratory analysis of existing remotely sensed data using scaling techniques, (4) selection and evaluation of suitable remotely sensed data based on the scene model, (5) selection of suitable spatial analytic techniques to meet information requirements, and (6) cost-benefit analysis. Results from a case study show that the framework provided an objective mechanism to identify relevant aspects of the monitoring problem and environmental characteristics for selecting remotely sensed data and analysis techniques.
Resumo:
We have constructed cDNA microarrays for soybean (Glycine max L. Merrill), containing approximately 4,100 Unigene ESTs derived from axenic roots, to evaluate their application and utility for functional genomics of organ differentiation in legumes. We assessed microarray technology by conducting studies to evaluate the accuracy of microarray data and have found them to be both reliable and reproducible in repeat hybridisations. Several ESTs showed high levels (>50 fold) of differential expression in either root or shoot tissue of soybean. A small number of physiologically interesting, and differentially expressed sequences found by microarray analysis were verified by both quantitative real-time RT-PCR and Northern blot analysis. There was a linear correlation (r(2) = 0.99, over 5 orders of magnitude) between microarray and quantitative real-time RT-PCR data. Microarray analysis of soybean has enormous potential not only for the discovery of new genes involved in tissue differentiation and function, but also to study the expression of previously characterised genes, gene networks and gene interactions in wild-type, mutant or transgenic; plants.
Resumo:
As for other complex diseases, linkage analyses of schizophrenia (SZ) have produced evidence for numerous chromosomal regions, with inconsistent results reported across studies. The presence of locus heterogeneity appears likely and may reduce the power of linkage analyses if homogeneity is assumed. In addition, when multiple heterogeneous datasets are pooled, intersample variation in the proportion of linked families ( a) may diminish the power of the pooled sample to detect susceptibility loci, in spite of the larger sample size obtained. We compare the significance of linkage. findings obtained using allele- sharing LOD scores ( LODexp) - which assume homogeneity - and heterogeneity LOD scores ( HLOD) in European American and African American NIMH SZ families. We also pool these two samples and evaluate the relative power of the LODexp and two different heterogeneity statistics. One of these ( HLOD- P) estimates the heterogeneity parameter a only in aggregate data, while the second ( HLOD- S) determines a separately for each sample. In separate and combined data, we show consistently improved performance of HLOD scores over LODexp. Notably, genome-wide significant evidence for linkage is obtained at chromosome 10p in the European American sample using a recessive HLOD score. When the two samples are combined, linkage at the 10p locus also achieves genome-wide significance under HLOD- S, but not HLOD- P. Using HLOD- S, improved evidence for linkage was also obtained for a previously reported region on chromosome 15q. In linkage analyses of complex disease, power may be maximised by routinely modelling locus heterogeneity within individual datasets, even when multiple datasets are combined to form larger samples.