22 resultados para Statistics for life sciences
em University of Queensland eSpace - Australia
Resumo:
We present a novel nonparametric density estimator and a new data-driven bandwidth selection method with excellent properties. The approach is in- spired by the principles of the generalized cross entropy method. The pro- posed density estimation procedure has numerous advantages over the tra- ditional kernel density estimator methods. Firstly, for the first time in the nonparametric literature, the proposed estimator allows for a genuine incor- poration of prior information in the density estimation procedure. Secondly, the approach provides the first data-driven bandwidth selection method that is guaranteed to provide a unique bandwidth for any data. Lastly, simulation examples suggest the proposed approach outperforms the current state of the art in nonparametric density estimation in terms of accuracy and reliability.
Resumo:
Estimating energy requirements is necessary in clinical practice when indirect calorimetry is impractical. This paper systematically reviews current methods for estimating energy requirements. Conclusions include: there is discrepancy between the characteristics of populations upon which predictive equations are based and current populations; tools are not well understood, and patient care can be compromised by inappropriate application of the tools. Data comparing tools and methods are presented and issues for practitioners are discussed. (C) 2003 International Life Sciences Institute.
Resumo:
In this study we present a novel automated strategy for predicting infarct evolution, based on MR diffusion and perfusion images acquired in the acute stage of stroke. The validity of this methodology was tested on novel patient data including data acquired from an independent stroke clinic. Regions-of-interest (ROIs) defining the initial diffusion lesion and tissue with abnormal hemodynamic function as defined by the mean transit time (MTT) abnormality were automatically extracted from DWI/PI maps. Quantitative measures of cerebral blood flow (CBF) and volume (CBV) along with ratio measures defined relative to the contralateral hemisphere (r(a)CBF and r(a)CBV) were calculated for the MTT ROIs. A parametric normal classifier algorithm incorporating these measures was used to predict infarct growth. The mean r(a)CBF and r(a)CBV values for eventually infarcted MTT tissue were 0.70 +/-0.19 and 1.20 +/-0.36. For recovered tissue the mean values were 0.99 +/-0.25 and 1.87 +/-0.71, respectively. There was a significant difference between these two regions for both measures (P
Resumo:
We present the conditional quantum dynamics of an electron tunneling between two quantum dots subject to a measurement using a low transparency point contact or tunnel junction. The double dot system forms a single qubit and the measurement corresponds to a continuous in time readout of the occupancy of the quantum dot. We illustrate the difference between conditional and unconditional dynamics of the qubit. The conditional dynamics is discussed in two regimes depending on the rate of tunneling through the point contact: quantum jumps, in which individual electron tunneling current events can be distinguished, and a diffusive dynamics in which individual events are ignored, and the time-averaged current is considered as a continuous diffusive variable. We include the effect of inefficient measurement and the influence of the relative phase between the two tunneling amplitudes of the double dot/point contact system.
Resumo:
Binning and truncation of data are common in data analysis and machine learning. This paper addresses the problem of fitting mixture densities to multivariate binned and truncated data. The EM approach proposed by McLachlan and Jones (Biometrics, 44: 2, 571-578, 1988) for the univariate case is generalized to multivariate measurements. The multivariate solution requires the evaluation of multidimensional integrals over each bin at each iteration of the EM procedure. Naive implementation of the procedure can lead to computationally inefficient results. To reduce the computational cost a number of straightforward numerical techniques are proposed. Results on simulated data indicate that the proposed methods can achieve significant computational gains with no loss in the accuracy of the final parameter estimates. Furthermore, experimental results suggest that with a sufficient number of bins and data points it is possible to estimate the true underlying density almost as well as if the data were not binned. The paper concludes with a brief description of an application of this approach to diagnosis of iron deficiency anemia, in the context of binned and truncated bivariate measurements of volume and hemoglobin concentration from an individual's red blood cells.
Resumo:
Motivation: This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. Results: The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets.