957 resultados para Statistical approach
Resumo:
A version of the thermodynamic perturbation theory based on a scaling transformation of the partition function has been applied to the statistical derivation of the equation of state in a highpressure region. Two modifications of the equations of state have been obtained on the basis of the free energy functional perturbation series. The comparative analysis of the experimental PV T- data on the isothermal compression for the supercritical fluids of inert gases has been carried out. © 2012.
Resumo:
DNA-binding proteins are crucial for various cellular processes and hence have become an important target for both basic research and drug development. With the avalanche of protein sequences generated in the postgenomic age, it is highly desired to establish an automated method for rapidly and accurately identifying DNA-binding proteins based on their sequence information alone. Owing to the fact that all biological species have developed beginning from a very limited number of ancestral species, it is important to take into account the evolutionary information in developing such a high-throughput tool. In view of this, a new predictor was proposed by incorporating the evolutionary information into the general form of pseudo amino acid composition via the top-n-gram approach. It was observed by comparing the new predictor with the existing methods via both jackknife test and independent data-set test that the new predictor outperformed its counterparts. It is anticipated that the new predictor may become a useful vehicle for identifying DNA-binding proteins. It has not escaped our notice that the novel approach to extract evolutionary information into the formulation of statistical samples can be used to identify many other protein attributes as well.
Resumo:
In the nonparametric framework of Data Envelopment Analysis the statistical properties of its estimators have been investigated and only asymptotic results are available. For DEA estimators results of practical use have been proved only for the case of one input and one output. However, in the real world problems the production process is usually well described by many variables. In this paper a machine learning approach to variable aggregation based on Canonical Correlation Analysis is presented. This approach is applied for efficiency estimation of all the farms in Terceira Island of the Azorean archipelago.
Resumo:
In nonlinear and stochastic control problems, learning an efficient feed-forward controller is not amenable to conventional neurocontrol methods. For these approaches, estimating and then incorporating uncertainty in the controller and feed-forward models can produce more robust control results. Here, we introduce a novel inversion-based neurocontroller for solving control problems involving uncertain nonlinear systems which could also compensate for multi-valued systems. The approach uses recent developments in neural networks, especially in the context of modelling statistical distributions, which are applied to forward and inverse plant models. Provided that certain conditions are met, an estimate of the intrinsic uncertainty for the outputs of neural networks can be obtained using the statistical properties of networks. More generally, multicomponent distributions can be modelled by the mixture density network. Based on importance sampling from these distributions a novel robust inverse control approach is obtained. This importance sampling provides a structured and principled approach to constrain the complexity of the search space for the ideal control law. The developed methodology circumvents the dynamic programming problem by using the predicted neural network uncertainty to localise the possible control solutions to consider. A nonlinear multi-variable system with different delays between the input-output pairs is used to demonstrate the successful application of the developed control algorithm. The proposed method is suitable for redundant control systems and allows us to model strongly non-Gaussian distributions of control signal as well as processes with hysteresis. © 2004 Elsevier Ltd. All rights reserved.
Resumo:
2002 Mathematics Subject Classification: 65C05
Resumo:
2000 Mathematics Subject Classification: 62P10, 92D10, 92D30, 94A17, 62L10.
Resumo:
2000 Mathematics Subject Classification: 78A50
Resumo:
Background: Intensive risk factor management is recommended for individuals with diabetes. However, it is not known if such an approach is appropriate in the elderly with multiple comorbidities and limited life expectancy. The aim of this study was to characterise a cohort of very elderly individuals with diabetes and assess the impact of known risk factors on mortality. Methods: This was a retrospective audit approved by the clinical audit lead. All patients aged >80 years who attended diabetes outpatient clinics 2 years prior to the date of the audit (April 2012) were identified from clinic records. A detailed history including demographics, comorbidities and treatment were collected. Blood pressure readings, HbA1c, cholesterol and renal function were extracted and the mean of these readings was recorded. Survival status at 2 years was recorded for all patients. Statistical analysis was performed using SPSS19. Results: Data were available for 864 (381 male, 483 female) patients. The majority (75%) lived in their own home. More than 60% had multiple comorbidities and 25% had a prior history of cardiovascular disease. Two-thirds of the patients had more than one hospital admission in 2 years and a third had more than three admissions. 60% were on either insulin or a sulfonylurea. Mean HbA1c was 7.6%, cholesterol 4.2mmol/l, systolic blood pressure 145mmHg and eGFR 53ml/min. Over 2 years, 174 (20%)had died. Age, creatinine and previous coronary heart disease were significant predictors of death. Conclusion: The benefits of intensive diabetes management appear to be uncertain in very elderly patients. The need for intensive treatment must therefore be individualised to each patient.
Resumo:
Biometrics is afield of study which pursues the association of a person's identity with his/her physiological or behavioral characteristics.^ As one aspect of biometrics, face recognition has attracted special attention because it is a natural and noninvasive means to identify individuals. Most of the previous studies in face recognition are based on two-dimensional (2D) intensity images. Face recognition based on 2D intensity images, however, is sensitive to environment illumination and subject orientation changes, affecting the recognition results. With the development of three-dimensional (3D) scanners, 3D face recognition is being explored as an alternative to the traditional 2D methods for face recognition.^ This dissertation proposes a method in which the expression and the identity of a face are determined in an integrated fashion from 3D scans. In this framework, there is a front end expression recognition module which sorts the incoming 3D face according to the expression detected in the 3D scans. Then, scans with neutral expressions are processed by a corresponding 3D neutral face recognition module. Alternatively, if a scan displays a non-neutral expression, e.g., a smiling expression, it will be routed to an appropriate specialized recognition module for smiling face recognition.^ The expression recognition method proposed in this dissertation is innovative in that it uses information from 3D scans to perform the classification task. A smiling face recognition module was developed, based on the statistical modeling of the variance between faces with neutral expression and faces with a smiling expression.^ The proposed expression and face recognition framework was tested with a database containing 120 3D scans from 30 subjects (Half are neutral faces and half are smiling faces). It is shown that the proposed framework achieves a recognition rate 10% higher than attempting the identification with only the neutral face recognition module.^
Resumo:
The microarray technology provides a high-throughput technique to study gene expression. Microarrays can help us diagnose different types of cancers, understand biological processes, assess host responses to drugs and pathogens, find markers for specific diseases, and much more. Microarray experiments generate large amounts of data. Thus, effective data processing and analysis are critical for making reliable inferences from the data. ^ The first part of dissertation addresses the problem of finding an optimal set of genes (biomarkers) to classify a set of samples as diseased or normal. Three statistical gene selection methods (GS, GS-NR, and GS-PCA) were developed to identify a set of genes that best differentiate between samples. A comparative study on different classification tools was performed and the best combinations of gene selection and classifiers for multi-class cancer classification were identified. For most of the benchmarking cancer data sets, the gene selection method proposed in this dissertation, GS, outperformed other gene selection methods. The classifiers based on Random Forests, neural network ensembles, and K-nearest neighbor (KNN) showed consistently god performance. A striking commonality among these classifiers is that they all use a committee-based approach, suggesting that ensemble classification methods are superior. ^ The same biological problem may be studied at different research labs and/or performed using different lab protocols or samples. In such situations, it is important to combine results from these efforts. The second part of the dissertation addresses the problem of pooling the results from different independent experiments to obtain improved results. Four statistical pooling techniques (Fisher inverse chi-square method, Logit method. Stouffer's Z transform method, and Liptak-Stouffer weighted Z-method) were investigated in this dissertation. These pooling techniques were applied to the problem of identifying cell cycle-regulated genes in two different yeast species. As a result, improved sets of cell cycle-regulated genes were identified. The last part of dissertation explores the effectiveness of wavelet data transforms for the task of clustering. Discrete wavelet transforms, with an appropriate choice of wavelet bases, were shown to be effective in producing clusters that were biologically more meaningful. ^
Resumo:
Current reform initiatives recommend that school geometry teaching and learning include the study of three-dimensional geometric objects and provide students with opportunities to use spatial abilities in mathematical tasks. Two ways of using Geometer's Sketchpad (GSP), a dynamic and interactive computer program, in conjunction with manipulatives enable students to investigate and explore geometric concepts, especially when used in a constructivist setting. Research on spatial abilities has focused on visual reasoning to improve visualization skills. This dissertation investigated the hypothesis that connecting visual and analytic reasoning may better improve students' spatial visualization abilities as compared to instruction that makes little or no use of the connection of the two. Data were collected using the Purdue Spatial Visualization Tests (PSVT) administered as a pretest and posttest to a control and two experimental groups. Sixty-four 10th grade students in three geometry classrooms participated in the study during 6 weeks. Research questions were answered using statistical procedures. An analysis of covariance was used for a quantitative analysis, whereas a description of students' visual-analytic processing strategies was presented using qualitative methods. The quantitative results indicated that there were significant differences in gender, but not in the group factor. However, when analyzing a sub sample of 33 participants with pretest scores below the 50th percentile, males in one of the experimental groups significantly benefited from the treatment. A review of previous research also indicated that students with low visualization skills benefited more than those with higher visualization skills. The qualitative results showed that girls were more sophisticated in their visual-analytic processing strategies to solve three-dimensional tasks. It is recommended that the teaching and learning of spatial visualization start in the middle school, prior to students' more rigorous mathematics exposure in high school. A duration longer than 6 weeks for treatments in similar future research studies is also recommended.
Resumo:
As users continually request additional functionality, software systems will continue to grow in their complexity, as well as in their susceptibility to failures. Particularly for sensitive systems requiring higher levels of reliability, faulty system modules may increase development and maintenance cost. Hence, identifying them early would support the development of reliable systems through improved scheduling and quality control. Research effort to predict software modules likely to contain faults, as a consequence, has been substantial. Although a wide range of fault prediction models have been proposed, we remain far from having reliable tools that can be widely applied to real industrial systems. For projects with known fault histories, numerous research studies show that statistical models can provide reasonable estimates at predicting faulty modules using software metrics. However, as context-specific metrics differ from project to project, the task of predicting across projects is difficult to achieve. Prediction models obtained from one project experience are ineffective in their ability to predict fault-prone modules when applied to other projects. Hence, taking full benefit of the existing work in software development community has been substantially limited. As a step towards solving this problem, in this dissertation we propose a fault prediction approach that exploits existing prediction models, adapting them to improve their ability to predict faulty system modules across different software projects.
Resumo:
Conceptual database design is an unusually difficult and error-prone task for novice designers. This study examined how two training approaches---rule-based and pattern-based---might improve performance on database design tasks. A rule-based approach prescribes a sequence of rules for modeling conceptual constructs, and the action to be taken at various stages while developing a conceptual model. A pattern-based approach presents data modeling structures that occur frequently in practice, and prescribes guidelines on how to recognize and use these structures. This study describes the conceptual framework, experimental design, and results of a laboratory experiment that employed novice designers to compare the effectiveness of the two training approaches (between-subjects) at three levels of task complexity (within subjects). Results indicate an interaction effect between treatment and task complexity. The rule-based approach was significantly better in the low-complexity and the high-complexity cases; there was no statistical difference in the medium-complexity case. Designer performance fell significantly as complexity increased. Overall, though the rule-based approach was not significantly superior to the pattern-based approach in all instances, it out-performed the pattern-based approach at two out of three complexity levels. The primary contributions of the study are (1) the operationalization of the complexity construct to a degree not addressed in previous studies; (2) the development of a pattern-based instructional approach to database design; and (3) the finding that the effectiveness of a particular training approach may depend on the complexity of the task.
Resumo:
A certain type of bacterial inclusion, known as a bacterial microcompartment, was recently identified and imaged through cryo-electron tomography. A reconstructed 3D object from single-axis limited angle tilt-series cryo-electron tomography contains missing regions and this problem is known as the missing wedge problem. Due to missing regions on the reconstructed images, analyzing their 3D structures is a challenging problem. The existing methods overcome this problem by aligning and averaging several similar shaped objects. These schemes work well if the objects are symmetric and several objects with almost similar shapes and sizes are available. Since the bacterial inclusions studied here are not symmetric, are deformed, and show a wide range of shapes and sizes, the existing approaches are not appropriate. This research develops new statistical methods for analyzing geometric properties, such as volume, symmetry, aspect ratio, polyhedral structures etc., of these bacterial inclusions in presence of missing data. These methods work with deformed and non-symmetric varied shaped objects and do not necessitate multiple objects for handling the missing wedge problem. The developed methods and contributions include: (a) an improved method for manual image segmentation, (b) a new approach to 'complete' the segmented and reconstructed incomplete 3D images, (c) a polyhedral structural distance model to predict the polyhedral shapes of these microstructures, (d) a new shape descriptor for polyhedral shapes, named as polyhedron profile statistic, and (e) the Bayes classifier, linear discriminant analysis and support vector machine based classifiers for supervised incomplete polyhedral shape classification. Finally, the predicted 3D shapes for these bacterial microstructures belong to the Johnson solids family, and these shapes along with their other geometric properties are important for better understanding of their chemical and biological characteristics.
Resumo:
Dynamic positron emission tomography (PET) imaging can be used to track the distribution of injected radio-labelled molecules over time in vivo. This is a powerful technique, which provides researchers and clinicians the opportunity to study the status of healthy and pathological tissue by examining how it processes substances of interest. Widely used tracers include 18F-uorodeoxyglucose, an analog of glucose, which is used as the radiotracer in over ninety percent of PET scans. This radiotracer provides a way of quantifying the distribution of glucose utilisation in vivo. The interpretation of PET time-course data is complicated because the measured signal is a combination of vascular delivery and tissue retention effects. If the arterial time-course is known, the tissue time-course can typically be expressed in terms of a linear convolution between the arterial time-course and the tissue residue function. As the residue represents the amount of tracer remaining in the tissue, this can be thought of as a survival function; these functions been examined in great detail by the statistics community. Kinetic analysis of PET data is concerned with estimation of the residue and associated functionals such as ow, ux and volume of distribution. This thesis presents a Markov chain formulation of blood tissue exchange and explores how this relates to established compartmental forms. A nonparametric approach to the estimation of the residue is examined and the improvement in this model relative to compartmental model is evaluated using simulations and cross-validation techniques. The reference distribution of the test statistics, generated in comparing the models, is also studied. We explore these models further with simulated studies and an FDG-PET dataset from subjects with gliomas, which has previously been analysed with compartmental modelling. We also consider the performance of a recently proposed mixture modelling technique in this study.