983 resultados para Statistical Prediction


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two types of prediction problem can be solved using a regression line viz., prediction of the ‘population’ regression line at the point ‘x’ and prediction of an ‘individual’ new member of the population ‘y1’ for which ‘x1’ has been measured. The second problem is probably the most commonly encountered and the most relevant to calibration studies. A regression line is likely to be most useful for calibration if the range of values of the X variable is large, if there is a good representation of the ‘x,y’ values across the range of X, and if several estimates of ‘y’ are made at each ‘x’. It is poor statistical practice to use a regression line for calibration or prediction beyond the limits of the data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis describes the development of a simple and accurate method for estimating the quantity and composition of household waste arisings. The method is based on the fundamental tenet that waste arisings can be predicted from information on the demographic and socio-economic characteristics of households, thus reducing the need for the direct measurement of waste arisings to that necessary for the calibration of a prediction model. The aim of the research is twofold: firstly to investigate the generation of waste arisings at the household level, and secondly to devise a method for supplying information on waste arisings to meet the needs of waste collection and disposal authorities, policy makers at both national and European level and the manufacturers of plant and equipment for waste sorting and treatment. The research was carried out in three phases: theoretical, empirical and analytical. In the theoretical phase specific testable hypotheses were formulated concerning the process of waste generation at the household level. The empirical phase of the research involved an initial questionnaire survey of 1277 households to obtain data on their socio-economic characteristics, and the subsequent sorting of waste arisings from each of the households surveyed. The analytical phase was divided between (a) the testing of the research hypotheses by matching each household's waste against its demographic/socioeconomic characteristics (b) the development of statistical models capable of predicting the waste arisings from an individual household and (c) the development of a practical method for obtaining area-based estimates of waste arisings using readily available data from the national census. The latter method was found to represent a substantial improvement over conventional methods of waste estimation in terms of both accuracy and spatial flexibility. The research therefore represents a substantial contribution both to scientific knowledge of the process of household waste generation, and to the practical management of waste arisings.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Membrane proteins, which constitute approximately 20% of most genomes, are poorly tractable targets for experimental structure determination, thus analysis by prediction and modelling makes an important contribution to their on-going study. Membrane proteins form two main classes: alpha helical and beta barrel trans-membrane proteins. By using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we addressed alpha-helical topology prediction. This method has accuracies of 77.4% for prokaryotic proteins and 61.4% for eukaryotic proteins. The method described here represents an important advance in the computational determination of membrane protein topology and offers a useful, and complementary, tool for the analysis of membrane proteins for a range of applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Membrane proteins, which constitute approximately 20% of most genomes, form two main classes: alpha helical and beta barrel transmembrane proteins. Using methods based on Bayesian Networks, a powerful approach for statistical inference, we have sought to address beta-barrel topology prediction. The beta-barrel topology predictor reports individual strand accuracies of 88.6%. The method outlined here represents a potentially important advance in the computational determination of membrane protein topology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background - MHC Class I molecules present antigenic peptides to cytotoxic T cells, which forms an integral part of the adaptive immune response. Peptides are bound within a groove formed by the MHC heavy chain. Previous approaches to MHC Class I-peptide binding prediction have largely concentrated on the peptide anchor residues located at the P2 and C-terminus positions. Results - A large dataset comprising MHC-peptide structural complexes was created by re-modelling pre-determined x-ray crystallographic structures. Static energetic analysis, following energy minimisation, was performed on the dataset in order to characterise interactions between bound peptides and the MHC Class I molecule, partitioning the interactions within the groove into van der Waals, electrostatic and total non-bonded energy contributions. Conclusion - The QSAR techniques of Genetic Function Approximation (GFA) and Genetic Partial Least Squares (G/PLS) algorithms were used to identify key interactions between the two molecules by comparing the calculated energy values with experimentally-determined BL50 data. Although the peptide termini binding interactions help ensure the stability of the MHC Class I-peptide complex, the central region of the peptide is also important in defining the specificity of the interaction. As thermodynamic studies indicate that peptide association and dissociation may be driven entropically, it may be necessary to incorporate entropic contributions into future calculations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With its implications for vaccine discovery, the accurate prediction of T cell epitopes is one of the key aspirations of computational vaccinology. We have developed a robust multivariate statistical method, based on partial least squares, for the quantitative prediction of peptide binding to major histocompatibility complexes (MHC), the principal checkpoint on the antigen presentation pathway. As a service to the immunobiology community, we have made a Perl implementation of the method available via a World Wide Web server. We call this server MHCPred. Access to the server is freely available from the URL: http://www.jenner.ac.uk/MHCPred. We have exemplified our method with a model for peptides binding to the common human MHC molecule HLA-B*3501.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Accurate T-cell epitope prediction is a principal objective of computational vaccinology. As a service to the immunology and vaccinology communities at large, we have implemented, as a server on the World Wide Web, a partial least squares-base multivariate statistical approach to the quantitative prediction of peptide binding to major histocom-patibility complexes (MHC), the key checkpoint on the antigen presentation pathway within adaptive,cellular immunity. MHCPred implements robust statistical models for both Class I alleles (HLA-A*0101, HLA-A*0201, HLA-A*0202, HLA-A*0203,HLA-A*0206, HLA-A*0301, HLA-A*1101, HLA-A*3301, HLA-A*6801, HLA-A*6802 and HLA-B*3501) and Class II alleles (HLA-DRB*0401, HLA-DRB*0401and HLA-DRB* 0701).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

* The work is supported by RFBR, grant 04-01-00858-a

Relevância:

30.00% 30.00%

Publicador:

Resumo:

As users continually request additional functionality, software systems will continue to grow in their complexity, as well as in their susceptibility to failures. Particularly for sensitive systems requiring higher levels of reliability, faulty system modules may increase development and maintenance cost. Hence, identifying them early would support the development of reliable systems through improved scheduling and quality control. Research effort to predict software modules likely to contain faults, as a consequence, has been substantial. Although a wide range of fault prediction models have been proposed, we remain far from having reliable tools that can be widely applied to real industrial systems. For projects with known fault histories, numerous research studies show that statistical models can provide reasonable estimates at predicting faulty modules using software metrics. However, as context-specific metrics differ from project to project, the task of predicting across projects is difficult to achieve. Prediction models obtained from one project experience are ineffective in their ability to predict fault-prone modules when applied to other projects. Hence, taking full benefit of the existing work in software development community has been substantially limited. As a step towards solving this problem, in this dissertation we propose a fault prediction approach that exploits existing prediction models, adapting them to improve their ability to predict faulty system modules across different software projects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Bio-systems are inherently complex information processing systems. Furthermore, physiological complexities of biological systems limit the formation of a hypothesis in terms of behavior and the ability to test hypothesis. More importantly the identification and classification of mutation in patients are centric topics in today's cancer research. Next generation sequencing (NGS) technologies can provide genome-wide coverage at a single nucleotide resolution and at reasonable speed and cost. The unprecedented molecular characterization provided by NGS offers the potential for an individualized approach to treatment. These advances in cancer genomics have enabled scientists to interrogate cancer-specific genomic variants and compare them with the normal variants in the same patient. Analysis of this data provides a catalog of somatic variants, present in tumor genome but not in the normal tissue DNA. In this dissertation, we present a new computational framework to the problem of predicting the number of mutations on a chromosome for a certain patient, which is a fundamental problem in clinical and research fields. We begin this dissertation with the development of a framework system that is capable of utilizing published data from a longitudinal study of patients with acute myeloid leukemia (AML), who's DNA from both normal as well as malignant tissues was subjected to NGS analysis at various points in time. By processing the sequencing data at the time of cancer diagnosis using the components of our framework, we tested it by predicting the genomic regions to be mutated at the time of relapse and, later, by comparing our results with the actual regions that showed mutations (discovered at relapse time). We demonstrate that this coupling of the algorithm pipeline can drastically improve the predictive abilities of searching a reliable molecular signature. Arguably, the most important result of our research is its superior performance to other methods like Radial Basis Function Network, Sequential Minimal Optimization, and Gaussian Process. In the final part of this dissertation, we present a detailed significance, stability and statistical analysis of our model. A performance comparison of the results are presented. This work clearly lays a good foundation for future research for other types of cancer.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Energy efficiency and user comfort have recently become priorities in the Facility Management (FM) sector. This has resulted in the use of innovative building components, such as thermal solar panels, heat pumps, etc., as they have potential to provide better performance, energy savings and increased user comfort. However, as the complexity of components increases, the requirement for maintenance management also increases. The standard routine for building maintenance is inspection which results in repairs or replacement when a fault is found. This routine leads to unnecessary inspections which have a cost with respect to downtime of a component and work hours. This research proposes an alternative routine: performing building maintenance at the point in time when the component is degrading and requires maintenance, thus reducing the frequency of unnecessary inspections. This thesis demonstrates that statistical techniques can be used as part of a maintenance management methodology to invoke maintenance before failure occurs. The proposed FM process is presented through a scenario utilising current Building Information Modelling (BIM) technology and innovative contractual and organisational models. This FM scenario supports a Degradation based Maintenance (DbM) scheduling methodology, implemented using two statistical techniques, Particle Filters (PFs) and Gaussian Processes (GPs). DbM consists of extracting and tracking a degradation metric for a component. Limits for the degradation metric are identified based on one of a number of proposed processes. These processes determine the limits based on the maturity of the historical information available. DbM is implemented for three case study components: a heat exchanger; a heat pump; and a set of bearings. The identified degradation points for each case study, from a PF, a GP and a hybrid (PF and GP combined) DbM implementation are assessed against known degradation points. The GP implementations are successful for all components. For the PF implementations, the results presented in this thesis find that the extracted metrics and limits identify degradation occurrences accurately for components which are in continuous operation. For components which have seasonal operational periods, the PF may wrongly identify degradation. The GP performs more robustly than the PF, but the PF, on average, results in fewer false positives. The hybrid implementations, which are a combination of GP and PF results, are successful for 2 of 3 case studies and are not affected by seasonal data. Overall, DbM is effectively applied for the three case study components. The accuracy of the implementations is dependant on the relationships modelled by the PF and GP, and on the type and quantity of data available. This novel maintenance process can improve equipment performance and reduce energy wastage from BSCs operation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-08

Relevância:

30.00% 30.00%

Publicador:

Resumo:

La stratégie actuelle de contrôle de la qualité de l’anode est inadéquate pour détecter les anodes défectueuses avant qu’elles ne soient installées dans les cuves d’électrolyse. Des travaux antérieurs ont porté sur la modélisation du procédé de fabrication des anodes afin de prédire leurs propriétés directement après la cuisson en utilisant des méthodes statistiques multivariées. La stratégie de carottage des anodes utilisée à l’usine partenaire fait en sorte que ce modèle ne peut être utilisé que pour prédire les propriétés des anodes cuites aux positions les plus chaudes et les plus froides du four à cuire. Le travail actuel propose une stratégie pour considérer l’histoire thermique des anodes cuites à n’importe quelle position et permettre de prédire leurs propriétés. Il est montré qu’en combinant des variables binaires pour définir l’alvéole et la position de cuisson avec les données routinières mesurées sur le four à cuire, les profils de température des anodes cuites à différentes positions peuvent être prédits. Également, ces données ont été incluses dans le modèle pour la prédiction des propriétés des anodes. Les résultats de prédiction ont été validés en effectuant du carottage supplémentaire et les performances du modèle sont concluantes pour la densité apparente et réelle, la force de compression, la réactivité à l’air et le Lc et ce peu importe la position de cuisson.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Protective factors are neglected in risk assessment in adult psychiatric and criminal justice populations. This review investigated the predictive efficacy of selected tools that assess protective factors. Five databases were searched using comprehensive terms for records up to June 2014, resulting in 17 studies (n = 2,198). Results were combined in a multilevel meta-analysis using the R (R Core Team, R: A Language and Environment for Statistical Computing, Vienna, Austria: R Foundation for Statistical Computing, 2015) metafor package (Viechtbauer, Journal of Statistical Software, 2010, 36, 1). Prediction of outcomes was poor relative to a reference category of violent offending, with the exception of prediction of discharge from secure units. There were no significant differences between the predictive efficacy of risk scales, protective scales, and summary judgments. Protective factor assessment may be clinically useful, but more development is required. Claims that use of these tools is therapeutically beneficial require testing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this thesis, wind wave prediction and analysis in the Southern Caspian Sea are surveyed. Because of very much importance and application of this matter in reducing vital and financial damages or marine activities, such as monitoring marine pollution, designing marine structure, shipping, fishing, offshore industry, tourism and etc, gave attention by some marine activities. In this study are used the Caspian Sea topography data that are extracted from the Caspian Sea Hydrography map of Iran Armed Forces Geographical Organization and the I 0 meter wind field data that are extracted from the transmitted GTS synoptic data of regional centers to Forecasting Center of Iran Meteorological Organization for wave prediction and is used the 20012 wave are recorded by the oil company's buoy that was located at distance 28 Kilometers from Neka shore for wave analysis. The results of this research are as follows: - Because of disagreement between the prediction results of SMB method in the Caspian sea and wave data of the Anzali and Neka buoys. The SMB method isn't able to Predict wave characteristics in the Southern Caspian Sea. - Because of good relativity agreement between the WAM model output in the Caspian Sea and wave data of the Anzali buoy. The WAM model is able to predict wave characteristics in the southern Caspian Sea with high relativity accuracy. The extreme wave height distribution function for fitting to the Southern Caspian Sea wave data is obtained by determining free parameters of Poisson-Gumbel function through moment method. These parameters are as below: A=2.41, B=0.33. The maximum relative error between the estimated 4-year return value of the Southern Caspian Sea significant wave height by above function with the wave data of Neka buoy is about %35. The 100-year return value of the Southern Caspian Sea significant height wave is about 4.97 meter. The maximum relative error between the estimated 4-year return value of the Southern Caspian Sea significant wave height by statistical model of peak over threshold with the wave data of Neka buoy is about %2.28. The parametric relation for fitting to the Southern Caspian Sea frequency spectra is obtained by determining free parameters of the Strekalov, Massel and Krylov etal_ multipeak spectra through mathematical method. These parameters are as below: A = 2.9 B=26.26, C=0.0016 m=0.19 and n=3.69. The maximum relative error between calculated free parameters of the Southern Caspian Sea multipeak spectrum with the proposed free parameters of double-peaked spectrum by Massel and Strekalov on the experimental data from the Caspian Sea is about 36.1 % in spectrum energetic part and is about 74M% in spectrum high frequency part. The peak over threshold waverose of the Southern Caspian Sea shows that maximum occurrence probability of wave height is relevant to waves with 2-2.5 meters wave fhe error sources in the statistical analysis are mainly due to: l) the missing wave data in 2 years duration through battery discharge of Neka buoy. 2) the deportation %15 of significant height annual mean in single year than long period average value that is caused by lack of adequate measurement on oceanic waves, and the error sources in the spectral analysis are mainly due to above- mentioned items and low accurate of the proposed free parameters of double-peaked spectrum on the experimental data from the Caspian Sea.