91 resultados para Prediction algorithms

em University of Queensland eSpace - Australia


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Promiscuous human leukocyte antigen (HLA) binding peptides are ideal targets for vaccine development. Existing computational models for prediction of promiscuous peptides used hidden Markov models and artificial neural networks as prediction algorithms. We report a system based on support vector machines that outperforms previously published methods. Preliminary testing showed that it can predict peptides binding to HLA-A2 and -A3 super-type molecules with excellent accuracy, even for molecules where no binding data are currently available.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The current prediction or genes in the Plasmodium falciparum genome database relies upon a limited number of specially developed computer algorithms. We have re-annotated the sequence of chromosome 2 of P. falciparum by a computer-assisted manual analysis. which is described here. Of 161 newly predicted introns, we have experimentally confirmed 98. We regard 110 introns from the previously published analyses as probable, we delete 3, change 26 and add 135. We recognise 214 genes in chromosome 2. We have predicted introns in 121 genes. The increased complexity or gene structure on chromosome 2 is likely to be mirrored by the entire genome. (C) 2001 Elsevier Science B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: Inpatient length of stay (LOS) is an important measure of hospital activity, health care resource consumption, and patient acuity. This research work aims at developing an incremental expectation maximization (EM) based learning approach on mixture of experts (ME) system for on-line prediction of LOS. The use of a batchmode learning process in most existing artificial neural networks to predict LOS is unrealistic, as the data become available over time and their pattern change dynamically. In contrast, an on-line process is capable of providing an output whenever a new datum becomes available. This on-the-spot information is therefore more useful and practical for making decisions, especially when one deals with a tremendous amount of data. Methods and material: The proposed approach is illustrated using a real example of gastroenteritis LOS data. The data set was extracted from a retrospective cohort study on all infants born in 1995-1997 and their subsequent admissions for gastroenteritis. The total number of admissions in this data set was n = 692. Linked hospitalization records of the cohort were retrieved retrospectively to derive the outcome measure, patient demographics, and associated co-morbidities information. A comparative study of the incremental learning and the batch-mode learning algorithms is considered. The performances of the learning algorithms are compared based on the mean absolute difference (MAD) between the predictions and the actual LOS, and the proportion of predictions with MAD < 1 day (Prop(MAD < 1)). The significance of the comparison is assessed through a regression analysis. Results: The incremental learning algorithm provides better on-line prediction of LOS when the system has gained sufficient training from more examples (MAD = 1.77 days and Prop(MAD < 1) = 54.3%), compared to that using the batch-mode learning. The regression analysis indicates a significant decrease of MAD (p-value = 0.063) and a significant (p-value = 0.044) increase of Prop(MAD

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prediction of peroxisomal matrix proteins generally depends on the presence of one of two distinct motifs at the end of the amino acid sequence. PTS1 peroxisomal proteins have a well conserved tripeptide at the C-terminal end. However, the preceding residues in the sequence arguably play a crucial role in targeting the protein to the peroxisome. Previous work in applying machine learning to the prediction of peroxisomal matrix proteins has failed W capitalize on the full extent of these dependencies. We benchmark a range of machine learning algorithms, and show that a classifier - based on the Support Vector Machine - produces more accurate results when dependencies between the conserved motif and the preceding section are exploited. We publish an updated and rigorously curated data set that results in increased prediction accuracy of most tested models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent years, the phrase 'genomic medicine' has increasingly been used to describe a new development in medicine that holds great promise for human health. This new approach to health care uses the knowledge of an individual's genetic make-up to identify those that are at a higher risk of developing certain diseases and to intervene at an earlier stage to prevent these diseases. Identifying genes that are involved in disease aetiology will provide researchers with tools to develop better treatments and cures. A major role within this field is attributed to 'predictive genomic medicine', which proposes screening healthy individuals to identify those who carry alleles that increase their susceptibility to common diseases, such as cancers and heart disease. Physicians could then intervene even before the disease manifests and advise individuals with a higher genetic risk to change their behaviour - for instance, to exercise or to eat a healthier diet - or offer drugs or other medical treatment to reduce their chances of developing these diseases. These promises have fallen on fertile ground among politicians, health-care providers and the general public, particularly in light of the increasing costs of health care in developed societies. Various countries have established databases on the DNA and health information of whole populations as a first step towards genomic medicine. Biomedical research has also identified a large number of genes that could be used to predict someone's risk of developing a certain disorder. But it would be premature to assume that genomic medicine will soon become reality, as many problems remain to be solved. Our knowledge about most disease genes and their roles is far from sufficient to make reliable predictions about a patient’s risk of actually developing a disease. In addition, genomic medicine will create new political, social, ethical and economic challenges that will have to be addressed in the near future.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Despite many successes of conventional DNA sequencing methods, some DNAs remain difficult or impossible to sequence. Unsequenceable regions occur in the genomes of many biologically important organisms, including the human genome. Such regions range in length from tens to millions of bases, and may contain valuable information such as the sequences of important genes. The authors have recently developed a technique that renders a wide range of problematic DNAs amenable to sequencing. The technique is known as sequence analysis via mutagenesis (SAM). This paper presents a number of algorithms for analysing and interpreting data generated by this technique.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There are many techniques for electricity market price forecasting. However, most of them are designed for expected price analysis rather than price spike forecasting. An effective method of predicting the occurrence of spikes has not yet been observed in the literature so far. In this paper, a data mining based approach is presented to give a reliable forecast of the occurrence of price spikes. Combined with the spike value prediction techniques developed by the same authors, the proposed approach aims at providing a comprehensive tool for price spike forecasting. In this paper, feature selection techniques are firstly described to identify the attributes relevant to the occurrence of spikes. A simple introduction to the classification techniques is given for completeness. Two algorithms: support vector machine and probability classifier are chosen to be the spike occurrence predictors and are discussed in details. Realistic market data are used to test the proposed model with promising results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The BR algorithm is a novel and efficient method to find all eigenvalues of upper Hessenberg matrices and has never been applied to eigenanalysis for power system small signal stability. This paper analyzes differences between the BR and the QR algorithms with performance comparison in terms of CPU time based on stopping criteria and storage requirement. The BR algorithm utilizes accelerating strategies to improve its performance when computing eigenvalues of narrowly banded, nearly tridiagonal upper Hessenberg matrices. These strategies significantly reduce the computation time at a reasonable level of precision. Compared with the QR algorithm, the BR algorithm requires fewer iteration steps and less storage space without depriving of appropriate precision in solving eigenvalue problems of large-scale power systems. Numerical examples demonstrate the efficiency of the BR algorithm in pursuing eigenanalysis tasks of 39-, 68-, 115-, 300-, and 600-bus systems. Experiment results suggest that the BR algorithm is a more efficient algorithm for large-scale power system small signal stability eigenanalysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a fast method for finding optimal parameters for a low-resolution (threading) force field intended to distinguish correct from incorrect folds for a given protein sequence. In contrast to other methods, the parameterization uses information from >10(7) misfolded structures as well as a set of native sequence-structure pairs. In addition to testing the resulting force field's performance on the protein sequence threading problem, results are shown that characterize the number of parameters necessary for effective structure recognition.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multi-frequency bioimpedance analysis (MFBIA) was used to determine the impedance, reactance and resistance of 103 lamb carcasses (17.1-34.2 kg) immediately after slaughter and evisceration. Carcasses were halved, frozen and one half subsequently homogenized and analysed for water, crude protein and fat content. Three measures of carcass length were obtained. Diagonal length between the electrodes (right side biceps femoris to left side of neck) explained a greater proportion of the variance in water mass than did estimates of spinal length and was selected for use in the index L-2/Z to predict the mass of chemical components in the carcass. Use of impedance (Z) measured at the characteristic frequency (Z(c)) instead of 50 kHz (Z(50)) did not improve the power of the model to predict the mass of water, protein or fat in the carcass. While L-2/Z(50) explained a significant proportion of variation in the masses of body water (r(2) 0.64), protein (r(2) 0.34) and fat (r(2) 0.35), its inclusion in multi-variate indices offered small or no increases in predictive capacity when hot carcass weight (HCW) and a measure of rib fat-depth (GR) were present in the model. Optimized equations were able to account for 65-90 % of the variance observed in the weight of chemical components in the carcass. It is concluded that single frequency impedance data do not provide better prediction of carcass composition than can be obtained from measures of HCW and GR. Indices of intracellular water mass derived from impedance at zero frequency and the characteristic frequency explained a similar proportion of the variance in carcass protein mass as did the index L-2/Z(50).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Algorithms for explicit integration of structural dynamics problems with multiple time steps (subcycling) are investigated. Only one such algorithm, due to Smolinski and Sleith has proved to be stable in a classical sense. A simplified version of this algorithm that retains its stability is presented. However, as with the original version, it can be shown to sacrifice accuracy to achieve stability. Another algorithm in use is shown to be only statistically stable, in that a probability of stability can be assigned if appropriate time step limits are observed. This probability improves rapidly with the number of degrees of freedom in a finite element model. The stability problems are shown to be a property of the central difference method itself, which is modified to give the subcycling algorithm. A related problem is shown to arise when a constraint equation in time is introduced into a time-continuous space-time finite element model. (C) 1998 Elsevier Science S.A.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Motivation: Prediction methods for identifying binding peptides could minimize the number of peptides required to be synthesized and assayed, and thereby facilitate the identification of potential T-cell epitopes. We developed a bioinformatic method for the prediction of peptide binding to MHC class II molecules. Results: Experimental binding data and expert knowledge of anchor positions and binding motifs were combined with an evolutionary algorithm (EA) and an artificial neural network (ANN): binding data extraction --> peptide alignment --> ANN training and classification. This method, termed PERUN, was implemented for the prediction of peptides that bind to HLA-DR4(B1*0401). The respective positive predictive values of PERUN predictions of high-, moderate-, low- and zero-affinity binder-a were assessed as 0.8, 0.7, 0.5 and 0.8 by cross-validation, and 1.0, 0.8, 0.3 and 0.7 by experimental binding. This illustrates the synergy between experimentation and computer modeling, and its application to the identification of potential immunotheraaeutic peptides.