978 resultados para Prediction algorithms


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In order to assist in comparing the computational techniques used in different models, the authors propose a standardized set of one-dimensional numerical experiments that could be completed for each model. The results of these experiments, with a simplified form of the computational representation for advection, diffusion, pressure gradient term, Coriolis term, and filter used in the models, should be reported in the peer-reviewed literature. Specific recommendations are described in this paper.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: Current approaches of predicting protein functions from a protein-protein interaction (PPI) dataset are based on an assumption that the available functions of the proteins (a.k.a. annotated proteins) will determine the functions of the proteins whose functions are unknown yet at the moment (a.k.a. un-annotated proteins). Therefore, the protein function prediction is a mono-directed and one-off procedure, i.e. from annotated proteins to un-annotated proteins. However, the interactions between proteins are mutual rather than static and mono-directed, although functions of some proteins are unknown for some reasons at present. That means when we use the similarity-based approach to predict functions of un-annotated proteins, the un-annotated proteins, once their functions are predicted, will affect the similarities between proteins, which in turn will affect the prediction results. In other words, the function prediction is a dynamic and mutual procedure. This dynamic feature of protein interactions, however, was not considered in the existing prediction algorithms.

Results: In this paper, we propose a new prediction approach that predicts protein functions iteratively. This iterative approach incorporates the dynamic and mutual features of PPI interactions, as well as the local and global semantic influence of protein functions, into the prediction. To guarantee predicting functions iteratively, we propose a new protein similarity from protein functions. We adapt new evaluation metrics to evaluate the prediction quality of our algorithm and other similar algorithms. Experiments on real PPI datasets were conducted to evaluate the effectiveness of the proposed approach in predicting unknown protein functions.

Conclusions:
The iterative approach is more likely to reflect the real biological nature between proteins when predicting functions. A proper definition of protein similarity from protein functions is the key to predicting functions iteratively. The evaluation results demonstrated that in most cases, the iterative approach outperformed non-iterative ones with higher prediction quality in terms of prediction precision, recall and F-value.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In recent years, significant effort has been given to predicting protein functions from protein interaction data generated from high throughput techniques. However, predicting protein functions correctly and reliably still remains a challenge. Recently, many computational methods have been proposed for predicting protein functions. Among these methods, clustering based methods are the most promising. The existing methods, however, mainly focus on protein relationship modeling and the prediction algorithms that statically predict functions from the clusters that are related to the unannotated proteins. In fact, the clustering itself is a dynamic process and the function prediction should take this dynamic feature of clustering into consideration. Unfortunately, this dynamic feature of clustering is ignored in the existing prediction methods. In this paper, we propose an innovative progressive clustering based prediction method to trace the functions of relevant annotated proteins across all clusters that are generated through the progressive clustering of proteins. A set of prediction criteria is proposed to predict functions of unannotated proteins from all relevant clusters and traced functions. The method was evaluated on real protein interaction datasets and the results demonstrated the effectiveness of the proposed method compared with representative existing methods.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The research presented here develops a robust reliability algorithm for the identification of reliable protein interactions that can be incorporated with a gene expression dataset to improve the algorithm performance, and novel breast cancer based diagnostic, prognostic and treatment prediction algorithms, respectively, which take into account the existing issues in order to provide a fair estimation of their performance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recently telecommunication industry benefits from infrastructure sharing, one of the most fundamental enablers of cloud computing, leading to emergence of the Mobile Virtual Network Operator (MVNO) concept. The most momentous intents by this approach are the support of on-demand provisioning and elasticity of virtualized mobile network components, based on data traffic load. To realize it, during operation and management procedures, the virtualized services need be triggered in order to scale-up/down or scale-out/in an instance. In this paper we propose an architecture called MOBaaS (Mobility and Bandwidth Availability Prediction as a Service), comprising two algorithms in order to predict user(s) mobility and network link bandwidth availability, that can be implemented in cloud based mobile network structure and can be used as a support service by any other virtualized mobile network services. MOBaaS can provide prediction information in order to generate required triggers for on-demand deploying, provisioning, disposing of virtualized network components. This information can be used for self-adaptation procedures and optimal network function configuration during run-time operation, as well. Through the preliminary experiments with the prototype implementation on the OpenStack platform, we evaluated and confirmed the feasibility and the effectiveness of the prediction algorithms and the proposed architecture.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

An earthquake of magnitude M and linear source dimension L(M) is preceded within a few years by certain patterns of seismicity in the magnitude range down to about (M - 3) in an area of linear dimension about 5L-10L. Prediction algorithms based on such patterns may allow one to predict approximately 80% of strong earthquakes with alarms occupying altogether 20-30% of the time-space considered. An area of alarm can be narrowed down to 2L-3L when observations include lower magnitudes, down to about (M - 4). In spite of their limited accuracy, such predictions open a possibility to prevent considerable damage. The following findings may provide for further development of prediction methods: (i) long-range correlations in fault system dynamics and accordingly large size of the areas over which different observed fields could be averaged and analyzed jointly, (ii) specific symptoms of an approaching strong earthquake, (iii) the partial similarity of these symptoms worldwide, (iv) the fact that some of them are not Earth specific: we probably encountered in seismicity the symptoms of instability common for a wide class of nonlinear systems.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background
It is generally acknowledged that a functional understanding of a biological system can only be obtained by an understanding of the collective of molecular interactions in form of biological networks. Protein networks are one particular network type of special importance, because proteins form the functional base units of every biological cell. On a mesoscopic level of protein networks, modules are of significant importance because these building blocks may be the next elementary functional level above individual proteins allowing to gain insight into fundamental organizational principles of biological cells.
Results
In this paper, we provide a comparative analysis of five popular and four novel module detection algorithms. We study these module prediction methods for simulated benchmark networks as well as 10 biological protein interaction networks (PINs). A particular focus of our analysis is placed on the biological meaning of the predicted modules by utilizing the Gene Ontology (GO) database as gold standard for the definition of biological processes. Furthermore, we investigate the robustness of the results by perturbing the PINs simulating in this way our incomplete knowledge of protein networks.
Conclusions
Overall, our study reveals that there is a large heterogeneity among the different module prediction algorithms if one zooms-in the biological level of biological processes in the form of GO terms and all methods are severely affected by a slight perturbation of the networks. However, we also find pathways that are enriched in multiple modules, which could provide important information about the hierarchical organization of the system

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Mobile and wireless networks have long exploited mobility predictions, focused on predicting the future location of given users, to perform more efficient network resource management. In this paper, we present a new approach in which we provide predictions as a probability distribution of the likelihood of moving to a set of future locations. This approach provides wireless services a greater amount of knowledge and enables them to perform more effectively. We present a framework for the evaluation of this new type of predictor, and develop 2 new predictors, HEM and G-Stat. We evaluate our predictors accuracy in predicting future cells for mobile users, using two large geolocation data sets, from MDC [11], [12] and Crawdad [13]. We show that our predictors can successfully predict with as low as an average 2.2% inaccuracy in certain scenarios.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This project was designed to provide the structural softwood processing industry with the basis for improved green and dry grading to allow maximise MGP grade yields, consistent product performance and reduced processing costs. To achieve this, advanced statistical techniques were used in conjunction with state-of-the-art property measurement systems. Specifically, the project aimed to make two significant steps forward for the Australian structural softwood industry: • assessment of technologies, both existing and novel, that may lead to selection of a consistent, reliable and accurate device for the log yard and green mill. The purpose is to more accurately identify and reject material that will not make a minimum grade of MGP10 downstream; • improved correlation of grading MOE and MOR parameters in the dry mill using new analytical methods and a combination of devices. The three populations tested were stiffness-limited radiata pine, strength-limited radiata pine and Caribbean pine. Resonance tests were conducted on logs prior to sawmilling, and on boards. Raw data from existing in-line systems were captured for the green and dry boards. The dataset was analysed using classical and advanced statistical tools to provide correlations between data sets and to develop efficient strength and stiffness prediction equations. Stiffness and strength prediction algorithms were developed from raw and combined parameters. Parameters were analysed for comparison of prediction capabilities using in-line parameters, off-line parameters and a combination of in-line and off-line parameters. The results show that acoustic resonance techniques have potential for log assessment, to sort for low stiffness and/or low strength, depending on the resource. From the log measurements, a strong correlation was found between the average static MOE of the dried boards within a log and the predicted value. These results have application in segregating logs into structural and non-structural uses. Some commercial technologies are already available for this application such as Hitman LG640. For green boards it was found that in-line and laboratory acoustic devices can provide a good prediction of dry static MOE and moderate prediction for MOR.There is high potential for segregating boards at this stage of processing. Grading after the log breakdown can improve significantly the effectiveness of the mill. Subsequently, reductions in non-structural volumes can be achieved. Depending on the resource it can be expected that a 5 to 8 % reduction in non structural boards won’t be dried with an associated saving of $70 to 85/m3. For dry boards, vibration and a standard Metriguard CLT/HCLT provided a similar level of prediction on stiffness limited resource. However, Metriguard provides a better strength prediction in strength limited resources (due to this equipment’s ability to measure local characteristics). The combination of grading equipment specifically for stiffness related predictors (Metriguard or vibration) with defect detection systems (optical or X-ray scanner) provides a higher level of prediction, especially for MOR. Several commercial technologies are already available for acoustic grading on board such those from Microtec, Luxscan, Falcon engineering or Dynalyse AB for example. Differing combinations of equipment, and their strategic location within the processing chain, can dramatically improve the efficiency of the mill, the level of which will vary depending of the resource. For example, an initial acoustic sorting on green boards combined with an optical scanner associated with an acoustic system for grading dry board can result in a large reduction of the proportion of low value low non-structural produced. The application of classical MLR on several predictors proved to be effective, in particular for MOR predictions. However, the usage of a modern statistics approach(chemometrics tools) such as PLS proved to be more efficient for improving the level of prediction. Compared to existing technologies, the results of the project indicate a good improvement potential for grading in the green mill, ahead of kiln drying and subsequent cost-adding processes. The next stage is the development and refinement of systems for this purpose.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background: A nucleosome is the fundamental repeating unit of the eukaryotic chromosome. It has been shown that the positioning of a majority of nucleosomes is primarily controlled by factors other than the intrinsic preference of the DNA sequence. One of the key questions in this context is the role, if any, that can be played by the variability of nucleosomal DNA structure. Results: In this study, we have addressed this question by analysing the variability at the dinucleotide and trinucleotide as well as longer length scales in a dataset of nucleosome X-ray crystal structures. We observe that the nucleosome structure displays remarkable local level structural versatility within the B-DNA family. The nucleosomal DNA also incorporates a large number of kinks. Conclusions: Based on our results, we propose that the local and global level versatility of B-DNA structure may be a significant factor modulating the formation of nucleosomes in the vicinity of high-plasticity genes, and in varying the probability of binding by regulatory proteins. Hence, these factors should be incorporated in the prediction algorithms and there may not be a unique `template' for predicting putative nucleosome sequences. In addition, the multimodal distribution of dinucleotide parameters for some steps and the presence of a large number of kinks in the nucleosomal DNA structure indicate that the linear elastic model, used by several algorithms to predict the energetic cost of nucleosome formation, may lead to incorrect results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Computer-assisted topology predictions are widely used to build low-resolution structural models of integral membrane proteins (IMPs). Experimental validation of these models by traditional methods is labor intensive and requires modifications that might alter the IMP native conformation. This work employs oxidative labeling coupled with mass spectrometry (MS) as a validation tool for computer-generated topology models. ·OH exposure introduces oxidative modifications in solvent-accessible regions, whereas buried segments (e.g., transmembrane helices) are non-oxidizable. The Escherichia coli protein WaaL (O-antigen ligase) is predicted to have 12 transmembrane helices and a large extramembrane domain (Pérez et al., Mol. Microbiol. 2008, 70, 1424). Tryptic digestion and LC-MS/MS were used to map the oxidative labeling behavior of WaaL. Met and Cys exhibit high intrinsic reactivities with ·OH, making them sensitive probes for solvent accessibility assays. Overall, the oxidation pattern of these residues is consistent with the originally proposed WaaL topology. One residue (M151), however, undergoes partial oxidation despite being predicted to reside within a transmembrane helix. Using an improved computer algorithm, a slightly modified topology model was generated that places M151 closer to the membrane interface. On the basis of the labeling data, it is concluded that the refined model more accurately reflects the actual topology of WaaL. We propose that the combination of oxidative labeling and MS represents a useful strategy for assessing the accuracy of IMP topology predictions, supplementing data obtained in traditional biochemical assays. In the future, it might be possible to incorporate oxidative labeling data directly as constraints in topology prediction algorithms.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper compares the most common digital signal processing methods of exon prediction in eukaryotes, and also proposes a technique for noise suppression in exon prediction. The specimen used here which has relevance in medical research, has been taken from the public genomic database - GenBank.Here exon prediction has been done using the digital signal processing methods viz. binary method, EIIP (electron-ion interaction psuedopotential) method and filter methods. Under filter method two filter designs, and two approaches using these two designs have been tried. The discrete wavelet transform has been used for de-noising of the exon plots.Results of exon prediction based on the methods mentioned above, which give values closest to the ones found in the NCBI database are given here. The exon plot de-noised using discrete wavelet transform is also given.Alterations to the proven methods as done by the authors, improves performance of exon prediction algorithms. Also it has been proven that the discrete wavelet transform is an effective tool for de-noising which can be used with exon prediction algorithms