847 resultados para classification methods
Resumo:
This item provides supplementary materials for the paper mentioned in the title, specifically a range of organisms used in the study. The full abstract for the main paper is as follows: Next Generation Sequencing (NGS) technologies have revolutionised molecular biology, allowing clinical sequencing to become a matter of routine. NGS data sets consist of short sequence reads obtained from the machine, given context and meaning through downstream assembly and annotation. For these techniques to operate successfully, the collected reads must be consistent with the assumed species or species group, and not corrupted in some way. The common bacterium Staphylococcus aureus may cause severe and life-threatening infections in humans,with some strains exhibiting antibiotic resistance. In this paper, we apply an SVM classifier to the important problem of distinguishing S. aureus sequencing projects from alternative pathogens, including closely related Staphylococci. Using a sequence k-mer representation, we achieve precision and recall above 95%, implicating features with important functional associations.
Resumo:
This paper gives a review of recent progress in the design of numerical methods for computing the trajectories (sample paths) of solutions to stochastic differential equations. We give a brief survey of the area focusing on a number of application areas where approximations to strong solutions are important, with a particular focus on computational biology applications, and give the necessary analytical tools for understanding some of the important concepts associated with stochastic processes. We present the stochastic Taylor series expansion as the fundamental mechanism for constructing effective numerical methods, give general results that relate local and global order of convergence and mention the Magnus expansion as a mechanism for designing methods that preserve the underlying structure of the problem. We also present various classes of explicit and implicit methods for strong solutions, based on the underlying structure of the problem. Finally, we discuss implementation issues relating to maintaining the Brownian path, efficient simulation of stochastic integrals and variable-step-size implementations based on various types of control.
Resumo:
The pioneering work of Runge and Kutta a hundred years ago has ultimately led to suites of sophisticated numerical methods suitable for solving complex systems of deterministic ordinary differential equations. However, in many modelling situations, the appropriate representation is a stochastic differential equation and here numerical methods are much less sophisticated. In this paper a very general class of stochastic Runge-Kutta methods is presented and much more efficient classes of explicit methods than previous extant methods are constructed. In particular, a method of strong order 2 with a deterministic component based on the classical Runge-Kutta method is constructed and some numerical results are presented to demonstrate the efficacy of this approach.
Resumo:
In this paper, general order conditions and a global convergence proof are given for stochastic Runge Kutta methods applied to stochastic ordinary differential equations ( SODEs) of Stratonovich type. This work generalizes the ideas of B-series as applied to deterministic ordinary differential equations (ODEs) to the stochastic case and allows a completely general formalism for constructing high order stochastic methods, either explicit or implicit. Some numerical results will be given to illustrate this theory.
Resumo:
Stochastic differential equations (SDEs) arise fi om physical systems where the parameters describing the system can only be estimated or are subject to noise. There has been much work done recently on developing numerical methods for solving SDEs. This paper will focus on stability issues and variable stepsize implementation techniques for numerically solving SDEs effectively. (C) 2000 Elsevier Science B.V. All rights reserved.
Resumo:
In recent years considerable attention has been paid to the numerical solution of stochastic ordinary differential equations (SODEs), as SODEs are often more appropriate than their deterministic counterparts in many modelling situations. However, unlike the deterministic case numerical methods for SODEs are considerably less sophisticated due to the difficulty in representing the (possibly large number of) random variable approximations to the stochastic integrals. Although Burrage and Burrage [High strong order explicit Runge-Kutta methods for stochastic ordinary differential equations, Applied Numerical Mathematics 22 (1996) 81-101] were able to construct strong local order 1.5 stochastic Runge-Kutta methods for certain cases, it is known that all extant stochastic Runge-Kutta methods suffer an order reduction down to strong order 0.5 if there is non-commutativity between the functions associated with the multiple Wiener processes. This order reduction down to that of the Euler-Maruyama method imposes severe difficulties in obtaining meaningful solutions in a reasonable time frame and this paper attempts to circumvent these difficulties by some new techniques. An additional difficulty in solving SODEs arises even in the Linear case since it is not possible to write the solution analytically in terms of matrix exponentials unless there is a commutativity property between the functions associated with the multiple Wiener processes. Thus in this present paper first the work of Magnus [On the exponential solution of differential equations for a linear operator, Communications on Pure and Applied Mathematics 7 (1954) 649-673] (applied to deterministic non-commutative Linear problems) will be applied to non-commutative linear SODEs and methods of strong order 1.5 for arbitrary, linear, non-commutative SODE systems will be constructed - hence giving an accurate approximation to the general linear problem. Secondly, for general nonlinear non-commutative systems with an arbitrary number (d) of Wiener processes it is shown that strong local order I Runge-Kutta methods with d + 1 stages can be constructed by evaluated a set of Lie brackets as well as the standard function evaluations. A method is then constructed which can be efficiently implemented in a parallel environment for this arbitrary number of Wiener processes. Finally some numerical results are presented which illustrate the efficacy of these approaches. (C) 1999 Elsevier Science B.V. All rights reserved.
Resumo:
In many modeling situations in which parameter values can only be estimated or are subject to noise, the appropriate mathematical representation is a stochastic ordinary differential equation (SODE). However, unlike the deterministic case in which there are suites of sophisticated numerical methods, numerical methods for SODEs are much less sophisticated. Until a recent paper by K. Burrage and P.M. Burrage (1996), the highest strong order of a stochastic Runge-Kutta method was one. But K. Burrage and P.M. Burrage (1996) showed that by including additional random variable terms representing approximations to the higher order Stratonovich (or Ito) integrals, higher order methods could be constructed. However, this analysis applied only to the one Wiener process case. In this paper, it will be shown that in the multiple Wiener process case all known stochastic Runge-Kutta methods can suffer a severe order reduction if there is non-commutativity between the functions associated with the Wiener processes. Importantly, however, it is also suggested how this order can be repaired if certain commutator operators are included in the Runge-Kutta formulation. (C) 1998 Elsevier Science B.V. and IMACS. All rights reserved.
Resumo:
In Burrage and Burrage [1] it was shown that by introducing a very general formulation for stochastic Runge-Kutta methods, the previous strong order barrier of order one could be broken without having to use higher derivative terms. In particular, methods of strong order 1.5 were developed in which a Stratonovich integral of order one and one of order two were present in the formulation. In this present paper, general order results are proven about the maximum attainable strong order of these stochastic Runge-Kutta methods (SRKs) in terms of the order of the Stratonovich integrals appearing in the Runge-Kutta formulation. In particular, it will be shown that if an s-stage SRK contains Stratonovich integrals up to order p then the strong order of the SRK cannot exceed min{(p + 1)/2, (s - 1)/2), p greater than or equal to 2, s greater than or equal to 3 or 1 if p = 1.
Resumo:
The fungal genera Ustilago, Sporisorium and Macalpinomyces represent an unresolved complex. Taxa within the complex often possess characters that occur in more than one genus, creating uncertainty for species placement. Previous studies have indicated that the genera cannot be separated by morphology alone. Here we chronologically review the history of the Ustilago-Sporisorium-Macalpinomyces complex, argue for its resolution and suggest methods to accomplish a stable taxonomy. A combined molecular and morphological approach is required to identify synapomorphic characters that underpin a new classification. Ustilago, Sporisorium and Macalpinomyces require explicit re-description and new genera, based on monophyletic groups, are needed to accommodate taxa that no longer fit the emended descriptions. A resolved classification will end the taxonomic confusion that surrounds generic placement of these smut fungi.
Resumo:
Nitrogen balance is increasingly used as an indicator of the environmental performance of agricultural sector in national, international, and global contexts. There are three main methods of accounting the national nitrogen balance: farm gate, soil surface, and soil system. OECD (2008) recently reported the nitrogen and phosphorus balances for member countries for the 1985 - 2004 period using the soil surface method. The farm gate and soil system methods were also used in some international projects. Some studies have provided the comparison among these methods and the conclusion is mixed. The motivation of this present paper was to combine these three methods to provide a more detailed auditing of the nitrogen balance and flows for national agricultural production. In addition, the present paper also provided a new strategy of using reliable international and national data sources to calculate nitrogen balance using the farm gate method. The empirical study focused on the nitrogen balance of OECD countries for the period from 1985 to 2003. The N surplus sent to the total environment of OECD surged dramatically in early 1980s, gradually decreased during 1990s but exhibited an increasing trends in early 2000s. The overall N efficiency however fluctuated without a clear increasing trend. The eco-environmental ranking shows that Australia and Ireland were the worst while Korea and Greece were the best.
Resumo:
Finding and labelling semantic features patterns of documents in a large, spatial corpus is a challenging problem. Text documents have characteristics that make semantic labelling difficult; the rapidly increasing volume of online documents makes a bottleneck in finding meaningful textual patterns. Aiming to deal with these issues, we propose an unsupervised documnent labelling approach based on semantic content and feature patterns. A world ontology with extensive topic coverage is exploited to supply controlled, structured subjects for labelling. An algorithm is also introduced to reduce dimensionality based on the study of ontological structure. The proposed approach was promisingly evaluated by compared with typical machine learning methods including SVMs, Rocchio, and kNN.
Resumo:
Background Predicting protein subnuclear localization is a challenging problem. Some previous works based on non-sequence information including Gene Ontology annotations and kernel fusion have respective limitations. The aim of this work is twofold: one is to propose a novel individual feature extraction method; another is to develop an ensemble method to improve prediction performance using comprehensive information represented in the form of high dimensional feature vector obtained by 11 feature extraction methods. Methodology/Principal Findings A novel two-stage multiclass support vector machine is proposed to predict protein subnuclear localizations. It only considers those feature extraction methods based on amino acid classifications and physicochemical properties. In order to speed up our system, an automatic search method for the kernel parameter is used. The prediction performance of our method is evaluated on four datasets: Lei dataset, multi-localization dataset, SNL9 dataset and a new independent dataset. The overall accuracy of prediction for 6 localizations on Lei dataset is 75.2% and that for 9 localizations on SNL9 dataset is 72.1% in the leave-one-out cross validation, 71.7% for the multi-localization dataset and 69.8% for the new independent dataset, respectively. Comparisons with those existing methods show that our method performs better for both single-localization and multi-localization proteins and achieves more balanced sensitivities and specificities on large-size and small-size subcellular localizations. The overall accuracy improvements are 4.0% and 4.7% for single-localization proteins and 6.5% for multi-localization proteins. The reliability and stability of our classification model are further confirmed by permutation analysis. Conclusions It can be concluded that our method is effective and valuable for predicting protein subnuclear localizations. A web server has been designed to implement the proposed method. It is freely available at http://bioinformatics.awowshop.com/snlpred_page.php.
Resumo:
Railway bridges deteriorate with age. Factors such as environmental effects on different materials of a bridge, variation of loads, fatigue, etc will reduce the remaining life of bridges. Bridges are currently rated individually for maintenance and repair actions according to the structural conditions of their elements. Dealing with thousands of bridges and several factors that cause deterioration, makes the rating process extremely complicated. Current simplified but practical rating methods are not based on an accurate structural condition assessment system. On the other hand, the sophisticated but more accurate methods are only used for a single bridge or particular types of bridges. It is therefore necessary to develop a practical and accurate system which will be capable of rating a network of railway bridges. This paper introduces a new method for rating a network of bridges based on their current and future structural conditions. The method identifies typical bridges representing a group of railway bridges. The most crucial agents will be determined and categorized to criticality and vulnerability factors. Classification based on structural configuration, loading, and critical deterioration factors will be conducted. Finally a rating method for a network of railway bridges that takes into account the effects of damaged structural components due to variations in loading and environmental conditions on the integrity of the whole structure will be proposed. The outcome of this research is expected to significantly improve the rating methods for railway bridges by considering the unique characteristics of different factors and incorporating the correlation between them.
Resumo:
utomatic pain monitoring has the potential to greatly improve patient diagnosis and outcomes by providing a continuous objective measure. One of the most promising methods is to do this via automatically detecting facial expressions. However, current approaches have failed due to their inability to: 1) integrate the rigid and non-rigid head motion into a single feature representation, and 2) incorporate the salient temporal patterns into the classification stage. In this paper, we tackle the first problem by developing a “histogram of facial action units” representation using Active Appearance Model (AAM) face features, and then utilize a Hidden Conditional Random Field (HCRF) to overcome the second issue. We show that both of these methods improve the performance on the task of pain detection in sequence level compared to current state-of-the-art-methods on the UNBC-McMaster Shoulder Pain Archive.