68 resultados para DNA Sequence, Hidden Markov Model, Bayesian Model, Sensitive Analysis, Markov Chain Monte Carlo


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In biological mass spectrometry (MS), two ionization techniques are predominantly employed for the analysis of larger biomolecules, such as polypeptides. These are nano-electrospray ionization [1, 2] (nanoESI) and matrix-assisted laser desorption/ionization [3, 4] (MALDI). Both techniques are considered to be “soft”, allowing the desorption and ionization of intact molecular analyte species and thus their successful mass-spectrometric analysis. One of the main differences between these two ionization techniques lies in their ability to produce multiply charged ions. MALDI typically generates singly charged peptide ions whereas nanoESI easily provides multiply charged ions, even for peptides as low as 1000 Da in mass. The production of highly charged ions is desirable as this allows the use of mass analyzers, such as ion traps (including orbitraps) and hybrid quadrupole instruments, which typically offer only a limited m/z range (< 2000–4000). It also enables more informative fragmentation spectra using techniques such as collisioninduced dissociation (CID) and electron capture/transfer dissociation (ECD/ETD) in combination with tandem MS (MS/MS). [5, 6] Thus, there is a clear advantage of using ESI in research areas where peptide sequencing, or in general, the structural elucidation of biomolecules by MS/MS is required. Nonetheless, MALDI with its higher tolerance to contaminants and additives, ease-of-operation, potential for highspeed and automated sample preparation and analysis as well as its MS imaging capabilities makes it an ionization technique that can cover bioanalytical areas for which ESI is less suitable. [7, 8] If these strengths could be combined with the analytical power of multiply charged ions, new instrumental configurations and large-scale proteomic analyses based on MALDI MS(/MS) would become feasible.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monte Carlo algorithms often aim to draw from a distribution π by simulating a Markov chain with transition kernel P such that π is invariant under P. However, there are many situations for which it is impractical or impossible to draw from the transition kernel P. For instance, this is the case with massive datasets, where is it prohibitively expensive to calculate the likelihood and is also the case for intractable likelihood models arising from, for example, Gibbs random fields, such as those found in spatial statistics and network analysis. A natural approach in these cases is to replace P by an approximation Pˆ. Using theory from the stability of Markov chains we explore a variety of situations where it is possible to quantify how ’close’ the chain given by the transition kernel Pˆ is to the chain given by P . We apply these results to several examples from spatial statistics and network analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The species limits and infraspecific DNA sequence diversity of Cyclamen libanoticum was examined. Analysis of the chloroplast DNA from six regions shows that, in C. libanoticum, only one base-pair difference is found among samples within the analysed 7066 base-pairs. This one base-pair difference (9 ‘A’s vs 10 ‘A’s) was found in the samples collected from a single site and could represent a very minor change in DNA sequence or even show the limits or accuracy of the sequencing system used. C. libanoticum is highly distinct from other congeners in DNA sequence.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Models for which the likelihood function can be evaluated only up to a parameter-dependent unknown normalizing constant, such as Markov random field models, are used widely in computer science, statistical physics, spatial statistics, and network analysis. However, Bayesian analysis of these models using standard Monte Carlo methods is not possible due to the intractability of their likelihood functions. Several methods that permit exact, or close to exact, simulation from the posterior distribution have recently been developed. However, estimating the evidence and Bayes’ factors for these models remains challenging in general. This paper describes new random weight importance sampling and sequential Monte Carlo methods for estimating BFs that use simulation to circumvent the evaluation of the intractable likelihood, and compares them to existing methods. In some cases we observe an advantage in the use of biased weight estimates. An initial investigation into the theoretical and empirical properties of this class of methods is presented. Some support for the use of biased estimates is presented, but we advocate caution in the use of such estimates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Land cover data derived from satellites are commonly used to prescribe inputs to models of the land surface. Since such data inevitably contains errors, quantifying how uncertainties in the data affect a model’s output is important. To do so, a spatial distribution of possible land cover values is required to propagate through the model’s simulation. However, at large scales, such as those required for climate models, such spatial modelling can be difficult. Also, computer models often require land cover proportions at sites larger than the original map scale as inputs, and it is the uncertainty in these proportions that this article discusses. This paper describes a Monte Carlo sampling scheme that generates realisations of land cover proportions from the posterior distribution as implied by a Bayesian analysis that combines spatial information in the land cover map and its associated confusion matrix. The technique is computationally simple and has been applied previously to the Land Cover Map 2000 for the region of England and Wales. This article demonstrates the ability of the technique to scale up to large (global) satellite derived land cover maps and reports its application to the GlobCover 2009 data product. The results show that, in general, the GlobCover data possesses only small biases, with the largest belonging to non–vegetated surfaces. In vegetated surfaces, the most prominent area of uncertainty is Southern Africa, which represents a complex heterogeneous landscape. It is also clear from this study that greater resources need to be devoted to the construction of comprehensive confusion matrices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There are now considerable expectations that semi-distributed models are useful tools for supporting catchment water quality management. However, insufficient attention has been given to evaluating the uncertainties inherent to this type of model, especially those associated with the spatial disaggregation of the catchment. The Integrated Nitrogen in Catchments model (INCA) is subjected to an extensive regionalised sensitivity analysis in application to the River Kennet, part of the groundwater-dominated upper Thames catchment, UK The main results are: (1) model output was generally insensitive to land-phase parameters, very sensitive to groundwater parameters, including initial conditions, and significantly sensitive to in-river parameters; (2) INCA was able to produce good fits simultaneously to the available flow, nitrate and ammonium in-river data sets; (3) representing parameters as heterogeneous over the catchment (206 calibrated parameters) rather than homogeneous (24 calibrated parameters) produced a significant improvement in fit to nitrate but no significant improvement to flow and caused a deterioration in ammonium performance; (4) the analysis indicated that calibrating the flow-related parameters first, then calibrating the remaining parameters (as opposed to calibrating all parameters together) was not a sensible strategy in this case; (5) even the parameters to which the model output was most sensitive suffered from high uncertainty due to spatial inconsistencies in the estimated optimum values, parameter equifinality and the sampling error associated with the calibration method; (6) soil and groundwater nutrient and flow data are needed to reduce. uncertainty in initial conditions, residence times and nitrogen transformation parameters, and long-term historic data are needed so that key responses to changes in land-use management can be assimilated. The results indicate the general, difficulty of reconciling the questions which catchment nutrient models are expected to answer with typically limited data sets and limited knowledge about suitable model structures. The results demonstrate the importance of analysing semi-distributed model uncertainties prior to model application, and illustrate the value and limitations of using Monte Carlo-based methods for doing so. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nonlinear adjustment toward long-run price equilibrium relationships in the sugar-ethanol-oil nexus in Brazil is examined. We develop generalized bivariate error correction models that allow for cointegration between sugar, ethanol, and oil prices, where dynamic adjustments are potentially nonlinear functions of the disequilibrium errors. A range of models are estimated using Bayesian Monte Carlo Markov Chain algorithms and compared using Bayesian model selection methods. The results suggest that the long-run drivers of Brazilian sugar prices are oil prices and that there are nonlinearities in the adjustment processes of sugar and ethanol prices to oil price but linear adjustment between ethanol and sugar prices.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When a computer program requires legitimate access to confidential data, the question arises whether such a program may illegally reveal sensitive information. This paper proposes a policy model to specify what information flow is permitted in a computational system. The security definition, which is based on a general notion of information lattices, allows various representations of information to be used in the enforcement of secure information flow in deterministic or nondeterministic systems. A flexible semantics-based analysis technique is presented, which uses the input-output relational model induced by an attacker's observational power, to compute the information released by the computational system. An illustrative attacker model demonstrates the use of the technique to develop a termination-sensitive analysis. The technique allows the development of various information flow analyses, parametrised by the attacker's observational power, which can be used to enforce what declassification policies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Water quality models generally require a relatively large number of parameters to define their functional relationships, and since prior information on parameter values is limited, these are commonly defined by fitting the model to observed data. In this paper, the identifiability of water quality parameters and the associated uncertainty in model simulations are investigated. A modification to the water quality model `Quality Simulation Along River Systems' is presented in which an improved flow component is used within the existing water quality model framework. The performance of the model is evaluated in an application to the Bedford Ouse river, UK, using a Monte-Carlo analysis toolbox. The essential framework of the model proved to be sound, and calibration and validation performance was generally good. However some supposedly important water quality parameters associated with algal activity were found to be completely insensitive, and hence non-identifiable, within the model structure, while others (nitrification and sedimentation) had optimum values at or close to zero, indicating that those processes were not detectable from the data set examined. (C) 2003 Elsevier Science B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper reports an uncertainty analysis of critical loads for acid deposition for a site in southern England, using the Steady State Mass Balance Model. The uncertainty bounds, distribution type and correlation structure for each of the 18 input parameters was considered explicitly, and overall uncertainty estimated by Monte Carlo methods. Estimates of deposition uncertainty were made from measured data and an atmospheric dispersion model, and hence the uncertainty in exceedance could also be calculated. The uncertainties of the calculated critical loads were generally much lower than those of the input parameters due to a "compensation of errors" mechanism - coefficients of variation ranged from 13% for CLmaxN to 37% for CL(A). With 1990 deposition, the probability that the critical load was exceeded was > 0.99; to reduce this probability to 0.50, a 63% reduction in deposition is required; to 0.05, an 82% reduction. With 1997 deposition, which was lower than that in 1990, exceedance probabilities declined and uncertainties in exceedance narrowed as deposition uncertainty had less effect. The parameters contributing most to the uncertainty in critical loads were weathering rates, base cation uptake rates, and choice of critical chemical value, indicating possible research priorities. However, the different critical load parameters were to some extent sensitive to different input parameters. The application of such probabilistic results to environmental regulation is discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Measurements of anthropogenic tracers such as chlorofluorocarbons and tritium must be quantitatively combined with ocean general circulation models as a component of systematic model development. The authors have developed and tested an inverse method, using a Green's function, to constrain general circulation models with transient tracer data. Using this method chlorofluorocarbon-11 and -12 (CFC-11 and -12) observations are combined with a North Atlantic configuration of the Miami Isopycnic Coordinate Ocean Model with 4/3 degrees resolution. Systematic differences can be seen between the observed CFC concentrations and prior CFC fields simulated by the model. These differences are reduced by the inversion, which determines the optimal gas transfer across the air-sea interface, accounting for uncertainties in the tracer observations. After including the effects of unresolved variability in the CFC fields, the model is found to be inconsistent with the observations because the model/data misfit slightly exceeds the error estimates. By excluding observations in waters ventilated north of the Greenland-Scotland ridge (sigma (0) < 27.82 kg m(-3); shallower than about 2000 m), the fit is improved, indicating that the Nordic overflows are poorly represented in the model. Some systematic differences in the model/data residuals remain and are related, in part, to excessively deep model ventilation near Rockall and deficient ventilation in the main thermocline of the eastern subtropical gyre. Nevertheless, there do not appear to be gross errors in the basin-scale model circulation. Analysis of the CFC inventory using the constrained model suggests that the North Atlantic Ocean shallower than about 2000 m was near 20% saturated in the mid-1990s. Overall, this basin is a sink to 22% of the total atmosphere-to-ocean CFC-11 flux-twice the global average value. The average water mass formation rates over the CFC transient are 7.0 and 6.0 Sv (Sv = 10(6) m(3) s(-1)) for subtropical mode water and subpolar mode water, respectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Numerous techniques exist which can be used for the task of behavioural analysis and recognition. Common amongst these are Bayesian networks and Hidden Markov Models. Although these techniques are extremely powerful and well developed, both have important limitations. By fusing these techniques together to form Bayes-Markov chains, the advantages of both techniques can be preserved, while reducing their limitations. The Bayes-Markov technique forms the basis of a common, flexible framework for supplementing Markov chains with additional features. This results in improved user output, and aids in the rapid development of flexible and efficient behaviour recognition systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Danish Eulerian Model (DEM) is a powerful air pollution model, designed to calculate the concentrations of various dangerous species over a large geographical region (e.g. Europe). It takes into account the main physical and chemical processes between these species, the actual meteorological conditions, emissions, etc.. This is a huge computational task and requires significant resources of storage and CPU time. Parallel computing is essential for the efficient practical use of the model. Some efficient parallel versions of the model were created over the past several years. A suitable parallel version of DEM by using the Message Passing Interface library (AIPI) was implemented on two powerful supercomputers of the EPCC - Edinburgh, available via the HPC-Europa programme for transnational access to research infrastructures in EC: a Sun Fire E15K and an IBM HPCx cluster. Although the implementation is in principal, the same for both supercomputers, few modifications had to be done for successful porting of the code on the IBM HPCx cluster. Performance analysis and parallel optimization was done next. Results from bench marking experiments will be presented in this paper. Another set of experiments was carried out in order to investigate the sensitivity of the model to variation of some chemical rate constants in the chemical submodel. Certain modifications of the code were necessary to be done in accordance with this task. The obtained results will be used for further sensitivity analysis Studies by using Monte Carlo simulation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As the sister group to vertebrates, amphioxus is consistently used as a model of genome evolution for understanding the invertebrate/vertebrate transition. The amphioxus genome has not undergone massive duplications like those in the vertebrates or disruptive rearrangements like in the genome of Ciona, a urochordate, making it an ideal evolutionary model. Transposable elements have been linked to many genomic evolutionary changes including increased genome size, modified gene expression, massive gene rearrangements, and possibly intron evolution. Despite their importance in genome evolution, few previous examples of transposable elements have been identified in amphioxus. We report five novel Miniature Inverted-repeat Transposable Elements (MITEs) identified by an analysis of amphioxus DNA sequence, which we have named LanceleTn-1, LanceleTn-2, LanceleTn-3a, LanceleTn-3b and LanceleTn-4. Several of the LanceleTn elements were identified in the amphioxus ParaHox cluster, and we suggest these have had important implications for the evolution of this highly conserved gene cluster. The estimated high copy numbers of these elements implies that MITEs are probably the most abundant type of mobile element in amphioxus, and are thus likely to have been of fundamental importance in shaping the evolution of the amphioxus genome.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new approach to the study of the local organization in amorphous polymer materials is presented. The method couples neutron diffraction experiments that explore the structure on the spatial scale 1–20 Å with the reverse Monte Carlo fitting procedure to predict structures that accurately represent the experimental scattering results over the whole momentum transfer range explored. Molecular mechanics and molecular dynamics techniques are also used to produce atomistic models independently from any experimental input, thereby providing a test of the viability of the reverse Monte Carlo method in generating realistic models for amorphous polymeric systems. An analysis of the obtained models in terms of single chain properties and of orientational correlations between chain segments is presented. We show the viability of the method with data from molten polyethylene. The analysis derives a model with average C-C and C-H bond lengths of 1.55 Å and 1.1 Å respectively, average backbone valence angle of 112, a torsional angle distribution characterized by a fraction of trans conformers of 0.67 and, finally, a weak interchain orientational correlation at around 4 Å.