918 resultados para Stochastic sequences.


Relevância:

100.00% 100.00%

Publicador:

Resumo:

On cover: C00-1469-145.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Hidden Markov models (HMMs) are probabilistic models that are well adapted to many tasks in bioinformatics, for example, for predicting the occurrence of specific motifs in biological sequences. MAMOT is a command-line program for Unix-like operating systems, including MacOS X, that we developed to allow scientists to apply HMMs more easily in their research. One can define the architecture and initial parameters of the model in a text file and then use MAMOT for parameter optimization on example data, decoding (like predicting motif occurrence in sequences) and the production of stochastic sequences generated according to the probabilistic model. Two examples for which models are provided are coiled-coil domains in protein sequences and protein binding sites in DNA. A wealth of useful features include the use of pseudocounts, state tying and fixing of selected parameters in learning, and the inclusion of prior probabilities in decoding. AVAILABILITY: MAMOT is implemented in C++, and is distributed under the GNU General Public Licence (GPL). The software, documentation, and example model files can be found at http://bcf.isb-sib.ch/mamot

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The classical methods of analysing time series by Box-Jenkins approach assume that the observed series uctuates around changing levels with constant variance. That is, the time series is assumed to be of homoscedastic nature. However, the nancial time series exhibits the presence of heteroscedasticity in the sense that, it possesses non-constant conditional variance given the past observations. So, the analysis of nancial time series, requires the modelling of such variances, which may depend on some time dependent factors or its own past values. This lead to introduction of several classes of models to study the behaviour of nancial time series. See Taylor (1986), Tsay (2005), Rachev et al. (2007). The class of models, used to describe the evolution of conditional variances is referred to as stochastic volatility modelsThe stochastic models available to analyse the conditional variances, are based on either normal or log-normal distributions. One of the objectives of the present study is to explore the possibility of employing some non-Gaussian distributions to model the volatility sequences and then study the behaviour of the resulting return series. This lead us to work on the related problem of statistical inference, which is the main contribution of the thesis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivation: A consensus sequence for a family of related sequences is, as the name suggests, a sequence that captures the features common to most members of the family. Consensus sequences are important in various DNA sequencing applications and are a convenient way to characterize a family of molecules. Results: This paper describes a new algorithm for finding a consensus sequence, using the popular optimization method known as simulated annealing. Unlike the conventional approach of finding a consensus sequence by first forming a multiple sequence alignment, this algorithm searches for a sequence that minimises the sum of pairwise distances to each of the input sequences. The resulting consensus sequence can then be used to induce a multiple sequence alignment. The time required by the algorithm scales linearly with the number of input sequences and quadratically with the length of the consensus sequence. We present results demonstrating the high quality of the consensus sequences and alignments produced by the new algorithm. For comparison, we also present similar results obtained using ClustalW. The new algorithm outperforms ClustalW in many cases.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Forest fire sequences can be modelled as a stochastic point process where events are characterized by their spatial locations and occurrence in time. Cluster analysis permits the detection of the space/time pattern distribution of forest fires. These analyses are useful to assist fire-managers in identifying risk areas, implementing preventive measures and conducting strategies for an efficient distribution of the firefighting resources. This paper aims to identify hot spots in forest fire sequences by means of the space-time scan statistics permutation model (STSSP) and a geographical information system (GIS) for data and results visualization. The scan statistical methodology uses a scanning window, which moves across space and time, detecting local excesses of events in specific areas over a certain period of time. Finally, the statistical significance of each cluster is evaluated through Monte Carlo hypothesis testing. The case study is the forest fires registered by the Forest Service in Canton Ticino (Switzerland) from 1969 to 2008. This dataset consists of geo-referenced single events including the location of the ignition points and additional information. The data were aggregated into three sub-periods (considering important preventive legal dispositions) and two main ignition-causes (lightning and anthropogenic causes). Results revealed that forest fire events in Ticino are mainly clustered in the southern region where most of the population is settled. Our analysis uncovered local hot spots arising from extemporaneous arson activities. Results regarding the naturally-caused fires (lightning fires) disclosed two clusters detected in the northern mountainous area.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents gamma stochastic volatility models and investigates its distributional and time series properties. The parameter estimators obtained by the method of moments are shown analytically to be consistent and asymptotically normal. The simulation results indicate that the estimators behave well. The insample analysis shows that return models with gamma autoregressive stochastic volatility processes capture the leptokurtic nature of return distributions and the slowly decaying autocorrelation functions of squared stock index returns for the USA and UK. In comparison with GARCH and EGARCH models, the gamma autoregressive model picks up the persistence in volatility for the US and UK index returns but not the volatility persistence for the Canadian and Japanese index returns. The out-of-sample analysis indicates that the gamma autoregressive model has a superior volatility forecasting performance compared to GARCH and EGARCH models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates random number generators in stochastic iteration algorithms that require infinite uniform sequences. We take a simple model of the general transport equation and solve it with the application of a linear congruential generator, the Mersenne twister, the mother-of-all generators, and a true random number generator based on quantum effects. With this simple model we show that for reasonably contractive operators the theoretically not infinite-uniform sequences perform also well. Finally, we demonstrate the power of stochastic iteration for the solution of the light transport problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a method for the recognition of complex actions. Our method combines automatic learning of simple actions and manual definition of complex actions in a single grammar. Contrary to the general trend in complex action recognition that consists in dividing recognition into two stages, our method performs recognition of simple and complex actions in a unified way. This is performed by encoding simple action HMMs within the stochastic grammar that models complex actions. This unified approach enables a more effective influence of the higher activity layers into the recognition of simple actions which leads to a substantial improvement in the classification of complex actions. We consider the recognition of complex actions based on person transits between areas in the scene. As input, our method receives crossings of tracks along a set of zones which are derived using unsupervised learning of the movement patterns of the objects in the scene. We evaluate our method on a large dataset showing normal, suspicious and threat behaviour on a parking lot. Experiments show an improvement of ~ 30% in the recognition of both high-level scenarios and their composing simple actions with respect to a two-stage approach. Experiments with synthetic noise simulating the most common tracking failures show that our method only experiences a limited decrease in performance when moderate amounts of noise are added.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The transcription process is crucial to life and the enzyme RNA polymerase (RNAP) is the major component of the transcription machinery. The development of single-molecule techniques, such as magnetic and optical tweezers, atomic-force microscopy and single-molecule fluorescence, increased our understanding of the transcription process and complements traditional biochemical studies. Based on these studies, theoretical models have been proposed to explain and predict the kinetics of the RNAP during the polymerization, highlighting the results achieved by models based on the thermodynamic stability of the transcription elongation complex. However, experiments showed that if more than one RNAP initiates from the same promoter, the transcription behavior slightly changes and new phenomenona are observed. We proposed and implemented a theoretical model that considers collisions between RNAPs and predicts their cooperative behavior during multi-round transcription generalizing the Bai et al. stochastic sequence-dependent model. In our approach, collisions between elongating enzymes modify their transcription rate values. We performed the simulations in Mathematica® and compared the results of the single and the multiple-molecule transcription with experimental results and other theoretical models. Our multi-round approach can recover several expected behaviors, showing that the transcription process for the studied sequences can be accelerated up to 48% when collisions are allowed: the dwell times on pause sites are reduced as well as the distance that the RNAPs backtracked from backtracking sites. © 2013 Costa et al.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Scopo della modellizzazione delle stringhe di DNA è la formulazione di modelli matematici che generano sequenze di basi azotate compatibili con il genoma esistente. In questa tesi si prendono in esame quei modelli matematici che conservano un'importante proprietà, scoperta nel 1952 dal biochimico Erwin Chargaff, chiamata oggi "seconda regola di Chargaff". I modelli matematici che tengono conto delle simmetrie di Chargaff si dividono principalmente in due filoni: uno la ritiene un risultato dell'evoluzione sul genoma, mentre l'altro la ipotizza peculiare di un genoma primitivo e non intaccata dalle modifiche apportate dall'evoluzione. Questa tesi si propone di analizzare un modello del secondo tipo. In particolare ci siamo ispirati al modello definito da da Sobottka e Hart. Dopo un'analisi critica e lo studio del lavoro degli autori, abbiamo esteso il modello ad un più ampio insieme di casi. Abbiamo utilizzato processi stocastici come Bernoulli-scheme e catene di Markov per costruire una possibile generalizzazione della struttura proposta nell'articolo, analizzando le condizioni che implicano la validità della regola di Chargaff. I modelli esaminati sono costituiti da semplici processi stazionari o concatenazioni di processi stazionari. Nel primo capitolo vengono introdotte alcune nozioni di biologia. Nel secondo si fa una descrizione critica e prospettica del modello proposto da Sobottka e Hart, introducendo le definizioni formali per il caso generale presentato nel terzo capitolo, dove si sviluppa l'apparato teorico del modello generale.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Formal grammars can used for describing complex repeatable structures such as DNA sequences. In this paper, we describe the structural composition of DNA sequences using a context-free stochastic L-grammar. L-grammars are a special class of parallel grammars that can model the growth of living organisms, e.g. plant development, and model the morphology of a variety of organisms. We believe that parallel grammars also can be used for modeling genetic mechanisms and sequences such as promoters. Promoters are short regulatory DNA sequences located upstream of a gene. Detection of promoters in DNA sequences is important for successful gene prediction. Promoters can be recognized by certain patterns that are conserved within a species, but there are many exceptions which makes the promoter recognition a complex problem. We replace the problem of promoter recognition by induction of context-free stochastic L-grammar rules, which are later used for the structural analysis of promoter sequences. L-grammar rules are derived automatically from the drosophila and vertebrate promoter datasets using a genetic programming technique and their fitness is evaluated using a Support Vector Machine (SVM) classifier. The artificial promoter sequences generated using the derived L- grammar rules are analyzed and compared with natural promoter sequences.