895 resultados para Cadeias de Markov Homogêneas e Não-Homogêneas
Resumo:
In this thesis we consider systems of finitely many particles moving on paths given by a strong Markov process and undergoing branching and reproduction at random times. The branching rate of a particle, its number of offspring and their spatial distribution are allowed to depend on the particle's position and possibly on the configuration of coexisting particles. In addition there is immigration of new particles, with the rate of immigration and the distribution of immigrants possibly depending on the configuration of pre-existing particles as well. In the first two chapters of this work, we concentrate on the case that the joint motion of particles is governed by a diffusion with interacting components. The resulting process of particle configurations was studied by E. Löcherbach (2002, 2004) and is known as a branching diffusion with immigration (BDI). Chapter 1 contains a detailed introduction of the basic model assumptions, in particular an assumption of ergodicity which guarantees that the BDI process is positive Harris recurrent with finite invariant measure on the configuration space. This object and a closely related quantity, namely the invariant occupation measure on the single-particle space, are investigated in Chapter 2 where we study the problem of the existence of Lebesgue-densities with nice regularity properties. For example, it turns out that the existence of a continuous density for the invariant measure depends on the mechanism by which newborn particles are distributed in space, namely whether branching particles reproduce at their death position or their offspring are distributed according to an absolutely continuous transition kernel. In Chapter 3, we assume that the quantities defining the model depend only on the spatial position but not on the configuration of coexisting particles. In this framework (which was considered by Höpfner and Löcherbach (2005) in the special case that branching particles reproduce at their death position), the particle motions are independent, and we can allow for more general Markov processes instead of diffusions. The resulting configuration process is a branching Markov process in the sense introduced by Ikeda, Nagasawa and Watanabe (1968), complemented by an immigration mechanism. Generalizing results obtained by Höpfner and Löcherbach (2005), we give sufficient conditions for ergodicity in the sense of positive recurrence of the configuration process and finiteness of the invariant occupation measure in the case of general particle motions and offspring distributions.
Resumo:
In questa questa tesi vengono presentate alcune delle più importanti definizioni di funzione computabile mediante un algoritmo: una prima descrizione è quella data tramite le funzioni ricorsive, un secondo approccio è dato in termini di macchine di Turing, infine, vengono considerati gli algoritmi di Markov. Si dimostra che tutte queste definizioni sono equivalenti. Completa la tesi un breve cenno al lambda-K-calcolo.
Resumo:
Nowadays communication is switching from a centralized scenario, where communication media like newspapers, radio, TV programs produce information and people are just consumers, to a completely different decentralized scenario, where everyone is potentially an information producer through the use of social networks, blogs, forums that allow a real-time worldwide information exchange. These new instruments, as a result of their widespread diffusion, have started playing an important socio-economic role. They are the most used communication media and, as a consequence, they constitute the main source of information enterprises, political parties and other organizations can rely on. Analyzing data stored in servers all over the world is feasible by means of Text Mining techniques like Sentiment Analysis, which aims to extract opinions from huge amount of unstructured texts. This could lead to determine, for instance, the user satisfaction degree about products, services, politicians and so on. In this context, this dissertation presents new Document Sentiment Classification methods based on the mathematical theory of Markov Chains. All these approaches bank on a Markov Chain based model, which is language independent and whose killing features are simplicity and generality, which make it interesting with respect to previous sophisticated techniques. Every discussed technique has been tested in both Single-Domain and Cross-Domain Sentiment Classification areas, comparing performance with those of other two previous works. The performed analysis shows that some of the examined algorithms produce results comparable with the best methods in literature, with reference to both single-domain and cross-domain tasks, in $2$-classes (i.e. positive and negative) Document Sentiment Classification. However, there is still room for improvement, because this work also shows the way to walk in order to enhance performance, that is, a good novel feature selection process would be enough to outperform the state of the art. Furthermore, since some of the proposed approaches show promising results in $2$-classes Single-Domain Sentiment Classification, another future work will regard validating these results also in tasks with more than $2$ classes.
Resumo:
In questa trattazione si introduce il concetto di catena di Markov nascosta: una coppia di processi stocastici (X,O), dove X è una catena di Markov non osservabile direttamente e O è il processo stocastico delle osservazioni, dipendente istante per istante solo dallo stato corrente della catena X. In prima istanza si illustrano i metodi per la soluzione di tre problemi classici, dato un modello di Markov nascosto e una sequenza di segnali osservati: valutare la probabilità della osservazione nel modello, trovare la sequenza nascosta di stati più probabile e aggiornare il modello per rendere più probabile l'osservazione. In secondo luogo si applica il modello ai giochi stocastici, nel caso in cui solo uno dei giocatori non è a conoscenza del gioco in ogni turno, ma può cercare di ottenere informazioni utili osservando le mosse dell'avversario informato. In particolare si cercano strategie basate sul concetto di catena di Markov nascoste e si analizzano i risultati ottenuti per valutare l'efficienza dell'approccio.
Resumo:
Gli argomenti trattati in questa tesi sono le catene di Markov reversibili e alcune applicazioni al metodo Montecarlo basato sulle catene di Markov. Inizialmente vengono descritte alcune delle proprietà fondamentali delle catene di Markov e in particolare delle catene di Markov reversibili. In seguito viene descritto il metodo Montecarlo basato sulle catene di Markov, il quale attraverso la simulazione di catene di Markov cerca di stimare la distribuzione di una variabile casuale o di un vettore di variabili casuali con una certa distribuzione di probabilità. La parte finale è dedicata ad un esempio in cui utilizzando Matlab sono evidenziati alcuni aspetti studiati nel corso della tesi.
Resumo:
Questa tesi si inserisce nell’ambito di studio dei modelli stocastici applicati alle sequenze di DNA. I random walk e le catene di Markov sono tra i processi aleatori che hanno trovato maggiore diffusione in ambito applicativo grazie alla loro capacità di cogliere le caratteristiche salienti di molti sistemi complessi, pur mantenendo semplice la descrizione di questi. Nello specifico, la trattazione si concentra sull’applicazione di questi nel contesto dell’analisi statistica delle sequenze genomiche. Il DNA può essere rappresentato in prima approssimazione da una sequenza di nucleotidi che risulta ben riprodotta dal modello a catena di Markov; ciò rappresenta il punto di partenza per andare a studiare le proprietà statistiche delle catene di DNA. Si approfondisce questo discorso andando ad analizzare uno studio che si ripropone di caratterizzare le sequenze di DNA tramite le distribuzioni delle distanze inter-dinucleotidiche. Se ne commentano i risultati, al fine di mostrare le potenzialità di questi modelli nel fare emergere caratteristiche rilevanti in altri ambiti, in questo caso quello biologico.
Resumo:
We propose a new and clinically oriented approach to perform atlas-based segmentation of brain tumor images. A mesh-free method is used to model tumor-induced soft tissue deformations in a healthy brain atlas image with subsequent registration of the modified atlas to a pathologic patient image. The atlas is seeded with a tumor position prior and tumor growth simulating the tumor mass effect is performed with the aim of improving the registration accuracy in case of patients with space-occupying lesions. We perform tests on 2D axial slices of five different patient data sets and show that the approach gives good results for the segmentation of white matter, grey matter, cerebrospinal fluid and the tumor.
Resumo:
We present an automatic method to segment brain tissues from volumetric MRI brain tumor images. The method is based on non-rigid registration of an average atlas in combination with a biomechanically justified tumor growth model to simulate soft-tissue deformations caused by the tumor mass-effect. The tumor growth model, which is formulated as a mesh-free Markov Random Field energy minimization problem, ensures correspondence between the atlas and the patient image, prior to the registration step. The method is non-parametric, simple and fast compared to other approaches while maintaining similar accuracy. It has been evaluated qualitatively and quantitatively with promising results on eight datasets comprising simulated images and real patient data.
Resumo:
An important aspect of the QTL mapping problem is the treatment of missing genotype data. If complete genotype data were available, QTL mapping would reduce to the problem of model selection in linear regression. However, in the consideration of loci in the intervals between the available genetic markers, genotype data is inherently missing. Even at the typed genetic markers, genotype data is seldom complete, as a result of failures in the genotyping assays or for the sake of economy (for example, in the case of selective genotyping, where only individuals with extreme phenotypes are genotyped). We discuss the use of algorithms developed for hidden Markov models (HMMs) to deal with the missing genotype data problem.
Resumo:
In Malani and Neilsen (1992) we have proposed alternative estimates of survival function (for time to disease) using a simple marker that describes time to some intermediate stage in a disease process. In this paper we derive the asymptotic variance of one such proposed estimator using two different methods and compare terms of order 1/n when there is no censoring. In the absence of censoring the asymptotic variance obtained using the Greenwood type approach converges to exact variance up to terms involving 1/n. But the asymptotic variance obtained using the theory of the counting process and results from Voelkel and Crowley (1984) on semi-Markov processes has a different term of order 1/n. It is not clear to us at this point why the variance formulae using the latter approach give different results.
Resumo:
Genomic alterations have been linked to the development and progression of cancer. The technique of Comparative Genomic Hybridization (CGH) yields data consisting of fluorescence intensity ratios of test and reference DNA samples. The intensity ratios provide information about the number of copies in DNA. Practical issues such as the contamination of tumor cells in tissue specimens and normalization errors necessitate the use of statistics for learning about the genomic alterations from array-CGH data. As increasing amounts of array CGH data become available, there is a growing need for automated algorithms for characterizing genomic profiles. Specifically, there is a need for algorithms that can identify gains and losses in the number of copies based on statistical considerations, rather than merely detect trends in the data. We adopt a Bayesian approach, relying on the hidden Markov model to account for the inherent dependence in the intensity ratios. Posterior inferences are made about gains and losses in copy number. Localized amplifications (associated with oncogene mutations) and deletions (associated with mutations of tumor suppressors) are identified using posterior probabilities. Global trends such as extended regions of altered copy number are detected. Since the posterior distribution is analytically intractable, we implement a Metropolis-within-Gibbs algorithm for efficient simulation-based inference. Publicly available data on pancreatic adenocarcinoma, glioblastoma multiforme and breast cancer are analyzed, and comparisons are made with some widely-used algorithms to illustrate the reliability and success of the technique.
Resumo:
Latent class regression models are useful tools for assessing associations between covariates and latent variables. However, evaluation of key model assumptions cannot be performed using methods from standard regression models due to the unobserved nature of latent outcome variables. This paper presents graphical diagnostic tools to evaluate whether or not latent class regression models adhere to standard assumptions of the model: conditional independence and non-differential measurement. An integral part of these methods is the use of a Markov Chain Monte Carlo estimation procedure. Unlike standard maximum likelihood implementations for latent class regression model estimation, the MCMC approach allows us to calculate posterior distributions and point estimates of any functions of parameters. It is this convenience that allows us to provide the diagnostic methods that we introduce. As a motivating example we present an analysis focusing on the association between depression and socioeconomic status, using data from the Epidemiologic Catchment Area study. We consider a latent class regression analysis investigating the association between depression and socioeconomic status measures, where the latent variable depression is regressed on education and income indicators, in addition to age, gender, and marital status variables. While the fitted latent class regression model yields interesting results, the model parameters are found to be invalid due to the violation of model assumptions. The violation of these assumptions is clearly identified by the presented diagnostic plots. These methods can be applied to standard latent class and latent class regression models, and the general principle can be extended to evaluate model assumptions in other types of models.
Resumo:
Markov chain Monte Carlo is a method of producing a correlated sample in order to estimate features of a complicated target distribution via simple ergodic averages. A fundamental question in MCMC applications is when should the sampling stop? That is, when are the ergodic averages good estimates of the desired quantities? We consider a method that stops the MCMC sampling the first time the width of a confidence interval based on the ergodic averages is less than a user-specified value. Hence calculating Monte Carlo standard errors is a critical step in assessing the output of the simulation. In particular, we consider the regenerative simulation and batch means methods of estimating the variance of the asymptotic normal distribution. We describe sufficient conditions for the strong consistency and asymptotic normality of both methods and investigate their finite sample properties in a variety of examples.