902 resultados para Markov chains hidden Markov models Viterbi algorithm Forward-Backward algorithm maximum likelihood


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multitype branching processes (MTBP) model branching structures, where the nodes of the resulting tree are particles of different types. Usually such a process is not observable in the sense of the whole tree, but only as the “generation” at a given moment in time, which consists of the number of particles of every type. This requires an EM-type algorithm to obtain a maximum likelihood (ML) estimate of the parameters of the branching process. Using a version of the inside-outside algorithm for stochastic context-free grammars (SCFG), such an estimate could be obtained for the offspring distribution of the process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel approach to the computation of primitive geometrical structures, where no prior knowledge about the visual scene is available and a high level of noise is expected. We based our work on the grouping principles of proximity and similarity, of points and preliminary models. The former was realized using Minimum Spanning Trees (MST), on which we apply a stable alignment and goodness of fit criteria. As for the latter, we used spectral clustering of preliminary models. The algorithm can be generalized to various model fitting settings, without tuning of run parameters. Experiments demonstrate the significant improvement in the localization accuracy of models in plane, homography and motion segmentation examples. The efficiency of the algorithm is not dependent on fine tuning of run parameters like most others in the field.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 60J80, 60J85, 62P10, 92D25.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The correlated probit model is frequently used for multiple ordered data since it allows to incorporate seamlessly different correlation structures. The estimation of the probit model parameters based on direct maximization of the limited information maximum likelihood is a numerically intensive procedure. We propose an extension of the EM algorithm for obtaining maximum likelihood estimates for a correlated probit model for multiple ordinal outcomes. The algorithm is implemented in the free software environment for statistical computing and graphics R. We present two simulation studies to examine the performance of the developed algorithm. We apply the model to data on 121 women with cervical or endometrial cancer. Patients developed normal tissue reactions as a result of post-operative external beam pelvic radiotherapy. In this work we focused on modeling the effects of a genetic factor on early skin and early urogenital tissue reactions and on assessing the strength of association between the two types of reactions. We established that there was an association between skin reactions and polymorphism XRCC3 codon 241 (C>T) (rs861539) and that skin and urogenital reactions were positively correlated. ACM Computing Classification System (1998): G.3.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

My dissertation has three chapters which develop and apply microeconometric tech- niques to empirically relevant problems. All the chapters examines the robustness issues (e.g., measurement error and model misspecification) in the econometric anal- ysis. The first chapter studies the identifying power of an instrumental variable in the nonparametric heterogeneous treatment effect framework when a binary treat- ment variable is mismeasured and endogenous. I characterize the sharp identified set for the local average treatment effect under the following two assumptions: (1) the exclusion restriction of an instrument and (2) deterministic monotonicity of the true treatment variable in the instrument. The identification strategy allows for general measurement error. Notably, (i) the measurement error is nonclassical, (ii) it can be endogenous, and (iii) no assumptions are imposed on the marginal distribution of the measurement error, so that I do not need to assume the accuracy of the measure- ment. Based on the partial identification result, I provide a consistent confidence interval for the local average treatment effect with uniformly valid size control. I also show that the identification strategy can incorporate repeated measurements to narrow the identified set, even if the repeated measurements themselves are endoge- nous. Using the the National Longitudinal Study of the High School Class of 1972, I demonstrate that my new methodology can produce nontrivial bounds for the return to college attendance when attendance is mismeasured and endogenous.

The second chapter, which is a part of a coauthored project with Federico Bugni, considers the problem of inference in dynamic discrete choice problems when the structural model is locally misspecified. We consider two popular classes of estimators for dynamic discrete choice models: K-step maximum likelihood estimators (K-ML) and K-step minimum distance estimators (K-MD), where K denotes the number of policy iterations employed in the estimation problem. These estimator classes include popular estimators such as Rust (1987)’s nested fixed point estimator, Hotz and Miller (1993)’s conditional choice probability estimator, Aguirregabiria and Mira (2002)’s nested algorithm estimator, and Pesendorfer and Schmidt-Dengler (2008)’s least squares estimator. We derive and compare the asymptotic distributions of K- ML and K-MD estimators when the model is arbitrarily locally misspecified and we obtain three main results. In the absence of misspecification, Aguirregabiria and Mira (2002) show that all K-ML estimators are asymptotically equivalent regardless of the choice of K. Our first result shows that this finding extends to a locally misspecified model, regardless of the degree of local misspecification. As a second result, we show that an analogous result holds for all K-MD estimators, i.e., all K- MD estimator are asymptotically equivalent regardless of the choice of K. Our third and final result is to compare K-MD and K-ML estimators in terms of asymptotic mean squared error. Under local misspecification, the optimally weighted K-MD estimator depends on the unknown asymptotic bias and is no longer feasible. In turn, feasible K-MD estimators could have an asymptotic mean squared error that is higher or lower than that of the K-ML estimators. To demonstrate the relevance of our asymptotic analysis, we illustrate our findings using in a simulation exercise based on a misspecified version of Rust (1987) bus engine problem.

The last chapter investigates the causal effect of the Omnibus Budget Reconcil- iation Act of 1993, which caused the biggest change to the EITC in its history, on unemployment and labor force participation among single mothers. Unemployment and labor force participation are difficult to define for a few reasons, for example, be- cause of marginally attached workers. Instead of searching for the unique definition for each of these two concepts, this chapter bounds unemployment and labor force participation by observable variables and, as a result, considers various competing definitions of these two concepts simultaneously. This bounding strategy leads to partial identification of the treatment effect. The inference results depend on the construction of the bounds, but they imply positive effect on labor force participa- tion and negligible effect on unemployment. The results imply that the difference- in-difference result based on the BLS definition of unemployment can be misleading

due to misclassification of unemployment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While molecular and cellular processes are often modeled as stochastic processes, such as Brownian motion, chemical reaction networks and gene regulatory networks, there are few attempts to program a molecular-scale process to physically implement stochastic processes. DNA has been used as a substrate for programming molecular interactions, but its applications are restricted to deterministic functions and unfavorable properties such as slow processing, thermal annealing, aqueous solvents and difficult readout limit them to proof-of-concept purposes. To date, whether there exists a molecular process that can be programmed to implement stochastic processes for practical applications remains unknown.

In this dissertation, a fully specified Resonance Energy Transfer (RET) network between chromophores is accurately fabricated via DNA self-assembly, and the exciton dynamics in the RET network physically implement a stochastic process, specifically a continuous-time Markov chain (CTMC), which has a direct mapping to the physical geometry of the chromophore network. Excited by a light source, a RET network generates random samples in the temporal domain in the form of fluorescence photons which can be detected by a photon detector. The intrinsic sampling distribution of a RET network is derived as a phase-type distribution configured by its CTMC model. The conclusion is that the exciton dynamics in a RET network implement a general and important class of stochastic processes that can be directly and accurately programmed and used for practical applications of photonics and optoelectronics. Different approaches to using RET networks exist with vast potential applications. As an entropy source that can directly generate samples from virtually arbitrary distributions, RET networks can benefit applications that rely on generating random samples such as 1) fluorescent taggants and 2) stochastic computing.

By using RET networks between chromophores to implement fluorescent taggants with temporally coded signatures, the taggant design is not constrained by resolvable dyes and has a significantly larger coding capacity than spectrally or lifetime coded fluorescent taggants. Meanwhile, the taggant detection process becomes highly efficient, and the Maximum Likelihood Estimation (MLE) based taggant identification guarantees high accuracy even with only a few hundred detected photons.

Meanwhile, RET-based sampling units (RSU) can be constructed to accelerate probabilistic algorithms for wide applications in machine learning and data analytics. Because probabilistic algorithms often rely on iteratively sampling from parameterized distributions, they can be inefficient in practice on the deterministic hardware traditional computers use, especially for high-dimensional and complex problems. As an efficient universal sampling unit, the proposed RSU can be integrated into a processor / GPU as specialized functional units or organized as a discrete accelerator to bring substantial speedups and power savings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This work represents an original contribution to the methodology for ecosystem models' development as well as the rst attempt of an end-to-end (E2E) model of the Northern Humboldt Current Ecosystem (NHCE). The main purpose of the developed model is to build a tool for ecosystem-based management and decision making, reason why the credibility of the model is essential, and this can be assessed through confrontation to data. Additionally, the NHCE exhibits a high climatic and oceanographic variability at several scales, the major source of interannual variability being the interruption of the upwelling seasonality by the El Niño Southern Oscillation, which has direct e ects on larval survival and sh recruitment success. Fishing activity can also be highly variable, depending on the abundance and accessibility of the main shery resources. This context brings the two main methodological questions addressed in this thesis, through the development of an end-to-end model coupling the high trophic level model OSMOSE to the hydrodynamics and biogeochemical model ROMS-PISCES: i) how to calibrate ecosystem models using time series data and ii) how to incorporate the impact of the interannual variability of the environment and shing. First, this thesis highlights some issues related to the confrontation of complex ecosystem models to data and proposes a methodology for a sequential multi-phases calibration of ecosystem models. We propose two criteria to classify the parameters of a model: the model dependency and the time variability of the parameters. Then, these criteria along with the availability of approximate initial estimates are used as decision rules to determine which parameters need to be estimated, and their precedence order in the sequential calibration process. Additionally, a new Evolutionary Algorithm designed for the calibration of stochastic models (e.g Individual Based Model) and optimized for maximum likelihood estimation has been developed and applied to the calibration of the OSMOSE model to time series data. The environmental variability is explicit in the model: the ROMS-PISCES model forces the OSMOSE model and drives potential bottom-up e ects up the foodweb through plankton and sh trophic interactions, as well as through changes in the spatial distribution of sh. The latter e ect was taken into account using presence/ absence species distribution models which are traditionally assessed through a confusion matrix and the statistical metrics associated to it. However, when considering the prediction of the habitat against time, the variability in the spatial distribution of the habitat can be summarized and validated using the emerging patterns from the shape of the spatial distributions. We modeled the potential habitat of the main species of the Humboldt Current Ecosystem using several sources of information ( sheries, scienti c surveys and satellite monitoring of vessels) jointly with environmental data from remote sensing and in situ observations, from 1992 to 2008. The potential habitat was predicted over the study period with monthly resolution, and the model was validated using quantitative and qualitative information of the system using a pattern oriented approach. The nal ROMS-PISCES-OSMOSE E2E ecosystem model for the NHCE was calibrated using our evolutionary algorithm and a likelihood approach to t monthly time series data of landings, abundance indices and catch at length distributions from 1992 to 2008. To conclude, some potential applications of the model for shery management are presented and their limitations and perspectives discussed.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Entangled quantum states can be given a separable decomposition if we relax the restriction that the local operators be quantum states. Motivated by the construction of classical simulations and local hidden variable models, we construct `smallest' local sets of operators that achieve this. In other words, given an arbitrary bipartite quantum state we construct convex sets of local operators that allow for a separable decomposition, but that cannot be made smaller while continuing to do so. We then consider two further variants of the problem where the local state spaces are required to contain the local quantum states, and obtain solutions for a variety of cases including a region of pure states around the maximally entangled state. The methods involve calculating certain forms of cross norm. Two of the variants of the problem have a strong relationship to theorems on ensemble decompositions of positive operators, and our results thereby give those theorems an added interpretation. The results generalise those obtained in our previous work on this topic [New J. Phys. 17, 093047 (2015)].

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Suppose two or more variables are jointly normally distributed. If there is a common relationship between these variables it would be very important to quantify this relationship by a parameter called the correlation coefficient which measures its strength, and the use of it can develop an equation for predicting, and ultimately draw testable conclusion about the parent population. This research focused on the correlation coefficient ρ for the bivariate and trivariate normal distribution when equal variances and equal covariances are considered. Particularly, we derived the maximum Likelihood Estimators (MLE) of the distribution parameters assuming all of them are unknown, and we studied the properties and asymptotic distribution of . Showing this asymptotic normality, we were able to construct confidence intervals of the correlation coefficient ρ and test hypothesis about ρ. With a series of simulations, the performance of our new estimators were studied and were compared with those estimators that already exist in the literature. The results indicated that the MLE has a better or similar performance than the others.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The semiarid region of northeastern Brazil, the Caatinga, is extremely important due to its biodiversity and endemism. Measurements of plant physiology are crucial to the calibration of Dynamic Global Vegetation Models (DGVMs) that are currently used to simulate the responses of vegetation in face of global changes. In a field work realized in an area of preserved Caatinga forest located in Petrolina, Pernambuco, measurements of carbon assimilation (in response to light and CO2) were performed on 11 individuals of Poincianella microphylla, a native species that is abundant in this region. These data were used to calibrate the maximum carboxylation velocity (Vcmax) used in the INLAND model. The calibration techniques used were Multiple Linear Regression (MLR), and data mining techniques as the Classification And Regression Tree (CART) and K-MEANS. The results were compared to the UNCALIBRATED model. It was found that simulated Gross Primary Productivity (GPP) reached 72% of observed GPP when using the calibrated Vcmax values, whereas the UNCALIBRATED approach accounted for 42% of observed GPP. Thus, this work shows the benefits of calibrating DGVMs using field ecophysiological measurements, especially in areas where field data is scarce or non-existent, such as in the Caatinga

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Crop monitoring and more generally land use change detection are of primary importance in order to analyze spatio-temporal dynamics and its impacts on environment. This aspect is especially true in such a region as the State of Mato Grosso (south of the Brazilian Amazon Basin) which hosts an intensive pioneer front. Deforestation in this region as often been explained by soybean expansion in the last three decades. Remote sensing techniques may now represent an efficient and objective manner to quantify how crops expansion really represents a factor of deforestation through crop mapping studies. Due to the special characteristics of the soybean productions' farms in Mato Grosso (area varying between 1000 hectares and 40000 hectares and individual fields often bigger than 100 hectares), the Moderate Resolution Imaging Spectroradiometer (MODIS) data with a near daily temporal resolution and 250 m spatial resolution can be considered as adequate resources to crop mapping. Especially, multitemporal vegetation indices (VI) studies have been currently used to realize this task [1] [2]. In this study, 16-days compositions of EVI (MODQ13 product) data are used. However, although these data are already processed, multitemporal VI profiles still remain noisy due to cloudiness (which is extremely frequent in a tropical region such as south Amazon Basin), sensor problems, errors in atmospheric corrections or BRDF effect. Thus, many works tried to develop algorithms that could smooth the multitemporal VI profiles in order to improve further classification. The goal of this study is to compare and test different smoothing algorithms in order to select the one which satisfies better to the demand which is classifying crop classes. Those classes correspond to 6 different agricultural managements observed in Mato Grosso through an intensive field work which resulted in mapping more than 1000 individual fields. The agricultural managements above mentioned are based on combination of soy, cotton, corn, millet and sorghum crops sowed in single or double crop systems. Due to the difficulty in separating certain classes because of too similar agricultural calendars, the classification will be reduced to 3 classes : Cotton (single crop), Soy and cotton (double crop), soy (single or double crop with corn, millet or sorghum). The classification will use training data obtained in the 2005-2006 harvest and then be tested on the 2006-2007 harvest. In a first step, four smoothing techniques are presented and criticized. Those techniques are Best Index Slope Extraction (BISE) [3], Mean Value Iteration (MVI) [4], Weighted Least Squares (WLS) [5] and Savitzky-Golay Filter (SG) [6] [7]. These techniques are then implemented and visually compared on a few individual pixels so that it allows doing a first selection between the five studied techniques. The WLS and SG techniques are selected according to criteria proposed by [8]. Those criteria are: ability in eliminating frequent noises, conserving the upper values of the VI profiles and keeping the temporality of the profiles. Those selected algorithms are then programmed and applied to the MODIS/TERRA EVI data (16-days composition periods). Tests of separability are realized based on the Jeffries-Matusita distance in order to see if the algorithms managed in improving the potential of differentiation between the classes. Those tests are realized on the overall profile (comprising 23 MODIS images) as well as on each MODIS sub-period of the profile [1]. This last test is a double interest process because it allows comparing the smoothing techniques and also enables to select a set of images which carries more information on the separability between the classes. Those selected dates can then be used to realize a supervised classification. Here three different classifiers are tested to evaluate if the smoothing techniques as a particular effect on the classification depending on the classifiers used. Those classifiers are Maximum Likelihood classifier, Spectral Angle Mapper (SAM) classifier and CHAID Improved Decision tree. It appears through the separability tests on the overall process that the smoothed profiles don't improve efficiently the potential of discrimination between classes when compared with the original data. However, the same tests realized on the MODIS sub-periods show better results obtained with the smoothed algorithms. The results of the classification confirm this first analyze. The Kappa coefficients are always better with the smoothing techniques and the results obtained with the WLS and SG smoothed profiles are nearly equal. However, the results are different depending on the classifier used. The impact of the smoothing algorithms is much better while using the decision tree model. Indeed, it allows a gain of 0.1 in the Kappa coefficient. While using the Maximum Likelihood end SAM models, the gain remains positive but is much lower (Kappa improved of 0.02 only). Thus, this work's aim is to prove the utility in smoothing the VI profiles in order to improve the final results. However, the choice of the smoothing algorithm has to be made considering the original data used and the classifier models used. In that case the Savitzky-Golay filter gave the better results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis provides a necessary and sufficient condition for asymptotic efficiency of a nonparametric estimator of the generalised autocovariance function of a Gaussian stationary random process. The generalised autocovariance function is the inverse Fourier transform of a power transformation of the spectral density, and encompasses the traditional and inverse autocovariance functions. Its nonparametric estimator is based on the inverse discrete Fourier transform of the same power transformation of the pooled periodogram. The general result is then applied to the class of Gaussian stationary ARMA processes and its implications are discussed. We illustrate that for a class of contrast functionals and spectral densities, the minimum contrast estimator of the spectral density satisfies a Yule-Walker system of equations in the generalised autocovariance estimator. Selection of the pooling parameter, which characterizes the nonparametric estimator of the generalised autocovariance, controlling its resolution, is addressed by using a multiplicative periodogram bootstrap to estimate the finite-sample distribution of the estimator. A multivariate extension of recently introduced spectral models for univariate time series is considered, and an algorithm for the coefficients of a power transformation of matrix polynomials is derived, which allows to obtain the Wold coefficients from the matrix coefficients characterizing the generalised matrix cepstral models. This algorithm also allows the definition of the matrix variance profile, providing important quantities for vector time series analysis. A nonparametric estimator based on a transformation of the smoothed periodogram is proposed for estimation of the matrix variance profile.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Universidade Estadual de Campinas . Faculdade de Educação Física

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dados de bovinos compostos foram analisados para avaliar o efeito da epistasia nos modelos de avaliação genética. As características analisadas foram os pesos aos 205 (P205) e 390 dias (P390) e perímetro escrotal aos 390 dias (PE390). As análises foram realizadas pela metodologia de máxima verossimilhança considerando-se dois modelos: o modelo 1 incluiu como covariáveis os efeitos aditivos diretos e maternos, e os não aditivos das heterozigoses para os efeitos diretos e para o materno total, e o modelo 2 considerou também o efeito direto de epistasia. Para comparação dos modelos, foram utilizados o critério de informação de Akaike (AIC) e o critério de informação Bayesiano de Schwartz (BIC), e o teste de razão de verossimilhança. A inclusão da epistasia no modelo de avaliação genética pouco alterou as estimativas de componentes de (co)variâncias genéticas aditivas e, consequentemente, as herdabilidades. O teste de verossimilhança e o critério de Akaike sugeriram que o modelo 2, que inclui a epistasia, apresentou maior aderência aos dados para todas as características analisadas. O critério BIC indicou este modelo como o melhor apenas para P205. Para análise genética dessa população, o modelo que considerou o efeito de epistasia foi o mais adequado.