888 resultados para likelihood-based inference


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Heterogeneous datasets arise naturally in most applications due to the use of a variety of sensors and measuring platforms. Such datasets can be heterogeneous in terms of the error characteristics and sensor models. Treating such data is most naturally accomplished using a Bayesian or model-based geostatistical approach; however, such methods generally scale rather badly with the size of dataset, and require computationally expensive Monte Carlo based inference. Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential Bayesian framework for inference in such projected processes is presented. The observations are considered one at a time which avoids the need for high dimensional integrals typically required in a Bayesian approach. A C++ library, gptk, which is part of the INTAMAP web service, is introduced which implements projected, sequential estimation and adds several novel features. In particular the library includes the ability to use a generic observation operator, or sensor model, to permit data fusion. It is also possible to cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the covariance parameters is explored, including the impact of the projected process approximation on likelihood profiles. We illustrate the projected sequential method in application to synthetic and real datasets. Limitations and extensions are discussed. © 2010 Elsevier Ltd.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The use of graphical processing unit (GPU) parallel processing is becoming a part of mainstream statistical practice. The reliance of Bayesian statistics on Markov Chain Monte Carlo (MCMC) methods makes the applicability of parallel processing not immediately obvious. It is illustrated that there are substantial gains in improved computational time for MCMC and other methods of evaluation by computing the likelihood using GPU parallel processing. Examples use data from the Global Terrorism Database to model terrorist activity in Colombia from 2000 through 2010 and a likelihood based on the explicit convolution of two negative-binomial processes. Results show decreases in computational time by a factor of over 200. Factors influencing these improvements and guidelines for programming parallel implementations of the likelihood are discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Techniques for evaluating and selecting multivariate volatility forecasts are not yet understood as well as their univariate counterparts. This paper considers the ability of different loss functions to discriminate between a set of competing forecasting models which are subsequently applied in a portfolio allocation context. It is found that a likelihood-based loss function outperforms its competitors, including those based on the given portfolio application. This result indicates that considering the particular application of forecasts is not necessarily the most effective basis on which to select models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a new algorithm to compute the voxel-wise genetic contribution to brain fiber microstructure using diffusion tensor imaging (DTI) in a dataset of 25 monozygotic (MZ) twins and 25 dizygotic (DZ) twin pairs (100 subjects total). First, the structural and DT scans were linearly co-registered. Structural MR scans were nonlinearly mapped via a 3D fluid transformation to a geometrically centered mean template, and the deformation fields were applied to the DTI volumes. After tensor re-orientation to realign them to the anatomy, we computed several scalar and multivariate DT-derived measures including the geodesic anisotropy (GA), the tensor eigenvalues and the full diffusion tensors. A covariance-weighted distance was measured between twins in the Log-Euclidean framework [2], and used as input to a maximum-likelihood based algorithm to compute the contributions from genetics (A), common environmental factors (C) and unique environmental ones (E) to fiber architecture. Quanititative genetic studies can take advantage of the full information in the diffusion tensor, using covariance weighted distances and statistics on the tensor manifold.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The aim of this paper is to provide a Bayesian formulation of the so-called magnitude-based inference approach to quantifying and interpreting effects, and in a case study example provide accurate probabilistic statements that correspond to the intended magnitude-based inferences. The model is described in the context of a published small-scale athlete study which employed a magnitude-based inference approach to compare the effect of two altitude training regimens (live high-train low (LHTL), and intermittent hypoxic exposure (IHE)) on running performance and blood measurements of elite triathletes. The posterior distributions, and corresponding point and interval estimates, for the parameters and associated effects and comparisons of interest, were estimated using Markov chain Monte Carlo simulations. The Bayesian analysis was shown to provide more direct probabilistic comparisons of treatments and able to identify small effects of interest. The approach avoided asymptotic assumptions and overcame issues such as multiple testing. Bayesian analysis of unscaled effects showed a probability of 0.96 that LHTL yields a substantially greater increase in hemoglobin mass than IHE, a 0.93 probability of a substantially greater improvement in running economy and a greater than 0.96 probability that both IHE and LHTL yield a substantially greater improvement in maximum blood lactate concentration compared to a Placebo. The conclusions are consistent with those obtained using a ‘magnitude-based inference’ approach that has been promoted in the field. The paper demonstrates that a fully Bayesian analysis is a simple and effective way of analysing small effects, providing a rich set of results that are straightforward to interpret in terms of probabilistic statements.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Approximate Bayesian computation (ABC) is a popular technique for analysing data for complex models where the likelihood function is intractable. It involves using simulation from the model to approximate the likelihood, with this approximate likelihood then being used to construct an approximate posterior. In this paper, we consider methods that estimate the parameters by maximizing the approximate likelihood used in ABC. We give a theoretical analysis of the asymptotic properties of the resulting estimator. In particular, we derive results analogous to those of consistency and asymptotic normality for standard maximum likelihood estimation. We also discuss how sequential Monte Carlo methods provide a natural method for implementing our likelihood-based ABC procedures.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fuzzification is introduced into gray-scale mathematical morphology by using two-input one-output fuzzy rule-based inference systems. The fuzzy inferring dilation or erosion is defined from the approximate reasoning of the two consequences of a dilation or an erosion and an extended rank-order operation. The fuzzy inference systems with numbers of rules and fuzzy membership functions are further reduced to a simple fuzzy system formulated by only an exponential two-input one-output function. Such a one-function fuzzy inference system is able to approach complex fuzzy inference systems by using two specified parameters within it-a proportion to characterize the fuzzy degree and an exponent to depict the nonlinearity in the inferring. The proposed fuzzy inferring morphological operators tend to keep the object details comparable to the structuring element and to smooth the conventional morphological operations. Based on digital area coding of a gray-scale image, incoherently optical correlation for neighboring connection, and optical thresholding for rank-order operations, a fuzzy inference system can be realized optically in parallel. (C) 1996 Society of Photo-Optical Instrumentation Engineers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper investigates a method of automatic pronunciation scoring for use in computer-assisted language learning (CALL) systems. The method utilizes a likelihood-based `Goodness of Pronunciation' (GOP) measure which is extended to include individual thresholds for each phone based on both averaged native confidence scores and on rejection statistics provided by human judges. Further improvements are obtained by incorporating models of the subject's native language and by augmenting the recognition networks to include expected pronunciation errors. The various GOP measures are assessed using a specially recorded database of non-native speakers which has been annotated to mark phone-level pronunciation errors. Since pronunciation assessment is highly subjective, a set of four performance measures has been designed, each of them measuring different aspects of how well computer-derived phone-level scores agree with human scores. These performance measures are used to cross-validate the reference annotations and to assess the basic GOP algorithm and its refinements. The experimental results suggest that a likelihood-based pronunciation scoring metric can achieve usable performance, especially after applying the various enhancements.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We conducted phylogenetic analyses to identify the closest related living relatives of the Xizang and Sichuan hot-spring snakes (T baileyi and T. zhaoermii) endemic to the Tibetan Plateau, using mitochondrial DNA sequences (cyt b, ND4) from eight specimen

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We investigated the molecular evolution of duplicated color vision genes (LWS-1 and SWS2) within cyprinid fish, focusing on the most cavefish-rich genus-Sinocyclocheilus. Maximum likelihood-based codon substitution approaches were used to analyze the evolution of vision genes. We found that the duplicated color vision genes had unequal evolutionary rates, which may lead to a related function divergence. Divergence of LWS-1 was strongly influenced by positive selection causing an accelerated rate of substitution in the proportion of pocket-forming residues. The SWS2 pigment experienced divergent selection between lineages, and no positively selected site was found. A duplicate copy of LWS-1 of some cyprinine species had become a pseudogene, but all SWS2 sequences remained intact in the regions examined in the cyprinid fishes examined in this study. The pseudogenization events did not occur randomly in the two copies of LWS-1 within Sinocyclocheilus species. Some cave species of Sinocyclocheilus with numerous morphological specializations that seem to be highly adapted for caves, retain both intact copies of color vision genes in their genome. We found some novel amino acid substitutions at key sites, which might represent interesting target sites for future mutagenesis experiments. Our data add to the increasing evidence that duplicate genes experience lower selective constraints and in some cases positive selection following gene duplication. Some of these observations are unexpected and may provide insights into the effect of caves on the evolution of color vision genes in fishes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fuzzy-neural-network-based inference systems are well-known universal approximators which can produce linguistically interpretable results. Unfortunately, their dimensionality can be extremely high due to an excessive number of inputs and rules, which raises the need for overall structure optimization. In the literature, various input selection methods are available, but they are applied separately from rule selection, often without considering the fuzzy structure. This paper proposes an integrated framework to optimize the number of inputs and the number of rules simultaneously. First, a method is developed to select the most significant rules, along with a refinement stage to remove unnecessary correlations. An improved information criterion is then proposed to find an appropriate number of inputs and rules to include in the model, leading to a balanced tradeoff between interpretability and accuracy. Simulation results confirm the efficacy of the proposed method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A análise das séries temporais de valores inteiros tornou-se, nos últimos anos, uma área de investigação importante, não só devido à sua aplicação a dados de contagem provenientes de diversos campos da ciência, mas também pelo facto de ser uma área pouco explorada, em contraste com a análise séries temporais de valores contínuos. Uma classe que tem obtido especial relevo é a dos modelos baseados no operador binomial thinning, da qual se destaca o modelo auto-regressivo de valores inteiros de ordem p. Esta classe é muito vasta, pelo que este trabalho tem como objectivo dar um contributo para a análise estatística de processos de contagem que lhe pertencem. Esta análise é realizada do ponto de vista da predição de acontecimentos, aos quais estão associados mecanismos de alarme, e também da introdução de novos modelos que se baseiam no referido operador. Em muitos fenómenos descritos por processos estocásticos a implementação de um sistema de alarmes pode ser fundamental para prever a ocorrência de um acontecimento futuro. Neste trabalho abordam-se, nas perspectivas clássica e bayesiana, os sistemas de alarme óptimos para processos de contagem, cujos parâmetros dependem de covariáveis de interesse e que variam no tempo, mais concretamente para o modelo auto-regressivo de valores inteiros não negativos com coeficientes estocásticos, DSINAR(1). A introdução de novos modelos que pertencem à classe dos modelos baseados no operador binomial thinning é feita quando se propõem os modelos PINAR(1)T e o modelo SETINAR(2;1). O modelo PINAR(1)T tem estrutura periódica, cujas inovações são uma sucessão periódica de variáveis aleatórias independentes com distribuição de Poisson, o qual foi estudado com detalhe ao nível das suas propriedades probabilísticas, métodos de estimação e previsão. O modelo SETINAR(2;1) é um processo auto-regressivo de valores inteiros, definido por limiares auto-induzidos e cujas inovações formam uma sucessão de variáveis independentes e identicamente distribuídas com distribuição de Poisson. Para este modelo estudam-se as suas propriedades probabilísticas e métodos para estimar os seus parâmetros. Para cada modelo introduzido, foram realizados estudos de simulação para comparar os métodos de estimação que foram usados.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper proposes finite-sample procedures for testing the SURE specification in multi-equation regression models, i.e. whether the disturbances in different equations are contemporaneously uncorrelated or not. We apply the technique of Monte Carlo (MC) tests [Dwass (1957), Barnard (1963)] to obtain exact tests based on standard LR and LM zero correlation tests. We also suggest a MC quasi-LR (QLR) test based on feasible generalized least squares (FGLS). We show that the latter statistics are pivotal under the null, which provides the justification for applying MC tests. Furthermore, we extend the exact independence test proposed by Harvey and Phillips (1982) to the multi-equation framework. Specifically, we introduce several induced tests based on a set of simultaneous Harvey/Phillips-type tests and suggest a simulation-based solution to the associated combination problem. The properties of the proposed tests are studied in a Monte Carlo experiment which shows that standard asymptotic tests exhibit important size distortions, while MC tests achieve complete size control and display good power. Moreover, MC-QLR tests performed best in terms of power, a result of interest from the point of view of simulation-based tests. The power of the MC induced tests improves appreciably in comparison to standard Bonferroni tests and, in certain cases, outperforms the likelihood-based MC tests. The tests are applied to data used by Fischer (1993) to analyze the macroeconomic determinants of growth.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recent work shows that a low correlation between the instruments and the included variables leads to serious inference problems. We extend the local-to-zero analysis of models with weak instruments to models with estimated instruments and regressors and with higher-order dependence between instruments and disturbances. This makes this framework applicable to linear models with expectation variables that are estimated non-parametrically. Two examples of such models are the risk-return trade-off in finance and the impact of inflation uncertainty on real economic activity. Results show that inference based on Lagrange Multiplier (LM) tests is more robust to weak instruments than Wald-based inference. Using LM confidence intervals leads us to conclude that no statistically significant risk premium is present in returns on the S&P 500 index, excess holding yields between 6-month and 3-month Treasury bills, or in yen-dollar spot returns.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Real-world learning tasks often involve high-dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)---both for the estimation of mixture components and for coping with the missing data.