938 resultados para Spatial Mixture Models


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

High spatial resolution environmental data gives us a better understanding of the environmental factors affecting plant distributions at fine spatial scales. However, large environmental datasets dramatically increase compute times and output species model size stimulating the need for an alternative computing solution. Cluster computing offers such a solution, by allowing both multiple plant species Environmental Niche Models (ENMs) and individual tiles of high spatial resolution models to be computed concurrently on the same compute cluster. We apply our methodology to a case study of 4,209 species of Mediterranean flora (around 17% of species believed present in the biome). We demonstrate a 16 times speed-up of ENM computation time when 16 CPUs were used on the compute cluster. Our custom Java ‘Merge’ and ‘Downsize’ programs reduce ENM output files sizes by 94%. The median 0.98 test AUC score of species ENMs is aided by various species occurrence data filtering techniques. Finally, by calculating the percentage change of individual grid cell values, we map the projected percentages of plant species vulnerable to climate change in the Mediterranean region between 1950–2000 and 2020.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Esta dissertação concentra-se nos processos estocásticos espaciais definidos em um reticulado, os chamados modelos do tipo Cliff & Ord. Minha contribuição nesta tese consiste em utilizar aproximações de Edgeworth e saddlepoint para investigar as propriedades em amostras finitas do teste para detectar a presença de dependência espacial em modelos SAR (autoregressivo espacial), e propor uma nova classe de modelos econométricos espaciais na qual os parâmetros que afetam a estrutura da média são distintos dos parâmetros presentes na estrutura da variância do processo. Isto permite uma interpretação mais clara dos parâmetros do modelo, além de generalizar uma proposta de taxonomia feita por Anselin (2003). Eu proponho um estimador para os parâmetros do modelo e derivo a distribuição assintótica do estimador. O modelo sugerido na dissertação fornece uma interpretação interessante ao modelo SARAR, bastante comum na literatura. A investigação das propriedades em amostras finitas dos testes expande com relação a literatura permitindo que a matriz de vizinhança do processo espacial seja uma função não-linear do parâmetro de dependência espacial. A utilização de aproximações ao invés de simulações (mais comum na literatura), permite uma maneira fácil de comparar as propriedades dos testes com diferentes matrizes de vizinhança e corrigir o tamanho ao comparar a potência dos testes. Eu obtenho teste invariante ótimo que é também localmente uniformemente mais potente (LUMPI). Construo o envelope de potência para o teste LUMPI e mostro que ele é virtualmente UMP, pois a potência do teste está muito próxima ao envelope (considerando as estruturas espaciais definidas na dissertação). Eu sugiro um procedimento prático para construir um teste que tem boa potência em uma gama de situações onde talvez o teste LUMPI não tenha boas propriedades. Eu concluo que a potência do teste aumenta com o tamanho da amostra e com o parâmetro de dependência espacial (o que está de acordo com a literatura). Entretanto, disputo a visão consensual que a potência do teste diminui a medida que a matriz de vizinhança fica mais densa. Isto reflete um erro de medida comum na literatura, pois a distância estatística entre a hipótese nula e a alternativa varia muito com a estrutura da matriz. Fazendo a correção, concluo que a potência do teste aumenta com a distância da alternativa à nula, como esperado.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Most authors struggle to pick a title that adequately conveys all of the material covered in a book. When I first saw Applied Spatial Data Analysis with R, I expected a review of spatial statistical models and their applications in packages (libraries) from the CRAN site of R. The authors’ title is not misleading, but I was very pleasantly surprised by how deep the word “applied” is here. The first half of the book essentially covers how R handles spatial data. To some statisticians this may be boring. Do you want, or need, to know the difference between S3 and S4 classes, how spatial objects in R are organized, and how various methods work on the spatial objects? A few years ago I would have said “no,” especially to the “want” part. Just let me slap my EXCEL spreadsheet into R and run some spatial functions on it. Unfortunately, the world is not so simple, and ultimately we want to minimize effort to get all of our spatial analyses accomplished. The first half of this book certainly convinced me that some extra effort in organizing my data into certain spatial class structures makes the analysis easier and less subject to mistakes. I also admit that I found it very interesting and I learned a lot.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

There is an emerging interest in modeling spatially correlated survival data in biomedical and epidemiological studies. In this paper, we propose a new class of semiparametric normal transformation models for right censored spatially correlated survival data. This class of models assumes that survival outcomes marginally follow a Cox proportional hazard model with unspecified baseline hazard, and their joint distribution is obtained by transforming survival outcomes to normal random variables, whose joint distribution is assumed to be multivariate normal with a spatial correlation structure. A key feature of the class of semiparametric normal transformation models is that it provides a rich class of spatial survival models where regression coefficients have population average interpretation and the spatial dependence of survival times is conveniently modeled using the transformed variables by flexible normal random fields. We study the relationship of the spatial correlation structure of the transformed normal variables and the dependence measures of the original survival times. Direct nonparametric maximum likelihood estimation in such models is practically prohibited due to the high dimensional intractable integration of the likelihood function and the infinite dimensional nuisance baseline hazard parameter. We hence develop a class of spatial semiparametric estimating equations, which conveniently estimate the population-level regression coefficients and the dependence parameters simultaneously. We study the asymptotic properties of the proposed estimators, and show that they are consistent and asymptotically normal. The proposed method is illustrated with an analysis of data from the East Boston Ashma Study and its performance is evaluated using simulations.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In many clinical trials to evaluate treatment efficacy, it is believed that there may exist latent treatment effectiveness lag times after which medical procedure or chemical compound would be in full effect. In this article, semiparametric regression models are proposed and studied to estimate the treatment effect accounting for such latent lag times. The new models take advantage of the invariance property of the additive hazards model in marginalizing over random effects, so parameters in the models are easy to be estimated and interpreted, while the flexibility without specifying baseline hazard function is kept. Monte Carlo simulation studies demonstrate the appropriateness of the proposed semiparametric estimation procedure. Data collected in the actual randomized clinical trial, which evaluates the effectiveness of biodegradable carmustine polymers for treatment of recurrent brain tumors, are analyzed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We tested the prediction from spatial competition models that intraspecific aggregation may promote coexistence and thus maintain biodiversity with experimental communities of four annual species. Monocultures, three-species mixtures, and the four-species mixture were sown at two densities and with either random or intraspecifically aggregated distributions. There was a hierarchy of competitive abilities among the four species. The weaker competitors showed higher aboveground biomass in the aggregated distribution compared to the random distribution, especially at high density. In one species, intraspecific aggregation resulted in an 86% increase in the number of flowering individuals and a 171% increase in the reproductive biomass at high density. The competitively superior species had a lower biomass in the aggregated distribution than in the random distribution at high density. The data support the hypothesis that the spatial distribution of plants profoundly affects competition in such a way that weaker competitors increase their fitness while stronger competitors are suppressed when grown in the neighborhood of conspecifics. This implies that the spatial arrangement of plants in a community can be an important determinant of species coexistence and biodiversity.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Mixture modeling is commonly used to model categorical latent variables that represent subpopulations in which population membership is unknown but can be inferred from the data. In relatively recent years, the potential of finite mixture models has been applied in time-to-event data. However, the commonly used survival mixture model assumes that the effects of the covariates involved in failure times differ across latent classes, but the covariate distribution is homogeneous. The aim of this dissertation is to develop a method to examine time-to-event data in the presence of unobserved heterogeneity under a framework of mixture modeling. A joint model is developed to incorporate the latent survival trajectory along with the observed information for the joint analysis of a time-to-event variable, its discrete and continuous covariates, and a latent class variable. It is assumed that the effects of covariates on survival times and the distribution of covariates vary across different latent classes. The unobservable survival trajectories are identified through estimating the probability that a subject belongs to a particular class based on observed information. We applied this method to a Hodgkin lymphoma study with long-term follow-up and observed four distinct latent classes in terms of long-term survival and distributions of prognostic factors. Our results from simulation studies and from the Hodgkin lymphoma study demonstrated the superiority of our joint model compared with the conventional survival model. This flexible inference method provides more accurate estimation and accommodates unobservable heterogeneity among individuals while taking involved interactions between covariates into consideration.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We consider the problem of assessing the number of clusters in a limited number of tissue samples containing gene expressions for possibly several thousands of genes. It is proposed to use a normal mixture model-based approach to the clustering of the tissue samples. One advantage of this approach is that the question on the number of clusters in the data can be formulated in terms of a test on the smallest number of components in the mixture model compatible with the data. This test can be carried out on the basis of the likelihood ratio test statistic, using resampling to assess its null distribution. The effectiveness of this approach is demonstrated on simulated data and on some microarray datasets, as considered previously in the bioinformatics literature. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Mixture models implemented via the expectation-maximization (EM) algorithm are being increasingly used in a wide range of problems in pattern recognition such as image segmentation. However, the EM algorithm requires considerable computational time in its application to huge data sets such as a three-dimensional magnetic resonance (MR) image of over 10 million voxels. Recently, it was shown that a sparse, incremental version of the EM algorithm could improve its rate of convergence. In this paper, we show how this modified EM algorithm can be speeded up further by adopting a multiresolution kd-tree structure in performing the E-step. The proposed algorithm outperforms some other variants of the EM algorithm for segmenting MR images of the human brain. (C) 2004 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We analyze a real data set pertaining to reindeer fecal pellet-group counts obtained from a survey conducted in a forest area in northern Sweden. In the data set, over 70% of counts are zeros, and there is high spatial correlation. We use conditionally autoregressive random effects for modeling of spatial correlation in a Poisson generalized linear mixed model (GLMM), quasi-Poisson hierarchical generalized linear model (HGLM), zero-inflated Poisson (ZIP), and hurdle models. The quasi-Poisson HGLM allows for both under- and overdispersion with excessive zeros, while the ZIP and hurdle models allow only for overdispersion. In analyzing the real data set, we see that the quasi-Poisson HGLMs can perform better than the other commonly used models, for example, ordinary Poisson HGLMs, spatial ZIP, and spatial hurdle models, and that the underdispersed Poisson HGLMs with spatial correlation fit the reindeer data best. We develop R codes for fitting these models using a unified algorithm for the HGLMs. Spatial count response with an extremely high proportion of zeros, and underdispersion can be successfully modeled using the quasi-Poisson HGLM with spatial random effects.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper studies the relationship between permanent income and homicides, estimating an income-crime elasticity. We assume that this elasticity varies across geographical areas. We estimate different specifications of Spatial Panel Models using information of urban areas in Medellin (Colombia), areas known as communes. Spatial Models consider the importance of location and the type of neighbors of each commune. We simulate an intervention over permanent income in order to estimate the income elasticity for each commune and the average elasticity of income-crime on the city. We provide evidence about spatial dependence between the homicides per commune and their neighbors, and about a relationship between homicides and neighbor’s income. In our case of study, the average estimated impact of 1% increase in permanent income in a specific commune produces a decrease in the homicide rate on average in 0.39%. Finally, permanent income plays a crime deterrent role, but also this effect of income on crime varies across the city, showing that some areas are strategically located to this kind of intervention.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the last two decades, authors have begun to expand classical stochastic frontier (SF) models in order to include also some spatial components. Indeed, firms tend to concentrate in clusters, taking advantage of positive agglomeration externalities due to cooperation, shared ideas and emulation, resulting in increased productivity levels. Until now scholars have introduced spatial dependence into SF models following two different paths: evaluating global and local spatial spillover effects related to the frontier or considering spatial cross-sectional correlation in the inefficiency and/or in the error term. In this thesis, we extend the current literature on spatial SF models introducing two novel specifications for panel data. First, besides considering productivity and input spillovers, we introduce the possibility to evaluate the specific spatial effects arising from each inefficiency determinant through their spatial lags aiming to capture also knowledge spillovers. Second, we develop a very comprehensive spatial SF model that includes both frontier and error-based spillovers in order to consider four different sources of spatial dependence (i.e. productivity and input spillovers related to the frontier function and behavioural and environmental correlation associated with the two error terms). Finally, we test the finite sample properties of the two proposed spatial SF models through simulations, and we provide two empirical applications to the Italian accommodation and agricultural sectors. From a practical perspective, policymakers, based on results from these models, can rely on precise, detailed and distinct insights on the spillover effects affecting the productive performance of neighbouring spatial units obtaining interesting and relevant suggestions for policy decisions.