971 resultados para Markov Population Processes
Resumo:
Peer reviewed
Resumo:
Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.
Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.
One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.
Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.
In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.
Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.
The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.
Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.
Resumo:
Light rainfall is the baseline input to the annual water budget in mountainous landscapes through the tropics and at mid-latitudes. In the Southern Appalachians, the contribution from light rainfall ranges from 50-60% during wet years to 80-90% during dry years, with convective activity and tropical cyclone input providing most of the interannual variability. The Southern Appalachians is a region characterized by rich biodiversity that is vulnerable to land use/land cover changes due to its proximity to a rapidly growing population. Persistent near surface moisture and associated microclimates observed in this region has been well documented since the colonization of the area in terms of species health, fire frequency, and overall biodiversity. The overarching objective of this research is to elucidate the microphysics of light rainfall and the dynamics of low level moisture in the inner region of the Southern Appalachians during the warm season, with a focus on orographically mediated processes. The overarching research hypothesis is that physical processes leading to and governing the life cycle of orographic fog, low level clouds, and precipitation, and their interactions, are strongly tied to landform, land cover, and the diurnal cycles of flow patterns, radiative forcing, and surface fluxes at the ridge-valley scale. The following science questions will be addressed specifically: 1) How do orographic clouds and fog affect the hydrometeorological regime from event to annual scale and as a function of terrain characteristics and land cover?; 2) What are the source areas, governing processes, and relevant time-scales of near surface moisture convergence patterns in the region?; and 3) What are the four dimensional microphysical and dynamical characteristics, including variability and controlling factors and processes, of fog and light rainfall? The research was conducted with two major components: 1) ground-based high-quality observations using multi-sensor platforms and 2) interpretive numerical modeling guided by the analysis of the in situ data collection. Findings illuminate a high level of spatial – down to the ridge scale - and temporal – from event to annual scale - heterogeneity in observations, and a significant impact on the hydrological regime as a result of seeder-feeder interactions among fog, low level clouds, and stratiform rainfall that enhance coalescence efficiency and lead to significantly higher rainfall rates at the land surface. Specifically, results show that enhancement of an event up to one order of magnitude in short-term accumulation can occur as a result of concurrent fog presence. Results also show that events are modulated strongly by terrain characteristics including elevation, slope, geometry, and land cover. These factors produce interactions between highly localized flows and gradients of temperature and moisture with larger scale circulations. Resulting observations of DSD and rainfall patterns are stratified by region and altitude and exhibit clear diurnal and seasonal cycles.
Resumo:
Ocean acidification, as a consequence of increasing marine pCO2, may have severe effects on the physiology of marine organisms. However, experimental studies remain scarce, in particular concerning fish. While adults will most likely remain relatively unaffected by changes in seawater pH, early life-history stages are potentially more sensitive - particularly the critical stage of fertilization, in which sperm motility plays a central role. In this study, the effects of ocean acidification (decrease of pHT to 7.55) on sperm motility of Baltic cod, Gadus morhua, were assessed. We found no significant effect of decreased pH on sperm speed, rate of change of direction or percent motility for the population of cod analyzed. We predict that future ocean acidification will probably not pose a problem for sperm behavior, and hence fertilization success, of Baltic cod.
Resumo:
An increasing number of studies are now reporting the effects of ocean acidification on a broad range of marine species, processes and systems. Many of these are investigating the sensitive early life-history stages that several major reviews have highlighted as being potentially most susceptible to ocean acidification. Nonetheless there remain few investigations of the effects of ocean acidification on the very earliest, and critical, process of fertilization, and still fewer that have investigated levels of ocean acidification relevant for the coming century. Here we report the effects of near-future levels of ocean acidification (?0.35 pH unit change) on sperm swimming speed, sperm motility, and fertilization kinetics in a population of the Pacific oyster Crassostrea gigas from western Sweden. We found no significant effect of ocean acidification - a result that was well-supported by power analysis. Similar findings from Japan suggest that this may be a globally robust result, and we emphasise the need for experiments on multiple populations from throughout a species' range. We also discuss the importance of sound experimental design and power analysis in meaningful interpretation of non-significant results.
Resumo:
Study of biogeochemical processes in waters and sediments of the Chukchi Sea in August 2004 revealed atypical maxima of biogenic element (N, P, and Si) concentrations and rate of microbial sulfate reduction in the surface layer (0-3 cm) of marine sediments. The C/N/P ratio in organic matter (OM) of this layer does not fit the Redfield-Richards stoichiometric model. Specific features of biogeochemical processes in the sea are likely related to the complex dynamics of water, high primary produc¬tivity (110-1400 mg C/m**2/day), low depth of the basin (<50 m for 60% of the water area), reduced food chain due to low population of zooplankton, high density of zoobenthos (up to 4230 g/m**2), and high activity of microbial processes. Drastic decrease in concentrations of biogenic elements, iodine, total alkalinity, and population of microorganisms beneath the 0-3 cm layer testify to large-scale OM decay at the water-seafloor barrier. Our original experimental data support high annual rate of OM mineralization at the bottom of the Chukchi Sea.
Resumo:
Reduction in global ocean pH due to the uptake of increased atmospheric CO2 is expected to negatively affect calcifying organisms, including the planktonic larval stages of many marine invertebrates. Planktonic larvae play crucial roles in the benthic-pelagic life cycle of marine organisms by connecting and sustaining existing populations and colonizing new habitats. Calcified larvae are typically denser than seawater and rely on swimming to navigate vertically structured water columns. Larval sand dollars Dendraster excentricus have calcified skeletal rods supporting their bodies, and propel themselves with ciliated bands looped around projections called arms. Ciliated bands are also used in food capture, and filtration rate is correlated with band length. As a result, swimming and feeding performance are highly sensitive to morphological changes. When reared at an elevated PCO2 level (1000 ppm), larval sand dollars developed significantly narrower bodies at four and six-arm stages. Morphological changes also varied between four observed maternal lineages, suggesting within-population variation in sensitivity to changes in PCO2 level. Despite these morphological changes, PCO2 concentration alone had no significant effect on swimming speeds. However, acidified larvae had significantly smaller larval stomachs and bodies, suggesting reduced feeding performance. Adjustments to larval morphologies in response to ocean acidification may prioritize swimming over feeding, implying that negative consequences of ocean acidification are carried over to later developmental stages.
Resumo:
Future scenarios for the oceans project combined developments of CO2 accumulation and global warming and their impact on marine ecosystems. The synergistic impact of both factors was addressed by studying the effect of elevated CO2 concentrations on thermal tolerance of the cold-eurythermal spider crab Hyas araneus from the population around Helgoland. Here ambient temperatures characterize the southernmost distribution limit of this species. Animals were exposed to present day normocapnia (380 ppm CO2), CO2 levels expected towards 2100 (710 ppm) and beyond (3000 ppm). Heart rate and haemolymph PO2 (PeO2) were measured during progressive short term cooling from 10 to 0°C and during warming from 10 to 25°C. An increase of PeO2 occurred during cooling, the highest values being reached at 0°C under all three CO2 levels. Heart rate increased during warming until a critical temperature (Tc) was reached. The putative Tc under normocapnia was presumably >25°C, from where it fell to 23.5°C under 710 ppm and then 21.1°C under 3000 ppm. At the same time, thermal sensitivity, as seen in the Q10 values of heart rate, rose with increasing CO2concentration in the warmth. Our results suggest a narrowing of the thermal window of Hyas araneus under moderate increases in CO2 levels by exacerbation of the heat or cold induced oxygen and capacity limitation of thermal tolerance.
Resumo:
The persistence of most coastal marine species depends on larvae finding suitable adult habitat at the end of an offshore dispersive stage that can last weeks or months. We tested the effects that ocean acidification from elevated levels of atmospheric carbon dioxide (CO2) could have on the ability of larvae to detect olfactory cues from adult habitats. Larval clownfish reared in control seawater (pH 8.15) discriminated between a range of cues that could help them locate reef habitat and suitable settlement sites. This discriminatory ability was disrupted when larvae were reared in conditions simulating CO2-induced ocean acidification. Larvae became strongly attracted to olfactory stimuli they normally avoided when reared at levels of ocean pH that could occur ca. 2100 (pH 7.8) and they no longer responded to any olfactory cues when reared at pH levels (pH 7.6) that might be attained later next century on a business-as-usual carbon-dioxide emissions trajectory. If acidification continues unabated, the impairment of sensory ability will reduce population sustainability of many marine species, with potentially profound consequences for marine diversity.
Resumo:
Two types of health reforms in Latin America are analysed: one based on insurance and service commodification and the one referred to the unified public systems of progressive governments. Health insurance with explicit service packages has not fulfilled their purposes of universal coverage, equal access to necessary health services and improvement of health conditions but has opened health as a field of profit making for insurance companies and private health providers. The national health services as a state obligation have developed territorialized health services and widened substantially timely access to the majority of the population. The adoption of an integrated and wide social policy has an impact on population well fare. It faces some problems derived from the old health systems and the power of the insurance and medical complex.
Resumo:
A non-Markovian process is one that retains `memory' of its past. A systematic understanding of these processes is necessary to fully describe and harness a vast range of complex phenomena; however, no such general characterisation currently exists. This long-standing problem has hindered advances in understanding physical, chemical and biological processes, where often dubious theoretical assumptions are made to render a dynamical description tractable. Moreover, the methods currently available to treat non-Markovian quantum dynamics are plagued with unphysical results, like non-positive dynamics. Here we develop an operational framework to characterise arbitrary non-Markovian quantum processes. We demonstrate the universality of our framework and how the characterisation can be rendered efficient, before formulating a necessary and sufficient condition for quantum Markov processes. Finally, we stress how our framework enables the actual systematic analysis of non-Markovian processes, the understanding of their typicality, and the development of new master equations for the effective description of memory-bearing open-system evolution.
Resumo:
Les déficits cognitifs sont centraux à la psychose et sont observables plusieurs années avant le premier épisode psychotique. L’atteinte de la mémoire épisodique est fréquemment identifiée comme une des plus sévères, tant chez les patients qu’avant l’apparition de la pathologie chez des populations à risque. Chez les patients psychotiques, l’étude neuropsychologique des processus mnésiques a permis de mieux comprendre l’origine de cette atteinte. Une altération des processus de mémoire de source qui permettent d’associer un souvenir à son origine a ainsi été identifiée et a été associée aux symptômes positifs de psychose, principalement aux hallucinations. La mémoire de source de même que la présence de symptômes sous-cliniques n’ont pourtant jamais été investiguées avant l’apparition de la maladie chez une population à haut risque génétique de psychose (HRG). Or, leur étude permettrait de voir si les déficits en mémoire de source de même que le vécu d’expériences hallucinatoires sont associés à l’apparition de la psychose ou s’ils en précèdent l’émergence, constituant alors des indicateurs précoces de pathologie. Afin d’étudier cette question, trois principaux objectifs ont été poursuivis par la présente thèse : 1) caractériser le fonctionnement de la mémoire de source chez une population HRG afin d’observer si une atteinte de ce processus précède l’apparition de la maladie, 2) évaluer si des manifestations sous-cliniques de symptômes psychotiques, soit les expériences hallucinatoires, sont identifiables chez une population à risque et 3) investiguer si un lien est présent entre le fonctionnement en mémoire de source et la symptomatologie sous-clinique chez une population à risque, à l’instar de ce qui est documenté chez les patients. Les résultats de la thèse ont permis de démontrer que les HRG présentent une atteinte de la mémoire de source ciblée à l’attribution du contexte temporel des souvenirs, ainsi que des distorsions mnésiques qui se manifestent par une fragmentation des souvenirs et par une défaillance de la métacognition en mémoire. Il a également été observé que les expériences hallucinatoires sous-cliniques étaient plus fréquentes chez les HRG. Des associations ont été documentées entre certaines distorsions en mémoire et la propension à halluciner. Ces résultats permettent d’identifier de nouveaux indicateurs cliniques et cognitifs du risque de développer une psychose et permettent de soulever des hypothèses liant l’attribution de la source interne-externe de l’information et le développement de la maladie. Les implications empiriques, théoriques, méthodologiques et cliniques de la thèse sont discutées.
Resumo:
Language provides an interesting lens to look at state-building processes because of its cross-cutting nature. For example, in addition to its symbolic value and appeal, a national language has other roles in the process, including: (a) becoming the primary medium of communication which permits the nation to function efficiently in its political and economic life, (b) promoting social cohesion, allowing the nation to develop a common culture, and (c) forming a primordial basis for self-determination. Moreover, because of its cross-cutting nature, language interventions are rarely isolated activities. Languages are adopted by speakers, taking root in and spreading between communities because they are legitimated by legislation, and then reproduced through institutions like the education and military systems. Pádraig Ó’ Riagáin (1997) makes a case for this observing that “Language policy is formulated, implemented, and accomplishes its results within a complex interrelated set of economic, social, and political processes which include, inter alia, the operation of other non-language state policies” (p. 45). In the Turkish case, its foundational role in the formation of the Turkish nation-state but its linkages to human rights issues raises interesting issues about how socio-cultural practices become reproduced through institutional infrastructure formation. This dissertation is a country-level case study looking at Turkey’s nation-state building process through the lens of its language and education policy development processes with a focus on the early years of the Republic between 1927 and 1970. This project examines how different groups self-identified or were self-identified (as the case may be) in official Turkish statistical publications (e.g., the Turkish annual statistical yearbooks and the population censuses) during that time period when language and ethnicity data was made publicly available. The overarching questions this dissertation explores include: 1.What were the geo-political conditions surrounding the development and influencing the Turkish government’s language and education policies? 2.Are there any observable patterns in the geo-spatial distribution of language, literacy, and education participation rates over time? In what ways, are these traditionally linked variables (language, literacy, education participation) problematic? 3.What do changes in population identifiers, e.g., language and ethnicity, suggest about the government’s approach towards nation-state building through the construction of a civic Turkish identity and institution building? Archival secondary source data was digitized, aggregated by categories relevant to this project at national and provincial levels and over the course of time (primarily between 1927 and 2000). The data was then re-aggregated into values that could be longitudinally compared and then layered on aspatial administrative maps. This dissertation contributes to existing body of social policy literature by taking an interdisciplinary approach in looking at the larger socio-economic contexts in which language and education policies are produced.
Resumo:
Several models have been studied on predictive epidemics of arthropod vectored plant viruses in an attempt to bring understanding to the complex but specific relationship between the three cornered pathosystem (virus, vector and host plant), as well as their interactions with the environment. A large body of studies mainly focuses on weather based models as management tool for monitoring pests and diseases, with very few incorporating the contribution of vector's life processes in the disease dynamics, which is an essential aspect when mitigating virus incidences in a crop stand. In this study, we hypothesized that the multiplication and spread of tomato spotted wilt virus (TSWV) in a crop stand is strongly related to its influences on Frankliniella occidentalis preferential behavior and life expectancy. Model dynamics of important aspects in disease development within TSWV-F. occidentalis-host plant interactions were developed, focusing on F. occidentalis' life processes as influenced by TSWV. The results show that the influence of TSWV on F. occidentalis preferential behaviour leads to an estimated increase in relative acquisition rate of the virus, and up to 33% increase in transmission rate to healthy plants. Also, increased life expectancy; which relates to improved fitness, is dependent on the virus induced preferential behaviour, consequently promoting multiplication and spread of the virus in a crop stand. The development of vector-based models could further help in elucidating the role of tri-trophic interactions in agricultural disease systems. Use of the model to examine the components of the disease process could also boost our understanding on how specific epidemiological characteristics interact to cause diseases in crops. With this level of understanding we can efficiently develop more precise control strategies for the virus and the vector.