373 resultados para Metropolis-Hasting
Resumo:
Many modern applications fall into the category of "large-scale" statistical problems, in which both the number of observations n and the number of features or parameters p may be large. Many existing methods focus on point estimation, despite the continued relevance of uncertainty quantification in the sciences, where the number of parameters to estimate often exceeds the sample size, despite huge increases in the value of n typically seen in many fields. Thus, the tendency in some areas of industry to dispense with traditional statistical analysis on the basis that "n=all" is of little relevance outside of certain narrow applications. The main result of the Big Data revolution in most fields has instead been to make computation much harder without reducing the importance of uncertainty quantification. Bayesian methods excel at uncertainty quantification, but often scale poorly relative to alternatives. This conflict between the statistical advantages of Bayesian procedures and their substantial computational disadvantages is perhaps the greatest challenge facing modern Bayesian statistics, and is the primary motivation for the work presented here.
Two general strategies for scaling Bayesian inference are considered. The first is the development of methods that lend themselves to faster computation, and the second is design and characterization of computational algorithms that scale better in n or p. In the first instance, the focus is on joint inference outside of the standard problem of multivariate continuous data that has been a major focus of previous theoretical work in this area. In the second area, we pursue strategies for improving the speed of Markov chain Monte Carlo algorithms, and characterizing their performance in large-scale settings. Throughout, the focus is on rigorous theoretical evaluation combined with empirical demonstrations of performance and concordance with the theory.
One topic we consider is modeling the joint distribution of multivariate categorical data, often summarized in a contingency table. Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. In Chapter 2, we derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.
Latent class models for the joint distribution of multivariate categorical, such as the PARAFAC decomposition, data play an important role in the analysis of population structure. In this context, the number of latent classes is interpreted as the number of genetically distinct subpopulations of an organism, an important factor in the analysis of evolutionary processes and conservation status. Existing methods focus on point estimates of the number of subpopulations, and lack robust uncertainty quantification. Moreover, whether the number of latent classes in these models is even an identified parameter is an open question. In Chapter 3, we show that when the model is properly specified, the correct number of subpopulations can be recovered almost surely. We then propose an alternative method for estimating the number of latent subpopulations that provides good quantification of uncertainty, and provide a simple procedure for verifying that the proposed method is consistent for the number of subpopulations. The performance of the model in estimating the number of subpopulations and other common population structure inference problems is assessed in simulations and a real data application.
In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis--Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. In Chapter 4 we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis--Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback-Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even in relatively small samples. The proposed approximation provides a computationally scalable and principled approach to regularized estimation and approximate Bayesian inference for log-linear models.
Another challenging and somewhat non-standard joint modeling problem is inference on tail dependence in stochastic processes. In applications where extreme dependence is of interest, data are almost always time-indexed. Existing methods for inference and modeling in this setting often cluster extreme events or choose window sizes with the goal of preserving temporal information. In Chapter 5, we propose an alternative paradigm for inference on tail dependence in stochastic processes with arbitrary temporal dependence structure in the extremes, based on the idea that the information on strength of tail dependence and the temporal structure in this dependence are both encoded in waiting times between exceedances of high thresholds. We construct a class of time-indexed stochastic processes with tail dependence obtained by endowing the support points in de Haan's spectral representation of max-stable processes with velocities and lifetimes. We extend Smith's model to these max-stable velocity processes and obtain the distribution of waiting times between extreme events at multiple locations. Motivated by this result, a new definition of tail dependence is proposed that is a function of the distribution of waiting times between threshold exceedances, and an inferential framework is constructed for estimating the strength of extremal dependence and quantifying uncertainty in this paradigm. The method is applied to climatological, financial, and electrophysiology data.
The remainder of this thesis focuses on posterior computation by Markov chain Monte Carlo. The Markov Chain Monte Carlo method is the dominant paradigm for posterior computation in Bayesian analysis. It has long been common to control computation time by making approximations to the Markov transition kernel. Comparatively little attention has been paid to convergence and estimation error in these approximating Markov Chains. In Chapter 6, we propose a framework for assessing when to use approximations in MCMC algorithms, and how much error in the transition kernel should be tolerated to obtain optimal estimation performance with respect to a specified loss function and computational budget. The results require only ergodicity of the exact kernel and control of the kernel approximation accuracy. The theoretical framework is applied to approximations based on random subsets of data, low-rank approximations of Gaussian processes, and a novel approximating Markov chain for discrete mixture models.
Data augmentation Gibbs samplers are arguably the most popular class of algorithm for approximately sampling from the posterior distribution for the parameters of generalized linear models. The truncated Normal and Polya-Gamma data augmentation samplers are standard examples for probit and logit links, respectively. Motivated by an important problem in quantitative advertising, in Chapter 7 we consider the application of these algorithms to modeling rare events. We show that when the sample size is large but the observed number of successes is small, these data augmentation samplers mix very slowly, with a spectral gap that converges to zero at a rate at least proportional to the reciprocal of the square root of the sample size up to a log factor. In simulation studies, moderate sample sizes result in high autocorrelations and small effective sample sizes. Similar empirical results are observed for related data augmentation samplers for multinomial logit and probit models. When applied to a real quantitative advertising dataset, the data augmentation samplers mix very poorly. Conversely, Hamiltonian Monte Carlo and a type of independence chain Metropolis algorithm show good mixing on the same dataset.
Resumo:
Thèse numérisée par la Direction des bibliothèques de l'Université de Montréal.
Resumo:
Care has come to dominate much feminist research on globalized migrations and the transfer of labor from the South to the North, while the older concept of reproduction had been pushed into the background but is now becoming the subject of debates on the commodification of care in the household and changes in welfare state policies. This article argues that we could achieve a better understanding of the different modalities and trajectories of care in the reproduction of individuals, families, and communities, both of migrant and nonmigrant populations by articulating the diverse circuits of migration, in particular that of labor and the family. In doing this, I go back to the earlier North American writing on racialized minorities and migrants and stratified social reproduction. I also explore insights from current Asian studies of gendered circuits of migration connecting labor and marriage migrations as well as the notion of global householding that highlights the gender politics of social reproduction operating within and beyond households in institutional and welfare architectures. In contrast to Asia, there has relatively been little exploration in European studies of the articulation of labor and family migrations through the lens of social reproduction. However, connecting the different types of migration enables us to achieve a more complex understanding of care trajectories and their contribution to social reproduction.
Resumo:
The city of London was, during the years of 1940–1941, a city under fire. The metropolis seemed to have two faces, like the Roman deity Janus: the face of the daylight hours, so normal, and yet so deceiving in its false quietness – and at nightfall, the city turned, and the face of it was the face of the devil himself, transforming London into a living inferno. This thesis examines the sensescapes of the Blitz, through the diaries and memoirs written of that time. The primary sources consist of seven different diaries, two autobiographies, and four research volumes that contain multiple diary- and memoire entries, mostly from the Mass Observation Archives and from the Imperial War Museum. The sensory approach is a new orientation in the field of history – it studies the five senses in their cultural contexts, interpreting the often subtle ways in which the senses affect into society, politics, culture, and class hierarchies, to name only but few. The subject of the sensory history of war is a theme widely unexamined: this thesis contributes to this frontier field by unveiling the sensorium of the London bombings, comparing the differences between the halves of nychtemeron, and examining how the Blitz was communicated by the writers as a lived, bodily experience. This study reveals the very different sensory worlds in which the Londoners lived, during a time that is often described with the mythical solidarity that was thought to exist between the people. The reality of the homeless, working class, and poor were in the foul smelling tubes, poor law -dated rations, and in the smoking ruins of East End – the contrast was massive reflecting it to the luxury hotels and restaurants of the upper classes, opportunities for evacuation, sheltering possibilities, and overall comforts of life.
Resumo:
The protein folding problem has been one of the most challenging subjects in biological physics due to its complexity. Energy landscape theory based on statistical mechanics provides a thermodynamic interpretation of the protein folding process. We have been working to answer fundamental questions about protein-protein and protein-water interactions, which are very important for describing the energy landscape surface of proteins correctly. At first, we present a new method for computing protein-protein interaction potentials of solvated proteins directly from SAXS data. An ensemble of proteins was modeled by Metropolis Monte Carlo and Molecular Dynamics simulations, and the global X-ray scattering of the whole model ensemble was computed at each snapshot of the simulation. The interaction potential model was optimized and iterated by a Levenberg-Marquardt algorithm. Secondly, we report that terahertz spectroscopy directly probes hydration dynamics around proteins and determines the size of the dynamical hydration shell. We also present the sequence and pH-dependence of the hydration shell and the effect of the hydrophobicity. On the other hand, kinetic terahertz absorption (KITA) spectroscopy is introduced to study the refolding kinetics of ubiquitin and its mutants. KITA results are compared to small angle X-ray scattering, tryptophan fluorescence, and circular dichroism results. We propose that KITA monitors the rearrangement of hydrogen bonding during secondary structure formation. Finally, we present development of the automated single molecule operating system (ASMOS) for a high throughput single molecule detector, which levitates a single protein molecule in a 10 µm diameter droplet by the laser guidance. I also have performed supporting calculations and simulations with my own program codes.
Resumo:
This dissertation examines how Buenos Aires emerged as a creative capital of mass culture and cultural industries in South America during a period when Argentine theater and cinema expanded rapidly, winning over a regional marketplace swelled by transatlantic immigration, urbanization and industrialization. I argue that mass culture across the River Plate developed from a singular dynamic of exchange and competition between Buenos Aires and neighboring Montevideo. The study focuses on the Argentine, Uruguayan, and international performers, playwrights, producers, cultural impresarios, critics, and consumers who collectively built regional cultural industries. The cultural industries in this region blossomed in the interwar period as the advent of new technologies like sound film created profitable opportunities for mass cultural production and new careers for countless theater professionals. Buenos Aires also became a global cultural capital in the wider Hispanic Atlantic world, as its commercial culture served a region composed largely of immigrants and their descendants. From the 1920s through the 1940s, Montevideo maintained a subordinate but symbiotic relationship with Buenos Aires. The two cities shared interlinked cultural marketplaces that attracted performers and directors from the Atlantic world to work in theatre and film productions, especially in times of political upheaval such as the Spanish Civil War and the Perón era in Argentina. As a result of this transnational process, Argentine mass culture became widely consumed throughout South America, competing successfully with Hollywood, European, and other Latin American cinemas and helping transform Buenos Aires into a cosmopolitan metropolis. By examining the relationship between regional and national frames of cultural production, my dissertation contributes to the fields of Latin American studies and urban history while seeking to de-center the United States and Europe from the central framing of transnational history.
Resumo:
A questão dos retornados é ainda uma questão sensível na nossa sociedade. Alguns de nós conhecemos alguém, familiares ou amigos, que tiveram de fugir do Ultramar. No espaço de poucas décadas, o território ultramarino passava de Terra Prometida a pesadelo, com milhares de colonos a terem de regressar à metrópole, muitos apenas com a roupa que traziam colada ao corpo. Este artigo divaga sobre as razões pelas quais se iniciou a colonização de África, enumera os principais problemas da ocupação efectiva, principalmente no início do século passado, e fala sobre a vida social e económica no Ultramar até à independência dos territórios, com foco no caso angolano. Aborda ainda histórias contadas na primeira pessoa de situações sobre a fuga das colónias até à chegada a Portugal.
Resumo:
Many exchange rate papers articulate the view that instabilities constitute a major impediment to exchange rate predictability. In this thesis we implement Bayesian and other techniques to account for such instabilities, and examine some of the main obstacles to exchange rate models' predictive ability. We first consider in Chapter 2 a time-varying parameter model in which fluctuations in exchange rates are related to short-term nominal interest rates ensuing from monetary policy rules, such as Taylor rules. Unlike the existing exchange rate studies, the parameters of our Taylor rules are allowed to change over time, in light of the widespread evidence of shifts in fundamentals - for example in the aftermath of the Global Financial Crisis. Focusing on quarterly data frequency from the crisis, we detect forecast improvements upon a random walk (RW) benchmark for at least half, and for as many as seven out of 10, of the currencies considered. Results are stronger when we allow the time-varying parameters of the Taylor rules to differ between countries. In Chapter 3 we look closely at the role of time-variation in parameters and other sources of uncertainty in hindering exchange rate models' predictive power. We apply a Bayesian setup that incorporates the notion that the relevant set of exchange rate determinants and their corresponding coefficients, change over time. Using statistical and economic measures of performance, we first find that predictive models which allow for sudden, rather than smooth, changes in the coefficients yield significant forecast improvements and economic gains at horizons beyond 1-month. At shorter horizons, however, our methods fail to forecast better than the RW. And we identify uncertainty in coefficients' estimation and uncertainty about the precise degree of coefficients variability to incorporate in the models, as the main factors obstructing predictive ability. Chapter 4 focus on the problem of the time-varying predictive ability of economic fundamentals for exchange rates. It uses bootstrap-based methods to uncover the time-specific conditioning information for predicting fluctuations in exchange rates. Employing several metrics for statistical and economic evaluation of forecasting performance, we find that our approach based on pre-selecting and validating fundamentals across bootstrap replications generates more accurate forecasts than the RW. The approach, known as bumping, robustly reveals parsimonious models with out-of-sample predictive power at 1-month horizon; and outperforms alternative methods, including Bayesian, bagging, and standard forecast combinations. Chapter 5 exploits the predictive content of daily commodity prices for monthly commodity-currency exchange rates. It builds on the idea that the effect of daily commodity price fluctuations on commodity currencies is short-lived, and therefore harder to pin down at low frequencies. Using MIxed DAta Sampling (MIDAS) models, and Bayesian estimation methods to account for time-variation in predictive ability, the chapter demonstrates the usefulness of suitably exploiting such short-lived effects in improving exchange rate forecasts. It further shows that the usual low-frequency predictors, such as money supplies and interest rates differentials, typically receive little support from the data at monthly frequency, whereas MIDAS models featuring daily commodity prices are highly likely. The chapter also introduces the random walk Metropolis-Hastings technique as a new tool to estimate MIDAS regressions.
Resumo:
Background: West Nile virus (WNV) infection, is an arbovirus infection with high morbidity and mortality, the vector responsible for both human and animal transmission is Culex pipens complex. Objective: To determine the species distribution and seasonal abundance of Culex pipens and Culex quinquefasciatus mosquitoes in Abeokuta, Nigeria. Methods: Mosquitoes belonging to the Culex pipens complex were captured in three different locations located within Abeokuta Metropolis between March 2012 and January 2013. Individual species were identified using morphometric methods. Amplification of the Ace2 gene by PCR confirmed morphormetric identification of the mosquitoes. Results: A total of 751 mosquitoes were captured. Culex quinquefaciatus recorded the highest distribution of vectors with 56.6% and Culex pipens 43.4% (P > 0.05). Idi aba community recorded the highest distribution of mosquito vectors with 42.9% (n=322) and Culex quinqueaciatus was more abundantly distributed with 183 mosquitoes. Aro community recorded 32% (n=240) of captured mosquitoes with Culex quinquefaciatus having a higher level of abundance and lastly Kemta with a distribution of 25.1% (n=189). Conclusion: Results from this study show that potential vectors of WNV abound within Abeokuta, putting residents at high risk of West Nile infection. We advocate for introduction of routine testing of WNV in Abeokuta and Nigeria. Keywords:
Resumo:
Tese (doutorado)—Universidade de Brasília, Instituto de Letras, Instituto de Letras, Departamento de Teoria Literária e Literaturas, Programa de Pós-Graduação em Literatura, 2016.
Resumo:
Tese (doutorado)—Universidade de Brasília, Centro de Desenvolvimento Sustentável, 2015.
Resumo:
Siegfried Kracauer (1889-1966) fu di formazione ingegnere-architetto, giornalista. Egli fu un instancabile osservatore critico della superficie della realtà, convinto quale era che solo dall’osservazione dei fenomeni superficiali si potesse davvero intuire la realtà di un’epoca. L’obiettivo della tesi è cercare di cogliere il rapporto tra forma e critica della realtà attraverso saggi, articoli di giornale, recensioni di libri e film, biografie e autobiografie. All’interno di questo lavoro si sono isolate alcune immagini e opere che permettono, a nostro parere, di cogliere il senso della decifrazione della modernità in Kracauer. La luce come figura ambigua della fantasmagoria e della metropoli, il mito e la razionalizzazione capitalistica, la figura di Ginster, personaggio letterario chiaramente autobiografico incaricato di descrivere le tensioni nel passaggio dall’esperienza della prima guerra mondale al mondo moderno dell’improvvisazione e della perdita dei confini e, infine, Jacques Offenbach e l’operetta, incursione storica di Kracauer alla ricerca di una biografia sociale della città di Parigi come archeologia della modernità sviluppando un parallelo tra l’epoca del Secondo Impero di Napoleone III e l’avvento del nazismo. A ognuno di questi momenti è dedicato un capitolo che cerca di sviluppare continuità e discontinuità del pensiero di Kracauer nei confronti delle eredità filosofiche e metodologiche di György Lukács, Georg Simmel, Karl Marx. mL’attenzione è stata rivolta alle testimonianze dei rapporti e dei confronti, talora aspri, con i suoi colleghi e amici a partire da quelli complicati con Benjamin e Adorno che restituiscono un’immagine di un pensatore originale e complesso alla ricerca e, paradossalmente sulla soglia, di una via per pensare l’irruzione della cultura di massa e del potere assoggettante delle immagini.
Resumo:
The established isotropic tomographic models show the features of subduction zones in terms of seismic velocity anomalies, but they are generally subjected to the generation of artifacts due to the lack of anisotropy in forward modelling. There is evidence for the significant influence of seismic anisotropy in the mid-upper mantle, especially for boundary layers like subducting slabs. As consequence, in isotropic models artifacts may be misinterpreted as compositional or thermal heterogeneities. In this thesis project the application of a trans-dimensional Metropolis-Hastings method is investigated in the context of anisotropic seismic tomography. This choice arises as a response to the important limitations introduced by traditional inversion methods which use iterative procedures of optimization of a function object of the inversion. On the basis of a first implementation of the Bayesian sampling algorithm, the code is tested with some cartesian two-dimensional models, and then extended to polar coordinates and dimensions typical of subduction zones, the main focus proposed for this method. Synthetic experiments with increasing complexity are realized to test the performance of the method and the precautions for multiple contexts, taking into account also the possibility to apply seismic ray-tracing iteratively. The code developed is tested mainly for 2D inversions, future extensions will allow the anisotropic inversion of seismological data to provide more realistic imaging of real subduction zones, less subjected to generation of artifacts.