989 resultados para 230203 Statistical Theory
Resumo:
Outcome-dependent, two-phase sampling designs can dramatically reduce the costs of observational studies by judicious selection of the most informative subjects for purposes of detailed covariate measurement. Here we derive asymptotic information bounds and the form of the efficient score and influence functions for the semiparametric regression models studied by Lawless, Kalbfleisch, and Wild (1999) under two-phase sampling designs. We show that the maximum likelihood estimators for both the parametric and nonparametric parts of the model are asymptotically normal and efficient. The efficient influence function for the parametric part aggress with the more general information bound calculations of Robins, Hsieh, and Newey (1995). By verifying the conditions of Murphy and Van der Vaart (2000) for a least favorable parametric submodel, we provide asymptotic justification for statistical inference based on profile likelihood.
Resumo:
High density oligonucleotide expression arrays are a widely used tool for the measurement of gene expression on a large scale. Affymetrix GeneChip arrays appear to dominate this market. These arrays use short oligonucleotides to probe for genes in an RNA sample. Due to optical noise, non-specific hybridization, probe-specific effects, and measurement error, ad-hoc measures of expression, that summarize probe intensities, can lead to imprecise and inaccurate results. Various researchers have demonstrated that expression measures based on simple statistical models can provide great improvements over the ad-hoc procedure offered by Affymetrix. Recently, physical models based on molecular hybridization theory, have been proposed as useful tools for prediction of, for example, non-specific hybridization. These physical models show great potential in terms of improving existing expression measures. In this paper we demonstrate that the system producing the measured intensities is too complex to be fully described with these relatively simple physical models and we propose empirically motivated stochastic models that compliment the above mentioned molecular hybridization theory to provide a comprehensive description of the data. We discuss how the proposed model can be used to obtain improved measures of expression useful for the data analysts.
Resumo:
In the setting of high-dimensional linear models with Gaussian noise, we investigate the possibility of confidence statements connected to model selection. Although there exist numerous procedures for adaptive (point) estimation, the construction of adaptive confidence regions is severely limited (cf. Li in Ann Stat 17:1001–1008, 1989). The present paper sheds new light on this gap. We develop exact and adaptive confidence regions for the best approximating model in terms of risk. One of our constructions is based on a multiscale procedure and a particular coupling argument. Utilizing exponential inequalities for noncentral χ2-distributions, we show that the risk and quadratic loss of all models within our confidence region are uniformly bounded by the minimal risk times a factor close to one.
Resumo:
Barry Saltzman was a giant in the fields of meteorology and climate science. A leading figure in the study of weather and climate for over 40 yr, he has frequently been referred to as the "father of modern climate theory." Ahead of his time in many ways, Saltzman made significant contributions to our understanding of the general circulation and spectral energetics budget of the atmosphere, as well as climate change across a wide spectrum of time scales. In his endeavor to develop a unified theory of how the climate system works, lie played a role in the development of energy balance models, statistical dynamical models, and paleoclimate dynamical models. He was a pioneer in developing meteorologically motivated dynamical systems, including the progenitor of Lorenz's famous chaos model. In applying his own dynamical-systems approach to long-term climate change, he recognized the potential for using atmospheric general circulation models in a complimentary way. In 1998, he was awarded the Carl-Gustaf Rossby medal, the highest honor of the American Meteorological Society "for his life-long contributions to the study of the global circulation and the evolution of the earth's climate." In this paper, the authors summarize and place into perspective some of the most significant contributions that Barry Saltzman made during his long and distinguished career. This short review also serves as an introduction to the papers in this special issue of the Journal of Climate dedicated to Barry's memory.
Resumo:
In recent years, the econometrics literature has shown a growing interest in the study of partially identified models, in which the object of economic and statistical interest is a set rather than a point. The characterization of this set and the development of consistent estimators and inference procedures for it with desirable properties are the main goals of partial identification analysis. This review introduces the fundamental tools of the theory of random sets, which brings together elements of topology, convex geometry, and probability theory to develop a coherent mathematical framework to analyze random elements whose realizations are sets. It then elucidates how these tools have been fruitfully applied in econometrics to reach the goals of partial identification analysis.
Resumo:
Coalescent theory represents the most significant progress in theoretical population genetics in the past three decades. The coalescent theory states that all genes or alleles in a given population are ultimately inherited from a single ancestor shared by all members of the population, known as the most recent common ancestor. It is now widely recognized as a cornerstone for rigorous statistical analyses of molecular data from population [1]. The scientists have developed a large number of coalescent models and methods[2,3,4,5,6], which are not only applied in coalescent analysis and process, but also in today’s population genetics and genome studies, even public health. The thesis aims at completing a statistical framework based on computers for coalescent analysis. This framework provides a large number of coalescent models and statistic methods to assist students and researchers in coalescent analysis, whose results are presented in various formats as texts, graphics and printed pages. In particular, it also supports to create new coalescent models and statistical methods. ^
Resumo:
In this paper we present a tool to carry out the multifractal analysis of binary, two-dimensional images through the calculation of the Rényi D(q) dimensions and associated statistical regressions. The estimation of a (mono)fractal dimension corresponds to the special case where the moment order is q = 0.
Resumo:
The SMS, Simultaneous Multiple Surfaces, design was born to Nonimaging Optics applications and is now being applied also to Imaging Optics. In this paper the wave aberration function of a selected SMS design is studied. It has been found the SMS aberrations can be analyzed with a little set of parameters, sometimes two. The connection of this model with the conventional aberration expansion is also presented. To verify these mathematical model two SMS design systems were raytraced and the data were analyzed with a classical statistical methods: the plot of discrepancies and the quadratic average error. Both the tests show very good agreement with the model for our systems.
Resumo:
Alzheimer's disease (AD) is the most common cause of dementia. Over the last few years, a considerable effort has been devoted to exploring new biomarkers. Nevertheless, a better understanding of brain dynamics is still required to optimize therapeutic strategies. In this regard, the characterization of mild cognitive impairment (MCI) is crucial, due to the high conversion rate from MCI to AD. However, only a few studies have focused on the analysis of magnetoencephalographic (MEG) rhythms to characterize AD and MCI. In this study, we assess the ability of several parameters derived from information theory to describe spontaneous MEG activity from 36 AD patients, 18 MCI subjects and 26 controls. Three entropies (Shannon, Tsallis and Rényi entropies), one disequilibrium measure (based on Euclidean distance ED) and three statistical complexities (based on Lopez Ruiz–Mancini–Calbet complexity LMC) were used to estimate the irregularity and statistical complexity of MEG activity. Statistically significant differences between AD patients and controls were obtained with all parameters (p < 0.01). In addition, statistically significant differences between MCI subjects and controls were achieved by ED and LMC (p < 0.05). In order to assess the diagnostic ability of the parameters, a linear discriminant analysis with a leave-one-out cross-validation procedure was applied. The accuracies reached 83.9% and 65.9% to discriminate AD and MCI subjects from controls, respectively. Our findings suggest that MCI subjects exhibit an intermediate pattern of abnormalities between normal aging and AD. Furthermore, the proposed parameters provide a new description of brain dynamics in AD and MCI.
Resumo:
In this work, we show how number theoretical problems can be fruitfully approached with the tools of statistical physics. We focus on g-Sidon sets, which describe sequences of integers whose pairwise sums are different, and propose a random decision problem which addresses the probability of a random set of k integers to be g-Sidon. First, we provide numerical evidence showing that there is a crossover between satisfiable and unsatisfiable phases which converts to an abrupt phase transition in a properly defined thermodynamic limit. Initially assuming independence, we then develop a mean-field theory for the g-Sidon decision problem. We further improve the mean-field theory, which is only qualitatively correct, by incorporating deviations from independence, yielding results in good quantitative agreement with the numerics for both finite systems and in the thermodynamic limit. Connections between the generalized birthday problem in probability theory, the number theory of Sidon sets and the properties of q-Potts models in condensed matter physics are briefly discussed
Resumo:
Submitted ACKNOWLEDGMENTS T. B. acknowledges the financial support from SERB, Department of Science and Technology (DST), India [Project Grant No.: SB/FTP/PS-005/2013]. D. G. acknowledges DST, India, for providing support through the INSPIRE fellowship. J. K. acknowledges Government of the Russian Federation (Agreement No. 14.Z50.31.0033 with Institute of Applied Physics RAS).
Resumo:
Over four hundred years ago, Sir Walter Raleigh asked his mathematical assistant to find formulas for the number of cannonballs in regularly stacked piles. These investigations aroused the curiosity of the astronomer Johannes Kepler and led to a problem that has gone centuries without a solution: why is the familiar cannonball stack the most efficient arrangement possible? Here we discuss the solution that Hales found in 1998. Almost every part of the 282-page proof relies on long computer verifications. Random matrix theory was developed by physicists to describe the spectra of complex nuclei. In particular, the statistical fluctuations of the eigenvalues (“the energy levels”) follow certain universal laws based on symmetry types. We describe these and then discuss the remarkable appearance of these laws for zeros of the Riemann zeta function (which is the generating function for prime numbers and is the last special function from the last century that is not understood today.) Explaining this phenomenon is a central problem. These topics are distinct, so we present them separately with their own introductory remarks.
Resumo:
A molecular model of poorly understood hydrophobic effects is heuristically developed using the methods of information theory. Because primitive hydrophobic effects can be tied to the probability of observing a molecular-sized cavity in the solvent, the probability distribution of the number of solvent centers in a cavity volume is modeled on the basis of the two moments available from the density and radial distribution of oxygen atoms in liquid water. The modeled distribution then yields the probability that no solvent centers are found in the cavity volume. This model is shown to account quantitatively for the central hydrophobic phenomena of cavity formation and association of inert gas solutes. The connection of information theory to statistical thermodynamics provides a basis for clarification of hydrophobic effects. The simplicity and flexibility of the approach suggest that it should permit applications to conformational equilibria of nonpolar solutes and hydrophobic residues in biopolymers.