982 resultados para Probability Distribution
Resumo:
Advanced forecasting of space weather requires simulation of the whole Sun-to-Earth system, which necessitates driving magnetospheric models with the outputs from solar wind models. This presents a fundamental difficulty, as the magnetosphere is sensitive to both large-scale solar wind structures, which can be captured by solar wind models, and small-scale solar wind “noise,” which is far below typical solar wind model resolution and results primarily from stochastic processes. Following similar approaches in terrestrial climate modeling, we propose statistical “downscaling” of solar wind model results prior to their use as input to a magnetospheric model. As magnetospheric response can be highly nonlinear, this is preferable to downscaling the results of magnetospheric modeling. To demonstrate the benefit of this approach, we first approximate solar wind model output by smoothing solar wind observations with an 8 h filter, then add small-scale structure back in through the addition of random noise with the observed spectral characteristics. Here we use a very simple parameterization of noise based upon the observed probability distribution functions of solar wind parameters, but more sophisticated methods will be developed in the future. An ensemble of results from the simple downscaling scheme are tested using a model-independent method and shown to add value to the magnetospheric forecast, both improving the best estimate and quantifying the uncertainty. We suggest a number of features desirable in an operational solar wind downscaling scheme.
Resumo:
Large waves pose risks to ships, offshore structures, coastal infrastructure and ecosystems. This paper analyses 10 years of in-situ measurements of significant wave height (Hs) and maximum wave height (Hmax) from the ocean weather ship Polarfront in the Norwegian Sea. During the period 2000 to 2009, surface elevation was recorded every 0.59 s during sampling periods of 30 min. The Hmax observations scale linearly with Hs on average. A widely-used empirical Weibull distribution is found to estimate average values of Hmax/Hs and Hmax better than a Rayleigh distribution, but tends to underestimate both for all but the smallest waves. In this paper we propose a modified Rayleigh distribution which compensates for the heterogeneity of the observed dataset: the distribution is fitted to the whole dataset and improves the estimate of the largest waves. Over the 10-year period, the Weibull distribution approximates the observed Hs and Hmax well, and an exponential function can be used to predict the probability distribution function of the ratio Hmax/Hs. However, the Weibull distribution tends to underestimate the occurrence of extremely large values of Hs and Hmax. The persistence of Hs and Hmax in winter is also examined. Wave fields with Hs>12 m and Hmax>16 m do not last longer than 3 h. Low-to-moderate wave heights that persist for more than 12 h dominate the relationship of the wave field with the winter NAO index over 2000–2009. In contrast, the inter-annual variability of wave fields with Hs>5.5 m or Hmax>8.5 m and wave fields persisting over ~2.5 days is not associated with the winter NAO index.
Resumo:
Wind generation's contribution to supporting peak electricity demand is one of the key questions in wind integration studies. Differently from conventional units, the available outputs of different wind farms cannot be approximated as being statistically independent, and hence near-zero wind output is possible across an entire power system. This paper will review the risk model structures currently used to assess wind's capacity value, along with discussion of the resulting data requirements. A central theme is the benefits from performing statistical estimation of the joint distribution for demand and available wind capacity, focusing attention on uncertainties due to limited histories of wind and demand data; examination of Great Britain data from the last 25 years shows that the data requirements are greater than generally thought. A discussion is therefore presented into how analysis of the types of weather system which have historically driven extreme electricity demands can help to deliver robust insights into wind's contribution to supporting demand, even in the face of such data limitations. The role of the form of the probability distribution for available conventional capacity in driving wind capacity credit results is also discussed.
Resumo:
This paper describes the methodology of providing multiprobability predictions for proteomic mass spectrometry data. The methodology is based on a newly developed machine learning framework called Venn machines. Is allows to output a valid probability interval. The methodology is designed for mass spectrometry data. For demonstrative purposes, we applied this methodology to MALDI-TOF data sets in order to predict the diagnosis of heart disease and early diagnoses of ovarian cancer and breast cancer. The experiments showed that probability intervals are narrow, that is, the output of the multiprobability predictor is similar to a single probability distribution. In addition, probability intervals produced for heart disease and ovarian cancer data were more accurate than the output of corresponding probability predictor. When Venn machines were forced to make point predictions, the accuracy of such predictions is for the most data better than the accuracy of the underlying algorithm that outputs single probability distribution of a label. Application of this methodology to MALDI-TOF data sets empirically demonstrates the validity. The accuracy of the proposed method on ovarian cancer data rises from 66.7 % 11 months in advance of the moment of diagnosis to up to 90.2 % at the moment of diagnosis. The same approach has been applied to heart disease data without time dependency, although the achieved accuracy was not as high (up to 69.9 %). The methodology allowed us to confirm mass spectrometry peaks previously identified as carrying statistically significant information for discrimination between controls and cases.
Resumo:
The co-polar correlation coefficient (ρhv) has many applications, including hydrometeor classification, ground clutter and melting layer identification, interpretation of ice microphysics and the retrieval of rain drop size distributions (DSDs). However, we currently lack the quantitative error estimates that are necessary if these applications are to be fully exploited. Previous error estimates of ρhv rely on knowledge of the unknown "true" ρhv and implicitly assume a Gaussian probability distribution function of ρhv samples. We show that frequency distributions of ρhv estimates are in fact highly negatively skewed. A new variable: L = -log10(1 - ρhv) is defined, which does have Gaussian error statistics, and a standard deviation depending only on the number of independent radar pulses. This is verified using observations of spherical drizzle drops, allowing, for the first time, the construction of rigorous confidence intervals in estimates of ρhv. In addition, we demonstrate how the imperfect co-location of the horizontal and vertical polarisation sample volumes may be accounted for. The possibility of using L to estimate the dispersion parameter (µ) in the gamma drop size distribution is investigated. We find that including drop oscillations is essential for this application, otherwise there could be biases in retrieved µ of up to ~8. Preliminary results in rainfall are presented. In a convective rain case study, our estimates show µ to be substantially larger than 0 (an exponential DSD). In this particular rain event, rain rate would be overestimated by up to 50% if a simple exponential DSD is assumed.
Resumo:
Data from 58 strong-lensing events surveyed by the Sloan Lens ACS Survey are used to estimate the projected galaxy mass inside their Einstein radii by two independent methods: stellar dynamics and strong gravitational lensing. We perform a joint analysis of these two estimates inside models with up to three degrees of freedom with respect to the lens density profile, stellar velocity anisotropy, and line-of-sight (LOS) external convergence, which incorporates the effect of the large-scale structure on strong lensing. A Bayesian analysis is employed to estimate the model parameters, evaluate their significance, and compare models. We find that the data favor Jaffe`s light profile over Hernquist`s, but that any particular choice between these two does not change the qualitative conclusions with respect to the features of the system that we investigate. The density profile is compatible with an isothermal, being sightly steeper and having an uncertainty in the logarithmic slope of the order of 5% in models that take into account a prior ignorance on anisotropy and external convergence. We identify a considerable degeneracy between the density profile slope and the anisotropy parameter, which largely increases the uncertainties in the estimates of these parameters, but we encounter no evidence in favor of an anisotropic velocity distribution on average for the whole sample. An LOS external convergence following a prior probability distribution given by cosmology has a small effect on the estimation of the lens density profile, but can increase the dispersion of its value by nearly 40%.
Resumo:
In this paper, we present a study on a deterministic partially self-avoiding walk (tourist walk), which provides a novel method for texture feature extraction. The method is able to explore an image on all scales simultaneously. Experiments were conducted using different dynamics concerning the tourist walk. A new strategy, based on histograms. to extract information from its joint probability distribution is presented. The promising results are discussed and compared to the best-known methods for texture description reported in the literature. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
We consider bipartitions of one-dimensional extended systems whose probability distribution functions describe stationary states of stochastic models. We define estimators of the information shared between the two subsystems. If the correlation length is finite, the estimators stay finite for large system sizes. If the correlation length diverges, so do the estimators. The definition of the estimators is inspired by information theory. We look at several models and compare the behaviors of the estimators in the finite-size scaling limit. Analytical and numerical methods as well as Monte Carlo simulations are used. We show how the finite-size scaling functions change for various phase transitions, including the case where one has conformal invariance.
Resumo:
Canalizing genes possess such broad regulatory power, and their action sweeps across a such a wide swath of processes that the full set of affected genes are not highly correlated under normal conditions. When not active, the controlling gene will not be predictable to any significant degree by its subject genes, either alone or in groups, since their behavior will be highly varied relative to the inactive controlling gene. When the controlling gene is active, its behavior is not well predicted by any one of its targets, but can be very well predicted by groups of genes under its control. To investigate this question, we introduce in this paper the concept of intrinsically multivariate predictive (IMP) genes, and present a mathematical study of IMP in the context of binary genes with respect to the coefficient of determination (CoD), which measures the predictive power of a set of genes with respect to a target gene. A set of predictor genes is said to be IMP for a target gene if all properly contained subsets of the predictor set are bad predictors of the target but the full predictor set predicts the target with great accuracy. We show that logic of prediction, predictive power, covariance between predictors, and the entropy of the joint probability distribution of the predictors jointly affect the appearance of IMP genes. In particular, we show that high-predictive power, small covariance among predictors, a large entropy of the joint probability distribution of predictors, and certain logics, such as XOR in the 2-predictor case, are factors that favor the appearance of IMP. The IMP concept is applied to characterize the behavior of the gene DUSP1, which exhibits control over a central, process-integrating signaling pathway, thereby providing preliminary evidence that IMP can be used as a criterion for discovery of canalizing genes.
Resumo:
We apply the concept of exchangeable random variables to the case of non-additive robability distributions exhibiting ncertainty aversion, and in the lass generated bya convex core convex non-additive probabilities, ith a convex core). We are able to rove two versions of the law of arge numbers (de Finetti's heorems). By making use of two efinitions. of independence we rove two versions of the strong law f large numbers. It turns out that e cannot assure the convergence of he sample averages to a constant. e then modal the case there is a true" probability distribution ehind the successive realizations of the uncertain random variable. In this case convergence occurs. This result is important because it renders true the intuition that it is possible "to learn" the "true" additive distribution behind an uncertain event if one repeatedly observes it (a sufficiently large number of times). We also provide a conjecture regarding the "Iearning" (or updating) process above, and prove a partia I result for the case of Dempster-Shafer updating rule and binomial trials.
Resumo:
In this paper we prove convergence to chaotic sunspot equilibrium through two learning rules used in the bounded rationality literature. The rst one shows the convergence of the actual dynamics generated by simple adaptive learning rules to a probability distribution that is close to the stationary measure of the sunspot equilibrium; since this stationary measure is absolutely continuous it results in a robust convergence to the stochastic equilibrium. The second one is based on the E-stability criterion for testing stability of rational expectations equilibrium, we show that the conditional probability distribution de ned by the sunspot equilibrium is expectational stable under a reasonable updating rule of this parameter. We also report some numerical simulations of the processes proposed.
Resumo:
Este trabalho explora um importante conceito desenvolvido por Breeden & Litzenberger para extrair informações contidas nas opções de juros no mercado brasileiro (Opção Sobre IDI), no âmbito da Bolsa de Valores, Mercadorias e Futuros de São Paulo (BM&FBOVESPA) dias antes e após a decisão do COPOM sobre a taxa Selic. O método consiste em determinar a distribuição de probabilidade através dos preços das opções sobre IDI, após o cálculo da superfície de volatilidade implícita, utilizando duas técnicas difundidas no mercado: Interpolação Cúbica (Spline Cubic) e Modelo de Black (1976). Serão analisados os quatro primeiros momentos da distribuição: valor esperado, variância, assimetria e curtose, assim como suas respectivas variações.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
O conhecimento do modelo de distribuição espacial de pragas na cultura é fundamental para estabelecer um plano adequado de amostragem seqüencial e, assim, permitir a correta utilização das estratégias de controle e a otimização das técnicas de amostragem. Esta pesquisa objetivou estudar a distribuição espacial de lagartas de Alabama argillacea (Hübner) na cultura do algodoeiro, cultivar CNPA ITA-90. A coleta de dados ocorreu durante o ano agrícola de 1998/99 na Fazenda Itamarati Sul S.A., localizada no município de Ponta Porã, MS, em três diferentes áreas de 10.000 m² cada uma. Cada área amostral foi composta de 100 parcelas com 100 m² cada. Foi realizada semanalmente a contagem das lagartas pequenas, médias e grandes, encontradas em cinco plantas por parcela. Os índices de agregação (razão variância/média e índice de Morisita), o teste de qui-quadrado com o ajuste dos valores encontrados e esperados às distribuições teóricas de freqüência (Poisson, binomial positiva e binomial negativa), mostraram que todos os estádios das lagartas estão distribuídos de acordo com o modelo de distribuição contagiosa, ajustando-se ao padrão da Distribuição Binomial Negativa durante todo o período de infestação.