68 resultados para Multivariate statistics
Resumo:
The article presents an essay that deals with the study conducted by Donald MacKenzie and the case studies comparing the use of population statistics in France and Great Britain in the periods of 1825 and 1885. It analyzes Donald MacKenzie's study on the ways professional and political commitments informed the choice of statistical indexes in the British statistical community. Furthermore, the author is interested in knowing how this influenced the development of mathematical statistics in Great Britain. The author concludes that the differences in the debates over population statistics are accounted to the differences in the social and epistemological logics of population statistics.
Resumo:
An evaluation is undertaken of the statistics of daily precipitation as simulated by five regional climate models using comprehensive observations in the region of the European Alps. Four limited area models and one variable-resolution global model are considered, all with a grid spacing of 50 km. The 15-year integrations were forced from reanalyses and observed sea surface temperature and sea ice (global model from sea surface only). The observational reference is based on 6400 rain gauge records (10–50 stations per grid box). Evaluation statistics encompass mean precipitation, wet-day frequency, precipitation intensity, and quantiles of the frequency distribution. For mean precipitation, the models reproduce the characteristics of the annual cycle and the spatial distribution. The domain mean bias varies between −23% and +3% in winter and between −27% and −5% in summer. Larger errors are found for other statistics. In summer, all models underestimate precipitation intensity (by 16–42%) and there is a too low frequency of heavy events. This bias reflects too dry summer mean conditions in three of the models, while it is partly compensated by too many low-intensity events in the other two models. Similar intermodel differences are found for other European subregions. Interestingly, the model errors are very similar between the two models with the same dynamical core (but different parameterizations) and they differ considerably between the two models with similar parameterizations (but different dynamics). Despite considerable biases, the models reproduce prominent mesoscale features of heavy precipitation, which is a promising result for their use in climate change downscaling over complex topography.
Resumo:
The probability of a quantum particle being detected in a given solid angle is determined by the S-matrix. The explanation of this fact in time-dependent scattering theory is often linked to the quantum flux, since the quantum flux integrated against a (detector-) surface and over a time interval can be viewed as the probability that the particle crosses this surface within the given time interval. Regarding many particle scattering, however, this argument is no longer valid, as each particle arrives at the detector at its own random time. While various treatments of this problem can be envisaged, here we present a straightforward Bohmian analysis of many particle potential scattering from which the S-matrix probability emerges in the limit of large distances.
Resumo:
Synoptic climatology relates the atmospheric circulation with the surface environment. The aim of this study is to examine the variability of the surface meteorological patterns, which are developing under different synoptic scale categories over a suburban area with complex topography. Multivariate Data Analysis techniques were performed to a data set with surface meteorological elements. Three principal components related to the thermodynamic status of the surface environment and the two components of the wind speed were found. The variability of the surface flows was related with atmospheric circulation categories by applying Correspondence Analysis. Similar surface thermodynamic fields develop under cyclonic categories, which are contrasted with the anti-cyclonic category. A strong, steady wind flow characterized by high shear values develops under the cyclonic Closed Low and the anticyclonic H–L categories, in contrast to the variable weak flow under the anticyclonic Open Anticyclone category.
Resumo:
Cross-bred cow adoption is an important and potent policy variable precipitating subsistence household entry into emerging milk markets. This paper focuses on the problem of designing policies that encourage and sustain milkmarket expansion among a sample of subsistence households in the Ethiopian highlands. In this context it is desirable to measure households’ ‘proximity’ to market in terms of the level of deficiency of essential inputs. This problem is compounded by four factors. One is the existence of cross-bred cow numbers (count data) as an important, endogenous decision by the household; second is the lack of a multivariate generalization of the Poisson regression model; third is the censored nature of the milk sales data (sales from non-participating households are, essentially, censored at zero); and fourth is an important simultaneity that exists between the decision to adopt a cross-bred cow, the decision about how much milk to produce, the decision about how much milk to consume and the decision to market that milk which is produced but not consumed internally by the household. Routine application of Gibbs sampling and data augmentation overcome these problems in a relatively straightforward manner. We model the count data from two sites close to Addis Ababa in a latent, categorical-variable setting with known bin boundaries. The single-equation model is then extended to a multivariate system that accommodates the covariance between crossbred-cow adoption, milk-output, and milk-sales equations. The latent-variable procedure proves tractable in extension to the multivariate setting and provides important information for policy formation in emerging-market settings
Conditioning model output statistics of regional climate model precipitation on circulation patterns
Resumo:
Dynamical downscaling of Global Climate Models (GCMs) through regional climate models (RCMs) potentially improves the usability of the output for hydrological impact studies. However, a further downscaling or interpolation of precipitation from RCMs is often needed to match the precipitation characteristics at the local scale. This study analysed three Model Output Statistics (MOS) techniques to adjust RCM precipitation; (1) a simple direct method (DM), (2) quantile-quantile mapping (QM) and (3) a distribution-based scaling (DBS) approach. The modelled precipitation was daily means from 16 RCMs driven by ERA40 reanalysis data over the 1961–2000 provided by the ENSEMBLES (ENSEMBLE-based Predictions of Climate Changes and their Impacts) project over a small catchment located in the Midlands, UK. All methods were conditioned on the entire time series, separate months and using an objective classification of Lamb's weather types. The performance of the MOS techniques were assessed regarding temporal and spatial characteristics of the precipitation fields, as well as modelled runoff using the HBV rainfall-runoff model. The results indicate that the DBS conditioned on classification patterns performed better than the other methods, however an ensemble approach in terms of both climate models and downscaling methods is recommended to account for uncertainties in the MOS methods.
Resumo:
We discuss the modeling of dielectric responses of electromagnetically excited networks which are composed of a mixture of capacitors and resistors. Such networks can be employed as lumped-parameter circuits to model the response of composite materials containing conductive and insulating grains. The dynamics of the excited network systems are studied using a state space model derived from a randomized incidence matrix. Time and frequency domain responses from synthetic data sets generated from state space models are analyzed for the purpose of estimating the fraction of capacitors in the network. Good results were obtained by using either the time-domain response to a pulse excitation or impedance data at selected frequencies. A chemometric framework based on a Successive Projections Algorithm (SPA) enables the construction of multiple linear regression (MLR) models which can efficiently determine the ratio of conductive to insulating components in composite material samples. The proposed method avoids restrictions commonly associated with Archie’s law, the application of percolation theory or Kohlrausch-Williams-Watts models and is applicable to experimental results generated by either time domain transient spectrometers or continuous-wave instruments. Furthermore, it is quite generic and applicable to tomography, acoustics as well as other spectroscopies such as nuclear magnetic resonance, electron paramagnetic resonance and, therefore, should be of general interest across the dielectrics community.
Resumo:
Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summary statistics of the simulated data with summary statistics of the observed data. Here we show how to construct appropriate summary statistics for ABC in a semi-automatic manner. We aim for summary statistics which will enable inference about certain parameters of interest to be as accurate as possible. Theoretical results show that optimal summary statistics are the posterior means of the parameters. Although these cannot be calculated analytically, we use an extra stage of simulation to estimate how the posterior means vary as a function of the data; and we then use these estimates of our summary statistics within ABC. Empirical results show that our approach is a robust method for choosing summary statistics that can result in substantially more accurate ABC analyses than the ad hoc choices of summary statistics that have been proposed in the literature. We also demonstrate advantages over two alternative methods of simulation-based inference.
Resumo:
We consider methods of evaluating multivariate density forecasts. A recently proposed method is found to lack power when the correlation structure is mis-specified. Tests that have good power to detect mis-specifications of this sort are described. We also consider the properties of the tests in the presence of more general mis-specifications.
Resumo:
This note caveats standard statistics which accompany chess endgame tables, EGTs. It refers to Nalimov's double-counting of pawnless positions with both Kings on a long diagonal, and to the inclusion of positions which are not reachable from the initial position.
Resumo:
Sensory thresholds are often collected through ascending forced-choice methods. Group thresholds are important for comparing stimuli or populations; yet, the method has two problems. An individual may correctly guess the correct answer at any concentration step and might detect correctly at low concentrations but become adapted or fatigued at higher concentrations. The survival analysis method deals with both issues. Individual sequences of incorrect and correct answers are adjusted, taking into account the group performance at each concentration. The technique reduces the chance probability where there are consecutive correct answers. Adjusted sequences are submitted to survival analysis to determine group thresholds. The technique was applied to an aroma threshold and a taste threshold study. It resulted in group thresholds similar to ASTM or logarithmic regression procedures. Significant differences in taste thresholds between younger and older adults were determined. The approach provides a more robust technique over previous estimation methods.
Resumo:
For certain observing types, such as those that are remotely sensed, the observation errors are correlated and these correlations are state- and time-dependent. In this work, we develop a method for diagnosing and incorporating spatially correlated and time-dependent observation error in an ensemble data assimilation system. The method combines an ensemble transform Kalman filter with a method that uses statistical averages of background and analysis innovations to provide an estimate of the observation error covariance matrix. To evaluate the performance of the method, we perform identical twin experiments using the Lorenz ’96 and Kuramoto-Sivashinsky models. Using our approach, a good approximation to the true observation error covariance can be recovered in cases where the initial estimate of the error covariance is incorrect. Spatial observation error covariances where the length scale of the true covariance changes slowly in time can also be captured. We find that using the estimated correlated observation error in the assimilation improves the analysis.
Resumo:
We consider the forecasting of macroeconomic variables that are subject to revisions, using Bayesian vintage-based vector autoregressions. The prior incorporates the belief that, after the first few data releases, subsequent ones are likely to consist of revisions that are largely unpredictable. The Bayesian approach allows the joint modelling of the data revisions of more than one variable, while keeping the concomitant increase in parameter estimation uncertainty manageable. Our model provides markedly more accurate forecasts of post-revision values of inflation than do other models in the literature.
Resumo:
With a rapidly increasing fraction of electricity generation being sourced from wind, extreme wind power generation events such as prolonged periods of low (or high) generation and ramps in generation, are a growing concern for the efficient and secure operation of national power systems. As extreme events occur infrequently, long and reliable meteorological records are required to accurately estimate their characteristics. Recent publications have begun to investigate the use of global meteorological “reanalysis” data sets for power system applications, many of which focus on long-term average statistics such as monthly-mean generation. Here we demonstrate that reanalysis data can also be used to estimate the frequency of relatively short-lived extreme events (including ramping on sub-daily time scales). Verification against 328 surface observation stations across the United Kingdom suggests that near-surface wind variability over spatiotemporal scales greater than around 300 km and 6 h can be faithfully reproduced using reanalysis, with no need for costly dynamical downscaling. A case study is presented in which a state-of-the-art, 33 year reanalysis data set (MERRA, from NASA-GMAO), is used to construct an hourly time series of nationally-aggregated wind power generation in Great Britain (GB), assuming a fixed, modern distribution of wind farms. The resultant generation estimates are highly correlated with recorded data from National Grid in the recent period, both for instantaneous hourly values and for variability over time intervals greater than around 6 h. This 33 year time series is then used to quantify the frequency with which different extreme GB-wide wind power generation events occur, as well as their seasonal and inter-annual variability. Several novel insights into the nature of extreme wind power generation events are described, including (i) that the number of prolonged low or high generation events is well approximated by a Poission-like random process, and (ii) whilst in general there is large seasonal variability, the magnitude of the most extreme ramps is similar in both summer and winter. An up-to-date version of the GB case study data as well as the underlying model are freely available for download from our website: http://www.met.reading.ac.uk/~energymet/data/Cannon2014/.