924 resultados para Multivariate statistics
Resumo:
We consider the forecasting of macroeconomic variables that are subject to revisions, using Bayesian vintage-based vector autoregressions. The prior incorporates the belief that, after the first few data releases, subsequent ones are likely to consist of revisions that are largely unpredictable. The Bayesian approach allows the joint modelling of the data revisions of more than one variable, while keeping the concomitant increase in parameter estimation uncertainty manageable. Our model provides markedly more accurate forecasts of post-revision values of inflation than do other models in the literature.
Resumo:
With a rapidly increasing fraction of electricity generation being sourced from wind, extreme wind power generation events such as prolonged periods of low (or high) generation and ramps in generation, are a growing concern for the efficient and secure operation of national power systems. As extreme events occur infrequently, long and reliable meteorological records are required to accurately estimate their characteristics. Recent publications have begun to investigate the use of global meteorological “reanalysis” data sets for power system applications, many of which focus on long-term average statistics such as monthly-mean generation. Here we demonstrate that reanalysis data can also be used to estimate the frequency of relatively short-lived extreme events (including ramping on sub-daily time scales). Verification against 328 surface observation stations across the United Kingdom suggests that near-surface wind variability over spatiotemporal scales greater than around 300 km and 6 h can be faithfully reproduced using reanalysis, with no need for costly dynamical downscaling. A case study is presented in which a state-of-the-art, 33 year reanalysis data set (MERRA, from NASA-GMAO), is used to construct an hourly time series of nationally-aggregated wind power generation in Great Britain (GB), assuming a fixed, modern distribution of wind farms. The resultant generation estimates are highly correlated with recorded data from National Grid in the recent period, both for instantaneous hourly values and for variability over time intervals greater than around 6 h. This 33 year time series is then used to quantify the frequency with which different extreme GB-wide wind power generation events occur, as well as their seasonal and inter-annual variability. Several novel insights into the nature of extreme wind power generation events are described, including (i) that the number of prolonged low or high generation events is well approximated by a Poission-like random process, and (ii) whilst in general there is large seasonal variability, the magnitude of the most extreme ramps is similar in both summer and winter. An up-to-date version of the GB case study data as well as the underlying model are freely available for download from our website: http://www.met.reading.ac.uk/~energymet/data/Cannon2014/.
Resumo:
Theory predicts the emergence of generalists in variable environments and antagonistic pleiotropy to favour specialists in constant environments, but empirical data seldom support such generalist–specialist trade-offs. We selected for generalists and specialists in the dung fly Sepsis punctum (Diptera: Sepsidae) under conditions that we predicted would reveal antagonistic pleiotropy and multivariate trade-offs underlying thermal reaction norms for juvenile development. We performed replicated laboratory evolution using four treatments: adaptation at a hot (31 °C) or a cold (15 °C) temperature, or under regimes fluctuating between these temperatures, either within or between generations. After 20 generations, we assessed parental effects and genetic responses of thermal reaction norms for three correlated life-history traits: size at maturity, juvenile growth rate and juvenile survival. We find evidence for antagonistic pleiotropy for performance at hot and cold temperatures, and a temperature-mediated trade-off between juvenile survival and size at maturity, suggesting that trade-offs associated with environmental tolerance can arise via intensified evolutionary compromises between genetically correlated traits. However, despite this antagonistic pleiotropy, we found no support for the evolution of increased thermal tolerance breadth at the expense of reduced maximal performance, suggesting low genetic variance in the generalist–specialist dimension.
Resumo:
Accurate monitoring of degradation levels in soils is essential in order to understand and achieve complete degradation of petroleum hydrocarbons in contaminated soils. We aimed to develop the use of multivariate methods for the monitoring of biodegradation of diesel in soils and to determine if diesel contaminated soils could be remediated to a chemical composition similar to that of an uncontaminated soil. An incubation experiment was set up with three contrasting soil types. Each soil was exposed to diesel at varying stages of degradation and then analysed for key hydrocarbons throughout 161 days of incubation. Hydrocarbon distributions were analysed by Principal Coordinate Analysis and similar samples grouped by cluster analysis. Variation and differences between samples were determined using permutational multivariate analysis of variance. It was found that all soils followed trajectories approaching the chemical composition of the unpolluted soil. Some contaminated soils were no longer significantly different to that of uncontaminated soil after 161 days of incubation. The use of cluster analysis allows the assignment of a percentage chemical similarity of a diesel contaminated soil to an uncontaminated soil sample. This will aid in the monitoring of hydrocarbon contaminated sites and the establishment of potential endpoints for successful remediation.
Resumo:
To improve the quantity and impact of observations used in data assimilation it is necessary to take into account the full, potentially correlated, observation error statistics. A number of methods for estimating correlated observation errors exist, but a popular method is a diagnostic that makes use of statistical averages of observation-minus-background and observation-minus-analysis residuals. The accuracy of the results it yields is unknown as the diagnostic is sensitive to the difference between the exact background and exact observation error covariances and those that are chosen for use within the assimilation. It has often been stated in the literature that the results using this diagnostic are only valid when the background and observation error correlation length scales are well separated. Here we develop new theory relating to the diagnostic. For observations on a 1D periodic domain we are able to the show the effect of changes in the assumed error statistics used in the assimilation on the estimated observation error covariance matrix. We also provide bounds for the estimated observation error variance and eigenvalues of the estimated observation error correlation matrix. We demonstrate that it is still possible to obtain useful results from the diagnostic when the background and observation error length scales are similar. In general, our results suggest that when correlated observation errors are treated as uncorrelated in the assimilation, the diagnostic will underestimate the correlation length scale. We support our theoretical results with simple illustrative examples. These results have potential use for interpreting the derived covariances estimated using an operational system.
Resumo:
Although the sunspot-number series have existed since the mid-19th century, they are still the subject of intense debate, with the largest uncertainty being related to the "calibration" of the visual acuity of individual observers in the past. Daisy-chain regression methods are applied to inter-calibrate the observers which may lead to significant bias and error accumulation. Here we present a novel method to calibrate the visual acuity of the key observers to the reference data set of Royal Greenwich Observatory sunspot groups for the period 1900-1976, using the statistics of the active-day fraction. For each observer we independently evaluate their observational thresholds [S_S] defined such that the observer is assumed to miss all of the groups with an area smaller than S_S and report all the groups larger than S_S. Next, using a Monte-Carlo method we construct, from the reference data set, a correction matrix for each observer. The correction matrices are significantly non-linear and cannot be approximated by a linear regression or proportionality. We emphasize that corrections based on a linear proportionality between annually averaged data lead to serious biases and distortions of the data. The correction matrices are applied to the original sunspot group records for each day, and finally the composite corrected series is produced for the period since 1748. The corrected series displays secular minima around 1800 (Dalton minimum) and 1900 (Gleissberg minimum), as well as the Modern grand maximum of activity in the second half of the 20th century. The uniqueness of the grand maximum is confirmed for the last 250 years. It is shown that the adoption of a linear relationship between the data of Wolf and Wolfer results in grossly inflated group numbers in the 18th and 19th centuries in some reconstructions.
Resumo:
With the development of convection-permitting numerical weather prediction the efficient use of high resolution observations in data assimilation is becoming increasingly important. The operational assimilation of these observations, such as Dopplerradar radial winds, is now common, though to avoid violating the assumption of un- correlated observation errors the observation density is severely reduced. To improve the quantity of observations used and the impact that they have on the forecast will require the introduction of the full, potentially correlated, error statistics. In this work, observation error statistics are calculated for the Doppler radar radial winds that are assimilated into the Met Office high resolution UK model using a diagnostic that makes use of statistical averages of observation-minus-background and observation-minus-analysis residuals. This is the first in-depth study using the diagnostic to estimate both horizontal and along-beam correlated observation errors. By considering the new results obtained it is found that the Doppler radar radial wind error standard deviations are similar to those used operationally and increase as the observation height increases. Surprisingly the estimated observation error correlation length scales are longer than the operational thinning distance. They are dependent on both the height of the observation and on the distance of the observation away from the radar. Further tests show that the long correlations cannot be attributed to the use of superobservations or the background error covariance matrix used in the assimilation. The large horizontal correlation length scales are, however, in part, a result of using a simplified observation operator.
Resumo:
In this Letter, we determine the kappa-distribution function for a gas in the presence of an external field of force described by a potential U(r). In the case of a dilute gas, we show that the kappa-power law distribution including the potential energy factor term can rigorously be deduced in the framework of kinetic theory with basis on the Vlasov equation. Such a result is significant as a preliminary to the discussion on the role of long range interactions in the Kaniadakis thermostatistics and the underlying kinetic theory. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The South American (SA) rainy season is studied in this paper through the application of a multivariate Empirical Orthogonal Function (EOF) analysis to a SA gridded precipitation analysis and to the components of Lorenz Energy Cycle (LEC) derived from the National Centers for Environmental Prediction (NCEP) reanalysis. The EOF analysis leads to the identification of patterns of the rainy season and the associated mechanisms in terms of their energetics. The first combined EOF represents the northwest-southeast dipole of the precipitation between South and Central America, the South American Monsoon System (SAMS). The second combined EOF represents a synoptic pattern associated with the SACZ (South Atlantic convergence zone) and the third EOF is in spatial quadrature to the second EOF. The phase relationship of the EOFs, as computed from the principal components (PCs), suggests a nonlinear transition from the SACZ to the fully developed SAMS mode by November and between both components describing the SACZ by September-October (the rainy season onset). According to the LEC, the first mode is dominated by the eddy generation term at its maximum, the second by both baroclinic and eddy generation terms and the third by barotropic instability previous to the connection to the second mode by September-October. The predominance of the different LEC components at each phase of the SAMS can be used as an indicator of the onset of the rainy season in terms of physical processes, while the existence of the outstanding spectral peaks in the time dependence of the EOFs at the intraseasonal time scale could be used for monitoring purposes. Copyright (C) 2009 Royal Meteorological Society
Resumo:
This paper presents a GIS-based multicriteria flood risk assessment and mapping approach applied to coastal drainage basins where hydrological data are not available. It involves risk to different types of possible processes: coastal inundation (storm surge), river, estuarine and flash flood, either at urban or natural areas, and fords. Based on the causes of these processes, several environmental indicators were taken to build-up the risk assessment. Geoindicators include geological-geomorphologic proprieties of Quaternary sedimentary units, water table, drainage basin morphometry, coastal dynamics, beach morphodynamics and microclimatic characteristics. Bioindicators involve coastal plain and low slope native vegetation categories and two alteration states. Anthropogenic indicators encompass land use categories properties such as: type, occupation density, urban structure type and occupation consolidation degree. The selected indicators were stored within an expert Geoenvironmental Information System developed for the State of Sao Paulo Coastal Zone (SIIGAL), which attributes were mathematically classified through deterministic approaches, in order to estimate natural susceptibilities (Sn), human-induced susceptibilities (Sa), return period of rain events (Ri), potential damages (Dp) and the risk classification (R), according to the equation R=(Sn.Sa.Ri).Dp. Thematic maps were automatically processed within the SIIGAL, in which automata cells (""geoenvironmental management units"") aggregating geological-geomorphologic and land use/native vegetation categories were the units of classification. The method has been applied to the Northern Littoral of the State of Sao Paulo (Brazil) in 32 small drainage basins, demonstrating to be very useful for coastal zone public politics, civil defense programs and flood management.
Resumo:
The topology of real-world complex networks, such as in transportation and communication, is always changing with time. Such changes can arise not only as a natural consequence of their growth, but also due to major modi. cations in their intrinsic organization. For instance, the network of transportation routes between cities and towns ( hence locations) of a given country undergo a major change with the progressive implementation of commercial air transportation. While the locations could be originally interconnected through highways ( paths, giving rise to geographical networks), transportation between those sites progressively shifted or was complemented by air transportation, with scale free characteristics. In the present work we introduce the path-star transformation ( in its uniform and preferential versions) as a means to model such network transformations where paths give rise to stars of connectivity. It is also shown, through optimal multivariate statistical methods (i.e. canonical projections and maximum likelihood classification) that while the US highways network adheres closely to a geographical network model, its path-star transformation yields a network whose topological properties closely resembles those of the respective airport transportation network.
Resumo:
Canalizing genes possess such broad regulatory power, and their action sweeps across a such a wide swath of processes that the full set of affected genes are not highly correlated under normal conditions. When not active, the controlling gene will not be predictable to any significant degree by its subject genes, either alone or in groups, since their behavior will be highly varied relative to the inactive controlling gene. When the controlling gene is active, its behavior is not well predicted by any one of its targets, but can be very well predicted by groups of genes under its control. To investigate this question, we introduce in this paper the concept of intrinsically multivariate predictive (IMP) genes, and present a mathematical study of IMP in the context of binary genes with respect to the coefficient of determination (CoD), which measures the predictive power of a set of genes with respect to a target gene. A set of predictor genes is said to be IMP for a target gene if all properly contained subsets of the predictor set are bad predictors of the target but the full predictor set predicts the target with great accuracy. We show that logic of prediction, predictive power, covariance between predictors, and the entropy of the joint probability distribution of the predictors jointly affect the appearance of IMP genes. In particular, we show that high-predictive power, small covariance among predictors, a large entropy of the joint probability distribution of predictors, and certain logics, such as XOR in the 2-predictor case, are factors that favor the appearance of IMP. The IMP concept is applied to characterize the behavior of the gene DUSP1, which exhibits control over a central, process-integrating signaling pathway, thereby providing preliminary evidence that IMP can be used as a criterion for discovery of canalizing genes.