42 resultados para Data sets storage


Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we develop a method, termed the Interaction Distribution (ID) method, for analysis of quantitative ecological network data. In many cases, quantitative network data sets are under-sampled, i.e. many interactions are poorly sampled or remain unobserved. Hence, the output of statistical analyses may fail to differentiate between patterns that are statistical artefacts and those which are real characteristics of ecological networks. The ID method can support assessment and inference of under-sampled ecological network data. In the current paper, we illustrate and discuss the ID method based on the properties of plant-animal pollination data sets of flower visitation frequencies. However, the ID method may be applied to other types of ecological networks. The method can supplement existing network analyses based on two definitions of the underlying probabilities for each combination of pollinator and plant species: (1), pi,j: the probability for a visit made by the i’th pollinator species to take place on the j’th plant species; (2), qi,j: the probability for a visit received by the j’th plant species to be made by the i’th pollinator. The method applies the Dirichlet distribution to estimate these two probabilities, based on a given empirical data set. The estimated mean values for pi,j and qi,j reflect the relative differences between recorded numbers of visits for different pollinator and plant species, and the estimated uncertainty of pi,j and qi,j decreases with higher numbers of recorded visits.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We propose a new class of neurofuzzy construction algorithms with the aim of maximizing generalization capability specifically for imbalanced data classification problems based on leave-one-out (LOO) cross validation. The algorithms are in two stages, first an initial rule base is constructed based on estimating the Gaussian mixture model with analysis of variance decomposition from input data; the second stage carries out the joint weighted least squares parameter estimation and rule selection using orthogonal forward subspace selection (OFSS)procedure. We show how different LOO based rule selection criteria can be incorporated with OFSS, and advocate either maximizing the leave-one-out area under curve of the receiver operating characteristics, or maximizing the leave-one-out Fmeasure if the data sets exhibit imbalanced class distribution. Extensive comparative simulations illustrate the effectiveness of the proposed algorithms.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Within the SPARC Data Initiative, the first comprehensive assessment of the quality of 13 water vapor products from 11 limb-viewing satellite instruments (LIMS, SAGE II, UARS-MLS, HALOE, POAM III, SMR, SAGE III, MIPAS, SCIAMACHY, ACE-FTS, and Aura-MLS) obtained within the time period 1978-2010 has been performed. Each instrument's water vapor profile measurements were compiled into monthly zonal mean time series on a common latitude-pressure grid. These time series serve as basis for the "climatological" validation approach used within the project. The evaluations include comparisons of monthly or annual zonal mean cross sections and seasonal cycles in the tropical and extratropical upper troposphere and lower stratosphere averaged over one or more years, comparisons of interannual variability, and a study of the time evolution of physical features in water vapor such as the tropical tape recorder and polar vortex dehydration. Our knowledge of the atmospheric mean state in water vapor is best in the lower and middle stratosphere of the tropics and midlatitudes, with a relative uncertainty of. 2-6% (as quantified by the standard deviation of the instruments' multiannual means). The uncertainty increases toward the polar regions (+/- 10-15%), the mesosphere (+/- 15%), and the upper troposphere/lower stratosphere below 100 hPa (+/- 30-50%), where sampling issues add uncertainty due to large gradients and high natural variability in water vapor. The minimum found in multiannual (1998-2008) mean water vapor in the tropical lower stratosphere is 3.5 ppmv (+/- 14%), with slightly larger uncertainties for monthly mean values. The frequently used HALOE water vapor data set shows consistently lower values than most other data sets throughout the atmosphere, with increasing deviations from the multi-instrument mean below 100 hPa in both the tropics and extratropics. The knowledge gained from these comparisons and regarding the quality of the individual data sets in different regions of the atmosphere will help to improve model-measurement comparisons (e.g., for diagnostics such as the tropical tape recorder or seasonal cycles), data merging activities, and studies of climate variability.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A comprehensive quality assessment of the ozone products from 18 limb-viewing satellite instruments is provided by means of a detailed intercomparison. The ozone climatologies in form of monthly zonal mean time series covering the upper troposphere to lower mesosphere are obtained from LIMS, SAGE I/II/III, UARS-MLS, HALOE, POAM II/III, SMR, OSIRIS, MIPAS, GOMOS, SCIAMACHY, ACE-FTS, ACE-MAESTRO, Aura-MLS, HIRDLS, and SMILES within 1978–2010. The intercomparisons focus on mean biases of annual zonal mean fields, interannual variability, and seasonal cycles. Additionally, the physical consistency of the data is tested through diagnostics of the quasi-biennial oscillation and Antarctic ozone hole. The comprehensive evaluations reveal that the uncertainty in our knowledge of the atmospheric ozone mean state is smallest in the tropical and midlatitude middle stratosphere with a 1σ multi-instrument spread of less than ±5%. While the overall agreement among the climatological data sets is very good for large parts of the stratosphere, individual discrepancies have been identified, including unrealistic month-to-month fluctuations, large biases in particular atmospheric regions, or inconsistencies in the seasonal cycle. Notable differences between the data sets exist in the tropical lower stratosphere (with a spread of ±30%) and at high latitudes (±15%). In particular, large relative differences are identified in the Antarctic during the time of the ozone hole, with a spread between the monthly zonal mean fields of ±50%. The evaluations provide guidance on what data sets are the most reliable for applications such as studies of ozone variability, model-measurement comparisons, detection of long-term trends, and data-merging activities.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present the first comprehensive intercomparison of currently available satellite ozone climatologies in the upper troposphere/lower stratosphere (UTLS) (300–70 hPa) as part of the Stratosphere-troposphere Processes and their Role in Climate (SPARC) Data Initiative. The Tropospheric Emission Spectrometer (TES) instrument is the only nadir-viewing instrument in this initiative, as well as the only instrument with a focus on tropospheric composition. We apply the TES observational operator to ozone climatologies from the more highly vertically resolved limb-viewing instruments. This minimizes the impact of differences in vertical resolution among the instruments and allows identification of systematic differences in the large-scale structure and variability of UTLS ozone. We find that the climatologies from most of the limb-viewing instruments show positive differences (ranging from 5 to 75%) with respect to TES in the tropical UTLS, and comparison to a “zonal mean” ozonesonde climatology indicates that these differences likely represent a positive bias for p ≤ 100 hPa. In the extratropics, there is good agreement among the climatologies regarding the timing and magnitude of the ozone seasonal cycle (differences in the peak-to-peak amplitude of <15%) when the TES observational operator is applied, as well as very consistent midlatitude interannual variability. The discrepancies in ozone temporal variability are larger in the tropics, with differences between the data sets of up to 55% in the seasonal cycle amplitude. However, the differences among the climatologies are everywhere much smaller than the range produced by current chemistry-climate models, indicating that the multiple-instrument ensemble is useful for quantitatively evaluating these models.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Stratospheric water vapour is a powerful greenhouse gas. The longest available record from balloon observations over Boulder, Colorado, USA shows increases in stratospheric water vapour concentrations that cannot be fully explained by observed changes in the main drivers, tropical tropopause temperatures and methane. Satellite observations could help resolve the issue, but constructing a reliable long-term data record from individual short satellite records is challenging. Here we present an approach to merge satellite data sets with the help of a chemistry–climate model nudged to observed meteorology. We use the models’ water vapour as a transfer function between data sets that overcomes issues arising from instrument drift and short overlap periods. In the lower stratosphere, our water vapour record extends back to 1988 and water vapour concentrations largely follow tropical tropopause temperatures. Lower and mid-stratospheric long-term trends are negative, and the trends from Boulder are shown not to be globally representative. In the upper stratosphere, our record extends back to 1986 and shows positive long-term trends. The altitudinal differences in the trends are explained by methane oxidation together with a strengthened lower-stratospheric and a weakened upper stratospheric circulation inferred by this analysis. Our results call into question previous estimates of surface radiative forcing based on presumed global long-term increases in water vapour concentrations in the lower stratosphere.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents a novel approach to the automatic classification of very large data sets composed of terahertz pulse transient signals, highlighting their potential use in biochemical, biomedical, pharmaceutical and security applications. Two different types of THz spectra are considered in the classification process. Firstly a binary classification study of poly-A and poly-C ribonucleic acid samples is performed. This is then contrasted with a difficult multi-class classification problem of spectra from six different powder samples that although have fairly indistinguishable features in the optical spectrum, they also possess a few discernable spectral features in the terahertz part of the spectrum. Classification is performed using a complex-valued extreme learning machine algorithm that takes into account features in both the amplitude as well as the phase of the recorded spectra. Classification speed and accuracy are contrasted with that achieved using a support vector machine classifier. The study systematically compares the classifier performance achieved after adopting different Gaussian kernels when separating amplitude and phase signatures. The two signatures are presented as feature vectors for both training and testing purposes. The study confirms the utility of complex-valued extreme learning machine algorithms for classification of the very large data sets generated with current terahertz imaging spectrometers. The classifier can take into consideration heterogeneous layers within an object as would be required within a tomographic setting and is sufficiently robust to detect patterns hidden inside noisy terahertz data sets. The proposed study opens up the opportunity for the establishment of complex-valued extreme learning machine algorithms as new chemometric tools that will assist the wider proliferation of terahertz sensing technology for chemical sensing, quality control, security screening and clinic diagnosis. Furthermore, the proposed algorithm should also be very useful in other applications requiring the classification of very large datasets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

With a rapidly increasing fraction of electricity generation being sourced from wind, extreme wind power generation events such as prolonged periods of low (or high) generation and ramps in generation, are a growing concern for the efficient and secure operation of national power systems. As extreme events occur infrequently, long and reliable meteorological records are required to accurately estimate their characteristics. Recent publications have begun to investigate the use of global meteorological “reanalysis” data sets for power system applications, many of which focus on long-term average statistics such as monthly-mean generation. Here we demonstrate that reanalysis data can also be used to estimate the frequency of relatively short-lived extreme events (including ramping on sub-daily time scales). Verification against 328 surface observation stations across the United Kingdom suggests that near-surface wind variability over spatiotemporal scales greater than around 300 km and 6 h can be faithfully reproduced using reanalysis, with no need for costly dynamical downscaling. A case study is presented in which a state-of-the-art, 33 year reanalysis data set (MERRA, from NASA-GMAO), is used to construct an hourly time series of nationally-aggregated wind power generation in Great Britain (GB), assuming a fixed, modern distribution of wind farms. The resultant generation estimates are highly correlated with recorded data from National Grid in the recent period, both for instantaneous hourly values and for variability over time intervals greater than around 6 h. This 33 year time series is then used to quantify the frequency with which different extreme GB-wide wind power generation events occur, as well as their seasonal and inter-annual variability. Several novel insights into the nature of extreme wind power generation events are described, including (i) that the number of prolonged low or high generation events is well approximated by a Poission-like random process, and (ii) whilst in general there is large seasonal variability, the magnitude of the most extreme ramps is similar in both summer and winter. An up-to-date version of the GB case study data as well as the underlying model are freely available for download from our website: http://www.met.reading.ac.uk/~energymet/data/Cannon2014/.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Catastrophe risk models used by the insurance industry are likely subject to significant uncertainty, but due to their proprietary nature and strict licensing conditions they are not available for experimentation. In addition, even if such experiments were conducted, these would not be repeatable by other researchers because commercial confidentiality issues prevent the details of proprietary catastrophe model structures from being described in public domain documents. However, such experimentation is urgently required to improve decision making in both insurance and reinsurance markets. In this paper we therefore construct our own catastrophe risk model for flooding in Dublin, Ireland, in order to assess the impact of typical precipitation data uncertainty on loss predictions. As we consider only a city region rather than a whole territory and have access to detailed data and computing resources typically unavailable to industry modellers, our model is significantly more detailed than most commercial products. The model consists of four components, a stochastic rainfall module, a hydrological and hydraulic flood hazard module, a vulnerability module, and a financial loss module. Using these we undertake a series of simulations to test the impact of driving the stochastic event generator with four different rainfall data sets: ground gauge data, gauge-corrected rainfall radar, meteorological reanalysis data (European Centre for Medium-Range Weather Forecasts Reanalysis-Interim; ERA-Interim) and a satellite rainfall product (The Climate Prediction Center morphing method; CMORPH). Catastrophe models are unusual because they use the upper three components of the modelling chain to generate a large synthetic database of unobserved and severe loss-driving events for which estimated losses are calculated. We find the loss estimates to be more sensitive to uncertainties propagated from the driving precipitation data sets than to other uncertainties in the hazard and vulnerability modules, suggesting that the range of uncertainty within catastrophe model structures may be greater than commonly believed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

As we enter an era of ‘big data’, asset information is becoming a deliverable of complex projects. Prior research suggests digital technologies enable rapid, flexible forms of project organizing. This research analyses practices of managing change in Airbus, CERN and Crossrail, through desk-based review, interviews, visits and a cross-case workshop. These organizations deliver complex projects, rely on digital technologies to manage large data-sets; and use configuration management, a systems engineering approach with mid-20th century origins, to establish and maintain integrity. In them, configuration management has become more, rather than less, important. Asset information is structured, with change managed through digital systems, using relatively hierarchical, asynchronous and sequential processes. The paper contributes by uncovering limits to flexibility in complex projects where integrity is important. Challenges of managing change are discussed, considering the evolving nature of configuration management; potential use of analytics on complex projects; and implications for research and practice.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Geotechnical systems, such as landfills, mine tailings storage facilities (TSFs), slopes, and levees, are required to perform safely throughout their service life, which can span from decades for levees to “in perpetuity” for TSFs. The conventional design practice by geotechnical engineers for these systems utilizes the as-built material properties to predict its performance throughout the required service life. The implicit assumption in this design methodology is that the soil properties are stable through time. This is counter to long-term field observations of these systems, particularly where ecological processes such as plant, animal, biological, and geochemical activity are present. Plant roots can densify soil and/or increase hydraulic conductivity, burrowing animals can increase seepage, biological activity can strengthen soil, geochemical processes can increase stiffness, etc. The engineering soil properties naturally change as a stable ecological system is gradually established following initial construction, and these changes alter system performance. This paper presents an integrated perspective and new approach to this issue, considering ecological, geotechnical, and mining demands and constraints. A series of data sets and case histories are utilized to examine these issues and to propose a more integrated design approach, and consideration is given to future opportunities to manage engineered landscapes as ecological systems. We conclude that soil scientists and restoration ecologists must be engaged in initial project design and geotechnical engineers must be active in long-term management during the facility’s service life. For near-surface geotechnical structures in particular, this requires an interdisciplinary perspective and the embracing of soil as a living ecological system rather than an inert construction material.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A quality assessment of the CFC-11 (CCl3F), CFC-12 (CCl2F2), HF, and SF6 products from limb-viewing satellite instruments is provided by means of a detailed intercomparison. The climatologies in the form of monthly zonal mean time series are obtained from HALOE, MIPAS, ACE-FTS, and HIRDLS within the time period 1991–2010. The intercomparisons focus on the mean biases of the monthly and annual zonal mean fields and aim to identify their vertical, latitudinal and temporal structure. The CFC evaluations (based on MIPAS, ACE-FTS and HIRDLS) reveal that the uncertainty in our knowledge of the atmospheric CFC-11 and CFC-12 mean state, as given by satellite data sets, is smallest in the tropics and mid-latitudes at altitudes below 50 and 20 hPa, respectively, with a 1σ multi-instrument spread of up to ±5 %. For HF, the situation is reversed. The two available data sets (HALOE and ACE-FTS) agree well above 100 hPa, with a spread in this region of ±5 to ±10 %, while at altitudes below 100 hPa the HF annual mean state is less well known, with a spread ±30 % and larger. The atmospheric SF6 annual mean states derived from two satellite data sets (MIPAS and ACE-FTS) show only very small differences with a spread of less than ±5 % and often below ±2.5 %. While the overall agreement among the climatological data sets is very good for large parts of the upper troposphere and lower stratosphere (CFCs, SF6) or middle stratosphere (HF), individual discrepancies have been identified. Pronounced deviations between the instrument climatologies exist for particular atmospheric regions which differ from gas to gas. Notable features are differently shaped isopleths in the subtropics, deviations in the vertical gradients in the lower stratosphere and in the meridional gradients in the upper troposphere, and inconsistencies in the seasonal cycle. Additionally, long-term drifts between the instruments have been identified for the CFC-11 and CFC-12 time series. The evaluations as a whole provide guidance on what data sets are the most reliable for applications such as studies of atmospheric transport and variability, model–measurement comparisons and detection of long-term trends. The data sets will be publicly available from the SPARC Data Centre and through PANGAEA (doi:10.1594/PANGAEA.849223).