39 resultados para data sets

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The common GIS-based approach to regional analyses of soil organic carbon (SOC) stocks and changes is to define geographic layers for which unique sets of driving variables are derived, which include land use, climate, and soils. These GIS layers, with their associated attribute data, can then be fed into a range of empirical and dynamic models. Common methodologies for collating and formatting regional data sets on land use, climate, and soils were adopted for the project Assessment of Soil Organic Carbon Stocks and Changes at National Scale (GEFSOC). This permitted the development of a uniform protocol for handling the various input for the dynamic GEFSOC Modelling System. Consistent soil data sets for Amazon-Brazil, the Indo-Gangetic Plains (IGP) of India, Jordan and Kenya, the case study areas considered in the GEFSOC project, were prepared using methodologies developed for the World Soils and Terrain Database (SOTER). The approach involved three main stages: (1) compiling new soil geographic and attribute data in SOTER format; (2) using expert estimates and common sense to fill selected gaps in the measured or primary data; (3) using a scheme of taxonomy-based pedotransfer rules and expert-rules to derive soil parameter estimates for similar soil units with missing soil analytical data. The most appropriate approach varied from country to country, depending largely on the overall accessibility and quality of the primary soil data available in the case study areas. The secondary SOTER data sets discussed here are appropriate for a wide range of environmental applications at national scale. These include agro-ecological zoning, land evaluation, modelling of soil C stocks and changes, and studies of soil vulnerability to pollution. Estimates of national-scale stocks of SOC, calculated using SOTER methods, are presented as a first example of database application. Independent estimates of SOC stocks are needed to evaluate the outcome of the GEFSOC Modelling System for current conditions of land use and climate. (C) 2007 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Many kernel classifier construction algorithms adopt classification accuracy as performance metrics in model evaluation. Moreover, equal weighting is often applied to each data sample in parameter estimation. These modeling practices often become problematic if the data sets are imbalanced. We present a kernel classifier construction algorithm using orthogonal forward selection (OFS) in order to optimize the model generalization for imbalanced two-class data sets. This kernel classifier identification algorithm is based on a new regularized orthogonal weighted least squares (ROWLS) estimator and the model selection criterion of maximal leave-one-out area under curve (LOO-AUC) of the receiver operating characteristics (ROCs). It is shown that, owing to the orthogonalization procedure, the LOO-AUC can be calculated via an analytic formula based on the new regularized orthogonal weighted least squares parameter estimator, without actually splitting the estimation data set. The proposed algorithm can achieve minimal computational expense via a set of forward recursive updating formula in searching model terms with maximal incremental LOO-AUC value. Numerical examples are used to demonstrate the efficacy of the algorithm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sub-seasonal variability including equatorial waves significantly influence the dehydration and transport processes in the tropical tropopause layer (TTL). This study investigates the wave activity in the TTL in 7 reanalysis data sets (RAs; NCEP1, NCEP2, ERA40, ERA-Interim, JRA25, MERRA, and CFSR) and 4 chemistry climate models (CCMs; CCSRNIES, CMAM, MRI, and WACCM) using the zonal wave number-frequency spectral analysis method with equatorially symmetric-antisymmetric decomposition. Analyses are made for temperature and horizontal winds at 100 hPa in the RAs and CCMs and for outgoing longwave radiation (OLR), which is a proxy for convective activity that generates tropopause-level disturbances, in satellite data and the CCMs. Particular focus is placed on equatorial Kelvin waves, mixed Rossby-gravity (MRG) waves, and the Madden-Julian Oscillation (MJO). The wave activity is defined as the variance, i.e., the power spectral density integrated in a particular zonal wave number-frequency region. It is found that the TTL wave activities show significant difference among the RAs, ranging from ∼0.7 (for NCEP1 and NCEP2) to ∼1.4 (for ERA-Interim, MERRA, and CFSR) with respect to the averages from the RAs. The TTL activities in the CCMs lie generally within the range of those in the RAs, with a few exceptions. However, the spectral features in OLR for all the CCMs are very different from those in the observations, and the OLR wave activities are too low for CCSRNIES, CMAM, and MRI. It is concluded that the broad range of wave activity found in the different RAs decreases our confidence in their validity and in particular their value for validation of CCM performance in the TTL, thereby limiting our quantitative understanding of the dehydration and transport processes in the TTL.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Variability in the strength of the stratospheric Lagrangian mean meridional or Brewer-Dobson circulation and horizontal mixing into the tropics over the past three decades are examined using observations of stratospheric mean age of air and ozone. We use a simple representation of the stratosphere, the tropical leaky pipe (TLP) model, guided by mean meridional circulation and horizontal mixing changes in several reanalyses data sets and chemistry climate model (CCM) simulations, to help elucidate reasons for the observed changes in stratospheric mean age and ozone. We find that the TLP model is able to accurately simulate multiyear variability in ozone following recent major volcanic eruptions and the early 2000s sea surface temperature changes, as well as the lasting impact on mean age of relatively short-term circulation perturbations. We also find that the best quantitative agreement with the observed mean age and ozone trends over the past three decades is found assuming a small strengthening of the mean circulation in the lower stratosphere, a moderate weakening of the mean circulation in the middle and upper stratosphere, and a moderate increase in the horizontal mixing into the tropics. The mean age trends are strongly sensitive to trends in the horizontal mixing into the tropics, and the uncertainty in the mixing trends causes uncertainty in the mean circulation trends. Comparisons of the mean circulation and mixing changes suggested by the measurements with those from a recent suite of CCM runs reveal significant differences that may have important implications on the accurate simulation of future stratospheric climate.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Advanced Along-Track Scanning Radiometer (AATSR) was launched on Envisat in March 2002. The AATSR instrument is designed to retrieve precise and accurate global sea surface temperature (SST) that, combined with the large data set collected from its predecessors, ATSR and ATSR-2, will provide a long term record of SST data that is greater than 15 years. This record can be used for independent monitoring and detection of climate change. The AATSR validation programme has successfully completed its initial phase. The programme involves validation of the AATSR derived SST values using in situ radiometers, in situ buoys and global SST fields from other data sets. The results of the initial programme presented here will demonstrate that the AATSR instrument is currently close to meeting its scientific objectives of determining global SST to an accuracy of 0.3 K (one sigma). For night time data, the analysis gives a warm bias of between +0.04 K (0.28 K) for buoys to +0.06 K (0.20 K) for radiometers, with slightly higher errors observed for day time data, showing warm biases of between +0.02 (0.39 K) for buoys to +0.11 K (0.33 K) for radiometers. They show that the ATSR series of instruments continues to be the world leader in delivering accurate space-based observations of SST, which is a key climate parameter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A method is proposed for merging different nadir-sounding climate data records using measurements from high-resolution limb sounders to provide a transfer function between the different nadir measurements. The two nadir-sounding records need not be overlapping so long as the limb-sounding record bridges between them. The method is applied to global-mean stratospheric temperatures from the NOAA Climate Data Records based on the Stratospheric Sounding Unit (SSU) and the Advanced Microwave Sounding Unit-A (AMSU), extending the SSU record forward in time to yield a continuous data set from 1979 to present, and providing a simple framework for extending the SSU record into the future using AMSU. SSU and AMSU are bridged using temperature measurements from the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS), which is of high enough vertical resolution to accurately represent the weighting functions of both SSU and AMSU. For this application, a purely statistical approach is not viable since the different nadir channels are not sufficiently linearly independent, statistically speaking. The near-global-mean linear temperature trends for extended SSU for 1980–2012 are −0.63 ± 0.13, −0.71 ± 0.15 and −0.80 ± 0.17 K decade−1 (95 % confidence) for channels 1, 2 and 3, respectively. The extended SSU temperature changes are in good agreement with those from the Microwave Limb Sounder (MLS) on the Aura satellite, with both exhibiting a cooling trend of ~ 0.6 ± 0.3 K decade−1 in the upper stratosphere from 2004 to 2012. The extended SSU record is found to be in agreement with high-top coupled atmosphere–ocean models over the 1980–2012 period, including the continued cooling over the first decade of the 21st century.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The stratospheric mean-meridional circulation (MMC) and eddy mixing are compared among six meteorological reanalysis data sets: NCEP-NCAR, NCEP-CFSR, ERA-40, ERA-Interim, JRA-25, and JRA-55 for the period 1979–2012. The reanalysis data sets produced using advanced systems (i.e., NCEP-CFSR, ERA-Interim, and JRA-55) generally reveal a weaker MMC in the Northern Hemisphere (NH) compared with those produced using older systems (i.e., NCEP/NCAR, ERA-40, and JRA-25). The mean mixing strength differs largely among the data products. In the NH lower stratosphere, the contribution of planetary-scale mixing is larger in the new data sets than in the old data sets, whereas that of small-scale mixing is weaker in the new data sets. Conventional data assimilation techniques introduce analysis increments without maintaining physical balance, which may have caused an overly strong MMC and spurious small-scale eddies in the old data sets. At the NH mid-latitudes, only ERA-Interim reveals a weakening MMC trend in the deep branch of the Brewer–Dobson circulation (BDC). The relative importance of the eddy mixing compared with the mean-meridional transport in the subtropical lower stratosphere shows increasing trends in ERA-Interim and JRA-55; this together with the weakened MMC in the deep branch may imply an increasing age-of-air (AoA) in the NH middle stratosphere in ERA-Interim. Overall, discrepancies between the different variables and trends therein as derived from the different reanalyses are still relatively large, suggesting that more investments in these products are needed in order to obtain a consolidated picture of observed changes in the BDC and the mechanisms that drive them.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. The first paper of this series examined the effects of the former on the variogram and this paper examines the effects of asymmetry arising from outliers. Simulated annealing was used to create normally distributed random fields of different size that are realizations of known processes described by variograms with different nugget:sill ratios. These primary data sets were then contaminated with randomly located and spatially aggregated outliers from a secondary process to produce different degrees of asymmetry. Experimental variograms were computed from these data by Matheron's estimator and by three robust estimators. The effects of standard data transformations on the coefficient of skewness and on the variogram were also investigated. Cross-validation was used to assess the performance of models fitted to experimental variograms computed from a range of data contaminated by outliers for kriging. The results showed that where skewness was caused by outliers the variograms retained their general shape, but showed an increase in the nugget and sill variances and nugget:sill ratios. This effect was only slightly more for the smallest data set than for the two larger data sets and there was little difference between the results for the latter. Overall, the effect of size of data set was small for all analyses. The nugget:sill ratio showed a consistent decrease after transformation to both square roots and logarithms; the decrease was generally larger for the latter, however. Aggregated outliers had different effects on the variogram shape from those that were randomly located, and this also depended on whether they were aggregated near to the edge or the centre of the field. The results of cross-validation showed that the robust estimators and the removal of outliers were the most effective ways of dealing with outliers for variogram estimation and kriging. (C) 2007 Elsevier Ltd. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We describe a general likelihood-based 'mixture model' for inferring phylogenetic trees from gene-sequence or other character-state data. The model accommodates cases in which different sites in the alignment evolve in qualitatively distinct ways, but does not require prior knowledge of these patterns or partitioning of the data. We call this qualitative variability in the pattern of evolution across sites "pattern-heterogeneity" to distinguish it from both a homogenous process of evolution and from one characterized principally by differences in rates of evolution. We present studies to show that the model correctly retrieves the signals of pattern-heterogeneity from simulated gene-sequence data, and we apply the method to protein-coding genes and to a ribosomal 12S data set. The mixture model outperforms conventional partitioning in both these data sets. We implement the mixture model such that it can simultaneously detect rate- and pattern-heterogeneity. The model simplifies to a homogeneous model or a rate- variability model as special cases, and therefore always performs at least as well as these two approaches, and often considerably improves upon them. We make the model available within a Bayesian Markov-chain Monte Carlo framework for phylogenetic inference, as an easy-to-use computer program.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Resolving the relationships between Metazoa and other eukaryotic groups as well as between metazoan phyla is central to the understanding of the origin and evolution of animals. The current view is based on limited data sets, either a single gene with many species (e.g., ribosomal RNA) or many genes but with only a few species. Because a reliable phylogenetic inference simultaneously requires numerous genes and numerous species, we assembled a very large data set containing 129 orthologous proteins (similar to30,000 aligned amino acid positions) for 36 eukaryotic species. Included in the alignments are data from the choanoflagellate Monosiga ovata, obtained through the sequencing of about 1,000 cDNAs. We provide conclusive support for choanoflagellates as the closest relative of animals and for fungi as the second closest. The monophyly of Plantae and chromalveolates was recovered but without strong statistical support. Within animals, in contrast to the monophyly of Coelomata observed in several recent large-scale analyses, we recovered a paraphyletic Coelamata, with nematodes and platyhelminths nested within. To include a diverse sample of organisms, data from EST projects were used for several species, resulting in a large amount of missing data in our alignment (about 25%). By using different approaches, we verify that the inferred phylogeny is not sensitive to these missing data. Therefore, this large data set provides a reliable phylogenetic framework for studying eukaryotic and animal evolution and will be easily extendable when large amounts of sequence information become available from a broader taxonomic range.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Once unit-cell dimensions have been determined from a powder diffraction data set and therefore the crystal system is known (e.g. orthorhombic), the method presented by Markvardsen, David, Johnson & Shankland [Acta Cryst. (2001), A57, 47-54] can be used to generate a table ranking the extinction symbols of the given crystal system according to probability. Markvardsen et al. tested a computer program (ExtSym) implementing the method against Pawley refinement outputs generated using the TF12LS program [David, Ibberson & Matthewman (1992). Report RAL-92-032. Rutherford Appleton Laboratory, Chilton, Didcot, Oxon, UK]. Here, it is shown that ExtSym can be used successfully with many well known powder diffraction analysis packages, namely DASH [David, Shankland, van de Streek, Pidcock, Motherwell & Cole (2006). J. Appl. Cryst. 39, 910-915], FullProf [Rodriguez-Carvajal (1993). Physica B, 192, 55-69], GSAS [Larson & Von Dreele (1994). Report LAUR 86-748. Los Alamos National Laboratory, New Mexico, USA], PRODD [Wright (2004). Z. Kristallogr. 219, 1-11] and TOPAS [Coelho (2003). Bruker AXS GmbH, Karlsruhe, Germany]. In addition, a precise description of the optimal input for ExtSym is given to enable other software packages to interface with ExtSym and to allow the improvement/modification of existing interfacing scripts. ExtSym takes as input the powder data in the form of integrated intensities and error estimates for these intensities. The output returned by ExtSym is demonstrated to be strongly dependent on the accuracy of these error estimates and the reason for this is explained. ExtSym is tested against a wide range of data sets, confirming the algorithm to be very successful at ranking the published extinction symbol as the most likely. (C) 2008 International Union of Crystallography Printed in Singapore - all rights reserved.