68 resultados para Collection of Network Data
em CentAUR: Central Archive University of Reading - UK
Resumo:
The unparalleled collection of clinical data and biomaterials within the EHDN's REGISTRY can expedite the search for disease modifiers (genetic and environmental) of age at onset and disease progression that could be harnessed for the development of novel treatments.
Resumo:
In this paper, we develop a method, termed the Interaction Distribution (ID) method, for analysis of quantitative ecological network data. In many cases, quantitative network data sets are under-sampled, i.e. many interactions are poorly sampled or remain unobserved. Hence, the output of statistical analyses may fail to differentiate between patterns that are statistical artefacts and those which are real characteristics of ecological networks. The ID method can support assessment and inference of under-sampled ecological network data. In the current paper, we illustrate and discuss the ID method based on the properties of plant-animal pollination data sets of flower visitation frequencies. However, the ID method may be applied to other types of ecological networks. The method can supplement existing network analyses based on two definitions of the underlying probabilities for each combination of pollinator and plant species: (1), pi,j: the probability for a visit made by the i’th pollinator species to take place on the j’th plant species; (2), qi,j: the probability for a visit received by the j’th plant species to be made by the i’th pollinator. The method applies the Dirichlet distribution to estimate these two probabilities, based on a given empirical data set. The estimated mean values for pi,j and qi,j reflect the relative differences between recorded numbers of visits for different pollinator and plant species, and the estimated uncertainty of pi,j and qi,j decreases with higher numbers of recorded visits.
Resumo:
GODIVA2 is a dynamic website that provides visual access to several terabytes of physically distributed, four-dimensional environmental data. It allows users to explore large datasets interactively without the need to install new software or download and understand complex data. Through the use of open international standards, GODIVA2 maintains a high level of interoperability with third-party systems, allowing diverse datasets to be mutually compared. Scientists can use the system to search for features in large datasets and to diagnose the output from numerical simulations and data processing algorithms. Data providers around Europe have adopted GODIVA2 as an INSPIRE-compliant dynamic quick-view system for providing visual access to their data.
Resumo:
The complexity inherent in climate data makes it necessary to introduce more than one statistical tool to the researcher to gain insight into the climate system. Empirical orthogonal function (EOF) analysis is one of the most widely used methods to analyze weather/climate modes of variability and to reduce the dimensionality of the system. Simple structure rotation of EOFs can enhance interpretability of the obtained patterns but cannot provide anything more than temporal uncorrelatedness. In this paper, an alternative rotation method based on independent component analysis (ICA) is considered. The ICA is viewed here as a method of EOF rotation. Starting from an initial EOF solution rather than rotating the loadings toward simplicity, ICA seeks a rotation matrix that maximizes the independence between the components in the time domain. If the underlying climate signals have an independent forcing, one can expect to find loadings with interpretable patterns whose time coefficients have properties that go beyond simple noncorrelation observed in EOFs. The methodology is presented and an application to monthly means sea level pressure (SLP) field is discussed. Among the rotated (to independence) EOFs, the North Atlantic Oscillation (NAO) pattern, an Arctic Oscillation–like pattern, and a Scandinavian-like pattern have been identified. There is the suggestion that the NAO is an intrinsic mode of variability independent of the Pacific.
Resumo:
To provide reliable estimates for mapping soil properties for precision agriculture requires intensive sampling and costly laboratory analyses. If the spatial structure of ancillary data, such as yield, digital information from aerial photographs, and soil electrical conductivity (EC) measurements, relates to that of soil properties they could be used to guide the sampling intensity for soil surveys. Variograins of permanent soil properties at two study sites on different parent materials were compared with each other and with those for ancillary data. The ranges of spatial dependence identified by the variograms of both sets of properties are of similar orders of magnitude for each study site, Maps of the ancillary data appear to show similar patterns of variation and these seem to relate to those of the permanent properties of the soil. Correlation analysis has confirmed these relations. Maps of kriged estimates from sub-sampled data and the original variograrns showed that the main patterns of variation were preserved when a sampling interval of less than half the average variogram range of ancillary data was used. Digital data from aerial photographs for different years and EC appear to show a more consistent relation with the soil properties than does yield. Aerial photographs, in particular those of bare soil, seem to be the most useful ancillary data and they are often cheaper to obtain than yield and EC data.
Resumo:
Matheron's usual variogram estimator can result in unreliable variograms when data are strongly asymmetric or skewed. Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. This paper examines the effects of underlying asymmetry on the variogram and on the accuracy of prediction, and the second one examines the effects arising from outliers. Standard geostatistical texts suggest ways of dealing with underlying asymmetry; however, this is based on informed intuition rather than detailed investigation. To determine whether the methods generally used to deal with underlying asymmetry are appropriate, the effects of different coefficients of skewness on the shape of the experimental variogram and on the model parameters were investigated. Simulated annealing was used to create normally distributed random fields of different size from variograms with different nugget:sill ratios. These data were then modified to give different degrees of asymmetry and the experimental variogram was computed in each case. The effects of standard data transformations on the form of the variogram were also investigated. Cross-validation was used to assess quantitatively the performance of the different variogram models for kriging. The results showed that the shape of the variogram was affected by the degree of asymmetry, and that the effect increased as the size of data set decreased. Transformations of the data were more effective in reducing the skewness coefficient in the larger sets of data. Cross-validation confirmed that variogram models from transformed data were more suitable for kriging than were those from the raw asymmetric data. The results of this study have implications for the 'standard best practice' in dealing with asymmetry in data for geostatistical analyses. (C) 2007 Elsevier Ltd. All rights reserved.
Resumo:
Asymmetry in a distribution can arise from a long tail of values in the underlying process or from outliers that belong to another population that contaminate the primary process. The first paper of this series examined the effects of the former on the variogram and this paper examines the effects of asymmetry arising from outliers. Simulated annealing was used to create normally distributed random fields of different size that are realizations of known processes described by variograms with different nugget:sill ratios. These primary data sets were then contaminated with randomly located and spatially aggregated outliers from a secondary process to produce different degrees of asymmetry. Experimental variograms were computed from these data by Matheron's estimator and by three robust estimators. The effects of standard data transformations on the coefficient of skewness and on the variogram were also investigated. Cross-validation was used to assess the performance of models fitted to experimental variograms computed from a range of data contaminated by outliers for kriging. The results showed that where skewness was caused by outliers the variograms retained their general shape, but showed an increase in the nugget and sill variances and nugget:sill ratios. This effect was only slightly more for the smallest data set than for the two larger data sets and there was little difference between the results for the latter. Overall, the effect of size of data set was small for all analyses. The nugget:sill ratio showed a consistent decrease after transformation to both square roots and logarithms; the decrease was generally larger for the latter, however. Aggregated outliers had different effects on the variogram shape from those that were randomly located, and this also depended on whether they were aggregated near to the edge or the centre of the field. The results of cross-validation showed that the robust estimators and the removal of outliers were the most effective ways of dealing with outliers for variogram estimation and kriging. (C) 2007 Elsevier Ltd. All rights reserved.
Impact of hydrographic data assimilation on the modelled Atlantic meridional overturning circulation
Resumo:
Here we make an initial step toward the development of an ocean assimilation system that can constrain the modelled Atlantic Meridional Overturning Circulation (AMOC) to support climate predictions. A detailed comparison is presented of 1° and 1/4° resolution global model simulations with and without sequential data assimilation, to the observations and transport estimates from the RAPID mooring array across 26.5° N in the Atlantic. Comparisons of modelled water properties with the observations from the merged RAPID boundary arrays demonstrate the ability of in situ data assimilation to accurately constrain the east-west density gradient between these mooring arrays. However, the presence of an unconstrained "western boundary wedge" between Abaco Island and the RAPID mooring site WB2 (16 km offshore) leads to the intensification of an erroneous southwards flow in this region when in situ data are assimilated. The result is an overly intense southward upper mid-ocean transport (0–1100 m) as compared to the estimates derived from the RAPID array. Correction of upper layer zonal density gradients is found to compensate mostly for a weak subtropical gyre circulation in the free model run (i.e. with no assimilation). Despite the important changes to the density structure and transports in the upper layer imposed by the assimilation, very little change is found in the amplitude and sub-seasonal variability of the AMOC. This shows that assimilation of upper layer density information projects mainly on the gyre circulation with little effect on the AMOC at 26° N due to the absence of corrections to density gradients below 2000 m (the maximum depth of Argo). The sensitivity to initial conditions was explored through two additional experiments using a climatological initial condition. These experiments showed that the weak bias in gyre intensity in the control simulation (without data assimilation) develops over a period of about 6 months, but does so independently from the overturning, with no change to the AMOC. However, differences in the properties and volume transport of North Atlantic Deep Water (NADW) persisted throughout the 3 year simulations resulting in a difference of 3 Sv in AMOC intensity. The persistence of these dense water anomalies and their influence on the AMOC is promising for the development of decadal forecasting capabilities. The results suggest that the deeper waters must be accurately reproduced in order to constrain the AMOC.
Resumo:
1. Suction sampling is a popular method for the collection of quantitative data on grassland invertebrate populations, although there have been no detailed studies into the effectiveness of the method. 2. We investigate the effect of effort (duration and number of suction samples) and sward height on the efficiency of suction sampling of grassland beetle, true bug, planthopper and spider Populations. We also compare Suction sampling with an absolute sampling method based on the destructive removal of turfs. 3. Sampling for durations of 16 seconds was sufficient to collect 90% of all individuals and species of grassland beetles, with less time required for the true bugs, spiders and planthoppers. The number of samples required to collect 90% of the species was more variable, although in general 55 sub-samples was sufficient for all groups, except the true bugs. Increasing sward height had a negative effect on the capture efficiency of suction sampling. 4. The assemblage structure of beetles, planthoppers and spiders was independent of the sampling method (suction or absolute) used. 5. Synthesis and applications. In contrast to other sampling methods used in grassland habitats (e.g. sweep netting or pitfall trapping), suction sampling is an effective quantitative tool for the measurement of invertebrate diversity and assemblage structure providing sward height is included as a covariate. The effective sampling of beetles, true bugs, planthoppers and spiders altogether requires a minimum sampling effort of 110 sub-samples of duration of 16 seconds. Such sampling intensities can be adjusted depending on the taxa sampled, and we provide information to minimize sampling problems associated with this versatile technique. Suction sampling should remain an important component in the toolbox of experimental techniques used during both experimental and management sampling regimes within agroecosystems, grasslands or other low-lying vegetation types.