96 resultados para Data-Intensive Science
Resumo:
Site-specific management requires accurate knowledge of the spatial variation in a range of soil properties within fields. This involves considerable sampling effort, which is costly. Ancillary data, such as crop yield, elevation and apparent electrical conductivity (ECa) of the soil, can provide insight into the spatial variation of some soil properties. A multivariate classification with spatial constraint imposed by the variogram was used to classify data from two arable crop fields. The yield data comprised 5 years of crop yield, and the ancillary data 3 years of yield data, elevation and ECa. Information on soil chemical and physical properties was provided by intensive surveys of the soil. Multivariate variograms computed from these data were used to constrain sites spatially within classes to increase their contiguity. The constrained classifications resulted in coherent classes, and those based on the ancillary data were similar to those from the soil properties. The ancillary data seemed to identify areas in the field where the soil is reasonably homogeneous. The results of targeted sampling showed that these classes could be used as a basis for management and to guide future sampling of the soil.
Resumo:
To provide reliable estimates for mapping soil properties for precision agriculture requires intensive sampling and costly laboratory analyses. If the spatial structure of ancillary data, such as yield, digital information from aerial photographs, and soil electrical conductivity (EC) measurements, relates to that of soil properties they could be used to guide the sampling intensity for soil surveys. Variograins of permanent soil properties at two study sites on different parent materials were compared with each other and with those for ancillary data. The ranges of spatial dependence identified by the variograms of both sets of properties are of similar orders of magnitude for each study site, Maps of the ancillary data appear to show similar patterns of variation and these seem to relate to those of the permanent properties of the soil. Correlation analysis has confirmed these relations. Maps of kriged estimates from sub-sampled data and the original variograrns showed that the main patterns of variation were preserved when a sampling interval of less than half the average variogram range of ancillary data was used. Digital data from aerial photographs for different years and EC appear to show a more consistent relation with the soil properties than does yield. Aerial photographs, in particular those of bare soil, seem to be the most useful ancillary data and they are often cheaper to obtain than yield and EC data.
Resumo:
Data such as digitized aerial photographs, electrical conductivity and yield are intensive and relatively inexpensive to obtain compared with collecting soil data by sampling. If such ancillary data are co-regionalized with the soil data they should be suitable for co-kriging. The latter requires that information for both variables is co-located at several locations; this is rarely so for soil and ancillary data. To solve this problem, we have derived values for the ancillary variable at the soil sampling locations by averaging the values within a radius of 15 m, taking the nearest-neighbour value, kriging over 5 m blocks, and punctual kriging. The cross-variograms from these data with clay content and also the pseudo cross-variogram were used to co-krige to validation points and the root mean squared errors (RMSEs) were calculated. In general, the data averaged within 15m and the punctually kriged values resulted in more accurate predictions.
Resumo:
Maps of kriged soil properties for precision agriculture are often based on a variogram estimated from too few data because the costs of sampling and analysis are often prohibitive. If the variogram has been computed by the usual method of moments, it is likely to be unstable when there are fewer than 100 data. The scale of variation in soil properties should be investigated prior to sampling by computing a variogram from ancillary data, such as an aerial photograph of the bare soil. If the sampling interval suggested by this is large in relation to the size of the field there will be too few data to estimate a reliable variogram for kriging. Standardized variograms from aerial photographs can be used with standardized soil data that are sparse, provided the data are spatially structured and the nugget:sill ratio is similar to that of a reliable variogram of the property. The problem remains of how to set this ratio in the absence of an accurate variogram. Several methods of estimating the nugget:sill ratio for selected soil properties are proposed and evaluated. Standardized variograms with nugget:sill ratios set by these methods are more similar to those computed from intensive soil data than are variograms computed from sparse soil data. The results of cross-validation and mapping show that the standardized variograms provide more accurate estimates, and preserve the main patterns of variation better than those computed from sparse data.
Resumo:
The chess endgame is increasingly being seen through the lens of, and therefore effectively defined by, a data ‘model’ of itself. It is vital that such models are clearly faithful to the reality they purport to represent. This paper examines that issue and systems engineering responses to it, using the chess endgame as the exemplar scenario. A structured survey has been carried out of the intrinsic challenges and complexity of creating endgame data by reviewing the past pattern of errors during work in progress, surfacing in publications and occurring after the data was generated. Specific measures are proposed to counter observed classes of error-risk, including a preliminary survey of techniques for using state-of-the-art verification tools to generate EGTs that are correct by construction. The approach may be applied generically beyond the game domain.
Resumo:
While Nalimov’s endgame tables for Western Chess are the most used today, their Depth-to-Mate metric is not the most efficient or effective in use. The authors have developed and used new programs to create tables to alternative metrics and recommend better strategies for endgame play.
Resumo:
In the past decade, the amount of data in biological field has become larger and larger; Bio-techniques for analysis of biological data have been developed and new tools have been introduced. Several computational methods are based on unsupervised neural network algorithms that are widely used for multiple purposes including clustering and visualization, i.e. the Self Organizing Maps (SOM). Unfortunately, even though this method is unsupervised, the performances in terms of quality of result and learning speed are strongly dependent from the neuron weights initialization. In this paper we present a new initialization technique based on a totally connected undirected graph, that report relations among some intersting features of data input. Result of experimental tests, where the proposed algorithm is compared to the original initialization techniques, shows that our technique assures faster learning and better performance in terms of quantization error.
Resumo:
More data will be produced in the next five years than in the entire history of human kind, a digital deluge that marks the beginning of the Century of Information. Through a year-long consultation with UK researchers, a coherent strategy has been developed, which will nurture Century-of-Information Research (CIR); it crystallises the ideas developed by the e-Science Directors' Forum Strategy Working Group. This paper is an abridged version of their latest report which can be found at: http://wikis.nesc.ac.uk/escienvoy/Century_of_Information_Research_Strategy which also records the consultation process and the affiliations of the authors. This document is derived from a paper presented at the Oxford e-Research Conference 2008 and takes into account suggestions made in the ensuing panel discussion. The goals of the CIR Strategy are to facilitate the growth of UK research and innovation that is data and computationally intensive and to develop a new culture of 'digital-systems judgement' that will equip research communities, businesses, government and society as a whole, with the skills essential to compete and prosper in the Century of Information. The CIR Strategy identifies a national requirement for a balanced programme of coordination, research, infrastructure, translational investment and education to empower UK researchers, industry, government and society. The Strategy is designed to deliver an environment which meets the needs of UK researchers so that they can respond agilely to challenges, can create knowledge and skills, and can lead new kinds of research. It is a call to action for those engaged in research, those providing data and computational facilities, those governing research and those shaping education policies. The ultimate aim is to help researchers strengthen the international competitiveness of the UK research base and increase its contribution to the economy. The objectives of the Strategy are to better enable UK researchers across all disciplines to contribute world-leading fundamental research; to accelerate the translation of research into practice; and to develop improved capabilities, facilities and context for research and innovation. It envisages a culture that is better able to grasp the opportunities provided by the growing wealth of digital information. Computing has, of course, already become a fundamental tool in all research disciplines. The UK e-Science programme (2001-06)—since emulated internationally—pioneered the invention and use of new research methods, and a new wave of innovations in digital-information technologies which have enabled them. The Strategy argues that the UK must now harness and leverage its own, plus the now global, investment in digital-information technology in order to spread the benefits as widely as possible in research, education, industry and government. Implementing the Strategy would deliver the computational infrastructure and its benefits as envisaged in the Science & Innovation Investment Framework 2004-2014 (July 2004), and in the reports developing those proposals. To achieve this, the Strategy proposes the following actions: support the continuous innovation of digital-information research methods; provide easily used, pervasive and sustained e-Infrastructure for all research; enlarge the productive research community which exploits the new methods efficiently; generate capacity, propagate knowledge and develop skills via new curricula; and develop coordination mechanisms to improve the opportunities for interdisciplinary research and to make digital-infrastructure provision more cost effective. To gain the best value for money strategic coordination is required across a broad spectrum of stakeholders. A coherent strategy is essential in order to establish and sustain the UK as an international leader of well-curated national data assets and computational infrastructure, which is expertly used to shape policy, support decisions, empower researchers and to roll out the results to the wider benefit of society. The value of data as a foundation for wellbeing and a sustainable society must be appreciated; national resources must be more wisely directed to the collection, curation, discovery, widening access, analysis and exploitation of these data. Every researcher must be able to draw on skills, tools and computational resources to develop insights, test hypotheses and translate inventions into productive use, or to extract knowledge in support of governmental decision making. This foundation plus the skills developed will launch significant advances in research, in business, in professional practice and in government with many consequent benefits for UK citizens. The Strategy presented here addresses these complex and interlocking requirements.
Impact of hydrographic data assimilation on the modelled Atlantic meridional overturning circulation
Resumo:
Here we make an initial step toward the development of an ocean assimilation system that can constrain the modelled Atlantic Meridional Overturning Circulation (AMOC) to support climate predictions. A detailed comparison is presented of 1° and 1/4° resolution global model simulations with and without sequential data assimilation, to the observations and transport estimates from the RAPID mooring array across 26.5° N in the Atlantic. Comparisons of modelled water properties with the observations from the merged RAPID boundary arrays demonstrate the ability of in situ data assimilation to accurately constrain the east-west density gradient between these mooring arrays. However, the presence of an unconstrained "western boundary wedge" between Abaco Island and the RAPID mooring site WB2 (16 km offshore) leads to the intensification of an erroneous southwards flow in this region when in situ data are assimilated. The result is an overly intense southward upper mid-ocean transport (0–1100 m) as compared to the estimates derived from the RAPID array. Correction of upper layer zonal density gradients is found to compensate mostly for a weak subtropical gyre circulation in the free model run (i.e. with no assimilation). Despite the important changes to the density structure and transports in the upper layer imposed by the assimilation, very little change is found in the amplitude and sub-seasonal variability of the AMOC. This shows that assimilation of upper layer density information projects mainly on the gyre circulation with little effect on the AMOC at 26° N due to the absence of corrections to density gradients below 2000 m (the maximum depth of Argo). The sensitivity to initial conditions was explored through two additional experiments using a climatological initial condition. These experiments showed that the weak bias in gyre intensity in the control simulation (without data assimilation) develops over a period of about 6 months, but does so independently from the overturning, with no change to the AMOC. However, differences in the properties and volume transport of North Atlantic Deep Water (NADW) persisted throughout the 3 year simulations resulting in a difference of 3 Sv in AMOC intensity. The persistence of these dense water anomalies and their influence on the AMOC is promising for the development of decadal forecasting capabilities. The results suggest that the deeper waters must be accurately reproduced in order to constrain the AMOC.