63 resultados para computation- and data-intensive applications


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The task of assessing the likelihood and extent of coastal flooding is hampered by the lack of detailed information on near-shore bathymetry. This is required as an input for coastal inundation models, and in some cases the variability in the bathymetry can impact the prediction of those areas likely to be affected by flooding in a storm. The constant monitoring and data collection that would be required to characterise the near-shore bathymetry over large coastal areas is impractical, leaving the option of running morphodynamic models to predict the likely bathymetry at any given time. However, if the models are inaccurate the errors may be significant if incorrect bathymetry is used to predict possible flood risks. This project is assessing the use of data assimilation techniques to improve the predictions from a simple model, by rigorously incorporating observations of the bathymetry into the model, to bring the model closer to the actual situation. Currently we are concentrating on Morecambe Bay as a primary study site, as it has a highly dynamic inter-tidal zone, with changes in the course of channels in this zone impacting the likely locations of flooding from storms. We are working with SAR images, LiDAR, and swath bathymetry to give us the observations over a 2.5 year period running from May 2003 – November 2005. We have a LiDAR image of the entire inter-tidal zone for November 2005 to use as validation data. We have implemented a 3D-Var data assimilation scheme, to investigate the improvements in performance of the data assimilation compared to the previous scheme which was based on the optimal interpolation method. We are currently evaluating these different data assimilation techniques, using 22 SAR data observations. We will also include the LiDAR data and swath bathymetry to improve the observational coverage, and investigate the impact of different types of observation on the predictive ability of the model. We are also assessing the ability of the data assimilation scheme to recover the correct bathymetry after storm events, which can dramatically change the bathymetry in a short period of time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper focuses on active networks applications and in particular on the possible interactions among these applications. Active networking is a very promising research field which has been developed recently, and which poses several interesting challenges to network designers. A number of proposals for e±cient active network architectures are already to be found in the literature. However, how two or more active network applications may interact has not being investigated so far. In this work, we consider a number of applications that have been designed to exploit the main features of active networks and we discuss what are the main benefits that these applications may derive from them. Then, we introduce some forms of interaction including interference and communications among applications, and identify the components of an active network architecture that are needed to support these forms of interaction. We conclude by presenting a brief example of an active network application exploiting the concept of interaction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

More data will be produced in the next five years than in the entire history of human kind, a digital deluge that marks the beginning of the Century of Information. Through a year-long consultation with UK researchers, a coherent strategy has been developed, which will nurture Century-of-Information Research (CIR); it crystallises the ideas developed by the e-Science Directors' Forum Strategy Working Group. This paper is an abridged version of their latest report which can be found at: http://wikis.nesc.ac.uk/escienvoy/Century_of_Information_Research_Strategy which also records the consultation process and the affiliations of the authors. This document is derived from a paper presented at the Oxford e-Research Conference 2008 and takes into account suggestions made in the ensuing panel discussion. The goals of the CIR Strategy are to facilitate the growth of UK research and innovation that is data and computationally intensive and to develop a new culture of 'digital-systems judgement' that will equip research communities, businesses, government and society as a whole, with the skills essential to compete and prosper in the Century of Information. The CIR Strategy identifies a national requirement for a balanced programme of coordination, research, infrastructure, translational investment and education to empower UK researchers, industry, government and society. The Strategy is designed to deliver an environment which meets the needs of UK researchers so that they can respond agilely to challenges, can create knowledge and skills, and can lead new kinds of research. It is a call to action for those engaged in research, those providing data and computational facilities, those governing research and those shaping education policies. The ultimate aim is to help researchers strengthen the international competitiveness of the UK research base and increase its contribution to the economy. The objectives of the Strategy are to better enable UK researchers across all disciplines to contribute world-leading fundamental research; to accelerate the translation of research into practice; and to develop improved capabilities, facilities and context for research and innovation. It envisages a culture that is better able to grasp the opportunities provided by the growing wealth of digital information. Computing has, of course, already become a fundamental tool in all research disciplines. The UK e-Science programme (2001-06)—since emulated internationally—pioneered the invention and use of new research methods, and a new wave of innovations in digital-information technologies which have enabled them. The Strategy argues that the UK must now harness and leverage its own, plus the now global, investment in digital-information technology in order to spread the benefits as widely as possible in research, education, industry and government. Implementing the Strategy would deliver the computational infrastructure and its benefits as envisaged in the Science & Innovation Investment Framework 2004-2014 (July 2004), and in the reports developing those proposals. To achieve this, the Strategy proposes the following actions: support the continuous innovation of digital-information research methods; provide easily used, pervasive and sustained e-Infrastructure for all research; enlarge the productive research community which exploits the new methods efficiently; generate capacity, propagate knowledge and develop skills via new curricula; and develop coordination mechanisms to improve the opportunities for interdisciplinary research and to make digital-infrastructure provision more cost effective. To gain the best value for money strategic coordination is required across a broad spectrum of stakeholders. A coherent strategy is essential in order to establish and sustain the UK as an international leader of well-curated national data assets and computational infrastructure, which is expertly used to shape policy, support decisions, empower researchers and to roll out the results to the wider benefit of society. The value of data as a foundation for wellbeing and a sustainable society must be appreciated; national resources must be more wisely directed to the collection, curation, discovery, widening access, analysis and exploitation of these data. Every researcher must be able to draw on skills, tools and computational resources to develop insights, test hypotheses and translate inventions into productive use, or to extract knowledge in support of governmental decision making. This foundation plus the skills developed will launch significant advances in research, in business, in professional practice and in government with many consequent benefits for UK citizens. The Strategy presented here addresses these complex and interlocking requirements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This chapter describes the present status and future prospects for transgenic (genetically modified) crops. It concentrates on the most recent data obtained from patent databases and field trial applications, as well as the usual scientific literature. By these means, it is possible to obtain a useful perspective into future commercial products and international trends. The various research areas are subdivided on the basis of those associated with input (agronomic) traits and those concerned with output (e.g., food quality) characteristics. Among the former group are new methods of improving stress resistance, and among the latter are many examples of producing pharmaceutical compounds in plants.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Answering many of the critical questions in conservation, development and environmental management requires integrating the social and natural sciences. However, understanding the array of available quantitative methods and their associated terminology presents a major barrier to successful collaboration. We provide an overview of quantitative socio-economic methods that distils their complexity into a simple taxonomy. We outline how each has been used in conjunction with ecological models to address questions relating to the management of socio-ecological systems. We review the application of social and ecological quantitative concepts to agro-ecology and classify the approaches used to integrate the two disciplines. Our review included all published integrated models from 2003 to 2008 in 27 journals that publish agricultural modelling research. Although our focus is on agro-ecology, many of the results are broadly applicable to other fields involving an interaction between human activities and ecology. We found 36 papers that integrated social and ecological concepts in a quantitative model. Four different approaches to integration were used, depending on the scale at which human welfare was quantified. Most models viewed humans as pure profit maximizers, both when calculating welfare and predicting behaviour. Synthesis and applications. We reached two main conclusions based on our taxonomy and review. The first is that quantitative methods that extend predictions of behaviour and measurements of welfare beyond a simple market value basis are underutilized by integrated models. The second is that the accuracy of prediction for integrated models remains largely unquantified. Addressing both problems requires researchers to reach a common understanding of modelling goals and data requirements during the early stages of a project.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Heterogeneity in lifetime data may be modelled by multiplying an individual's hazard by an unobserved frailty. We test for the presence of frailty of this kind in univariate and bivariate data with Weibull distributed lifetimes, using statistics based on the ordered Cox-Snell residuals from the null model of no frailty. The form of the statistics is suggested by outlier testing in the gamma distribution. We find through simulation that the sum of the k largest or k smallest order statistics, for suitably chosen k , provides a powerful test when the frailty distribution is assumed to be gamma or positive stable, respectively. We provide recommended values of k for sample sizes up to 100 and simple formulae for estimated critical values for tests at the 5% level.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1.There is concern over the possibility of unwanted environmental change following transgene movement from genetically modified (GM) rapeseed Brassica napus to its wild and weedy relatives. 2. The aim of this research was to develop a remote sensing-assisted methodology to help quantify gene flow from crops to their wild relatives over wide areas. Emphasis was placed on locating sites of sympatry, where the frequency of gene flow is likely to be highest, and on measuring the size of rapeseed fields to allow spatially explicit modelling of wind-mediated pollen-dispersal patterns. 3. Remote sensing was used as a tool to locate rapeseed fields, and a variety of image-processing techniques was adopted to facilitate the compilation of a spatially explicit profile of sympatry between the crop and Brassica rapa. 4. Classified satellite images containing rapeseed fields were first used to infer the spatial relationship between donor rapeseed fields and recipient riverside B. rapa populations. Such images also have utility for improving the efficiency of ground surveys by identifying probable sites of sympatry. The same data were then also used for the calculation of mean field size. 5. This paper forms a companion paper to Wilkinson et al. (2003), in which these elements were combined to produce a spatially explicit profile of hybrid formation over the UK. The current paper demonstrates the value of remote sensing and image processing for large-scale studies of gene flow, and describes a generic method that could be applied to a variety of crops in many countries. 6.Synthesis and applications. The decision to approve or prevent the release of a GM cultivar is made at a national rather than regional level. It is highly desirable that data relating to the decision-making process are collected at the same scale, rather than relying on extrapolation from smaller experiments designed at the plot, field or even regional scale. It would be extremely difficult and labour intensive to attempt to carry out such large-scale investigations without the use of remote-sensing technology. This study used rapeseed in the UK as a model to demonstrate the value of remote sensing in assembling empirical information at a national level.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Event-related functional magnetic resonance imaging (efMRI) has emerged as a powerful technique for detecting brains' responses to presented stimuli. A primary goal in efMRI data analysis is to estimate the Hemodynamic Response Function (HRF) and to locate activated regions in human brains when specific tasks are performed. This paper develops new methodologies that are important improvements not only to parametric but also to nonparametric estimation and hypothesis testing of the HRF. First, an effective and computationally fast scheme for estimating the error covariance matrix for efMRI is proposed. Second, methodologies for estimation and hypothesis testing of the HRF are developed. Simulations support the effectiveness of our proposed methods. When applied to an efMRI dataset from an emotional control study, our method reveals more meaningful findings than the popular methods offered by AFNI and FSL. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Increasingly, distributed systems are being used to host all manner of applications. While these platforms provide a relatively cheap and effective means of executing applications, so far there has been little work in developing tools and utilities that can help application developers understand problems with the supporting software, or the executing applications. To fully understand why an application executing on a distributed system is not behaving as would be expected it is important that not only the application, but also the underlying middleware, and the operating system are analysed too, otherwise issues could be missed and certainly overall performance profiling and fault diagnoses would be harder to understand. We believe that one approach to profiling and the analysis of distributed systems and the associated applications is via the plethora of log files generated at runtime. In this paper we report on a system (Slogger), that utilises various emerging Semantic Web technologies to gather the heterogeneous log files generated by the various layers in a distributed system and unify them in common data store. Once unified, the log data can be queried and visualised in order to highlight potential problems or issues that may be occurring in the supporting software or the application itself.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A unified approach is proposed for data modelling that includes supervised regression and classification applications as well as unsupervised probability density function estimation. The orthogonal-least-squares regression based on the leave-one-out test criteria is formulated within this unified data-modelling framework to construct sparse kernel models that generalise well. Examples from regression, classification and density estimation applications are used to illustrate the effectiveness of this generic data-modelling approach for constructing parsimonious kernel models with excellent generalisation capability. (C) 2008 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a world of almost permanent and rapidly increasing electronic data availability, techniques of filtering, compressing, and interpreting this data to transform it into valuable and easily comprehensible information is of utmost importance. One key topic in this area is the capability to deduce future system behavior from a given data input. This book brings together for the first time the complete theory of data-based neurofuzzy modelling and the linguistic attributes of fuzzy logic in a single cohesive mathematical framework. After introducing the basic theory of data-based modelling, new concepts including extended additive and multiplicative submodels are developed and their extensions to state estimation and data fusion are derived. All these algorithms are illustrated with benchmark and real-life examples to demonstrate their efficiency. Chris Harris and his group have carried out pioneering work which has tied together the fields of neural networks and linguistic rule-based algortihms. This book is aimed at researchers and scientists in time series modeling, empirical data modeling, knowledge discovery, data mining, and data fusion.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The real-time parallel computation of histograms using an array of pipelined cells is proposed and prototyped in this paper with application to consumer imaging products. The array operates in two modes: histogram computation and histogram reading. The proposed parallel computation method does not use any memory blocks. The resulting histogram bins can be stored into an external memory block in a pipelined fashion for subsequent reading or streaming of the results. The array of cells can be tuned to accommodate the required data path width in a VLSI image processing engine as present in many imaging consumer devices. Synthesis of the architectures presented in this paper in FPGA are shown to compute the real-time histogram of images streamed at over 36 megapixels at 30 frames/s by processing in parallel 1, 2 or 4 pixels per clock cycle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Mitigation Options for Phosphorus and Sediment (MOPS) project investigated the effectiveness of within-field control measures (tramline management, straw residue management, type of cultivation and direction, and vegetative buffers) in terms of mitigating sediment and phosphorus loss from winter-sown combinable cereal crops using three case study sites. To determine the cost of the approaches, simple financial spreadsheet models were constructed at both farm and regional levels. Taking into account crop areas, crop rotation margins per hectare were calculated to reflect the costs of crop establishment, fertiliser and agro-chemical applications, harvesting, and the associated labour and machinery costs. Variable and operating costs associated with each mitigation option were then incorporated to demonstrate the impact on the relevant crop enterprise and crop rotation margins. These costs were then compared to runoff, sediment and phosphorus loss data obtained from monitoring hillslope-length scale field plots. Each of the mitigation options explored in this study had potential for reducing sediment and phosphorus losses from arable land under cereal crops. Sediment losses were reduced from between 9 kg ha−1 to as much as 4780 kg ha−1 with a corresponding reduction in phosphorus loss from 0.03 kg ha−1 to 2.89 kg ha−1. In percentage terms reductions of phosphorus were between 9% and 99%. Impacts on crop rotation margins also varied. Minimum tillage resulted in cost savings (up to £50 ha−1) whilst other options showed increased costs (up to £19 ha−1 for straw residue incorporation). Overall, the results indicate that each of the options has potential for on-farm implementation. However, tramline management appeared to have the greatest potential for reducing runoff, sediment, and phosphorus losses from arable land (between 69% and 99%) and is likely to be considered cost-effective with only a small additional cost of £2–4 ha−1, although further work is needed to evaluate alternative tramline management methods. Tramline management is also the only option not incorporated within current policy mechanisms associated with reducing soil erosion and phosphorus loss and in light of its potential is an approach that should be encouraged once further evidence is available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values.