836 resultados para Data fusion applications
Resumo:
In the Biodiversity World (BDW) project we have created a flexible and extensible Web Services-based Grid environment for biodiversity researchers to solve problems in biodiversity and analyse biodiversity patterns. In this environment, heterogeneous and globally distributed biodiversity-related resources such as data sets and analytical tools are made available to be accessed and assembled by users into workflows to perform complex scientific experiments. One such experiment is bioclimatic modelling of the geographical distribution of individual species using climate variables in order to predict past and future climate-related changes in species distribution. Data sources and analytical tools required for such analysis of species distribution are widely dispersed, available on heterogeneous platforms, present data in different formats and lack interoperability. The BDW system brings all these disparate units together so that the user can combine tools with little thought as to their availability, data formats and interoperability. The current Web Servicesbased Grid environment enables execution of the BDW workflow tasks in remote nodes but with a limited scope. The next step in the evolution of the BDW architecture is to enable workflow tasks to utilise computational resources available within and outside the BDW domain. We describe the present BDW architecture and its transition to a new framework which provides a distributed computational environment for mapping and executing workflows in addition to bringing together heterogeneous resources and analytical tools.
Resumo:
Details about the parameters of kinetic systems are crucial for progress in both medical and industrial research, including drug development, clinical diagnosis and biotechnology applications. Such details must be collected by a series of kinetic experiments and investigations. The correct design of the experiment is essential to collecting data suitable for analysis, modelling and deriving the correct information. We have developed a systematic and iterative Bayesian method and sets of rules for the design of enzyme kinetic experiments. Our method selects the optimum design to collect data suitable for accurate modelling and analysis and minimises the error in the parameters estimated. The rules select features of the design such as the substrate range and the number of measurements. We show here that this method can be directly applied to the study of other important kinetic systems, including drug transport, receptor binding, microbial culture and cell transport kinetics. It is possible to reduce the errors in the estimated parameters and, most importantly, increase the efficiency and cost-effectiveness by reducing the necessary amount of experiments and data points measured. (C) 2003 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Resumo:
Fifty years ago Carl Sauer suggested, controversially and on the basis of theory rather than evidence, that Southeast Asia was the source area for agriculture throughout the Old World, including the Pacific. Since then, the archaeobotanical record (macroscopic and microscopic) from the Pacific islands has increased, leading to suggestions, also still controversial, that Melanesia was a center of origin of agriculture independent of South-east Asia, based on tree fruits and nuts and vegetatively propagated starchy staples. Such crops generally lack morphological markers of domestication, so exploitation, cultivation and domestication cannot easily be distinguished in the archaeological record. Molecular studies involving techniques such as chromosome painting, DNA fingerprinting and DNA sequencing, can potentially complement the archaeological record by suggesting where species which were spread through the Pacific by man originated and by what routes they attained their present distributions. A combination of archaeobotanical and molecular studies should therefore eventually enable the rival claims of Melanesia versus South-east Asia as independent centers of invention of agriculture to be assessed.
Resumo:
Because of the importance and potential usefulness of construction market statistics to firms and government, consistency between different sources of data is examined with a view to building a predictive model of construction output using construction data alone. However, a comparison of Department of Trade and Industry (DTI) and Office for National Statistics (ONS) series shows that the correlation coefcient (used as a measure of consistency) of the DTI output and DTI orders data and the correlation coefficient of the DTI output and ONS output data are low. It is not possible to derive a predictive model of DTI output based on DTI orders data alone. The question arises whether or not an alternative independent source of data may be used to predict DTI output data. Independent data produced by Emap Glenigan (EG), based on planning applications, potentially offers such a source of information. The EG data records the value of planning applications and their planned start and finish dates. However, as this data is ex ante and is not correlated with DTI output it is not possible to use this data to describe the volume of actual construction output. Nor is it possible to use the EG planning data to predict DTI construc-tion orders data. Further consideration of the issues raised reveal that it is not practically possible to develop a consistent predictive model of construction output using construction statistics gathered at different stages in the development process.
Resumo:
A wireless sensor network (WSN) is a group of sensors linked by wireless medium to perform distributed sensing tasks. WSNs have attracted a wide interest from academia and industry alike due to their diversity of applications, including home automation, smart environment, and emergency services, in various buildings. The primary goal of a WSN is to collect data sensed by sensors. These data are characteristic of being heavily noisy, exhibiting temporal and spatial correlation. In order to extract useful information from such data, as this paper will demonstrate, people need to utilise various techniques to analyse the data. Data mining is a process in which a wide spectrum of data analysis methods is used. It is applied in the paper to analyse data collected from WSNs monitoring an indoor environment in a building. A case study is given to demonstrate how data mining can be used to optimise the use of the office space in a building.
Resumo:
We have designed and implemented a low-cost digital system using closed-circuit television cameras coupled to a digital acquisition system for the recording of in vivo behavioral data in rodents and for allowing observation and recording of more than 10 animals simultaneously at a reduced cost, as compared with commercially available solutions. This system has been validated using two experimental rodent models: one involving chemically induced seizures and one assessing appetite and feeding. We present observational results showing comparable or improved levels of accuracy and observer consistency between this new system and traditional methods in these experimental models, discuss advantages of the presented system over conventional analog systems and commercially available digital systems, and propose possible extensions to the system and applications to non-rodent studies.
Resumo:
In the past decade, airborne based LIght Detection And Ranging (LIDAR) has been recognised by both the commercial and public sectors as a reliable and accurate source for land surveying in environmental, engineering and civil applications. Commonly, the first task to investigate LIDAR point clouds is to separate ground and object points. Skewness Balancing has been proven to be an efficient non-parametric unsupervised classification algorithm to address this challenge. Initially developed for moderate terrain, this algorithm needs to be adapted to handle sloped terrain. This paper addresses the difficulty of object and ground point separation in LIDAR data in hilly terrain. A case study on a diverse LIDAR data set in terms of data provider, resolution and LIDAR echo has been carried out. Several sites in urban and rural areas with man-made structure and vegetation in moderate and hilly terrain have been investigated and three categories have been identified. A deeper investigation on an urban scene with a river bank has been selected to extend the existing algorithm. The results show that an iterative use of Skewness Balancing is suitable for sloped terrain.
Resumo:
Knowledge-elicitation is a common technique used to produce rules about the operation of a plant from the knowledge that is available from human expertise. Similarly, data-mining is becoming a popular technique to extract rules from the data available from the operation of a plant. In the work reported here knowledge was required to enable the supervisory control of an aluminium hot strip mill by the determination of mill set-points. A method was developed to fuse knowledge-elicitation and data-mining to incorporate the best aspects of each technique, whilst avoiding known problems. Utilisation of the knowledge was through an expert system, which determined schedules of set-points and provided information to human operators. The results show that the method proposed in this paper was effective in producing rules for the on-line control of a complex industrial process. (C) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Knowledge-elicitation is a common technique used to produce rules about the operation of a plant from the knowledge that is available from human expertise. Similarly, data-mining is becoming a popular technique to extract rules from the data available from the operation of a plant. In the work reported here knowledge was required to enable the supervisory control of an aluminium hot strip mill by the determination of mill set-points. A method was developed to fuse knowledge-elicitation and data-mining to incorporate the best aspects of each technique, whilst avoiding known problems. Utilisation of the knowledge was through an expert system, which determined schedules of set-points and provided information to human operators. The results show that the method proposed in this paper was effective in producing rules for the on-line control of a complex industrial process.
Resumo:
A unified approach is proposed for data modelling that includes supervised regression and classification applications as well as unsupervised probability density function estimation. The orthogonal-least-squares regression based on the leave-one-out test criteria is formulated within this unified data-modelling framework to construct sparse kernel models that generalise well. Examples from regression, classification and density estimation applications are used to illustrate the effectiveness of this generic data-modelling approach for constructing parsimonious kernel models with excellent generalisation capability. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
We agree with Duckrow and Albano [Phys. Rev. E 67, 063901 (2003)] and Quian Quiroga et al. [Phys. Rev. E 67, 063902 (2003)] that mutual information (MI) is a useful measure of dependence for electroencephalogram (EEG) data, but we show that the improvement seen in the performance of MI on extracting dependence trends from EEG is more dependent on the type of MI estimator rather than any embedding technique used. In an independent study we conducted in search for an optimal MI estimator, and in particular for EEG applications, we examined the performance of a number of MI estimators on the data set used by Quian Quiroga et al. in their original study, where the performance of different dependence measures on real data was investigated [Phys. Rev. E 65, 041903 (2002)]. We show that for EEG applications the best performance among the investigated estimators is achieved by k-nearest neighbors, which supports the conjecture by Quian Quiroga et al. in Phys. Rev. E 67, 063902 (2003) that the nearest neighbor estimator is the most precise method for estimating MI.
Resumo:
We agree with Duckrow and Albano [Phys. Rev. E 67, 063901 (2003)] and Quian Quiroga [Phys. Rev. E 67, 063902 (2003)] that mutual information (MI) is a useful measure of dependence for electroencephalogram (EEG) data, but we show that the improvement seen in the performance of MI on extracting dependence trends from EEG is more dependent on the type of MI estimator rather than any embedding technique used. In an independent study we conducted in search for an optimal MI estimator, and in particular for EEG applications, we examined the performance of a number of MI estimators on the data set used by Quian Quiroga in their original study, where the performance of different dependence measures on real data was investigated [Phys. Rev. E 65, 041903 (2002)]. We show that for EEG applications the best performance among the investigated estimators is achieved by k-nearest neighbors, which supports the conjecture by Quian Quiroga in Phys. Rev. E 67, 063902 (2003) that the nearest neighbor estimator is the most precise method for estimating MI.
Resumo:
The General Packet Radio Service (GPRS) was developed to allow packet data to be transported efficiently over an existing circuit switched radio network. The main applications for GPRS are in transporting IP datagram’s from the user’s mobile Internet browser to and from the Internet, or in telemetry equipment. A simple Error Detection and Correction (EDC) scheme to improve the GPRS Block Error Rate (BLER) performance is presented, particularly for coding scheme 4 (CS-4), however gains in other coding schemes are seen. For every GPRS radio block that is corrected by the EDC scheme, the block does not need to be retransmitted releasing bandwidth in the channel, improving throughput and the user’s application data rate. As GPRS requires intensive processing in the baseband, a viable hardware solution for a GPRS BLER co-processor is discussed that has been currently implemented in a Field Programmable Gate Array (FPGA) and presented in this paper.
Resumo:
We propose a unified data modeling approach that is equally applicable to supervised regression and classification applications, as well as to unsupervised probability density function estimation. A particle swarm optimization (PSO) aided orthogonal forward regression (OFR) algorithm based on leave-one-out (LOO) criteria is developed to construct parsimonious radial basis function (RBF) networks with tunable nodes. Each stage of the construction process determines the center vector and diagonal covariance matrix of one RBF node by minimizing the LOO statistics. For regression applications, the LOO criterion is chosen to be the LOO mean square error, while the LOO misclassification rate is adopted in two-class classification applications. By adopting the Parzen window estimate as the desired response, the unsupervised density estimation problem is transformed into a constrained regression problem. This PSO aided OFR algorithm for tunable-node RBF networks is capable of constructing very parsimonious RBF models that generalize well, and our analysis and experimental results demonstrate that the algorithm is computationally even simpler than the efficient regularization assisted orthogonal least square algorithm based on LOO criteria for selecting fixed-node RBF models. Another significant advantage of the proposed learning procedure is that it does not have learning hyperparameters that have to be tuned using costly cross validation. The effectiveness of the proposed PSO aided OFR construction procedure is illustrated using several examples taken from regression and classification, as well as density estimation applications.
Resumo:
Light Detection And Ranging (LIDAR) is an important modality in terrain and land surveying for many environmental, engineering and civil applications. This paper presents the framework for a recently developed unsupervised classification algorithm called Skewness Balancing for object and ground point separation in airborne LIDAR data. The main advantages of the algorithm are threshold-freedom and independence from LIDAR data format and resolution, while preserving object and terrain details. The framework for Skewness Balancing has been built in this contribution with a prediction model in which unknown LIDAR tiles can be categorised as “hilly” or “moderate” terrains. Accuracy assessment of the model is carried out using cross-validation with an overall accuracy of 95%. An extension to the algorithm is developed to address the overclassification issue for hilly terrain. For moderate terrain, the results show that from the classified tiles detached objects (buildings and vegetation) and attached objects (bridges and motorway junctions) are separated from bare earth (ground, roads and yards) which makes Skewness Balancing ideal to be integrated into geographic information system (GIS) software packages.