935 resultados para test data generation
Resumo:
In this paper, we address issues in segmentation Of remotely sensed LIDAR (LIght Detection And Ranging) data. The LIDAR data, which were captured by airborne laser scanner, contain 2.5 dimensional (2.5D) terrain surface height information, e.g. houses, vegetation, flat field, river, basin, etc. Our aim in this paper is to segment ground (flat field)from non-ground (houses and high vegetation) in hilly urban areas. By projecting the 2.5D data onto a surface, we obtain a texture map as a grey-level image. Based on the image, Gabor wavelet filters are applied to generate Gabor wavelet features. These features are then grouped into various windows. Among these windows, a combination of their first and second order of statistics is used as a measure to determine the surface properties. The test results have shown that ground areas can successfully be segmented from LIDAR data. Most buildings and high vegetation can be detected. In addition, Gabor wavelet transform can partially remove hill or slope effects in the original data by tuning Gabor parameters.
Resumo:
As Terabyte datasets become the norm, the focus has shifted away from our ability to produce and store ever larger amounts of data, onto its utilization. It is becoming increasingly difficult to gain meaningful insights into the data produced. Also many forms of the data we are currently producing cannot easily fit into traditional visualization methods. This paper presents a new and novel visualization technique based on the concept of a Data Forest. Our Data Forest has been designed to be used with vir tual reality (VR) as its presentation method. VR is a natural medium for investigating large datasets. Our approach can easily be adapted to be used in a variety of different ways, from a stand alone single user environment to large multi-user collaborative environments. A test application is presented using multi-dimensional data to demonstrate the concepts involved.
Resumo:
As we increase our ability to produce and store ever larger amounts of data, it is becoming increasingly difficult to understand what the data is trying to tell us. Not all the data we are currently producing can easily fit into traditional visualization methods. This paper presents a new and novel visualization technique based on the concept of a Data Forest. Our Data Forest has been developed to be utilised by virtual reality (VR) systems. VR is a natural information medium. This approach can easily be adapted to be used in collaborative environments. A test application has been developed to demonstrate the concepts involved and a collaborative version tested.
Resumo:
A unified approach is proposed for data modelling that includes supervised regression and classification applications as well as unsupervised probability density function estimation. The orthogonal-least-squares regression based on the leave-one-out test criteria is formulated within this unified data-modelling framework to construct sparse kernel models that generalise well. Examples from regression, classification and density estimation applications are used to illustrate the effectiveness of this generic data-modelling approach for constructing parsimonious kernel models with excellent generalisation capability. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Agri-environment schemes (AESs) have been implemented across EU member states in an attempt to reconcile agricultural production methods with protection of the environment and maintenance of the countryside. To determine the extent to which such policy objectives are being fulfilled, participating countries are obliged to monitor and evaluate the environmental, agricultural and socio-economic impacts of their AESs. However, few evaluations measure precise environmental outcomes and critically, there are no agreed methodologies to evaluate the benefits of particular agri-environmental measures, or to track the environmental consequences of changing agricultural practices. In response to these issues, the Agri-Environmental Footprint project developed a common methodology for assessing the environmental impact of European AES. The Agri-Environmental Footprint Index (AFI) is a farm-level, adaptable methodology that aggregates measurements of agri-environmental indicators based on Multi-Criteria Analysis (MCA) techniques. The method was developed specifically to allow assessment of differences in the environmental performance of farms according to participation in agri-environment schemes. The AFI methodology is constructed so that high values represent good environmental performance. This paper explores the use of the AFI methodology in combination with Farm Business Survey data collected in England for the Farm Accountancy Data Network (FADN), to test whether its use could be extended for the routine surveillance of environmental performance of farming systems using established data sources. Overall, the aim was to measure the environmental impact of three different types of agriculture (arable, lowland livestock and upland livestock) in England and to identify differences in AFI due to participation in agri-environment schemes. However, because farm size, farmer age, level of education and region are also likely to influence the environmental performance of a holding, these factors were also considered. Application of the methodology revealed that only arable holdings participating in agri-environment schemes had a greater environmental performance, although responses differed between regions. Of the other explanatory variables explored, the key factors determining the environmental performance for lowland livestock holdings were farm size, farmer age and level of education. In contrast, the AFI value of upland livestock holdings differed only between regions. The paper demonstrates that the AFI methodology can be used readily with English FADN data and therefore has the potential to be applied more widely to similar data sources routinely collected across the EU-27 in a standardised manner.
Resumo:
Resistance baselines were obtained for the first generation anticoagulant rodenticides chlorophacinone and diphacinone using laboratory, caesarian-derived Norway rats (Rattus norvegicus) as the susceptible strain and the blood clotting response test method. The ED99 estimates for a quantal response were: chlorophacinone, males 0.86 mg kg−1, females 1.03 mg kg−1; diphacinone, males 1.26 mg kg−1, females 1.60 mg kg−1. The dose-response data also showed that chlorophacinone was significantly (p<0.0001) more potent than diphacinone for both male and female rats, and that male rats were more susceptible than females to both compounds (p<0.002). The ED99 doses were then given to groups of five male and five female rats of the Welsh and Hampshire warfarin-resistant strains. Twenty-four hours later, prothrombin times were slightly elevated in both strains but all the animals were classified as resistant to the two compounds, indicating cross-resistance from warfarin to diphacinone and chlorophacinone. When rats of the two resistant strains were fed for six consecutive days on baits containing either diphacinone or chlorophacinone, many animals survived, indicating that their resistance might enable them to survive treatments with these compounds in the field.
Resumo:
This study investigates the intonation of Chinese and Arabic learners of English using the computerized test battery Profiling Elements of Prosody for Speech and Communication (PEPS-C). The aims were to ascertain which aspects of intonation are difficult for these learners, and to determine whether PEPS-C can be used to assess the intonation of adult learners. Although some results were significantly different from native-speaker data, raw scores showed that the learner groups performed well in most tasks, which may indicate that the learners' level is too high for the PEPS-C to be useful. However, the PEPS-C did reveal that Arabic learners performed significantly worse at contrastive stress placement, and Chinese learners performed significantly worse assessing likes and dislikes.
Resumo:
This study details validation of two separate multiplex STR systems for use in paternity investigations. These are the Second Generation Multiplex (SGM) developed by the UK Forensic Science Service and the PowerPlex 1 multiplex commercially available from Promega Inc. (Madison, WI, USA). These multiplexes contain 12 different STR systems (two are duplicated in the two systems). Population databases from Caucasian, Asian and Afro-Caribbean populations have been compiled for all loci. In all but two of the 36 STR/ethnic group combinations, no evidence was obtained to indicate inconsistency with Hardy-Weinberg (HW) proportions. Empirical and theoretical approaches have been taken to validate these systems for paternity testing. Samples from 121 cases of disputed paternity were analysed using established Single Locus Probe (SLP) tests currently in use, and also using the two multiplex STR systems. Results of all three test systems were compared and no non-conformities in the conclusions were observed, although four examples of apparent germ line mutations in the STR systems were identified. The data was analysed to give information on expected paternity indices and exclusion rates for these STR systems. The 12 systems combined comprise a highly discriminating test suitable for paternity testing. 99.96% of non-fathers are excluded from paternity on two or more STR systems. Where no exclusion is found, Paternity Index (PI) values of > 10,000 are expected in > 96% of cases.
Resumo:
An important goal in computational neuroanatomy is the complete and accurate simulation of neuronal morphology. We are developing computational tools to model three-dimensional dendritic structures based on sets of stochastic rules. This paper reports an extensive, quantitative anatomical characterization of simulated motoneurons and Purkinje cells. We used several local and global algorithms implemented in the L-Neuron and ArborVitae programs to generate sets of virtual neurons. Parameters statistics for all algorithms were measured from experimental data, thus providing a compact and consistent description of these morphological classes. We compared the emergent anatomical features of each group of virtual neurons with those of the experimental database in order to gain insights on the plausibility of the model assumptions, potential improvements to the algorithms, and non-trivial relations among morphological parameters. Algorithms mainly based on local constraints (e.g., branch diameter) were successful in reproducing many morphological properties of both motoneurons and Purkinje cells (e.g. total length, asymmetry, number of bifurcations). The addition of global constraints (e.g., trophic factors) improved the angle-dependent emergent characteristics (average Euclidean distance from the soma to the dendritic terminations, dendritic spread). Virtual neurons systematically displayed greater anatomical variability than real cells, suggesting the need for additional constraints in the models. For several emergent anatomical properties, a specific algorithm reproduced the experimental statistics better than the others did. However, relative performances were often reversed for different anatomical properties and/or morphological classes. Thus, combining the strengths of alternative generative models could lead to comprehensive algorithms for the complete and accurate simulation of dendritic morphology.
Resumo:
A role for sequential test procedures is emerging in genetic and epidemiological studies using banked biological resources. This stems from the methodology's potential for improved use of information relative to comparable fixed sample designs. Studies in which cost, time and ethics feature prominently are particularly suited to a sequential approach. In this paper sequential procedures for matched case–control studies with binary data will be investigated and assessed. Design issues such as sample size evaluation and error rates are identified and addressed. The methodology is illustrated and evaluated using both real and simulated data sets.
Resumo:
Over recent years there has been an increasing deployment of renewable energy generation technologies, particularly large-scale wind farms. As wind farm deployment increases, it is vital to gain a good understanding of how the energy produced is affected by climate variations, over a wide range of time-scales, from short (hours to weeks) to long (months to decades) periods. By relating wind speed at specific sites in the UK to a large-scale climate pattern (the North Atlantic Oscillation or "NAO"), the power generated by a modelled wind turbine under three different NAO states is calculated. It was found that the wind conditions under these NAO states may yield a difference in the mean wind power output of up to 10%. A simple model is used to demonstrate that forecasts of future NAO states can potentially be used to improve month-ahead statistical forecasts of monthly-mean wind power generation. The results confirm that the NAO has a significant impact on the hourly-, daily- and monthly-mean power output distributions from the turbine with important implications for (a) the use of meteorological data (e.g. their relationship to large scale climate patterns) in wind farm site assessment and, (b) the utilisation of seasonal-to-decadal climate forecasts to estimate future wind farm power output. This suggests that further research into the links between large-scale climate variability and wind power generation is both necessary and valuable.
Resumo:
Variational data assimilation systems for numerical weather prediction rely on a transformation of model variables to a set of control variables that are assumed to be uncorrelated. Most implementations of this transformation are based on the assumption that the balanced part of the flow can be represented by the vorticity. However, this assumption is likely to break down in dynamical regimes characterized by low Burger number. It has recently been proposed that a variable transformation based on potential vorticity should lead to control variables that are uncorrelated over a wider range of regimes. In this paper we test the assumption that a transform based on vorticity and one based on potential vorticity produce an uncorrelated set of control variables. Using a shallow-water model we calculate the correlations between the transformed variables in the different methods. We show that the control variables resulting from a vorticity-based transformation may retain large correlations in some dynamical regimes, whereas a potential vorticity based transformation successfully produces a set of uncorrelated control variables. Calculations of spatial correlations show that the benefit of the potential vorticity transformation is linked to its ability to capture more accurately the balanced component of the flow.
Resumo:
PV only generates electricity during daylight hours and primarily generates over summer. In the UK, the carbon intensity of grid electricity is higher during the daytime and over winter. This work investigates whether the grid electricity displaced by PV is high or low carbon compared to the annual mean carbon intensity using carbon factors at higher temporal resolutions (half-hourly and daily). UK policy for carbon reporting requires savings to be calculated using the annual mean carbon intensity of grid electricity. This work offers an insight into whether this technique is appropriate. Using half hourly data on the generating plant supplying the grid from November 2008 to May 2010, carbon factors for grid electricity at half-hourly and daily resolution have been derived using technology specific generation emission factors. Applying these factors to generation data from PV systems installed on schools, it is possible to assess the variation in the carbon savings from displacing grid electricity with PV generation using carbon factors with different time resolutions. The data has been analyzed for a period of 363 to 370 days and so cannot account for inter-year variations in the relationship between PV generation and carbon intensity of the electricity grid. This analysis suggests that PV displaces more carbon intensive electricity using half-hourly carbon factors than using daily factors but less compared with annual ones. A similar methodology could provide useful insights on other variable renewable and demand-side technologies and in other countries where PV performance and grid behavior are different.
Resumo:
The performance of flood inundation models is often assessed using satellite observed data; however these data have inherent uncertainty. In this study we assess the impact of this uncertainty when calibrating a flood inundation model (LISFLOOD-FP) for a flood event in December 2006 on the River Dee, North Wales, UK. The flood extent is delineated from an ERS-2 SAR image of the event using an active contour model (snake), and water levels at the flood margin calculated through intersection of the shoreline vector with LiDAR topographic data. Gauged water levels are used to create a reference water surface slope for comparison with the satellite-derived water levels. Residuals between the satellite observed data points and those from the reference line are spatially clustered into groups of similar values. We show that model calibration achieved using pattern matching of observed and predicted flood extent is negatively influenced by this spatial dependency in the data. By contrast, model calibration using water elevations produces realistic calibrated optimum friction parameters even when spatial dependency is present. To test the impact of removing spatial dependency a new method of evaluating flood inundation model performance is developed by using multiple random subsamples of the water surface elevation data points. By testing for spatial dependency using Moran’s I, multiple subsamples of water elevations that have no significant spatial dependency are selected. The model is then calibrated against these data and the results averaged. This gives a near identical result to calibration using spatially dependent data, but has the advantage of being a statistically robust assessment of model performance in which we can have more confidence. Moreover, by using the variations found in the subsamples of the observed data it is possible to assess the effects of observational uncertainty on the assessment of flooding risk.
Resumo:
The transition to a low-carbon economy urgently demands better information on the drivers of energy consumption. UK government policy has prioritized energy efficiency in the built stock as a means of carbon reduction, but the sector is historically information poor, particularly the non-domestic building stock. This paper presents the results of a pilot study that investigated whether and how property and energy consumption data might be combined for non-domestic energy analysis. These data were combined in a ‘Non-Domestic Energy Efficiency Database’ to describe the location and physical attributes of each property and its energy consumption. The aim was to support the generation of a range of energy-efficiency statistics for the industrial, commercial and institutional sectors of the non-domestic building stock, and to provide robust evidence for national energy-efficiency and carbon-reduction policy development and monitoring. The work has brought together non-domestic energy data, property data and mapping in a ‘data framework’ for the first time. The results show what is possible when these data are integrated and the associated difficulties. A data framework offers the potential to inform energy-efficiency policy formation and to support its monitoring at a level of detail not previously possible.