893 resultados para data driven approach


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Nitrogen adsorption on carbon nanotubes is wide- ly studied because nitrogen adsorption isotherm measurement is a standard method applied for porosity characterization. A further reason is that carbon nanotubes are potential adsorbents for separation of nitrogen from oxygen in air. The study presented here describes the results of GCMC simulations of nitrogen (three site model) adsorption on single and multi walled closed nanotubes. The results obtained are described by a new adsorption isotherm model proposed in this study. The model can be treated as the tube analogue of the GAB isotherm taking into account the lateral adsorbate-adsorbate interactions. We show that the model describes the simulated data satisfactorily. Next this new approach is applied for a description of experimental data measured on different commercially available (and characterized using HRTEM) carbon nanotubes. We show that generally a quite good fit is observed and therefore it is suggested that the observed mechanism of adsorption in the studied materials is mainly determined by adsorption on tubes separated at large distances, so the tubes behave almost independently.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Airborne lidar provides accurate height information of objects on the earth and has been recognized as a reliable and accurate surveying tool in many applications. In particular, lidar data offer vital and significant features for urban land-cover classification, which is an important task in urban land-use studies. In this article, we present an effective approach in which lidar data fused with its co-registered images (i.e. aerial colour images containing red, green and blue (RGB) bands and near-infrared (NIR) images) and other derived features are used effectively for accurate urban land-cover classification. The proposed approach begins with an initial classification performed by the Dempster–Shafer theory of evidence with a specifically designed basic probability assignment function. It outputs two results, i.e. the initial classification and pseudo-training samples, which are selected automatically according to the combined probability masses. Second, a support vector machine (SVM)-based probability estimator is adopted to compute the class conditional probability (CCP) for each pixel from the pseudo-training samples. Finally, a Markov random field (MRF) model is established to combine spatial contextual information into the classification. In this stage, the initial classification result and the CCP are exploited. An efficient belief propagation (EBP) algorithm is developed to search for the global minimum-energy solution for the maximum a posteriori (MAP)-MRF framework in which three techniques are developed to speed up the standard belief propagation (BP) algorithm. Lidar and its co-registered data acquired by Toposys Falcon II are used in performance tests. The experimental results prove that fusing the height data and optical images is particularly suited for urban land-cover classification. There is no training sample needed in the proposed approach, and the computational cost is relatively low. An average classification accuracy of 93.63% is achieved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In order to best utilize the limited resource of medical resources, and to reduce the cost and improve the quality of medical treatment, we propose to build an interoperable regional healthcare systems among several levels of medical treatment organizations. In this paper, our approaches are as follows:(1) the ontology based approach is introduced as the methodology and technological solution for information integration; (2) the integration framework of data sharing among different organizations are proposed(3)the virtual database to realize data integration of hospital information system is established. Our methods realize the effective management and integration of the medical workflow and the mass information in the interoperable regional healthcare system. Furthermore, this research provides the interoperable regional healthcare system with characteristic of modularization, expansibility and the stability of the system is enhanced by hierarchy structure.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

It is well known that there is a dynamic relationship between cerebral blood flow (CBF) and cerebral blood volume (CBV). With increasing applications of functional MRI, where the blood oxygen-level-dependent signals are recorded, the understanding and accurate modeling of the hemodynamic relationship between CBF and CBV becomes increasingly important. This study presents an empirical and data-based modeling framework for model identification from CBF and CBV experimental data. It is shown that the relationship between the changes in CBF and CBV can be described using a parsimonious autoregressive with exogenous input model structure. It is observed that neither the ordinary least-squares (LS) method nor the classical total least-squares (TLS) method can produce accurate estimates from the original noisy CBF and CBV data. A regularized total least-squares (RTLS) method is thus introduced and extended to solve such an error-in-the-variables problem. Quantitative results show that the RTLS method works very well on the noisy CBF and CBV data. Finally, a combination of RTLS with a filtering method can lead to a parsimonious but very effective model that can characterize the relationship between the changes in CBF and CBV.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Traditionally, the formal scientific output in most fields of natural science has been limited to peer- reviewed academic journal publications, with less attention paid to the chain of intermediate data results and their associated metadata, including provenance. In effect, this has constrained the representation and verification of the data provenance to the confines of the related publications. Detailed knowledge of a dataset’s provenance is essential to establish the pedigree of the data for its effective re-use, and to avoid redundant re-enactment of the experiment or computation involved. It is increasingly important for open-access data to determine their authenticity and quality, especially considering the growing volumes of datasets appearing in the public domain. To address these issues, we present an approach that combines the Digital Object Identifier (DOI) – a widely adopted citation technique – with existing, widely adopted climate science data standards to formally publish detailed provenance of a climate research dataset as an associated scientific workflow. This is integrated with linked-data compliant data re-use standards (e.g. OAI-ORE) to enable a seamless link between a publication and the complete trail of lineage of the corresponding dataset, including the dataset itself.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The complexity of current and emerging high performance architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven performance modelling approach is outlined that is appro- priate for modern multicore architectures. The approach is demonstrated by constructing a model of a simple shallow water code on a Cray XE6 system, from application-specific benchmarks that illustrate precisely how architectural char- acteristics impact performance. The model is found to recre- ate observed scaling behaviour up to 16K cores, and used to predict optimal rank-core affinity strategies, exemplifying the type of problem such a model can be used for.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Urbanization is one of the major forms of habitat alteration occurring at the present time. Although this is typically deleterious to biodiversity, some species flourish within these human-modified landscapes, potentially leading to negative and/or positive interactions between people and wildlife. Hence, up-to-date assessment of urban wildlife populations is important for developing appropriate management strategies. Surveying urban wildlife is limited by land partition and private ownership, rendering many common survey techniques difficult. Garnering public involvement is one solution, but this method is constrained by the inherent biases of non-standardised survey effort associated with voluntary participation. We used a television-led media approach to solicit national participation in an online sightings survey to investigate changes in the distribution of urban foxes in Great Britain and to explore relationships between urban features and fox occurrence and sightings density. Our results show that media-based approaches can generate a large national database on the current distribution of a recognisable species. Fox distribution in England and Wales has changed markedly within the last 25 years, with sightings submitted from 91% of urban areas previously predicted to support few or no foxes. Data were highly skewed with 90% of urban areas having <30 fox sightings per 1000 people km-2. The extent of total urban area was the only variable with a significant impact on both fox occurrence and sightings density in urban areas; longitude and percentage of public green urban space were respectively, significantly positively and negatively associated with sightings density only. Latitude, and distance to nearest neighbouring conurbation had no impact on either occurrence or sightings density. Given the limitations associated with this method, further investigations are needed to determine the association between sightings density and actual fox density, and variability of fox density within and between urban areas in Britain.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The Environmental Data Abstraction Library provides a modular data management library for bringing new and diverse datatypes together for visualisation within numerous software packages, including the ncWMS viewing service, which already has very wide international uptake. The structure of EDAL is presented along with examples of its use to compare satellite, model and in situ data types within the same visualisation framework. We emphasize the value of this capability for cross calibration of datasets and evaluation of model products against observations, including preparation for data assimilation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The complexity of current and emerging architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven model is developed for a simple shallow water code on a Cray XE6 system, to explore how deployment choices such as domain decomposition and core affinity affect performance. The resource sharing present in modern multi-core architectures adds various levels of heterogeneity to the system. Shared resources often includes cache, memory, network controllers and in some cases floating point units (as in the AMD Bulldozer), which mean that the access time depends on the mapping of application tasks, and the core's location within the system. Heterogeneity further increases with the use of hardware-accelerators such as GPUs and the Intel Xeon Phi, where many specialist cores are attached to general-purpose cores. This trend for shared resources and non-uniform cores is expected to continue into the exascale era. The complexity of these systems means that various runtime scenarios are possible, and it has been found that under-populating nodes, altering the domain decomposition and non-standard task to core mappings can dramatically alter performance. To find this out, however, is often a process of trial and error. To better inform this process, a performance model was developed for a simple regular grid-based kernel code, shallow. The code comprises two distinct types of work, loop-based array updates and nearest-neighbour halo-exchanges. Separate performance models were developed for each part, both based on a similar methodology. Application specific benchmarks were run to measure performance for different problem sizes under different execution scenarios. These results were then fed into a performance model that derives resource usage for a given deployment scenario, with interpolation between results as necessary.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Bloom filters are a data structure for storing data in a compressed form. They offer excellent space and time efficiency at the cost of some loss of accuracy (so-called lossy compression). This work presents a yes-no Bloom filter, which as a data structure consisting of two parts: the yes-filter which is a standard Bloom filter and the no-filter which is another Bloom filter whose purpose is to represent those objects that were recognised incorrectly by the yes-filter (that is, to recognise the false positives of the yes-filter). By querying the no-filter after an object has been recognised by the yes-filter, we get a chance of rejecting it, which improves the accuracy of data recognition in comparison with the standard Bloom filter of the same total length. A further increase in accuracy is possible if one chooses objects to include in the no-filter so that the no-filter recognises as many as possible false positives but no true positives, thus producing the most accurate yes-no Bloom filter among all yes-no Bloom filters. This paper studies how optimization techniques can be used to maximize the number of false positives recognised by the no-filter, with the constraint being that it should recognise no true positives. To achieve this aim, an Integer Linear Program (ILP) is proposed for the optimal selection of false positives. In practice the problem size is normally large leading to intractable optimal solution. Considering the similarity of the ILP with the Multidimensional Knapsack Problem, an Approximate Dynamic Programming (ADP) model is developed making use of a reduced ILP for the value function approximation. Numerical results show the ADP model works best comparing with a number of heuristics as well as the CPLEX built-in solver (B&B), and this is what can be recommended for use in yes-no Bloom filters. In a wider context of the study of lossy compression algorithms, our researchis an example showing how the arsenal of optimization methods can be applied to improving the accuracy of compressed data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

ISO19156 Observations and Measurements (O&M) provides a standardised framework for organising information about the collection of information about the environment. Here we describe the implementation of a specialisation of O&M for environmental data, the Metadata Objects for Linking Environmental Sciences (MOLES3). MOLES3 provides support for organising information about data, and for user navigation around data holdings. The implementation described here, “CEDA-MOLES”, also supports data management functions for the Centre for Environmental Data Archival, CEDA. The previous iteration of MOLES (MOLES2) saw active use over five years, being replaced by CEDA-MOLES in late 2014. During that period important lessons were learnt both about the information needed, as well as how to design and maintain the necessary information systems. In this paper we review the problems encountered in MOLES2; how and why CEDA-MOLES was developed and engineered; the migration of information holdings from MOLES2 to CEDA-MOLES; and, finally, provide an early assessment of MOLES3 (as implemented in CEDA-MOLES) and its limitations. Key drivers for the MOLES3 development included the necessity for improved data provenance, for further structured information to support ISO19115 discovery metadata export (for EU INSPIRE compliance), and to provide appropriate fixed landing pages for Digital Object Identifiers (DOIs) in the presence of evolving datasets. Key lessons learned included the importance of minimising information structure in free text fields, and the necessity to support as much agility in the information infrastructure as possible without compromising on maintainability both by those using the systems internally and externally (e.g. citing in to the information infrastructure), and those responsible for the systems themselves. The migration itself needed to ensure continuity of service and traceability of archived assets.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A new class of accelerating cosmological models driven by a one-parameter version of the general Chaplygin-type equation of state is proposed. The simplified version is naturally obtained from causality considerations with basis on the adiabatic sound speed vs plus the observed accelerating stage of the universe. We show that very stringent constraints on the unique free parameter a describing the simplified Chaplygin model can be obtained from a joint analysis involving the latest SNe type la data and the recent Sloan Digital Sky Survey measurement of baryon acoustic oscillations (BAO). In our analysis we have considered separately the SNe type la gold sample measured by [A.G. Riess et al.. Astrophys. J. 607 (2004) 665] and the supernova legacy survey (SNLS) from [P. Astier et al., Astron. Astrophys. 447 (2006) 31]. At 95.4% (c.l.), we find for BAO + gold sample, 0.91 <= alpha <= 1.0 and Omega(m) = 0.28(-0.048)(+0.043) while BAO + SNLS analysis provides 0.94 <= alpha <= 1.0 and Omega(m) = 0.27(-0.045)(+0.048). (C) 2008 Elsevier B.V. All rights reserved.