917 resultados para structuration of lexical data bases


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The virtual quadrilateral is the coalescence of novel data structures that reduces the storage requirements of spatial data without jeopardizing the quality and operability of the inherent information. The data representative of the observed area is parsed to ascertain the necessary contiguous measures that, when contained, implicitly define a quadrilateral. The virtual quadrilateral then represents a geolocated area of the observed space where all of the measures are the same. The area, contoured as a rectangle, is pseudo-delimited by the opposite coordinates of the bounding area. Once defined, the virtual quadrilateral is representative of an area in the observed space and is represented in a database by the attributes of its bounding coordinates and measure of its contiguous space. Virtual quadrilaterals have been found to ensure a lossless reduction of the physical storage, maintain the implied features of the data, facilitate the rapid retrieval of vast amount of the represented spatial data and accommodate complex queries. The methods presented herein demonstrate that virtual quadrilaterals are created quite easily, are stable and versatile objects in a database and have proven to be beneficial to exigent spatial data applications such as geographic information systems. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The primary aim of this dissertation is to develop data mining tools for knowledge discovery in biomedical data when multiple (homogeneous or heterogeneous) sources of data are available. The central hypothesis is that, when information from multiple sources of data are used appropriately and effectively, knowledge discovery can be better achieved than what is possible from only a single source. ^ Recent advances in high-throughput technology have enabled biomedical researchers to generate large volumes of diverse types of data on a genome-wide scale. These data include DNA sequences, gene expression measurements, and much more; they provide the motivation for building analysis tools to elucidate the modular organization of the cell. The challenges include efficiently and accurately extracting information from the multiple data sources; representing the information effectively, developing analytical tools, and interpreting the results in the context of the domain. ^ The first part considers the application of feature-level integration to design classifiers that discriminate between soil types. The machine learning tools, SVM and KNN, were used to successfully distinguish between several soil samples. ^ The second part considers clustering using multiple heterogeneous data sources. The resulting Multi-Source Clustering (MSC) algorithm was shown to have a better performance than clustering methods that use only a single data source or a simple feature-level integration of heterogeneous data sources. ^ The third part proposes a new approach to effectively incorporate incomplete data into clustering analysis. Adapted from K-means algorithm, the Generalized Constrained Clustering (GCC) algorithm makes use of incomplete data in the form of constraints to perform exploratory analysis. Novel approaches for extracting constraints were proposed. For sufficiently large constraint sets, the GCC algorithm outperformed the MSC algorithm. ^ The last part considers the problem of providing a theme-specific environment for mining multi-source biomedical data. The database called PlasmoTFBM, focusing on gene regulation of Plasmodium falciparum, contains diverse information and has a simple interface to allow biologists to explore the data. It provided a framework for comparing different analytical tools for predicting regulatory elements and for designing useful data mining tools. ^ The conclusion is that the experiments reported in this dissertation strongly support the central hypothesis.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Approaches to quantify the organic carbon accumulation on a global scale generally do not consider the small-scale variability of sedimentary and oceanographic boundary conditions along continental margins. In this study, we present a new approach to regionalize the total organic carbon (TOC) content in surface sediments (<5 cm sediment depth). It is based on a compilation of more than 5500 single measurements from various sources. Global TOC distribution was determined by the application of a combined qualitative and quantitative-geostatistical method. Overall, 33 benthic TOC-based provinces were defined and used to process the global distribution pattern of the TOC content in surface sediments in a 1°x1° grid resolution. Regional dependencies of data points within each single province are expressed by modeled semi-variograms. Measured and estimated TOC values show good correlation, emphasizing the reasonable applicability of the method. The accumulation of organic carbon in marine surface sediments is a key parameter in the control of mineralization processes and the material exchange between the sediment and the ocean water. Our approach will help to improve global budgets of nutrient and carbon cycles.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this study we present a global distribution pattern and budget of the minimum flux of particulate organic carbon to the sea floor (J POC alpha). The estimations are based on regionally specific correlations between the diffusive oxygen flux across the sediment-water interface, the total organic carbon content in surface sediments, and the oxygen concentration in bottom waters. For this, we modified the principal equation of Cai and Reimers [1995] as a basic monod reaction rate, applied within 11 regions where in situ measurements of diffusive oxygen uptake exist. By application of the resulting transfer functions to other regions with similar sedimentary conditions and areal interpolation, we calculated a minimum global budget of particulate organic carbon that actually reaches the sea floor of ~0.5 GtC yr**-1 (>1000 m water depth (wd)), whereas approximately 0.002-0.12 GtC yr**-1 is buried in the sediments (0.01-0.4% of surface primary production). Despite the fact that our global budget is in good agreement with previous studies, we found conspicuous differences among the distribution patterns of primary production, calculations based on particle trap collections of the POC flux, and J POC alpha of this study. These deviations, especially located at the southeastern and southwestern Atlantic Ocean, the Greenland and Norwegian Sea and the entire equatorial Pacific Ocean, strongly indicate a considerable influence of lateral particle transport on the vertical link between surface waters and underlying sediments. This observation is supported by sediment trap data. Furthermore, local differences in the availability and quality of the organic matter as well as different transport mechanisms through the water column are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Copyright ©ERS 2014.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Copyright ©ERS 2014.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We thank Orkney Islands Council for access to Eynhallow and Talisman Energy (UK) Ltd and Marine Scotland for fieldwork and equipment support. Handling and tagging of fulmars was conducted under licences from the British Trust for Ornithology and the UK Home Office. EE was funded by a Marine Alliance for Science and Technology for Scotland/University of Aberdeen College of Life Sciences and Medicine studentship and LQ was supported by a NERC Studentship. Thanks also to the many colleagues who assisted with fieldwork during the project, and to Helen Bailey and Arliss Winship for advice on implementing the state-space model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Postprint

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Postprint

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We thank Orkney Islands Council for access to Eynhallow and Talisman Energy (UK) Ltd and Marine Scotland for fieldwork and equipment support. Handling and tagging of fulmars was conducted under licences from the British Trust for Ornithology and the UK Home Office. EE was funded by a Marine Alliance for Science and Technology for Scotland/University of Aberdeen College of Life Sciences and Medicine studentship and LQ was supported by a NERC Studentship. Thanks also to the many colleagues who assisted with fieldwork during the project, and to Helen Bailey and Arliss Winship for advice on implementing the state-space model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Statin therapy reduces the risk of occlusive vascular events, but uncertainty remains about potential effects on cancer. We sought to provide a detailed assessment of any effects on cancer of lowering LDL cholesterol (LDL-C) with a statin using individual patient records from 175,000 patients in 27 large-scale statin trials. Methods and Findings: Individual records of 134,537 participants in 22 randomised trials of statin versus control (median duration 4.8 years) and 39,612 participants in 5 trials of more intensive versus less intensive statin therapy (median duration 5.1 years) were obtained. Reducing LDL-C with a statin for about 5 years had no effect on newly diagnosed cancer or on death from such cancers in either the trials of statin versus control (cancer incidence: 3755 [1.4% per year [py]] versus 3738 [1.4% py], RR 1.00 [95% CI 0.96-1.05]; cancer mortality: 1365 [0.5% py] versus 1358 [0.5% py], RR 1.00 [95% CI 0.93-1.08]) or in the trials of more versus less statin (cancer incidence: 1466 [1.6% py] vs 1472 [1.6% py], RR 1.00 [95% CI 0.93-1.07]; cancer mortality: 447 [0.5% py] versus 481 [0.5% py], RR 0.93 [95% CI 0.82-1.06]). Moreover, there was no evidence of any effect of reducing LDL-C with statin therapy on cancer incidence or mortality at any of 23 individual categories of sites, with increasing years of treatment, for any individual statin, or in any given subgroup. In particular, among individuals with low baseline LDL-C (<2 mmol/L), there was no evidence that further LDL-C reduction (from about 1.7 to 1.3 mmol/L) increased cancer risk (381 [1.6% py] versus 408 [1.7% py]; RR 0.92 [99% CI 0.76-1.10]). Conclusions: In 27 randomised trials, a median of five years of statin therapy had no effect on the incidence of, or mortality from, any type of cancer (or the aggregate of all cancer).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Systematic, high-quality observations of the atmosphere, oceans and terrestrial environments are required to improve understanding of climate characteristics and the consequences of climate change. The overall aim of this report is to carry out a comparative assessment of approaches taken to addressing the state of European observations systems and related data analysis by some leading actors in the field. This research reports on approaches to climate observations and analyses in Ireland, Switzerland, Germany, The Netherlands and Austria and explores options for a more coordinated approach to national responses to climate observations in Europe. The key aspects addressed are: an assessment of approaches to develop GCOS and provision of analysis of GCOS data; an evaluation of how these countries are reporting development of GCOS; highlighting best practice in advancing GCOS implementation including analysis of Essential Climate Variables (ECVs); a comparative summary of the differences and synergies in terms of the reporting of climate observations; an overview of relevant European initiatives and recommendations on how identified gaps might be addressed in the short to medium term.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multi-frequency eddy current measurements are employed in estimating pressure tube (PT) to calandria tube (CT) gap in CANDU fuel channels, a critical inspection activity required to ensure fitness for service of fuel channels. In this thesis, a comprehensive characterization of eddy current gap data is laid out, in order to extract further information on fuel channel condition, and to identify generalized applications for multi-frequency eddy current data. A surface profiling technique, generalizable to multiple probe and conductive material configurations has been developed. This technique has allowed for identification of various pressure tube artefacts, has been independently validated (using ultrasonic measurements), and has been deployed and commissioned at Ontario Power Generation. Dodd and Deeds solutions to the electromagnetic boundary value problem associated with the PT to CT gap probe configuration were experimentally validated for amplitude response to changes in gap. Using the validated Dodd and Deeds solutions, principal components analysis (PCA) has been employed to identify independence and redundancies in multi-frequency eddy current data. This has allowed for an enhanced visualization of factors affecting gap measurement. Results of the PCA of simulation data are consistent with the skin depth equation, and are validated against PCA of physical experiments. Finally, compressed data acquisition has been realized, allowing faster data acquisition for multi-frequency eddy current systems with hardware limitations, and is generalizable to other applications where real time acquisition of large data sets is prohibitive.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The convex hull describes the extent or shape of a set of data and is used ubiquitously in computational geometry. Common algorithms to construct the convex hull on a finite set of n points (x,y) range from O(nlogn) time to O(n) time. However, it is often the case that a heuristic procedure is applied to reduce the original set of n points to a set of s < n points which contains the hull and so accelerates the final hull finding procedure. We present an algorithm to precondition data before building a 2D convex hull with integer coordinates, with three distinct advantages. First, for all practical purposes, it is linear; second, no explicit sorting of data is required and third, the reduced set of s points is constructed such that it forms an ordered set that can be directly pipelined into an O(n) time convex hull algorithm. Under these criteria a fast (or O(n)) pre-conditioner in principle creates a fast convex hull (approximately O(n)) for an arbitrary set of points. The paper empirically evaluates and quantifies the acceleration generated by the method against the most common convex hull algorithms. An extra acceleration of at least four times when compared to previous existing preconditioning methods is found from experiments on a dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The convex hull describes the extent or shape of a set of data and is used ubiquitously in computational geometry. Common algorithms to construct the convex hull on a finite set of n points (x,y) range from O(nlogn) time to O(n) time. However, it is often the case that a heuristic procedure is applied to reduce the original set of n points to a set of s < n points which contains the hull and so accelerates the final hull finding procedure. We present an algorithm to precondition data before building a 2D convex hull with integer coordinates, with three distinct advantages. First, for all practical purposes, it is linear; second, no explicit sorting of data is required and third, the reduced set of s points is constructed such that it forms an ordered set that can be directly pipelined into an O(n) time convex hull algorithm. Under these criteria a fast (or O(n)) pre-conditioner in principle creates a fast convex hull (approximately O(n)) for an arbitrary set of points. The paper empirically evaluates and quantifies the acceleration generated by the method against the most common convex hull algorithms. An extra acceleration of at least four times when compared to previous existing preconditioning methods is found from experiments on a dataset.