36 resultados para Processing wikipedia data
Resumo:
The increasing volume of data describing humandisease processes and the growing complexity of understanding, managing, and sharing such data presents a huge challenge for clinicians and medical researchers. This paper presents the@neurIST system, which provides an infrastructure for biomedical research while aiding clinical care, by bringing together heterogeneous data and complex processing and computing services. Although @neurIST targets the investigation and treatment of cerebral aneurysms, the system’s architecture is generic enough that it could be adapted to the treatment of other diseases.Innovations in @neurIST include confining the patient data pertaining to aneurysms inside a single environment that offers cliniciansthe tools to analyze and interpret patient data and make use of knowledge-based guidance in planning their treatment. Medicalresearchers gain access to a critical mass of aneurysm related data due to the system’s ability to federate distributed informationsources. A semantically mediated grid infrastructure ensures that both clinicians and researchers are able to seamlessly access andwork on data that is distributed across multiple sites in a secure way in addition to providing computing resources on demand forperforming computationally intensive simulations for treatment planning and research.
Resumo:
The spectral efficiency achievable with joint processing of pilot and data symbol observations is compared with that achievable through the conventional (separate) approach of first estimating the channel on the basis of the pilot symbols alone, and subsequently detecting the datasymbols. Studied on the basis of a mutual information lower bound, joint processing is found to provide a non-negligible advantage relative to separate processing, particularly for fast fading. It is shown that, regardless of the fading rate, only a very small number of pilot symbols (at most one per transmit antenna and per channel coherence interval) shouldbe transmitted if joint processing is allowed.
Resumo:
Polyphenols are a major class of bioactive phytochemicals whose consumption may play a role in the prevention of a number of chronic diseases such as cardiovascular diseases, type II diabetes and cancers. Phenol-Explorer, launched in 2009, is the only freely available web-based database on the content of polyphenols in food and their in vivo metabolism and pharmacokinetics. Here we report the third release of the database (Phenol-Explorer 3.0), which adds data on the effects of food processing on polyphenol contents in foods. Data on >100 foods, covering 161 polyphenols or groups of polyphenols before and after processing, were collected from 129 peer-reviewed publications and entered into new tables linked to the existing relational design. The effect of processing on polyphenol content is expressed in the form of retention factor coefficients, or the proportion of a given polyphenol retained after processing, adjusted for change in water content. The result is the first database on the effects of food processing on polyphenol content and, following the model initially defined for Phenol-Explorer, all data may be traced back to original sources. The new update will allow polyphenol scientists to more accurately estimate polyphenol exposure from dietary surveys. Database URL: http://www.phenol-explorer.eu
Resumo:
Peer-reviewed
Resumo:
This PhD project aims to study paraphrasing, initially understood as the different ways in which the same content is expressed linguistically. We will go into that concept in depth trying to define and delimit its scope more accurately. In that sense, we also aim to discover which kind of structures and phenomena it covers. Although there exist some paraphrasing typologies, the great majority of them only apply to English, and focus on lexical and syntactic transformations. Our intention is to go further into this subject and propose a paraphrasing typology for Spanish and Catalan combining lexical, syntactic, semantic and pragmatic knowledge. We apply a bottom-up methodology trying to collect evidence of this phenomenon from the data. For this purpose, we are initially using the Spanish Wikipedia as our corpus. The internal structure of this encyclopedia makes it a good resource for extracting paraphrasing examples for our investigation. This empirical approach will be complemented with the use of linguistic knowledge, and by comparing and contrasting our results to previously proposed paraphrasing typologies in order to enlarge the possible paraphrasing forms found in our corpus. The fact that the same content can be expressed in many different ways presents a major challenge for Natural Language Processing (NLP) applications. Thus, research on paraphrasing has recently been attracting increasing attention in the fields of NLP and Computational Linguistics. The results obtained in this investigation would be of great interest in many of these applications.
Resumo:
Configuració d'un entorn de desenvolupament en el IDE Eclipse. Introducció als SIG. Usos, utilitats i exemples. Conèixer la eina gvSIG. Conèixer els estàndards més estesos de l'Open Geospatial Consortium (OGC) i en especial del Web Processing Services. Analitzar, dissenyar i desenvolupar un client capaç de consumir serveis wps.
Resumo:
Estudi dels estàndards definits per l'Open Geospatial Consortium, i més concretament en l'estàndard Web Processing Service (wps). Així mateix, ha tingut una component pràctica que ha consistit en el disseny i desenvolupament d'un client capaç de consumir serveis Web creats segons wps i integrat a la plataforma gvSIG.
Resumo:
We present a georeferenced photomosaic of the Lucky Strike hydrothermal vent field (Mid-Atlantic Ridge, 37°18’N). The photomosaic was generated from digital photographs acquired using the ARGO II seafloor imaging system during the 1996 LUSTRE cruise, which surveyed a ~1 km2 zone and provided a coverage of ~20% of the seafloor. The photomosaic has a pixel resolution of 15 mm and encloses the areas with known active hydrothermal venting. The final mosaic is generated after an optimization that includes the automatic detection of the same benthic features across different images (feature-matching), followed by a global alignment of images based on the vehicle navigation. We also provide software to construct mosaics from large sets of images for which georeferencing information exists (location, attitude, and altitude per image), to visualize them, and to extract data. Georeferencing information can be provided by the raw navigation data (collected during the survey) or result from the optimization obtained from imatge matching. Mosaics based solely on navigation can be readily generated by any user but the optimization and global alignment of the mosaic requires a case-by-case approach for which no universally software is available. The Lucky Strike photomosaics (optimized and navigated-only) are publicly available through the Marine Geoscience Data System (MGDS, http://www.marine-geo.org). The mosaic-generating and viewing software is available through the Computer Vision and Robotics Group Web page at the University of Girona (http://eia.udg.es/_rafa/mosaicviewer.html)
Resumo:
Gaia is the most ambitious space astrometry mission currently envisaged and is a technological challenge in all its aspects. We describe a proposal for the payload data handling system of Gaia, as an example of a high-performance, real-time, concurrent, and pipelined data system. This proposal includes the front-end systems for the instrumentation, the data acquisition and management modules, the star data processing modules, and the payload data handling unit. We also review other payload and service module elements and we illustrate a data flux proposal.
Resumo:
Drift is an important issue that impairs the reliability of gas sensing systems. Sensor aging, memory effects and environmental disturbances produce shifts in sensor responses that make initial statistical models for gas or odor recognition useless after a relatively short period (typically few weeks). Frequent recalibrations are needed to preserve system accuracy. However, when recalibrations involve numerous samples they become expensive and laborious. An interesting and lower cost alternative is drift counteraction by signal processing techniques. Orthogonal Signal Correction (OSC) is proposed for drift compensation in chemical sensor arrays. The performance of OSC is also compared with Component Correction (CC). A simple classification algorithm has been employed for assessing the performance of the algorithms on a dataset composed by measurements of three analytes using an array of seventeen conductive polymer gas sensors over a ten month period.
Resumo:
When dealing with nonlinear blind processing algorithms (deconvolution or post-nonlinear source separation), complex mathematical estimations must be done giving as a result very slow algorithms. This is the case, for example, in speech processing, spike signals deconvolution or microarray data analysis. In this paper, we propose a simple method to reduce computational time for the inversion of Wiener systems or the separation of post-nonlinear mixtures, by using a linear approximation in a minimum mutual information algorithm. Simulation results demonstrate that linear spline interpolation is fast and accurate, obtaining very good results (similar to those obtained without approximation) while computational time is dramatically decreased. On the other hand, cubic spline interpolation also obtains similar good results, but due to its intrinsic complexity, the global algorithm is much more slow and hence not useful for our purpose.
Resumo:
In this work we present a simulation of a recognition process with perimeter characterization of a simple plant leaves as a unique discriminating parameter. Data coding allowing for independence of leaves size and orientation may penalize performance recognition for some varieties. Border description sequences are then used to characterize the leaves. Independent Component Analysis (ICA) is then applied in order to study which is the best number of components to be considered for the classification task, implemented by means of an Artificial Neural Network (ANN). Obtained results with ICA as a pre-processing tool are satisfactory, and compared with some references our system improves the recognition success up to 80.8% depending on the number of considered independent components.
Resumo:
Background Nowadays, combining the different sources of information to improve the biological knowledge available is a challenge in bioinformatics. One of the most powerful methods for integrating heterogeneous data types are kernel-based methods. Kernel-based data integration approaches consist of two basic steps: firstly the right kernel is chosen for each data set; secondly the kernels from the different data sources are combined to give a complete representation of the available data for a given statistical task. Results We analyze the integration of data from several sources of information using kernel PCA, from the point of view of reducing dimensionality. Moreover, we improve the interpretability of kernel PCA by adding to the plot the representation of the input variables that belong to any dataset. In particular, for each input variable or linear combination of input variables, we can represent the direction of maximum growth locally, which allows us to identify those samples with higher/lower values of the variables analyzed. Conclusions The integration of different datasets and the simultaneous representation of samples and variables together give us a better understanding of biological knowledge.
Resumo:
DnaSP is a software package for a comprehensive analysis of DNA polymorphism data. Version 5 implements a number of new features and analytical methods allowing extensive DNA polymorphism analyses on large datasets. Among other features, the newly implemented methods allow for: (i) analyses on multiple data files; (ii) haplotype phasing; (iii) analyses on insertion/deletion polymorphism data; (iv) visualizing sliding window results integrated with available genome annotations in the UCSC browser.
Resumo:
Peer-reviewed