885 resultados para Semantic Publishing, Linked Data, Bibliometrics, Informetrics, Data Retrieval, Citations
Resumo:
Personal data is a key asset for many companies, since this is the essence in providing personalized services. Not all companies, and specifically new entrants to the markets, have the opportunity to access the data they need to run their business. In this paper, we describe a comprehensive personal data framework that allows service providers to share and exchange personal data and knowledge about users, while facilitating users to decide who can access which data and why. We analyze the challenges related to personal data collection, integration, retrieval, and identity and privacy management, and present the framework architecture that addresses them. We also include the validation of the framework in a banking scenario, where social and financial data is collected and properly combined to generate new socio-economic knowledge about users that is then used by a personal lending service.
Resumo:
This paper forms part of Lukasz Mikolajczyk's PhD dissertation, which is supervised by Karen Milek
Resumo:
The HIV Reverse Transcriptase and Protease Sequence Database is an on-line relational database that catalogs evolutionary and drug-related sequence variation in the human immunodeficiency virus (HIV) reverse transcriptase (RT) and protease enzymes, the molecular targets of anti-HIV therapy (http://hivdb.stanford.edu). The database contains a compilation of nearly all published HIV RT and protease sequences, including submissions from International Collaboration databases and sequences published in journal articles. Sequences are linked to data about the source of the sequence sample and the antiretroviral drug treatment history of the individual from whom the isolate was obtained. During the past year 3500 sequences have been added and the data model has been expanded to include drug susceptibility data on sequenced isolates. Database content has also been integrated with didactic text and the output of two sequence analysis programs.
Resumo:
Controversy still exists over the adaptive nature of variation of enzyme loci. In conifers, random amplified polymorphic DNAs (RAPDs) represent a class of marker loci that is unlikely to fall within or be strongly linked to coding DNA. We have compared the genetic diversity in natural populations of black spruce [Picea mariana (Mill.) B.S.P.] using genotypic data at allozyme loci and RAPD loci as well as phenotypic data from inferred RAPD fingerprints. The genotypic data for both allozymes and RAPDs were obtained from at least six haploid megagametophytes for each of 75 sexually mature individuals distributed in five populations. Heterozygosities and population fixation indices were in complete agreement between allozyme loci and RAPD loci. In black spruce, it is more likely that the similar levels of variation detected at both enzyme and RAPD loci are due to such evolutionary forces as migration and the mating system, rather than to balancing selection and overdominance. Furthermore, we show that biased estimates of expected heterozygosity and among-population differentiation are obtained when using allele frequencies derived from dominant RAPD phenotypes.
Resumo:
We present a comprehensive study of the influence of the geomagnetic field on the energy estimation of extensive air showers with a zenith angle smaller than 60 degrees, detected at the Pierre Auger Observatory. the geomagnetic field induces an azimuthal modulation of the estimated energy of cosmic rays up to the similar to 2% level at large zenith angles. We present a method to account for this modulation of the reconstructed energy. We analyse the effect of the modulation on large scale anisotropy searches in the arrival direction distributions of cosmic rays. At a given energy, the geomagnetic effect is shown to induce a pseudo-dipolar pattern at the percent level in the declination distribution that needs to be accounted for.
Open business intelligence: on the importance of data quality awareness in user-friendly data mining
Resumo:
Citizens demand more and more data for making decisions in their daily life. Therefore, mechanisms that allow citizens to understand and analyze linked open data (LOD) in a user-friendly manner are highly required. To this aim, the concept of Open Business Intelligence (OpenBI) is introduced in this position paper. OpenBI facilitates non-expert users to (i) analyze and visualize LOD, thus generating actionable information by means of reporting, OLAP analysis, dashboards or data mining; and to (ii) share the new acquired information as LOD to be reused by anyone. One of the most challenging issues of OpenBI is related to data mining, since non-experts (as citizens) need guidance during preprocessing and application of mining algorithms due to the complexity of the mining process and the low quality of the data sources. This is even worst when dealing with LOD, not only because of the different kind of links among data, but also because of its high dimensionality. As a consequence, in this position paper we advocate that data mining for OpenBI requires data quality-aware mechanisms for guiding non-expert users in obtaining and sharing the most reliable knowledge from the available LOD.
Resumo:
Currently there are an overwhelming number of scientific publications in Life Sciences, especially in Genetics and Biotechnology. This huge amount of information is structured in corporate Data Warehouses (DW) or in Biological Databases (e.g. UniProt, RCSB Protein Data Bank, CEREALAB or GenBank), whose main drawback is its cost of updating that makes it obsolete easily. However, these Databases are the main tool for enterprises when they want to update their internal information, for example when a plant breeder enterprise needs to enrich its genetic information (internal structured Database) with recently discovered genes related to specific phenotypic traits (external unstructured data) in order to choose the desired parentals for breeding programs. In this paper, we propose to complement the internal information with external data from the Web using Question Answering (QA) techniques. We go a step further by providing a complete framework for integrating unstructured and structured information by combining traditional Databases and DW architectures with QA systems. The great advantage of our framework is that decision makers can compare instantaneously internal data with external data from competitors, thereby allowing taking quick strategic decisions based on richer data.
Resumo:
La recherche d'informations s'intéresse, entre autres, à répondre à des questions comme: est-ce qu'un document est pertinent à une requête ? Est-ce que deux requêtes ou deux documents sont similaires ? Comment la similarité entre deux requêtes ou documents peut être utilisée pour améliorer l'estimation de la pertinence ? Pour donner réponse à ces questions, il est nécessaire d'associer chaque document et requête à des représentations interprétables par ordinateur. Une fois ces représentations estimées, la similarité peut correspondre, par exemple, à une distance ou une divergence qui opère dans l'espace de représentation. On admet généralement que la qualité d'une représentation a un impact direct sur l'erreur d'estimation par rapport à la vraie pertinence, jugée par un humain. Estimer de bonnes représentations des documents et des requêtes a longtemps été un problème central de la recherche d'informations. Le but de cette thèse est de proposer des nouvelles méthodes pour estimer les représentations des documents et des requêtes, la relation de pertinence entre eux et ainsi modestement avancer l'état de l'art du domaine. Nous présentons quatre articles publiés dans des conférences internationales et un article publié dans un forum d'évaluation. Les deux premiers articles concernent des méthodes qui créent l'espace de représentation selon une connaissance à priori sur les caractéristiques qui sont importantes pour la tâche à accomplir. Ceux-ci nous amènent à présenter un nouveau modèle de recherche d'informations qui diffère des modèles existants sur le plan théorique et de l'efficacité expérimentale. Les deux derniers articles marquent un changement fondamental dans l'approche de construction des représentations. Ils bénéficient notamment de l'intérêt de recherche dont les techniques d'apprentissage profond par réseaux de neurones, ou deep learning, ont fait récemment l'objet. Ces modèles d'apprentissage élicitent automatiquement les caractéristiques importantes pour la tâche demandée à partir d'une quantité importante de données. Nous nous intéressons à la modélisation des relations sémantiques entre documents et requêtes ainsi qu'entre deux ou plusieurs requêtes. Ces derniers articles marquent les premières applications de l'apprentissage de représentations par réseaux de neurones à la recherche d'informations. Les modèles proposés ont aussi produit une performance améliorée sur des collections de test standard. Nos travaux nous mènent à la conclusion générale suivante: la performance en recherche d'informations pourrait drastiquement être améliorée en se basant sur les approches d'apprentissage de représentations.
Resumo:
La recherche d'informations s'intéresse, entre autres, à répondre à des questions comme: est-ce qu'un document est pertinent à une requête ? Est-ce que deux requêtes ou deux documents sont similaires ? Comment la similarité entre deux requêtes ou documents peut être utilisée pour améliorer l'estimation de la pertinence ? Pour donner réponse à ces questions, il est nécessaire d'associer chaque document et requête à des représentations interprétables par ordinateur. Une fois ces représentations estimées, la similarité peut correspondre, par exemple, à une distance ou une divergence qui opère dans l'espace de représentation. On admet généralement que la qualité d'une représentation a un impact direct sur l'erreur d'estimation par rapport à la vraie pertinence, jugée par un humain. Estimer de bonnes représentations des documents et des requêtes a longtemps été un problème central de la recherche d'informations. Le but de cette thèse est de proposer des nouvelles méthodes pour estimer les représentations des documents et des requêtes, la relation de pertinence entre eux et ainsi modestement avancer l'état de l'art du domaine. Nous présentons quatre articles publiés dans des conférences internationales et un article publié dans un forum d'évaluation. Les deux premiers articles concernent des méthodes qui créent l'espace de représentation selon une connaissance à priori sur les caractéristiques qui sont importantes pour la tâche à accomplir. Ceux-ci nous amènent à présenter un nouveau modèle de recherche d'informations qui diffère des modèles existants sur le plan théorique et de l'efficacité expérimentale. Les deux derniers articles marquent un changement fondamental dans l'approche de construction des représentations. Ils bénéficient notamment de l'intérêt de recherche dont les techniques d'apprentissage profond par réseaux de neurones, ou deep learning, ont fait récemment l'objet. Ces modèles d'apprentissage élicitent automatiquement les caractéristiques importantes pour la tâche demandée à partir d'une quantité importante de données. Nous nous intéressons à la modélisation des relations sémantiques entre documents et requêtes ainsi qu'entre deux ou plusieurs requêtes. Ces derniers articles marquent les premières applications de l'apprentissage de représentations par réseaux de neurones à la recherche d'informations. Les modèles proposés ont aussi produit une performance améliorée sur des collections de test standard. Nos travaux nous mènent à la conclusion générale suivante: la performance en recherche d'informations pourrait drastiquement être améliorée en se basant sur les approches d'apprentissage de représentations.
Resumo:
The Middle Valley segment at the northern end of the Juan de Fuca Ridge is a deep extensional rift blanketed with 200-500 m of Pleistocene turbiditic sediment. Sites 857 and 858 were drilled during Ocean Drilling Program Leg 139 to determine whether these two sites were hydrologically linked end members of an active hydrothermal circulation system. Site 858 was placed in an area of active hydrothermal discharge with fluids up to 270°C venting through anhydrite-bearing mounds on top of altered sediment. The shallow basement of fine-grained basalt that underlies the vents at Site 858 is interpreted as a seamount that was subsequently buried by turbidites. Site 857 was placed 1.6 km south of the Site 858 vents in a zone of high heat flow and numerous seismically imaged ridge-parallel faults. Drilling at Site 857 encountered sediments that are increasingly altered with depth and that overlie a series of mafic sills at depths of 460-940 m below sea floor. Sill margins and adjacent baked sediment are highly altered to magnesian chlorite and crosscut with veins filled with quartz, chlorite, sulfides, epidote, and wairakite. The sill interiors vary from slightly altered, with unaltered plagioclase and clinopyroxene in a mesostasis replaced by chlorite, to local zones of intense alteration and brecciation. In these latter zones, the sill interiors are pervasively replaced by chlorite, epidote, quartz, pyrite, titanite, and rare actinolite. The most complete replacement is associated with brecciated horizons with low recovery and slickensides on fracture surfaces, which we interpret as intersections between faults and the sills. Geochemically, the alteration of the sill complex is reflected in significant whole-rock depletions in Ca, Sr, and Na with corresponding enrichments in Mg, Al, and most metals. The latter results from the formation of conspicuous sulfide poikiloblasts. In contrast, metamorphism of the Site 858 seamount includes incomplete albitization of plagioclase phenocrysts and replacement of sparse mafic phenocrysts. Much of the basement alteration at Site 858 is confined to crosscutting veins except for a highly altered and veined horizon at the contact between basaltic basement and the overlying sediment. The sill complex at Site 857 is more highly depleted in 18O (d18O = 2.4 per mil - 4.7 per mil) and more pervasively replaced by secondary minerals relative to the extrusives at Site 858 (d18O = 4.5 per mil - 5.5 per mil). There is no evidence of significant albitization of the plagioclase at Site 857, suggesting high Ca/Na in the pore fluids. Fluid-inclusion data from hydrothermal minerals in altered mafic rocks and veins at Sites 857 and 858 show a consistency of homogenization temperatures, varying from 245 to 270°C, which is within the range of temperatures observed for the fluids venting at Site 858. The consistency of the fluid inclusion temperatures, the lack of albitization within the Site 857 sills, and the apparently low water/rock ratio collectively suggest that the sill complex at Site 857 is in thermal equilibrium and being altered by a highly evolved Ca-rich fluid similar to the fluids now venting at Site 858. The alteration evident in these two deep crustal drillsites is a result of the ongoing hydrothermal circulation and is consistent with downhole logging results, instrumented borehole results, and hydrothermal fluid chemistry. The pervasive alteration of the laterally extensive sill-sediment complex at Site 857 determines the chemistry of the fluids that are venting at Site 858. The limited alteration of the Site 858 lavas suggests that this basement edifice acts as a penetrator or ventilator for the regional hydrothermal reservoir with much of the flow focussed at the highly altered and veined sediment-basalt contact.
Resumo:
The early Eocene represents a time of major changes in the global carbon cycle and fluctuations in global temperatures on both short- and long-time scales. These perturbations of the ocean-atmosphere system have been linked to orbital forcing and changes in net organic carbon burial, but accurate age models are required to disentangle the various forcing mechanisms and assess causal relationships. Discrepancies between the employed astrochronological and radioisotopic dating techniques prevent the construction of a robust time frame between ~49 and ~54 Ma. Here we present an astronomically tuned age model for this critical time period based on a new high-resolution benthic d13C record of ODP Site 1263, SE Atlantic. First, we assess three possible tuning options to the stable long-eccentricity cycle (405-kyr), starting from Eocene Thermal Maximum 2 (ETM2, ~54 Ma). Next we compare our record to the existing bulk carbonate d13C record from the equatorial Atlantic (Demerara Rise, ODP Site 1258) to evaluate our three initial age models and compare them with alternative age models previously established for this site. Finally, we refine our preferred age model by expanding our tuning to the 100-kyr eccentricity cycle of the La2010d solution. This solution appears to accurately reflect the long- and short-term eccentricity-related patterns in our benthic d13C record of ODP Site 1263 back to at least 52 Ma and possibly to 54 Ma. Our time scale not only aims to provide a new detailed age model for this period, but it may also serve to enhance our understanding of the response of the climate system to orbital forcing during this super greenhouse period as well as trends in its background state.
Resumo:
Prepared by Cynthia Norris Graae, Program analyst, Office of Federal Civil Rights Evaluation.
Resumo:
"B-241021"--P. l.
Resumo:
"Incorporates, after careful revision, a number of ... essays which appeared as 'Locomotive running'". -- Pref.