966 resultados para PDF,estrazione,Linked Open Data,dataset RDF


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Information extraction or knowledge discovery from large data sets should be linked to data aggregation process. Data aggregation process can result in a new data representation with decreased number of objects of a given set. A deterministic approach to separable data aggregation means a lesser number of objects without mixing of objects from different categories. A statistical approach is less restrictive and allows for almost separable data aggregation with a low level of mixing of objects from different categories. Layers of formal neurons can be designed for the purpose of data aggregation both in the case of deterministic and statistical approach. The proposed designing method is based on minimization of the of the convex and piecewise linear (CPL) criterion functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While openness is well applied to software development and exploitation (open sources), and successfully applied to new business models (open innovation), fundamental and applied research seems to lag behind. Even after decades of advocacy, in 2011 only 50% of the public-funded research was freely available and accessible (Archambault et al., 2013). The current research workflows, stemming from a pre-internet age, result in loss of opportunity not only for the researchers themselves (cf. extensive literature on topic at Open Access citation project, http://opcit.eprints.org/), but also slows down innovation and application of research results (Houghton & Swan, 2011). Recent studies continue to suggest that lack of awareness among researchers, rather than lack of e-infrastructure and methodology, is a key reason for this loss of opportunity (Graziotin 2014). The session will focus on why Open Science is ideally suited to achieving tenure-relevant researcher impact in a “Publish or Perish” reality. Open Science encapsulates tools and approaches for each step along the research cycle: from Open Notebook Science to Open Data, Open Access, all setting up researchers for capitalising on social media in order to promote and discuss, and establish unexpected collaborations. Incorporating these new approaches into a updated personal research workflow is of strategic beneficial for young researchers, and will prepare them for expected long term funder trends towards greater openness and demand for greater return on investment (ROI) for public funds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Global databases of calcium carbonate concentrations and mass accumulation rates in Holocene and last glacial maximum sediments were used to estimate the deep-sea sedimentary calcium carbonate burial rate during these two time intervals. Sparse calcite mass accumulation rate data were extrapolated across regions of varying calcium carbonate concentration using a gridded map of calcium carbonate concentrations and the assumption that accumulation of noncarbonate material is uncorrelated with calcite concentration within some geographical region. Mean noncarbonate accumulation rates were estimated within each of nine regions, determined by the distribution and nature of the accumulation rate data. For core-top sediments the regions of reasonable data coverage encompass 67% of the high-calcite (>75%) sediments globally, and within these regions we estimate an accumulation rate of 55.9 ± 3.6 x 10**11 mol/yr. The same regions cover 48% of glacial high-CaCO3 sediments (the smaller fraction is due to a shift of calcite deposition to the poorly sampled South Pacific) and total 44.1 ± 6.0 x 10**11 mol/yr. Projecting both estimates to 100 % coverage yields accumulation estimates of 8.3 x 10**12 mol/yr today and 9.2 x 10**12 mol/yr during glacial time. This is little better than a guess given the incomplete data coverage, but it suggests that glacial deep sea calcite burial rate was probably not considerably faster than today in spite of a presumed decrease in shallow water burial during glacial time.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The DTRF2014 is a realization of the the fundamental Earth-fixed coordinate system, the International Terrestrial Reference System (ITRS). It has been computed by the Deutsches Geodätisches Forschungsinstitut der Technischen Universität München (DGFI-TUM). The DTRF2014 consists of station positions and velocities of 1712 globally distributed geodetic observing stations of the observation techniques VLBI, SLR, GNSS and DORIS. Additionally, for the first time, non-tidal atmospheric and hydrological loading is considered in the solution. The DTRF2014 was released in August 2016 and incorporates observation data of the four techniques up 2014. The observation data were processed and submitted by the corresponding technique services: IGS (International GNSS Service, http://igscb.jpl.nasa.gov) IVS (International VLBI Service, http://ivscc.gsfc.nasa.gov) ILRS (International Laser Ranging Service, http://ilrs.gsfc.nasa.gov) IDS (International DORIS Service, http://ids-doris.org). The DTRF2014 is an independent ITRS realization. It is computed on the basis of the same input data as the realizations JTRF2014 (JPL, Pasadena) and ITRF2014 (IGN, Paris). The three realizations of the ITRS differ conceptually. While DTRF2014 and ITRF2014 are based on station positions at a reference epoch and velocities, the JTRF2014 is based on time series of station positions. DTRF2014 and ITRF2014 result from different combination strategies: The ITRF2014 is based on the combination of solutions, the DTRF2014 is computed by the combination of normal equations. The DTRF2014 comprises 3D coordinates and coordinate changes of 1347 GNSS-, 113 VLBI-, 99 SLR- and 153 DORIS-stations. The reference epoch is 1.1.2005, 0h UTC. The Earth Orientation Parameters (EOP) - that means the coordinates of the terrestrial and the celestial pole, UT1-UTC and the Length of Day (LOD) - were simultaneously estimated with the station coordinates. The EOP time series cover the period from 1979.7 to 2015.0. The station names are the official IERS identifiers: CDP numbers or 4-character IDs and DOMES numbers (http://itrf.ensg.ign.fr/doc_ITRF/iers_sta_list.txt). The DTRF2014 solution is available in one comprehensive SINEX file and four technique-specific SINEX files, see below. A detailed description of the solution is given on the website of DGFI-TUM (http://www.dgfi.tum.de/en/science-data-products/dtrf2014/). More information can be made available by request.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Internet ha rivoluzionato il modo di comunicare degli individui. Siamo testimoni della nascita e dello sviluppo di un'era caratterizzata dalla disponibilità di informazione libera e accessibile a tutti. Negli ultimi anni grazie alla diffusione di smartphone, tablet e altre tipologie di dispositivi connessi, è cambiato il fulcro dell'innovazione spostandosi dalle persone agli oggetti. E' così che nasce il concetto di Internet of Things, termine usato per descrivere la rete di comunicazione creata tra i diversi dispositivi connessi ad Internet e capaci di interagire in autonomia. Gli ambiti applicativi dell'Internet of Things spaziano dalla domotica alla sanità, dall'environmental monitoring al concetto di smart cities e così via. L'obiettivo principale di tale disciplina è quello di migliorare la vita delle persone grazie a sistemi che siano in grado di interagire senza aver bisogno dell'intervento dell'essere umano. Proprio per la natura eterogenea della disciplina e in relazione ai diversi ambiti applicativi, nell'Internet of Things si può incorrere in problemi derivanti dalla presenza di tecnologie differenti o di modalità eterogenee di memorizzazione dei dati. A questo proposito viene introdotto il concetto di Internet of Things collaborativo, termine che indica l'obiettivo di realizzare applicazioni che possano garantire interoperabilità tra i diversi ecosistemi e tra le diverse fonti da cui l'Internet of Things attinge, sfruttando la presenza di piattaforme di pubblicazione di Open Data. L'obiettivo di questa tesi è stato quello di creare un sistema per l'aggregazione di dati da due piattaforme, ThingSpeak e Sparkfun, con lo scopo di unificarli in un unico database ed estrarre informazioni significative dai dati tramite due tecniche di Data Mining: il Dictionary Learning e l'Affinity Propagation. Vengono illustrate le due metodologie che rientrano rispettivamente tra le tecniche di classificazione e di clustering.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

En la actualidad, muchos gobiernos están publicando (o tienen la intención de publicar en breve) miles de conjuntos de datos para que personas y organizaciones los puedan utilizar. Como consecuencia, la cantidad de aplicaciones basadas en Open Data está incrementándose. Sin embargo cada gobierno tiene sus propios procedimientos para publicar sus datos, y esto causa una variedad de formatos dado que no existe un estándar internacional para especificar los formatos de estos datos. El objetivo principal de este trabajo es un análisis comparativo de datos ambientales en bases de datos abiertas (Open Data) pertenecientes a distintos gobiernos. Debido a esta variedad de formatos, debemos construir un proceso de integración de datos que sea capaz de unir todos los tipos de formatos. El trabajo implica un pre-procesado, limpieza e integración de las diferentes fuentes de datos. Existen muchas aplicaciones desarrolladas para dar soporte en el proceso de integración por ejemplo Data Tamer, Data Wrangler como se explica en este documento. El problema con estas aplicaciones es que necesitan la interacción del usuario como parte fundamental del proceso de integración. En este trabajo tratamos de evitar la supervisión humana aprovechando las similitudes de los datasets procedentes de igual área que en nuestro caso se aplica al área de medioambiente. De esta forma los procesos pueden ser automatizados con una programación adecuada. Para conseguirlo, la idea principal de este trabajo es construir procesos ad hoc adaptados a las fuentes de cada gobierno para conseguir una integración automática. Concretamente este trabajo se enfoca en datos ambientales como lo son la temperatura, consumo de energía, calidad de aire, radiación solar, velocidad del viento, etc. Desde hace dos años el gobierno de Madrid está publicando sus datos relativos a indicadores ambientales en tiempo real. Del mismo modo, otros gobiernos han publicado conjuntos de datos Open Data relativos al medio ambiente (como Andalucía o Bilbao), pero todos estos datos tienen diferentes formatos. En este trabajo se presenta una solución capaz de integrar todas ellos que además permite al usuario visualizar y hacer análisis sobre los datos en tiempo real. Una vez que el proceso de integración está realizado, todos los datos de cada gobierno poseen el mismo formato y se pueden lanzar procesos de análisis de una manera más computacional. Este trabajo tiene tres partes fundamentales: 1. Estudio de los entornos Open Data y la literatura al respecto; 2. Desarrollo de un proceso de integración y 3. Desarrollo de una Interface Gráfica y Analítica. Aunque en una primera fase se implementaron los procesos de integración mediante Java y Oracle y la Interface Gráfica con Java (jsp), en una fase posterior se realizó toda la implementación con lenguaje R y la interface gráfica mediante sus librerías, principalmente con Shiny. El resultado es una aplicación que provee de un conjunto de Datos Ambientales Integrados en Tiempo Real respecto a dos gobiernos muy diferentes en España, disponible para cualquier desarrollador que desee construir sus propias aplicaciones.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Two concepts in rural economic development policy have been the focus of much research and policy action: the identification and support of clusters or networks of firms and the availability and adoption by rural businesses of Information and Communication Technologies (ICT). From a theoretical viewpoint these policies are based on two contrasting models, with clustering seen as a process of economic agglomeration, and ICT-mediated communication as a means of facilitating economic dispersion. The study’s conceptual framework is based on four interrelated elements: location, interaction, knowledge, and advantage, together with the concept of networks which is employed as an operationally and theoretically unifying concept. The research questions are developed in four successive categories: Policy, Theory, Networks, and Method. The questions are approached using a study of two contrasting groups of rural small businesses in West Cork, Ireland: (a) Speciality Foods, and (b) firms in Digital Products and Services. The study combines Social Network Analysis (SNA) with Qualitative Thematic Analysis, using data collected from semi-structured interviews with 58 owners or managers of these businesses. Data comprise relational network data on the firms’ connections to suppliers, customers, allies and competitors, together with linked qualitative data on how the firms established connections, and how tacit and codified knowledge was sourced and utilised. The research finds that the key characteristics identified in the cluster literature are evident in the sample of Speciality Food businesses, in relation to flows of tacit knowledge, social embedding, and the development of forms of social capital. In particular the research identified the presence of two distinct forms of collective social capital in this network, termed “community” and “reputation”. By contrast the sample of Digital Products and Services businesses does not have the form of a cluster, but matches more closely to dispersive models, or “chain” structures. Much of the economic and social structure of this set of firms is best explained in terms of “project organisation”, and by the operation of an individual rather than collective form of “reputation”. The rural setting in which these firms are located has resulted in their being service-centric, and consequently they rely on ICT-mediated communication in order to exchange tacit knowledge “at a distance”. It is this factor, rather than inputs of codified knowledge, that most strongly influences their operation and their need for availability and adoption of high quality communication technologies. Thus the findings have applicability in relation to theory in Economic Geography and to policy and practice in Rural Development. In addition the research contributes to methodological questions in SNA, and to methodological questions about the combination or mixing of quantitative and qualitative methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Few hydrological studies have been made in Greenland, other than on glacial hydrology associated with the ice sheet. Understanding permafrost hydrology and hydroclimatic change and variability, however, provides key information for understanding climate change effects and feedbacks in the Arctic landscape. This paper presents a new extensive and detailed hydrological and meteorological open access dataset, with high temporal resolution from a 1.56 km**2 permafrost catchment with a lake underlain by a through talik close to the ice sheet in the Kangerlussuaq region, western Greenland. The paper describes the hydrological site investigations and utilized equipment, as well as the data collection and processing. The investigations were performed between 2010 and 2013. The high spatial resolution, within the investigated area, of the dataset makes it highly suitable for various detailed hydrological and ecological studies on catchment scale.

Relevância:

100.00% 100.00%

Publicador: