922 resultados para data storage concept
Resumo:
Data integration systems offer uniform access to a set of autonomous and heterogeneous data sources. One of the main challenges in data integration is reconciling semantic differences among data sources. Approaches that been used to solve this problem can be categorized as schema-based and attribute-based. Schema-based approaches use schema information to identify the semantic similarity in data; furthermore, they focus on reconciling types before reconciling attributes. In contrast, attribute-based approaches use statistical and structural information of attributes to identify the semantic similarity of data in different sources. This research examines an approach to semantic reconciliation based on integrating properties expressed at different levels of abstraction or granularity using the concept of property precedence. Property precedence reconciles the meaning of attributes by identifying similarities between attributes based on what these attributes represent in the real world. In order to use property precedence for semantic integration, we need to identify the precedence of attributes within and across data sources. The goal of this research is to develop and evaluate a method and algorithms that will identify precedence relations among attributes and build property precedence graph (PPG) that can be used to support integration.
Resumo:
This thesis stems from the project with real-time environmental monitoring company EMSAT Corporation. They were looking for methods to automatically ag spikes and other anomalies in their environmental sensor data streams. The problem presents several challenges: near real-time anomaly detection, absence of labeled data and time-changing data streams. Here, we address this problem using both a statistical parametric approach as well as a non-parametric approach like Kernel Density Estimation (KDE). The main contribution of this thesis is extending the KDE to work more effectively for evolving data streams, particularly in presence of concept drift. To address that, we have developed a framework for integrating Adaptive Windowing (ADWIN) change detection algorithm with KDE. We have tested this approach on several real world data sets and received positive feedback from our industry collaborator. Some results appearing in this thesis have been presented at ECML PKDD 2015 Doctoral Consortium.
Resumo:
Con l’avvento di Internet, il numero di utenti con un effettivo accesso alla rete e la possibilità di condividere informazioni con tutto il mondo è, negli anni, in continua crescita. Con l’introduzione dei social media, in aggiunta, gli utenti sono portati a trasferire sul web una grande quantità di informazioni personali mettendoli a disposizione delle varie aziende. Inoltre, il mondo dell’Internet Of Things, grazie al quale i sensori e le macchine risultano essere agenti sulla rete, permette di avere, per ogni utente, un numero maggiore di dispositivi, direttamente collegati tra loro e alla rete globale. Proporzionalmente a questi fattori anche la mole di dati che vengono generati e immagazzinati sta aumentando in maniera vertiginosa dando luogo alla nascita di un nuovo concetto: i Big Data. Nasce, di conseguenza, la necessità di far ricorso a nuovi strumenti che possano sfruttare la potenza di calcolo oggi offerta dalle architetture più complesse che comprendono, sotto un unico sistema, un insieme di host utili per l’analisi. A tal merito, una quantità di dati così vasta, routine se si parla di Big Data, aggiunta ad una velocità di trasmissione e trasferimento altrettanto alta, rende la memorizzazione dei dati malagevole, tanto meno se le tecniche di storage risultano essere i tradizionali DBMS. Una soluzione relazionale classica, infatti, permetterebbe di processare dati solo su richiesta, producendo ritardi, significative latenze e inevitabile perdita di frazioni di dataset. Occorre, perciò, far ricorso a nuove tecnologie e strumenti consoni a esigenze diverse dalla classica analisi batch. In particolare, è stato preso in considerazione, come argomento di questa tesi, il Data Stream Processing progettando e prototipando un sistema bastato su Apache Storm scegliendo, come campo di applicazione, la cyber security.
Resumo:
Acknowledgements: We thank Iain Malcolm of Marine Scotland Science for access to data from the Girnock and the Scottish Environment Protection Agency for historical stage-discharge relationships. CS contributions on this paper were in part supported by the NERC/JPI SIWA project (NE/M019896/1).
Resumo:
Acknowledgements The authors would like to thank Jonathan Dick, Maria Blumstock, Claire Tunaley and Jason Lessels for assistance with the field work and Audrey Innes for lab sample preparation. Climatic data were provided by Iain Malcolm and Marine Scotland Fisheries at the Freshwater Lab, Pitlochry. Additional precipitation and temperature data were provided by the UK Meteorological Office and the British Atmospheric Data Centre (BADC). We are grateful for the careful and constructive comments of two anonymous reviewers that helped to improve an earlier version of this manuscript. The European Research Council ERC (project GA 335910) is thanked for funding.
Resumo:
Acknowledgements The authors would like to thank Jonathan Dick, Maria Blumstock, Claire Tunaley and Jason Lessels for assistance with the field work and Audrey Innes for lab sample preparation. Climatic data were provided by Iain Malcolm and Marine Scotland Fisheries at the Freshwater Lab, Pitlochry. Additional precipitation and temperature data were provided by the UK Meteorological Office and the British Atmospheric Data Centre (BADC). We are grateful for the careful and constructive comments of two anonymous reviewers that helped to improve an earlier version of this manuscript. The European Research Council ERC (project GA 335910) is thanked for funding.
Resumo:
Historically, the concepts of field-independence, closure flexibility, and weak central coherence have been used to denote a locally, rather globally, dominated perceptual style. To date, there has been little attempt to clarify the relationship between these constructs, or to examine the convergent validity of the various tasks purported to measure them. To address this, we administered 14 tasks that have been used to study visual perceptual styles to a group of 90 neuro-typical adults. The data were subjected to exploratory factor analysis. We found evidence for the existence of a narrowly defined weak central coherence (field-independence) factor that received loadings from only a few of the tasks used to operationalise this concept. This factor can most aptly be described as representing the ability to dis-embed a simple stimulus from a more complex array. The results suggest that future studies of perceptual styles should include tasks whose theoretical validity is empirically verified, as such validity cannot be established merely on the basis of a priori task analysis. Moreover, the use of multiple indices is required to capture the latent dimensions of perceptual styles reliably.
Resumo:
To explore the feasibility of processing Compact Muon Solenoid (CMS) analysis jobs across the wide area network, the FIU CMS Tier-3 center and the Florida CMS Tier-2 center designed a remote data access strategy. A Kerberized Lustre test bed was installed at the Tier-2 with the design to provide storage resources to private-facing worker nodes at the Tier-3. However, the Kerberos security layer is not capable of authenticating resources behind a private network. As a remedy, an xrootd server on a public-facing node at the Tier-3 was installed to export the file system to the private-facing worker nodes. We report the performance of CMS analysis jobs processed by the Tier-3 worker nodes accessing data from a Kerberized Lustre file. The processing performance of this configuration is benchmarked against a direct connection to the Lustre file system, and separately, where the xrootd server is near the Lustre file system.
Resumo:
Our ability to project the impact of global change on marine ecosystem is limited by our poor understanding on how to predict species sensitivity. For example, the impact of ocean acidification is highly species-specific, even in closely related taxa. The aim of this study was to test the hypothesis that the tolerance range of a given species to decreased pH corresponds to their natural range of exposure. Larvae of the green sea urchin Strongylocentrotus droebachiensis were cultured from fertilization to metamorphic competence (29 days) under a wide range of pH (from pHT = 8.0/pCO2 ~ 480 ?atm to pHT = 6.5/pCO2 ~ 20 000 ?atm) covering present (from pHT 8.7 to 7.6), projected near-future variability (from pHT 8.3 to 7.2) and beyond. Decreasing pH impacted all tested parameters (mortality, symmetry, growth, morphometry and respiration). Development of normal, although showing morphological plasticity, swimming larvae was possible as low as pHT >= 7.0. Within that range, decreasing pH increased mortality and asymmetry and decreased body length (BL) growth rate. Larvae raised at lowered pH and with similar BL had shorter arms and a wider body. Relative to a given BL, respiration rates and stomach volume both increased with decreasing pH suggesting changes in energy budget. At the lowest pHs (pHT <= 6.5), all the tested parameters were strongly negatively affected and no larva survived past 13 days post fertilization. In conclusion, sea urchin larvae appeared to be highly plastic when exposed to decreased pH until a physiological tipping point at pHT = 7.0. However, this plasticity was associated with direct (increased mortality) and indirect (decreased growth) consequences for fitness.
Resumo:
To project the future development of the soil organic carbon (SOC) storage in permafrost environments, the spatial and vertical distribution of key soil properties and their landscape controls needs to be understood. This article reports findings from the Arctic Lena River Delta where we sampled 50 soil pedons. These were classified according to the U.S.D.A. Soil Taxonomy and fall mostly into the Gelisol soil order used for permafrost-affected soils. Soil profiles have been sampled for the active layer (mean depth 58±10 cm) and the upper permafrost to one meter depth. We analyze SOC stocks and key soil properties, i.e. C%, N%, C/N, bulk density, visible ice and water content. These are compared for different landscape groupings of pedons according to geomorphology, soil and land cover and for different vertical depth increments. High vertical resolution plots are used to understand soil development. These show that SOC storage can be highly variable with depth. We recommend the treatment of permafrost-affected soils according to subdivisions into: the surface organic layer, mineral subsoil in the active layer, organic enriched cryoturbated or buried horizons and the mineral subsoil in the permafrost. The major geomorphological units of a subregion of the Lena River Delta were mapped with a land form classification using a data-fusion approach of optical satellite imagery and digital elevation data to upscale SOC storage. Landscape mean SOC storage is estimated to 19.2±2.0 kg C/m**2. Our results show that the geomorphological setting explains more soil variability than soil taxonomy classes or vegetation cover. The soils from the oldest, Pleistocene aged, unit of the delta store the highest amount of SOC per m**2 followed by the Holocene river terrace. The Pleistocene terrace affected by thermal-degradation, the recent floodplain and bare alluvial sediments store considerably less SOC in descending order.
Resumo:
The recently proposed global monsoon hypothesis interprets monsoon systems as part of one global-scale atmospheric overturning circulation, implying a connection between the regional monsoon systems and an in-phase behaviour of all northern hemispheric monsoons on annual timescales (Trenberth et al., 2000). Whether this concept can be applied to past climates and variability on longer timescales is still under debate, because the monsoon systems exhibit different regional characteristics such as different seasonality (i.e. onset, peak, and withdrawal). To investigate the interconnection of different monsoon systems during the pre-industrial Holocene, five transient global climate model simulations have been analysed with respect to the rainfall trend and variability in different sub-domains of the Afro-Asian monsoon region. Our analysis suggests that on millennial timescales with varying orbital forcing, the monsoons do not behave as a tightly connected global system. According to the models, the Indian and North African monsoons are coupled, showing similar rainfall trend and moderate correlation in rainfall variability in all models. The East Asian monsoon changes independently during the Holocene. The dissimilarities in the seasonality of the monsoon sub-systems lead to a stronger response of the North African and Indian monsoon systems to the Holocene insolation forcing than of the East Asian monsoon and affect the seasonal distribution of Holocene rainfall variations. Within the Indian and North African monsoon domain, precipitation solely changes during the summer months, showing a decreasing Holocene precipitation trend. In the East Asian monsoon region, the precipitation signal is determined by an increasing precipitation trend during spring and a decreasing precipitation change during summer, partly balancing each other. A synthesis of reconstructions and the model results do not reveal an impact of the different seasonality on the timing of the Holocene rainfall optimum in the different sub-monsoon systems. They rather indicate locally inhomogeneous rainfall changes and show, that single palaeo-records should not be used to characterise the rainfall change and monsoon evolution for entire monsoon sub-systems.
Resumo:
Although various models have been proposed to explain the origin of manganese nodules (see Goldberg and Arrhenius), two major hypotheses have received extensive attention. One concept suggests that manganese nodules form as the result of interaction between submarine volcanic products and sea water. The common association of manganese nodules with volcanic materials constitutes the main evidence for this theory. The second theory involves a direct inorganic precipitation of manganese from sea water. Goldberg and Arrhenius view this process as the oxidation of divalent manganese to tetravalent manganese by oxygen under the catalytic action of particulate iron hydroxides. Manganese accumulation by the Goldberg and Arrhenius theory would be a relatively slow and comparatively steady process, whereas Bonatti and Nayudu believe manganese nodule formation takes place subsequent to the eruption of submarine volcanoes by the acidic leaching of lava.
Resumo:
Cloud computing offers massive scalability and elasticity required by many scien-tific and commercial applications. Combining the computational and data handling capabilities of clouds with parallel processing also has the potential to tackle Big Data problems efficiently. Science gateway frameworks and workflow systems enable application developers to implement complex applications and make these available for end-users via simple graphical user interfaces. The integration of such frameworks with Big Data processing tools on the cloud opens new oppor-tunities for application developers. This paper investigates how workflow sys-tems and science gateways can be extended with Big Data processing capabilities. A generic approach based on infrastructure aware workflows is suggested and a proof of concept is implemented based on the WS-PGRADE/gUSE science gateway framework and its integration with the Hadoop parallel data processing solution based on the MapReduce paradigm in the cloud. The provided analysis demonstrates that the methods described to integrate Big Data processing with workflows and science gateways work well in different cloud infrastructures and application scenarios, and can be used to create massively parallel applications for scientific analysis of Big Data.
Resumo:
Cloud storage has rapidly become a cornerstone of many businesses and has moved from an early adopters stage to an early majority, where we typically see explosive deployments. As companies rush to join the cloud revolution, it has become vital to create the necessary tools that will effectively protect users' data from unauthorized access. Nevertheless, sharing data between multiple users' under the same domain in a secure and efficient way is not trivial. In this paper, we propose Sharing in the Rain – a protocol that allows cloud users' to securely share their data based on predefined policies. The proposed protocol is based on Attribute-Based Encryption (ABE) and allows users' to encrypt data based on certain policies and attributes. Moreover, we use a Key-Policy Attribute-Based technique through which access revocation is optimized. More precisely, we show how to securely and efficiently remove access to a file, for a certain user that is misbehaving or is no longer part of a user group, without having to decrypt and re-encrypt the original data with a new key or a new policy.