955 resultados para Large datasets
Resumo:
Marine dissolved organic matter (DOM) represents one of the largest active carbon reservoirs on Earth. Changes in pool size or composition could have major impacts on the global carbon cycle. Ocean acidification is a potential driver for these changes because it influences marine primary production and heterotrophic respiration. Here we show that ocean acidification as expected for a 'business-as-usual' emission scenario in the year 2100 (900 µatm) does not affect the DOM pool with respect to its size and molecular composition. We applied ultrahigh-resolution mass spectrometry to monitor the production and turnover of 7,360 distinct molecular DOM features in an unprecedented long-term mesocosm study in a Swedish Fjord, covering a full cycle of marine production. DOM concentration and molecular composition did not differ significantly between present-day and year 2100 CO2 levels. Our findings are likely applicable to other coastal and productive marine ecosystems in general.
Resumo:
The relationship between mesoscale hydrodynamics and the distribution of large particulate matter (LPM, particles larger than 200 ?m) in the first 1000 m of the Western Mediterranean basin was studied with a microprocessor-driven CTD-video package, the Underwater Video Profiler (UVP). Observations made during the last decade showed that, in late spring and summer, LPM concentration was high in the coastal part of the Western Mediterranean basin at the shelf break and near the continental slope (computed maximum: 149 ?g C/l between 0 and 100 m near the Spanish coast of the Gibraltar Strait). LPM concentration decreased further offshore into the central Mediterranean Sea where, below 100 m, it remained uniformly low, ranging from 2 to 4 ?g C/l. However, a strong variability was observed in the different mesoscale structures such as the Almeria-Oran jet in the Alboran Sea or the Algerian eddies. LPM concentration was up to one order of magnitude higher in fronts and eddies than in the adjacent oligotrophic Mediterranean waters (i.e. 35 vs. 8 ?g C/l in the Alboran Sea or 16 vs. 3 ?g C/l in a small shear cyclonic eddy). Our observations suggest that LPM spatial heterogeneity generated by the upper layer mesoscale hydrodynamics extends into deeper layers. Consequently, the superficial mesoscale dynamics may significantly contribute to the biogeochemical cycling between the upper and meso-pelagic layers.
Resumo:
The Andaman Sea and other macrotidal semi-enclosed tropical seas feature large amplitude internal waves (LAIW). Although LAIW induce strong fluctuations i.e. of temperature, pH, and nutrients, their influence on reef development is so far unknown. A better-known source of disturbance is the monsoon affecting corals due to turbulent mixing and sedimentation. Because in the Andaman Sea both, LAIW and monsoon, act from the same westerly direction their relative contribution to reef development is difficult to discern. Here, we explore the framework development in a number of offshore island locations subjected to differential LAIW- and SW-monsoon impact to address this open question. Cumulative negative temperature anomalies - a proxy for LAIW impact - explained a higher percentage of the variability in coral reef framework height, than sedimentation rates which resulted mainly from the monsoon. Temperature anomalies and sediment grain size provided the best correlation with framework height suggesting that so far neglected subsurface processes (LAIW) play a significant role in shaping coral reefs.
Resumo:
Managing large medical image collections is an increasingly demanding important issue in many hospitals and other medical settings. A huge amount of this information is daily generated, which requires robust and agile systems. In this paper we present a distributed multi-agent system capable of managing very large medical image datasets. In this approach, agents extract low-level information from images and store them in a data structure implemented in a relational database. The data structure can also store semantic information related to images and particular regions. A distinctive aspect of our work is that a single image can be divided so that the resultant sub-images can be stored and managed separately by different agents to improve performance in data accessing and processing. The system also offers the possibility of applying some region-based operations and filters on images, facilitating image classification. These operations can be performed directly on data structures in the database.
Resumo:
Recently we have seen a large increase in the amount of geospatial data that is being published using RDF and Linked Data principles. Eorts such as the W3C Geo XG, and most recently the GeoSPARQL initiative are providing the necessary vocabularies to pub- lish this kind of information on the Web of Data. In this context it is necessary to develop applications that consume and take advantage of these geospatial datasets. In this paper we present map4rdf, a faceted browsing tool for exploring and visualizing RDF datasets enhanced with geospatial information.
Resumo:
The well-documented re-colonisation of the French large river basins of Loire and Rhone by European otter and beaver allowed the analysis of explanatory factors and threats to species movement in the river corridor. To what extent anthropogenic disturbance of the riparian zone influences the corridor functioning is a central question in the understanding of ecological networks and the definition of restoration goals for river networks. The generalist or specialist nature of target species might be determining for the responses to habitat quality and barriers in the riparian corridor. Detailed datasets of land use, human stressors and hydro-morphological characteristics of river segments for the entire river basins allowed identifying the habitat requirements of the two species for the riparian zone. The identified critical factors were entered in a network analysis based on the ecological niche factor approach. Significant responses to riparian corridor quality for forest cover, alterations of channel straightening and urbanisation and infrastructure in the riparian zone are observed for both species, so they may well serve as indicators for corridor functioning. The hypothesis for generalists being less sensitive to human disturbance was withdrawn, since the otter as generalist species responded strongest to hydro-morphological alterations and human presence in general. The beaver responded the strongest to the physical environment as expected for this specialist species. The difference in responses for generalist and specialist species is clearly present and the two species have a strong complementary indicator value. The interpretation of the network analysis outcomes stresses the need for an estimation of ecological requirements of more species in the evaluation of riparian corridor functioning and in conservation planning.
Resumo:
This paper describes the main goals and outcomes of the EU-funded Framework 7 project entitled Semantic Evaluation at Large Scale (SEALS). The growth and success of the Semantic Web is built upon a wide range of Semantic technologies from ontology engineering tools through to semantic web service discovery and semantic search. The evaluation of such technologies ? and, indeed, assessments of their mutual compatibility ? is critical for their sustained improvement and adoption. The SEALS project is creating an open and sustainable platform on which all aspects of an evaluation can be hosted and executed and has been designed to accommodate most technology types. It is envisaged that the platform will become the de facto repository of test datasets and will allow anyone to organise, execute and store the results of technology evaluations free of charge and without corporate bias. The demonstration will show how individual tools can be prepared for evaluation, uploaded to the platform, evaluated according to some criteria and the subsequent results viewed. In addition, the demonstration will show the flexibility and power of the SEALS Platform for evaluation organisers by highlighting some of the key technologies used.
Resumo:
Modern sensor technologies and simulators applied to large and complex dynamic systems (such as road traffic networks, sets of river channels, etc.) produce large amounts of behavior data that are difficult for users to interpret and analyze. Software tools that generate presentations combining text and graphics can help users understand this data. In this paper we describe the results of our research on automatic multimedia presentation generation (including text, graphics, maps, images, etc.) for interactive exploration of behavior datasets. We designed a novel user interface that combines automatically generated text and graphical resources. We describe the general knowledge-based design of our presentation generation tool. We also present applications that we developed to validate the method, and a comparison with related work.
Resumo:
Large-scale circulations patterns (ENSO, NAO) have been shown to have a significant impact on seasonal weather, and therefore on crop yield over many parts of the world(Garnett and Khandekar, 1992; Aasa et al., 2004; Rozas and Garcia-Gonzalez, 2012). In this study, we analyze the influence of large-scale circulation patterns and regional climate on the principal components of maize yield variability in Iberian Peninsula (IP) using reanalysis datasets. Additionally, we investigate the modulation of these relationships by multidecadal patterns. This study is performed analyzing long time series of maize yield, only climate dependent, computed with the crop model CERES-maize (Jones and Kiniry, 1986) included in Decision Support System for Agrotechnology Transfer (DSSAT v.4.5).
Resumo:
New findings of well-preserved Early Cretaceous planktonic foraminiferal assemblages from the Cismon core (NE Italy), Calabianca (NW Sicily), Lesches en Diois (SE France) and DSDP Site 545 (off Morocco) sections allow a better understanding of the morphological features of several taxa. This paper deals with the revision of the small, planispiral individuals that several authors include in the genus Blowiella Krechmar and Gorbachik. Comparison of morphological characteristics between Blowiella and the genus Globigerinelloides Cushman and ten Dam has resulted in retention of the latter as senior synonym of Blowiella. In fact, the morphological differences (i.e. the number of chambers in the outer whorl, the width of the umbilical area, and size and spacing of pores) used to distinguish Blowiella from Globigerinelloides cannot, in our opinion, be used in discriminating genera, but can only be applied at species level. The small, few-chambered species of the genus Globigerinelloides retained here are Globigerinelloides blowi(Bolli), Globigerinelloides duboisi (Chevalier), Globigerinelloides maridalensis (Bolli), and Globigerinelloides paragottisi sp. nov. (=Globigerinelloides gottisi auctorum). Stratigraphically, in the sections studied Globigerinelloides blowi and Globigerinelloides paragottisi sp. nov. are first recorded from the mid-Upper Barremian in the Cismon core and Calabianca section, while rare individuals belonging to Globigerinelloides maridalensis and Globigerinelloide duboisi occur intermittently from the Barremian/Aptian boundary and from the Lower Aptian, respectively. All of these taxa become more frequent and abundant just above the Selli Level (OAE1a, Lower Aptian), within the Leupoldina cabri Zone (Upper Aptian). Based on the DSDP Site 545 succession, all four globigerinelloidid taxa range up to the Ticinella bejaouaensis Zone (uppermost Aptian), with Globigerinelloides maridalensis disappearing at the base of the zone, followed in close succession by the disappearance of G. blowi, G. paragottisi and finally G. duboisi.
Resumo:
Tropical scleractinian corals are particularly vulnerable to global warming as elevated sea surface temperatures (SST) disrupt the delicate balance between the coral host and their algal endosymbionts, leading to symbiont expulsion, mass bleaching and mortality. While satellite sensing of SST has proven a good predictor of coral bleaching at the regional scale, there are large deviations in bleaching severity and mortality on the local scale, which are only poorly understood. Here, we show that internal waves play a major role in explaining local coral bleaching and mortality patterns in the Andaman Sea. In spite of a severe region-wide SST anomaly in May 2010, frequent upslope intrusions of cold sub-pycnocline waters due to breaking large amplitude internal waves (LAIW) alleviated heating and mitigated coral bleaching and mortality in shallow LAIW-exposed waters. In LAIW-sheltered waters, by contrast, bleaching susceptible species suffered severe bleaching and total mortality. These findings suggest that LAIW, which are ubiquitous in tropical stratified waters, benefit coral reefs during thermal stress and provide local refugia for bleaching susceptible corals. The swash zones of LAIW may thus be important, so far overlooked, conservation areas for the maintainance of coral diversity in a warming climate. The consideration of LAIW can significantly improve coral bleaching predictions and can provide a valuable tool for coral reef conservation and management.
Resumo:
With rapid advances in video processing technologies and ever fast increments in network bandwidth, the popularity of video content publishing and sharing has made similarity search an indispensable operation to retrieve videos of user interests. The video similarity is usually measured by the percentage of similar frames shared by two video sequences, and each frame is typically represented as a high-dimensional feature vector. Unfortunately, high complexity of video content has posed the following major challenges for fast retrieval: (a) effective and compact video representations, (b) efficient similarity measurements, and (c) efficient indexing on the compact representations. In this paper, we propose a number of methods to achieve fast similarity search for very large video database. First, each video sequence is summarized into a small number of clusters, each of which contains similar frames and is represented by a novel compact model called Video Triplet (ViTri). ViTri models a cluster as a tightly bounded hypersphere described by its position, radius, and density. The ViTri similarity is measured by the volume of intersection between two hyperspheres multiplying the minimal density, i.e., the estimated number of similar frames shared by two clusters. The total number of similar frames is then estimated to derive the overall similarity between two video sequences. Hence the time complexity of video similarity measure can be reduced greatly. To further reduce the number of similarity computations on ViTris, we introduce a new one dimensional transformation technique which rotates and shifts the original axis system using PCA in such a way that the original inter-distance between two high-dimensional vectors can be maximally retained after mapping. An efficient B+-tree is then built on the transformed one dimensional values of ViTris' positions. Such a transformation enables B+-tree to achieve its optimal performance by quickly filtering a large portion of non-similar ViTris. Our extensive experiments on real large video datasets prove the effectiveness of our proposals that outperform existing methods significantly.
Resumo:
Very large spatially-referenced datasets, for example, those derived from satellite-based sensors which sample across the globe or large monitoring networks of individual sensors, are becoming increasingly common and more widely available for use in environmental decision making. In large or dense sensor networks, huge quantities of data can be collected over small time periods. In many applications the generation of maps, or predictions at specific locations, from the data in (near) real-time is crucial. Geostatistical operations such as interpolation are vital in this map-generation process and in emergency situations, the resulting predictions need to be available almost instantly, so that decision makers can make informed decisions and define risk and evacuation zones. It is also helpful when analysing data in less time critical applications, for example when interacting directly with the data for exploratory analysis, that the algorithms are responsive within a reasonable time frame. Performing geostatistical analysis on such large spatial datasets can present a number of problems, particularly in the case where maximum likelihood. Although the storage requirements only scale linearly with the number of observations in the dataset, the computational complexity in terms of memory and speed, scale quadratically and cubically respectively. Most modern commodity hardware has at least 2 processor cores if not more. Other mechanisms for allowing parallel computation such as Grid based systems are also becoming increasingly commonly available. However, currently there seems to be little interest in exploiting this extra processing power within the context of geostatistics. In this paper we review the existing parallel approaches for geostatistics. By recognising that diffeerent natural parallelisms exist and can be exploited depending on whether the dataset is sparsely or densely sampled with respect to the range of variation, we introduce two contrasting novel implementations of parallel algorithms based on approximating the data likelihood extending the methods of Vecchia [1988] and Tresp [2000]. Using parallel maximum likelihood variogram estimation and parallel prediction algorithms we show that computational time can be significantly reduced. We demonstrate this with both sparsely sampled data and densely sampled data on a variety of architectures ranging from the common dual core processor, found in many modern desktop computers, to large multi-node super computers. To highlight the strengths and weaknesses of the diffeerent methods we employ synthetic data sets and go on to show how the methods allow maximum likelihood based inference on the exhaustive Walker Lake data set.
Resumo:
Background - Problems of quality and safety persist in health systems worldwide. We conducted a large research programme to examine culture and behaviour in the English National Health Service (NHS). Methods - Mixed-methods study involving collection and triangulation of data from multiple sources, including interviews, surveys, ethnographic case studies, board minutes and publicly available datasets. We narratively synthesised data across the studies to produce a holistic picture and in this paper present a highlevel summary. Results - We found an almost universal desire to provide the best quality of care. We identified many 'bright spots' of excellent caring and practice and high-quality innovation across the NHS, but also considerable inconsistency. Consistent achievement of high-quality care was challenged by unclear goals, overlapping priorities that distracted attention, and compliance-oriented bureaucratised management. The institutional and regulatory environment was populated by multiple external bodies serving different but overlapping functions. Some organisations found it difficult to obtain valid insights into the quality of the care they provided. Poor organisational and information systems sometimes left staff struggling to deliver care effectively and disempowered them from initiating improvement. Good staff support and management were also highly variable, though they were fundamental to culture and were directly related to patient experience, safety and quality of care. Conclusions - Our results highlight the importance of clear, challenging goals for high-quality care. Organisations need to put the patient at the centre of all they do, get smart intelligence, focus on improving organisational systems, and nurture caring cultures by ensuring that staff feel valued, respected, engaged and supported.
Resumo:
Analysing the molecular polymorphism and interactions of DNA, RNA and proteins is of fundamental importance in biology. Predicting functions of polymorphic molecules is important in order to design more effective medicines. Analysing major histocompatibility complex (MHC) polymorphism is important for mate choice, epitope-based vaccine design and transplantation rejection etc. Most of the existing exploratory approaches cannot analyse these datasets because of the large number of molecules with a high number of descriptors per molecule. This thesis develops novel methods for data projection in order to explore high dimensional biological dataset by visualising them in a low-dimensional space. With increasing dimensionality, some existing data visualisation methods such as generative topographic mapping (GTM) become computationally intractable. We propose variants of these methods, where we use log-transformations at certain steps of expectation maximisation (EM) based parameter learning process, to make them tractable for high-dimensional datasets. We demonstrate these proposed variants both for synthetic and electrostatic potential dataset of MHC class-I. We also propose to extend a latent trait model (LTM), suitable for visualising high dimensional discrete data, to simultaneously estimate feature saliency as an integrated part of the parameter learning process of a visualisation model. This LTM variant not only gives better visualisation by modifying the project map based on feature relevance, but also helps users to assess the significance of each feature. Another problem which is not addressed much in the literature is the visualisation of mixed-type data. We propose to combine GTM and LTM in a principled way where appropriate noise models are used for each type of data in order to visualise mixed-type data in a single plot. We call this model a generalised GTM (GGTM). We also propose to extend GGTM model to estimate feature saliencies while training a visualisation model and this is called GGTM with feature saliency (GGTM-FS). We demonstrate effectiveness of these proposed models both for synthetic and real datasets. We evaluate visualisation quality using quality metrics such as distance distortion measure and rank based measures: trustworthiness, continuity, mean relative rank errors with respect to data space and latent space. In cases where the labels are known we also use quality metrics of KL divergence and nearest neighbour classifications error in order to determine the separation between classes. We demonstrate the efficacy of these proposed models both for synthetic and real biological datasets with a main focus on the MHC class-I dataset.