874 resultados para Distributed data access


Relevância:

40.00% 40.00%

Publicador:

Resumo:

In the last decade the principle of Open Access to publicly funded research has been getting a growing support from policy makers and funders across Europe, both at national level and within the European Union context. At European level some of the first relevant steps taken by the European Research Council (ERC) with a statement supporting Open Access (2006), shortly followed by guidelines for researchers funded by the ERC (2007) stating that all peer-reviewed publications from ERC funded projects should be made openly accessible shortly after their publication. Those guidelines were revised in October 2013, reinforcing the mandatory character of the requirements and expanding them to monographs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This work was supported in part by the EU „2nd Generation Open Access Infrastructure for Research in Europe" (OpenAIRE+). The autumn training school Development and Promotion of Open Access to Scientific Information and Research is organized in the frame of the Fourth International Conference on Digital Presentation and Preservation of Cultural and Scientific Heritage—DiPP2014 (September 18–21, 2014, Veliko Tarnovo, Bulgaria, http://dipp2014.math.bas.bg/), organized under the UNESCO patronage. The main organiser is the Institute of Mathematics and Informatics, Bulgarian Academy of Sciences with the support of EU project FOSTER (http://www.fosteropenscience.eu/) and the P. R. Slaveykov Regional Public Library in Veliko Tarnovo, Bulgaria.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper we evaluate and compare two representativeand popular distributed processing engines for large scalebig data analytics, Spark and graph based engine GraphLab. Wedesign a benchmark suite including representative algorithmsand datasets to compare the performances of the computingengines, from performance aspects of running time, memory andCPU usage, network and I/O overhead. The benchmark suite istested on both local computer cluster and virtual machines oncloud. By varying the number of computers and memory weexamine the scalability of the computing engines with increasingcomputing resources (such as CPU and memory). We also runcross-evaluation of generic and graph based analytic algorithmsover graph processing and generic platforms to identify thepotential performance degradation if only one processing engineis available. It is observed that both computing engines showgood scalability with increase of computing resources. WhileGraphLab largely outperforms Spark for graph algorithms, ithas close running time performance as Spark for non-graphalgorithms. Additionally the running time with Spark for graphalgorithms over cloud virtual machines is observed to increaseby almost 100% compared to over local computer clusters.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. ^ Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a twofold “custom wrapper” approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. ^ Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. ^ This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Access control (AC) is a necessary defense against a large variety of security attacks on the resources of distributed enterprise applications. However, to be effective, AC in some application domains has to be fine-grain, support the use of application-specific factors in authorization decisions, as well as consistently and reliably enforce organization-wide authorization policies across enterprise applications. Because the existing middleware technologies do not provide a complete solution, application developers resort to embedding AC functionality in application systems. This coupling of AC functionality with application logic causes significant problems including tremendously difficult, costly and error prone development, integration, and overall ownership of application software. The way AC for application systems is engineered needs to be changed. ^ In this dissertation, we propose an architectural approach for engineering AC mechanisms to address the above problems. First, we develop a framework for implementing the role-based access control (RBAC) model using AC mechanisms provided by CORBA Security. For those application domains where the granularity of CORBA controls and the expressiveness of RBAC model suffice, our framework addresses the stated problem. ^ In the second and main part of our approach, we propose an architecture for an authorization service, RAD, to address the problem of controlling access to distributed application resources, when the granularity and support for complex policies by middleware AC mechanisms are inadequate. Applying this architecture, we developed a CORBA-based application authorization service (CAAS). Using CAAS, we studied the main properties of the architecture and showed how they can be substantiated by employing CORBA and Java technologies. Our approach enables a wide-ranging solution for controlling the resources of distributed enterprise applications. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The research presented in this dissertation is comprised of several parts which jointly attain the goal of Semantic Distributed Database Management with Applications to Internet Dissemination of Environmental Data. ^ Part of the research into more effective and efficient data management has been pursued through enhancements to the Semantic Binary Object-Oriented database (Sem-ODB) such as more effective load balancing techniques for the database engine, and the use of Sem-ODB as a tool for integrating structured and unstructured heterogeneous data sources. Another part of the research in data management has pursued methods for optimizing queries in distributed databases through the intelligent use of network bandwidth; this has applications in networks that provide varying levels of Quality of Service or throughput. ^ The application of the Semantic Binary database model as a tool for relational database modeling has also been pursued. This has resulted in database applications that are used by researchers at the Everglades National Park to store environmental data and to remotely-sensed imagery. ^ The areas of research described above have contributed to the creation TerraFly, which provides for the dissemination of geospatial data via the Internet. TerraFly research presented herein ranges from the development of TerraFly's back-end database and interfaces, through the features that are presented to the public (such as the ability to provide autopilot scripts and on-demand data about a point), to applications of TerraFly in the areas of hazard mitigation, recreation, and aviation. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

With the recent explosion in the complexity and amount of digital multimedia data, there has been a huge impact on the operations of various organizations in distinct areas, such as government services, education, medical care, business, entertainment, etc. To satisfy the growing demand of multimedia data management systems, an integrated framework called DIMUSE is proposed and deployed for distributed multimedia applications to offer a full scope of multimedia related tools and provide appealing experiences for the users. This research mainly focuses on video database modeling and retrieval by addressing a set of core challenges. First, a comprehensive multimedia database modeling mechanism called Hierarchical Markov Model Mediator (HMMM) is proposed to model high dimensional media data including video objects, low-level visual/audio features, as well as historical access patterns and frequencies. The associated retrieval and ranking algorithms are designed to support not only the general queries, but also the complicated temporal event pattern queries. Second, system training and learning methodologies are incorporated such that user interests are mined efficiently to improve the retrieval performance. Third, video clustering techniques are proposed to continuously increase the searching speed and accuracy by architecting a more efficient multimedia database structure. A distributed video management and retrieval system is designed and implemented to demonstrate the overall performance. The proposed approach is further customized for a mobile-based video retrieval system to solve the perception subjectivity issue by considering individual user's profile. Moreover, to deal with security and privacy issues and concerns in distributed multimedia applications, DIMUSE also incorporates a practical framework called SMARXO, which supports multilevel multimedia security control. SMARXO efficiently combines role-based access control (RBAC), XML and object-relational database management system (ORDBMS) to achieve the target of proficient security control. A distributed multimedia management system named DMMManager (Distributed MultiMedia Manager) is developed with the proposed framework DEMUR; to support multimedia capturing, analysis, retrieval, authoring and presentation in one single framework.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: Biologists often need to assess whether unfamiliar datasets warrant the time investment required for more detailed exploration. Basing such assessments on brief descriptions provided by data publishers is unwieldy for large datasets that contain insights dependent on specific scientific questions. Alternatively, using complex software systems for a preliminary analysis may be deemed as too time consuming in itself, especially for unfamiliar data types and formats. This may lead to wasted analysis time and discarding of potentially useful data. Results: We present an exploration of design opportunities that the Google Maps interface offers to biomedical data visualization. In particular, we focus on synergies between visualization techniques and Google Maps that facilitate the development of biological visualizations which have both low-overhead and sufficient expressivity to support the exploration of data at multiple scales. The methods we explore rely on displaying pre-rendered visualizations of biological data in browsers, with sparse yet powerful interactions, by using the Google Maps API. We structure our discussion around five visualizations: a gene co-regulation visualization, a heatmap viewer, a genome browser, a protein interaction network, and a planar visualization of white matter in the brain. Feedback from collaborative work with domain experts suggests that our Google Maps visualizations offer multiple, scale-dependent perspectives and can be particularly helpful for unfamiliar datasets due to their accessibility. We also find that users, particularly those less experienced with computer use, are attracted by the familiarity of the Google Maps API. Our five implementations introduce design elements that can benefit visualization developers. Conclusions: We describe a low-overhead approach that lets biologists access readily analyzed views of unfamiliar scientific datasets. We rely on pre-computed visualizations prepared by data experts, accompanied by sparse and intuitive interactions, and distributed via the familiar Google Maps framework. Our contributions are an evaluation demonstrating the validity and opportunities of this approach, a set of design guidelines benefiting those wanting to create such visualizations, and five concrete example visualizations.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Access control (AC) is a necessary defense against a large variety of security attacks on the resources of distributed enterprise applications. However, to be effective, AC in some application domains has to be fine-grain, support the use of application-specific factors in authorization decisions, as well as consistently and reliably enforce organization-wide authorization policies across enterprise applications. Because the existing middleware technologies do not provide a complete solution, application developers resort to embedding AC functionality in application systems. This coupling of AC functionality with application logic causes significant problems including tremendously difficult, costly and error prone development, integration, and overall ownership of application software. The way AC for application systems is engineered needs to be changed. In this dissertation, we propose an architectural approach for engineering AC mechanisms to address the above problems. First, we develop a framework for implementing the role-based access control (RBAC) model using AC mechanisms provided by CORBA Security. For those application domains where the granularity of CORBA controls and the expressiveness of RBAC model suffice, our framework addresses the stated problem. In the second and main part of our approach, we propose an architecture for an authorization service, RAD, to address the problem of controlling access to distributed application resources, when the granularity and support for complex policies by middleware AC mechanisms are inadequate. Applying this architecture, we developed a CORBA-based application authorization service (CAAS). Using CAAS, we studied the main properties of the architecture and showed how they can be substantiated by employing CORBA and Java technologies. Our approach enables a wide-ranging solution for controlling the resources of distributed enterprise applications.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a two-fold "custom wrapper" approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We provide a compilation of downward fluxes (total mass, POC, PON, BSiO2, CaCO3, PIC and lithogenic/terrigenous fluxes) from over 6000 sediment trap measurements distributed in the Atlantic Ocean, from 30 degree North to 49 degree South, and covering the period 1982-2011. Data from the Mediterranean Sea are also included. Data were compiled from different sources: data repositories (BCO-DMO, PANGAEA), time series sites (BATS, CARIACO), published scientific papers and/or personal communications from PI's. All sources are specifed in the data set. Data from the World Ocean Atlas 2009 were extracted to provide each flux observation with contextual environmental data, such as temperature, salinity, oxygen (concentration, AOU and percentage saturation), nitrate, phosphate and silicate.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Marine phytoplankton can evolve rapidly when confronted with aspects of climate change because of their large population sizes and fast generation times. Despite this, the importance of environment fluctuations, a key feature of climate change, has received little attention-selection experiments with marine phytoplankton are usually carried out in stable environments and use single or few representatives of a species, genus or functional group. Here we investigate whether and by how much environmental fluctuations contribute to changes in ecologically important phytoplankton traits such as C:N ratios and cell size, and test the variability of changes in these traits within the globally distributed species Ostreococcus. We have evolved 16 physiologically distinct lineages of Ostreococcus at stable high CO2 (1031±87?µatm CO2, SH) and fluctuating high CO2 (1012±244?µatm CO2, FH) for 400 generations. We find that although both fluctuation and high CO2 drive evolution, FH-evolved lineages are smaller, have reduced C:N ratios and respond more strongly to further increases in CO2 than do SH-evolved lineages. This indicates that environmental fluctuations are an important factor to consider when predicting how the characteristics of future phytoplankton populations will have an impact on biogeochemical cycles and higher trophic levels in marine food webs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The exponential growth of studies on the biological response to ocean acidification over the last few decades has generated a large amount of data. To facilitate data comparison, a data compilation hosted at the data publisher PANGAEA was initiated in 2008 and is updated on a regular basis (doi:10.1594/PANGAEA.149999). By January 2015, a total of 581 data sets (over 4 000 000 data points) from 539 papers had been archived. Here we present the developments of this data compilation five years since its first description by Nisumaa et al. (2010). Most of study sites from which data archived are still in the Northern Hemisphere and the number of archived data from studies from the Southern Hemisphere and polar oceans are still relatively low. Data from 60 studies that investigated the response of a mix of organisms or natural communities were all added after 2010, indicating a welcomed shift from the study of individual organisms to communities and ecosystems. The initial imbalance of considerably more data archived on calcification and primary production than on other processes has improved. There is also a clear tendency towards more data archived from multifactorial studies after 2010. For easier and more effective access to ocean acidification data, the ocean acidification community is strongly encouraged to contribute to the data archiving effort, and help develop standard vocabularies describing the variables and define best practices for archiving ocean acidification data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The DTRF2014 is a realization of the the fundamental Earth-fixed coordinate system, the International Terrestrial Reference System (ITRS). It has been computed by the Deutsches Geodätisches Forschungsinstitut der Technischen Universität München (DGFI-TUM). The DTRF2014 consists of station positions and velocities of 1712 globally distributed geodetic observing stations of the observation techniques VLBI, SLR, GNSS and DORIS. Additionally, for the first time, non-tidal atmospheric and hydrological loading is considered in the solution. The DTRF2014 was released in August 2016 and incorporates observation data of the four techniques up 2014. The observation data were processed and submitted by the corresponding technique services: IGS (International GNSS Service, http://igscb.jpl.nasa.gov) IVS (International VLBI Service, http://ivscc.gsfc.nasa.gov) ILRS (International Laser Ranging Service, http://ilrs.gsfc.nasa.gov) IDS (International DORIS Service, http://ids-doris.org). The DTRF2014 is an independent ITRS realization. It is computed on the basis of the same input data as the realizations JTRF2014 (JPL, Pasadena) and ITRF2014 (IGN, Paris). The three realizations of the ITRS differ conceptually. While DTRF2014 and ITRF2014 are based on station positions at a reference epoch and velocities, the JTRF2014 is based on time series of station positions. DTRF2014 and ITRF2014 result from different combination strategies: The ITRF2014 is based on the combination of solutions, the DTRF2014 is computed by the combination of normal equations. The DTRF2014 comprises 3D coordinates and coordinate changes of 1347 GNSS-, 113 VLBI-, 99 SLR- and 153 DORIS-stations. The reference epoch is 1.1.2005, 0h UTC. The Earth Orientation Parameters (EOP) - that means the coordinates of the terrestrial and the celestial pole, UT1-UTC and the Length of Day (LOD) - were simultaneously estimated with the station coordinates. The EOP time series cover the period from 1979.7 to 2015.0. The station names are the official IERS identifiers: CDP numbers or 4-character IDs and DOMES numbers (http://itrf.ensg.ign.fr/doc_ITRF/iers_sta_list.txt). The DTRF2014 solution is available in one comprehensive SINEX file and four technique-specific SINEX files, see below. A detailed description of the solution is given on the website of DGFI-TUM (http://www.dgfi.tum.de/en/science-data-products/dtrf2014/). More information can be made available by request.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The program PanTool was developed as a tool box like a Swiss Army Knife for data conversion and recalculation, written to harmonize individual data collections to standard import format used by PANGAEA. The format of input files the program PanTool needs is a tabular saved in plain ASCII. The user can create this files with a spread sheet program like MS-Excel or with the system text editor. PanTool is distributed as freeware for the operating systems Microsoft Windows, Apple OS X and Linux.