995 resultados para Data Management


Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The iRODS system, created by the San Diego Supercomputing Centre, is a rule oriented data management system that allows the user to create sets of rules to define how the data is to be managed. Each rule corresponds to a particular action or operation (such as checksumming a file) and the system is flexible enough to allow the user to create new rules for new types of operations. The iRODS system can interface to any storage system (provided an iRODS driver is built for that system) and relies on its’ metadata catalogue to provide a virtual file-system that can handle files of any size and type. However, some storage systems (such as tape systems) do not handle small files efficiently and prefer small files to be packaged up (or “bundled”) into larger units. We have developed a system that can bundle small data files of any type into larger units - mounted collections. The system can create collection families and contains its’ own extensible metadata, including metadata on which family the collection belongs to. The mounted collection system can work standalone and is being incorporated into the iRODS system to enhance the systems flexibility to handle small files. In this paper we describe the motivation for creating a mounted collection system, its’ architecture and how it has been incorporated into the iRODS system. We describe different technologies used to create the mounted collection system and provide some performance numbers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is remarkable agreement in expectations today for vastly improved ocean data management a decade from now -- capabilities that will help to bring significant benefits to ocean research and to society. Advancing data management to such a degree, however, will require cultural and policy changes that are slow to effect. The technological foundations upon which data management systems are built are certain to continue advancing rapidly in parallel. These considerations argue for adopting attitudes of pragmatism and realism when planning data management strategies. In this paper we adopt those attitudes as we outline opportunities for progress in ocean data management. We begin with a synopsis of expectations for integrated ocean data management a decade from now. We discuss factors that should be considered by those evaluating candidate “standards”. We highlight challenges and opportunities in a number of technical areas, including “Web 2.0” applications, data modeling, data discovery and metadata, real-time operational data, archival of data, biological data management and satellite data management. We discuss the importance of investments in the development of software toolkits to accelerate progress. We conclude the paper by recommending a few specific, short term targets for implementation, that we believe to be both significant and achievable, and calling for action by community leadership to effect these advancements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Climate-G is a large scale distributed testbed devoted to climate change research. It is an unfunded effort started in 2008 and involving a wide community both in Europe and US. The testbed is an interdisciplinary effort involving partners from several institutions and joining expertise in the field of climate change and computational science. Its main goal is to allow scientists carrying out geographical and cross-institutional data discovery, access, analysis, visualization and sharing of climate data. It represents an attempt to address, in a real environment, challenging data and metadata management issues. This paper presents a complete overview about the Climate-G testbed highlighting the most important results that have been achieved since the beginning of this project.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose: To investigate the relationship between research data management (RDM) and data sharing in the formulation of RDM policies and development of practices in higher education institutions (HEIs). Design/methodology/approach: Two strands of work were undertaken sequentially: firstly, content analysis of 37 RDM policies from UK HEIs; secondly, two detailed case studies of institutions with different approaches to RDM based on semi-structured interviews with staff involved in the development of RDM policy and services. The data are interpreted using insights from Actor Network Theory. Findings: RDM policy formation and service development has created a complex set of networks within and beyond institutions involving different professional groups with widely varying priorities shaping activities. Data sharing is considered an important activity in the policies and services of HEIs studied, but its prominence can in most cases be attributed to the positions adopted by large research funders. Research limitations/implications: The case studies, as research based on qualitative data, cannot be assumed to be universally applicable but do illustrate a variety of issues and challenges experienced more generally, particularly in the UK. Practical implications: The research may help to inform development of policy and practice in RDM in HEIs and funder organisations. Originality/value: This paper makes an early contribution to the RDM literature on the specific topic of the relationship between RDM policy and services, and openness – a topic which to date has received limited attention.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The availability of critical services and their data can be significantly increased by replicating them on multiple systems connected with each other, even in the face of system and network failures. In some platforms such as peer-to-peer (P2P) systems, their inherent characteristic mandates the employment of some form of replication to provide acceptable service to their users. However, the problem of how best to replicate data to build highly available peer-to-peer systems is still an open problem. In this paper, we propose an approach to address the data replication problem on P2P systems. The proposed scheme is compared with other techniques and is shown to require less communication cost for an operation as well as provide higher degree of data availability.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The peer-to-peer content distribution network (PCDN) is a hot topic recently, and it has a huge potential for massive data intensive applications on the Internet. One of the challenges in PCDN is routing for data sources and data deliveries. In this paper, we studied a type of network model which is formed by dynamic autonomy area, structured source servers and proxy servers. Based on this network model, we proposed a number of algorithms to address the routing and data delivery issues. According to the highly dynamics of the autonomy area, we established dynamic tree structure proliferation system routing, proxy routing and resource searching algorithms. The simulations results showed that the performance of the proposed network model and the algorithms are stable.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this manual is to provide a comprehensive practical tool for the generation and analysis of genetic data for subsequent application in aquatic resources management in relation to genetic stock identification in inland fisheries and aquaculture. The material only covers general background on genetics in relation to aquaculture and fisheries resource management, the techniques and relevant methods of data analysis that are commonly used to address questions relating to genetic resource characterisation and population genetic analyses. No attempt is made to include applications of genetic improvement techniques e.g. selective breeding or producing genetically modified organisms (GMOs). The manual includes two ‘stand-alone’ parts, of which this is the second volume: Part 1 – Conceptual basis of population genetic approaches: will provide a basic foundation on genetics in general, and concepts of population genetics. Issues on the choices of molecular markers and project design are also discussed. Part 2 – Laboratory protocols, data management and analysis: will provide step-by-step protocols of the most commonly used molecular genetic techniques utilised in population genetics and systematic studies. In addition, a brief discussion and explanation of how these data are managed and analysed is also included. This manual is expected to enable NACA member country personnel to be trained to undertake molecular genetic studies in their own institutions, and as such is aimed at middle and higher level technical grades. The manual can also provide useful teaching material for specialised advanced level university courses in the region and postgraduate students. The manual has gone through two development/improvement stages. The initial material was tested at a regional workshop and at the second stage feedback from participants was used to improve the contents.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Wireless sensor networks (WSNs) are proposed as powerful means for fine grained monitoring in different classes of applications at very low cost and for extended periods of time. Among various solutions, supporting WSNs with intelligent mobile platforms for handling the data management, proved its benefits towards extending the network lifetime and enhancing its performance. The mobility model applied highly affects the data latency in the network as well as the sensors’ energy consumption levels. Intelligent-based models taking into consideration the network runtime conditions are adopted to overcome such problems. In this chapter, existing proposals that use intelligent mobility for managing the data in WSNs are surveyed. Different classifications are presented through the chapter to give a complete view on the solutions lying in this domain. Furthermore, these models are compared considering various metrics and design goals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Recently, fields with substantial computing requirementshave turned to cloud computing for economical, scalable, and on-demandprovisioning of required execution environments. However, current cloudofferings focus on providing individual servers while tasks such as applicationdistribution and data preparation are left to cloud users. This article presents anew form of cloud called HPC Hybrid Deakin (H2D) cloud; an experimentalhybrid cloud capable of utilising both local and remote computational servicesfor large embarrassingly parallel applications. As well as supporting execution,H2D also provides a new service, called DataVault, that provides transparentdata management services so all cloud-hosted clusters have required datasetsbefore commencing execution.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Open-data has created an unprecedented opportunity with new challenges for ecosystem scientists. Skills in data management are essential to acquire, manage, publish, access and re-use data. These skills span many disciplines and require trans-disciplinary collaboration. Science synthesis centres support analysis and synthesis through collaborative 'Working Groups' where domain specialists work together to synthesise existing information to provide insight into critical problems. The Australian Centre for Ecological Analysis and Synthesis (ACEAS) served a wide range of stakeholders, from scientists to policy-makers to managers. This paper investigates the level of sophistication in data management in the ecosystem science community through the lens of the ACEAS experience, and identifies the important factors required to enable us to benefit from this new data-world and produce innovative science. ACEAS promoted the analysis and synthesis of data to solve transdisciplinary questions, and promoted the publication of the synthesised data. To do so, it provided support in many of the key skillsets required. Analysis and synthesis in multi-disciplinary and multi-organisational teams, and publishing data were new for most. Data were difficult to discover and access, and to make ready for analysis, largely due to lack of metadata. Data use and publication were hampered by concerns about data ownership and a desire for data citation. A web portal was created to visualise geospatial datasets to maximise data interpretation. By the end of the experience there was a significant increase in appreciation of the importance of a Data Management Plan. It is extremely doubtful that the work would have occurred or data delivered without the support of the Synthesis centre, as few of the participants had the necessary networks or skills. It is argued that participation in the Centre provided an important learning opportunity, and has resulted in improved knowledge and understanding of good data management practices.