912 resultados para meta-data


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a generic framework that can be used to describe study plans using meta-data. The context of this research and associated technologies and standards is presented. The approach adopted here has been developed within the mENU project that aims to provide a model for a European Networked University. The methodology for the design of the generic Framework is discussed and the main design requirements are presented. The approach adopted was based on a set of templates containing meta-data required for the description of programs of study and consisting of generic building elements annotated appropriately. The process followed to develop the templates is presented together with a set of evaluation criteria to test the suitability of the approach. The templates structure is presented and example templates are shown. A first evaluation of the approach has shown that the proposed framework can provide a flexible and competent means for the generic description of study plans for the purposes of a networked university.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Projects funded by the Australian National Data Service(ANDS). The specific projects that were funded included: a) Greenhouse Gas Emissions Project (N2O) with Prof. Peter Grace from QUT’s Institute of Sustainable Resources. b) Q150 Project for the management of multimedia data collected at Festival events with Prof. Phil Graham from QUT’s Institute of Creative Industries. c) Bio-diversity environmental sensing with Prof. Paul Roe from the QUT Microsoft eResearch Centre. For the purposes of these projects the Eclipse Rich Client Platform (Eclipse RCP) was chosen as an appropriate software development framework within which to develop the respective software. This poster will present a brief overview of the requirements of the projects, an overview of the experiences of the project team in using Eclipse RCP, report on the advantages and disadvantages of using Eclipse and it’s perspective on Eclipse as an integrated tool for supporting future data management requirements.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we propose a new method of data handling for web servers. We call this method Network Aware Buffering and Caching (NABC for short). NABC facilitates reduction of data copies in web server's data sending path, by doing three things: (1) Layout the data in main memory in a way that protocol processing can be done without data copies (2) Keep a unified cache of data in kernel and ensure safe access to it by various processes and kernel and (3) Pass only the necessary meta data between processes so that bulk data handling time spent during IPC can be reduced. We realize NABC by implementing a set of system calls and an user library. The end product of the implementation is a set of APIs specifically designed for use by the web servers. We port an in house web server called SWEET, to NABC APIs and evaluate performance using a range of workloads both simulated and real. The results show a very impressive gain of 12% to 21% in throughput for static file serving and 1.6 to 4 times gain in throughput for lightweight dynamic content serving for a server using NABC APIs over the one using UNIX APIs.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

OBJECTIVES: The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data. METHODS: To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis. RESULTS: To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse. CONCLUSIONS: Web and grid services, especially pre-defined data mining services that can run on or 'near' the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Short-term Water Information and Forecasting Tools (SWIFT) is a suite of tools for flood and short-term streamflow forecasting, consisting of a collection of hydrologic model components and utilities. Catchments are modeled using conceptual subareas and a node-link structure for channel routing. The tools comprise modules for calibration, model state updating, output error correction, ensemble runs and data assimilation. Given the combinatorial nature of the modelling experiments and the sub-daily time steps typically used for simulations, the volume of model configurations and time series data is substantial and its management is not trivial. SWIFT is currently used mostly for research purposes but has also been used operationally, with intersecting but significantly different requirements. Early versions of SWIFT used mostly ad-hoc text files handled via Fortran code, with limited use of netCDF for time series data. The configuration and data handling modules have since been redesigned. The model configuration now follows a design where the data model is decoupled from the on-disk persistence mechanism. For research purposes the preferred on-disk format is JSON, to leverage numerous software libraries in a variety of languages, while retaining the legacy option of custom tab-separated text formats when it is a preferred access arrangement for the researcher. By decoupling data model and data persistence, it is much easier to interchangeably use for instance relational databases to provide stricter provenance and audit trail capabilities in an operational flood forecasting context. For the time series data, given the volume and required throughput, text based formats are usually inadequate. A schema derived from CF conventions has been designed to efficiently handle time series for SWIFT.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recently, two international standard organizations, ISO and OGC, have done the work of standardization for GIS. Current standardization work for providing interoperability among GIS DB focuses on the design of open interfaces. But, this work has not considered procedures and methods for designing river geospatial data. Eventually, river geospatial data has its own model. When we share the data by open interface among heterogeneous GIS DB, differences between models result in the loss of information. In this study a plan was suggested both to respond to these changes in the information envirnment and to provide a future Smart River-based river information service by understanding the current state of river geospatial data model, improving, redesigning the database. Therefore, primary and foreign key, which can distinguish attribute information and entity linkages, were redefined to increase the usability. Database construction of attribute information and entity relationship diagram have been newly redefined to redesign linkages among tables from the perspective of a river standard database. In addition, this study was undertaken to expand the current supplier-oriented operating system to a demand-oriented operating system by establishing an efficient management of river-related information and a utilization system, capable of adapting to the changes of a river management paradigm.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Linked data offers a promising setting to encode, publish and share metadata of resources. As the matter of fact, it is already adopted by data producers such as European Environment Agency, US and some EU Governs, whose first ambition is to share (meta)data making their processes more effective and transparent. Such as an increasing interest and involvement of data providers surely represents a genuine witness of the web of data success, but in a longer perspective, frameworks supporting linked data consumers in their decision making processes will be a compelling need. In this respect, the talk is introducing SSONDE, a framework enabling in detailed comparison, ranking and selection of linked data resources through the analysis of their RDF ontology driven metadata. SSONDE implements an instance similarity especially designed to support in resource selection, namely the process stakeholders engage to choose a set of resources suitable for a given analysis purpose: (i) it deploys an asymmetric similarity assessment to emphasize information about gains and losses the stakeholders get adopting a resource in place of another; (ii) it relies on an explicit formalization of contexts to tailor the similarity assessment with respect to specific user-defined selection goals. The talk aims at providing an insight on SSONDE instance similarity and it will briefly describe some examples of SSONDE deployment in the context of linked data consumption.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e.g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows. Results: We present the application of the workflow-centric RO model for our bioinformatics case study. Three workflows were produced following recently defined Best Practices for workflow design. By modelling the experiment as an RO, we were able to automatically query the experiment and answer questions such as “which particular data was input to a particular workflow to test a particular hypothesis?”, and “which particular conclusions were drawn from a particular workflow?”. Conclusions: Applying a workflow-centric RO model to aggregate and annotate the resources used in a bioinformatics experiment, allowed us to retrieve the conclusions of the experiment in the context of the driving hypothesis, the executed workflows and their input data. The RO model is an extendable reference model that can be used by other systems as well.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Every Argo data file submitted by a DAC for distribution on the GDAC has its format and data consistency checked by the Argo FileChecker. Two types of checks are applied: 1. Format checks. Ensures the file formats match the Argo standards precisely. 2. Data consistency checks. Additional data consistency checks are performed on a file after it passes the format checks. These checks do not duplicate any of the quality control checks performed elsewhere. These checks can be thought of as “sanity checks” to ensure that the data are consistent with each other. The data consistency checks enforce data standards and ensure that certain data values are reasonable and/or consistent with other information in the files. Examples of the “data standard” checks are the “mandatory parameters” defined for meta-data files and the technical parameter names in technical data files. Files with format or consistency errors are rejected by the GDAC and are not distributed. Less serious problems will generate warnings and the file will still be distributed on the GDAC. Reference Tables and Data Standards: Many of the consistency checks involve comparing the data to the published reference tables and data standards. These tables are documented in the User’s Manual. (The FileChecker implements “text versions” of these tables.)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Developed economies are moving from an economy of corporations to an economy of people. More than ever, people produce and share value amongst themselves, and create value for corporations through co-creation and by sharing their data. This data remains in the hands of corporations and governments, but people want to regain control. Digital identity 3.0 gives people that control, and much more. In this paper we describe a concept for a digital identity platform that substantially goes beyond common concepts providing authentication services. Instead, the notion of digital identity 3.0 empowers people to decide who creates, updates, reads and deletes their data, and to bring their own data into interactions with organisations, governments and peers. To the extent that the user allows, this data is updated and expanded based on automatic, integrated and predictive learning, enabling trusted third party providers (e.g., retailers, banks, public sector) to proactively provide services. Consumers can also add to their digital identity desired meta-data and attribute values allowing them to design their own personal data record and to facilitate individualised experiences. We discuss the essential features of digital identity 3.0, reflect on relevant stakeholders and outline possible usage scenarios in selected industries.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A nursery site for the Alaska skate (Bathyraja parmifera) was sampled seasonally from June 2004 to July 2005. At the small nursery site (~2 km2), located in a highly productive area near the shelf-slope interface at the head of Bering Canyon in the eastern Bering Sea, reproductive males and females dominated the catch and neonate and juvenile skates were rare. Seasonal samples showed summertime (June and July) as the peak reproductive time in the nursery although some reproduction occurred throughout the year. Timeseries analysis of embryo length frequencies revealed that three cohorts were developing simultaneously and the period of embryonic development was estimated at 3.5 years and average embryo growth rate at 0.2 mm/day. Estimated egg case deposition occurred mainly during summertime and hatching occurred during winter months. Protracted hatching times may be common for oviparous elasmobranch species and may be directly correlated with ambient temperatures as evident from a meta-data analysis. Evidence indicates that the Alaska skate uses the eastern Bering Sea outer continental shelf region for reproduction and the middle and inner shelf regions as habitat for immature and subadults. Skate nurseries may be vulnerable to disturbances because they are located in highly productive areas and because embryos develop slowly.