8 resultados para Meta Data, Semantic Web, Software Maintenance, Software Metrics

em Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Two distinct trends are emerging with respect to how data is shared, collected, and analyzed within the bioinformatics community. First, Linked Data, exposed as SPARQL endpoints, promises to make data easier to collect and integrate by moving towards the harmonization of data syntax, descriptive vocabularies, and identifiers, as well as providing a standardized mechanism for data access. Second, Web Services, often linked together into workflows, normalize data access and create transparent, reproducible scientific methodologies that can, in principle, be re-used and customized to suit new scientific questions. Constructing queries that traverse semantically-rich Linked Data requires substantial expertise, yet traditional RESTful or SOAP Web Services cannot adequately describe the content of a SPARQL endpoint. We propose that content-driven Semantic Web Services can enable facile discovery of Linked Data, independent of their location. Results: We use a well-curated Linked Dataset - OpenLifeData - and utilize its descriptive metadata to automatically configure a series of more than 22,000 Semantic Web Services that expose all of its content via the SADI set of design principles. The OpenLifeData SADI services are discoverable via queries to the SHARE registry and easy to integrate into new or existing bioinformatics workflows and analytical pipelines. We demonstrate the utility of this system through comparison of Web Service-mediated data access with traditional SPARQL, and note that this approach not only simplifies data retrieval, but simultaneously provides protection against resource-intensive queries. Conclusions: We show, through a variety of different clients and examples of varying complexity, that data from the myriad OpenLifeData can be recovered without any need for prior-knowledge of the content or structure of the SPARQL endpoints. We also demonstrate that, via clients such as SHARE, the complexity of federated SPARQL queries is dramatically reduced.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

La herramienta pretende integrar datos de la UPV/EHU públicamente accesibles y establecer puentes entre distintos repositorios de información del ámbito académico. Para ello, aprovecharemos el Linked Data, "la forma que tiene la Web Semántica de vincular los distintos datos que están distribuidos en la Web", y los estándares que éste define. La herramienta pretende integrar datos de la UPV/EHU públicamente accesibles y establecer puentes entre distintos repositorios de información del ámbito académico. Para ello, aprovecharemos el Linked Data, "la forma que tiene la Web Semántica de vincular los distintos datos que están distribuidos en la Web", y los estándares que éste define. Los repositorios elegidos para este trabajo han sido el ADDI, Bilatu, las páginas de todos los centros de la UPV/EHU en el Campus de Gipuzkoa y la DBLP. La mayoría de las funcionalidades de esta aplicación son genéricas, por lo que podrían fácilmente aplicarse a repositorios de otras instituciones. El sistema es un prototipo que demuestra la factibilidad del objetivo de integración y que está abierto a la incorporación de más conjuntos de datos, siguiendo la misma metodología empleada en el desarrollo de este proyecto. La mayoría de las funcionalidades de esta aplicación son genéricas, por lo que podrían fácilmente aplicarse a repositorios de otras instituciones. El sistema es un prototipo que demuestra la factibilidad del objetivo de integración y que está abierto a la incorporación de más conjuntos de datos, siguiendo la misma metodología empleada en el desarrollo de este proyecto.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

205 p.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Protein inference from peptide identifications in shotgun proteomics must deal with ambiguities that arise due to the presence of peptides shared between different proteins, which is common in higher eukaryotes. Recently data independent acquisition (DIA) approaches have emerged as an alternative to the traditional data dependent acquisition (DDA) in shotgun proteomics experiments. MSE is the term used to name one of the DIA approaches used in QTOF instruments. MSE data require specialized software to process acquired spectra and to perform peptide and protein identifications. However the software available at the moment does not group the identified proteins in a transparent way by taking into account peptide evidence categories. Furthermore the inspection, comparison and report of the obtained results require tedious manual intervention. Here we report a software tool to address these limitations for MSE data. Results In this paper we present PAnalyzer, a software tool focused on the protein inference process of shotgun proteomics. Our approach considers all the identified proteins and groups them when necessary indicating their confidence using different evidence categories. PAnalyzer can read protein identification files in the XML output format of the ProteinLynx Global Server (PLGS) software provided by Waters Corporation for their MSE data, and also in the mzIdentML format recently standardized by HUPO-PSI. Multiple files can also be read simultaneously and are considered as technical replicates. Results are saved to CSV, HTML and mzIdentML (in the case of a single mzIdentML input file) files. An MSE analysis of a real sample is presented to compare the results of PAnalyzer and ProteinLynx Global Server. Conclusions We present a software tool to deal with the ambiguities that arise in the protein inference process. Key contributions are support for MSE data analysis by ProteinLynx Global Server and technical replicates integration. PAnalyzer is an easy to use multiplatform and free software tool.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

More and more users aim at taking advantage of the existing Linked Open Data environment to formulate a query over a dataset and to then try to process the same query over different datasets, one after another, in order to obtain a broader set of answers. However, the heterogeneity of vocabularies used in the datasets on the one side, and the fact that the number of alignments among those datasets is scarce on the other, makes that querying task difficult for them. Considering this scenario we present in this paper a proposal that allows on demand translations of queries formulated over an original dataset, into queries expressed using the vocabulary of a targeted dataset. Our approach relieves users from knowing the vocabulary used in the targeted datasets and even more it considers situations where alignments do not exist or they are not suitable for the formulated query. Therefore, in order to favour the possibility of getting answers, sometimes there is no guarantee of obtaining a semantically equivalent translation. The core component of our proposal is a query rewriting model that considers a set of transformation rules devised from a pragmatic point of view. The feasibility of our scheme has been validated with queries defined in well known benchmarks and SPARQL endpoint logs, as the obtained results confirm.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: In recent years Galaxy has become a popular workflow management system in bioinformatics, due to its ease of installation, use and extension. The availability of Semantic Web-oriented tools in Galaxy, however, is limited. This is also the case for Semantic Web Services such as those provided by the SADI project, i.e. services that consume and produce RDF. Here we present SADI-Galaxy, a tool generator that deploys selected SADI Services as typical Galaxy tools. Results: SADI-Galaxy is a Galaxy tool generator: through SADI-Galaxy, any SADI-compliant service becomes a Galaxy tool that can participate in other out-standing features of Galaxy such as data storage, history, workflow creation, and publication. Galaxy can also be used to execute and combine SADI services as it does with other Galaxy tools. Finally, we have semi-automated the packing and unpacking of data into RDF such that other Galaxy tools can easily be combined with SADI services, plugging the rich SADI Semantic Web Service environment into the popular Galaxy ecosystem. Conclusions: SADI-Galaxy bridges the gap between Galaxy, an easy to use but "static" workflow system with a wide user-base, and SADI, a sophisticated, semantic, discovery-based framework for Web Services, thus benefiting both user communities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The digital management of collections in museums, archives, libraries and galleries is an increasingly important part of cultural heritage studies. This paper describes a representation for folk song metadata, based on the Web Ontology Language (OWL) implementation of the CIDOC Conceptual Reference Model. The OWL representation facilitates encoding and reasoning over a genre ontology, while the CIDOC model enables a representation of complex spatial containment and proximity relations among geographic regions. It is shown how complex queries of folk song metadata, relying on inference and not only retrieval, can be expressed in OWL and solved using a description logic reasoner.