948 resultados para Query-by-example
Resumo:
Query rewriting is one of the fundamental steps in ontologybased data access (OBDA) approaches. It takes as inputs an ontology and a query written according to that ontology, and produces as an output a set of queries that should be evaluated to account for the inferences that should be considered for that query and ontology. Different query rewriting systems give support to different ontology languages with varying expressiveness, and the rewritten queries obtained as an output do also vary in expressiveness. This heterogeneity has traditionally made it difficult to compare different approaches, and the area lacks in general commonly agreed benchmarks that could be used not only for such comparisons but also for improving OBDA support. In this paper we compile data, dimensions and measurements that have been used to evaluate some of the most recent systems, we analyse and characterise these assets, and provide a unified set of them that could be used as a starting point towards a more systematic benchmarking process for such systems. Finally, we apply this initial benchmark with some of the most relevant OBDA approaches in the state of the art.
Resumo:
Ontology-based data access (OBDA) systems use ontologies to provide views over relational databases. Most of these systems work with ontologies implemented in description logic families of reduced expressiveness, what allows applying efficient query rewriting techniques for query answering. In this paper we describe a set of optimisations that are applicable with one of the most expressive families used in this context (ELHIO¬). Our resulting system exhibits a behaviour that is comparable to the one shown by systems that handle less expressive logics.
Resumo:
A methodology is presented to determine both the short-term and the long-term influence of the spectral variations on the performance of Multi-Junction (MJ) solar cells and Concentrating "This is the peer reviewed version of the following article: R. Núñez, C. Domínguez, S. Askins, M. Victoria, R. Herrero, I. Antón, and G. Sala, “Determination of spectral variations by means of component cells useful for CPV rating and design,” Prog. Photovolt: Res. Appl., 2015., which has been published in final form at http://onlinelibrary.wiley.com/doi/10.1002/pip.2715/full. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archiving [http://olabout.wiley.com/WileyCDA/Section/id-820227.html#terms]." Photovoltaic (CPV) modules. Component cells with the same optical behavior as MJ solar cells are used to characterize the spectrum. A set of parameters, namely Spectral Matching Ratios (SMRs), is used to characterize spectrally a particular Direct Normal Irradiance (DNI) by comparison to the reference spectrum (AM1.5D-ASTM-G173-03). Furthermore, the spectrally corrected DNI for a given MJ solar cell technology is defined providing a way to estimate the losses associated to the spectral variations. The last section analyzes how the spectrum evolves throughout a year in a given place and the set of SMRs representative for that location are calculated. This information can be used to maximize the energy harvested by the MJ solar cell throughout the year. As an example, three years of data recorded in Madrid shows that losses lower than 5% are expected due to current mismatch for state-of-the-art MJ solar cells.
Resumo:
RDB to RDF Mapping Language (R2RML) es una recomendación del W3C que permite especificar reglas para transformar bases de datos relacionales a RDF. Estos datos en RDF se pueden materializar y almacenar en un sistema gestor de tripletas RDF (normalmente conocidos con el nombre triple store), en el cual se pueden evaluar consultas SPARQL. Sin embargo, hay casos en los cuales la materialización no es adecuada o posible, por ejemplo, cuando la base de datos se actualiza frecuentemente. En estos casos, lo mejor es considerar los datos en RDF como datos virtuales, de tal manera que las consultas SPARQL anteriormente mencionadas se traduzcan a consultas SQL que se pueden evaluar sobre los sistemas gestores de bases de datos relacionales (SGBD) originales. Para esta traducción se tienen en cuenta los mapeos R2RML. La primera parte de esta tesis se centra en la traducción de consultas. Se propone una formalización de la traducción de SPARQL a SQL utilizando mapeos R2RML. Además se proponen varias técnicas de optimización para generar consultas SQL que son más eficientes cuando son evaluadas en sistemas gestores de bases de datos relacionales. Este enfoque se evalúa mediante un benchmark sintético y varios casos reales. Otra recomendación relacionada con R2RML es la conocida como Direct Mapping (DM), que establece reglas fijas para la transformación de datos relacionales a RDF. A pesar de que ambas recomendaciones se publicaron al mismo tiempo, en septiembre de 2012, todavía no se ha realizado un estudio formal sobre la relación entre ellas. Por tanto, la segunda parte de esta tesis se centra en el estudio de la relación entre R2RML y DM. Se divide este estudio en dos partes: de R2RML a DM, y de DM a R2RML. En el primer caso, se estudia un fragmento de R2RML que tiene la misma expresividad que DM. En el segundo caso, se representan las reglas de DM como mapeos R2RML, y también se añade la semántica implícita (relaciones de subclase, 1-N y M-N) que se puede encontrar codificada en la base de datos. Esta tesis muestra que es posible usar R2RML en casos reales, sin necesidad de realizar materializaciones de los datos, puesto que las consultas SQL generadas son suficientemente eficientes cuando son evaluadas en el sistema gestor de base de datos relacional. Asimismo, esta tesis profundiza en el entendimiento de la relación existente entre las dos recomendaciones del W3C, algo que no había sido estudiado con anterioridad. ABSTRACT. RDB to RDF Mapping Language (R2RML) is a W3C recommendation that allows specifying rules for transforming relational databases into RDF. This RDF data can be materialized and stored in a triple store, so that SPARQL queries can be evaluated by the triple store. However, there are several cases where materialization is not adequate or possible, for example, if the underlying relational database is updated frequently. In those cases, RDF data is better kept virtual, and hence SPARQL queries over it have to be translated into SQL queries to the underlying relational database system considering that the translation process has to take into account the specified R2RML mappings. The first part of this thesis focuses on query translation. We discuss the formalization of the translation from SPARQL to SQL queries that takes into account R2RML mappings. Furthermore, we propose several optimization techniques so that the translation procedure generates SQL queries that can be evaluated more efficiently over the underlying databases. We evaluate our approach using a synthetic benchmark and several real cases, and show positive results that we obtained. Direct Mapping (DM) is another W3C recommendation for the generation of RDF data from relational databases. While R2RML allows users to specify their own transformation rules, DM establishes fixed transformation rules. Although both recommendations were published at the same time, September 2012, there has not been any study regarding the relationship between them. The second part of this thesis focuses on the study of the relationship between R2RML and DM. We divide this study into two directions: from R2RML to DM, and from DM to R2RML. From R2RML to DM, we study a fragment of R2RML having the same expressive power than DM. From DM to R2RML, we represent DM transformation rules as R2RML mappings, and also add the implicit semantics encoded in databases, such as subclass, 1-N and N-N relationships. This thesis shows that by formalizing and optimizing R2RML-based SPARQL to SQL query translation, it is possible to use R2RML engines in real cases as the resulting SQL is efficient enough to be evaluated by the underlying relational databases. In addition to that, this thesis facilitates the understanding of bidirectional relationship between the two W3C recommendations, something that had not been studied before.
Resumo:
Ontology-Based Data Access (OBDA) permite el acceso a diferentes tipos de fuentes de datos (tradicionalmente bases de datos) usando un modelo más abstracto proporcionado por una ontología. La reescritura de consultas (query rewriting) usa una ontología para reescribir una consulta en una consulta reescrita que puede ser evaluada en la fuente de datos. Las consultas reescritas recuperan las respuestas que están implicadas por la combinación de los datos explicitamente almacenados en la fuente de datos, la consulta original y la ontología. Al trabajar sólo sobre las queries, la reescritura de consultas permite OBDA sobre cualquier fuente de datos que puede ser consultada, independientemente de las posibilidades para modificarla. Sin embargo, producir y evaluar las consultas reescritas son procesos costosos que suelen volverse más complejos conforme la expresividad y tamaño de la ontología y las consultas aumentan. En esta tesis exploramos distintas optimizaciones que peuden ser realizadas tanto en el proceso de reescritura como en las consultas reescritas para mejorar la aplicabilidad de OBDA en contextos realistas. Nuestra contribución técnica principal es un sistema de reescritura de consultas que implementa las optimizaciones presentadas en esta tesis. Estas optimizaciones son las contribuciones principales de la tesis y se pueden agrupar en tres grupos diferentes: -optimizaciones que se pueden aplicar al considerar los predicados en la ontología que no están realmente mapeados con las fuentes de datos. -optimizaciones en ingeniería que se pueden aplicar al manejar el proceso de reescritura de consultas en una forma que permite reducir la carga computacional del proceso de generación de consultas reescritas. -optimizaciones que se pueden aplicar al considerar metainformación adicional acerca de las características de la ABox. En esta tesis proporcionamos demostraciones formales acerca de la corrección y completitud de las optimizaciones propuestas, y una evaluación empírica acerca del impacto de estas optimizaciones. Como contribución adicional, parte de este enfoque empírico, proponemos un banco de pruebas (benchmark) para la evaluación de los sistemas de reescritura de consultas. Adicionalmente, proporcionamos algunas directrices para la creación y expansión de esta clase de bancos de pruebas. ABSTRACT Ontology-Based Data Access (OBDA) allows accessing different kinds of data sources (traditionally databases) using a more abstract model provided by an ontology. Query rewriting uses such ontology to rewrite a query into a rewritten query that can be evaluated on the data source. The rewritten queries retrieve the answers that are entailed by the combination of the data explicitly stored in the data source, the original query and the ontology. However, producing and evaluating the rewritten queries are both costly processes that become generally more complex as the expressiveness and size of the ontology and queries increase. In this thesis we explore several optimisations that can be performed both in the rewriting process and in the rewritten queries to improve the applicability of OBDA in real contexts. Our main technical contribution is a query rewriting system that implements the optimisations presented in this thesis. These optimisations are the core contributions of the thesis and can be grouped into three different groups: -optimisations that can be applied when considering the predicates in the ontology that are actually mapped to the data sources. -engineering optimisations that can be applied by handling the process of query rewriting in a way that permits to reduce the computational load of the query generation process. -optimisations that can be applied when considering additional metainformation about the characteristics of the ABox. In this thesis we provide formal proofs for the correctness of the proposed optimisations, and an empirical evaluation about the impact of the optimisations. As an additional contribution, part of this empirical approach, we propose a benchmark for the evaluation of query rewriting systems. We also provide some guidelines for the creation and expansion of this kind of benchmarks.
Resumo:
On-line partial discharge (PD) measurements have become a common technique for assessing the insulation condition of installed high voltage (HV) insulated cables. When on-line tests are performed in noisy environments, or when more than one source of pulse-shaped signals are present in a cable system, it is difficult to perform accurate diagnoses. In these cases, an adequate selection of the non-conventional measuring technique and the implementation of effective signal processing tools are essential for a correct evaluation of the insulation degradation. Once a specific noise rejection filter is applied, many signals can be identified as potential PD pulses, therefore, a classification tool to discriminate the PD sources involved is required. This paper proposes an efficient method for the classification of PD signals and pulse-type noise interferences measured in power cables with HFCT sensors. By using a signal feature generation algorithm, representative parameters associated to the waveform of each pulse acquired are calculated so that they can be separated in different clusters. The efficiency of the clustering technique proposed is demonstrated through an example with three different PD sources and several pulse-shaped interferences measured simultaneously in a cable system with a high frequency current transformer (HFCT).
Resumo:
Import of DNA into mammalian nuclei is generally inefficient. Therefore, one of the current challenges in human gene therapy is the development of efficient DNA delivery systems. Here we tested whether bacterial proteins could be used to target DNA to mammalian cells. Agrobacterium tumefaciens, a plant pathogen, efficiently transfers DNA as a nucleoprotein complex to plant cells. Agrobacterium-mediated T-DNA transfer to plant cells is the only known example for interkingdom DNA transfer and is widely used for plant transformation. Agrobacterium virulence proteins VirD2 and VirE2 perform important functions in this process. We reconstituted complexes consisting of the bacterial virulence proteins VirD2, VirE2, and single-stranded DNA (ssDNA) in vitro. These complexes were tested for import into HeLa cell nuclei. Import of ssDNA required both VirD2 and VirE2 proteins. A VirD2 mutant lacking its C-terminal nuclear localization signal was deficient in import of the ssDNA–protein complexes into nuclei. Import of VirD2–ssDNA–VirE2 complexes was fast and efficient, and was shown to depended on importin α, Ran, and an energy source. We report here that the bacterium-derived and plant-adapted protein–DNA complex, made in vitro, can be efficiently imported into mammalian nuclei following the classical importin-dependent nuclear import pathway. This demonstrates the potential of our approach to enhance gene transfer to animal cells.
Resumo:
Epithelial Na+ channels are expressed widely in absorptive epithelia such as the renal collecting duct and the colon and play a critical role in fluid and electrolyte homeostasis. Recent studies have shown that these channels interact via PY motifs in the C terminals of their α, β, and γ subunits with the WW domains of the ubiquitin-protein ligase Nedd4. Mutation or deletion of these PY motifs (as occurs, for example, in the heritable form of hypertension known as Liddle’s syndrome) leads to increased Na+ channel activity. Thus, binding of Nedd4 by the PY motifs would appear to be part of a physiological control system for down-regulation of Na+ channel activity. The nature of this control system is, however, unknown. In the present paper, we show that Nedd4 mediates the ubiquitin-dependent down-regulation of Na+ channel activity in response to increased intracellular Na+. We further show that Nedd4 operates downstream of Go in this feedback pathway. We find, however, that Nedd4 is not involved in the feedback control of Na+ channels by intracellular anions. Finally, we show that Nedd4 has no influence on Na+ channel activity when the Na+ and anion feedback systems are inactive. We conclude that Nedd4 normally mediates feedback control of epithelial Na+ channels by intracellular Na+, and we suggest that the increased Na+ channel activity observed in Liddle’s syndrome is attributable to the loss of this regulatory feedback system.
Resumo:
Linkage disequilibrium analysis can provide high resolution in the mapping of disease genes because it incorporates information on recombinations that have occurred during the entire period from the mutational event to the present. A circumstance particularly favorable for high-resolution mapping is when a single founding mutation segregates in an isolated population. We review here the population structure of Finland in which a small founder population some 100 generations ago has expanded into 5.1 million people today. Among the 30-odd autosomal recessive disorders that are more prevalent in Finland than elsewhere, several appear to have segregated for this entire period in the “panmictic” southern Finnish population. Linkage disequilibrium analysis has allowed precise mapping and determination of genetic distances at the 0.1-cM level in several of these disorders. Estimates of genetic distance have proven accurate, but previous calculations of the confidence intervals were too small because sampling variation was ignored. In the north and east of Finland the population can be viewed as having been “founded” only after 1500. Disease mutations that have undergone such a founding bottleneck only 20 or so generations ago exhibit linkage disequilibrium and haplotype sharing over long genetic distances (5–15 cM). These features have been successfully exploited in the mapping and cloning of many genes. We review the statistical issues of fine mapping by linkage disequilibrium and suggest that improved methodologies may be necessary to map diseases of complex etiology that may have arisen from multiple founding mutations.
Resumo:
Temporal patterning of biological variables, in the form of oscillations and rhythms on many time scales, is ubiquitous. Altering the temporal pattern of an input variable greatly affects the output of many biological processes. We develop here a conceptual framework for a quantitative understanding of such pattern dependence, focusing particularly on nonlinear, saturable, time-dependent processes that abound in biophysics, biochemistry, and physiology. We show theoretically that pattern dependence is governed by the nonlinearity of the input–output transformation as well as its time constant. As a result, only patterns on certain time scales permit the expression of pattern dependence, and processes with different time constants can respond preferentially to different patterns. This has implications for temporal coding and decoding, and allows differential control of processes through pattern. We show how pattern dependence can be quantitatively predicted using only information from steady, unpatterned input. To apply our ideas, we analyze, in an experimental example, how muscle contraction depends on the pattern of motorneuron firing.
Resumo:
Iron regulatory proteins (IRPs) are cytoplasmic RNA binding proteins that are central components of a sensory and regulatory network that modulates vertebrate iron homeostasis. IRPs regulate iron metabolism by binding to iron responsive element(s) (IREs) in the 5′ or 3′ untranslated region of ferritin or transferrin receptor (TfR) mRNAs. Two IRPs, IRP1 and IRP2, have been identified previously. IRP1 exhibits two mutually exclusive functions as an RNA binding protein or as the cytosolic isoform of aconitase. We demonstrate that the Ba/F3 family of murine pro-B lymphocytes represents the first example of a mammalian cell line that fails to express IRP1 protein or mRNA. First, all of the IRE binding activity in Ba/F3-gp55 cells is attributable to IRP2. Second, synthesis of IRP2, but not of IRP1, is detectable in Ba/F3-gp55 cells. Third, the Ba/F3 family of cells express IRP2 mRNA at a level similar to other murine cell lines, but IRP1 mRNA is not detectable. In the Ba/F3 family of cells, alterations in iron status modulated ferritin biosynthesis and TfR mRNA level over as much as a 20- and 14-fold range, respectively. We conclude that IRP1 is not essential for regulation of ferritin or TfR expression by iron and that IRP2 can act as the sole IRE-dependent mediator of cellular iron homeostasis.
Resumo:
By using a simplified model of small open liquid-like clusters with surface effects, in the gas phase, it is shown how the statistical thermodynamics of small systems can be extended to include metastable supersaturated gaseous states not too far from the gas–liquid equilibrium transition point. To accomplish this, one has to distinguish between mathematical divergence and physical convergence of the open-system partition function.
Resumo:
The nature of chaperone action in the eukaryotic cytosol that assists newly translated cytosolic proteins to reach the native state has remained poorly defined. Actin, tubulin, and Gα transducin are assisted by the cytosolic chaperonin, CCT, but many other proteins, for example, ornithine transcarbamoylase (OTC), a cytosolic homotrimeric enzyme of yeast, do not require CCT action. Here, we observe that yeast cytosolic OTC is assisted to its native state by the SSA class of yeast cytosolic Hsp70 proteins. In vitro, refolding of OTC diluted from denaturant was assisted by crude yeast cytosol and ATP and found to be directed by SSA1/2. In vivo, when OTC was induced in a temperature-sensitive SSA-deficient strain, it exhibited reduced specific activity, and nonnative subunits were detected in the soluble fraction. These findings indicate that, in vivo, the Hsp70 system assists in folding at least some newly translated cytosolic enzymes, most likely functioning in a posttranslational manner.
Resumo:
ETS transcription factors play important roles in hematopoiesis, angiogenesis, and organogenesis during murine development. The ETS genes also have a role in neoplasia, for example in Ewing’s sarcomas and retrovirally induced cancers. The ETS genes encode transcription factors that bind to specific DNA sequences and activate transcription of various cellular and viral genes. To isolate novel ETS target genes, we used two approaches. In the first approach, we isolated genes by the RNA differential display technique. Previously, we have shown that the overexpression of ETS1 and ETS2 genes effects transformation of NIH 3T3 cells and specific transformants produce high levels of the ETS proteins. To isolate ETS1 and ETS2 responsive genes in these transformed cells, we prepared RNA from ETS1, ETS2 transformants, and normal NIH 3T3 cell lines and converted it into cDNA. This cDNA was amplified by PCR and displayed on sequencing gels. The differentially displayed bands were subcloned into plasmid vectors. By Northern blot analysis, several clones showed differential patterns of mRNA expression in the NIH 3T3-, ETS1-, and ETS2-expressing cell lines. Sixteen clones were analyzed by DNA sequence analysis, and 13 of them appeared to be unique because their DNA sequences did not match with any of the known genes present in the gene bank. Three known genes were found to be identical to the CArG box binding factor, phospholipase A2-activating protein, and early growth response 1 (Egr1) genes. In the second approach, to isolate ETS target promoters directly, we performed ETS1 binding with MboI-cleaved genomic DNA in the presence of a specific mAb followed by whole genome PCR. The immune complex-bound ETS binding sites containing DNA fragments were amplified and subcloned into pBluescript and subjected to DNA sequence and computer analysis. We found that, of a large number of clones isolated, 43 represented unique sequences not previously identified. Three clones turned out to contain regulatory sequences derived from human serglycin, preproapolipoprotein C II, and Egr1 genes. The ETS binding sites derived from these three regulatory sequences showed specific binding with recombinant ETS proteins. Of interest, Egr1 was identified by both of these techniques, suggesting strongly that it is indeed an ETS target gene.
Resumo:
Growth of a glutamate transport-deficient mutant of Rhodobacter sphaeroides on glutamate as sole carbon and nitrogen source can be restored by the addition of millimolar amounts of Na+. Uptake of glutamate (Kt of 0.2 μM) by the mutant strictly requires Na+ (Km of 25 mM) and is inhibited by ionophores that collapse the proton motive force (pmf). The activity is osmotic-shock-sensitive and can be restored in spheroplasts by the addition of osmotic shock fluid. Transport of glutamate is also observed in membrane vesicles when Na+, a proton motive force, and purified glutamate binding protein are present. Both transport and binding is highly specific for glutamate. The Na+-dependent glutamate transporter of Rb. sphaeroides is an example of a secondary transport system that requires a periplasmic binding protein and may define a new family of bacterial transport proteins.