977 resultados para approximate KNN query


Relevância:

20.00% 20.00%

Publicador:

Resumo:

La seguridad verificada es una metodología para demostrar propiedades de seguridad de los sistemas informáticos que se destaca por las altas garantías de corrección que provee. Los sistemas informáticos se modelan como programas probabilísticos y para probar que verifican una determinada propiedad de seguridad se utilizan técnicas rigurosas basadas en modelos matemáticos de los programas. En particular, la seguridad verificada promueve el uso de demostradores de teoremas interactivos o automáticos para construir demostraciones completamente formales cuya corrección es certificada mecánicamente (por ordenador). La seguridad verificada demostró ser una técnica muy efectiva para razonar sobre diversas nociones de seguridad en el área de criptografía. Sin embargo, no ha podido cubrir un importante conjunto de nociones de seguridad “aproximada”. La característica distintiva de estas nociones de seguridad es que se expresan como una condición de “similitud” entre las distribuciones de salida de dos programas probabilísticos y esta similitud se cuantifica usando alguna noción de distancia entre distribuciones de probabilidad. Este conjunto incluye destacadas nociones de seguridad de diversas áreas como la minería de datos privados, el análisis de flujo de información y la criptografía. Ejemplos representativos de estas nociones de seguridad son la indiferenciabilidad, que permite reemplazar un componente idealizado de un sistema por una implementación concreta (sin alterar significativamente sus propiedades de seguridad), o la privacidad diferencial, una noción de privacidad que ha recibido mucha atención en los últimos años y tiene como objetivo evitar la publicación datos confidenciales en la minería de datos. La falta de técnicas rigurosas que permitan verificar formalmente este tipo de propiedades constituye un notable problema abierto que tiene que ser abordado. En esta tesis introducimos varias lógicas de programa quantitativas para razonar sobre esta clase de propiedades de seguridad. Nuestra principal contribución teórica es una versión quantitativa de una lógica de Hoare relacional para programas probabilísticos. Las pruebas de correción de estas lógicas son completamente formalizadas en el asistente de pruebas Coq. Desarrollamos, además, una herramienta para razonar sobre propiedades de programas a través de estas lógicas extendiendo CertiCrypt, un framework para verificar pruebas de criptografía en Coq. Confirmamos la efectividad y aplicabilidad de nuestra metodología construyendo pruebas certificadas por ordendor de varios sistemas cuyo análisis estaba fuera del alcance de la seguridad verificada. Esto incluye, entre otros, una meta-construcción para diseñar funciones de hash “seguras” sobre curvas elípticas y algoritmos diferencialmente privados para varios problemas de optimización combinatoria de la literatura reciente. ABSTRACT The verified security methodology is an emerging approach to build high assurance proofs about security properties of computer systems. Computer systems are modeled as probabilistic programs and one relies on rigorous program semantics techniques to prove that they comply with a given security goal. In particular, it advocates the use of interactive theorem provers or automated provers to build fully formal machine-checked versions of these security proofs. The verified security methodology has proved successful in modeling and reasoning about several standard security notions in the area of cryptography. However, it has fallen short of covering an important class of approximate, quantitative security notions. The distinguishing characteristic of this class of security notions is that they are stated as a “similarity” condition between the output distributions of two probabilistic programs, and this similarity is quantified using some notion of distance between probability distributions. This class comprises prominent security notions from multiple areas such as private data analysis, information flow analysis and cryptography. These include, for instance, indifferentiability, which enables securely replacing an idealized component of system with a concrete implementation, and differential privacy, a notion of privacy-preserving data mining that has received a great deal of attention in the last few years. The lack of rigorous techniques for verifying these properties is thus an important problem that needs to be addressed. In this dissertation we introduce several quantitative program logics to reason about this class of security notions. Our main theoretical contribution is, in particular, a quantitative variant of a full-fledged relational Hoare logic for probabilistic programs. The soundness of these logics is fully formalized in the Coq proof-assistant and tool support is also available through an extension of CertiCrypt, a framework to verify cryptographic proofs in Coq. We validate the applicability of our approach by building fully machine-checked proofs for several systems that were out of the reach of the verified security methodology. These comprise, among others, a construction to build “safe” hash functions into elliptic curves and differentially private algorithms for several combinatorial optimization problems from the recent literature.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Query rewriting is one of the fundamental steps in ontologybased data access (OBDA) approaches. It takes as inputs an ontology and a query written according to that ontology, and produces as an output a set of queries that should be evaluated to account for the inferences that should be considered for that query and ontology. Different query rewriting systems give support to different ontology languages with varying expressiveness, and the rewritten queries obtained as an output do also vary in expressiveness. This heterogeneity has traditionally made it difficult to compare different approaches, and the area lacks in general commonly agreed benchmarks that could be used not only for such comparisons but also for improving OBDA support. In this paper we compile data, dimensions and measurements that have been used to evaluate some of the most recent systems, we analyse and characterise these assets, and provide a unified set of them that could be used as a starting point towards a more systematic benchmarking process for such systems. Finally, we apply this initial benchmark with some of the most relevant OBDA approaches in the state of the art.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of parameterizing approximately algebraic curves and surfaces is an active research field, with many implications in practical applications. The problem can be treated locally or globally. We formally state the problem, in its global version for the case of algebraic curves (planar or spatial), and we report on some algorithms approaching it, as well as on the associated error distance analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La obtención de materiales monofásicos con respuesta ferroeléctrica y (anti-)ferromagnética simultánea y acoplada resulta problemática debido a limitaciones intrínsecas de tipo físico, estructural y electrónico. En este sentido una alternativa más realista, y en cierto modo con mayor flexibilidad a la hora de diseñar futuros dispositivos multiferroicos, consiste en preparar materiales compuestos en los cuales el acoplamiento magnetoeléctrico se puede alcanzar explotando los efectos interfaciales entre fases disimilares. Tal es el caso de los materiales compuestos basados en BaTiO3 (fase ferroeléctrica) y NiFe2O4 (fase magnética), que ya se han empezado a preparar fundamentalmente por medio de técnicas de deposición altamente energéticas. Sin embargo de cara a su aplicación práctica, sería interesante poder preparar esos materiales por métodos más sostenibles y menos costosos. De acuerdo con ello, en este trabajo se presenta un estudio preliminar en torno a la evolución microestructural experimentada por los materiales basados en NiFe2O4-BaTiO3 cuando son preparados mediante una técnica de procesamiento suave en disolución como es la síntesis hidrotermal. En concreto se ha analizado la influencia que diversos parámetros característicos del procesamiento hidrotermal pueden tener sobre la generación y distribución de fases e interfases durante la posterior consolidación térmica de estos materiales compuestos.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multilayered, counterflow, parallel-plate heat exchangers are analyzed numerically and theoretically. The analysis, carried out for constant property fluids, considers a hydrodynamically developed laminar flow and neglects longitudinal conduction both in the fluid and in the plates. The solution for the temperature field involves eigenfunction expansions that can be solved in terms of Whittaker functions using standard symbolic algebra packages, leading to analytical expressions that provide the eigenvalues numerically. It is seen that the approximate solution obtained by retaining the first two modes in the eigenfunction expansion provides an accurate representation for the temperature away from the entrance regions, specially for long heat exchangers, thereby enabling simplified expressions for the wall and bulk temperatures, local heat-transfer rate, overall heat-transfer coefficient, and outlet bulk temperatures. The agreement between the numerical and theoretical results suggests the possibility of using the analytical solutions presented herein as benchmark problems for computational heat-transfer codes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Los fundamentos de la Teoría de la Decisión Bayesiana proporcionan un marco coherente en el que se pueden resolver los problemas de toma de decisiones. La creciente disponibilidad de ordenadores potentes está llevando a tratar problemas cada vez más complejos con numerosas fuentes de incertidumbre multidimensionales; varios objetivos conflictivos; preferencias, metas y creencias cambiantes en el tiempo y distintos grupos afectados por las decisiones. Estos factores, a su vez, exigen mejores herramientas de representación de problemas; imponen fuertes restricciones cognitivas sobre los decisores y conllevan difíciles problemas computacionales. Esta tesis tratará estos tres aspectos. En el Capítulo 1, proporcionamos una revisión crítica de los principales métodos gráficos de representación y resolución de problemas, concluyendo con algunas recomendaciones fundamentales y generalizaciones. Nuestro segundo comentario nos lleva a estudiar tales métodos cuando sólo disponemos de información parcial sobre las preferencias y creencias del decisor. En el Capítulo 2, estudiamos este problema cuando empleamos diagramas de influencia (DI). Damos un algoritmo para calcular las soluciones no dominadas en un DI y analizamos varios conceptos de solución ad hoc. El último aspecto se estudia en los Capítulos 3 y 4. Motivado por una aplicación de gestión de embalses, introducimos un método heurístico para resolver problemas de decisión secuenciales. Como muestra resultados muy buenos, extendemos la idea a problemas secuenciales generales y cuantificamos su bondad. Exploramos después en varias direcciones la aplicación de métodos de simulación al Análisis de Decisiones. Introducimos primero métodos de Monte Cario para aproximar el conjunto no dominado en problemas continuos. Después, proporcionamos un método de Monte Cario basado en cadenas de Markov para problemas con información completa con estructura general: las decisiones y las variables aleatorias pueden ser continuas, y la función de utilidad puede ser arbitraria. Nuestro esquema es aplicable a muchos problemas modelizados como DI. Finalizamos con un capítulo de conclusiones y problemas abiertos.---ABSTRACT---The foundations of Bayesian Decisión Theory provide a coherent framework in which decisión making problems may be solved. With the advent of powerful computers and given the many challenging problems we face, we are gradually attempting to solve more and more complex decisión making problems with high and multidimensional uncertainty, múltiple objectives, influence of time over decisión tasks and influence over many groups. These complexity factors demand better representation tools for decisión making problems; place strong cognitive demands on the decison maker judgements; and lead to involved computational problems. This thesis will deal with these three topics. In recent years, many representation tools have been developed for decisión making problems. In Chapter 1, we provide a critical review of most of them and conclude with recommendations and generalisations. Given our second query, we could wonder how may we deal with those representation tools when there is only partial information. In Chapter 2, we find out how to deal with such a problem when it is structured as an influence diagram (ID). We give an algorithm to compute nondominated solutions in ID's and analyse several ad hoc solution concepts.- The last issue is studied in Chapters 3 and 4. In a reservoir management case study, we have introduced a heuristic method for solving sequential decisión making problems. Since it shows very good performance, we extend the idea to general problems and quantify its goodness. We explore then in several directions the application of simulation based methods to Decisión Analysis. We first introduce Monte Cario methods to approximate the nondominated set in continuous problems. Then, we provide a Monte Cario Markov Chain method for problems under total information with general structure: decisions and random variables may be continuous, and the utility function may be arbitrary. Our scheme is applicable to many problems modeled as IDs. We conclude with discussions and several open problems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we study query answering and rewriting in ontologybased data access. Specifically, we present an algorithm for computing a perfect rewriting of unions of conjunctive queries posed over ontologies expressed in the description logic ELHIO, which covers the OWL 2 QL and OWL 2 EL profiles. The novelty of our algorithm is the use of a set of ABox dependencies, which are compiled into a so-called EBox, to limit the expansion of the rewriting. So far, EBoxes have only been used in query rewriting in the case of DL-Lite, which is less expressive than ELHIO. We have extensively evaluated our new query rewriting technique, and in this paper we discuss the tradeoff between the reduction of the size of the rewriting and the computational cost of our approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

When applying computational mathematics in practical applications, even though one may be dealing with a problem that can be solved algorithmically, and even though one has good algorithms to approach the solution, it can happen, and often it is the case, that the problem has to be reformulated and analyzed from a different computational point of view. This is the case of the development of approximate algorithms. This paper frames in the research area of approximate algebraic geometry and commutative algebra and, more precisely, on the problem of the approximate parametrization.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

R2RML is used to specify transformations of data available in relational databases into materialised or virtual RDF datasets. SPARQL queries evaluated against virtual datasets are translated into SQL queries according to the R2RML mappings, so that they can be evaluated over the underlying relational database engines. In this paper we describe an extension of a well-known algorithm for SPARQL to SQL translation, originally formalised for RDBMS-backed triple stores, that takes into account R2RML mappings. We present the result of our implementation using queries from a synthetic benchmark and from three real use cases, and show that SPARQL queries can be in general evaluated as fast as the SQL queries that would have been generated by SQL experts if no R2RML mappings had been used.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Query rewriting is one of the fundamental steps in ontologybased data access (OBDA) approaches. It takes as inputs an ontology and a query written according to that ontology, and produces as an output a set of queries that should be evaluated to account for the inferences that should be considered for that query and ontology. Different query rewriting systems give support to different ontology languages with varying expressiveness, and the rewritten queries obtained as an output do also vary in expressiveness. This heterogeneity has traditionally made it difficult to compare different approaches, and the area lacks in general commonly agreed benchmarks that could be used not only for such comparisons but also for improving OBDA support. In this paper we compile data, dimensions and measurements that have been used to evaluate some of the most recent systems, we analyse and characterise these assets, and provide a unified set of them that could be used as a starting point towards a more systematic benchmarking process for such systems. Finally, we apply this initial benchmark with some of the most relevant OBDA approaches in the state of the art.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ontology-based data access (OBDA) systems use ontologies to provide views over relational databases. Most of these systems work with ontologies implemented in description logic families of reduced expressiveness, what allows applying efficient query rewriting techniques for query answering. In this paper we describe a set of optimisations that are applicable with one of the most expressive families used in this context (ELHIO¬). Our resulting system exhibits a behaviour that is comparable to the one shown by systems that handle less expressive logics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ontology-Based Data Access (OBDA) permite el acceso a diferentes tipos de fuentes de datos (tradicionalmente bases de datos) usando un modelo más abstracto proporcionado por una ontología. La reescritura de consultas (query rewriting) usa una ontología para reescribir una consulta en una consulta reescrita que puede ser evaluada en la fuente de datos. Las consultas reescritas recuperan las respuestas que están implicadas por la combinación de los datos explicitamente almacenados en la fuente de datos, la consulta original y la ontología. Al trabajar sólo sobre las queries, la reescritura de consultas permite OBDA sobre cualquier fuente de datos que puede ser consultada, independientemente de las posibilidades para modificarla. Sin embargo, producir y evaluar las consultas reescritas son procesos costosos que suelen volverse más complejos conforme la expresividad y tamaño de la ontología y las consultas aumentan. En esta tesis exploramos distintas optimizaciones que peuden ser realizadas tanto en el proceso de reescritura como en las consultas reescritas para mejorar la aplicabilidad de OBDA en contextos realistas. Nuestra contribución técnica principal es un sistema de reescritura de consultas que implementa las optimizaciones presentadas en esta tesis. Estas optimizaciones son las contribuciones principales de la tesis y se pueden agrupar en tres grupos diferentes: -optimizaciones que se pueden aplicar al considerar los predicados en la ontología que no están realmente mapeados con las fuentes de datos. -optimizaciones en ingeniería que se pueden aplicar al manejar el proceso de reescritura de consultas en una forma que permite reducir la carga computacional del proceso de generación de consultas reescritas. -optimizaciones que se pueden aplicar al considerar metainformación adicional acerca de las características de la ABox. En esta tesis proporcionamos demostraciones formales acerca de la corrección y completitud de las optimizaciones propuestas, y una evaluación empírica acerca del impacto de estas optimizaciones. Como contribución adicional, parte de este enfoque empírico, proponemos un banco de pruebas (benchmark) para la evaluación de los sistemas de reescritura de consultas. Adicionalmente, proporcionamos algunas directrices para la creación y expansión de esta clase de bancos de pruebas. ABSTRACT Ontology-Based Data Access (OBDA) allows accessing different kinds of data sources (traditionally databases) using a more abstract model provided by an ontology. Query rewriting uses such ontology to rewrite a query into a rewritten query that can be evaluated on the data source. The rewritten queries retrieve the answers that are entailed by the combination of the data explicitly stored in the data source, the original query and the ontology. However, producing and evaluating the rewritten queries are both costly processes that become generally more complex as the expressiveness and size of the ontology and queries increase. In this thesis we explore several optimisations that can be performed both in the rewriting process and in the rewritten queries to improve the applicability of OBDA in real contexts. Our main technical contribution is a query rewriting system that implements the optimisations presented in this thesis. These optimisations are the core contributions of the thesis and can be grouped into three different groups: -optimisations that can be applied when considering the predicates in the ontology that are actually mapped to the data sources. -engineering optimisations that can be applied by handling the process of query rewriting in a way that permits to reduce the computational load of the query generation process. -optimisations that can be applied when considering additional metainformation about the characteristics of the ABox. In this thesis we provide formal proofs for the correctness of the proposed optimisations, and an empirical evaluation about the impact of the optimisations. As an additional contribution, part of this empirical approach, we propose a benchmark for the evaluation of query rewriting systems. We also provide some guidelines for the creation and expansion of this kind of benchmarks.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Previously conducted sequence analysis of Arabidopsis thaliana (ecotype Columbia-0) reported an insertion of 270-kb mtDNA into the pericentric region on the short arm of chromosome 2. DNA fiber-based fluorescence in situ hybridization analyses reveal that the mtDNA insert is 618 ± 42 kb, ≈2.3 times greater than that determined by contig assembly and sequencing analysis. Portions of the mitochondrial genome previously believed to be absent were identified within the insert. Sections of the mtDNA are repeated throughout the insert. The cytological data illustrate that DNA contig assembly by using bacterial artificial chromosomes tends to produce a minimal clone path by skipping over duplicated regions, thereby resulting in sequencing errors. We demonstrate that fiber-fluorescence in situ hybridization is a powerful technique to analyze large repetitive regions in the higher eukaryotic genomes and is a valuable complement to ongoing large genome sequencing projects.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Natural Language Interfaces to Query Databases (NLIDBs) have been an active research field since the 1960s. However, they have not been widely adopted. This article explores some of the biggest challenges and approaches for building NLIDBs and proposes techniques to reduce implementation and adoption costs. The article describes {AskMe*}, a new system that leverages some of these approaches and adds an innovative feature: query-authoring services, which lower the entry barrier for end users. Advantages of these approaches are proven with experimentation. Results confirm that, even when {AskMe*} is automatically reconfigurable against multiple domains, its accuracy is comparable to domain-specific NLIDBs.