959 resultados para In-memory databases


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The integration of powerful partial evaluation methods into practical compilers for logic programs is still far from reality. This is related both to 1) efficiency issues and to 2) the complications of dealing with practical programs. Regarding efnciency, the most successful unfolding rules used nowadays are based on structural orders applied over (covering) ancestors, i.e., a subsequence of the atoms selected during a derivation. Unfortunately, maintaining the structure of the ancestor relation during unfolding introduces significant overhead. We propose an efficient, practical local unfolding rule based on the notion of covering ancestors which can be used in combination with any structural order and allows a stack-based implementation without losing any opportunities for specialization. Regarding the second issue, we propose assertion-based techniques which allow our approach to deal with real programs that include (Prolog) built-ins and external predicates in a very extensible manner. Finally, we report on our implementation of these techniques in a practical partial evaluator, embedded in a state of the art compiler which uses global analysis extensively (the Ciao compiler and, specifically, its preprocessor CiaoPP). The performance analysis of the resulting system shows that our techniques, in addition to dealing with practical programs, are also significantly more efficient in time and somewhat more efficient in memory than traditional tree-based implementations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Information about the computational cost of programs is potentially useful for a variety of purposes, including selecting among different algorithms, guiding program transformations, in granularity control and mapping decisions in parallelizing compilers, and query optimization in deductive databases. Cost analysis of logic programs is complicated by nondeterminism: on the one hand, procedures can return múltiple Solutions, making it necessary to estímate the number of solutions in order to give nontrivial upper bound cost estimates; on the other hand, the possibility of failure has to be taken into account while estimating lower bounds. Here we discuss techniques to address these problems to some extent.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Finding useful sharing information between instances in object- oriented programs has recently been the focus of much research. The applications of such static analysis are multiple: by knowing which variables definitely do not share in memory we can apply conventional compiler optimizations, find coarse-grained parallelism opportunities, or, more importantly, verify certain correctness aspects of programs even in the absence of annotations. In this paper we introduce a framework for deriving precise sharing information based on abstract interpretation for a Java-like language. Our analysis achieves precision in various ways, including supporting multivariance, which allows separating different contexts. We propose a combined Set Sharing + Nullity + Classes domain which captures which instances do not share and which ones are definitively null, and which uses the classes to refine the static information when inheritance is present. The use of a set sharing abstraction allows a more precise representation of the existing sharings and is crucial in achieving precision during interprocedural analysis. Carrying the domains in a combined way facilitates the interaction among them in the presence of multivariance in the analysis. We show through examples and experimentally that both the set sharing part of the domain as well as the combined domain provide more accurate information than previous work based on pair sharing domains, at reasonable cost.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Finding useful sharing information between instances in object- oriented programs has been recently the focus of much research. The applications of such static analysis are multiple: by knowing which variables share in memory we can apply conventional compiler optimizations, find coarse-grained parallelism opportunities, or, more importantly,erify certain correctness aspects of programs even in the absence of annotations In this paper we introduce a framework for deriving precise sharing information based on abstract interpretation for a Java-like language. Our analysis achieves precision in various ways. The analysis is multivariant, which allows separating different contexts. We propose a combined Set Sharing + Nullity + Classes domain which captures which instances share and which ones do not or are definitively null, and which uses the classes to refine the static information when inheritance is present. Carrying the domains in a combined way facilitates the interaction among the domains in the presence of mutivariance in the analysis. We show that both the set sharing part of the domain as well as the combined domain provide more accurate information than previous work based on pair sharing domains, at reasonable cost.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The boundary element method (BEM) has been applied successfully to many engineering problems during the last decades. Compared with domain type methods like the finite element method (FEM) or the finite difference method (FDM) the BEM can handle problems where the medium extends to infinity much easier than domain type methods as there is no need to develop special boundary conditions (quiet or absorbing boundaries) or infinite elements at the boundaries introduced to limit the domain studied. The determination of the dynamic stiffness of arbitrarily shaped footings is just one of these fields where the BEM has been the method of choice, especially in the 1980s. With the continuous development of computer technology and the available hardware equipment the size of the problems under study grew and, as the flop count for solving the resulting linear system of equations grows with the third power of the number of equations, there was a need for the development of iterative methods with better performance. In [1] the GMRES algorithm was presented which is now widely used for implementations of the collocation BEM. While the FEM results in sparsely populated coefficient matrices, the BEM leads, in general, to fully or densely populated ones, depending on the number of subregions, posing a serious memory problem even for todays computers. If the geometry of the problem permits the surface of the domain to be meshed with equally shaped elements a lot of the resulting coefficients will be calculated and stored repeatedly. The present paper shows how these unnecessary operations can be avoided reducing the calculation time as well as the storage requirement. To this end a similar coefficient identification algorithm (SCIA), has been developed and implemented in a program written in Fortran 90. The vertical dynamic stiffness of a single pile in layered soil has been chosen to test the performance of the implementation. The results obtained with the 3-d model may be compared with those obtained with an axisymmetric formulation which are considered to be the reference values as the mesh quality is much better. The entire 3D model comprises more than 35000 dofs being a soil region with 21168 dofs the biggest single region. Note that the memory necessary to store all coefficients of this single region is about 6.8 GB, an amount which is usually not available with personal computers. In the problem under study the interface zone between the two adjacent soil regions as well as the surface of the top layer may be meshed with equally sized elements. In this case the application of the SCIA leads to an important reduction in memory requirements. The maximum memory used during the calculation has been reduced to 1.2 GB. The application of the SCIA thus permits problems to be solved on personal computers which otherwise would require much more powerful hardware.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper proposes a highly automated mechanism to build an undo facility into a new or existing system easily. Our proposal is based on the observation that for a large set of operators it is not necessary to store in-memory object states or executed system commands to undo an action; the storage of input data is instead enough. This strategy simplifies greatly the design of the undo process and encapsulates most of the functionalities required in a framework structure similar to the many object-oriented programming frameworks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El propósito de esta tesis es la implementación de métodos eficientes de adaptación de mallas basados en ecuaciones adjuntas en el marco de discretizaciones de volúmenes finitos para mallas no estructuradas. La metodología basada en ecuaciones adjuntas optimiza la malla refinándola adecuadamente con el objetivo de mejorar la precisión de cálculo de un funcional de salida dado. El funcional suele ser una magnitud escalar de interés ingenieril obtenida por post-proceso de la solución, como por ejemplo, la resistencia o la sustentación aerodinámica. Usualmente, el método de adaptación adjunta está basado en una estimación a posteriori del error del funcional de salida mediante un promediado del residuo numérico con las variables adjuntas, “Dual Weighted Residual method” (DWR). Estas variables se obtienen de la solución del problema adjunto para el funcional seleccionado. El procedimiento habitual para introducir este método en códigos basados en discretizaciones de volúmenes finitos involucra la utilización de una malla auxiliar embebida obtenida por refinamiento uniforme de la malla inicial. El uso de esta malla implica un aumento significativo de los recursos computacionales (por ejemplo, en casos 3D el aumento de memoria requerida respecto a la que necesita el problema fluido inicial puede llegar a ser de un orden de magnitud). En esta tesis se propone un método alternativo basado en reformular la estimación del error del funcional en una malla auxiliar más basta y utilizar una técnica de estimación del error de truncación, denominada _ -estimation, para estimar los residuos que intervienen en el método DWR. Utilizando esta estimación del error se diseña un algoritmo de adaptación de mallas que conserva los ingredientes básicos de la adaptación adjunta estándar pero con un coste computacional asociado sensiblemente menor. La metodología de adaptación adjunta estándar y la propuesta en la tesis han sido introducidas en un código de volúmenes finitos utilizado habitualmente en la industria aeronáutica Europea. Se ha investigado la influencia de distintos parámetros numéricos que intervienen en el algoritmo. Finalmente, el método propuesto se compara con otras metodologías de adaptación de mallas y su eficiencia computacional se demuestra en una serie de casos representativos de interés aeronáutico. ABSTRACT The purpose of this thesis is the implementation of efficient grid adaptation methods based on the adjoint equations within the framework of finite volume methods (FVM) for unstructured grid solvers. The adjoint-based methodology aims at adapting grids to improve the accuracy of a functional output of interest, as for example, the aerodynamic drag or lift. The adjoint methodology is based on the a posteriori functional error estimation using the adjoint/dual-weighted residual method (DWR). In this method the error in a functional output can be directly related to local residual errors of the primal solution through the adjoint variables. These variables are obtained by solving the corresponding adjoint problem for the chosen functional. The common approach to introduce the DWR method within the FVM framework involves the use of an auxiliary embedded grid. The storage of this mesh demands high computational resources, i.e. over one order of magnitude increase in memory relative to the initial problem for 3D cases. In this thesis, an alternative methodology for adapting the grid is proposed. Specifically, the DWR approach for error estimation is re-formulated on a coarser mesh level using the _ -estimation method to approximate the truncation error. Then, an output-based adaptive algorithm is designed in such way that the basic ingredients of the standard adjoint method are retained but the computational cost is significantly reduced. The standard and the new proposed adjoint-based adaptive methodologies have been incorporated into a flow solver commonly used in the EU aeronautical industry. The influence of different numerical settings has been investigated. The proposed method has been compared against different grid adaptation approaches and the computational efficiency of the new method has been demonstrated on some representative aeronautical test cases.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El trabajo ha sido realizado dentro del marco de los proyectos EURECA (Enabling information re-Use by linking clinical REsearch and Care) e INTEGRATE (Integrative Cancer Research Through Innovative Biomedical Infrastructures), en los que colabora el Grupo de Informática Biomédica de la UPM junto a otras universidades e instituciones sanitarias europeas. En ambos proyectos se desarrollan servicios e infraestructuras con el objetivo principal de almacenar información clínica, procedente de fuentes diversas (como por ejemplo de historiales clínicos electrónicos de hospitales, de ensayos clínicos o artículos de investigación biomédica), de una forma común y fácilmente accesible y consultable para facilitar al máximo la investigación de estos ámbitos, de manera colaborativa entre instituciones. Esta es la idea principal de la interoperabilidad semántica en la que se concentran ambos proyectos, siendo clave para el correcto funcionamiento del software del que se componen. El intercambio de datos con un modelo de representación compartido, común y sin ambigüedades, en el que cada concepto, término o dato clínico tendrá una única forma de representación. Lo cual permite la inferencia de conocimiento, y encaja perfectamente en el contexto de la investigación médica. En concreto, la herramienta a desarrollar en este trabajo también está orientada a la idea de maximizar la interoperabilidad semántica, pues se ocupa de la carga de información clínica con un formato estandarizado en un modelo común de almacenamiento de datos, implementado en bases de datos relacionales. El trabajo ha sido desarrollado en el periodo comprendido entre el 3 de Febrero y el 6 de Junio de 2014. Se ha seguido un ciclo de vida en cascada para la organización del trabajo realizado en las tareas de las que se compone el proyecto, de modo que una fase no puede iniciarse sin que se haya terminado, revisado y aceptado la fase anterior. Exceptuando la tarea de documentación del trabajo (para la elaboración de esta memoria), que se ha desarrollado paralelamente a todas las demás. ----ABSTRACT--- The project has been developed during the second semester of the 2013/2014 academic year. This Project has been done inside EURECA and INTEGRATE European biomedical research projects, where the GIB (Biomedical Informatics Group) of the UPM works as a partner. Both projects aim is to develop platforms and services with the main goal of storing clinical information (e.g. information from hospital electronic health records (EHRs), clinical trials or research articles) in a common way and easy to access and query, in order to support medical research. The whole software environment of these projects is based on the idea of semantic interoperability, which means the ability of computer systems to exchange data with unambiguous and shared meaning. This idea allows knowledge inference, which fits perfectly in medical research context. The tool to develop in this project is also "semantic operability-oriented". Its purpose is to store standardized clinical information in a common data model, implemented in relational databases. The project has been performed during the period between February 3rd and June 6th, of 2014. It has followed a "Waterfall model" of software development, in which progress is seen as flowing steadily downwards through its phases. Each phase starts when its previous phase has been completed and reviewed. The task of documenting the project‟s work is an exception; it has been performed in a parallel way to the rest of the tasks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Este Proyecto Fin de Grado trabaja en pos de la mejora y ampliación de los sistemas Pegaso y Gades, dos Sistemas Expertos enmarcados en el ámbito de la e-Salud. Estos sistemas, que ya estaban en funcionamiento antes del comienzo de este trabajo, apoyan la toma de decisiones en Atención Primaria. Esto es, permiten evaluar el nivel de adquisición del lenguaje en niños de 0 a 6 años a través de sus respectivas aplicaciones web. Además, permiten almacenar dichas evaluaciones y consultarlas posteriormente, junto con las decisiones del sistema asociadas a las mismas. Pegaso y Gades siguen una arquitectura de tres capas y están desarrollados usando fundamentalmente componentes Java y siguiendo. Como parte de este trabajo, en primer lugar se solucionan algunos problemas en el comportamiento de ambos sistemas, como su incompatibilidad con Java SE 7. A continuación, se desarrolla una aplicación que permite generar una ontología en lenguaje OWL desde código Java. Para ello, se estudia primero el concepto de ontología, el lenguaje OWL y las diferentes librerías Java existentes para generar ontologías OWL. Por otra parte, se mejoran algunas de las funcionalidades de los sistemas de partida y se desarrolla una nueva funcionalidad para la explotación de los datos almacenados en las bases de datos de ambos sistemas Esta nueva funcionalidad consiste en un módulo responsable de la generación de estadísticas a partir de los datos de las evaluaciones del lenguaje que hayan sido realizadas y, por tanto, almacenadas en las bases de datos. Estas estadísticas, que pueden ser consultadas por todos los usuarios de Pegaso y Gades, permiten establecer correlaciones entre los diversos conjuntos de datos de las evaluaciones del lenguaje. Por último, las estadísticas son mostradas por pantalla en forma de varios tipos de gráficas y tablas, de modo que los usuarios expertos puedan analizar la información contenida en ellas. ABSTRACT. This Bachelor's Thesis works towards improving and expanding the systems Pegaso and Gades, which are two Expert Systems that belong to the e-Health field. These systems, which were already operational before starting this work, support the decision-making process in Primary Care. That is, they allow to evaluate the language acquisition level in children from 0 to 6 years old. They also allow to store these evaluations and consult them afterwards, together with the decisions associated to each of them. Pegaso and Gades follow a three-tier architecture and are developed using mainly Java components. As part of this work, some of the behavioural problems of both systems are fixed, such as their incompatibility with Java SE 7. Next, an application that allows to generate an OWL ontology from Java code is developed. In order to do that, the concept of ontology, the OWL language and the different existing Java libraries to generate OWL ontologies are studied. On the other hand, some of the functionalities of the initial systems are improved and a new functionality to utilise the data stored in the databases of both systems is developed. This new functionality consists of a module responsible for the generation of statistics from the data of the language evaluations that have been performed and, thus, stored in the databases. These statistics, which can be consulted by all users of Pegaso and Gades, allow to establish correlations between the diverse set of data from the language evaluations. Finally, the statistics are presented to the user on the screen in the shape of various types of charts and tables, so that the expert users can analyse the information contained in them.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nowadays, organizations have plenty of data stored in DB databases, which contain invaluable information. Decision Support Systems DSS provide the support needed to manage this information and planning médium and long-term ?the modus operandi? of these organizations. Despite the growing importance of these systems, most proposals do not include its total evelopment, mostly limiting itself on the development of isolated parts, which often have serious integration problems. Hence, methodologies that include models and processes that consider every factor are necessary. This paper will try to fill this void as it proposes an approach for developing spatial DSS driven by the development of their associated Data Warehouse DW, without forgetting its other components. To the end of framing the proposal different Engineering Software focus (The Software Engineering Process and Model Driven Architecture) are used, and coupling with the DB development methodology, (and both of them adapted to DW peculiarities). Finally, an example illustrates the proposal.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Providing descriptions of isolated sensors and sensor networks in natural language, understandable by the general public, is useful to help users find relevant sensors and analyze sensor data. In this paper, we discuss the feasibility of using geographic knowledge from public databases available on the Web (such as OpenStreetMap, Geonames, or DBpedia) to automatically construct such descriptions. We present a general method that uses such information to generate sensor descriptions in natural language. The results of the evaluation of our method in a hydrologic national sensor network showed that this approach is feasible and capable of generating adequate sensor descriptions with a lower development effort compared to other approaches. In the paper we also analyze certain problems that we found in public databases (e.g., heterogeneity, non-standard use of labels, or rigid search methods) and their impact in the generation of sensor descriptions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

R2RML is used to specify transformations of data available in relational databases into materialised or virtual RDF datasets. SPARQL queries evaluated against virtual datasets are translated into SQL queries according to the R2RML mappings, so that they can be evaluated over the underlying relational database engines. In this paper we describe an extension of a well-known algorithm for SPARQL to SQL translation, originally formalised for RDBMS-backed triple stores, that takes into account R2RML mappings. We present the result of our implementation using queries from a synthetic benchmark and from three real use cases, and show that SPARQL queries can be in general evaluated as fast as the SQL queries that would have been generated by SQL experts if no R2RML mappings had been used.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Parte de la investigación biomédica actual se encuentra centrada en el análisis de datos heterogéneos. Estos datos pueden tener distinto origen, estructura, y semántica. Gran cantidad de datos de interés para los investigadores se encuentran en bases de datos públicas, que recogen información de distintas fuentes y la ponen a disposición de la comunidad de forma gratuita. Para homogeneizar estas fuentes de datos públicas con otras de origen privado, existen diversas herramientas y técnicas que permiten automatizar los procesos de homogeneización de datos heterogéneos. El Grupo de Informática Biomédica (GIB) [1] de la Universidad Politécnica de Madrid colabora en el proyecto europeo P-medicine [2], cuya finalidad reside en el desarrollo de una infraestructura que facilite la evolución de los procedimientos médicos actuales hacia la medicina personalizada. Una de las tareas enmarcadas en el proyecto P-medicine que tiene asignado el grupo consiste en elaborar herramientas que ayuden a usuarios en el proceso de integración de datos contenidos en fuentes de información heterogéneas. Algunas de estas fuentes de información son bases de datos públicas de ámbito biomédico contenidas en la plataforma NCBI [3] (National Center for Biotechnology Information). Una de las herramientas que el grupo desarrolla para integrar fuentes de datos es Ontology Annotator. En una de sus fases, la labor del usuario consiste en recuperar información de una base de datos pública y seleccionar de forma manual los resultados relevantes. Para automatizar el proceso de búsqueda y selección de resultados relevantes, por un lado existe un gran interés en conseguir generar consultas que guíen hacia resultados lo más precisos y exactos como sea posible, por otro lado, existe un gran interés en extraer información relevante de elevadas cantidades de documentos, lo cual requiere de sistemas que analicen y ponderen los datos que caracterizan a los mismos. En el campo informático de la inteligencia artificial, dentro de la rama de la recuperación de la información, existen diversos estudios acerca de la expansión de consultas a partir de retroalimentación relevante que podrían ser de gran utilidad para dar solución a la cuestión. Estos estudios se centran en técnicas para reformular o expandir la consulta inicial utilizando como realimentación los resultados que en una primera instancia fueron relevantes para el usuario, de forma que el nuevo conjunto de resultados tenga mayor proximidad con los que el usuario realmente desea. El objetivo de este trabajo de fin de grado consiste en el estudio, implementación y experimentación de métodos que automaticen el proceso de extracción de información trascendente de documentos, utilizándola para expandir o reformular consultas. De esta forma se pretende mejorar la precisión y el ranking de los resultados asociados. Dichos métodos serán integrados en la herramienta Ontology Annotator y enfocados a la fuente de datos de PubMed [4].---ABSTRACT---Part of the current biomedical research is focused on the analysis of heterogeneous data. These data may have different origin, structure and semantics. A big quantity of interesting data is contained in public databases which gather information from different sources and make it open and free to be used by the community. In order to homogenize thise sources of public data with others which origin is private, there are some tools and techniques that allow automating the processes of integration heterogeneous data. The biomedical informatics group of the Universidad Politécnica de Madrid cooperates with the European project P-medicine which main purpose is to create an infrastructure and models to facilitate the transition from current medical practice to personalized medicine. One of the tasks of the project that the group is in charge of consists on the development of tools that will help users in the process of integrating data from diverse sources. Some of the sources are biomedical public data bases from the NCBI platform (National Center for Biotechnology Information). One of the tools in which the group is currently working on for the integration of data sources is called the Ontology Annotator. In this tool there is a phase in which the user has to retrieve information from a public data base and select the relevant data contained in it manually. For automating the process of searching and selecting data on the one hand, there is an interest in automatically generating queries that guide towards the more precise results as possible. On the other hand, there is an interest on retrieve relevant information from large quantities of documents. The solution requires systems that analyze and weigh the data allowing the localization of the relevant items. In the computer science field of the artificial intelligence, in the branch of information retrieval there are diverse studies about the query expansion from relevance feedback that could be used to solve the problem. The main purpose of this studies is to obtain a set of results that is the closer as possible to the information that the user really wants to retrieve. In order to reach this purpose different techniques are used to reformulate or expand the initial query using a feedback the results that where relevant for the user, with this method, the new set of results will have more proximity with the ones that the user really desires. The goal of this final dissertation project consists on the study, implementation and experimentation of methods that automate the process of extraction of relevant information from documents using this information to expand queries. This way, the precision and the ranking of the results associated will be improved. These methods will be integrated in the Ontology Annotator tool and will focus on the PubMed data source.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

En esta revisión bibliográfica, llevada a cabo a través de una búsqueda en distintas bases de datos (PubMed, SportDiscus, Scielo) así como en revistas tales como Elsevier y buscadores como Google, se busca la evidencia referente a las patologías de la columna vertebral en la infancia así como programas educativos de prevención y tratamiento y el papel que puede desempeñar la educación física en las patologías de la columna vertebral en general y de la hiperlordosis específicamente. La literatura existente debía estar comprendida entre los años 2005g2015. Como visión global de esta revisión, podríamos decir que los problemas de espalda en la niñez son muy habituales pese a producirse en menor número que en poblaciones adultas y que, actualmente, siguen considerándose como un desafío clínico debido a que, en la mayoría de las veces, vienen acompañadas de patologías más complejas. Dentro de los problemas más prevalentes se encuentran algunos como la hiperlordosis, el genu valgum, el desequilibrio entre los hombros, la inclinación pélvica lateral, la escoliosis, la rotación del tronco y la hipercifosis torácica, entre otros. Se exponen, además de los problemas más habituales de columna vertebral en la niñez, las posibles causas, diversos programas de prevención e intervención y, finalmente, se exponen la importancia que tienen la educación postural, el papel del profesor de educación física en la prevención, detección y tratamiento de dichas patologías así como el papel vital que puede desarrollar la educación física en dichos niños. ABSTRACT This literature review was carried out through a search in different databases (PubMed, SportDiscus, Scielo) as well as in magazines such as Elsevier and, finally, in Google. Evidences related to the pathologies of the spine in children as well as educational programs for the prevention and treatment were searched. The role that educational programs can play in the prevention of the spine pathologies in general and specifically in the hyperlordosis was also analyzed. Literature review period was from 2005 till 2015. Results showed that back problems in childhood are very common although the prevalence is lower than in adults. The fact that these pathologies come normally associated with other more important problems, makes spine diseases a medical challenge. Within the most prevalent problems we can find hyperlordosis, genu valgum, lateral pelvic tilt, scoliosis, trunk rotation, uneven shoulders and chest’s hipercifosis, among others. Most common problems of vertebral column in the childhood, the possible causes, different programs of prevention and intervention were also reviewed. Importance of postural education in schools as well as the figure of the physical education teacher in the prevention, detection and treatment were analyzed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

El presente trabajo se ha centrado en la investigación de soluciones para automatizar la tarea del enriquecimiento de fuentes de datos sobre redes de sensores con descripciones lingüísticas, con el fin de facilitar la posterior generación de textos en lenguaje natural. El uso de descripciones en lenguaje natural facilita el acceso a los datos a una mayor diversidad de usuarios y, como consecuencia, permite aprovechar mejor las inversiones en redes de sensores. En el trabajo se ha considerado el uso de bases de datos abiertas para abordar la necesidad de disponer de un gran volumen y diversidad de conocimiento geográfico. Se ha analizado también el enriquecimiento de datos dentro de enfoques metodológicos de curación de datos y métodos de generación de lenguaje natural. Como resultado del trabajo, se ha planteado un método general basado en una estrategia de generación y prueba que incluye una forma de representación y uso del conocimiento heurístico con varias etapas de razonamiento para la construcción de descripciones lingüísticas de enriquecimiento de datos. En la evaluación de la propuesta general se han manejado tres escenarios, dos de ellos para generación de referencias geográficas sobre redes de sensores complejas de dimensión real y otro para la generación de referencias temporales. Los resultados de la evaluación han mostrado la validez práctica de la propuesta general exhibiendo mejoras de rendimiento respecto a otros enfoques. Además, el análisis de los resultados ha permitido identificar y cuantificar el impacto previsible de diversas líneas de mejora en bases de datos abiertas. ABSTRACT This work has focused on the search for solutions to automate the task of enrichment sensor-network-based data sources with textual descriptions, so as to facilitate the generation of natural language texts. Using natural language descriptions facilitates data access to a wider range of users and, therefore, allows better leveraging investments in sensor networks. In this work we have considered the use of open databases to address the need for a large volume and diversity of geographical knowledge. We have also analyzed data enrichment in methodological approaches and data curation methods of natural language generation. As a result, it has raised a general method based on a strategy of generating and testing that includes a representation using heuristic knowledge with several stages of reasoning for the construction of linguistic descriptions of data enrichment. In assessing the overall proposal three scenarios have been addressed, two of them in the environmental domain with complex sensor networks and another real dimension in the time domain. The evaluation results have shown the validity and practicality of our proposal, showing performance improvements over other approaches. Furthermore, the analysis of the results has allowed identifying and quantifying the expected impact of various lines of improvement in open databases.