584 resultados para Annotations
Resumo:
Automatic cost analysis of programs has been traditionally studied in terms of a number of concrete, predefined resources such as execution steps, time, or memory. However, the increasing relevance of analysis applications such as static debugging and/or certification of user-level properties (including for mobile code) makes it interesting to develop analyses for resource notions that are actually applicationdependent. This may include, for example, bytes sent or received by an application, number of files left open, number of SMSs sent or received, number of accesses to a database, money spent, energy consumption, etc. We present a fully automated analysis for inferring upper bounds on the usage that a Java bytecode program makes of a set of application programmer-definable resources. In our context, a resource is defined by programmer-provided annotations which state the basic consumption that certain program elements make of that resource. From these definitions our analysis derives functions which return an upper bound on the usage that the whole program (and individual blocks) make of that resource for any given set of input data sizes. The analysis proposed is independent of the particular resource. We also present some experimental results from a prototype implementation of the approach covering an ample set of interesting resources.
Resumo:
Finding useful sharing information between instances in object- oriented programs has recently been the focus of much research. The applications of such static analysis are multiple: by knowing which variables definitely do not share in memory we can apply conventional compiler optimizations, find coarse-grained parallelism opportunities, or, more importantly, verify certain correctness aspects of programs even in the absence of annotations. In this paper we introduce a framework for deriving precise sharing information based on abstract interpretation for a Java-like language. Our analysis achieves precision in various ways, including supporting multivariance, which allows separating different contexts. We propose a combined Set Sharing + Nullity + Classes domain which captures which instances do not share and which ones are definitively null, and which uses the classes to refine the static information when inheritance is present. The use of a set sharing abstraction allows a more precise representation of the existing sharings and is crucial in achieving precision during interprocedural analysis. Carrying the domains in a combined way facilitates the interaction among them in the presence of multivariance in the analysis. We show through examples and experimentally that both the set sharing part of the domain as well as the combined domain provide more accurate information than previous work based on pair sharing domains, at reasonable cost.
Resumo:
Finding useful sharing information between instances in object- oriented programs has been recently the focus of much research. The applications of such static analysis are multiple: by knowing which variables share in memory we can apply conventional compiler optimizations, find coarse-grained parallelism opportunities, or, more importantly,erify certain correctness aspects of programs even in the absence of annotations In this paper we introduce a framework for deriving precise sharing information based on abstract interpretation for a Java-like language. Our analysis achieves precision in various ways. The analysis is multivariant, which allows separating different contexts. We propose a combined Set Sharing + Nullity + Classes domain which captures which instances share and which ones do not or are definitively null, and which uses the classes to refine the static information when inheritance is present. Carrying the domains in a combined way facilitates the interaction among the domains in the presence of mutivariance in the analysis. We show that both the set sharing part of the domain as well as the combined domain provide more accurate information than previous work based on pair sharing domains, at reasonable cost.
Resumo:
A range of methodologies and techniques are available to guide the design and implementation of language extensions and domainspecific languages. A simple yet powerful technique is based on source-tosource transformations interleaved across the compilation passes of a base language. Despite being a successful approach, it has the main drawback that the input source code is lost in the process. When considering the whole workflow of program development (warning and error reporting, debugging, or even program analysis), program translations are no more powerful than a glorified macro language. In this paper, we propose an augmented approach to language extensions for Prolog, where symbolic annotations are included in the target program. These annotations allow selectively reversing the translated code. We illustrate the approach by showing that coupling it with minimal extensions to a generic Prolog debugger allows us to provide users with a familiar, source-level view during the debugging of programs which use a variety of language extensions, such as functional notation, DCGs, or CLP{Q,R}.
Resumo:
Idea Management Systems are an implementation of open innovation notion in the Web environment with the use of crowdsourcing techniques. In this area, one of the popular methods for coping with large amounts of data is duplicate de- tection. With our research, we answer a question if there is room to introduce more relationship types and in what degree would this change affect the amount of idea metadata and its diversity. Furthermore, based on hierarchical dependencies between idea relationships and relationship transitivity we propose a number of methods for dataset summarization. To evaluate our hypotheses we annotate idea datasets with new relationships using the contemporary methods of Idea Management Systems to detect idea similarity. Having datasets with relationship annotations at our disposal, we determine if idea features not related to idea topic (e.g. innovation size) have any relation to how annotators perceive types of idea similarity or dissimilarity.
Resumo:
In the information society large amounts of information are being generated and transmitted constantly, especially in the most natural way for humans, i.e., natural language. Social networks, blogs, forums, and Q&A sites are a dynamic Large Knowledge Repository. So, Web 2.0 contains structured data but still the largest amount of information is expressed in natural language. Linguistic structures for text recognition enable the extraction of structured information from texts. However, the expressiveness of the current structures is limited as they have been designed with a strict order in their phrases, limiting their applicability to other languages and making them more sensible to grammatical errors. To overcome these limitations, in this paper we present a linguistic structure named ?linguistic schema?, with a richer expressiveness that introduces less implicit constraints over annotations.
Resumo:
A workflow-centric research object bundles a workflow, the provenance of the results obtained by its enactment, other digital objects that are relevant for the experiment (papers, datasets, etc.), and annotations that semantically describe all these objects. In this paper, we propose a model to specify workflow-centric research objects, and show how the model can be grounded using semantic technologies and existing vocabularies, in particular the Object Reuse and Exchange (ORE) model and the Annotation Ontology (AO).We describe the life-cycle of a research object, which resembles the life-cycle of a scienti?c experiment.
Resumo:
In this paper we describe the specification of amodel for the semantically interoperable representation of language resources for sentiment analysis. The model integrates "lemon", an RDF-based model for the specification of ontology-lexica (Buitelaar et al. 2009), which is used increasinglyfor the representation of language resources asLinked Data, with Marl, an RDF-based model for the representation of sentiment annotations (West-erski et al., 2011; Sánchez-Rada et al., 2013)
Resumo:
Durante los últimos años, el imparable crecimiento de fuentes de datos biomédicas, propiciado por el desarrollo de técnicas de generación de datos masivos (principalmente en el campo de la genómica) y la expansión de tecnologías para la comunicación y compartición de información ha propiciado que la investigación biomédica haya pasado a basarse de forma casi exclusiva en el análisis distribuido de información y en la búsqueda de relaciones entre diferentes fuentes de datos. Esto resulta una tarea compleja debido a la heterogeneidad entre las fuentes de datos empleadas (ya sea por el uso de diferentes formatos, tecnologías, o modelizaciones de dominios). Existen trabajos que tienen como objetivo la homogeneización de estas con el fin de conseguir que la información se muestre de forma integrada, como si fuera una única base de datos. Sin embargo no existe ningún trabajo que automatice de forma completa este proceso de integración semántica. Existen dos enfoques principales para dar solución al problema de integración de fuentes heterogéneas de datos: Centralizado y Distribuido. Ambos enfoques requieren de una traducción de datos de un modelo a otro. Para realizar esta tarea se emplean formalizaciones de las relaciones semánticas entre los modelos subyacentes y el modelo central. Estas formalizaciones se denominan comúnmente anotaciones. Las anotaciones de bases de datos, en el contexto de la integración semántica de la información, consisten en definir relaciones entre términos de igual significado, para posibilitar la traducción automática de la información. Dependiendo del problema en el que se esté trabajando, estas relaciones serán entre conceptos individuales o entre conjuntos enteros de conceptos (vistas). El trabajo aquí expuesto se centra en estas últimas. El proyecto europeo p-medicine (FP7-ICT-2009-270089) se basa en el enfoque centralizado y hace uso de anotaciones basadas en vistas y cuyas bases de datos están modeladas en RDF. Los datos extraídos de las diferentes fuentes son traducidos e integrados en un Data Warehouse. Dentro de la plataforma de p-medicine, el Grupo de Informática Biomédica (GIB) de la Universidad Politécnica de Madrid, en el cuál realicé mi trabajo, proporciona una herramienta para la generación de las necesarias anotaciones de las bases de datos RDF. Esta herramienta, denominada Ontology Annotator ofrece la posibilidad de generar de manera manual anotaciones basadas en vistas. Sin embargo, aunque esta herramienta muestra las fuentes de datos a anotar de manera gráfica, la gran mayoría de usuarios encuentran difícil el manejo de la herramienta , y pierden demasiado tiempo en el proceso de anotación. Es por ello que surge la necesidad de desarrollar una herramienta más avanzada, que sea capaz de asistir al usuario en el proceso de anotar bases de datos en p-medicine. El objetivo es automatizar los procesos más complejos de la anotación y presentar de forma natural y entendible la información relativa a las anotaciones de bases de datos RDF. Esta herramienta ha sido denominada Ontology Annotator Assistant, y el trabajo aquí expuesto describe el proceso de diseño y desarrollo, así como algunos algoritmos innovadores que han sido creados por el autor del trabajo para su correcto funcionamiento. Esta herramienta ofrece funcionalidades no existentes previamente en ninguna otra herramienta del área de la anotación automática e integración semántica de bases de datos. ---ABSTRACT---Over the last years, the unstoppable growth of biomedical data sources, mainly thanks to the development of massive data generation techniques (specially in the genomics field) and the rise of the communication and information sharing technologies, lead to the fact that biomedical research has come to rely almost exclusively on the analysis of distributed information and in finding relationships between different data sources. This is a complex task due to the heterogeneity of the sources used (either by the use of different formats, technologies or domain modeling). There are some research proyects that aim homogenization of these sources in order to retrieve information in an integrated way, as if it were a single database. However there is still now work to automate completely this process of semantic integration. There are two main approaches with the purpouse of integrating heterogeneous data sources: Centralized and Distributed. Both approches involve making translation from one model to another. To perform this task there is a need of using formalization of the semantic relationships between the underlying models and the main model. These formalizations are also calles annotations. In the context of semantic integration of the information, data base annotations consist on defining relations between concepts or words with the same meaning, so the automatic translation can be performed. Depending on the task, the ralationships can be between individuals or between whole sets of concepts (views). This paper focuses on the latter. The European project p-medicine (FP7-ICT-2009-270089) is based on the centralized approach. It uses view based annotations and RDF modeled databases. The data retireved from different data sources is translated and joined into a Data Warehouse. Within the p-medicine platform, the Biomedical Informatics Group (GIB) of the Polytechnic University of Madrid, in which I worked, provides a software to create annotations for the RDF sources. This tool, called Ontology Annotator, is used to create annotations manually. However, although Ontology Annotator displays the data sources graphically, most of the users find it difficult to use this software, thus they spend too much time to complete the task. For this reason there is a need to develop a more advanced tool, which would be able to help the user in the task of annotating p-medicine databases. The aim is automating the most complex processes of the annotation and display the information clearly and easy understanding. This software is called Ontology Annotater Assistant and this book describes the process of design and development of it. as well as some innovative algorithms that were designed by the author of the work. This tool provides features that no other software in the field of automatic annotation can provide.
Resumo:
Desde hace tiempo ha habido mucho interés en la automatización de todo tipo de tareas en las que la intervención humana es esencial para que sean completadas con éxito. Esto es de especial interés si además se ciertas tareas que pueden ser perfectamente reproducibles y, o bien requieren mucha formación, o bien consumen mucho tiempo. Este proyecto está dirigido a la búsqueda de métodos para automatizar la anotación de imágenes médicas. En concreto, se centra en el apartado de delimitación de las regiones de interés (ROIs) en imágenes de tipo PET siendo éstas usadas con frecuencia junto con las imágenes de tipo CT en el campo de oncología para delinear volúmenes afectados por cáncer. Se pretende con esto ayudar a los hospitales a organizar y estructurar las imágenes de sus pacientes y relacionarlas con las notas clínicas. Esto es lo que llamaremos el proceso de anotación de imágenes y la integración con la anotación de notas clínicas respectivamente. En este documento nos vamos a centrar en describir cuáles eran los objetivos iniciales, los pasos dados para su consecución y las dificultades encontradas durante el proceso. De todas las técnicas existentes en la literatura, se han elegido 4 técnicas de segmentación, 2 de ellas probadas en pacientes reales y las otras 2 probadas solo en phantoms según la literatura. En nuestro caso, las pruebas, se han realizado en imágenes PET de 6 pacientes reales diagnosticados de cáncer. Los resultados han sido analizados y presentados. ---ABSTRACT---For a long period of time, there has been an increasing interest in automation of tasks where human intervention is needed in order to succeed. This interest is even greater if those tasks must be solved by qualifed specialists in the area and the task is reproducible or if the task is too time consuming. The main objective of this project is to find methods which can help to automate medical image annotation processes. In our specific case, we are willing to delineate regions of interest (ROIs) in PET images which are frequently used simultaneaously ith CT images in oncology to determine those volumes that are afected by cancer. With this process we want to help hospitals organize and have from their patient studies and to relate these images to the corpus annotations. We may call this the image annotation process and the integration with the corpus annotation respectively. In this document we are going to concentrate in the description of the initial objectives, the steps we had to go through and the di�culties we had to face during this process. From all existing techniques in the literature, 4 segmentation techniques have been chosen, 2 of them were tested in real patients and the other 2 were tested using phantoms according to the literature. In our case, the tests have been done using PET images from 6 real patients diagnosed with cancer. The results have been analyzed and presented.
Resumo:
El presente trabajo desarrolla un servicio REST que transforma frases en lenguaje natural a grafos RDF. Los grafos generados son grafos dirigidos, donde los nodos se forman con los sustantivos o adjetivos de las frases, y los arcos se forman con los verbos. Se utiliza dentro del proyecto p-medicine para dar soporte a las siguientes funcionalidades: Búsquedas en lenguaje natural: actualmente la plataforma p-medicine proporciona un interfaz programático para realizar consultas en SPARQL. El servicio desarrollado permitiría generar esas consultas automáticamente a partir de frases en lenguaje natural. Anotaciones de bases de datos mediante lenguaje natural: la plataforma pmedicine incorpora una herramienta, desarrollada por el Grupo de Ingeniería Biomédica de la Universidad Politécnica de Madrid, para la anotación de bases de datos RDF. Estas anotaciones son necesarias para la posterior traducción de las bases de datos a un esquema central. El proceso de anotación requiere que el usuario construya de forma manual las vistas RDF que desea anotar, lo que requiere mostrar gráficamente el esquema RDF y que el usuario construya vistas RDF seleccionando las clases y relaciones necesarias. Este proceso es a menudo complejo y demasiado difícil para un usuario sin perfil técnico. El sistema se incorporará para permitir que la construcción de estas vistas se realice con lenguaje natural. ---ABSTRACT---The present work develops a REST service that transforms natural language sentences to RDF degrees. Generated graphs are directed graphs where nodes are formed with nouns or adjectives of phrases, and the arcs are formed with verbs. Used within the p-medicine project to support the following functionality: Natural language queries: currently the p-medicine platform provides a programmatic interface to query SPARQL. The developed service would automatically generate those queries from natural language sentences. Memos databases using natural language: the p-medicine platform incorporates a tool, developed by the Group of Biomedical Engineering at the Polytechnic University of Madrid, for the annotation of RDF data bases. Such annotations are necessary for the subsequent translation of databases to a central scheme. The annotation process requires the user to manually construct the RDF views that he wants annotate, requiring graphically display the RDF schema and the user to build RDF views by selecting classes and relationships. This process is often complex and too difficult for a user with no technical background. The system is incorporated to allow the construction of these views to be performed with natural language.
Resumo:
Actualmente, la Web provee un inmenso conjunto de servicios (WS-*, RESTful, OGC WFS), los cuales están normalmente expuestos a través de diferentes estándares que permiten localizar e invocar a estos servicios. Estos servicios están, generalmente, descritos utilizando información textual, sin una descripción formal, es decir, la descripción de los servicios es únicamente sintáctica. Para facilitar el uso y entendimiento de estos servicios, es necesario anotarlos de manera formal a través de la descripción de los metadatos. El objetivo de esta tesis es proponer un enfoque para la anotación semántica de servicios Web en el dominio geoespacial. Este enfoque permite automatizar algunas de las etapas del proceso de anotación, mediante el uso combinado de recursos ontológicos y servicios externos. Este proceso ha sido evaluado satisfactoriamente con un conjunto de servicios en el dominio geoespacial. La contribución principal de este trabajo es la automatización parcial del proceso de anotación semántica de los servicios RESTful y WFS, lo cual mejora el estado del arte en esta área. Una lista detallada de las contribuciones son: • Un modelo para representar servicios Web desde el punto de vista sintáctico y semántico, teniendo en cuenta el esquema y las instancias. • Un método para anotar servicios Web utilizando ontologías y recursos externos. • Un sistema que implementa el proceso de anotación propuesto. • Un banco de pruebas para la anotación semántica de servicios RESTful y OGC WFS. Abstract The Web contains an immense collection of Web services (WS-*, RESTful, OGC WFS), normally exposed through standards that tell us how to locate and invocate them. These services are usually described using mostly textual information and without proper formal descriptions, that is, existing service descriptions mostly stay on a syntactic level. If we want to make such services potentially easier to understand and use, we may want to annotate them formally, by means of descriptive metadata. The objective of this thesis is to propose an approach for the semantic annotation of services in the geospatial domain. Our approach automates some stages of the annotation process, by using a combination of thirdparty resources and services. It has been successfully evaluated with a set of geospatial services. The main contribution of this work is the partial automation of the process of RESTful and WFS semantic annotation services, what improves the current state of the art in this area. The more detailed list of contributions are: • A model for representing Web services. • A method for annotating Web services using ontological and external resources. • A system that implements the proposed annotation process. • A gold standard for the semantic annotation of RESTful and OGC WFS services, and algorithms for evaluating the annotations.
Resumo:
Background: One of the main challenges for biomedical research lies in the computer-assisted integrative study of large and increasingly complex combinations of data in order to understand molecular mechanisms. The preservation of the materials and methods of such computational experiments with clear annotations is essential for understanding an experiment, and this is increasingly recognized in the bioinformatics community. Our assumption is that offering means of digital, structured aggregation and annotation of the objects of an experiment will provide necessary meta-data for a scientist to understand and recreate the results of an experiment. To support this we explored a model for the semantic description of a workflow-centric Research Object (RO), where an RO is defined as a resource that aggregates other resources, e.g., datasets, software, spreadsheets, text, etc. We applied this model to a case study where we analysed human metabolite variation by workflows. Results: We present the application of the workflow-centric RO model for our bioinformatics case study. Three workflows were produced following recently defined Best Practices for workflow design. By modelling the experiment as an RO, we were able to automatically query the experiment and answer questions such as “which particular data was input to a particular workflow to test a particular hypothesis?”, and “which particular conclusions were drawn from a particular workflow?”. Conclusions: Applying a workflow-centric RO model to aggregate and annotate the resources used in a bioinformatics experiment, allowed us to retrieve the conclusions of the experiment in the context of the driving hypothesis, the executed workflows and their input data. The RO model is an extendable reference model that can be used by other systems as well.
Resumo:
This paper describes a framework for annotation on travel blogs based on subjectivity (FATS). The framework has the capability to auto-annotate -sentence by sentence- sections from blogs (posts) about travelling in the Spanish language. FATS is used in this experiment to annotate com- ponents from travel blogs in order to create a corpus of 300 annotated posts. Each subjective element in a sentence is annotated as positive or negative as appropriate. Currently correct annotations add up to about 95 per cent in our subset of the travel domain. By means of an iterative process of annotation we can create a subjectively annotated domain specific corpus.
Resumo:
The importance of vision-based systems for Sense-and-Avoid is increasing nowadays as remotely piloted and autonomous UAVs become part of the non-segregated airspace. The development and evaluation of these systems demand flight scenario images which are expensive and risky to obtain. Currently Augmented Reality techniques allow the compositing of real flight scenario images with 3D aircraft models to produce useful realistic images for system development and benchmarking purposes at a much lower cost and risk. With the techniques presented in this paper, 3D aircraft models are positioned firstly in a simulated 3D scene with controlled illumination and rendering parameters. Realistic simulated images are then obtained using an image processing algorithm which fuses the images obtained from the 3D scene with images from real UAV flights taking into account on board camera vibrations. Since the intruder and camera poses are user-defined, ground truth data is available. These ground truth annotations allow to develop and quantitatively evaluate aircraft detection and tracking algorithms. This paper presents the software developed to create a public dataset of 24 videos together with their annotations and some tracking application results.