31 resultados para medical information extraction
Resumo:
Virtual reality (VR) techniques to understand and obtain conclusions of data in an easy way are being used by the scientific community. However, these techniques are not used frequently for analyzing large amounts of data in life sciences, particularly in genomics, due to the high complexity of data (curse of dimensionality). Nevertheless, new approaches that allow to bring out the real important data characteristics, arise the possibility of constructing VR spaces to visually understand the intrinsic nature of data. It is well known the benefits of representing high dimensional data in tridimensional spaces by means of dimensionality reduction and transformation techniques, complemented with a strong component of interaction methods. Thus, a novel framework, designed for helping to visualize and interact with data about diseases, is presented. In this paper, the framework is applied to the Van't Veer breast cancer dataset is used, while oncologists from La Paz Hospital (Madrid) are interacting with the obtained results. That is to say a first attempt to generate a visually tangible model of breast cancer disease in order to support the experience of oncologists is presented.
Resumo:
Background: Minimally invasive surgery creates two technological opportunities: (1) the development of better training and objective evaluation environments, and (2) the creation of image guided surgical systems.
Resumo:
La nanotecnología es un área de investigación de reciente creación que trata con la manipulación y el control de la materia con dimensiones comprendidas entre 1 y 100 nanómetros. A escala nanométrica, los materiales exhiben fenómenos físicos, químicos y biológicos singulares, muy distintos a los que manifiestan a escala convencional. En medicina, los compuestos miniaturizados a nanoescala y los materiales nanoestructurados ofrecen una mayor eficacia con respecto a las formulaciones químicas tradicionales, así como una mejora en la focalización del medicamento hacia la diana terapéutica, revelando así nuevas propiedades diagnósticas y terapéuticas. A su vez, la complejidad de la información a nivel nano es mucho mayor que en los niveles biológicos convencionales (desde el nivel de población hasta el nivel de célula) y, por tanto, cualquier flujo de trabajo en nanomedicina requiere, de forma inherente, estrategias de gestión de información avanzadas. Desafortunadamente, la informática biomédica todavía no ha proporcionado el marco de trabajo que permita lidiar con estos retos de la información a nivel nano, ni ha adaptado sus métodos y herramientas a este nuevo campo de investigación. En este contexto, la nueva área de la nanoinformática pretende detectar y establecer los vínculos existentes entre la medicina, la nanotecnología y la informática, fomentando así la aplicación de métodos computacionales para resolver las cuestiones y problemas que surgen con la información en la amplia intersección entre la biomedicina y la nanotecnología. Las observaciones expuestas previamente determinan el contexto de esta tesis doctoral, la cual se centra en analizar el dominio de la nanomedicina en profundidad, así como en el desarrollo de estrategias y herramientas para establecer correspondencias entre las distintas disciplinas, fuentes de datos, recursos computacionales y técnicas orientadas a la extracción de información y la minería de textos, con el objetivo final de hacer uso de los datos nanomédicos disponibles. El autor analiza, a través de casos reales, alguna de las tareas de investigación en nanomedicina que requieren o que pueden beneficiarse del uso de métodos y herramientas nanoinformáticas, ilustrando de esta forma los inconvenientes y limitaciones actuales de los enfoques de informática biomédica a la hora de tratar con datos pertenecientes al dominio nanomédico. Se discuten tres escenarios diferentes como ejemplos de actividades que los investigadores realizan mientras llevan a cabo su investigación, comparando los contextos biomédico y nanomédico: i) búsqueda en la Web de fuentes de datos y recursos computacionales que den soporte a su investigación; ii) búsqueda en la literatura científica de resultados experimentales y publicaciones relacionadas con su investigación; iii) búsqueda en registros de ensayos clínicos de resultados clínicos relacionados con su investigación. El desarrollo de estas actividades requiere el uso de herramientas y servicios informáticos, como exploradores Web, bases de datos de referencias bibliográficas indexando la literatura biomédica y registros online de ensayos clínicos, respectivamente. Para cada escenario, este documento proporciona un análisis detallado de los posibles obstáculos que pueden dificultar el desarrollo y el resultado de las diferentes tareas de investigación en cada uno de los dos campos citados (biomedicina y nanomedicina), poniendo especial énfasis en los retos existentes en la investigación nanomédica, campo en el que se han detectado las mayores dificultades. El autor ilustra cómo la aplicación de metodologías provenientes de la informática biomédica a estos escenarios resulta efectiva en el dominio biomédico, mientras que dichas metodologías presentan serias limitaciones cuando son aplicadas al contexto nanomédico. Para abordar dichas limitaciones, el autor propone un enfoque nanoinformático, original, diseñado específicamente para tratar con las características especiales que la información presenta a nivel nano. El enfoque consiste en un análisis en profundidad de la literatura científica y de los registros de ensayos clínicos disponibles para extraer información relevante sobre experimentos y resultados en nanomedicina —patrones textuales, vocabulario en común, descriptores de experimentos, parámetros de caracterización, etc.—, seguido del desarrollo de mecanismos para estructurar y analizar dicha información automáticamente. Este análisis concluye con la generación de un modelo de datos de referencia (gold standard) —un conjunto de datos de entrenamiento y de test anotados manualmente—, el cual ha sido aplicado a la clasificación de registros de ensayos clínicos, permitiendo distinguir automáticamente los estudios centrados en nanodrogas y nanodispositivos de aquellos enfocados a testear productos farmacéuticos tradicionales. El presente trabajo pretende proporcionar los métodos necesarios para organizar, depurar, filtrar y validar parte de los datos nanomédicos existentes en la actualidad a una escala adecuada para la toma de decisiones. Análisis similares para otras tareas de investigación en nanomedicina ayudarían a detectar qué recursos nanoinformáticos se requieren para cumplir los objetivos actuales en el área, así como a generar conjunto de datos de referencia, estructurados y densos en información, a partir de literatura y otros fuentes no estructuradas para poder aplicar nuevos algoritmos e inferir nueva información de valor para la investigación en nanomedicina. ABSTRACT Nanotechnology is a research area of recent development that deals with the manipulation and control of matter with dimensions ranging from 1 to 100 nanometers. At the nanoscale, materials exhibit singular physical, chemical and biological phenomena, very different from those manifested at the conventional scale. In medicine, nanosized compounds and nanostructured materials offer improved drug targeting and efficacy with respect to traditional formulations, and reveal novel diagnostic and therapeutic properties. Nevertheless, the complexity of information at the nano level is much higher than the complexity at the conventional biological levels (from populations to the cell). Thus, any nanomedical research workflow inherently demands advanced information management. Unfortunately, Biomedical Informatics (BMI) has not yet provided the necessary framework to deal with such information challenges, nor adapted its methods and tools to the new research field. In this context, the novel area of nanoinformatics aims to build new bridges between medicine, nanotechnology and informatics, allowing the application of computational methods to solve informational issues at the wide intersection between biomedicine and nanotechnology. The above observations determine the context of this doctoral dissertation, which is focused on analyzing the nanomedical domain in-depth, and developing nanoinformatics strategies and tools to map across disciplines, data sources, computational resources, and information extraction and text mining techniques, for leveraging available nanomedical data. The author analyzes, through real-life case studies, some research tasks in nanomedicine that would require or could benefit from the use of nanoinformatics methods and tools, illustrating present drawbacks and limitations of BMI approaches to deal with data belonging to the nanomedical domain. Three different scenarios, comparing both the biomedical and nanomedical contexts, are discussed as examples of activities that researchers would perform while conducting their research: i) searching over the Web for data sources and computational resources supporting their research; ii) searching the literature for experimental results and publications related to their research, and iii) searching clinical trial registries for clinical results related to their research. The development of these activities will depend on the use of informatics tools and services, such as web browsers, databases of citations and abstracts indexing the biomedical literature, and web-based clinical trial registries, respectively. For each scenario, this document provides a detailed analysis of the potential information barriers that could hamper the successful development of the different research tasks in both fields (biomedicine and nanomedicine), emphasizing the existing challenges for nanomedical research —where the major barriers have been found. The author illustrates how the application of BMI methodologies to these scenarios can be proven successful in the biomedical domain, whilst these methodologies present severe limitations when applied to the nanomedical context. To address such limitations, the author proposes an original nanoinformatics approach specifically designed to deal with the special characteristics of information at the nano level. This approach consists of an in-depth analysis of the scientific literature and available clinical trial registries to extract relevant information about experiments and results in nanomedicine —textual patterns, common vocabulary, experiment descriptors, characterization parameters, etc.—, followed by the development of mechanisms to automatically structure and analyze this information. This analysis resulted in the generation of a gold standard —a manually annotated training or reference set—, which was applied to the automatic classification of clinical trial summaries, distinguishing studies focused on nanodrugs and nanodevices from those aimed at testing traditional pharmaceuticals. The present work aims to provide the necessary methods for organizing, curating and validating existing nanomedical data on a scale suitable for decision-making. Similar analysis for different nanomedical research tasks would help to detect which nanoinformatics resources are required to meet current goals in the field, as well as to generate densely populated and machine-interpretable reference datasets from the literature and other unstructured sources for further testing novel algorithms and inferring new valuable information for nanomedicine.
Resumo:
In the context of the Semantic Web, natural language descriptions associated with ontologies have proven to be of major importance not only to support ontology developers and adopters, but also to assist in tasks such as ontology mapping, information extraction, or natural language generation. In the state-of-the-art we find some attempts to provide guidelines for URI local names in English, and also some disagreement on the use of URIs for describing ontology elements. When trying to extrapolate these ideas to a multilingual scenario, some of these approaches fail to provide a valid solution. On the basis of some real experiences in the translation of ontologies from English into Spanish, we provide a preliminary set of guidelines for naming and labeling ontologies in a multilingual scenario.
Resumo:
This paper introduces a semantic language developed with the objective to be used in a semantic analyzer based on linguistic and world knowledge. Linguistic knowledge is provided by a Combinatorial Dictionary and several sets of rules. Extra-linguistic information is stored in an Ontology. The meaning of the text is represented by means of a series of RDF-type triples of the form predicate (subject, object). Semantic analyzer is one of the options of the multifunctional ETAP-3 linguistic processor. The analyzer can be used for Information Extraction and Question Answering. We describe semantic representation of expressions that provide an assessment of the number of objects involved and/or give a quantitative evaluation of different types of attributes. We focus on the following aspects: 1) parametric and non-parametric attributes; 2) gradable and non-gradable attributes; 3) ontological representation of different classes of attributes; 4) absolute and relative quantitative assessment; 5) punctual and interval quantitative assessment; 6) intervals with precise and fuzzy boundaries
Resumo:
El trabajo se enmarca dentro de los proyecto INTEGRATE y EURECA, cuyo objetivo es el desarrollo de una capa de interoperabilidad semántica que permita la integración de datos e investigación clínica, proporcionando una plataforma común que pueda ser integrada en diferentes instituciones clínicas y que facilite el intercambio de información entre las mismas. De esta manera se promueve la mejora de la práctica clínica a través de la cooperación entre instituciones de investigación con objetivos comunes. En los proyectos se hace uso de estándares y vocabularios clínicos ya existentes, como pueden ser HL7 o SNOMED, adaptándolos a las necesidades particulares de los datos con los que se trabaja en INTEGRATE y EURECA. Los datos clínicos se representan de manera que cada concepto utilizado sea único, evitando ambigüedades y apoyando la idea de plataforma común. El alumno ha formado parte de un equipo de trabajo perteneciente al Grupo de Informática de la UPM, que a su vez trabaja como uno de los socios de los proyectos europeos nombrados anteriormente. La herramienta desarrollada, tiene como objetivo realizar tareas de homogenización de la información almacenada en las bases de datos de los proyectos haciendo uso de los mecanismos de normalización proporcionados por el vocabulario médico SNOMED-CT. Las bases de datos normalizadas serán las utilizadas para llevar a cabo consultas por medio de servicios proporcionados en la capa de interoperabilidad, ya que contendrán información más precisa y completa que las bases de datos sin normalizar. El trabajo ha sido realizado entre el día 12 de Septiembre del año 2014, donde comienza la etapa de formación y recopilación de información, y el día 5 de Enero del año 2015, en el cuál se termina la redacción de la memoria. El ciclo de vida utilizado ha sido el de desarrollo en cascada, en el que las tareas no comienzan hasta que la etapa inmediatamente anterior haya sido finalizada y validada. Sin embargo, no todas las tareas han seguido este modelo, ya que la realización de la memoria del trabajo se ha llevado a cabo de manera paralela con el resto de tareas. El número total de horas dedicadas al Trabajo de Fin de Grado es 324. Las tareas realizadas y el tiempo de dedicación de cada una de ellas se detallan a continuación: Formación. Etapa de recopilación de información necesaria para implementar la herramienta y estudio de la misma [30 horas. Especificación de requisitos. Se documentan los diferentes requisitos que ha de cumplir la herramienta [20 horas]. Diseño. En esta etapa se toman las decisiones de diseño de la herramienta [35 horas]. Implementación. Desarrollo del código de la herramienta [80 horas]. Pruebas. Etapa de validación de la herramienta, tanto de manera independiente como integrada en los proyectos INTEGRATE y EURECA [70 horas]. Depuración. Corrección de errores e introducción de mejoras de la herramienta [45 horas]. Realización de la memoria. Redacción de la memoria final del trabajo [44 horas].---ABSTRACT---This project belongs to the semantic interoperability layer developed in the European projects INTEGRATE and EURECA, which aims to provide a platform to promote interchange of medical information from clinical trials to clinical institutions. Thus, research institutions may cooperate to enhance clinical practice. Different health standards and clinical terminologies has been used in both INTEGRATE and EURECA projects, e.g. HL7 or SNOMED-CT. These tools have been adapted to the projects data requirements. Clinical data are represented by unique concepts, avoiding ambiguity problems. The student has been working in the Biomedical Informatics Group from UPM, partner of the INTEGRATE and EURECA projects. The tool developed aims to perform homogenization tasks over information stored in databases of the project, through normalized representation provided by the SNOMED-CT terminology. The data query is executed against the normalized version of the databases, since the information retrieved will be more informative than non-normalized databases. The project has been performed from September 12th of 2014, when initiation stage began, to January 5th of 2015, when the final report was finished. The waterfall model for software development was followed during the working process. Therefore, a phase may not start before the previous one finishes and has been validated, except from the final report redaction, which has been carried out in parallel with the others phases. The tasks that have been developed and time for each one are detailed as follows: Training. Gathering the necessary information to develop the tool [30 hours]. Software requirement specification. Requirements the tool must accomplish [20 hours]. Design. Decisions on the design of the tool [35 hours]. Implementation. Tool development [80 hours]. Testing. Tool evaluation within the framework of the INTEGRATE and EURECA projects [70 hours]. Debugging. Improve efficiency and correct errors [45 hours]. Documenting. Final report elaboration [44 hours].
Resumo:
The scientific method is a methodological approach to the process of inquiry { in which empirically grounded theory of nature is constructed and verified [14]. It is a hard, exhaustive and dedicated multi-stage procedure that a researcher must perform to achieve valuable knowledge. Trying to help researchers during this process, a recommender system, intended as a researcher assistant, is designed to provide them useful tools and information for each stage of the procedure. A new similarity measure between research objects and a representational model, based on domain spaces, to handle them in dif ferent levels are created as well as a system to build them from OAI-PMH (and RSS) resources. It tries to represents a sound balance between scientific insight into individual scientific creative processes and technical implementation using innovative technologies in information extraction, document summarization and semantic analysis at a large scale.
Resumo:
La tesis que se presenta tiene como propósito la construcción automática de ontologías a partir de textos, enmarcándose en el área denominada Ontology Learning. Esta disciplina tiene como objetivo automatizar la elaboración de modelos de dominio a partir de fuentes información estructurada o no estructurada, y tuvo su origen con el comienzo del milenio, a raíz del crecimiento exponencial del volumen de información accesible en Internet. Debido a que la mayoría de información se presenta en la web en forma de texto, el aprendizaje automático de ontologías se ha centrado en el análisis de este tipo de fuente, nutriéndose a lo largo de los años de técnicas muy diversas provenientes de áreas como la Recuperación de Información, Extracción de Información, Sumarización y, en general, de áreas relacionadas con el procesamiento del lenguaje natural. La principal contribución de esta tesis consiste en que, a diferencia de la mayoría de las técnicas actuales, el método que se propone no analiza la estructura sintáctica superficial del lenguaje, sino que estudia su nivel semántico profundo. Su objetivo, por tanto, es tratar de deducir el modelo del dominio a partir de la forma con la que se articulan los significados de las oraciones en lenguaje natural. Debido a que el nivel semántico profundo es independiente de la lengua, el método permitirá operar en escenarios multilingües, en los que es necesario combinar información proveniente de textos en diferentes idiomas. Para acceder a este nivel del lenguaje, el método utiliza el modelo de las interlinguas. Estos formalismos, provenientes del área de la traducción automática, permiten representar el significado de las oraciones de forma independiente de la lengua. Se utilizará en concreto UNL (Universal Networking Language), considerado como la única interlingua de propósito general que está normalizada. La aproximación utilizada en esta tesis supone la continuación de trabajos previos realizados tanto por su autor como por el equipo de investigación del que forma parte, en los que se estudió cómo utilizar el modelo de las interlinguas en las áreas de extracción y recuperación de información multilingüe. Básicamente, el procedimiento definido en el método trata de identificar, en la representación UNL de los textos, ciertas regularidades que permiten deducir las piezas de la ontología del dominio. Debido a que UNL es un formalismo basado en redes semánticas, estas regularidades se presentan en forma de grafos, generalizándose en estructuras denominadas patrones lingüísticos. Por otra parte, UNL aún conserva ciertos mecanismos de cohesión del discurso procedentes de los lenguajes naturales, como el fenómeno de la anáfora. Con el fin de aumentar la efectividad en la comprensión de las expresiones, el método provee, como otra contribución relevante, la definición de un algoritmo para la resolución de la anáfora pronominal circunscrita al modelo de la interlingua, limitada al caso de pronombres personales de tercera persona cuando su antecedente es un nombre propio. El método propuesto se sustenta en la definición de un marco formal, que ha debido elaborarse adaptando ciertas definiciones provenientes de la teoría de grafos e incorporando otras nuevas, con el objetivo de ubicar las nociones de expresión UNL, patrón lingüístico y las operaciones de encaje de patrones, que son la base de los procesos del método. Tanto el marco formal como todos los procesos que define el método se han implementado con el fin de realizar la experimentación, aplicándose sobre un artículo de la colección EOLSS “Encyclopedia of Life Support Systems” de la UNESCO. ABSTRACT The purpose of this thesis is the automatic construction of ontologies from texts. This thesis is set within the area of Ontology Learning. This discipline aims to automatize domain models from structured or unstructured information sources, and had its origin with the beginning of the millennium, as a result of the exponential growth in the volume of information accessible on the Internet. Since most information is presented on the web in the form of text, the automatic ontology learning is focused on the analysis of this type of source, nourished over the years by very different techniques from areas such as Information Retrieval, Information Extraction, Summarization and, in general, by areas related to natural language processing. The main contribution of this thesis consists of, in contrast with the majority of current techniques, the fact that the method proposed does not analyze the syntactic surface structure of the language, but explores his deep semantic level. Its objective, therefore, is trying to infer the domain model from the way the meanings of the sentences are articulated in natural language. Since the deep semantic level does not depend on the language, the method will allow to operate in multilingual scenarios, where it is necessary to combine information from texts in different languages. To access to this level of the language, the method uses the interlingua model. These formalisms, coming from the area of machine translation, allow to represent the meaning of the sentences independently of the language. In this particular case, UNL (Universal Networking Language) will be used, which considered to be the only interlingua of general purpose that is standardized. The approach used in this thesis corresponds to the continuation of previous works carried out both by the author of this thesis and by the research group of which he is part, in which it is studied how to use the interlingua model in the areas of multilingual information extraction and retrieval. Basically, the procedure defined in the method tries to identify certain regularities at the UNL representation of texts that allow the deduction of the parts of the ontology of the domain. Since UNL is a formalism based on semantic networks, these regularities are presented in the form of graphs, generalizing in structures called linguistic patterns. On the other hand, UNL still preserves certain mechanisms of discourse cohesion from natural languages, such as the phenomenon of the anaphora. In order to increase the effectiveness in the understanding of expressions, the method provides, as another significant contribution, the definition of an algorithm for the resolution of pronominal anaphora limited to the model of the interlingua, in the case of third person personal pronouns when its antecedent is a proper noun. The proposed method is based on the definition of a formal framework, adapting some definitions from Graph Theory and incorporating new ones, in order to locate the notions of UNL expression and linguistic pattern, as well as the operations of pattern matching, which are the basis of the method processes. Both the formal framework and all the processes that define the method have been implemented in order to carry out the experimentation, applying on an article of the "Encyclopedia of Life Support Systems" of the UNESCO-EOLSS collection.
Resumo:
The focus of this chapter is to study feature extraction and pattern classification methods from two medical areas, Stabilometry and Electroencephalography (EEG). Stabilometry is the branch of medicine responsible for examining balance in human beings. Balance and dizziness disorders are probably two of the most common illnesses that physicians have to deal with. In Stabilometry, the key nuggets of information in a time series signal are concentrated within definite time periods are known as events. In this chapter, two feature extraction schemes have been developed to identify and characterise the events in Stabilometry and EEG signals. Based on these extracted features, an Adaptive Fuzzy Inference Neural network has been applied for classification of Stabilometry and EEG signals.
Resumo:
A particle accelerator is any device that, using electromagnetic fields, is able to communicate energy to charged particles (typically electrons or ionized atoms), accelerating and/or energizing them up to the required level for its purpose. The applications of particle accelerators are countless, beginning in a common TV CRT, passing through medical X-ray devices, and ending in large ion colliders utilized to find the smallest details of the matter. Among the other engineering applications, the ion implantation devices to obtain better semiconductors and materials of amazing properties are included. Materials supporting irradiation for future nuclear fusion plants are also benefited from particle accelerators. There are many devices in a particle accelerator required for its correct operation. The most important are the particle sources, the guiding, focalizing and correcting magnets, the radiofrequency accelerating cavities, the fast deflection devices, the beam diagnostic mechanisms and the particle detectors. Most of the fast particle deflection devices have been built historically by using copper coils and ferrite cores which could effectuate a relatively fast magnetic deflection, but needed large voltages and currents to counteract the high coil inductance in a response in the microseconds range. Various beam stability considerations and the new range of energies and sizes of present time accelerators and their rings require new devices featuring an improved wakefield behaviour and faster response (in the nanoseconds range). This can only be achieved by an electromagnetic deflection device based on a transmission line. The electromagnetic deflection device (strip-line kicker) produces a transverse displacement on the particle beam travelling close to the speed of light, in order to extract the particles to another experiment or to inject them into a different accelerator. The deflection is carried out by the means of two short, opposite phase pulses. The diversion of the particles is exerted by the integrated Lorentz force of the electromagnetic field travelling along the kicker. This Thesis deals with a detailed calculation, manufacturing and test methodology for strip-line kicker devices. The methodology is then applied to two real cases which are fully designed, built, tested and finally installed in the CTF3 accelerator facility at CERN (Geneva). Analytical and numerical calculations, both in 2D and 3D, are detailed starting from the basic specifications in order to obtain a conceptual design. Time domain and frequency domain calculations are developed in the process using different FDM and FEM codes. The following concepts among others are analyzed: scattering parameters, resonating high order modes, the wakefields, etc. Several contributions are presented in the calculation process dealing specifically with strip-line kicker devices fed by electromagnetic pulses. Materials and components typically used for the fabrication of these devices are analyzed in the manufacturing section. Mechanical supports and connexions of electrodes are also detailed, presenting some interesting contributions on these concepts. The electromagnetic and vacuum tests are then analyzed. These tests are required to ensure that the manufactured devices fulfil the specifications. Finally, and only from the analytical point of view, the strip-line kickers are studied together with a pulsed power supply based on solid state power switches (MOSFETs). The solid state technology applied to pulsed power supplies is introduced and several circuit topologies are modelled and simulated to obtain fast and good flat-top pulses.
Resumo:
The electroencephalograph (EEG) signal is one of the most widely used signals in the biomedicine field due to its rich information about human tasks. This research study describes a new approach based on i) build reference models from a set of time series, based on the analysis of the events that they contain, is suitable for domains where the relevant information is concentrated in specific regions of the time series, known as events. In order to deal with events, each event is characterized by a set of attributes. ii) Discrete wavelet transform to the EEG data in order to extract temporal information in the form of changes in the frequency domain over time- that is they are able to extract non-stationary signals embedded in the noisy background of the human brain. The performance of the model was evaluated in terms of training performance and classification accuracies and the results confirmed that the proposed scheme has potential in classifying the EEG signals.
Resumo:
Managing large medical image collections is an increasingly demanding important issue in many hospitals and other medical settings. A huge amount of this information is daily generated, which requires robust and agile systems. In this paper we present a distributed multi-agent system capable of managing very large medical image datasets. In this approach, agents extract low-level information from images and store them in a data structure implemented in a relational database. The data structure can also store semantic information related to images and particular regions. A distinctive aspect of our work is that a single image can be divided so that the resultant sub-images can be stored and managed separately by different agents to improve performance in data accessing and processing. The system also offers the possibility of applying some region-based operations and filters on images, facilitating image classification. These operations can be performed directly on data structures in the database.
Resumo:
Folksonomies emerge as the result of the free tagging activity of a large number of users over a variety of resources. They can be considered as valuable sources from which it is possible to obtain emerging vocabularies that can be leveraged in knowledge extraction tasks. However, when it comes to understanding the meaning of tags in folksonomies, several problems mainly related to the appearance of synonymous and ambiguous tags arise, specifically in the context of multilinguality. The authors aim to turn folksonomies into knowledge structures where tag meanings are identified, and relations between them are asserted. For such purpose, they use DBpedia as a general knowledge base from which they leverage its multilingual capabilities.
Resumo:
In the spinal cord of the anesthetized cat, spontaneous cord dorsum potentials (CDPs) appear synchronously along the lumbo-sacral segments. These CDPs have different shapes and magnitudes. Previous work has indicated that some CDPs appear to be specially associated with the activation of spinal pathways that lead to primary afferent depolarization and presynaptic inhibition. Visual detection and classification of these CDPs provides relevant information on the functional organization of the neural networks involved in the control of sensory information and allows the characterization of the changes produced by acute nerve and spinal lesions. We now present a novel feature extraction approach for signal classification, applied to CDP detection. The method is based on an intuitive procedure. We first remove by convolution the noise from the CDPs recorded in each given spinal segment. Then, we assign a coefficient for each main local maximum of the signal using its amplitude and distance to the most important maximum of the signal. These coefficients will be the input for the subsequent classification algorithm. In particular, we employ gradient boosting classification trees. This combination of approaches allows a faster and more accurate discrimination of CDPs than is obtained by other methods.
Resumo:
Background. Over the last years, the number of available informatics resources in medicine has grown exponentially. While specific inventories of such resources have already begun to be developed for Bioinformatics (BI), comparable inventories are as yet not available for Medical Informatics (MI) field, so that locating and accessing them currently remains a hard and time-consuming task. Description. We have created a repository of MI resources from the scientific literature, providing free access to its contents through a web-based service. Relevant information describing the resources is automatically extracted from manuscripts published in top-ranked MI journals. We used a pattern matching approach to detect the resources? names and their main features. Detected resources are classified according to three different criteria: functionality, resource type and domain. To facilitate these tasks, we have built three different taxonomies by following a novel approach based on folksonomies and social tagging. We adopted the terminology most frequently used by MI researchers in their publications to create the concepts and hierarchical relationships belonging to the taxonomies. The classification algorithm identifies the categories associated to resources and annotates them accordingly. The database is then populated with this data after manual curation and validation. Conclusions. We have created an online repository of MI resources to assist researchers in locating and accessing the most suitable resources to perform specific tasks. The database contained 282 resources at the time of writing. We are continuing to expand the number of available resources by taking into account further publications as well as suggestions from users and resource developers.