969 resultados para Automatic classification


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The pathogens manifestation in plantations are the largest cause of damage in several cultivars, which may cause increase of prices and loss of crop quality. This paper presents a method for automatic classification of cotton diseases through feature extraction of leaf symptoms from digital images. Wavelet transform energy has been used for feature extraction while Support Vector Machine has been used for classification. Five situations have been diagnosed, namely: Healthy crop, Ramularia disease, Bacterial Blight, Ascochyta Blight, and unspecified disease. © 2012 Taylor & Francis Group.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Aborda a classificação automática de faltas do tipo curto-circuito em linhas de transmissão. A maioria dos sistemas de transmissão possuem três fases (A, B e C). Por exemplo, um curto-circuito entre as fases A e B pode ser identicado como uma falta\AB". Considerando a possibilidade de um curto-circuito com a fase terra (T), a tarefa ao longo desse trabalho de classificar uma série temporal em uma das 11 faltas possíveis: AT, BT, CT, AB, AC, BC, ABC, ABT, ACT, BCT, ABCT. Estas faltas são responsáveis pela maioria dos distúrbios no sistema elétrico. Cada curto-circuito é representado por uma seqüência (série temporal) e ambos os tipos de classificação, on-line (para cada curto segmento extraído do sinal) e off-line (leva em consideração toda a seqüência), são investigados. Para evitar a atual falta de dados rotulados, o simulador Alternative Transient Program (ATP) é usado para criar uma base de dados rotulada e disponibilizada em domínio público. Alguns trabalhos na literatura não fazem distinção entre as faltas ABC e ABCT. Assim, resultados distinguindo esse dois tipos de faltas adotando técnicas de pré-processamento, diferentes front ends (por exemplo wavelets) e algoritmos de aprendizado (árvores de decisão e redes neurais) são apresentados. O custo computacional estimado durante o estágio de teste de alguns classificadores é investigado e a escolha dos parâmetros dos classificadores é feita a partir de uma seleção automática de modelo. Os resultados obtidos indicam que as árvores de decisão e as redes neurais apresentam melhores resultados quando comparados aos outros classificadores.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Statistical modelling and statistical learning theory are two powerful analytical frameworks for analyzing signals and developing efficient processing and classification algorithms. In this thesis, these frameworks are applied for modelling and processing biomedical signals in two different contexts: ultrasound medical imaging systems and primate neural activity analysis and modelling. In the context of ultrasound medical imaging, two main applications are explored: deconvolution of signals measured from a ultrasonic transducer and automatic image segmentation and classification of prostate ultrasound scans. In the former application a stochastic model of the radio frequency signal measured from a ultrasonic transducer is derived. This model is then employed for developing in a statistical framework a regularized deconvolution procedure, for enhancing signal resolution. In the latter application, different statistical models are used to characterize images of prostate tissues, extracting different features. These features are then uses to segment the images in region of interests by means of an automatic procedure based on a statistical model of the extracted features. Finally, machine learning techniques are used for automatic classification of the different region of interests. In the context of neural activity signals, an example of bio-inspired dynamical network was developed to help in studies of motor-related processes in the brain of primate monkeys. The presented model aims to mimic the abstract functionality of a cell population in 7a parietal region of primate monkeys, during the execution of learned behavioural tasks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

During the previous 10 years, global R&D expenditure in the pharmaceuticals and biotechnology sector has steadily increased, without a corresponding increase in output of new medicines. To address this situation, the biopharmaceutical industry's greatest need is to predict the failures at the earliest possible stage of the drug development process. A major key to reducing failures in drug screenings is the development and use of preclinical models that are more predictive of efficacy and safety in clinical trials. Further, relevant animal models are needed to allow a wider testing of novel hypotheses. Key to this is the developing, refining, and validating of complex animal models that directly link therapeutic targets to the phenotype of disease, allowing earlier prediction of human response to medicines and identification of safety biomarkers. Morehover, well-designed animal studies are essential to bridge the gap between test in cell cultures and people. Zebrafish is emerging, complementary to other models, as a powerful system for cancer studies and drugs discovery. We aim to investigate this research area designing a new preclinical cancer model based on the in vivo imaging of zebrafish embryogenesis. Technological advances in imaging have made it feasible to acquire nondestructive in vivo images of fluorescently labeled structures, such as cell nuclei and membranes, throughout early Zebrafishsh embryogenesis. This In vivo image-based investigation provides measurements for a large number of features at cellular level and events including nuclei movements, cells counting, and mitosis detection, thereby enabling the estimation of more significant parameters such as proliferation rate, highly relevant for investigating anticancer drug effects. In this work, we designed a standardized procedure for accessing drug activity at the cellular level in live zebrafish embryos. The procedure includes methodologies and tools that combine imaging and fully automated measurements of embryonic cell proliferation rate. We achieved proliferation rate estimation through the automatic classification and density measurement of epithelial enveloping layer and deep layer cells. Automatic embryonic cells classification provides the bases to measure the variability of relevant parameters, such as cell density, in different classes of cells and is finalized to the estimation of efficacy and selectivity of anticancer drugs. Through these methodologies we were able to evaluate and to measure in vivo the therapeutic potential and overall toxicity of Dbait and Irinotecan anticancer molecules. Results achieved on these anticancer molecules are presented and discussed; furthermore, extensive accuracy measurements are provided to investigate the robustness of the proposed procedure. Altogether, these observations indicate that zebrafish embryo can be a useful and cost-effective alternative to some mammalian models for the preclinical test of anticancer drugs and it might also provides, in the near future, opportunities to accelerate the process of drug discovery.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Detection of arrhythmic atrial beats in surface ECGs can be challenging when they are masked by the R or T wave, or do not affect the RR-interval. Here, we present a solution using a high-resolution esophageal long-term ECG that offers a detailed view on the atrial electrical activity. The recorded ECG shows atrial ectopic beats with long coupling intervals, which can only be successfully classified using additional morphology criteria. Esophageal high-resolution ECGs provide this information, whereas surface long-term ECGs show poor atrial signal quality. This new method is a promising tool for the long-term rhythm monitoring with software-based automatic classification of atrial beats.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In the last years significant efforts have been devoted to the development of advanced data analysis tools to both predict the occurrence of disruptions and to investigate the operational spaces of devices, with the long term goal of advancing the understanding of the physics of these events and to prepare for ITER. On JET the latest generation of the disruption predictor called APODIS has been deployed in the real time network during the last campaigns with the new metallic wall. Even if it was trained only with discharges with the carbon wall, it has reached very good performance, with both missed alarms and false alarms in the order of a few percent (and strategies to improve the performance have already been identified). Since for the optimisation of the mitigation measures, predicting also the type of disruption is considered to be also very important, a new clustering method, based on the geodesic distance on a probabilistic manifold, has been developed. This technique allows automatic classification of an incoming disruption with a success rate of better than 85%. Various other manifold learning tools, particularly Principal Component Analysis and Self Organised Maps, are also producing very interesting results in the comparative analysis of JET and ASDEX Upgrade (AUG) operational spaces, on the route to developing predictors capable of extrapolating from one device to another.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this work we investigated whether there is a relationship between dominant behaviour of dialogue participants and their verbal intelligence. The analysis is based on a corpus containing 56 dialogues and verbal intelligence scores of the test persons. All the dialogues were divided into three groups: H-H is a group of dialogues between higher verbal intelligence participants, L-L is a group of dialogues between lower verbal intelligence participant and L-H is a group of all the other dialogues. The dominance scores of the dialogue partners from each group were analysed. The analysis showed that differences between dominance scores and verbal intelligence coefficients for L-L were positively correlated. Verbal intelligence scores of the test persons were compared to other features that may reflect dominant behaviour. The analysis showed that number of interruptions, long utterances, times grabbed the floor, influence diffusion model, number of agreements and several acoustic features may be related to verbal intelligence. These features were used for the automatic classification of the dialogue partners into two groups (lower and higher verbal intelligence participants); the achieved accuracy was 89.36%.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Los alimentos son sistemas complejos, formados por diversas estructuras a diferentes escalas: macroscópica y microscópica. Muchas propiedades de los alimentos, que son importantes para su procesamiento, calidad y tratamiento postcosecha, están relacionados con su microestructura. La presente tesis doctoral propone una metodología completa para la determinación de la estructura de alimentos desde un punto de vista multi-escala, basándose en métodos de Resonancia Magnética Nuclear (NMR). Las técnicas de NMR son no invasivas y no destructivas y permiten el estudio tanto de macro- como de microestructura. Se han utilizado distintos procedimientos de NMR dependiendo del nivel que se desea estudiar. Para el nivel macroestructural, la Imagen de Resonancia Magnética (MRI) ha resultado ser muy útil para la caracterización de alimentos. Para el estudio microestructural, la MRI requiere altos tiempos de adquisición, lo que hace muy difícil la transferencia de esta técnica a aplicaciones en industria. Por tanto, la optimización de procedimientos de NMR basados en secuencias relaxometría 2D T1/T2 ha resultado ser una estrategia primordial en esta tesis. Estos protocolos de NMR se han implementado satisfactoriamente por primera vez en alto campo magnético. Se ha caracterizado la microestructura de productos alimentarios enteros por primera vez utilizando este tipo de protocolos. Como muestras, se han utilizado dos tipos de productos: modelos de alimentos y alimentos reales (manzanas). Además, como primer paso para su posterior implementación en la industria agroalimentaria, se ha mejorado una línea transportadora, especialmente diseñada para trabajar bajo condiciones de NMR en trabajos anteriores del grupo LPF-TAGRALIA. Se han estudiado y seleccionado las secuencias más rápidas y óptimas para la detección de dos tipos de desórdenes internos en manzanas: vitrescencia y roturas internas. La corrección de las imágenes en movimiento se realiza en tiempo real. Asimismo, se han utilizado protocolos de visión artificial para la clasificación automática de manzanas potencialmente afectadas por vitrescencia. El presente documento está dividido en diferentes capítulos: el Capítulo 2 explica los antecedentes de la presente tesis y el marco del proyecto en el que se ha desarrollado. El Capítulo 3 recoge el estado del arte. El Capítulo 4 establece los objetivos de esta tesis doctoral. Los resultados se dividen en cinco sub-secciones (dentro del Capítulo 5) que corresponden con los trabajos publicados bien en revistas revisadas por pares, bien en congresos internacionales o bien como capítulos de libros revisados por pares. La Sección 5.1. es un estudio del desarrollo de la vitrescencia en manzanas mediante MRI y lo relaciona con la posición de la fruta dentro de la copa del árbol. La Sección 5.2 presenta un trabajo sobre macro- y microestructura en modelos de alimentos. La Sección 5.3 es un artículo en revisión en una revista revisada por pares, en el que se hace un estudio microestrcutural no destructivo mediante relaxometría 2D T1/T2. la Sección 5.4, hace una comparación entre manzanas afectadas por vitrescencia mediante dos técnicas: tomografía de rayos X e MRI, en manzana. Por último, en la Sección 5.5 se muestra un trabajo en el que se hace un estudio de secuencias de MRI en línea para la evaluación de calidad interna en manzanas. Los siguientes capítulos ofrecen una discusión y conclusiones (Capítulo 6 y 7 respectivamente) de todos los capítulos de esta tesis doctoral. Finalmente, se han añadido tres apéndices: el primero con una introducción de los principios básicos de resonancia magnética nuclear (NMR) y en los otros dos, se presentan sendos estudios sobre el efecto de las fibras en la rehidratación de cereales de desayuno extrusionados, mediante diversas técnicas. Ambos trabajos se presentaron en un congreso internacional. Los resultados más relevantes de la presente tesis doctoral, se pueden dividir en tres grandes bloques: resultados sobre macroestructura, resultados sobre microestructura y resultados sobre MRI en línea. Resultados sobre macroestructura: - La imagen de resonancia magnética (MRI) se aplicó satisfactoriamente para la caracterización de macroestructura. En particular, la reconstrucción 3D de imágenes de resonancia magnética permitió identificar y caracterizar dos tipos distintos de vitrescencia en manzanas: central y radial, que se caracterizan por el porcentaje de daño y la conectividad (número de Euler). - La MRI proveía un mejor contraste para manzanas afectadas por vitrescencia que las imágenes de tomografía de rayos X (X-Ray CT), como se pudo verificar en muestras idénticas de manzana. Además, el tiempo de adquisición de la tomografía de rayos X fue alrededor de 12 veces mayor (25 minutos) que la adquisición de las imágenes de resonancia magnética (2 minutos 2 segundos). Resultados sobre microestructura: - Para el estudio de microestructura (nivel subcelular) se utilizaron con éxito secuencias de relaxometría 2D T1/T2. Estas secuencias se usaron por primera vez en alto campo y sobre piezas de alimento completo, convirtiéndose en una forma no destructiva de llevar a cabo estudios de microestructura. - El uso de MRI junto con relaxometría 2D T1/T2 permite realizar estudios multiescala en alimentos de forma no destructiva. Resultados sobre MRI en línea: - El uso de imagen de resonancia magnética en línea fue factible para la identificación de dos tipos de desórdenes internos en manzanas: vitrescencia y podredumbre interna. Las secuencias de imagen tipo FLASH resultaron adecuadas para la identificación en línea de vitrescencia en manzanas. Se realizó sin selección de corte, debido a que la vitrescencia puede desarrollarse en cualquier punto del volumen de la manzana. Se consiguió reducir el tiempo de adquisición, de modo que se llegaron a adquirir 1.3 frutos por segundos (758 ms por fruto). Las secuencias de imagen tipo UFLARE fueron adecuadas para la detección en línea de la podredumbre interna en manzanas. En este caso, se utilizó selección de corte, ya que se trata de un desorden que se suele localizar en la parte central del volumen de la manzana. Se consiguió reducir el tiempo de adquisicón hasta 0.67 frutos por segundo (1475 ms por fruto). En ambos casos (FLASH y UFLARE) fueron necesarios algoritmos para la corrección del movimiento de las imágenes en tiempo real. ABSTRACT Food is a complex system formed by several structures at different scales: macroscopic and microscopic. Many properties of foods that are relevant to process engineering or quality and postharvest treatments are related to their microstructure. This Ph.D Thesis proposes a complete methodology for food structure determination, in a multiscale way, based on the Nuclear Magnetic Resonance (NMR) phenomenon since NMR techniques are non-invasive and non-destructive, and allow both, macro- and micro-structure study. Different NMR procedures are used depending on the structure level under study. For the macrostructure level, Magnetic Resonance Imaging (MRI) revealed its usefulness for food characterization. For microstructure insight, MRI required high acquisition times, which is a hindrance for transference to industry applications. Therefore, optimization of NMR procedures based on T1/T2 relaxometry sequences was a key strategy in this Thesis. These NMR relaxometry protocols, are successfully implemented in high magnetic field. Microstructure of entire food products have been characterized for the first time using these protocols. Two different types of food products have been studied: food models and actual food (apples). Furthermore, as a first step for the food industry implementation, a grading line system, specially designed for working under NMR conditions in previous works of the LPF-TAGRALIA group, is improved. The study and selection of the most suitable rapid sequence to detect two different types of disorders in apples (watercore and internal breakdown) is performed and the real time image motion correction is applied. In addition, artificial vision protocols for the automatic classification of apples potentially affected by watercore are applied. This document is divided into seven different chapters: Chapter 2 explains the thesis background and the framework of the project in which it has been worked. Chapter 3 comprises the state of the art. Chapter 4 establishes de objectives of this Ph.D thesis. The results are divided into five different sections (in Chapter 5) that correspond to published peered reviewed works. Section 5.1 assesses the watercore development in apples with MRI and studies the effect of fruit location in the canopy. Section 5.2 is an MRI and 2D relaxometry study for macro- and microstructure assessment in food models. Section 5.3 is a non-destructive microstructural study using 2D T1/T2 relaxometry on watercore affected apples. Section 5.4 makes a comparison of X-ray CT and MRI on watercore disorder of different apple cultivars. Section 5.5, that is a study of online MRI sequences for the evaluation of apple internal quality. The subsequent chapters offer a general discussion and conclusions (Chapter 6 and Chapter 7 respectively) of all the works performed in the frame of this Ph.D thesis (two peer reviewed journals, one book chapter and one international congress).Finally, three appendices are included in which an introduction to NMR principles is offered and two published proceedings regarding the effect of fiber on the rehydration of extruded breakfast cereal are displayed. The most relevant results can be summarized into three sections: results on macrostructure, results on microstructure and results on on-line MRI. Results on macrostructure: - MRI was successfully used for macrostructure characterization. Indeed, 3D reconstruction of MRI in apples allows to identify two different types of watercore (radial and block), which are characterized by the percentage of damage and the connectivity (Euler number). - MRI provides better contrast for watercore than X-Ray CT as verified on identical samples. Furthermore, X-Ray CT images acquisition time was around 12 times higher (25 minutes) than MRI acquisition time (2 minutes 2 seconds). Results on microstructure: - 2D T1/T2 relaxometry were successfully applied for microstructure (subcellular level) characterization. 2D T1/T2 relaxometry sequences have been applied for the first time on high field for entire food pieces, being a non-destructive way to achieve microstructure study. - The use of MRI together with 2D T1/T2 relaxometry sequences allows a non-destructive multiscale study of food. Results on on-line MRI: - The use of on-line MRI was successful for the identification of two different internal disorders in apples: watercore and internal breakdown. FLASH imaging was a suitable technique for the on-line detection of watercore disorder in apples, with no slice selection, since watercore is a physiological disorder that may be developed anywhere in the apple volume. 1.3 fruits were imaged per second (768 ms per fruit). UFLARE imaging is a suitable sequence for the on-line detection of internal breakdown disorder in apples. Slice selection was used, as internal breakdown is usually located in the central slice of the apple volume. 0.67 fruits were imaged per second (1475 ms per fruit). In both cases (FLASH and UFLARE) motion correction was performed in real time, during the acquisition of the images.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

BACKGROUND: Clinical Trials (CTs) are essential for bridging the gap between experimental research on new drugs and their clinical application. Just like CTs for traditional drugs and biologics have helped accelerate the translation of biomedical findings into medical practice, CTs for nanodrugs and nanodevices could advance novel nanomaterials as agents for diagnosis and therapy. Although there is publicly available information about nanomedicine-related CTs, the online archiving of this information is carried out without adhering to criteria that discriminate between studies involving nanomaterials or nanotechnology-based processes (nano), and CTs that do not involve nanotechnology (non-nano). Finding out whether nanodrugs and nanodevices were involved in a study from CT summaries alone is a challenging task. At the time of writing, CTs archived in the well-known online registry ClinicalTrials.gov are not easily told apart as to whether they are nano or non-nano CTs-even when performed by domain experts, due to the lack of both a common definition for nanotechnology and of standards for reporting nanomedical experiments and results. METHODS: We propose a supervised learning approach for classifying CT summaries from ClinicalTrials.gov according to whether they fall into the nano or the non-nano categories. Our method involves several stages: i) extraction and manual annotation of CTs as nano vs. non-nano, ii) pre-processing and automatic classification, and iii) performance evaluation using several state-of-the-art classifiers under different transformations of the original dataset. RESULTS AND CONCLUSIONS: The performance of the best automated classifier closely matches that of experts (AUC over 0.95), suggesting that it is feasible to automatically detect the presence of nanotechnology products in CT summaries with a high degree of accuracy. This can significantly speed up the process of finding whether reports on ClinicalTrials.gov might be relevant to a particular nanoparticle or nanodevice, which is essential to discover any precedents for nanotoxicity events or advantages for targeted drug therapy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Este proyecto presenta un software para el análisis de imágenes dermatoscópicas correspondiente a lesiones melanocíticas, con el fin de clasificarlas entre lesiones benignas y melanoma. El sistema realiza una segmentación automática de la lesión y la procesa en varas etapas, extrayendo características de relevancia diagnóstica: asimetría, colores, irregularidad del borde, y la presencia de estructuras como redes pigmentadas atípicas o velo azul-blanquecino. Proporciona además una herramienta para el etiquetado manual de estructuras adicionales. La clasificación automática de las lesiones se realiza en base a los métodos de diagnóstico más comúnmente utilizados: las reglas ABCD, Menzies, 7-point checklist, CASH y CHAOS & CLUES. El sistema de clasificación se evalúa sobre una base de datos de imágenes dermatoscópicas, y se realiza una comparativa de los resultados obtenidos por cada método de diagnóstico. ABSTRACT. This project presents a software for the analysis of dermoscopic images of melanocytic lesions, and their classification into benign lesions and melanoma. The system performs automatic segmentation of the lesion and goes through several stages of extraction of certain characteristics relevant to the diagnosis, such as asymmetry, border irregularity, or presence of structures like atypical pigmented network or blue-whitish veil. Automatic classification of the lesions is accomplished by means of the most commonly used diagnostic methods, such as ABCD and Menzies's rules, the 7-point checklist, CASH, and CHAOS & CLUES. The classification system is evaluated by using a dermoscopic image database, and a comparison of the results yielded by the different diagnostic methods is performed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La nanotecnología es un área de investigación de reciente creación que trata con la manipulación y el control de la materia con dimensiones comprendidas entre 1 y 100 nanómetros. A escala nanométrica, los materiales exhiben fenómenos físicos, químicos y biológicos singulares, muy distintos a los que manifiestan a escala convencional. En medicina, los compuestos miniaturizados a nanoescala y los materiales nanoestructurados ofrecen una mayor eficacia con respecto a las formulaciones químicas tradicionales, así como una mejora en la focalización del medicamento hacia la diana terapéutica, revelando así nuevas propiedades diagnósticas y terapéuticas. A su vez, la complejidad de la información a nivel nano es mucho mayor que en los niveles biológicos convencionales (desde el nivel de población hasta el nivel de célula) y, por tanto, cualquier flujo de trabajo en nanomedicina requiere, de forma inherente, estrategias de gestión de información avanzadas. Desafortunadamente, la informática biomédica todavía no ha proporcionado el marco de trabajo que permita lidiar con estos retos de la información a nivel nano, ni ha adaptado sus métodos y herramientas a este nuevo campo de investigación. En este contexto, la nueva área de la nanoinformática pretende detectar y establecer los vínculos existentes entre la medicina, la nanotecnología y la informática, fomentando así la aplicación de métodos computacionales para resolver las cuestiones y problemas que surgen con la información en la amplia intersección entre la biomedicina y la nanotecnología. Las observaciones expuestas previamente determinan el contexto de esta tesis doctoral, la cual se centra en analizar el dominio de la nanomedicina en profundidad, así como en el desarrollo de estrategias y herramientas para establecer correspondencias entre las distintas disciplinas, fuentes de datos, recursos computacionales y técnicas orientadas a la extracción de información y la minería de textos, con el objetivo final de hacer uso de los datos nanomédicos disponibles. El autor analiza, a través de casos reales, alguna de las tareas de investigación en nanomedicina que requieren o que pueden beneficiarse del uso de métodos y herramientas nanoinformáticas, ilustrando de esta forma los inconvenientes y limitaciones actuales de los enfoques de informática biomédica a la hora de tratar con datos pertenecientes al dominio nanomédico. Se discuten tres escenarios diferentes como ejemplos de actividades que los investigadores realizan mientras llevan a cabo su investigación, comparando los contextos biomédico y nanomédico: i) búsqueda en la Web de fuentes de datos y recursos computacionales que den soporte a su investigación; ii) búsqueda en la literatura científica de resultados experimentales y publicaciones relacionadas con su investigación; iii) búsqueda en registros de ensayos clínicos de resultados clínicos relacionados con su investigación. El desarrollo de estas actividades requiere el uso de herramientas y servicios informáticos, como exploradores Web, bases de datos de referencias bibliográficas indexando la literatura biomédica y registros online de ensayos clínicos, respectivamente. Para cada escenario, este documento proporciona un análisis detallado de los posibles obstáculos que pueden dificultar el desarrollo y el resultado de las diferentes tareas de investigación en cada uno de los dos campos citados (biomedicina y nanomedicina), poniendo especial énfasis en los retos existentes en la investigación nanomédica, campo en el que se han detectado las mayores dificultades. El autor ilustra cómo la aplicación de metodologías provenientes de la informática biomédica a estos escenarios resulta efectiva en el dominio biomédico, mientras que dichas metodologías presentan serias limitaciones cuando son aplicadas al contexto nanomédico. Para abordar dichas limitaciones, el autor propone un enfoque nanoinformático, original, diseñado específicamente para tratar con las características especiales que la información presenta a nivel nano. El enfoque consiste en un análisis en profundidad de la literatura científica y de los registros de ensayos clínicos disponibles para extraer información relevante sobre experimentos y resultados en nanomedicina —patrones textuales, vocabulario en común, descriptores de experimentos, parámetros de caracterización, etc.—, seguido del desarrollo de mecanismos para estructurar y analizar dicha información automáticamente. Este análisis concluye con la generación de un modelo de datos de referencia (gold standard) —un conjunto de datos de entrenamiento y de test anotados manualmente—, el cual ha sido aplicado a la clasificación de registros de ensayos clínicos, permitiendo distinguir automáticamente los estudios centrados en nanodrogas y nanodispositivos de aquellos enfocados a testear productos farmacéuticos tradicionales. El presente trabajo pretende proporcionar los métodos necesarios para organizar, depurar, filtrar y validar parte de los datos nanomédicos existentes en la actualidad a una escala adecuada para la toma de decisiones. Análisis similares para otras tareas de investigación en nanomedicina ayudarían a detectar qué recursos nanoinformáticos se requieren para cumplir los objetivos actuales en el área, así como a generar conjunto de datos de referencia, estructurados y densos en información, a partir de literatura y otros fuentes no estructuradas para poder aplicar nuevos algoritmos e inferir nueva información de valor para la investigación en nanomedicina. ABSTRACT Nanotechnology is a research area of recent development that deals with the manipulation and control of matter with dimensions ranging from 1 to 100 nanometers. At the nanoscale, materials exhibit singular physical, chemical and biological phenomena, very different from those manifested at the conventional scale. In medicine, nanosized compounds and nanostructured materials offer improved drug targeting and efficacy with respect to traditional formulations, and reveal novel diagnostic and therapeutic properties. Nevertheless, the complexity of information at the nano level is much higher than the complexity at the conventional biological levels (from populations to the cell). Thus, any nanomedical research workflow inherently demands advanced information management. Unfortunately, Biomedical Informatics (BMI) has not yet provided the necessary framework to deal with such information challenges, nor adapted its methods and tools to the new research field. In this context, the novel area of nanoinformatics aims to build new bridges between medicine, nanotechnology and informatics, allowing the application of computational methods to solve informational issues at the wide intersection between biomedicine and nanotechnology. The above observations determine the context of this doctoral dissertation, which is focused on analyzing the nanomedical domain in-depth, and developing nanoinformatics strategies and tools to map across disciplines, data sources, computational resources, and information extraction and text mining techniques, for leveraging available nanomedical data. The author analyzes, through real-life case studies, some research tasks in nanomedicine that would require or could benefit from the use of nanoinformatics methods and tools, illustrating present drawbacks and limitations of BMI approaches to deal with data belonging to the nanomedical domain. Three different scenarios, comparing both the biomedical and nanomedical contexts, are discussed as examples of activities that researchers would perform while conducting their research: i) searching over the Web for data sources and computational resources supporting their research; ii) searching the literature for experimental results and publications related to their research, and iii) searching clinical trial registries for clinical results related to their research. The development of these activities will depend on the use of informatics tools and services, such as web browsers, databases of citations and abstracts indexing the biomedical literature, and web-based clinical trial registries, respectively. For each scenario, this document provides a detailed analysis of the potential information barriers that could hamper the successful development of the different research tasks in both fields (biomedicine and nanomedicine), emphasizing the existing challenges for nanomedical research —where the major barriers have been found. The author illustrates how the application of BMI methodologies to these scenarios can be proven successful in the biomedical domain, whilst these methodologies present severe limitations when applied to the nanomedical context. To address such limitations, the author proposes an original nanoinformatics approach specifically designed to deal with the special characteristics of information at the nano level. This approach consists of an in-depth analysis of the scientific literature and available clinical trial registries to extract relevant information about experiments and results in nanomedicine —textual patterns, common vocabulary, experiment descriptors, characterization parameters, etc.—, followed by the development of mechanisms to automatically structure and analyze this information. This analysis resulted in the generation of a gold standard —a manually annotated training or reference set—, which was applied to the automatic classification of clinical trial summaries, distinguishing studies focused on nanodrugs and nanodevices from those aimed at testing traditional pharmaceuticals. The present work aims to provide the necessary methods for organizing, curating and validating existing nanomedical data on a scale suitable for decision-making. Similar analysis for different nanomedical research tasks would help to detect which nanoinformatics resources are required to meet current goals in the field, as well as to generate densely populated and machine-interpretable reference datasets from the literature and other unstructured sources for further testing novel algorithms and inferring new valuable information for nanomedicine.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objectives: A recently introduced pragmatic scheme promises to be a useful catalog of interneuron names.We sought to automatically classify digitally reconstructed interneuronal morphologies according tothis scheme. Simultaneously, we sought to discover possible subtypes of these types that might emergeduring automatic classification (clustering). We also investigated which morphometric properties weremost relevant for this classification.Materials and methods: A set of 118 digitally reconstructed interneuronal morphologies classified into thecommon basket (CB), horse-tail (HT), large basket (LB), and Martinotti (MA) interneuron types by 42 of theworld?s leading neuroscientists, quantified by five simple morphometric properties of the axon and fourof the dendrites. We labeled each neuron with the type most commonly assigned to it by the experts. Wethen removed this class information for each type separately, and applied semi-supervised clustering tothose cells (keeping the others? cluster membership fixed), to assess separation from other types and lookfor the formation of new groups (subtypes). We performed this same experiment unlabeling the cells oftwo types at a time, and of half the cells of a single type at a time. The clustering model is a finite mixtureof Gaussians which we adapted for the estimation of local (per-cluster) feature relevance. We performedthe described experiments on three different subsets of the data, formed according to how many expertsagreed on type membership: at least 18 experts (the full data set), at least 21 (73 neurons), and at least26 (47 neurons).Results: Interneurons with more reliable type labels were classified more accurately. We classified HTcells with 100% accuracy, MA cells with 73% accuracy, and CB and LB cells with 56% and 58% accuracy,respectively. We identified three subtypes of the MA type, one subtype of CB and LB types each, andno subtypes of HT (it was a single, homogeneous type). We got maximum (adapted) Silhouette widthand ARI values of 1, 0.83, 0.79, and 0.42, when unlabeling the HT, CB, LB, and MA types, respectively,confirming the quality of the formed cluster solutions. The subtypes identified when unlabeling a singletype also emerged when unlabeling two types at a time, confirming their validity. Axonal morphometricproperties were more relevant that dendritic ones, with the axonal polar histogram length in the [pi, 2pi) angle interval being particularly useful.Conclusions: The applied semi-supervised clustering method can accurately discriminate among CB, HT, LB, and MA interneuron types while discovering potential subtypes, and is therefore useful for neuronal classification. The discovery of potential subtypes suggests that some of these types are more heteroge-neous that previously thought. Finally, axonal variables seem to be more relevant than dendritic ones fordistinguishing among the CB, HT, LB, and MA interneuron types.