871 resultados para Classifier Generalization Ability


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We designed and synthesized a novel daunorubicin (DNR) analogue that effectively circumvents P-glycoprotein (P-gp)-mediated drug resistance. The fully protected carbohydrate intermediate 1,2-dibromoacosamine was prepared from acosamine and effectively coupled to daunomycinone in high yield. Deprotection under alkaline conditions yielded 2$\sp\prime$-bromo-4$\sp\prime$-epidaunorubicin (WP401). The in vitro cytotoxicity and cellular and molecular pharmacology of WP401 were compared with those of DNR in a panel of wild-type cell lines (KB-3-1, P388S, and HL60S) and their multidrug-resistant (MDR) counterparts (KB-V1, P388/DOX, and HL60/DOX). Fluorescent spectrophotometry, flow cytometry, and confocal laser scanning microscopy were used to measure intracellular accumulation, retention, and subcellular distribution of these agents. All MDR cell lines exhibited reduced DNR uptake that was restored, upon incubation with either verapamil (VER) or cyclosporin A (CSA), to the level found in sensitive cell lines. In contrast, the uptake of WP401 was essentially the same in the absence or presence of VER or CSA in all tested cell lines. The in vitro cytotoxicity of WP401 was similar to that of DNR in the sensitive cell lines but significantly higher in resistant cell lines (resistance index (RI) of 2-6 for WP401 vs 75-85 for DNR). To ascertain whether drug-mediated cytotoxicity and retention were accompanied by DNA strand breaks, DNA single- and double-strand breaks were assessed by alkaline elution. High levels of such breaks were obtained using 0.1-2 $\mu$g/mL of WP401 in both sensitive and resistant cells. In contrast, DNR caused strand breaks only in sensitive cells and not much in resistant cells. We also compared drug-induced DNA fragmentation similar to that induced by DNR. However, in P-gp-positive cells, WP401 induced 2- to 5-fold more DNA fragmentation than DNR. This increased DNA strand breakage by WP401 was correlated with its increased uptake and cytotoxicity in these cell lines. Overall these results indicate that WP401 is more cytotoxic than DNR in MDR cells and that this phenomenon might be related to the reduced basicity of the amino group and increased lipophilicity of WP401. ^

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ozone (O3) phytototoxicity has been reported on a wide range of crops and wild Central European plantspecies, however no information has been provided regarding the sensitivity of plantspecies from dehesa Mediterranean therophytic grasslands in spite of their great plantspecies richness and the high O3 levels that are recorded in this area. A study was carried out in open-top chambers (OTCs) to assess the effects of O3 and competition on the reproductiveability of threecloverspecies: Trifolium cherleri, Trifolium subterraneum and Trifolium striatum. A phytometer approach was followed, therefore plants of these species were grown in mesoscosms composed of monocultures of four plants of each species, of threeplants of each species competing against a Briza maxima individual or of a single plant of each cloverspecies competing with threeB. maximaplants. Three O3 treatments were adopted: charcoal filtered air (CFA), non-filtered air (NFA) and non-filtered air supplemented with 40 nl l−1 of O3 (NFA+). The different mesocosms were exposed to the different O3 treatments for 45 days and then they remained in the open. Ozoneexposure caused reductions in the flower biomass of the threecloverspecies assessed. In the case of T. cherleri and T. subterraneum this effect was found following their exposure to the different O3 treatments during their vegetative period. An attenuation of these effects was found when the plants remained in the open. Ozone-induced detrimental effects on the seed output of T. striatum were also observed. The flower biomass of the cloverplants grown in monocultures was greater than when competing with one or threeB. maxima individuals. An increased flower biomass was found in the CFA monoculture mesocosms of T. cherleri when compared with the remaining mesocosms, once the plants were exposed in the open for 60 days. The implications of these effects on the performance of dehesa acid grasslands and for the definition of O3 critical levels is discussed

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an automatic modulation classifier for electronic warfare applications. It is a pattern recognition modulation classifier based on statistical features of the phase and instantaneous frequency. This classifier runs in a real time operation mode with sampling rates in excess of 1 Gsample/s. The hardware platform for this application is a Field Programmable Gate Array (FPGA). This AMC is subsidiary of a digital channelised receiver also implemented in the same platform.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Integrated Pest Management of insects includes several control tactics, such as the use of photoselective nets, which may reduce the flight activity of insects. Limiting the dispersal of pests such as aphids and whiteflies is important because of their major role as vectors of plant viruses, while a minor impact on natural enemies is desired. In this study, we examined for the first time the dispersal ability of three vector species, Bemisia tabaci (Gennadius) (Hemiptera: Aleyrodidae), Macrosiphum euphorbiae (Thomas) (Hemiptera: Aphididae) and Myzus persicae (Sulzer) (Hemiptera: Aphididae), in cages covered with photoselective nets. Contrary to the results obtained with aphids, the ability of the whitefly B. tabaci, to reach the target plant was reduced by photoselective nets. In a second set of experiments, the impact of UV-absorbing nets on the visual cues of two important predator species, Orius laevigatus (Fieber) (Hemiptera: Anthocoridae) and Amblyseius swirskii Athias-Henriot (Acari: Phytoseiidae), was evaluated. The anthocorid was caught in higher numbers in traps placed under regular nets, whereas the mites preferably chose environments in which the UV radiation was attenuated. We have observed a wide range of effects that impedes generalization, although photoselective nets have a positive effect on pest management of whiteflies and aphids under protected environments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In ubiquitous data stream mining applications, different devices often aim to learn concepts that are similar to some extent. In these applications, such as spam filtering or news recommendation, the data stream underlying concept (e.g., interesting mail/news) is likely to change over time. Therefore, the resultant model must be continuously adapted to such changes. This paper presents a novel Collaborative Data Stream Mining (Coll-Stream) approach that explores the similarities in the knowledge available from other devices to improve local classification accuracy. Coll-Stream integrates the community knowledge using an ensemble method where the classifiers are selected and weighted based on their local accuracy for different partitions of the feature space. We evaluate Coll-Stream classification accuracy in situations with concept drift, noise, partition granularity and concept similarity in relation to the local underlying concept. The experimental results show that Coll-Stream resultant model achieves stability and accuracy in a variety of situations using both synthetic and real world datasets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Performing activity recognition using the information provided by the different sensors embedded in a smartphone face limitations due to the capabilities of those devices when the computations are carried out in the terminal. In this work a fuzzy inference module is implemented in order to decide which classifier is the most appropriate to be used at a specific moment regarding the application requirements and the device context characterized by its battery level, available memory and CPU load. The set of classifiers that is considered is composed of Decision Tables and Trees that have been trained using different number of sensors and features. In addition, some classifiers perform activity recognition regardless of the on-body device position and others rely on the previous recognition of that position to use a classifier that is trained with measurements gathered with the mobile placed on that specific position. The modules implemented show that an evaluation of the classifiers allows sorting them so the fuzzy inference module can choose periodically the one that best suits the device context and application requirements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Neuronal morphology is a key feature in the study of brain circuits, as it is highly related to information processing and functional identification. Neuronal morphology affects the process of integration of inputs from other neurons and determines the neurons which receive the output of the neurons. Different parts of the neurons can operate semi-independently according to the spatial location of the synaptic connections. As a result, there is considerable interest in the analysis of the microanatomy of nervous cells since it constitutes an excellent tool for better understanding cortical function. However, the morphologies, molecular features and electrophysiological properties of neuronal cells are extremely variable. Except for some special cases, this variability makes it hard to find a set of features that unambiguously define a neuronal type. In addition, there are distinct types of neurons in particular regions of the brain. This morphological variability makes the analysis and modeling of neuronal morphology a challenge. Uncertainty is a key feature in many complex real-world problems. Probability theory provides a framework for modeling and reasoning with uncertainty. Probabilistic graphical models combine statistical theory and graph theory to provide a tool for managing domains with uncertainty. In particular, we focus on Bayesian networks, the most commonly used probabilistic graphical model. In this dissertation, we design new methods for learning Bayesian networks and apply them to the problem of modeling and analyzing morphological data from neurons. The morphology of a neuron can be quantified using a number of measurements, e.g., the length of the dendrites and the axon, the number of bifurcations, the direction of the dendrites and the axon, etc. These measurements can be modeled as discrete or continuous data. The continuous data can be linear (e.g., the length or the width of a dendrite) or directional (e.g., the direction of the axon). These data may follow complex probability distributions and may not fit any known parametric distribution. Modeling this kind of problems using hybrid Bayesian networks with discrete, linear and directional variables poses a number of challenges regarding learning from data, inference, etc. In this dissertation, we propose a method for modeling and simulating basal dendritic trees from pyramidal neurons using Bayesian networks to capture the interactions between the variables in the problem domain. A complete set of variables is measured from the dendrites, and a learning algorithm is applied to find the structure and estimate the parameters of the probability distributions included in the Bayesian networks. Then, a simulation algorithm is used to build the virtual dendrites by sampling values from the Bayesian networks, and a thorough evaluation is performed to show the model’s ability to generate realistic dendrites. In this first approach, the variables are discretized so that discrete Bayesian networks can be learned and simulated. Then, we address the problem of learning hybrid Bayesian networks with different kinds of variables. Mixtures of polynomials have been proposed as a way of representing probability densities in hybrid Bayesian networks. We present a method for learning mixtures of polynomials approximations of one-dimensional, multidimensional and conditional probability densities from data. The method is based on basis spline interpolation, where a density is approximated as a linear combination of basis splines. The proposed algorithms are evaluated using artificial datasets. We also use the proposed methods as a non-parametric density estimation technique in Bayesian network classifiers. Next, we address the problem of including directional data in Bayesian networks. These data have some special properties that rule out the use of classical statistics. Therefore, different distributions and statistics, such as the univariate von Mises and the multivariate von Mises–Fisher distributions, should be used to deal with this kind of information. In particular, we extend the naive Bayes classifier to the case where the conditional probability distributions of the predictive variables given the class follow either of these distributions. We consider the simple scenario, where only directional predictive variables are used, and the hybrid case, where discrete, Gaussian and directional distributions are mixed. The classifier decision functions and their decision surfaces are studied at length. Artificial examples are used to illustrate the behavior of the classifiers. The proposed classifiers are empirically evaluated over real datasets. We also study the problem of interneuron classification. An extensive group of experts is asked to classify a set of neurons according to their most prominent anatomical features. A web application is developed to retrieve the experts’ classifications. We compute agreement measures to analyze the consensus between the experts when classifying the neurons. Using Bayesian networks and clustering algorithms on the resulting data, we investigate the suitability of the anatomical terms and neuron types commonly used in the literature. Additionally, we apply supervised learning approaches to automatically classify interneurons using the values of their morphological measurements. Then, a methodology for building a model which captures the opinions of all the experts is presented. First, one Bayesian network is learned for each expert, and we propose an algorithm for clustering Bayesian networks corresponding to experts with similar behaviors. Then, a Bayesian network which represents the opinions of each group of experts is induced. Finally, a consensus Bayesian multinet which models the opinions of the whole group of experts is built. A thorough analysis of the consensus model identifies different behaviors between the experts when classifying the interneurons in the experiment. A set of characterizing morphological traits for the neuronal types can be defined by performing inference in the Bayesian multinet. These findings are used to validate the model and to gain some insights into neuron morphology. Finally, we study a classification problem where the true class label of the training instances is not known. Instead, a set of class labels is available for each instance. This is inspired by the neuron classification problem, where a group of experts is asked to individually provide a class label for each instance. We propose a novel approach for learning Bayesian networks using count vectors which represent the number of experts who selected each class label for each instance. These Bayesian networks are evaluated using artificial datasets from supervised learning problems. Resumen La morfología neuronal es una característica clave en el estudio de los circuitos cerebrales, ya que está altamente relacionada con el procesado de información y con los roles funcionales. La morfología neuronal afecta al proceso de integración de las señales de entrada y determina las neuronas que reciben las salidas de otras neuronas. Las diferentes partes de la neurona pueden operar de forma semi-independiente de acuerdo a la localización espacial de las conexiones sinápticas. Por tanto, existe un interés considerable en el análisis de la microanatomía de las células nerviosas, ya que constituye una excelente herramienta para comprender mejor el funcionamiento de la corteza cerebral. Sin embargo, las propiedades morfológicas, moleculares y electrofisiológicas de las células neuronales son extremadamente variables. Excepto en algunos casos especiales, esta variabilidad morfológica dificulta la definición de un conjunto de características que distingan claramente un tipo neuronal. Además, existen diferentes tipos de neuronas en regiones particulares del cerebro. La variabilidad neuronal hace que el análisis y el modelado de la morfología neuronal sean un importante reto científico. La incertidumbre es una propiedad clave en muchos problemas reales. La teoría de la probabilidad proporciona un marco para modelar y razonar bajo incertidumbre. Los modelos gráficos probabilísticos combinan la teoría estadística y la teoría de grafos con el objetivo de proporcionar una herramienta con la que trabajar bajo incertidumbre. En particular, nos centraremos en las redes bayesianas, el modelo más utilizado dentro de los modelos gráficos probabilísticos. En esta tesis hemos diseñado nuevos métodos para aprender redes bayesianas, inspirados por y aplicados al problema del modelado y análisis de datos morfológicos de neuronas. La morfología de una neurona puede ser cuantificada usando una serie de medidas, por ejemplo, la longitud de las dendritas y el axón, el número de bifurcaciones, la dirección de las dendritas y el axón, etc. Estas medidas pueden ser modeladas como datos continuos o discretos. A su vez, los datos continuos pueden ser lineales (por ejemplo, la longitud o la anchura de una dendrita) o direccionales (por ejemplo, la dirección del axón). Estos datos pueden llegar a seguir distribuciones de probabilidad muy complejas y pueden no ajustarse a ninguna distribución paramétrica conocida. El modelado de este tipo de problemas con redes bayesianas híbridas incluyendo variables discretas, lineales y direccionales presenta una serie de retos en relación al aprendizaje a partir de datos, la inferencia, etc. En esta tesis se propone un método para modelar y simular árboles dendríticos basales de neuronas piramidales usando redes bayesianas para capturar las interacciones entre las variables del problema. Para ello, se mide un amplio conjunto de variables de las dendritas y se aplica un algoritmo de aprendizaje con el que se aprende la estructura y se estiman los parámetros de las distribuciones de probabilidad que constituyen las redes bayesianas. Después, se usa un algoritmo de simulación para construir dendritas virtuales mediante el muestreo de valores de las redes bayesianas. Finalmente, se lleva a cabo una profunda evaluaci ón para verificar la capacidad del modelo a la hora de generar dendritas realistas. En esta primera aproximación, las variables fueron discretizadas para poder aprender y muestrear las redes bayesianas. A continuación, se aborda el problema del aprendizaje de redes bayesianas con diferentes tipos de variables. Las mixturas de polinomios constituyen un método para representar densidades de probabilidad en redes bayesianas híbridas. Presentamos un método para aprender aproximaciones de densidades unidimensionales, multidimensionales y condicionales a partir de datos utilizando mixturas de polinomios. El método se basa en interpolación con splines, que aproxima una densidad como una combinación lineal de splines. Los algoritmos propuestos se evalúan utilizando bases de datos artificiales. Además, las mixturas de polinomios son utilizadas como un método no paramétrico de estimación de densidades para clasificadores basados en redes bayesianas. Después, se estudia el problema de incluir información direccional en redes bayesianas. Este tipo de datos presenta una serie de características especiales que impiden el uso de las técnicas estadísticas clásicas. Por ello, para manejar este tipo de información se deben usar estadísticos y distribuciones de probabilidad específicos, como la distribución univariante von Mises y la distribución multivariante von Mises–Fisher. En concreto, en esta tesis extendemos el clasificador naive Bayes al caso en el que las distribuciones de probabilidad condicionada de las variables predictoras dada la clase siguen alguna de estas distribuciones. Se estudia el caso base, en el que sólo se utilizan variables direccionales, y el caso híbrido, en el que variables discretas, lineales y direccionales aparecen mezcladas. También se estudian los clasificadores desde un punto de vista teórico, derivando sus funciones de decisión y las superficies de decisión asociadas. El comportamiento de los clasificadores se ilustra utilizando bases de datos artificiales. Además, los clasificadores son evaluados empíricamente utilizando bases de datos reales. También se estudia el problema de la clasificación de interneuronas. Desarrollamos una aplicación web que permite a un grupo de expertos clasificar un conjunto de neuronas de acuerdo a sus características morfológicas más destacadas. Se utilizan medidas de concordancia para analizar el consenso entre los expertos a la hora de clasificar las neuronas. Se investiga la idoneidad de los términos anatómicos y de los tipos neuronales utilizados frecuentemente en la literatura a través del análisis de redes bayesianas y la aplicación de algoritmos de clustering. Además, se aplican técnicas de aprendizaje supervisado con el objetivo de clasificar de forma automática las interneuronas a partir de sus valores morfológicos. A continuación, se presenta una metodología para construir un modelo que captura las opiniones de todos los expertos. Primero, se genera una red bayesiana para cada experto y se propone un algoritmo para agrupar las redes bayesianas que se corresponden con expertos con comportamientos similares. Después, se induce una red bayesiana que modela la opinión de cada grupo de expertos. Por último, se construye una multired bayesiana que modela las opiniones del conjunto completo de expertos. El análisis del modelo consensuado permite identificar diferentes comportamientos entre los expertos a la hora de clasificar las neuronas. Además, permite extraer un conjunto de características morfológicas relevantes para cada uno de los tipos neuronales mediante inferencia con la multired bayesiana. Estos descubrimientos se utilizan para validar el modelo y constituyen información relevante acerca de la morfología neuronal. Por último, se estudia un problema de clasificación en el que la etiqueta de clase de los datos de entrenamiento es incierta. En cambio, disponemos de un conjunto de etiquetas para cada instancia. Este problema está inspirado en el problema de la clasificación de neuronas, en el que un grupo de expertos proporciona una etiqueta de clase para cada instancia de manera individual. Se propone un método para aprender redes bayesianas utilizando vectores de cuentas, que representan el número de expertos que seleccionan cada etiqueta de clase para cada instancia. Estas redes bayesianas se evalúan utilizando bases de datos artificiales de problemas de aprendizaje supervisado.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

“Por lo tanto, la cristalización de polímeros se supone, y en las teorías se describe a menudo, como un proceso de múltiples pasos con muchos aspectos físico-químicos y estructurales influyendo en él. Debido a la propia estructura de la cadena, es fácil entender que un proceso que es termodinámicamente forzado a aumentar su ordenamiento local, se vea obstaculizado geométricamente y, por tanto, no puede conducirse a un estado de equilibrio final. Como resultado, se forman habitualmente estructuras de no equilibrio con diferentes características dependiendo de la temperatura, presión, cizallamiento y otros parámetros físico-químicos del sistema”. Estas palabras, pronunciadas recientemente por el profesor Bernhard Wunderlich, uno de los mas relevantes fisico-quimicos que han abordado en las ultimas décadas el estudio del estado físico de las macromoléculas, adelantan lo que de alguna manera se explicita en esta memoria y constituyen el “leitmotiv” de este trabajo de tesis. El mecanismo de la cristalización de polímeros esta aun bajo debate en la comunidad de la física de polímeros y la mayoría de los abordajes experimentales se explican a través de la teoría LH. Esta teoría clásica debida a Lauritzen y Hoffman (LH), y que es una generalización de la teoría de cristalización de una molécula pequeña desde la fase de vapor, describe satisfactoriamente muchas observaciones experimentales aunque esta lejos de explicar el complejo fenómeno de la cristalización de polímeros. De hecho, la formulación original de esta teoría en el National Bureau of Standards, a comienzos de la década de los 70, sufrió varias reformulaciones importantes a lo largo de la década de los 80, buscando su adaptación a los hallazgos experimentales. Así nació el régimen III de cristalización que posibilita la creacion de nichos moleculares en la superficie y que dio pie al paradigma ofrecido por Sadler y col., para justificar los experimentos que se obtenian por “scattering” de neutrones y otras técnicas como la técnica de “droplets” o enfriamiento rapido. Por encima de todo, el gran éxito de la teoría radica en que explica la dependencia inversa entre el tamaño del plegado molecular y el subenfriamiento, definido este ultimo como el intervalo de temperatura que media entre la temperatura de equilibrio y la temperatura de cristalización. El problema concreto que aborda esta tesis es el estudio de los procesos de ordenamiento de poliolefinas con distinto grado de ramificacion mediante simulaciones numéricas. Los copolimeros estudiados en esta tesis se consideran materiales modelo de gran homogeneidad molecular desde el punto de vista de la distribución de tamaños y de ramificaciones en la cadena polimérica. Se eligieron estas poliolefinas debido al gran interes experimental en conocer el cambio en las propiedades fisicas de los materiales dependiendo del tipo y cantidad de comonomero utilizado. Además, son modelos sobre los que existen una ingente cantidad de información experimental, que es algo que preocupa siempre al crear una realidad virtual como es la simulación. La experiencia en el grupo Biophym es que los resultados de simulación deben de tener siempre un correlato mas o menos próximo experimental y ese argumento se maneja a lo largo de esta memoria. Empíricamente, se conoce muy bien que las propiedades físicas de las poliolefinas, en suma dependen del tipo y de la cantidad de ramificaciones que presenta el material polimérico. Sin embargo, tal como se ha explicado no existen modelos teóricos adecuados que expliquen los mecanismos subyacentes de los efectos de las ramas. La memoria de este trabajo es amplia por la complejidad del tema. Se inicia con una extensa introducción sobre los conceptos básicos de una macromolecula que son relevantes para entender el contenido del resto de la memoria. Se definen los conceptos de macromolecula flexible, distribuciones y momentos, y su comportamiento en disolución y fundido con los correspondientes parametros caracteristicos. Se pone especial énfasis en el concepto de “entanglement” o enmaranamiento por considerarse clave a la hora de tratar macromoléculas con una longitud superior a la longitud critica de enmaranamiento. Finaliza esta introducción con una reseña sobre el estado del arte en la simulación de los procesos de cristalización. En un segundo capitulo del trabajo se expone detalladamente la metodología usada en cada grupo de casos. En el primer capitulo de resultados, se discuten los estudios de simulación en disolución diluida para sistemas lineales y ramificados de cadena única. Este caso mas simple depende claramente del potencial de torsión elegido tal como se discute a lo largo del texto. La formación de los núcleos “babys” propuestos por Muthukumar parece que son consecuencia del potencial de torsión, ya que este facilita los estados de torsión mas estables. Así que se propone el análisis de otros potenciales que son igualmente utilizados y los resultados obtenidos sobre la cristalización, discutidos en consecuencia. Seguidamente, en un segundo capitulo de resultados se estudian moleculas de alcanos de cadena larga lineales y ramificados en un fundido por simulaciones atomisticas como un modelo de polietileno. Los resultados atomisticos pese a ser de gran detalle no logran captar en su totalidad los efectos experimentales que se observan en los fundidos subenfriados en su etapa previa al estado ordenado. Por esta razon se discuten en los capítulos 3 y 4 de resultados sistemas de cadenas cortas y largas utilizando dos modelos de grano grueso (CG-PVA y CG-PE). El modelo CG-PE se desarrollo durante la tesis. El uso de modelos de grano grueso garantiza una mayor eficiencia computacional con respecto a los modelos atomísticos y son suficientes para mostrar los fenómenos a la escala relevante para la cristalización. En todos estos estudios mencionados se sigue la evolución de los procesos de ordenamiento y de fusión en simulaciones de relajación isoterma y no isoterma. Como resultado de los modelos de simulación, se han evaluado distintas propiedades fisicas como la longitud de segmento ordenado, la cristalinidad, temperaturas de fusion/cristalizacion, etc., lo que permite una comparación con los resultados experimentales. Se demuestra claramente que los sistemas ramificados retrasan y dificultan el orden de la cadena polimérica y por tanto, las regiones cristalinas ordenadas decrecen al crecer las ramas. Como una conclusión general parece mostrarse una tendencia a la formación de estructuras localmente ordenadas que crecen como bloques para completar el espacio de cristalización que puede alcanzarse a una temperatura y a una escala de tiempo determinada. Finalmente hay que señalar que los efectos observados, estan en concordancia con otros resultados tanto teoricos/simulacion como experimentales discutidos a lo largo de esta memoria. Su resumen se muestra en un capitulo de conclusiones y líneas futuras de investigación que se abren como consecuencia de esta memoria. Hay que mencionar que el ritmo de investigación se ha acentuado notablemente en el ultimo ano de trabajo, en parte debido a las ventajas notables obtenidas por el uso de la metodología de grano grueso que pese a ser muy importante para esta memoria no repercute fácilmente en trabajos publicables. Todo ello justifica que gran parte de los resultados esten en fase de publicación. Abstract “Polymer crystallization is therefore assumed, and in theories often described, to be a multi step process with many influencing aspects. Because of the chain structure, it is easy to understand that a process which is thermodynamically forced to increase local ordering but is geometrically hindered cannot proceed into a final equilibrium state. As a result, nonequilibrium structures with different characteristics are usually formed, which depend on temperature, pressure, shearing and other parameters”. These words, recently written by Professor Bernhard Wunderlich, one of the most prominent researchers in polymer physics, put somehow in value the "leitmotiv "of this thesis. The crystallization mechanism of polymers is still under debate in the physics community and most of the experimental findings are still explained by invoking the LH theory. This classical theory, which was initially formulated by Lauritzen and Hoffman (LH), is indeed a generalization of the crystallization theory for small molecules from the vapor phase. Even though it describes satisfactorily many experimental observations, it is far from explaining the complex phenomenon of polymer crystallization. This theory was firstly devised in the early 70s at the National Bureau of Standards. It was successively reformulated along the 80s to fit the experimental findings. Thus, the crystallization regime III was introduced into the theory in order to explain the results found in neutron scattering, droplet or quenching experiments. This concept defines the roughness of the crystallization surface leading to the paradigm proposed by Sadler et al. The great success of this theory is the ability to explain the inverse dependence of the molecular folding size on the supercooling, the latter defined as the temperature interval between the equilibrium temperature and the crystallization temperature. The main scope of this thesis is the study of ordering processes in polyolefins with different degree of branching by using computer simulations. The copolymers studied along this work are considered materials of high molecular homogeneity, from the point of view of both size and branching distributions of the polymer chain. These polyolefins were selected due to the great interest to understand their structure– property relationships. It is important to note that there is a vast amount of experimental data concerning these materials, which is essential to create a virtual reality as is the simulation. The Biophym research group has a wide experience in the correlation between simulation data and experimental results, being this idea highly alive along this work. Empirically, it is well-known that the physical properties of the polyolefins depend on the type and amount of branches presented in the polymeric material. However, there are not suitable models to explain the underlying mechanisms associated to branching. This report is extensive due to the complexity of the topic under study. It begins with a general introduction to the basics concepts of macromolecular physics. This chapter is relevant to understand the content of the present document. Some concepts are defined along this section, among others the flexibility of macromolecules, size distributions and moments, and the behavior in solution and melt along with their corresponding characteristic parameters. Special emphasis is placed on the concept of "entanglement" which is a key item when dealing with macromolecules having a molecular size greater than the critical entanglement length. The introduction finishes with a review of the state of art on the simulation of crystallization processes. The second chapter of the thesis describes, in detail, the computational methodology used in each study. In the first results section, we discuss the simulation studies in dilute solution for linear and short chain branched single chain models. The simplest case is clearly dependent on the selected torsion potential as it is discussed throughout the text. For example, the formation of baby nuclei proposed by Mutukhumar seems to result from the effects of the torsion potential. Thus, we propose the analysis of other torsion potentials that are also used by other research groups. The results obtained on crystallization processes are accordingly discussed. Then, in a second results section, we study linear and branched long-chain alkane molecules in a melt by atomistic simulations as a polyethylene-like model. In spite of the great detail given by atomistic simulations, they are not able to fully capture the experimental facts observed in supercooled melts, in particular the pre-ordered states. For this reason, we discuss short and long chains systems using two coarse-grained models (CG-PVA and CG-PE) in section 3 and 4 of chapter 2. The CG-PE model was developed during the thesis. The use of coarse-grained models ensures greater computational efficiency with respect to atomistic models and is enough to show the relevant scale phenomena for crystallization. In all the analysis we follow the evolution of the ordering and melting processes by both isothermal and non isothermal simulations. During this thesis we have obtained different physical properties such as stem length, crystallinity, melting/crystallization temperatures, and so on. We show that branches in the chains cause a delay in the crystallization and hinder the ordering of the polymer chain. Therefore, crystalline regions decrease in size as branching increases. As a general conclusion, it seems that there is a tendency in the macromolecular systems to form ordered structures, which can grown locally as blocks, occupying the crystallization space at a given temperature and time scale. Finally it should be noted that the observed effects are consistent with both, other theoretical/simulation and experimental results. The summary is provided in the conclusions chapter along with future research lines that open as result of this report. It should be mentioned that the research work has speeded up markedly in the last year, in part because of the remarkable benefits obtained by the use of coarse-grained methodology that despite being very important for this thesis work, is not easily publishable by itself. All this justify that most of the results are still in the publication phase.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

La presente Tesis analiza las posibilidades que ofrecen en la actualidad las tecnologías del habla para la detección de patologías clínicas asociadas a la vía aérea superior. El estudio del habla que tradicionalmente cubre tanto la producción como el proceso de transformación del mensaje y las señales involucradas, desde el emisor hasta alcanzar al receptor, ofrece una vía de estudio alternativa para estas patologías. El hecho de que la señal emitida no solo contiene este mensaje, sino también información acerca del locutor, ha motivado el desarrollo de sistemas orientados a la identificación y verificación de la identidad de los locutores. Estos trabajos han recibido recientemente un nuevo impulso, orientándose tanto hacia la caracterización de rasgos que son comunes a varios locutores, como a las diferencias existentes entre grabaciones de un mismo locutor. Los primeros resultan especialmente relevantes para esta Tesis dado que estos rasgos podrían evidenciar la presencia de características relacionadas con una cierta condición común a varios locutores, independiente de su identidad. Tal es el caso que se enfrenta en esta Tesis, donde los rasgos identificados se relacionarían con una de la patología particular y directamente vinculada con el sistema de físico de conformación del habla. El caso del Síndrome de Apneas Hipopneas durante el Sueno (SAHS) resulta paradigmático. Se trata de una patología con una elevada prevalencia mundo, que aumenta con la edad. Los pacientes de esta patología experimentan episodios de cese involuntario de la respiración durante el sueño, que se prolongan durante varios segundos y que se reproducen a lo largo de la noche impidiendo el correcto descanso. En el caso de la apnea obstructiva, estos episodios se deben a la imposibilidad de mantener un camino abierto a través de la vía aérea, de forma que el flujo de aire se ve interrumpido. En la actualidad, el diagnostico de estos pacientes se realiza a través de un estudio polisomnográfico, que se centra en el análisis de los episodios de apnea durante el sueño, requiriendo que el paciente permanezca en el hospital durante una noche. La complejidad y el elevado coste de estos procedimientos, unidos a las crecientes listas de espera, han evidenciado la necesidad de contar con técnicas rápidas de detección, que si bien podrían no obtener tasas tan elevadas, permitirían reorganizar las listas de espera en función del grado de severidad de la patología en cada paciente. Entre otros, los sistemas de diagnostico por imagen, así como la caracterización antropométrica de los pacientes, han evidenciado la existencia de patrones anatómicos que tendrían influencia directa sobre el habla. Los trabajos dedicados al estudio del SAHS en lo relativo a como esta afecta al habla han sido escasos y algunos de ellos incluso contradictorios. Sin embargo, desde finales de la década de 1980 se conoce la existencia de patrones específicos relativos a la articulación, la fonación y la resonancia. Sin embargo, su descripción resultaba difícilmente aprovechable a través de un sistema de reconocimiento automático, pero apuntaba la existencia de un nexo entre voz y SAHS. En los últimos anos las técnicas de procesado automático han permitido el desarrollo de sistemas automáticos que ya son capaces de identificar diferencias significativas en el habla de los pacientes del SAHS, y que los distinguen de los locutores sanos. Por contra, poco se conoce acerca de la conexión entre estos nuevos resultados, los sé que habían obtenido en el pasado y la patogénesis del SAHS. Esta Tesis continua la labor desarrollada en este ámbito considerando específicamente: el estudio de la forma en que el SAHS afecta el habla de los pacientes, la mejora en las tasas de clasificación automática y la combinación de la información obtenida con los predictores utilizados por los especialistas clínicos en sus evaluaciones preliminares. Las dos primeras tareas plantean problemas simbióticos, pero diferentes. Mientras el estudio de la conexión entre el SAHS y el habla requiere de modelos acotados que puedan ser interpretados con facilidad, los sistemas de reconocimiento se sirven de un elevado número de dimensiones para la caracterización y posterior identificación de patrones. Así, la primera tarea debe permitirnos avanzar en la segunda, al igual que la incorporación de los predictores utilizados por los especialistas clínicos. La Tesis aborda el estudio tanto del habla continua como del habla sostenida, con el fin de aprovechar las sinergias y diferencias existentes entre ambas. En el análisis del habla continua se tomo como punto de partida un esquema que ya fue evaluado con anterioridad, y sobre el cual se ha tratado la evaluación y optimización de la representación del habla, así como la caracterización de los patrones específicos asociados al SAHS. Ello ha evidenciado la conexión entre el SAHS y los elementos fundamentales de la señal de voz: los formantes. Los resultados obtenidos demuestran que el éxito de estos sistemas se debe, fundamentalmente, a la capacidad de estas representaciones para describir dichas componentes, obviando las dimensiones ruidosas o con poca capacidad discriminativa. El esquema resultante ofrece una tasa de error por debajo del 18%, sirviéndose de clasificadores notablemente menos complejos que los descritos en el estado del arte y de una única grabación de voz de corta duración. En relación a la conexión entre el SAHS y los patrones observados, fue necesario considerar las diferencias inter- e intra-grupo, centrándonos en la articulación característica del locutor, sustituyendo los complejos modelos de clasificación por el estudio de los promedios espectrales. El resultado apunta con claridad hacia ciertas regiones del eje de frecuencias, sugiriendo la existencia de un estrechamiento sistemático en la sección del tracto en la región de la orofaringe, ya prevista en la patogénesis de este síndrome. En cuanto al habla sostenida, se han reproducido los estudios realizados sobre el habla continua en grabaciones de la vocal /a/ sostenida. Los resultados son cualitativamente análogos a los anteriores, si bien en este caso las tasas de clasificación resultan ser más bajas. Con el objetivo de identificar el sentido de este resultado se reprodujo el estudio de los promedios espectrales y de la variabilidad inter e intra-grupo. Ambos estudios mostraron importantes diferencias con los anteriores que podrían explicar estos resultados. Sin embargo, el habla sostenida ofrece otras oportunidades al establecer un entorno controlado para el estudio de la fonación, que también había sido identificada como una fuente de información para la detección del SAHS. De su estudio se pudo observar que, en el conjunto de datos disponibles, no existen variaciones que pudieran asociarse fácilmente con la fonación. Únicamente aquellas dimensiones que describen la distribución de energía a lo largo del eje de frecuencia evidenciaron diferencias significativas, apuntando, una vez más, en la dirección de las resonancias espectrales. Analizados los resultados anteriores, la Tesis afronta la fusión de ambas fuentes de información en un único sistema de clasificación. Con ello es posible mejorar las tasas de clasificación, bajo la hipótesis de que la información presente en el habla continua y el habla sostenida es fundamentalmente distinta. Esta tarea se realizo a través de un sencillo esquema de fusión que obtuvo un 88.6% de aciertos en clasificación (tasa de error del 11.4%), lo que representa una mejora significativa respecto al estado del arte. Finalmente, la combinación de este clasificador con los predictores utilizados por los especialistas clínicos ofreció una tasa del 91.3% (tasa de error de 8.7%), que se encuentra dentro del margen ofrecido por esquemas más costosos e intrusivos, y que a diferencia del propuesto, no pueden ser utilizados en la evaluación previa de los pacientes. Con todo, la Tesis ofrece una visión clara sobre la relación entre el SAHS y el habla, evidenciando el grado de madurez alcanzado por la tecnología del habla en la caracterización y detección del SAHS, poniendo de manifiesto que su uso para la evaluación de los pacientes ya sería posible, y dejando la puerta abierta a futuras investigaciones que continúen el trabajo aquí iniciado. ABSTRACT This Thesis explores the potential of speech technologies for the detection of clinical disorders connected to the upper airway. The study of speech traditionally covers both the production process and post processing of the signals involved, from the speaker up to the listener, offering an alternative path to study these pathologies. The fact that utterances embed not just the encoded message but also information about the speaker, has motivated the development of automatic systems oriented to the identification and verificaton the speaker’s identity. These have recently been boosted and reoriented either towards the characterization of traits that are common to several speakers, or to the differences between records of the same speaker collected under different conditions. The first are particularly relevant to this Thesis as these patterns could reveal the presence of features that are related to a common condition shared among different speakers, regardless of their identity. Such is the case faced in this Thesis, where the traits identified would relate to a particular pathology, directly connected to the speech production system. The Obstructive Sleep Apnea syndrome (OSA) is a paradigmatic case for analysis. It is a disorder with high prevalence among adults and affecting a larger number of them as they grow older. Patients suffering from this disorder experience episodes of involuntary cessation of breath during sleep that may last a few seconds and reproduce throughout the night, preventing proper rest. In the case of obstructive apnea, these episodes are related to the collapse of the pharynx, which interrupts the air flow. Currently, OSA diagnosis is done through a polysomnographic study, which focuses on the analysis of apnea episodes during sleep, requiring the patient to stay at the hospital for the whole night. The complexity and high cost of the procedures involved, combined with the waiting lists, have evidenced the need for screening techniques, which perhaps would not achieve outstanding performance rates but would allow clinicians to reorganize these lists ranking patients according to the severity of their condition. Among others, imaging diagnosis and anthropometric characterization of patients have evidenced the existence of anatomical patterns related to OSA that have direct influence on speech. Contributions devoted to the study of how this disorder affects scpeech are scarce and somehow contradictory. However, since the late 1980s the existence of specific patterns related to articulation, phonation and resonance is known. By that time these descriptions were virtually useless when coming to the development of an automatic system, but pointed out the existence of a link between speech and OSA. In recent years automatic processing techniques have evolved and are now able to identify significant differences in the speech of OSAS patients when compared to records from healthy subjects. Nevertheless, little is known about the connection between these new results with those published in the past and the pathogenesis of the OSA syndrome. This Thesis is aimed to progress beyond the previous research done in this area by addressing: the study of how OSA affects patients’ speech, the enhancement of automatic OSA classification based on speech analysis, and its integration with the information embedded in the predictors generally used by clinicians in preliminary patients’ examination. The first two tasks, though may appear symbiotic at first, are quite different. While studying the connection between speech and OSA requires simple narrow models that can be easily interpreted, classification requires larger models including a large number dimensions for the characterization and posterior identification of the observed patterns. Anyhow, it is clear that any progress made in the first task should allow us to improve our performance on the second one, and that the incorporation of the predictors used by clinicians shall contribute in this same direction. The Thesis considers both continuous and sustained speech analysis, to exploit the synergies and differences between them. On continuous speech analysis, a conventional speech processing scheme, designed and evaluated before this Thesis, was taken as a baseline. Over this initial system several alternative representations of the speech information were proposed, optimized and tested to select those more suitable for the characterization of OSA-specific patterns. Evidences were found on the existence of a connection between OSA and the fundamental constituents of the speech: the formants. Experimental results proved that the success of the proposed solution is well explained by the ability of speech representations to describe these specific OSA-related components, ignoring the noisy ones as well those presenting low discrimination capabilities. The resulting scheme obtained a 18% error rate, on a classification scheme significantly less complex than those described in the literature and operating on a single speech record. Regarding the connection between OSA and the observed patterns, it was necessary to consider inter-and intra-group differences for this analysis, and to focus on the articulation, replacing the complex classification models by the long-term average spectra. Results clearly point to certain regions on the frequency axis, suggesting the existence of a systematic narrowing in the vocal tract section at the oropharynx. This was already described in the pathogenesis of this syndrome. Regarding sustained speech, similar experiments as those conducted on continuous speech were reproduced on sustained phonations of vowel / a /. Results were qualitatively similar to the previous ones, though in this case perfomance rates were found to be noticeably lower. Trying to derive further knowledge from this result, experiments on the long-term average spectra and intraand inter-group variability ratios were also reproduced on sustained speech records. Results on both experiments showed significant differences from the previous ones obtained from continuous speech which could explain the differences observed on peformance. However, sustained speech also provided the opportunity to study phonation within the controlled framework it provides. This was also identified in the literature as a source of information for the detection of OSA. In this study it was found that, for the available dataset, no sistematic differences related to phonation could be found between the two groups of speakers. Only those dimensions which relate energy distribution along the frequency axis provided significant differences, pointing once again towards the direction of resonant components. Once classification schemes on both continuous and sustained speech were developed, the Thesis addressed their combination into a single classification system. Under the assumption that the information in continuous and sustained speech is fundamentally different, it should be possible to successfully merge the two of them. This was tested through a simple fusion scheme which obtained a 88.6% correct classification (11.4% error rate), which represents a significant improvement over the state of the art. Finally, the combination of this classifier with the variables used by clinicians obtained a 91.3% accuracy (8.7% error rate). This is within the range of alternative, but costly and intrusive schemes, which unlike the one proposed can not be used in the preliminary assessment of patients’ condition. In the end, this Thesis has shed new light on the underlying connection between OSA and speech, and evidenced the degree of maturity reached by speech technology on OSA characterization and detection, leaving the door open for future research which shall continue in the multiple directions that have been pointed out and left as future work.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The main objective of the current research was to search the optimum method to segregate the most frequent color commercial quality classes of tobacco leaves (c.v. "Virginia"). These color classes cover the whole continuous color scale, between "Pale Lemon" and "Oxidated Brown". With the usual expert classification there exists a significant level of uncertainty . Within this research, several methods for data discrimination were tested, in order to solve uncertainty. Classification errors below 5% were obtained with this proposed classifier along two different seasons (1994&1995).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Along the recent years, several moving object detection strategies by non-parametric background-foreground modeling have been proposed. To combine both models and to obtain the probability of a pixel to belong to the foreground, these strategies make use of Bayesian classifiers. However, these classifiers do not allow to take advantage of additional prior information at different pixels. So, we propose a novel and efficient alternative Bayesian classifier that is suitable for this kind of strategies and that allows the use of whatever prior information. Additionally, we present an effective method to dynamically estimate prior probability from the result of a particle filter-based tracking strategy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Electronic devices endowed with camera platforms require new and powerful machine vision applications, which commonly include moving object detection strategies. To obtain high-quality results, the most recent strategies estimate nonparametrically background and foreground models and combine them by means of a Bayesian classifier. However, typical classifiers are limited by the use of constant prior values and they do not allow the inclusion of additional spatiodependent prior information. In this Letter, we propose an alternative Bayesian classifier that, unlike those reported before, allows the use of additional prior information obtained from any source and depending on the spatial position of each pixel.