750 resultados para CLASSIFIER
Resumo:
MFCC coefficients extracted from the power spectral density of speech as a whole, seems to have become the de facto standard in the area of speaker recognition, as demonstrated by its use in almost all systems submitted to the 2013 Speaker Recognition Evaluation (SRE) in Mobile Environment [1], thus relegating to background this component of the recognition systems. However, in this article we will show that selecting the adequate speaker characterization system is as important as the selection of the classifier. To accomplish this we will compare the recognition rates achieved by different recognition systems that relies on the same classifier (GMM-UBM) but connected with different feature extraction systems (based on both classical and biometric parameters). As a result we will show that a gender dependent biometric parameterization with a simple recognition system based on GMM- UBM paradigm provides very competitive or even better recognition rates when compared to more complex classification systems based on classical features
Resumo:
In this paper, we analyze the performance of several well-known pattern recognition and dimensionality reduction techniques when applied to mass-spectrometry data for odor biometric identification. Motivated by the successful results of previous works capturing the odor from other parts of the body, this work attempts to evaluate the feasibility of identifying people by the odor emanated from the hands. By formulating this task according to a machine learning scheme, the problem is identified with a small-sample-size supervised classification problem in which the input data is formed by mass spectrograms from the hand odor of 13 subjects captured in different sessions. The high dimensionality of the data makes it necessary to apply feature selection and extraction techniques together with a simple classifier in order to improve the generalization capabilities of the model. Our experimental results achieve recognition rates over 85% which reveals that there exists discriminatory information in the hand odor and points at body odor as a promising biometric identifier.
Resumo:
BACKGROUND: Clinical Trials (CTs) are essential for bridging the gap between experimental research on new drugs and their clinical application. Just like CTs for traditional drugs and biologics have helped accelerate the translation of biomedical findings into medical practice, CTs for nanodrugs and nanodevices could advance novel nanomaterials as agents for diagnosis and therapy. Although there is publicly available information about nanomedicine-related CTs, the online archiving of this information is carried out without adhering to criteria that discriminate between studies involving nanomaterials or nanotechnology-based processes (nano), and CTs that do not involve nanotechnology (non-nano). Finding out whether nanodrugs and nanodevices were involved in a study from CT summaries alone is a challenging task. At the time of writing, CTs archived in the well-known online registry ClinicalTrials.gov are not easily told apart as to whether they are nano or non-nano CTs-even when performed by domain experts, due to the lack of both a common definition for nanotechnology and of standards for reporting nanomedical experiments and results. METHODS: We propose a supervised learning approach for classifying CT summaries from ClinicalTrials.gov according to whether they fall into the nano or the non-nano categories. Our method involves several stages: i) extraction and manual annotation of CTs as nano vs. non-nano, ii) pre-processing and automatic classification, and iii) performance evaluation using several state-of-the-art classifiers under different transformations of the original dataset. RESULTS AND CONCLUSIONS: The performance of the best automated classifier closely matches that of experts (AUC over 0.95), suggesting that it is feasible to automatically detect the presence of nanotechnology products in CT summaries with a high degree of accuracy. This can significantly speed up the process of finding whether reports on ClinicalTrials.gov might be relevant to a particular nanoparticle or nanodevice, which is essential to discover any precedents for nanotoxicity events or advantages for targeted drug therapy.
Resumo:
La heterogeneidad del medio geológico introduce en el proyecto de obra subterránea un alto grado de incertidumbre que debe ser debidamente gestionado a fin de reducir los riesgos asociados, que son fundamentalmente de tipo geotécnico. Entre los principales problemas a los que se enfrenta la Mecánica de Rocas moderna en el ámbito de la construcción subterránea, se encuentran la fluencia de roca en túneles (squeezing) y la rotura de pilares de carbón. Es ampliamente conocido que su aparición causa importantes perjuicios en el coste y la seguridad de los proyectos por lo que su estudio, ha estado tradicionalmente vinculado a la predicción de su ocurrencia. Entre las soluciones existentes para la determinación de estos problemas se encuentran las que se basan en métodos analíticos y numéricos. Estas metodologías son capaces de proporcionar un alto nivel de representatividad respecto del comportamiento geotécnico real, sin embargo, su utilización solo es posible cuando se dispone de una suficiente caracterización geotécnica y por tanto de una detallada definición de los parámetros que alimentan los complejos modelos constitutivos y criterios de rotura que los fenómenos estudiados requieren. Como es lógico, este nivel de definición solo es posible cuando se alcanzan etapas avanzadas de proyecto, incluso durante la propia construcción, a fin de calibrar adecuadamente los parámetros introducidos en los modelos, lo que supone una limitación de uso en etapas iniciales, cuando su predicción tiene verdadero sentido. Por su parte, los métodos empíricos permiten proporcionar soluciones a estos complejos problemas de un modo sencillo, con una baja parametrización y, dado su eminente enfoque observacional, de gran fiabilidad cuando se implementan sobre condiciones de contorno similares a las originales. La sencillez y escasez de los parámetros utilizados permiten a estas metodologías ser utilizadas desde las fases preliminares del proyecto, ya que estos constituyen en general, información habitual de fácil y económica adquisición. Este aspecto permite por tanto incorporar la predicción desde el principio del proceso de diseño, anticipando el riesgo en origen. En esta tesis doctoral, se presenta una nueva metodología empírica que sirve para proporcionar predicciones para la ocurrencia de squeezing y el fallo de pilares de carbón basada en una extensa recopilación de información de casos reales de túneles y minas en las que ambos fenómenos fueron evaluados. Esta información, recogida de referencias bibliográficas de prestigio, ha permitido recopilar una de las más extensas bases de datos existentes hasta la fecha relativa a estos fenómenos, lo que supone en sí mismo una importante contribución sobre el estado del arte. Con toda esta información, y con la ayuda de la teoría de clasificadores estadísticos, se ha implementado sobre las bases de datos un clasificador lineal de tipo regresión logística que permite hacer predicciones sobre la ocurrencia de ambos fenómenos en términos de probabilidad, y por tanto ponderar la incertidumbre asociada a la heterogeneidad incorporada por el medio geológico. Este aspecto del desarrollo es el verdadero valor añadido proporcionado por la tesis y la principal ventaja de la solución propuesta respecto de otras metodologías empíricas. Esta capacidad de ponderación probabilística permite al clasificador constituir una solución muy interesante como metodología para la evaluación de riesgo geotécnico y la toma de decisiones. De hecho, y como ejercicio de validación práctica, se ha implementado la solución desarrollada en un modelo coste-beneficio asociado a la optimización del diseño de pilares involucrados en una de mina “virtual” explotada por tajos largos. La capacidad del clasificador para cuantificar la probabilidad de fallo del diseño, junto con una adecuada cuantificación de las consecuencias de ese fallo, ha permitido definir una ley de riesgo que se ha incorporado al balance de costes y beneficios, que es capaz, a partir del redimensionamiento iterativo del sistema de pilares y de la propia configuración de la mina, maximizar el resultado económico del proyecto minero bajo unas condiciones de seguridad aceptables, fijadas de antemano. Geological media variability introduces to the subterranean project a high grade of uncertainty that should be properly managed with the aim to reduce the associated risks, which are mainly geotechnical. Among the major problems facing the modern Rock Mechanics in the field of underground construction are both, the rock squeezing while tunneling and the failure of coal pillars. Given their harmfulness to the cost and safety of the projects, their study has been traditionally linked to the determination of its occurrence. Among the existing solutions for the determination of these problems are those that are based on analytical and numerical methods. Those methodologies allow providing a high level of reliability of the geotechnical behavior, and therefore a detailed definition of the parameters that feed the complex constitutive models and failure criteria that require the studied phenomena. Obviously, this level of definition is only possible when advanced stages of the project are achieved and even during construction in order to properly calibrate the parameters entered in the models, which suppose a limited use in early stages, when the prediction has true sense. Meanwhile, empirical methods provide solutions to these complex problems in a simple way, with low parameterization and, given his observational scope, with highly reliability when implemented on similar conditions to the original context. The simplicity and scarcity of the parameters used allow these methodologies be applied in the early stages of the project, since that information should be commonly easy and cheaply to get. This aspect can therefore incorporate the prediction from the beginning of the design process, anticipating the risk beforehand. This thesis, based on the extensive data collection of case histories of tunnels and underground mines, presents a novel empirical approach used to provide predictions for the occurrence of both, squeezing and coal pillars failures. The information has been collected from prestigious references, providing one of the largest databases to date concerning phenomena, a fact which provides an important contribution to the state of the art. With all this information, and with the aid of the theory of statistical classifiers, it has been implemented on both databases, a type linear logistic regression classifier that allows predictions about the occurrence of these phenomena in terms of probability, and therefore weighting the uncertainty associated with geological variability. This aspect of the development is the real added value provided by the thesis and the main advantage of the proposed solution over other empirical methodologies. This probabilistic weighting capacity, allows being the classifier a very interesting methodology for the evaluation of geotechnical risk and decision making. In fact, in order to provide a practical validation, we have implemented the developed solution within a cost-benefit analysis associated with the optimization of the design of coal pillar systems involved in a "virtual" longwall mine. The ability of the classifier to quantify the probability of failure of the design along with proper quantification of the consequences of that failure, has allowed defining a risk law which is introduced into the cost-benefits model, which is able, from iterative resizing of the pillar system and the configuration of the mine, maximize the economic performance of the mining project under acceptable safety conditions established beforehand.
Resumo:
La rápida adopción de dispositivos electrónicos en el automóvil, ha contribuido a mejorar en gran medida la seguridad y el confort. Desde principios del siglo 20, la investigación en sistemas de seguridad activa ha originado el desarrollo de tecnologías como ABS (Antilock Brake System), TCS (Traction Control System) y ESP (Electronic Stability Program). El coste de despliegue de estos sistemas es crítico: históricamente, sólo han sido ampliamente adoptados cuando el precio de los sensores y la electrónica necesarios para su construcción ha caído hasta un valor marginal. Hoy en día, los vehículos a motor incluyen un amplio rango de sensores para implementar las funciones de seguridad. La incorporación de sistemas que detecten la presencia de agua, hielo o nieve en la vía es un factor adicional que podría ayudar a evitar situaciones de riesgo. Existen algunas implementaciones prácticas capaces de detectar carreteras mojadas, heladas y nevadas, aunque con limitaciones importantes. En esta tesis doctoral, se propone una aproximación novedosa al problema, basada en el análisis del ruido de rodadura generado durante la conducción. El ruido de rodadura es capturado y preprocesado. Después es analizado utilizando un clasificador basado en máquinas de vectores soporte (SVM), con el fin de generar una estimación del estado del firme. Todas estas operaciones se realizan en el propio vehículo. El sistema propuesto se ha desarrollado y evaluado utilizando Matlabr, mostrando tasas de aciertos de más del 90%. Se ha realizado una implementación en tiempo real, utilizando un prototipo basado en DSP. Después se han introducido varias optimizaciones para permitir que el sistema sea realizable usando un microcontrolador de propósito general. Finalmente se ha realizado una implementación hardware basada en un microcontrolador, integrándola estrechamente con las ECU del vehículo, pudiendo obtener datos capturados por los sensores del mismo y enviar las estimaciones del estado del firme. El sistema resultante ha sido patentado, y destaca por su elevada tasa de aciertos con un tamaño, consumo y coste reducidos. ABSTRACT Proliferation of automotive electronics, has greatly improved driving safety and comfort. Since the beginning of the 20th century, investigation in active safety systems has resulted in the development of technologies such as ABS (Antilock Brake System), TCS (Traction Control System) and ESP (Electronic Stability Program). Deployment cost of these systems is critical: historically, they have been widely adopted only when the price of the sensors and electronics needed to build them has been cut to a marginal value. Nowadays, motor vehicles include a wide range of sensors to implement the safety functions. Incorporation of systems capable of detecting water, ice or snow on the road is an additional factor that could help avoiding risky situations. There are some implementations capable of detecting wet, icy and snowy roads, although with important limitations. In this PhD Thesis, a novel approach is proposed, based on the analysis of the tyre/road noise radiated during driving. Tyre/road noise is captured and pre-processed. Then it is analysed using a Support Vector Machine (SVM) based classifier, to output an estimation of the road status. All these operations are performed on-board. Proposed system is developed and evaluated using Matlabr, showing success rates greater than 90%. A real time implementation is carried out using a DSP based prototype. Several optimizations are introduced enabling the system to work using a low-cost general purpose microcontroller. Finally a microcontroller based hardware implementation is developed. This implementation is tightly integrated with the vehicle ECUs, allowing it to obtain data captured by its sensors, and to send the road status estimations. Resulting system has been patented, and is notable because of its high hit rate, small size, low power consumption and low cost.
Diseño de algoritmos de guerra electrónica y radar para su implementación en sistemas de tiempo real
Resumo:
Esta tesis se centra en el estudio y desarrollo de algoritmos de guerra electrónica {electronic warfare, EW) y radar para su implementación en sistemas de tiempo real. La llegada de los sistemas de radio, radar y navegación al terreno militar llevó al desarrollo de tecnologías para combatirlos. Así, el objetivo de los sistemas de guerra electrónica es el control del espectro electomagnético. Una de la funciones de la guerra electrónica es la inteligencia de señales {signals intelligence, SIGINT), cuya labor es detectar, almacenar, analizar, clasificar y localizar la procedencia de todo tipo de señales presentes en el espectro. El subsistema de inteligencia de señales dedicado a las señales radar es la inteligencia electrónica {electronic intelligence, ELINT). Un sistema de tiempo real es aquel cuyo factor de mérito depende tanto del resultado proporcionado como del tiempo en que se da dicho resultado. Los sistemas radar y de guerra electrónica tienen que proporcionar información lo más rápido posible y de forma continua, por lo que pueden encuadrarse dentro de los sistemas de tiempo real. La introducción de restricciones de tiempo real implica un proceso de realimentación entre el diseño del algoritmo y su implementación en plataformas “hardware”. Las restricciones de tiempo real son dos: latencia y área de la implementación. En esta tesis, todos los algoritmos presentados se han implementado en plataformas del tipo field programmable gate array (FPGA), ya que presentan un buen compromiso entre velocidad, coste total, consumo y reconfigurabilidad. La primera parte de la tesis está centrada en el estudio de diferentes subsistemas de un equipo ELINT: detección de señales mediante un detector canalizado, extracción de los parámetros de pulsos radar, clasificación de modulaciones y localization pasiva. La transformada discreta de Fourier {discrete Fourier transform, DFT) es un detector y estimador de frecuencia quasi-óptimo para señales de banda estrecha en presencia de ruido blanco. El desarrollo de algoritmos eficientes para el cálculo de la DFT, conocidos como fast Fourier transform (FFT), han situado a la FFT como el algoritmo más utilizado para la detección de señales de banda estrecha con requisitos de tiempo real. Así, se ha diseñado e implementado un algoritmo de detección y análisis espectral para su implementación en tiempo real. Los parámetros más característicos de un pulso radar son su tiempo de llegada y anchura de pulso. Se ha diseñado e implementado un algoritmo capaz de extraer dichos parámetros. Este algoritmo se puede utilizar con varios propósitos: realizar un reconocimiento genérico del radar que transmite dicha señal, localizar la posición de dicho radar o bien puede utilizarse como la parte de preprocesado de un clasificador automático de modulaciones. La clasificación automática de modulaciones es extremadamente complicada en entornos no cooperativos. Un clasificador automático de modulaciones se divide en dos partes: preprocesado y el algoritmo de clasificación. Los algoritmos de clasificación basados en parámetros representativos calculan diferentes estadísticos de la señal de entrada y la clasifican procesando dichos estadísticos. Los algoritmos de localization pueden dividirse en dos tipos: triangulación y sistemas cuadráticos. En los algoritmos basados en triangulación, la posición se estima mediante la intersección de las rectas proporcionadas por la dirección de llegada de la señal. En cambio, en los sistemas cuadráticos, la posición se estima mediante la intersección de superficies con igual diferencia en el tiempo de llegada (time difference of arrival, TDOA) o diferencia en la frecuencia de llegada (frequency difference of arrival, FDOA). Aunque sólo se ha implementado la estimación del TDOA y FDOA mediante la diferencia de tiempos de llegada y diferencia de frecuencias, se presentan estudios exhaustivos sobre los diferentes algoritmos para la estimación del TDOA, FDOA y localización pasiva mediante TDOA-FDOA. La segunda parte de la tesis está dedicada al diseño e implementación filtros discretos de respuesta finita (finite impulse response, FIR) para dos aplicaciones radar: phased array de banda ancha mediante filtros retardadores (true-time delay, TTD) y la mejora del alcance de un radar sin modificar el “hardware” existente para que la solución sea de bajo coste. La operación de un phased array de banda ancha mediante desfasadores no es factible ya que el retardo temporal no puede aproximarse mediante un desfase. La solución adoptada e implementada consiste en sustituir los desfasadores por filtros digitales con retardo programable. El máximo alcance de un radar depende de la relación señal a ruido promedio en el receptor. La relación señal a ruido depende a su vez de la energía de señal transmitida, potencia multiplicado por la anchura de pulso. Cualquier cambio hardware que se realice conlleva un alto coste. La solución que se propone es utilizar una técnica de compresión de pulsos, consistente en introducir una modulación interna a la señal, desacoplando alcance y resolución. ABSTRACT This thesis is focused on the study and development of electronic warfare (EW) and radar algorithms for real-time implementation. The arrival of radar, radio and navigation systems to the military sphere led to the development of technologies to fight them. Therefore, the objective of EW systems is the control of the electromagnetic spectrum. Signals Intelligence (SIGINT) is one of the EW functions, whose mission is to detect, collect, analyze, classify and locate all kind of electromagnetic emissions. Electronic intelligence (ELINT) is the SIGINT subsystem that is devoted to radar signals. A real-time system is the one whose correctness depends not only on the provided result but also on the time in which this result is obtained. Radar and EW systems must provide information as fast as possible on a continuous basis and they can be defined as real-time systems. The introduction of real-time constraints implies a feedback process between the design of the algorithms and their hardware implementation. Moreover, a real-time constraint consists of two parameters: Latency and area of the implementation. All the algorithms in this thesis have been implemented on field programmable gate array (FPGAs) platforms, presenting a trade-off among performance, cost, power consumption and reconfigurability. The first part of the thesis is related to the study of different key subsystems of an ELINT equipment: Signal detection with channelized receivers, pulse parameter extraction, modulation classification for radar signals and passive location algorithms. The discrete Fourier transform (DFT) is a nearly optimal detector and frequency estimator for narrow-band signals buried in white noise. The introduction of fast algorithms to calculate the DFT, known as FFT, reduces the complexity and the processing time of the DFT computation. These properties have placed the FFT as one the most conventional methods for narrow-band signal detection for real-time applications. An algorithm for real-time spectral analysis for user-defined bandwidth, instantaneous dynamic range and resolution is presented. The most characteristic parameters of a pulsed signal are its time of arrival (TOA) and the pulse width (PW). The estimation of these basic parameters is a fundamental task in an ELINT equipment. A basic pulse parameter extractor (PPE) that is able to estimate all these parameters is designed and implemented. The PPE may be useful to perform a generic radar recognition process, perform an emitter location technique and can be used as the preprocessing part of an automatic modulation classifier (AMC). Modulation classification is a difficult task in a non-cooperative environment. An AMC consists of two parts: Signal preprocessing and the classification algorithm itself. Featurebased algorithms obtain different characteristics or features of the input signals. Once these features are extracted, the classification is carried out by processing these features. A feature based-AMC for pulsed radar signals with real-time requirements is studied, designed and implemented. Emitter passive location techniques can be divided into two classes: Triangulation systems, in which the emitter location is estimated with the intersection of the different lines of bearing created from the estimated directions of arrival, and quadratic position-fixing systems, in which the position is estimated through the intersection of iso-time difference of arrival (TDOA) or iso-frequency difference of arrival (FDOA) quadratic surfaces. Although TDOA and FDOA are only implemented with time of arrival and frequency differences, different algorithms for TDOA, FDOA and position estimation are studied and analyzed. The second part is dedicated to FIR filter design and implementation for two different radar applications: Wideband phased arrays with true-time delay (TTD) filters and the range improvement of an operative radar with no hardware changes to minimize costs. Wideband operation of phased arrays is unfeasible because time delays cannot be approximated by phase shifts. The presented solution is based on the substitution of the phase shifters by FIR discrete delay filters. The maximum range of a radar depends on the averaged signal to noise ratio (SNR) at the receiver. Among other factors, the SNR depends on the transmitted signal energy that is power times pulse width. Any possible hardware change implies high costs. The proposed solution lies in the use of a signal processing technique known as pulse compression, which consists of introducing an internal modulation within the pulse width, decoupling range and resolution.
Resumo:
A nivel mundial, el cáncer de mama es el tipo de cáncer más frecuente además de una de las principales causas de muerte entre la población femenina. Actualmente, el método más eficaz para detectar lesiones mamarias en una etapa temprana es la mamografía. Ésta contribuye decisivamente al diagnóstico precoz de esta enfermedad que, si se detecta a tiempo, tiene una probabilidad de curación muy alta. Uno de los principales y más frecuentes hallazgos en una mamografía, son las microcalcificaciones, las cuales son consideradas como un indicador importante de cáncer de mama. En el momento de analizar las mamografías, factores como la capacidad de visualización, la fatiga o la experiencia profesional del especialista radiólogo hacen que el riesgo de omitir ciertas lesiones presentes se vea incrementado. Para disminuir dicho riesgo es importante contar con diferentes alternativas como por ejemplo, una segunda opinión por otro especialista o un doble análisis por el mismo. En la primera opción se eleva el coste y en ambas se prolonga el tiempo del diagnóstico. Esto supone una gran motivación para el desarrollo de sistemas de apoyo o asistencia en la toma de decisiones. En este trabajo de tesis se propone, se desarrolla y se justifica un sistema capaz de detectar microcalcificaciones en regiones de interés extraídas de mamografías digitalizadas, para contribuir a la detección temprana del cáncer demama. Dicho sistema estará basado en técnicas de procesamiento de imagen digital, de reconocimiento de patrones y de inteligencia artificial. Para su desarrollo, se tienen en cuenta las siguientes consideraciones: 1. Con el objetivo de entrenar y probar el sistema propuesto, se creará una base de datos de imágenes, las cuales pertenecen a regiones de interés extraídas de mamografías digitalizadas. 2. Se propone la aplicación de la transformada Top-Hat, una técnica de procesamiento digital de imagen basada en operaciones de morfología matemática. La finalidad de aplicar esta técnica es la de mejorar el contraste entre las microcalcificaciones y el tejido presente en la imagen. 3. Se propone un algoritmo novel llamado sub-segmentación, el cual está basado en técnicas de reconocimiento de patrones aplicando un algoritmo de agrupamiento no supervisado, el PFCM (Possibilistic Fuzzy c-Means). El objetivo es encontrar las regiones correspondientes a las microcalcificaciones y diferenciarlas del tejido sano. Además, con la finalidad de mostrar las ventajas y desventajas del algoritmo propuesto, éste es comparado con dos algoritmos del mismo tipo: el k-means y el FCM (Fuzzy c-Means). Por otro lado, es importante destacar que en este trabajo por primera vez la sub-segmentación es utilizada para detectar regiones pertenecientes a microcalcificaciones en imágenes de mamografía. 4. Finalmente, se propone el uso de un clasificador basado en una red neuronal artificial, específicamente un MLP (Multi-layer Perceptron). El propósito del clasificador es discriminar de manera binaria los patrones creados a partir de la intensidad de niveles de gris de la imagen original. Dicha clasificación distingue entre microcalcificación y tejido sano. ABSTRACT Breast cancer is one of the leading causes of women mortality in the world and its early detection continues being a key piece to improve the prognosis and survival. Currently, the most reliable and practical method for early detection of breast cancer is mammography.The presence of microcalcifications has been considered as a very important indicator ofmalignant types of breast cancer and its detection and classification are important to prevent and treat the disease. However, the detection and classification of microcalcifications continue being a hard work due to that, in mammograms there is a poor contrast between microcalcifications and the tissue around them. Factors such as visualization, tiredness or insufficient experience of the specialist increase the risk of omit some present lesions. To reduce this risk, is important to have alternatives such as a second opinion or a double analysis for the same specialist. In the first option, the cost increases and diagnosis time also increases for both of them. This is the reason why there is a great motivation for development of help systems or assistance in the decision making process. This work presents, develops and justifies a system for the detection of microcalcifications in regions of interest extracted fromdigitizedmammographies to contribute to the early detection of breast cancer. This systemis based on image processing techniques, pattern recognition and artificial intelligence. For system development the following features are considered: With the aim of training and testing the system, an images database is created, belonging to a region of interest extracted from digitized mammograms. The application of the top-hat transformis proposed. This image processing technique is based on mathematical morphology operations. The aim of this technique is to improve the contrast betweenmicrocalcifications and tissue present in the image. A novel algorithm called sub-segmentation is proposed. The sub-segmentation is based on pattern recognition techniques applying a non-supervised clustering algorithm known as Possibilistic Fuzzy c-Means (PFCM). The aim is to find regions corresponding to the microcalcifications and distinguish them from the healthy tissue. Furthermore,with the aim of showing themain advantages and disadvantages this is compared with two algorithms of same type: the k-means and the fuzzy c-means (FCM). On the other hand, it is important to highlight in this work for the first time the sub-segmentation is used for microcalcifications detection. Finally, a classifier based on an artificial neural network such as Multi-layer Perceptron is used. The purpose of this classifier is to discriminate froma binary perspective the patterns built from gray level intensity of the original image. This classification distinguishes between microcalcifications and healthy tissue.
Resumo:
This paper describes our participation at SemEval- 2014 sentiment analysis task, in both contextual and message polarity classification. Our idea was to com- pare two different techniques for sentiment analysis. First, a machine learning classifier specifically built for the task using the provided training corpus. On the other hand, a lexicon-based approach using natural language processing techniques, developed for a ge- neric sentiment analysis task with no adaptation to the provided training corpus. Results, though far from the best runs, prove that the generic model is more robust as it achieves a more balanced evaluation for message polarity along the different test sets.
Resumo:
This paper describes our participation at the RepLab 2014 reputation dimensions scenario. Our idea was to evaluate the best combination strategy of a machine learning classifier with a rule-based algorithm based on logical expressions of terms. Results show that our baseline experiment using just Naive Bayes Multinomial with a term vector model representation of the tweet text is ranked second among runs from all participants in terms of accuracy.
Resumo:
This paper describes our participation at PAN 2014 author profiling task. Our idea was to define, develop and evaluate a simple machine learning classifier able to guess the gender and the age of a given user based on his/her texts, which could become part of the solution portfolio of the company. We were interested in finding not the best possible classifier that achieves the highest accuracy, but to find the optimum balance between performance and throughput using the most simple strategy and less dependent of external systems. Results show that our software using Naive Bayes Multinomial with a term vector model representation of the text is ranked quite well among the rest of participants in terms of accuracy.
Resumo:
This paper presents a robust approach for recognition of thermal face images based on decision level fusion of 34 different region classifiers. The region classifiers concentrate on local variations. They use singular value decomposition (SVD) for feature extraction. Fusion of decisions of the region classifier is done by using majority voting technique. The algorithm is tolerant against false exclusion of thermal information produced by the presence of inconsistent distribution of temperature statistics which generally make the identification process difficult. The algorithm is extensively evaluated on UGC-JU thermal face database, and Terravic facial infrared database and the recognition performance are found to be 95.83% and 100%, respectively. A comparative study has also been made with the existing works in the literature.
Resumo:
This work proposes an optimization of a semi-supervised Change Detection methodology based on a combination of Change Indices (CI) derived from an image multitemporal data set. For this purpose, SPOT 5 Panchromatic images with 2.5 m spatial resolution have been used, from which three Change Indices have been calculated. Two of them are usually known indices; however the third one has been derived considering the Kullbak-Leibler divergence. Then, these three indices have been combined forming a multiband image that has been used in as input for a Support Vector Machine (SVM) classifier where four different discriminant functions have been tested in order to differentiate between change and no_change categories. The performance of the suggested procedure has been assessed applying different quality measures, reaching in each case highly satisfactory values. These results have demonstrated that the simultaneous combination of basic change indices with others more sophisticated like the Kullback-Leibler distance, and the application of non-parametric discriminant functions like those employees in the SVM method, allows solving efficiently a change detection problem.
Resumo:
Nonlinear analysis tools for studying and characterizing the dynamics of physiological signals have gained popularity, mainly because tracking sudden alterations of the inherent complexity of biological processes might be an indicator of altered physiological states. Typically, in order to perform an analysis with such tools, the physiological variables that describe the biological process under study are used to reconstruct the underlying dynamics of the biological processes. For that goal, a procedure called time-delay or uniform embedding is usually employed. Nonetheless, there is evidence of its inability for dealing with non-stationary signals, as those recorded from many physiological processes. To handle with such a drawback, this paper evaluates the utility of non-conventional time series reconstruction procedures based on non uniform embedding, applying them to automatic pattern recognition tasks. The paper compares a state of the art non uniform approach with a novel scheme which fuses embedding and feature selection at once, searching for better reconstructions of the dynamics of the system. Moreover, results are also compared with two classic uniform embedding techniques. Thus, the goal is comparing uniform and non uniform reconstruction techniques, including the one proposed in this work, for pattern recognition in biomedical signal processing tasks. Once the state space is reconstructed, the scheme followed characterizes with three classic nonlinear dynamic features (Largest Lyapunov Exponent, Correlation Dimension and Recurrence Period Density Entropy), while classification is carried out by means of a simple k-nn classifier. In order to test its generalization capabilities, the approach was tested with three different physiological databases (Speech Pathologies, Epilepsy and Heart Murmurs). In terms of the accuracy obtained to automatically detect the presence of pathologies, and for the three types of biosignals analyzed, the non uniform techniques used in this work lightly outperformed the results obtained using the uniform methods, suggesting their usefulness to characterize non-stationary biomedical signals in pattern recognition applications. On the other hand, in view of the results obtained and its low computational load, the proposed technique suggests its applicability for the applications under study.
Resumo:
A novel GPU-based nonparametric moving object detection strategy for computer vision tools requiring real-time processing is proposed. An alternative and efficient Bayesian classifier to combine nonparametric background and foreground models allows increasing correct detections while avoiding false detections. Additionally, an efficient region of interest analysis significantly reduces the computational cost of the detections.
Resumo:
This paper presents a novel background modeling system that uses a spatial grid of Support Vector Machines classifiers for segmenting moving objects, which is a key step in many video-based consumer applications. The system is able to adapt to a large range of dynamic background situations since no parametric model or statistical distribution are assumed. This is achieved by using a different classifier per image region that learns the specific appearance of that scene region and its variations (illumination changes, dynamic backgrounds, etc.). The proposed system has been tested with a recent public database, outperforming other state-of-the-art algorithms.