929 resultados para Principal component analysis discriminant analysis
Resumo:
BACKGROUND AND AIMS Inflammatory bowel disease (IBD) frequently manifests during childhood and adolescence. For providing and understanding a comprehensive picture of a patients' health status, health-related quality of life (HRQoL) instruments are an essential complement to clinical symptoms and functional limitations. Currently, the IMPACT-III questionnaire is one of the most frequently used disease-specific HRQoL instrument among patients with IBD. However, there is a lack of studies examining the validation and reliability of this instrument. METHODS 146 paediatric IBD patients from the multicenter Swiss IBD paediatric cohort study database were included in the study. Medical and laboratory data were extracted from the hospital records. HRQoL data were assessed by means of standardized questionnaires filled out by the patients in a face-to-face interview. RESULTS The original six IMPACT-III domain scales could not be replicated in the current sample. A principal component analysis with the extraction of four factor scores revealed the most robust solution. The four factors indicated good internal reliability (Cronbach's alpha=.64-.86), good concurrent validity measured by correlations with the generic KIDSCREEN-27 scales and excellent discriminant validity for the dimension of physical functioning measured by HRQoL differences for active and inactive severity groups (p<.001, d=1.04). CONCLUSIONS This study with Swiss children with IBD indicates good validity and reliability for the IMPACT-III questionnaire. However, our findings suggest a slightly different factor structure than originally proposed. The IMPACT-III questionnaire can be recommended for its use in clinical practice. The factor structure should be further examined in other samples.
Resumo:
Time domain laser reflectance spectroscopy (TDRS) was applied for the first time to evaluate internal fruit quality. This technique, known in medicine-related knowledge areas, has not been used before in agricultural or food research. It allows the simultaneous non-destructive measuring of two optical characteristics of the tissues: light scattering and absorption. Models to measure firmness, sugar & acid contents in kiwifruit, tomato, apple, peach, nectarine and other fruits were built using sequential statistical techniques: principal component analysis, multiple stepwise linear regression, clustering and discriminant analysis. Consistent correlations were established between the two parameters measured with TDRS, i.e. absorption & transport scattering coefficients, with chemical constituents (sugars and acids) and firmness, respectively. Classification models were built to sort fruits into three quality grades, according to their firmness, soluble solids and acidity.
Resumo:
Mulch materials of different origins have been introduced into the agricultural sector in recent years alternatively to the standard polyethylene due to its environmental impact. This study aimed to evaluate the multivariate response of mulch materials over three consecutive years in a processing tomato (Solanum lycopersicon L.) crop in Central Spain. Two biodegradable plastic mulches (BD1, BD2), one oxo-biodegradable material (OB), two types of paper (PP1, PP2), and one barley straw cover (BS) were compared using two control treatments (standard black polyethylene [PE] and manual weed control [MW]). A total of 17 variables relating to yield, fruit quality, and weed control were investigated. Several multivariate statistical techniques were applied, including principal component analysis, cluster analysis, and discriminant analysis. A group of mulch materials comprised of OB and BD2 was found to be comparable to black polyethylene regarding all the variables considered. The weed control variables were found to be an important source of discrimination. The two paper mulches tested did not share the same treatment group membership in any case: PP2 presented a multivariate response more similar to the biodegradable plastics, while PP1 was more similar to BS and MW. Based on our multivariate approach, the materials OB and BD2 can be used as an effective, more environmentally friendly alternative to polyethylene mulches.
Resumo:
El objetivo principal alrededor del cual se desenvuelve este proyecto es el desarrollo de un sistema de reconocimiento facial. Entre sus objetivos específicos se encuentran: realizar una primera aproximación sobre las técnicas de reconocimiento facial existentes en la actualidad, elegir una aplicación donde pueda ser útil el reconocimiento facial, diseñar y desarrollar un programa en MATLAB que lleve a cabo la función de reconocimiento facial, y evaluar el funcionamiento del sistema desarrollado. Este documento se encuentra dividido en cuatro partes: INTRODUCCIÓN, MARCO TEÓRICO, IMPLEMENTACIÓN, y RESULTADOS, CONCLUSIONES Y LÍNEAS FUTURAS. En la primera parte, se hace una introducción relativa a la actualidad del reconocimiento facial y se comenta brevemente sobre las técnicas existentes para desarrollar un sistema biométrico de este tipo. En ella se justifican también aquellas técnicas que acabaron formando parte de la implementación. En la segunda parte, el marco teórico, se explica la estructura general que tiene un sistema de reconocimiento biométrico, así como sus modos de funcionamiento, y las tasas de error utilizadas para evaluar y comparar su rendimiento. Así mismo, se lleva a cabo una descripción más profunda sobre los conceptos y métodos utilizados para efectuar la detección y reconocimiento facial en la tercera parte del proyecto. La tercera parte abarca una descripción detallada de la solución propuesta. En ella se explica el diseño, características y aplicación de la implementación; que trata de un programa elaborado en MATLAB con interfaz gráfica, y que utiliza cuatro sistemas de reconocimiento facial, basados cada uno en diferentes técnicas: Análisis por componentes principales, análisis lineal discriminante, wavelets de Gabor, y emparejamiento de grafos elásticos. El programa ofrece además la capacidad de crear y editar una propia base de datos con etiquetas, dándole aplicación directa sobre el tema que se trata. Se proponen además una serie de características con el objetivo de ampliar y mejorar las funcionalidades del programa diseñado. Dentro de dichas características destaca la propuesta de un modo de verificación híbrido aplicable a cualquier rama de la biometría y un programa de evaluación capaz de medir, graficar, y comparar las configuraciones de cada uno de los sistemas de reconocimiento implementados. Otra característica destacable es la herramienta programada para la creación de grafos personalizados y generación de modelos, aplicable a reconocimiento de objetos en general. En la cuarta y última parte, se presentan al principio los resultados obtenidos. En ellos se contemplan y analizan las comparaciones entre las distintas configuraciones de los sistemas de reconocimiento implementados para diferentes bases de datos (una de ellas formada con imágenes con condiciones de adquisición no controladas). También se miden las tasas de error del modo de verificación híbrido propuesto. Finalmente, se extraen conclusiones, y se proponen líneas futuras de investigación. ABSTRACT The main goal of this project is to develop a facial recognition system. To meet this end, it was necessary to accomplish a series of specific objectives, which were: researching on the existing face recognition technics nowadays, choosing an application where face recognition might be useful, design and develop a face recognition system using MATLAB, and measure the performance of the implemented system. This document is divided into four parts: INTRODUCTION, THEORTICAL FRAMEWORK, IMPLEMENTATION, and RESULTS, CONCLUSSIONS AND FUTURE RESEARCH STUDIES. In the first part, an introduction is made in relation to facial recognition nowadays, and the techniques used to develop a biometric system of this kind. Furthermore, the techniques chosen to be part of the implementation are justified. In the second part, the general structure and the two basic modes of a biometric system are explained. The error rates used to evaluate and compare the performance of a biometric system are explained as well. Moreover, a description of the concepts and methods used to detect and recognize faces in the third part is made. The design, characteristics, and applications of the systems put into practice are explained in the third part. The implementation consists in developing a program with graphical user interface made in MATLAB. This program uses four face recognition systems, each of them based on a different technique: Principal Component Analysis (PCA), Fisher’s Linear Discriminant (FLD), Gabor wavelets, and Elastic Graph Matching (EGM). In addition, with this implementation it is possible to create and edit one´s tagged database, giving it a direct application. Also, a group of characteristics are proposed to enhance the functionalities of the program designed. Among these characteristics, three of them should be emphasized in this summary: A proposal of an hybrid verification mode of a biometric system; and an evaluation program capable of measuring, plotting curves, and comparing different configurations of each implemented recognition system; and a tool programmed to create personalized graphs and models (tagged graph associated to an image of a person), which can be used generally in object recognition. In the fourth and last part of the project, the results of the comparisons between different configurations of the systems implemented are shown for three databases (One of them created with pictures taken under non-controlled environments). The error rates of the proposed hybrid verification mode are measured as well. Finally, conclusions are extracted and future research studies are proposed.
Resumo:
The early detection of spoiling metabolic products in contaminated food is a very important tool to control quality. Some volatile compounds produce unpleasant odours at very low concentrations, making their early detection very challenging. This is the case of 1,3-pentadiene produced by microorganisms through decarboxylation of the preservative sorbate. In this work, we have developed a methodology to use the data produced by a low-cost, compact MWIR (Mid-Wave IR) spectrometry device without moving parts, which is based on a linear array of 128 elements of VPD PbSe coupled to a linear variable filter (LVF) working in the spectral range between 3 and 4.6 ?m. This device is able to analyze food headspace gases through dedicated sample presentation setup. This methodology enables the detection of CO2 and the volatile compound 1,3-pentadiene, as compared to synthetic patrons. Data analysis is based on an automated multidimensional dynamic processing of the MWIR spectra. Principal component and discriminant analysis allow segregating between four yeast strains including producers and no producers. The segregation power is accounted as a measure of the discrimination quality.
Resumo:
Este trabalho apresenta resultados geoquímicos multielementares de sedimentos de corrente no estado de São Paulo, obtidos através do projeto institucional do Serviço Geológico do Brasil denominado \"Levantamento Geoquímico de Baixa Densidade no Brasil\". Dados analíticos de 1422 amostras de sedimento de corrente obtidos por ICP-MS (Inductively Coupled Plasma Mass Spectrometry), para 32 elementos químicos (Al, Ba, Be, Ca, Ce, Co, Cr, Cs, Cu, Fe, Ga, Hf, K, La, Mg, Mn, Mo, Nb, Ni, P, Pb, Rb, Sc, Sn, Sr, Th, Ti, U, V, Y, Zn e Zr), foram processadas e abordadas através da análise estatística uni e multivariada. Os resultados do tratamento dos dados através de técnicas estatísticas univariadas forneceram os valores de background geoquímico (teor de fundo) dos 32 elementos para todo estado de São Paulo. A análise georreferenciada das distribuições geoquímicas unielementares evidenciaram a compartimentação geológica da área. As duas principais províncias geológicas do estado de São Paulo, Bacia do Paraná e Complexo Cristalino, se destacam claramente na maioria das distribuições geoquímicas. Unidades geológicas de maior expressão, como a Formação Serra Geral e o Grupo Bauru também foram claramente destacadas. Outras feições geoquímicas indicaram possíveis áreas contaminadas e unidades geológicas não cartografadas. Os resultados da aplicação de métodos estatísticos multivariados aos dados geoquímicos com 24 variáveis (Al, Ba, Ce, Co, Cr, Cs, Cu, Fe, Ga, La, Mn, Nb, Ni, Pb, Rb, Sc, Sr, Th, Ti, U, V, Y, Zn e Zr) permitiram definir as principais assinaturas e associações geoquímicas existentes em todo estado de São Paulo e correlacioná-las aos principais domínios litológicos. A análise de agrupamentos em modo Q forneceu oito grupos de amostras geoquimicamente correlacionáveis, que georreferenciadas reproduziram os principais compartimentos geológicos do estado: Complexo Cristalino, Grupos Itararé e Passa Dois, Formação Serra Geral e Grupos Bauru e Caiuá. A análise discriminante multigrupos comprovou, estatisticamente, a classificação dos grupos formados pela análise de agrupamentos e forneceu as principais variáveis discriminantes: Fe, Co, Sc, V e Cu. A análise de componentes principais, abordada em conjunto com a análise fatorial pelo método de rotação varimax, forneceram os principais fatores multivariados e suas respectivas associações elementares. O georreferenciamento dos valores de escores fatoriais multivariados delimitaram as áreas onde as associações elementares ocorrem e forneceram mapas multivariados para todo o estado. Por fim, conclui-se que os métodos estatísticos aplicados são indispensáveis no tratamento, apresentação e interpretação de dados geoquímicos. Ademais, com base em uma visão integrada dos resultados obtidos, este trabalho recomenda: (1) a execução dos levantamentos geoquímicos de baixa densidade em todo país em caráter de prioridade, pois são altamente eficazes na definição de backgrounds regionais e delimitação de províncias geoquímicas com interesse metalogenético e ambiental; (2) a execução do mapeamento geológico contínuo em escala adequada (maiores que 1:100.000) em áreas que apontam para possíveis existências de unidades não cartografadas nos mapas geológicos atuais.
Resumo:
Background: Identifying biological markers to aid diagnosis of bipolar disorder (BD) is critically important. To be considered a possible biological marker, neural patterns in BD should be discriminant from those in healthy individuals (HI). We examined patterns of neuromagnetic responses revealed by magnetoencephalography (MEG) during implicit emotion-processing using emotional (happy, fearful, sad) and neutral facial expressions, in sixteen BD and sixteen age- and gender-matched healthy individuals. Methods: Neuromagnetic data were recorded using a 306-channel whole-head MEG ELEKTA Neuromag System, and preprocessed using Signal Space Separation as implemented in MaxFilter (ELEKTA). Custom Matlab programs removed EOG and ECG signals from filtered MEG data, and computed means of epoched data (0-250ms, 250-500ms, 500-750ms). A generalized linear model with three factors (individual, emotion intensity and time) compared BD and HI. A principal component analysis of normalized mean channel data in selected brain regions identified principal components that explained 95% of data variation. These components were used in a quadratic support vector machine (SVM) pattern classifier. SVM classifier performance was assessed using the leave-one-out approach. Results: BD and HI showed significantly different patterns of activation for 0-250ms within both left occipital and temporal regions, specifically for neutral facial expressions. PCA analysis revealed significant differences between BD and HI for mild fearful, happy, and sad facial expressions within 250-500ms. SVM quadratic classifier showed greatest accuracy (84%) and sensitivity (92%) for neutral faces, in left occipital regions within 500-750ms. Conclusions: MEG responses may be used in the search for disease specific neural markers.
Resumo:
Opinion mining and sentiment analysis are important research areas of Natural Language Processing (NLP) tools and have become viable alternatives for automatically extracting the affective information found in texts. Our aim is to build an NLP model to analyze gamers’ sentiments and opinions expressed in a corpus of 9750 game reviews. A Principal Component Analysis using sentiment analysis features explained 51.2 % of the variance of the reviews and provides an integrated view of the major sentiment and topic related dimensions expressed in game reviews. A Discriminant Function Analysis based on the emerging components classified game reviews into positive, neutral and negative ratings with a 55 % accuracy.
Resumo:
Using water quality management programs is a necessary and inevitable way for preservation and sustainable use of water resources. One of the important issues in determining the quality of water in rivers is designing effective quality control networks, so that the measured quality variables in these stations are, as far as possible, indicative of overall changes in water quality. One of the methods to achieve this goal is increasing the number of quality monitoring stations and sampling instances. Since this will dramatically increase the annual cost of monitoring, deciding on which stations and parameters are the most important ones, along with increasing the instances of sampling, in a way that shows maximum change in the system under study can affect the future decision-making processes for optimizing the efficacy of extant monitoring network, removing or adding new stations or parameters and decreasing or increasing sampling instances. This end, the efficiency of multivariate statistical procedures was studied in this thesis. Multivariate statistical procedure, with regard to its features, can be used as a practical and useful method in recognizing and analyzing rivers’ pollution and consequently in understanding, reasoning, controlling, and correct decision-making in water quality management. This research was carried out using multivariate statistical techniques for analyzing the quality of water and monitoring the variables affecting its quality in Gharasou river, in Ardabil province in northwest of Iran. During a year, 28 physical and chemical parameters were sampled in 11 stations. The results of these measurements were analyzed by multivariate procedures such as: Cluster Analysis (CA), Principal Component Analysis (PCA), Factor Analysis (FA), and Discriminant Analysis (DA). Based on the findings from cluster analysis, principal component analysis, and factor analysis the stations were divided into three groups of highly polluted (HP), moderately polluted (MP), and less polluted (LP) stations Thus, this study illustrates the usefulness of multivariate statistical techniques for analysis and interpretation of complex data sets, and in water quality assessment, identification of pollution sources/factors and understanding spatial variations in water quality for effective river water quality management. This study also shows the effectiveness of these techniques for getting better information about the water quality and design of monitoring network for effective management of water resources. Therefore, based on the results, Gharasou river water quality monitoring program was developed and presented.
Resumo:
Produced water is characterized as one of the most common wastes generated during exploration and production of oil. This work aims to develop methodologies based on comparative statistical processes of hydrogeochemical analysis of production zones in order to minimize types of high-cost interventions to perform identification test fluids - TIF. For the study, 27 samples were collected from five different production zones were measured a total of 50 chemical species. After the chemical analysis was applied the statistical data, using the R Statistical Software, version 2.11.1. Statistical analysis was performed in three steps. In the first stage, the objective was to investigate the behavior of chemical species under study in each area of production through the descriptive graphical analysis. The second step was to identify a function that classify production zones from each sample, using discriminant analysis. In the training stage, the rate of correct classification function of discriminant analysis was 85.19%. The next stage of processing of the data used for Principal Component Analysis, by reducing the number of variables obtained from the linear combination of chemical species, try to improve the discriminant function obtained in the second stage and increase the discrimination power of the data, but the result was not satisfactory. In Profile Analysis curves were obtained for each production area, based on the characteristics of the chemical species present in each zone. With this study it was possible to develop a method using hydrochemistry and statistical analysis that can be used to distinguish the water produced in mature fields of oil, so that it is possible to identify the zone of production that is contributing to the excessive elevation of the water volume.
Resumo:
In questo elaborato vengono analizzate differenti tecniche per la detection di jammer attivi e costanti in una comunicazione satellitare in uplink. Osservando un numero limitato di campioni ricevuti si vuole identificare la presenza di un jammer. A tal fine sono stati implementati i seguenti classificatori binari: support vector machine (SVM), multilayer perceptron (MLP), spectrum guarding e autoencoder. Questi algoritmi di apprendimento automatico dipendono dalle features che ricevono in ingresso, per questo motivo è stata posta particolare attenzione alla loro scelta. A tal fine, sono state confrontate le accuratezze ottenute dai detector addestrati utilizzando differenti tipologie di informazione come: i segnali grezzi nel tempo, le statistical features, le trasformate wavelet e lo spettro ciclico. I pattern prodotti dall’estrazione di queste features dai segnali satellitari possono avere dimensioni elevate, quindi, prima della detection, vengono utilizzati i seguenti algoritmi per la riduzione della dimensionalità: principal component analysis (PCA) e linear discriminant analysis (LDA). Lo scopo di tale processo non è quello di eliminare le features meno rilevanti, ma combinarle in modo da preservare al massimo l’informazione, evitando problemi di overfitting e underfitting. Le simulazioni numeriche effettuate hanno evidenziato come lo spettro ciclico sia in grado di fornire le features migliori per la detection producendo però pattern di dimensioni elevate, per questo motivo è stato necessario l’utilizzo di algoritmi di riduzione della dimensionalità. In particolare, l'algoritmo PCA è stato in grado di estrarre delle informazioni migliori rispetto a LDA, le cui accuratezze risentivano troppo del tipo di jammer utilizzato nella fase di addestramento. Infine, l’algoritmo che ha fornito le prestazioni migliori è stato il Multilayer Perceptron che ha richiesto tempi di addestramento contenuti e dei valori di accuratezza elevati.
Resumo:
Hydrophilic and lipophilic extracts of ten cultivars of Highbush and Rabbiteye Brazilian blueberries (Vaccinium corymbosum L. and Vacciniumashei Reade, respectively) that are used for commercial production were analysed for antioxidant activity by the FRAP, ORAC, ABTS and β-carotene-linoleate methods. Results were correlated to the amounts of carotenoids, total phenolics and anthocyanins. Brazilian blueberries had relatively high concentration of total phenolics (1,622-3,457 mg gallic acid equivalents per 100 g DW) and total anthocyanins (140-318 mg cyanidin-3-glucoside equivalents per 100 g DW), as well as being a good source of carotenoids. There was a higher positive correlation between the amounts of these compounds and the antioxidant activity of hydrophilic compared to lipophilic extracts. There were also significant differences in the level of bioactive compounds and antioxidant activities between different cultivars, production location and year of cultivation.
Resumo:
Flavanones (hesperidin, naringenin, naringin, and poncirin) in industrial, hand-squeezed orange juices and from fresh-in-squeeze machines orange juices were determined by HPLC/DAD analysis using a previously described liquid-liquid extraction method. Method validation including the accuracy was performed by using recovery tests. Samples (36) collected from different Brazilian locations and brands were analyzed. Concentrations were determined using an external standard curve. The limits of detection (LOD) and the limits of quantification (LOQ) calculated were 0.0037, 1.87, 0.0147, and 0.0066 mg 100 g(-1) and 0.0089, 7.84, 0.0302, and 0.0200 mg 100 g(-1) for naringin, hesperidin, poncirin, and naringenin, respectively. The results demonstrated that hesperidin was present at the highest concentration levels, especially in the industrial orange juices. Its average content and concentration range were 69.85 and 18.80-139.00 mg 100 g(-1). The other flavanones showed the lowest concentration levels. The average contents and concentration ranges found were 0.019, 0.01-0.30, and 0.12 and 0.1-0.17, 0.13, and 0.01-0.36 mg 100 g(-1), respectively. The results were also evaluated using the principal component analysis (PCA) multivariate analysis technique which showed that poncirin, naringenin, and naringin were the principal elements that contributed to the variability in the sample concentrations.
Resumo:
Dulce de leche samples available in the Brazilian market were submitted to sensory profiling by quantitative descriptive analysis and acceptance test, as well sensory evaluation using the just-about-right scale and purchase intent. External preference mapping and the ideal sensory characteristics of dulce de leche were determined. The results were also evaluated by principal component analysis, hierarchical cluster analysis, partial least squares regression, artificial neural networks, and logistic regression. Overall, significant product acceptance was related to intermediate scores of the sensory attributes in the descriptive test, and this trend was observed even after consumer segmentation. The results obtained by sensometric techniques showed that optimizing an ideal dulce de leche from the sensory standpoint is a multidimensional process, with necessary adjustments on the appearance, aroma, taste, and texture attributes of the product for better consumer acceptance and purchase. The optimum dulce de leche was characterized by high scores for the attributes sweet taste, caramel taste, brightness, color, and caramel aroma in accordance with the preference mapping findings. In industrial terms, this means changing the parameters used in the thermal treatment and quantitative changes in the ingredients used in formulations.
Resumo:
In this work, we discuss the use of multi-way principal component analysis combined with comprehensive two-dimensional gas chromatography to study the volatile metabolites of the saprophytic fungus Memnoniella sp. isolated in vivo by headspace solid-phase microextraction. This fungus has been identified as having the ability to induce plant resistance against pathogens, possibly through its volatile metabolites. Adequate culture media was inoculated, and its headspace was then sampled with a solid-phase microextraction fiber and chromatographed every 24 h over seven days. The raw chromatogram processing using multi-way principal component analysis allowed the determination of the inoculation period, during which the concentration of volatile metabolites was maximized, as well as the discrimination of the appropriate peaks from the complex culture media background. Several volatile metabolites not previously described in the literature on biocontrol fungi were observed, as well as sesquiterpenes and aliphatic alcohols. These results stress that, due to the complexity of multidimensional chromatographic data, multivariate tools might be mandatory even for apparently trivial tasks, such as the determination of the temporal profile of metabolite production and extinction. However, when compared with conventional gas chromatography, the complex data processing yields a considerable improvement in the information obtained from the samples. This article is protected by copyright. All rights reserved.