913 resultados para INDEPENDENT COMPONENT ANALYSIS
Resumo:
In the current context of serious climate changes, where the increase of the frequency of some extreme events occurrence can enhance the rate of periods prone to high intensity forest fires, the National Forest Authority often implements, in several Portuguese forest areas, a regular set of measures in order to control the amount of fuel mass availability (PNDFCI, 2008). In the present work we’ll present a preliminary analysis concerning the assessment of the consequences given by the implementation of prescribed fire measures to control the amount of fuel mass in soil recovery, in particular in terms of its water retention capacity, its organic matter content, pH and content of iron. This work is included in a larger study (Meira-Castro, 2009(a); Meira-Castro, 2009(b)). According to the established praxis on the data collection, embodied in multidimensional matrices of n columns (variables in analysis) by p lines (sampled areas at different depths), and also considering the quantitative data nature present in this study, we’ve chosen a methodological approach that considers the multivariate statistical analysis, in particular, the Principal Component Analysis (PCA ) (Góis, 2004). The experiments were carried out in a soil cover over a natural site of Andaluzitic schist, in Gramelas, Caminha, NW Portugal, who was able to maintain itself intact from prescribed burnings from four years and was submit to prescribed fire in March 2008. The soils samples were collected from five different plots at six different time periods. The methodological option that was adopted have allowed us to identify the most relevant relational structures inside the n variables, the p samples and in two sets at the same time (Garcia-Pereira, 1990). Consequently, and in addition to the traditional outputs produced from the PCA, we have analyzed the influence of both sampling depths and geomorphological environments in the behavior of all variables involved.
Resumo:
This study aims to optimize the water quality monitoring of a polluted watercourse (Leça River, Portugal) through the principal component analysis (PCA) and cluster analysis (CA). These statistical methodologies were applied to physicochemical, bacteriological and ecotoxicological data (with the marine bacterium Vibrio fischeri and the green alga Chlorella vulgaris) obtained with the analysis of water samples monthly collected at seven monitoring sites and during five campaigns (February, May, June, August, and September 2006). The results of some variables were assigned to water quality classes according to national guidelines. Chemical and bacteriological quality data led to classify Leça River water quality as “bad” or “very bad”. PCA and CA identified monitoring sites with similar pollution pattern, giving to site 1 (located in the upstream stretch of the river) a distinct feature from all other sampling sites downstream. Ecotoxicity results corroborated this classification thus revealing differences in space and time. The present study includes not only physical, chemical and bacteriological but also ecotoxicological parameters, which broadens new perspectives in river water characterization. Moreover, the application of PCA and CA is very useful to optimize water quality monitoring networks, defining the minimum number of sites and their location. Thus, these tools can support appropriate management decisions.
Resumo:
High-content analysis has revolutionized cancer drug discovery by identifying substances that alter the phenotype of a cell, which prevents tumor growth and metastasis. The high-resolution biofluorescence images from assays allow precise quantitative measures enabling the distinction of small molecules of a host cell from a tumor. In this work, we are particularly interested in the application of deep neural networks (DNNs), a cutting-edge machine learning method, to the classification of compounds in chemical mechanisms of action (MOAs). Compound classification has been performed using image-based profiling methods sometimes combined with feature reduction methods such as principal component analysis or factor analysis. In this article, we map the input features of each cell to a particular MOA class without using any treatment-level profiles or feature reduction methods. To the best of our knowledge, this is the first application of DNN in this domain, leveraging single-cell information. Furthermore, we use deep transfer learning (DTL) to alleviate the intensive and computational demanding effort of searching the huge parameter's space of a DNN. Results show that using this approach, we obtain a 30% speedup and a 2% accuracy improvement.
Resumo:
BACKGROUND: Wireless capsule endoscopy has been introduced as an innovative, non-invasive diagnostic technique for evaluation of the gastrointestinal tract, reaching places where conventional endoscopy is unable to. However, the output of this technique is an 8 hours video, whose analysis by the expert physician is very time consuming. Thus, a computer assisted diagnosis tool to help the physicians to evaluate CE exams faster and more accurately is an important technical challenge and an excellent economical opportunity. METHOD: The set of features proposed in this paper to code textural information is based on statistical modeling of second order textural measures extracted from co-occurrence matrices. To cope with both joint and marginal non-Gaussianity of second order textural measures, higher order moments are used. These statistical moments are taken from the two-dimensional color-scale feature space, where two different scales are considered. Second and higher order moments of textural measures are computed from the co-occurrence matrices computed from images synthesized by the inverse wavelet transform of the wavelet transform containing only the selected scales for the three color channels. The dimensionality of the data is reduced by using Principal Component Analysis. RESULTS: The proposed textural features are then used as the input of a classifier based on artificial neural networks. Classification performances of 93.1% specificity and 93.9% sensitivity are achieved on real data. These promising results open the path towards a deeper study regarding the applicability of this algorithm in computer aided diagnosis systems to assist physicians in their clinical practice.
Resumo:
The main objective of this survey was to perform descriptive analysis of crime evolution in Portugal between 1995 and 2013. The main focus of this survey was to analyse spatial crime evolution patterns in Portuguese NUTS III regions. Most important crime types have been included into analysis. The main idea was to uncover relation between local patterns and global crime evolution; to define regions which have contributed to global crime evolution of some specific crime types and to define how they have contributed. There were many statistical reports and scientific papers which have analysed some particular crime types, but one global spatial-temporal analysis has not been found. Principal Component Analysis and multidimensional descriptive data analysis technique STATIS have been the base of the analysis. The results of this survey has shown that strong spatial and temporal crime patterns exist. It was possible to describe global crime evolution patterns and to define crime evolution patterns in NUTS III regions. It was possible to define three to four groups of crimes where each group shows similar spatial crime dynamics.
Resumo:
RESUMO - Objetivos: Anualmente morrem cerca de 1,3 milhões de pessoas, a nível mundial, devido aos acidentes de viação. Também mais de 20 milhões de pessoas sofrem ferimentos ligeiros ou graves devido aos acidentes de viação que resultam em incapacidade temporária ou permanente. Desta forma, consideram-se os acidentes de viação, um grave problema de saúde pública, com custos elevados para as sociedades afetando a saúde das populações e economias de cada país. Este estudo pretendeu descrever e caracterizar os condutores de veículos ligeiros, residentes em Portugal Continental, abrangendo características sociodemográficas, experiência de condução e questões relativas a atitudes, opiniões e comportamentos. Por outro lado procurou-se analisar a associação entre as opiniões, atitudes e comportamentos, auto reportados e a ocorrência de um acidente de viação nos últimos três anos a fim de construir um modelo final preditivo do risco de sofrer um acidente de viação. Método: Foi realizado um estudo observacional analítico transversal baseado num questionário traduzido para a língua portuguesa e com origem no projeto europeu SARTRE 4. A população-alvo foram todos os condutores de veículos ligeiros possuidores de uma licença de condução e residentes em Portugal Continental, baseado numa amostra de igual dimensão à definida no estudo europeu SARTRE 4 (600 condutores de veículos ligeiros). Das 52 perguntas existentes, selecionaram-se pela análise de componentes principais (ACP) variáveis potencialmente independentes e complementares para as componentes opiniões, atitudes e comportamentos. Para além das medidas descritivas usuais, recorreu-se à regressão logística binária para analisar associações e obter um modelo que permitisse estimar a probabilidade de sofrer um acidente rodoviário em função das variáveis selecionadas referentes às opiniões, atitudes e comportamentos auto reportados. Resultados: Dos 612 condutores inquiridos, 62,7% (383) responderam não ter sofrido nenhum acidente de viação nos últimos três anos enquanto 37,3% (228) respondeu ter estado envolvido em pelo menos um acidente de viação com danos materiais ou feridos, no mesmo período. De uma forma geral, o típico condutor que referiu ter sofrido um acidente nos últimos três anos é homem com mais de 65 anos de idade, com o 1º ensino básico, viúvo e sem filhos, não empregado e reside numa área urbana. Os condutores residentes numa área suburbana apresentaram um risco 5,368 mais elevado de sofrer um acidente de viação em relação aos condutores que habitam numa zona rural (IC 95%: 2,344-12,297; p<0,001). Os condutores que foram apenas submetidos uma vez a um controlo de álcool, nos últimos três anos, durante o exercício da condução apresentaram um risco 3,009 superior de sofrer um acidente de viação em relação aos condutores que nunca foram fiscalizados pela polícia (IC 95%: 1,949-4,647, p<0,001). Os condutores que referiram muito frequentemente parar para dormir quando se sentem cansados a conduzir têm uma probabilidade inferior de 81% de sofrer um acidente de viação em relação aos condutores que nunca o fazem (IC 95%: 0,058-0,620; p=0,006). Os condutores que quando cansados raramente bebem um café/bebida energética têm um risco de 4,829 superior de sofrer um acidente de viação do que os condutores que sempre o referiram fazer (IC 95%:1,807-12,903; p=0,002). Conclusões: Os resultados obtidos em relação aos fatores comportamentais vão ao encontro da maioria dos fatores de risco associados aos acidentes de viação referidos na literatura. Ainda assim, foram identificadas novas associações entre o risco de sofrer um acidente e as opiniões e as atitudes auto reportadas que através de estudos de maiores dimensões populacionais poderão vir a ser mais exploradas. Este trabalho vem reforçar a necessidade urgente de novas estratégias de intervenção, principalmente na componente comportamental, direcionadas aos grupos de risco, mantendo as existentes.
Analysis of metabolic flux distributions in relation to the extracellular environment in Avian cells
Resumo:
Continuous cell lines that proliferate in chemically defined and simple media have been highly regarded as suitable alternatives for vaccine production. One such cell line is the AG1.CR.pIX avian cell line developed by PROBIOGEN. This cell line can be cultivated in a fully scalable suspension culture and adapted to grow in chemically defined, calf serum free, medium [1]–[5]. The medium composition and cultivation strategy are important factors for reaching high virus titers. In this project, a series of computational methods was used to simulate the cell’s response to different environments. The study is based on the metabolic model of the central metabolism proposed in [1]. In a first step, Metabolic Flux Analysis (MFA) was used along with measured uptake and secretion fluxes to estimate intracellular flux values. The network and data were found to be consistent. In a second step, Flux Balance Analysis (FBA) was performed to access the cell’s biological objective. The objective that resulted in the best predicted results fit to the experimental data was the minimization of oxidative phosphorylation. Employing this objective, in the next step Flux Variability Analysis (FVA) was used to characterize the flux solution space. Furthermore, various scenarios, where a reaction deletion (elimination of the compound from the media) was simulated, were performed and the flux solution space for each scenario was calculated. Growth restrictions caused by essential and non-essential amino acids were accurately predicted. Fluxes related to the essential amino acids uptake and catabolism, the lipid synthesis and ATP production via TCA were found to be essential to exponential growth. Finally, the data gathered during the previous steps were analyzed using principal component analysis (PCA), in order to assess potential changes in the physiological state of the cell. Three metabolic states were found, which correspond to zero, partial and maximum biomass growth rate. Elimination of non-essential amino acids or pyruvate from the media showed no impact on the cell’s assumed normal metabolic state.
Resumo:
Introduction: Coordination is a strategy chosen by the central nervous system to control the movements and maintain stability during gait. Coordinated multi-joint movements require a complex interaction between nervous outputs, biomechanical constraints, and pro-prioception. Quantitatively understanding and modeling gait coordination still remain a challenge. Surgeons lack a way to model and appreciate the coordination of patients before and after surgery of the lower limbs. Patients alter their gait patterns and their kinematic synergies when they walk faster or slower than normal speed to maintain their stability and minimize the energy cost of locomotion. The goal of this study was to provide a dynamical system approach to quantitatively describe human gait coordination and apply it to patients before and after total knee arthroplasty. Methods: A new method of quantitative analysis of interjoint coordination during gait was designed, providing a general model to capture the whole dynamics and showing the kinematic synergies at various walking speeds. The proposed model imposed a relationship among lower limb joint angles (hips and knees) to parameterize the dynamics of locomotion of each individual. An integration of different analysis tools such as Harmonic analysis, Principal Component Analysis, and Artificial Neural Network helped overcome high-dimensionality, temporal dependence, and non-linear relationships of the gait patterns. Ten patients were studied using an ambulatory gait device (Physilog®). Each participant was asked to perform two walking trials of 30m long at 3 different speeds and to complete an EQ-5D questionnaire, a WOMAC and Knee Society Score. Lower limbs rotations were measured by four miniature angular rate sensors mounted respectively, on each shank and thigh. The outcomes of the eight patients undergoing total knee arthroplasty, recorded pre-operatively and post-operatively at 6 weeks, 3 months, 6 months and 1 year were compared to 2 age-matched healthy subjects. Results: The new method provided coordination scores at various walking speeds, ranged between 0 and 10. It determined the overall coordination of the lower limbs as well as the contribution of each joint to the total coordination. The difference between the pre-operative and post-operative coordination values were correlated with the improvements of the subjective outcome scores. Although the study group was small, the results showed a new way to objectively quantify gait coordination of patients undergoing total knee arthroplasty, using only portable body-fixed sensors. Conclusion: A new method for objective gait coordination analysis has been developed with very encouraging results regarding the objective outcome of lower limb surgery.
Resumo:
The introduction of culture-independent molecular screening techniques, especially based on 16S rRNA gene sequences, has allowed microbiologists to examine a facet of microbial diversity not necessarily reflected by the results of culturing studies. The bacterial community structure was studied for a pesticide-contaminated site that was subsequently remediated using an efficient degradative strain Arthrobacter protophormiae RKJ100. The efficiency of the bioremediation process was assessed by monitoring the depletion of the pollutant, and the effect of addition of an exogenous strain on the existing soil community structure was determined using molecular techniques. The 16S rRNA gene pool amplified from the soil metagenome was cloned and restriction fragment length polymorphism studies revealed 46 different phylotypes on the basis of similar banding patterns. Sequencing of representative clones of each phylotype showed that the community structure of the pesticide-contaminated soil was mainly constituted by Proteobacteria and Actinomycetes. Terminal restriction fragment length polymorphism analysis showed only nonsignificant changes in community structure during the process of bioremediation. Immobilized cells of strain RKJ100 enhanced pollutant degradation but seemed to have no detectable effects on the existing bacterial community structure.
Resumo:
The aim of this work is to evaluate the capabilities and limitations of chemometric methods and other mathematical treatments applied on spectroscopic data and more specifically on paint samples. The uniqueness of the spectroscopic data comes from the fact that they are multivariate - a few thousands variables - and highly correlated. Statistical methods are used to study and discriminate samples. A collection of 34 red paint samples was measured by Infrared and Raman spectroscopy. Data pretreatment and variable selection demonstrated that the use of Standard Normal Variate (SNV), together with removal of the noisy variables by a selection of the wavelengths from 650 to 1830 cm−1 and 2730-3600 cm−1, provided the optimal results for infrared analysis. Principal component analysis (PCA) and hierarchical clusters analysis (HCA) were then used as exploratory techniques to provide evidence of structure in the data, cluster, or detect outliers. With the FTIR spectra, the Principal Components (PCs) correspond to binder types and the presence/absence of calcium carbonate. 83% of the total variance is explained by the four first PCs. As for the Raman spectra, we observe six different clusters corresponding to the different pigment compositions when plotting the first two PCs, which account for 37% and 20% respectively of the total variance. In conclusion, the use of chemometrics for the forensic analysis of paints provides a valuable tool for objective decision-making, a reduction of the possible classification errors, and a better efficiency, having robust results with time saving data treatments.
Resumo:
Laser desorption ionisation mass spectrometry (LDI-MS) has demonstrated to be an excellent analytical method for the forensic analysis of inks on a questioned document. The ink can be analysed directly on its substrate (paper) and hence offers a fast method of analysis as sample preparation is kept to a minimum and more importantly, damage to the document is minimised. LDI-MS has also previously been reported to provide a high power of discrimination in the statistical comparison of ink samples and has the potential to be introduced as part of routine ink analysis. This paper looks into the methodology further and evaluates statistically the reproducibility and the influence of paper on black gel pen ink LDI-MS spectra; by comparing spectra of three different black gel pen inks on three different paper substrates. Although generally minimal, the influences of sample homogeneity and paper type were found to be sample dependent. This should be taken into account to avoid the risk of false differentiation of black gel pen ink samples. Other statistical approaches such as principal component analysis (PCA) proved to be a good alternative to correlation coefficients for the comparison of whole mass spectra.
Resumo:
In an earlier investigation (Burger et al., 2000) five sediment cores near the RodriguesTriple Junction in the Indian Ocean were studied applying classical statistical methods(fuzzy c-means clustering, linear mixing model, principal component analysis) for theextraction of endmembers and evaluating the spatial and temporal variation ofgeochemical signals. Three main factors of sedimentation were expected by the marinegeologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. Thedisplay of fuzzy membership values and/or factor scores versus depth providedconsistent results for two factors only; the ultra-basic component could not beidentified. The reason for this may be that only traditional statistical methods wereapplied, i.e. the untransformed components were used and the cosine-theta coefficient assimilarity measure.During the last decade considerable progress in compositional data analysis was madeand many case studies were published using new tools for exploratory analysis of thesedata. Therefore it makes sense to check if the application of suitable data transformations,reduction of the D-part simplex to two or three factors and visualinterpretation of the factor scores would lead to a revision of earlier results and toanswers to open questions . In this paper we follow the lines of a paper of R. Tolosana-Delgado et al. (2005) starting with a problem-oriented interpretation of the biplotscattergram, extracting compositional factors, ilr-transformation of the components andvisualization of the factor scores in a spatial context: The compositional factors will beplotted versus depth (time) of the core samples in order to facilitate the identification ofthe expected sources of the sedimentary process.Kew words: compositional data analysis, biplot, deep sea sediments
Resumo:
In order to obtain a high-resolution Pleistocene stratigraphy, eleven continuouslycored boreholes, 100 to 220m deep were drilled in the northern part of the PoPlain by Regione Lombardia in the last five years. Quantitative provenanceanalysis (QPA, Weltje and von Eynatten, 2004) of Pleistocene sands was carriedout by using multivariate statistical analysis (principal component analysis, PCA,and similarity analysis) on an integrated data set, including high-resolution bulkpetrography and heavy-mineral analyses on Pleistocene sands and of 250 majorand minor modern rivers draining the southern flank of the Alps from West toEast (Garzanti et al, 2004; 2006). Prior to the onset of major Alpine glaciations,metamorphic and quartzofeldspathic detritus from the Western and Central Alpswas carried from the axial belt to the Po basin longitudinally parallel to theSouthAlpine belt by a trunk river (Vezzoli and Garzanti, 2008). This scenariorapidly changed during the marine isotope stage 22 (0.87 Ma), with the onset ofthe first major Pleistocene glaciation in the Alps (Muttoni et al, 2003). PCA andsimilarity analysis from core samples show that the longitudinal trunk river at thistime was shifted southward by the rapid southward and westward progradation oftransverse alluvial river systems fed from the Central and Southern Alps.Sediments were transported southward by braided river systems as well as glacialsediments transported by Alpine valley glaciers invaded the alluvial plain.Kew words: Detrital modes; Modern sands; Provenance; Principal ComponentsAnalysis; Similarity, Canberra Distance; palaeodrainage
Resumo:
En aquest treball, es proposa un nou mètode per estimar en temps real la qualitat del producte final en processos per lot. Aquest mètode permet reduir el temps necessari per obtenir els resultats de qualitat de les anàlisi de laboratori. S'utiliza un model de anàlisi de componentes principals (PCA) construït amb dades històriques en condicions normals de funcionament per discernir si un lot finalizat és normal o no. Es calcula una signatura de falla pels lots anormals i es passa a través d'un model de classificació per la seva estimació. L'estudi proposa un mètode per utilitzar la informació de les gràfiques de contribució basat en les signatures de falla, on els indicadors representen el comportament de les variables al llarg del procés en les diferentes etapes. Un conjunt de dades compost per la signatura de falla dels lots anormals històrics es construeix per cercar els patrons i entrenar els models de classifcació per estimar els resultas dels lots futurs. La metodologia proposada s'ha aplicat a un reactor seqüencial per lots (SBR). Diversos algoritmes de classificació es proven per demostrar les possibilitats de la metodologia proposada.
Resumo:
First discussion on compositional data analysis is attributable to Karl Pearson, in 1897. However, notwithstanding the recent developments on algebraic structure of the simplex, more than twenty years after Aitchison’s idea of log-transformations of closed data, scientific literature is again full of statistical treatments of this type of data by using traditional methodologies. This is particularly true in environmental geochemistry where besides the problem of the closure, the spatial structure (dependence) of the data have to be considered. In this work we propose the use of log-contrast values, obtained by asimplicial principal component analysis, as LQGLFDWRUV of given environmental conditions. The investigation of the log-constrast frequency distributions allows pointing out the statistical laws able togenerate the values and to govern their variability. The changes, if compared, for example, with the mean values of the random variables assumed as models, or other reference parameters, allow definingmonitors to be used to assess the extent of possible environmental contamination. Case study on running and ground waters from Chiavenna Valley (Northern Italy) by using Na+, K+, Ca2+, Mg2+, HCO3-, SO4 2- and Cl- concentrations will be illustrated