951 resultados para Principal component analysis (PCA)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A compositional multivariate approach is used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Using these imputed values for 46 elements (including LOI), each soil sample site was assigned to the regional geology map provided by GSNI initially using the dominant lithology for the map polygon. Northern Ireland includes a diversity of geology representing a stratigraphic record from the Mesoproterozoic, up to and including the Palaeogene. However, the advance of ice sheets and their meltwaters over the last 100,000 years has left at least 80% of the bedrock covered by superficial deposits, including glacial till and post-glacial alluvium and peat. The question is to what extent the soil geochemistry reflects the underlying geology or superficial deposits. To address this, the geochemical data were transformed using centered log ratios (clr) to observe the requirements of compositional data analysis and avoid closure issues. Following this, compositional multivariate techniques including compositional Principal Component Analysis (PCA) and minimum/maximum autocorrelation factor (MAF) analysis method were used to determine the influence of underlying geology on the soil geochemistry signature. PCA showed that 72% of the variation was determined by the first four principal components (PC’s) implying “significant” structure in the data. Analysis of variance showed that only 10 PC’s were necessary to classify the soil geochemical data. To consider an improvement over PCA that uses the spatial relationships of the data, a classification based on MAF analysis was undertaken using the first 6 dominant factors. Understanding the relationship between soil geochemistry and superficial deposits is important for environmental monitoring of fragile ecosystems such as peat. To explore whether peat cover could be predicted from the classification, the lithology designation was adapted to include the presence of peat, based on GSNI superficial deposit polygons and linear discriminant analysis (LDA) undertaken. Prediction accuracy for LDA classification improved from 60.98% based on PCA using 10 principal components to 64.73% using MAF based on the 6 most dominant factors. The misclassification of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification. Further work will examine the influence of underlying lithologies on elemental concentrations in peat composition and the effect of this in classification analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Microsecond long Molecular Dynamics (MD) trajectories of biomolecular processes are now possible due to advances in computer technology. Soon, trajectories long enough to probe dynamics over many milliseconds will become available. Since these timescales match the physiological timescales over which many small proteins fold, all atom MD simulations of protein folding are now becoming popular. To distill features of such large folding trajectories, we must develop methods that can both compress trajectory data to enable visualization, and that can yield themselves to further analysis, such as the finding of collective coordinates and reduction of the dynamics. Conventionally, clustering has been the most popular MD trajectory analysis technique, followed by principal component analysis (PCA). Simple clustering used in MD trajectory analysis suffers from various serious drawbacks, namely, (i) it is not data driven, (ii) it is unstable to noise and change in cutoff parameters, and (iii) since it does not take into account interrelationships amongst data points, the separation of data into clusters can often be artificial. Usually, partitions generated by clustering techniques are validated visually, but such validation is not possible for MD trajectories of protein folding, as the underlying structural transitions are not well understood. Rigorous cluster validation techniques may be adapted, but it is more crucial to reduce the dimensions in which MD trajectories reside, while still preserving their salient features. PCA has often been used for dimension reduction and while it is computationally inexpensive, being a linear method, it does not achieve good data compression. In this thesis, I propose a different method, a nonmetric multidimensional scaling (nMDS) technique, which achieves superior data compression by virtue of being nonlinear, and also provides a clear insight into the structural processes underlying MD trajectories. I illustrate the capabilities of nMDS by analyzing three complete villin headpiece folding and six norleucine mutant (NLE) folding trajectories simulated by Freddolino and Schulten [1]. Using these trajectories, I make comparisons between nMDS, PCA and clustering to demonstrate the superiority of nMDS. The three villin headpiece trajectories showed great structural heterogeneity. Apart from a few trivial features like early formation of secondary structure, no commonalities between trajectories were found. There were no units of residues or atoms found moving in concert across the trajectories. A flipping transition, corresponding to the flipping of helix 1 relative to the plane formed by helices 2 and 3 was observed towards the end of the folding process in all trajectories, when nearly all native contacts had been formed. However, the transition occurred through a different series of steps in all trajectories, indicating that it may not be a common transition in villin folding. The trajectories showed competition between local structure formation/hydrophobic collapse and global structure formation in all trajectories. Our analysis on the NLE trajectories confirms the notion that a tight hydrophobic core inhibits correct 3-D rearrangement. Only one of the six NLE trajectories folded, and it showed no flipping transition. All the other trajectories get trapped in hydrophobically collapsed states. The NLE residues were found to be buried deeply into the core, compared to the corresponding lysines in the villin headpiece, thereby making the core tighter and harder to undo for 3-D rearrangement. Our results suggest that the NLE may not be a fast folder as experiments suggest. The tightness of the hydrophobic core may be a very important factor in the folding of larger proteins. It is likely that chaperones like GroEL act to undo the tight hydrophobic core of proteins, after most secondary structure elements have been formed, so that global rearrangement is easier. I conclude by presenting facts about chaperone-protein complexes and propose further directions for the study of protein folding.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Phenotypic variation in plants can be evaluated by morphological characterization using visual attributes. Fruits have been the major descriptors for identification of different varieties of fruit crops. However, even in their absence, farmers, breeders and interested stakeholders require to distinguish between different mango varieties. This study aimed at determining diversity in mango germplasm from the Upper Athi River (UAR) and providing useful alternative descriptors for the identification of different mango varieties in the absence of fruits. A total of 20 International Plant Genetic Resources Institute (IPGRI) descriptors for mango were selected for use in the visual assessment of 98 mango accessions from 15 sites of the UAR region of eastern Kenya. Purposive sampling was used to identify farmers growing diverse varieties of mangoes. Evaluation of the descriptors was performed on-site and the data collected were then subjected to multivariate analysis including Principal Component Analysis (PCA) and Cluster analysis, one- way analysis of variance (ANOVA) and Chi square tests. Results classified the accessions into two major groups corresponding to indigenous and exotic varieties. The PCA showed the first six principal components accounting for 75.12% of the total variance. A strong and highly significant correlation was observed between the color of fully grown leaves, leaf blade width, leaf blade length and petiole length and also between the leaf attitude, color of young leaf, stem circumference, tree height, leaf margin, growth habit and fragrance. Useful descriptors for morphological evaluation were 14 out of the selected 20; however, ANOVA and Chi square test revealed that diversity in the accessions was majorly as a result of variations in color of young leaves, leaf attitude, leaf texture, growth habit, leaf blade length, leaf blade width and petiole length traits. These results reveal that mango germplasm in the UAR has significant diversity and that other morphological traits apart from fruits can be useful in morphological characterization of mango.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A Flood Vulnerability Index (FloodVI) was developed using Principal Component Analysis (PCA) and a new aggregation method based on Cluster Analysis (CA). PCA simplifies a large number of variables into a few uncorrelated factors representing the social, economic, physical and environmental dimensions of vulnerability. CA groups areas that have the same characteristics in terms of vulnerability into vulnerability classes. The grouping of the areas determines their classification contrary to other aggregation methods in which the areas' classification determines their grouping. While other aggregation methods distribute the areas into classes, in an artificial manner, by imposing a certain probability for an area to belong to a certain class, as determined by the assumption that the aggregation measure used is normally distributed, CA does not constrain the distribution of the areas by the classes. FloodVI was designed at the neighbourhood level and was applied to the Portuguese municipality of Vila Nova de Gaia where several flood events have taken place in the recent past. The FloodVI sensitivity was assessed using three different aggregation methods: the sum of component scores, the first component score and the weighted sum of component scores. The results highlight the sensitivity of the FloodVI to different aggregation methods. Both sum of component scores and weighted sum of component scores have shown similar results. The first component score aggregation method classifies almost all areas as having medium vulnerability and finally the results obtained using the CA show a distinct differentiation of the vulnerability where hot spots can be clearly identified. The information provided by records of previous flood events corroborate the results obtained with CA, because the inundated areas with greater damages are those that are identified as high and very high vulnerability areas by CA. This supports the fact that CA provides a reliable FloodVI.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Flavanones (hesperidin, naringenin, naringin, and poncirin) in industrial, hand-squeezed orange juices and from fresh-in-squeeze machines orange juices were determined by HPLC/DAD analysis using a previously described liquid-liquid extraction method. Method validation including the accuracy was performed by using recovery tests. Samples (36) collected from different Brazilian locations and brands were analyzed. Concentrations were determined using an external standard curve. The limits of detection (LOD) and the limits of quantification (LOQ) calculated were 0.0037, 1.87, 0.0147, and 0.0066 mg 100 g(-1) and 0.0089, 7.84, 0.0302, and 0.0200 mg 100 g(-1) for naringin, hesperidin, poncirin, and naringenin, respectively. The results demonstrated that hesperidin was present at the highest concentration levels, especially in the industrial orange juices. Its average content and concentration range were 69.85 and 18.80-139.00 mg 100 g(-1). The other flavanones showed the lowest concentration levels. The average contents and concentration ranges found were 0.019, 0.01-0.30, and 0.12 and 0.1-0.17, 0.13, and 0.01-0.36 mg 100 g(-1), respectively. The results were also evaluated using the principal component analysis (PCA) multivariate analysis technique which showed that poncirin, naringenin, and naringin were the principal elements that contributed to the variability in the sample concentrations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This manuscript aims to show the basic concepts and practical application of Principal Component Analysis (PCA) as a tutorial, using Matlab or Octave computing environment for beginners, undergraduate and graduate students. As a practical example it is shown the exploratory analysis of edible vegetable oils by mid infrared spectroscopy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Descriptive terminology and sensory profile of three varieties of brazilian varietal white wines (cultivars Riesling, Gewürztraminer and Chardonnay) were developed by a methodology based on the Quantitative Descriptive Analysis (QDA). The sensory panel consensually defined the sensory descriptors, their respective reference materials and the descriptive evaluation ballot. Ten individuals were selected as judges based on their discrimination, reproducibility and individual consensus with the sensory panel. Twelve descriptors were generated showing similarities and differences among the wine samples. Each descriptor was evaluated using a nine-centimeters non-structured scale with the intensity terms anchored at its ends. The collected data were analysed by ANOVA, Tukey test and Principal Component Analysis (PCA). The results showed a great difference within the sensory profile of Riesling and Gewürztraminer wines, whereas Chardonnay wines showed a lesser variation. PCA separated samples into two groups: a first group formed by wines higher in sweetness and fruitty flavor and aroma; and a second group of wines higher in sourness, adstringency, bitterness, alcoholic and fermented flavors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Medium density fiberboard (MDF) is an engineered wood product formed by breaking down selected lignin-cellulosic material residuals into fibers, combining it with wax and a resin binder, and then forming panels by applying high temperature and pressure. Because the raw material in the industrial process is ever-changing, the panel industry requires methods for monitoring the composition of their products. The aim of this study was to estimate the ratio of sugarcane (SC) bagasse to Eucalyptus wood in MDF panels using near infrared (NIR) spectroscopy. Principal component analysis (PCA) and partial least square (PLS) regressions were performed. MDF panels having different bagasse contents were easily distinguished from each other by the PCA of their NIR spectra with clearly different patterns of response. The PLS-R models for SC content of these MDF samples presented a strong coefficient of determination (0.96) between the NIR-predicted and Lab-determined values and a low standard error of prediction (similar to 1.5%) in the cross-validations. A key role of resins (adhesives), cellulose, and lignin for such PLS-R calibrations was shown. PLS-DA model correctly classified ninety-four percent of MDF samples by cross-validations and ninety-eight percent of the panels by independent test set. These NIR-based models can be useful to quickly estimate sugarcane bagasse vs. Eucalyptus wood content ratio in unknown MDF samples and to verify the quality of these engineered wood products in an online process.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Natural products have widespread biological activities, including inhibition of mitochondrial enzyme systems. Some of these activities, for example cytotoxicity, may be the result of alteration of cellular bioenergetics. Based on previous computer-aided drug design (CADD) studies and considering reported data on structure-activity relationships (SAR), an assumption regarding the mechanism of action of natural products against parasitic infections involves the NADH-oxidase inhibition. In this study, chemometric tools, such as: Principal Component Analysis (PCA), Consensus PCA (CPCA), and partial least squares regression (PLS), were applied to a set of forty natural compounds, acting as NADH-oxidase inhibitors. The calculations were performed using the VolSurf+ program. The formalisms employed generated good exploratory and predictive results. The independent variables or descriptors having a hydrophobic profile were strongly correlated to the biological data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes a novel computer vision approach that processes video sequences of people walking and then recognises those people by their gait. Human motion carries different information that can be analysed in various ways. The skeleton carries motion information about human joints, and the silhouette carries information about boundary motion of the human body. Moreover, binary and gray-level images contain different information about human movements. This work proposes to recover these different kinds of information to interpret the global motion of the human body based on four different segmented image models, using a fusion model to improve classification. Our proposed method considers the set of the segmented frames of each individual as a distinct class and each frame as an object of this class. The methodology applies background extraction using the Gaussian Mixture Model (GMM), a scale reduction based on the Wavelet Transform (WT) and feature extraction by Principal Component Analysis (PCA). We propose four new schemas for motion information capture: the Silhouette-Gray-Wavelet model (SGW) captures motion based on grey level variations; the Silhouette-Binary-Wavelet model (SBW) captures motion based on binary information; the Silhouette-Edge-Binary model (SEW) captures motion based on edge information and the Silhouette Skeleton Wavelet model (SSW) captures motion based on skeleton movement. The classification rates obtained separately from these four different models are then merged using a new proposed fusion technique. The results suggest excellent performance in terms of recognising people by their gait.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sigma phase is a deleterious one which can be formed in duplex stainless steels during heat treatment or welding. Aiming to accompany this transformation, ferrite and sigma percentage and hardness were measured on samples of a UNS S31803 duplex stainless steel submitted to heat treatment. These results were compared to measurements obtained from ultrasound and eddy current techniques, i.e., velocity and impedance, respectively. Additionally, backscattered signals produced by wave propagation were acquired during ultrasonic inspection as well as magnetic Barkhausen noise during magnetic inspection. Both signal types were processed via a combination of detrended-fluctuation analysis (DFA) and principal component analysis (PCA). The techniques used were proven to be sensitive to changes in samples related to sigma phase formation due to heat treatment. Furthermore, there is an advantage using these methods since they are nondestructive. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work, chemometric methods are reported as potential tools for monitoring the authenticity of Brazilian ultra-high temperature (UHT) milk processed in industrial plants located in different regions of the country. A total of 100 samples were submitted to the qualitative analysis of adulterants such as starch, chlorine, formal. hydrogen peroxide and urine. Except for starch, all the samples reported, at least, the presence of one adulterant. The use of chemometric methodologies such as the Principal Component Analysis (PCA) and Hierarchical Cluster Analysis (HCA) enabled the verification of the occurrence of certain adulterations in specific regions. The proposed multivariate approaches may allow the sanitary agency authorities to optimise materials, human and financial resources, as they associate the occurrence of adulterations to the geographical location of the industrial plants. (c) 2010 Elsevier Ltd. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The concentrations of major, minor and trace metals were measured in water samples collected from five shallow Antarctic lakes (Carezza, Edmonson Point (No 14 and 15a), Inexpressible Island and Tarn Flat) found in Terra Nova Bay (northern Victoria Land, Antarctica) during the Italian Expeditions of 1993-2001. The total concentrations of a large suite of elements (Al, As, Ba, Ca, Cd, Ce, Co, Cr, Cs, Cu, Fe, Ga, Gd, K, La, Li, Mg, Mn, Mo, Na, Nd, Ni, Pb, Pr, Rb, Sc, Si, Sr, Ta, Ti, U, V, Y, W, Zn and Zr) were determined using spectroscopic techniques (ICP-AES, GF-AAS and ICP-MS). The results are similar to those obtained for the freshwater lakes of the Larsemann Hills, East Antarctica, and for the McMurdo Dry Valleys. Principal Component Analysis (PCA) and Cluster Analysis (CA) were performed to identify groups of samples with similar characteristics and to find correlations between the variables. The variability observed within the water samples is closely connected to the sea spray input; hence, it is primarily a consequence of geographical and meteorological factors, such as distance from the ocean and time of year. The trace element levels, in particular those of heavy metals, are very low, suggesting an origin from natural sources rather than from anthropogenic contamination.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sum: Plant biologists in fields of ecology, evolution, genetics and breeding frequently use multivariate methods. This paper illustrates Principal Component Analysis (PCA) and Gabriel's biplot as applied to microarray expression data from plant pathology experiments. Availability: An example program in the publicly distributed statistical language R is available from the web site (www.tpp.uq.edu.au) and by e-mail from the contact. Contact: scott.chapman@csiro.au.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Particulate matter, especially PM2.5, is associated with increased morbidity and mortality from respiratory diseases. Studies that focus on the chemical composition of the material are frequent in the literature, but those that characterize the biological fraction are rare. The objectives of this study were to characterize samples collected in Sao Paulo, Brazil on the quantity of fungi and endotoxins associated with PM2.5, correlating with the mass of particulate matter, chemical composition and meteorological parameters. We did that by Principal Component Analysis (PCA) and multiple linear regressions. The results have shown that fungi and endotoxins represent significant portion of PM2.5, reaching average concentrations of 772.23 spores mu g(-1) of PM2.5 (SD: 400.37) and 5.52 EU mg(-1) of PM2.5 (SD: 4.51 EU mg(-1)), respectively. Hyaline basidiospores, Cladosporium and total spore counts were correlated to factor Ba/Ca/Fe/Zn/K/Si of PM2.5 (p < 0.05). Genera Pen/Asp were correlated to the total mass of PM2.5 (p < 0.05) and colorless ascospores were correlated to humidity (p < 0.05). Endotoxin was positively correlated with the atmospheric temperature (p < 0.05). This study has shown that bioaerosol is present in considerable amounts in PM2.5 in the atmosphere of Sao Paulo, Brazil. Some fungi were correlated with soil particle resuspension and mass of particulate matter. Therefore, the relative contribution of bioaerosol in PM2.5 should be considered in future studies aimed at evaluating the clinical impact of exposure to air pollution. (C) 2010 Elsevier Ltd. All rights reserved.