922 resultados para principal component regression
Resumo:
BACKGROUND Compared to food patterns, nutrient patterns have been rarely used particularly at international level. We studied, in the context of a multi-center study with heterogeneous data, the methodological challenges regarding pattern analyses. METHODOLOGY/PRINCIPAL FINDINGS We identified nutrient patterns from food frequency questionnaires (FFQ) in the European Prospective Investigation into Cancer and Nutrition (EPIC) Study and used 24-hour dietary recall (24-HDR) data to validate and describe the nutrient patterns and their related food sources. Associations between lifestyle factors and the nutrient patterns were also examined. Principal component analysis (PCA) was applied on 23 nutrients derived from country-specific FFQ combining data from all EPIC centers (N = 477,312). Harmonized 24-HDRs available for a representative sample of the EPIC populations (N = 34,436) provided accurate mean group estimates of nutrients and foods by quintiles of pattern scores, presented graphically. An overall PCA combining all data captured a good proportion of the variance explained in each EPIC center. Four nutrient patterns were identified explaining 67% of the total variance: Principle component (PC) 1 was characterized by a high contribution of nutrients from plant food sources and a low contribution of nutrients from animal food sources; PC2 by a high contribution of micro-nutrients and proteins; PC3 was characterized by polyunsaturated fatty acids and vitamin D; PC4 was characterized by calcium, proteins, riboflavin, and phosphorus. The nutrients with high loadings on a particular pattern as derived from country-specific FFQ also showed high deviations in their mean EPIC intakes by quintiles of pattern scores when estimated from 24-HDR. Center and energy intake explained most of the variability in pattern scores. CONCLUSION/SIGNIFICANCE The use of 24-HDR enabled internal validation and facilitated the interpretation of the nutrient patterns derived from FFQs in term of food sources. These outcomes open research opportunities and perspectives of using nutrient patterns in future studies particularly at international level.
Resumo:
Three multivariate statistical tools (principal component analysis, factor analysis, analysis discriminant) have been tested to characterize and model the sags registered in distribution substations. Those models use several features to represent the magnitude, duration and unbalanced grade of sags. They have been obtained from voltage and current waveforms. The techniques are tested and compared using 69 registers of sags. The advantages and drawbacks of each technique are listed
Resumo:
A statistical method for classification of sags their origin downstream or upstream from the recording point is proposed in this work. The goal is to obtain a statistical model using the sag waveforms useful to characterise one type of sags and to discriminate them from the other type. This model is built on the basis of multi-way principal component analysis an later used to project the available registers in a new space with lower dimension. Thus, a case base of diagnosed sags is built in the projection space. Finally classification is done by comparing new sags against the existing in the case base. Similarity is defined in the projection space using a combination of distances to recover the nearest neighbours to the new sag. Finally the method assigns the origin of the new sag according to the origin of their neighbours
Resumo:
The work presented in this paper belongs to the power quality knowledge area and deals with the voltage sags in power transmission and distribution systems. Propagating throughout the power network, voltage sags can cause plenty of problems for domestic and industrial loads that can financially cost a lot. To impose penalties to responsible party and to improve monitoring and mitigation strategies, sags must be located in the power network. With such a worthwhile objective, this paper comes up with a new method for associating a sag waveform with its origin in transmission and distribution networks. It solves this problem through developing hybrid methods which hire multiway principal component analysis (MPCA) as a dimension reduction tool. MPCA reexpresses sag waveforms in a new subspace just in a few scores. We train some well-known classifiers with these scores and exploit them for classification of future sags. The capabilities of the proposed method for dimension reduction and classification are examined using the real data gathered from three substations in Catalonia, Spain. The obtained classification rates certify the goodness and powerfulness of the developed hybrid methods as brand-new tools for sag classification
Resumo:
Evaluation of segmentation methods is a crucial aspect in image processing, especially in the medical imaging field, where small differences between segmented regions in the anatomy can be of paramount importance. Usually, segmentation evaluation is based on a measure that depends on the number of segmented voxels inside and outside of some reference regions that are called gold standards. Although some other measures have been also used, in this work we propose a set of new similarity measures, based on different features, such as the location and intensity values of the misclassified voxels, and the connectivity and the boundaries of the segmented data. Using the multidimensional information provided by these measures, we propose a new evaluation method whose results are visualized applying a Principal Component Analysis of the data, obtaining a simplified graphical method to compare different segmentation results. We have carried out an intensive study using several classic segmentation methods applied to a set of MRI simulated data of the brain with several noise and RF inhomogeneity levels, and also to real data, showing that the new measures proposed here and the results that we have obtained from the multidimensional evaluation, improve the robustness of the evaluation and provides better understanding about the difference between segmentation methods.
Resumo:
First discussion on compositional data analysis is attributable to Karl Pearson, in 1897. However, notwithstanding the recent developments on algebraic structure of the simplex, more than twenty years after Aitchison’s idea of log-transformations of closed data, scientific literature is again full of statistical treatments of this type of data by using traditional methodologies. This is particularly true in environmental geochemistry where besides the problem of the closure, the spatial structure (dependence) of the data have to be considered. In this work we propose the use of log-contrast values, obtained by asimplicial principal component analysis, as LQGLFDWRUV of given environmental conditions. The investigation of the log-constrast frequency distributions allows pointing out the statistical laws able togenerate the values and to govern their variability. The changes, if compared, for example, with the mean values of the random variables assumed as models, or other reference parameters, allow definingmonitors to be used to assess the extent of possible environmental contamination. Case study on running and ground waters from Chiavenna Valley (Northern Italy) by using Na+, K+, Ca2+, Mg2+, HCO3-, SO4 2- and Cl- concentrations will be illustrated
Resumo:
BACKGROUND: It is unknown why patients with extensive ulcerative colitis (UC) have a higher risk of colorectal cancer compared with patients with left-sided UC. This study characterizes the inflammatory processes in left-sided UC, pancolitis, and UC-associated dysplasia at the transcriptional level to identify potential biomarkers and transcripts of importance for the carcinogenic behavior of chronic inflammation. METHODS: The Affymetrix GeneChip Human Genome U133 Plus 2.0 was applied on colonic biopsies from UC patients with left-sided UC, pancolitis, dysplasia, and controls. Reverse transcription polymerase chain reaction and immunohistochemistry were performed for validating selected transcripts in the initial cohort and in 2 independent cohorts of patients with UC. Microarray data were analyzed by principal component analysis, and reverse transcription polymerase chain reaction and immunohistochemistry data by the Wilcoxon's rank-sum test. RESULTS: The principal component analysis results revealed separate clusters for left-sided UC, pancolitis, dysplasia, and controls. Close clustering of dysplastic and pancolitic samples indicated similarities in gene expression. Indeed, 101 and 656 parallel upregulated and downregulated transcripts, respectively, were identified in specimens from dysplasia and pancolitis. Validation of selected transcripts hereof identified insulin receptor alpha (INSRA) and MAP kinase interacting serine/threonine kinase 2 (MKNK2) with an enhanced expression in dysplasia compared with left-sided UC and controls, whereas laminin γ2 (LAMC2) was found with a lower expression in dysplasia compared with the remaining 3 groups. CONCLUSIONS: This study demonstrates pancolitis and left-sided UC as distinct inflammatory processes at the transcriptional level, and identifies INSRA, MKNK2, and LAMC2 as potential critical transcripts in the inflammation-driven preneoplastic process of UC.
Resumo:
Although Leontopodium alpinum is considered to be threatened in many countries, only limited scientific information about its autecology is available. In this study, we aim to define the most important ecological factors which influence the distribution of L. alpinum in the Swiss Alps. These were assessed at the national scale using species distribution models based on topoclimatic predictors and at the community scale using exhaustive plant inventories. The latter were analysed using hierarchical clustering and principal component analysis, and the results were interpreted using ecological indicator values. L. alpinum was found almost exclusively on base-rich bedrocks (limestone and ultramaphic rocks). The species distribution models showed that the available moisture (dry regions, mostly in the Inner Alps), elevation (mostly above 2000 m.a.s.l.) and slope (mostly >30°) were the most important predictors. The relevés showed that L. alpinum is present in a wide range of plant communities, all subalpine-alpine open grasslands, with a low grass cover. As a light-demanding and short species, L. alpinum requires light at ground level; hence, it can only grow in open, nutrient-poor grasslands. These conditions are met in dry conditions (dry, summer-warm climate, rocky and draining soil, south-facing aspect and/or steep slope), at high elevations, on oligotrophic soils and/or on windy ridges. Base-rich soils appear to also be essential, although it is still unclear if this corresponds to physiological or ecological (lower competition) requirements.
Resumo:
A cultivation-independent approach based on polymerase chain reaction (PCR)-amplified partial small subunit rRNA genes was used to characterize bacterial populations in the surface soil of a commercial pear orchard consisting of different pear cultivars during two consecutive growing seasons. Pyrus communis L. cvs Blanquilla, Conference, and Williams are among the most widely cultivated cultivars in Europe and account for the majority of pear production in Northeastern Spain. To assess the heterogeneity of the community structure in response to environmental variables and tree phenology, bacterial populations were examined using PCR-denaturing gradient gel electrophoresis (DGGE) followed by cluster analysis of the 16S ribosomal DNA profiles by means of the unweighted pair group method with arithmetic means. Similarity analysis of the band patterns failed to identify characteristic fingerprints associated with the pear cultivars. Both environmentally and biologically based principal-component analyses showed that the microbial communities changed significantly throughout the year depending on temperature and, to a lesser extent, on tree phenology and rainfall. Prominent DGGE bands were excised and sequenced to gain insight into the identities of the predominant bacterial populations. Most DGGE band sequences were related to bacterial phyla, such as Bacteroidetes, Cyanobacteria, Acidobacteria, Proteobacteria, Nitrospirae, and Gemmatimonadetes, previously associated with typical agronomic crop environments
Resumo:
Aim The spotted knapweed (Centaurea stoebe), a plant native to south-east and central Europe, is highly invasive in North America. We investigated the spatio-temporal climatic niche dynamics of the spotted knapweed in North America along two putative eastern and western invasion routes. We then considered the patterns observed in the light of historical, ecological and evolutionary factors. Location Europe and North America. Methods The niche characteristics of the east and west invasive populations of spotted knapweed in North America were determined from documented occurrences over 120 consecutive years (1890-2010). The 2.5 and 97.5 percentiles of values along temperature and precipitation gradients, as given by the two first axes of a principal component axis (PCA), were then calculated. We additionally measured the climatic dissimilarity between invaded and native niches using a multivariate environmental similarity surface (MESS) analysis. Results Along both invasion routes, the species established in regions with climatic conditions that were similar to those in the native range in Europe. An initial spread in ruderal habitats always preceded spread in (semi-)natural habitats. In the east, the niche gradually increased over time until it reached limits similar to the native niche. Conversely, in the west the niche abruptly expanded after an extended time lag into climates not occupied in the native range; only the native cold niche limit was conserved. Main conclusions Our study reveals that different niche dynamics have taken place during the eastern and western invasions. This pattern indicates different combinations of historical, ecological and evolutionary factors in the two ranges. We hypothesize that the lack of a well-developed transportation network in the west at the time of the introduction of spotted knapweed confined the species to a geographically and climatically isolated region. The invasion of dry rangelands may have been favoured during the agricultural transition in the 1930s by release from natural enemies, local adaptation and less competitive vegetation, but further experimental and molecular studies are needed to explain these contrasting niche patterns fully. Our study illustrates the need and benefit of applying large-scale, temporally explicit approaches to understanding biological invasions.
Resumo:
Rho GTPases are conformational switches that control a wide variety of signaling pathways critical for eukaryotic cell development and proliferation. They represent attractive targets for drug design as their aberrant function and deregulated activity is associated with many human diseases including cancer. Extensive high-resolution structures (.100) and recent mutagenesis studies have laid the foundation for the design of new structure-based chemotherapeutic strategies. Although the inhibition of Rho signaling with drug-like compounds is an active area of current research, very little attention has been devoted to directly inhibiting Rho by targeting potential allosteric non-nucleotide binding sites. By avoiding the nucleotide binding site, compounds may minimize the potential for undesirable off-target interactions with other ubiquitous GTP and ATP binding proteins. Here we describe the application of molecular dynamics simulations, principal component analysis, sequence conservation analysis, and ensemble small-molecule fragment mapping to provide an extensive mapping of potential small-molecule binding pockets on Rho family members. Characterized sites include novel pockets in the vicinity of the conformationaly responsive switch regions as well as distal sites that appear to be related to the conformations of the nucleotide binding region. Furthermore the use of accelerated molecular dynamics simulation, an advanced sampling method that extends the accessible time-scale of conventional simulations, is found to enhance the characterization of novel binding sites when conformational changes are important for the protein mechanism.
Resumo:
Els avenços en tècniques de genotipat de polimorfismes genètics a gran escala estan liderant una revolució en el camp de l’epidemiologia genètica i la genètica de poblacions humanes. La informació aportada per aquestes tècniques ha evidenciat l’existència d’estructuracions poblacionals que poden augmentar l’error en els estudis d’associació a escala genòmica (GWAS, genome-wide association studies). Estudis recents han demostrat la presència d’aquestes estructuracions a nivell interregional i intrarregional a Europa. El present projecte ha avaluat el grau d’estructuració genètica en poblacions de la Península Ibèrica i altres regions del sudoest europeu (Itàlia i França) per quantificar l’impacte que aquesta potencial estructuració pot tenir en el disseny d’estudis d’associació GWAS i reconstruir la història demogràfica de les poblacions de la Mediterrània. Per aconseguir aquests objectius, s’han analitzat mostres de DNA de 770 individus de 26 poblacions de la Península Ibèrica, França, Itàlia i d’altres països de la Mediterrània. Aquestes mostres van ser genotipades per 240000 SNPs utilitzant l’array 250K StyI d’Affymetrix en el marc d’aquest projecte o mitjançant altres arrays d’Affymetrix en els projectes internacionals HapMap i POPRES. S’han realitzat anàlisis estadístiques incloent anàlisis de components principals, Fst, identitat per descendència, desequilibri de lligament, barreres genètiques, etc. Aquests resultats han permés construir un marc de referència de la variabilitat en aquesta regió, avaluar el seu impacte en estudis d’associació i proposar mesures per evitar l’increment de qualsevol tipus d’error (tipus I i II) en estudis nacionals i internacionals. A més, també han permés reconstruir la història de les poblacions humanes de la Mediterrània així com analitzar les seves relacions demogràfiques. Donada la duració limitada d’aquesta acció (24 mesos, d’octubre de 2010 a setembre de 2012), els resultats d’aquest projecte es troben actualment en fase de redacció i conduiran a diverses publicacions en revistes internacionals i a la preparació de comunicacions a congressos.
Resumo:
Background: Peach fruit undergoes a rapid softening process that involves a number of metabolic changes. Storing fruit at low temperatures has been widely used to extend its postharvest life. However, this leads to undesired changes, such as mealiness and browning, which affect the quality of the fruit. In this study, a 2-D DIGE approach was designed to screen for differentially accumulated proteins in peach fruit during normal softening as well as under conditions that led to fruit chilling injury. Results:The analysis allowed us to identify 43 spots -representing about 18% of the total number analyzed- that show statistically significant changes. Thirty-nine of the proteins could be identified by mass spectrometry. Some of the proteins that changed during postharvest had been related to peach fruit ripening and cold stress in the past. However, we identified other proteins that had not been linked to these processes. A graphical display of the relationship between the differentially accumulated proteins was obtained using pairwise average-linkage cluster analysis and principal component analysis. Proteins such as endopolygalacturonase, catalase, NADP-dependent isocitrate dehydrogenase, pectin methylesterase and dehydrins were found to be very important for distinguishing between healthy and chill injured fruit. A categorization of the differentially accumulated proteins was performed using Gene Ontology annotation. The results showed that the 'response to stress', 'cellular homeostasis', 'metabolism of carbohydrates' and 'amino acid metabolism' biological processes were affected the most during the postharvest. Conclusions: Using a comparative proteomic approach with 2-D DIGE allowed us to identify proteins that showed stage-specific changes in their accumulation pattern. Several proteins that are related to response to stress, cellular homeostasis, cellular component organization and carbohydrate metabolism were detected as being differentially accumulated. Finally, a significant proportion of the proteins identified had not been associated with softening, cold storage or chilling injury-altered fruit before; thus, comparative proteomics has proven to be a valuable tool for understanding fruit softening and postharvest.
Resumo:
Objective: To assess the factorial validity of the Portuguese version of the Maslach Burnout Inventory - Human Services Survey (MBI-HSS). Methods: Between November 2010 and November 2011 a Portuguese version of the MBI-HSS was applied to 151 Portuguese family doctors (55% women, median age 54 years). The factorial structure of the MBI-HSS was examined by principal component analysis (PCA) and confirmatory factor analysis (CFA). Internal consistency estimates of the MBI-HSS were determined with Cronbach's alpha. Results: The fit of the hypothesized three-factor model to the data was superior to the alternative two-factor and four-factor models. CFA supported MBI-HSS as an acceptable measure to evaluate burnout and deletion of items 12 and 16 improved the goodness of fit of the model. In PCA, the three-factor model explained 50.58% of the variance and the four-factor model did not lead to understandable components. Item 12 was also found to be problematic in PCA. The Cronbach's alpha was satisfactory for emotional exhaustion (alpha=0.90), lack of personal accomplishment (alpha=0.73), and depersonalization (alpha=0.64). Conclusion: The Portuguese version of the MBI-HSS was found to be reliable to measure burnout among Portuguese medical doctors. We also recommend the deletion of items 12 and 16 from the MBI-HSS.
Resumo:
The fatty acids from cocoa butters of different origins, varieties, and suppliers and a number of cocoa butter equivalents (Illexao 30-61, Illexao 30-71, Illexao 30-96, Choclin, Coberine, Chocosine-Illipe, Chocosine-Shea, Shokao, Akomax, Akonord, and Ertina) were investigated by bulk stable carbon isotope analysis and compound specific isotope analysis. The interpretation is based on principal component analysis combining the fatty acid concentrations and the bulk and molecular isotopic data. The scatterplot of the two first principal components allowed detection of the addition of vegetable fats to cocoa butters. Enrichment in heavy carbon isotope (C-13) of the bulk cocoa butter and of the individual fatty acids is related to mixing with other vegetable fats and possibly to thermally or oxidatively induced degradation during processing (e.g., drying and roasting of the cocoa beans or deodorization of the pressed fat) or storage. The feasibility of the analytical approach for authenticity assessment is discussed.