928 resultados para improved principal components analysis (IPCA) algorithm


Relevância:

100.00% 100.00%

Publicador:

Resumo:

OBJECTIVES: This study aimed to assess the validity of COOP charts in a general population sample, to examine whether illustrations contribute to instrument validity, and to establish general population norms. METHODS: A general population mail survey was conducted among 20-79 years old residents of the Swiss canton of Vaud. Participants were invited to complete COOP charts, the SF-36 Health Survey; they also provided data on health service use in the previous month. Two thirds of the respondents received standard COOP charts, the rest received charts without illustrations. RESULTS: Overall 1250 persons responded (54%). The presence of illustrations did not affect score distributions, except that the illustrated 'physical fitness' chart drew greater non-response (10 vs. 3%, p < 0.001). Validity tests were similar for illustrated and picture-less charts. Factor analysis yielded two principal components, corresponding to physical and mental health. Six COOP charts showed strong and nearly linear relationships with corresponding SF36 scores (all p < 0.001), demonstrating concurrent validity. Similarly, most COOP charts were associated with the use of medical services in the past month. Only the chart on 'social support' partly deviated from construct validity hypotheses. Population norms revealed a generally lower health status in women and an age-related decline in physical health. CONCLUSIONS: COOP charts can be used to assess the health status of a general population. Their validity is good, with the possible exception of the 'social support' chart. The illustrations do not affect the properties of this instrument.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Exposure to fine particulate matter air pollutants (PM2.5) affects heart rate variability parameters, and levels of serum proteins associated with inflammation, hemostasis and thrombosis. This study investigated sources potentially responsible for cardiovascular and hematological effects in highway patrol troopers. Results: Nine healthy young non-smoking male troopers working from 3 PM to midnight were studied on four consecutive days during their shift and the following night. Sources of in-vehicle PM2.5 were identified with variance-maximizing rotational principal factor analysis of PM2.5-components and associated pollutants. Two source models were calculated. Sources of in-vehicle PM2.5 identified were 1) crustal material, 2) wear of steel automotive components, 3) gasoline combustion, 4) speed-changing traffic with engine emissions and brake wear. In one model, sources 1 and 2 collapsed to a single source. Source factors scores were compared to cardiac and blood parameters measured ten and fifteen hours, respectively, after each shift. The "speed-change" factor was significantly associated with mean heart cycle length (MCL, +7% per standard deviation increase in the factor score), heart rate variability (+16%), supraventricular ectopic beats (+39%), % neutrophils (+7%), % lymphocytes (-10%), red blood cell volume MCV (+1%), von Willebrand Factor (+9%), blood urea nitrogen (+7%), and protein C (-11%). The "crustal" factor (but not the "collapsed" source) was associated with MCL (+3%) and serum uric acid concentrations (+5%). Controlling for potential confounders had little influence on the effect estimates. Conclusion: PM2.5 originating from speed-changing traffic modulates the autonomic control of the heart rhythm, increases the frequency of premature supraventricular beats and elicits proinflammatory and pro-thrombotic responses in healthy young men. [Authors]

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tire traces can be observed on several crime scenes as vehicles are often used by criminals. The tread abrasion on the road, while braking or skidding, leads to the production of small rubber particles which can be collected for comparison purposes. This research focused on the statistical comparison of Py-GC/MS profiles of tire traces and tire treads. The optimisation of the analytical method was carried out using experimental designs. The aim was to determine the best pyrolysis parameters regarding the repeatability of the results. Thus, the pyrolysis factor effect could also be calculated. The pyrolysis temperature was found to be five time more important than time. Finally, a pyrolysis at 650 °C during 15 s was selected. Ten tires of different manufacturers and models were used for this study. Several samples were collected on each tire, and several replicates were carried out to study the variability within each tire (intravariability). More than eighty compounds were integrated for each analysis and the variability study showed that more than 75% presented a relative standard deviation (RSD) below 5% for the ten tires, thus supporting a low intravariability. The variability between the ten tires (intervariability) presented higher values and the ten most variant compounds had a RSD value above 13%, supporting their high potential of discrimination between the tires tested. Principal Component Analysis (PCA) was able to fully discriminate the ten tires with the help of the first three principal components. The ten tires were finally used to perform braking tests on a racetrack with a vehicle equipped with an anti-lock braking system. The resulting tire traces were adequately collected using sheets of white gelatine. As for tires, the intravariability for the traces was found to be lower than the intervariability. Clustering methods were carried out and the Ward's method based on the squared Euclidean distance was able to correctly group all of the tire traces replicates in the same cluster than the replicates of their corresponding tire. Blind tests on traces were performed and were correctly assigned to their tire source. These results support the hypothesis that the tested tires, of different manufacturers and models, can be discriminated by a statistical comparison of their chemical profiles. The traces were found to be not differentiable from their source but differentiable from all the other tires present in the subset. The results are promising and will be extended on a larger sample set.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this work was to determine the genetic differences among eight Brazilian populations of the tomato leafminer Tuta absoluta (Meyrick) (Lepidoptera: Gelechiidae), from the states of Espírito Santo (Santa Tereza), Goiás (Goianápolis), Minas Gerais (Uberlândia and Viçosa), Pernambuco (Camocim de São Félix), Rio de Janeiro (São João da Barra) and São Paulo (Paulínia and Sumaré), using the amplified fragment length polymorphism (AFLP) technique. Fifteen combinations of EcoRI and MseI primers were used to assess divergence among populations. The data were analyzed using unweighted pair-group method, based on arithmetic averages (UPGMA) bootstrap analysis and principal coordinate analysis. Using a multilocus approach, these populations were divided in two groups, based on genetic fingerprints. Populations from Goianápolis, Santa Tereza, and Viçosa formed one group. Populations from Camocim de São Félix, Paulínia, São João da Barra, Sumaré, and Uberlândia fitted in the second group. These results were congruent with differences in susceptibility of this insect to insecticides, previously identified by other authors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this work we present a simulation of a recognition process with perimeter characterization of a simple plant leaves as a unique discriminating parameter. Data coding allowing for independence of leaves size and orientation may penalize performance recognition for some varieties. Border description sequences are then used, and Principal Component Analysis (PCA) is applied in order to study which is the best number of components for the classification task, implemented by means of a Support Vector Machine (SVM) System. Obtained results are satisfactory, and compared with [4] our system improves the recognition success, diminishing the variance at the same time.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The model plant Arabidopsis thaliana was studied for the search of new metabolites involved in wound signalling. Diverse LC approaches were considered in terms of efficiency and analysis time and a 7-min gradient on a UPLC-TOF-MS system with a short column was chosen for metabolite fingerprinting. This screening step was designed to allow the comparison of a high number of samples over a wide range of time points after stress induction in positive and negative ionisation modes. Thanks to data treatment, clear discrimination was obtained, providing lists of potential stress-induced ions. In a second step, the fingerprinting conditions were transferred to longer column, providing a higher peak capacity able to demonstrate the presence of isomers among the highlighted compounds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this work is to study the influence of several analytical parameters on the variability of Raman spectra of paint samples. In the present study, microtome thin section and direct (no preparation) analysis are considered as sample preparation. In order to evaluate their influence on the measures, an experimental design such as 'fractional full factorial' with seven factors (including the sampling process) is applied, for a total of 32 experiments representing 160 measures. Once the influence of sample preparation highlighted, a depth profile of a paint sample is carried out by changing the focusing plane in order to measure the colored layer under a clearcoat. This is undertaken in order to avoid sample preparation such a microtome sectioning. Finally, chemometric treatments such as principal component analysis are applied to the resulting spectra. The findings of this study indicate the importance of sample preparation, or more specifically, the surface roughness, on the variability of the measurements on a same sample. Moreover, the depth profile experiment highlights the influence of the refractive index of the upper layer (clearcoat) when measuring through a transparent layer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this work was to assess and characterize two clones, 169 and 685, of Cabernet Sauvignon grapes and to evaluate the wine produced from these grapes. The experiment was carried out in São Joaquim, SC, Brazil, during the 2009 harvest season. During grape ripening, the evolution of physical-chemical properties, phenolic compounds, organic acids, and anthocyanins was evaluated. During grape harvest, yield components were determined for each clone. Individual and total phenolics, individual and total anthocyanins, and antioxidant activity were evaluated for wine. The clones were also assessed regarding the duration of their phenological cycle. During ripening, the evolution of phenolic compounds and of physical-chemical parameters was similar for both clones; however, during harvest, significant differences were observed regarding yield, number of bunches per plant and berries per bunch, leaf area, and organic acid, polyphenol, and anthocyanin content. The wines produced from these clones showed significant differences regarding chemical composition. The clones showed similar phenological cycle and responses to bioclimatic parameters. Principal component analysis shows that clone 685 is strongly correlated with color characteristics, mainly monomeric anthocyanins, while clone 169 is correlated with individual phenolic compounds.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objective of this work was to evaluate the biochemical composition of six berry types belonging to Fragaria, Rubus, Vaccinium and Ribes genus. Fruit samples were collected in triplicate (50 fruit each) from 18 different species or cultivars of the mentioned genera, during three years (2008 to 2010). Content of individual sugars, organic acids, flavonols, and phenolic acids were determined by high performance liquid chromatography (HPLC) analysis, while total phenolics (TPC) and total antioxidant capacity (TAC), by using spectrophotometry. Principal component analysis (PCA) and hierarchical cluster analysis (CA) were performed to evaluate the differences in fruit biochemical profile. The highest contents of bioactive components were found in Ribes nigrum and in Fragaria vesca, Rubus plicatus, and Vaccinium myrtillus. PCA and CA were able to partially discriminate between berries on the basis of their biochemical composition. Individual and total sugars, myricetin, ellagic acid, TPC and TAC showed the highest impact on biochemical composition of the berry fruits. CA separated blackberry, raspberry, and blueberry as isolate groups, while classification of strawberry, black and red currant in a specific group has not occurred. There is a large variability both between and within the different types of berries. Metabolite fingerprinting of the evaluated berries showed unique biochemical profiles and specific combination of bioactive compound contents.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract:The objective of this work was to evaluate the suitability of the multivariate method of principal component analysis (PCA) using the GGE biplot software for grouping sunflower genotypes for their reaction to Alternaria leaf spot disease (Alternariaster helianthi), and for their yield and oil content. Sixty-nine genotypes were evaluated for disease severity in the field, at the R3 growth stage, in seven growing seasons, in Londrina, in the state of Paraná, Brazil, using a diagrammatic scale developed for this disease. Yield and oil content were also evaluated. Data were standardized using the software Statistica, and GGE biplot was used for PCA and graphical display of data. The first two principal components explained 77.9% of the total variation. According to the polygonal biplot using the first two principal components and three response variables, the genotypes were divided into seven sectors. Genotypes located on sectors 1 and 2 showed high yield and high oil content, respectively, and those located on sector 7 showed tolerance to the disease and high yield, despite the high disease severity. The principal component analysis using GGE biplot is an efficient method for grouping sunflower genotypes based on the studied variables.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Technological progress has made a huge amount of data available at increasing spatial and spectral resolutions. Therefore, the compression of hyperspectral data is an area of active research. In somefields, the original quality of a hyperspectral image cannot be compromised andin these cases, lossless compression is mandatory. The main goal of this thesisis to provide improved methods for the lossless compression of hyperspectral images. Both prediction- and transform-based methods are studied. Two kinds of prediction based methods are being studied. In the first method the spectra of a hyperspectral image are first clustered and and an optimized linear predictor is calculated for each cluster. In the second prediction method linear prediction coefficients are not fixed but are recalculated for each pixel. A parallel implementation of the above-mentioned linear prediction method is also presented. Also,two transform-based methods are being presented. Vector Quantization (VQ) was used together with a new coding of the residual image. In addition we have developed a new back end for a compression method utilizing Principal Component Analysis (PCA) and Integer Wavelet Transform (IWT). The performance of the compressionmethods are compared to that of other compression methods. The results show that the proposed linear prediction methods outperform the previous methods. In addition, a novel fast exact nearest-neighbor search method is developed. The search method is used to speed up the Linde-Buzo-Gray (LBG) clustering method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The objectives of this study were to evaluate the performance of cultivars, to quantify the variability and to estimate the genetic distances of 66 wine grape accessions in the Grape Germplasm Bank of the EMBRAPA Semi-Arid, in Juazeiro, BA, Brazil, through the characterization of discrete and continuous phenotypic variables. Multivariate statistics, such as, principal components, Tocher's optimization procedure, and the graphic of the distance, were efficient in grouping more similar genotypes, according to their phenotypic characteristics. There was no agreement in the formation of groups between continuous and discrete morpho-agronomic traits, when Tocher's optimization procedure was used. Discrete variables allowed the separation of Vitis vinifera and hybrids in different groups. Significant positive correlations were observed between weight, length and width of bunches, and a negative correlation between titratable acidity and TSS/TTA. The major part (84.12%) of the total variation present in the original data was explained by the four principal components. The results revealed little variability between wine grape accessions in the Grape Germplasm Bank of Embrapa Semi-Arid.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this present work, we are proposing a characteristics reduction system for a facial biometric identification system, using transformed domains such as discrete cosine transformed (DCT) and discrete wavelets transformed (DWT) as parameterization; and Support Vector Machines (SVM) and Neural Network (NN) as classifiers. The size reduction has been done with Principal Component Analysis (PCA) and with Independent Component Analysis (ICA). This system presents a similar success results for both DWT-SVM system and DWT-PCA-SVM system, about 98%. The computational load is improved on training mode due to the decreasing of input’s size and less complexity of the classifier.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Notre consommation en eau souterraine, en particulier comme eau potable ou pour l'irrigation, a considérablement augmenté au cours des années. De nombreux problèmes font alors leur apparition, allant de la prospection de nouvelles ressources à la remédiation des aquifères pollués. Indépendamment du problème hydrogéologique considéré, le principal défi reste la caractérisation des propriétés du sous-sol. Une approche stochastique est alors nécessaire afin de représenter cette incertitude en considérant de multiples scénarios géologiques et en générant un grand nombre de réalisations géostatistiques. Nous rencontrons alors la principale limitation de ces approches qui est le coût de calcul dû à la simulation des processus d'écoulements complexes pour chacune de ces réalisations. Dans la première partie de la thèse, ce problème est investigué dans le contexte de propagation de l'incertitude, oú un ensemble de réalisations est identifié comme représentant les propriétés du sous-sol. Afin de propager cette incertitude à la quantité d'intérêt tout en limitant le coût de calcul, les méthodes actuelles font appel à des modèles d'écoulement approximés. Cela permet l'identification d'un sous-ensemble de réalisations représentant la variabilité de l'ensemble initial. Le modèle complexe d'écoulement est alors évalué uniquement pour ce sousensemble, et, sur la base de ces réponses complexes, l'inférence est faite. Notre objectif est d'améliorer la performance de cette approche en utilisant toute l'information à disposition. Pour cela, le sous-ensemble de réponses approximées et exactes est utilisé afin de construire un modèle d'erreur, qui sert ensuite à corriger le reste des réponses approximées et prédire la réponse du modèle complexe. Cette méthode permet de maximiser l'utilisation de l'information à disposition sans augmentation perceptible du temps de calcul. La propagation de l'incertitude est alors plus précise et plus robuste. La stratégie explorée dans le premier chapitre consiste à apprendre d'un sous-ensemble de réalisations la relation entre les modèles d'écoulement approximé et complexe. Dans la seconde partie de la thèse, cette méthodologie est formalisée mathématiquement en introduisant un modèle de régression entre les réponses fonctionnelles. Comme ce problème est mal posé, il est nécessaire d'en réduire la dimensionnalité. Dans cette optique, l'innovation du travail présenté provient de l'utilisation de l'analyse en composantes principales fonctionnelles (ACPF), qui non seulement effectue la réduction de dimensionnalités tout en maximisant l'information retenue, mais permet aussi de diagnostiquer la qualité du modèle d'erreur dans cet espace fonctionnel. La méthodologie proposée est appliquée à un problème de pollution par une phase liquide nonaqueuse et les résultats obtenus montrent que le modèle d'erreur permet une forte réduction du temps de calcul tout en estimant correctement l'incertitude. De plus, pour chaque réponse approximée, une prédiction de la réponse complexe est fournie par le modèle d'erreur. Le concept de modèle d'erreur fonctionnel est donc pertinent pour la propagation de l'incertitude, mais aussi pour les problèmes d'inférence bayésienne. Les méthodes de Monte Carlo par chaîne de Markov (MCMC) sont les algorithmes les plus communément utilisés afin de générer des réalisations géostatistiques en accord avec les observations. Cependant, ces méthodes souffrent d'un taux d'acceptation très bas pour les problèmes de grande dimensionnalité, résultant en un grand nombre de simulations d'écoulement gaspillées. Une approche en deux temps, le "MCMC en deux étapes", a été introduite afin d'éviter les simulations du modèle complexe inutiles par une évaluation préliminaire de la réalisation. Dans la troisième partie de la thèse, le modèle d'écoulement approximé couplé à un modèle d'erreur sert d'évaluation préliminaire pour le "MCMC en deux étapes". Nous démontrons une augmentation du taux d'acceptation par un facteur de 1.5 à 3 en comparaison avec une implémentation classique de MCMC. Une question reste sans réponse : comment choisir la taille de l'ensemble d'entrainement et comment identifier les réalisations permettant d'optimiser la construction du modèle d'erreur. Cela requiert une stratégie itérative afin que, à chaque nouvelle simulation d'écoulement, le modèle d'erreur soit amélioré en incorporant les nouvelles informations. Ceci est développé dans la quatrième partie de la thèse, oú cette méthodologie est appliquée à un problème d'intrusion saline dans un aquifère côtier. -- Our consumption of groundwater, in particular as drinking water and for irrigation, has considerably increased over the years and groundwater is becoming an increasingly scarce and endangered resource. Nofadays, we are facing many problems ranging from water prospection to sustainable management and remediation of polluted aquifers. Independently of the hydrogeological problem, the main challenge remains dealing with the incomplete knofledge of the underground properties. Stochastic approaches have been developed to represent this uncertainty by considering multiple geological scenarios and generating a large number of realizations. The main limitation of this approach is the computational cost associated with performing complex of simulations in each realization. In the first part of the thesis, we explore this issue in the context of uncertainty propagation, where an ensemble of geostatistical realizations is identified as representative of the subsurface uncertainty. To propagate this lack of knofledge to the quantity of interest (e.g., the concentration of pollutant in extracted water), it is necessary to evaluate the of response of each realization. Due to computational constraints, state-of-the-art methods make use of approximate of simulation, to identify a subset of realizations that represents the variability of the ensemble. The complex and computationally heavy of model is then run for this subset based on which inference is made. Our objective is to increase the performance of this approach by using all of the available information and not solely the subset of exact responses. Two error models are proposed to correct the approximate responses follofing a machine learning approach. For the subset identified by a classical approach (here the distance kernel method) both the approximate and the exact responses are knofn. This information is used to construct an error model and correct the ensemble of approximate responses to predict the "expected" responses of the exact model. The proposed methodology makes use of all the available information without perceptible additional computational costs and leads to an increase in accuracy and robustness of the uncertainty propagation. The strategy explored in the first chapter consists in learning from a subset of realizations the relationship between proxy and exact curves. In the second part of this thesis, the strategy is formalized in a rigorous mathematical framework by defining a regression model between functions. As this problem is ill-posed, it is necessary to reduce its dimensionality. The novelty of the work comes from the use of functional principal component analysis (FPCA), which not only performs the dimensionality reduction while maximizing the retained information, but also allofs a diagnostic of the quality of the error model in the functional space. The proposed methodology is applied to a pollution problem by a non-aqueous phase-liquid. The error model allofs a strong reduction of the computational cost while providing a good estimate of the uncertainty. The individual correction of the proxy response by the error model leads to an excellent prediction of the exact response, opening the door to many applications. The concept of functional error model is useful not only in the context of uncertainty propagation, but also, and maybe even more so, to perform Bayesian inference. Monte Carlo Markov Chain (MCMC) algorithms are the most common choice to ensure that the generated realizations are sampled in accordance with the observations. Hofever, this approach suffers from lof acceptance rate in high dimensional problems, resulting in a large number of wasted of simulations. This led to the introduction of two-stage MCMC, where the computational cost is decreased by avoiding unnecessary simulation of the exact of thanks to a preliminary evaluation of the proposal. In the third part of the thesis, a proxy is coupled to an error model to provide an approximate response for the two-stage MCMC set-up. We demonstrate an increase in acceptance rate by a factor three with respect to one-stage MCMC results. An open question remains: hof do we choose the size of the learning set and identify the realizations to optimize the construction of the error model. This requires devising an iterative strategy to construct the error model, such that, as new of simulations are performed, the error model is iteratively improved by incorporating the new information. This is discussed in the fourth part of the thesis, in which we apply this methodology to a problem of saline intrusion in a coastal aquifer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent years there has been growing interest in composite indicators as an efficient tool of analysis and a method of prioritizing policies. This paper presents a composite index of intermediary determinants of child health using a multivariate statistical approach. The index shows how specific determinants of child health vary across Colombian departments (administrative subdivisions). We used data collected from the 2010 Colombian Demographic and Health Survey (DHS) for 32 departments and the capital city, Bogotá. Adapting the conceptual framework of Commission on Social Determinants of Health (CSDH), five dimensions related to child health are represented in the index: material circumstances, behavioural factors, psychosocial factors, biological factors and the health system. In order to generate the weight of the variables, and taking into account the discrete nature of the data, principal component analysis (PCA) using polychoric correlations was employed in constructing the index. From this method five principal components were selected. The index was estimated using a weighted average of the retained components. A hierarchical cluster analysis was also carried out. The results show that the biggest differences in intermediary determinants of child health are associated with health care before and during delivery.