928 resultados para improved principal components analysis (IPCA) algorithm


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-07

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Health-related quality of life (HRQL) assessment is an important measure of the impact of a wide range of disease process on an individual. To date, no HRQL tool has been evaluated in an Iranian population with cardiovascular disorders, specifically myocardial infarction, a major cause of mortality and morbidity. The MacNew Heart Disease Health-related Quality of Life instrument is a disease-specific HRQL questionnaire with satisfactory validity and reliability when applied cross-culturally. METHOD: A Persian version of MacNew was prepared by both forward and backward translation by bilinguals after which a feasibility test was performed. Consecutive patients (n = 51) admitted to a coronary care unit with acute myocardial infarction were recruited for measurement of their HRQL with retest one month after discharge in the follow-up clinic. Principal components analysis, intra-class correlation reliability, internal consistency, and test-retest reliability were assessed. RESULTS: Trivial rates of missing data confirmed the acceptability of the tool. Principal component analysis revealed that the three domains, emotional, social and physical, performed as well as in the original studies. Internal consistency was high and comparable to other studies, ranging from 0.92 for the emotional and physical domains, to 0.94 for the social domain, and to 0.95 for the Global score. Domain means of 5, 5.3 and 4.9 for emotional, physical and social respectively indicate that our Iranian population has similar emotional and physical but worse social HRQL scores. Test-retest analysis showed significant correlation in emotional and physical domains (P < 0.05). CONCLUSION: The Persian version of the MacNew questionnaire is comparable to the English version. It has high internal consistency and reasonable reproducibility, making it an appropriate specific quality of life tool for population-based studies and clinical practice in Iran in patients who have survived an acute myocardial infraction. Further studies are needed to confirm its validity in larger populations with cardiovascular disease

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In a industrial environment, to know the process one is working with is crucial to ensure its good functioning. In the present work, developed at Prio Biocombustíveis S.A. facilities, using process data, collected during the present work, and historical process data, the methanol recovery process was characterized, having started with the characterization of key process streams. Based on the information retrieved from the stream characterization, Aspen Plus® process simulation software was used to replicate the process and perform a sensitivity analysis with the objective of accessing the relative importance of certain key process variables (reflux/feed ratio, reflux temperature, reboiler outlet temperature, methanol, glycerol and water feed compositions). The work proceeded with the application of a set of statistical tools, starting with the Principal Components Analysis (PCA) from which the interactions between process variables and their contribution to the process variability was studied. Next, the Design of Experiments (DoE) was used to acquire experimental data and, with it, create a model for the water amount in the distillate. However, the necessary conditions to perform this method were not met and so it was abandoned. The Multiple Linear Regression method (MLR) was then used with the available data, creating several empiric models for the water at distillate, the one with the highest fit having a R2 equal to 92.93% and AARD equal to 19.44%. Despite the AARD still being relatively high, the model is still adequate to make fast estimates of the distillate’s quality. As for fouling, its presence has been noticed many times during this work. Not being possible to directly measure the fouling, the reboiler inlet steam pressure was used as an indicator of the fouling growth and its growth variation with the amount of Used Cooking Oil incorporated in the whole process. Comparing the steam cost associated to the reboiler’s operation when fouling is low (1.5 bar of steam pressure) and when fouling is high (reboiler’s steam pressure of 3 bar), an increase of about 58% occurs when the fouling increases.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Une taxonomie révisée et une connaissance des limites d’espèces demeurent toujours importantes dans les points chauds en biodiversité comme les Antilles où de nombreuses espèces endémiques sont retrouvées. Des limites d’espèces divergentes impliquent un différent nombre d’espèces retrouvées dans un écosystème, ce qui peut exercer une influence sur les décisions prises face aux enjeux de conservation. Les genres Gesneria et Rhytidophyllum qui forment les principaux représentants de la famille des Gesneriaceae dans les Antilles comprennent plusieurs taxons aux limites d’espèces ambigües et quelques espèces qui ont des sous-espèces reconnues. C’est le cas de Gesneria viridiflora (Decne.) Kuntze qui comprend quatre sous-espèces géographiquement isolées et qui présentent des caractères végétatifs et reproducteurs similaires et variables. Une délimitation d’espèces approfondie de ce complexe d’espèce est effectuée ici à partir d’une approche de taxonomie intégrative considérant des données morphologiques, génétiques et bioclimatiques. Les données morphologiques quantitatives et qualitatives obtenues à partir de spécimens d’herbier sont utilisées pour délimiter des groupes morphologiques à l’aide d’une analyse en coordonnées principales. Ces groupes sont ensuite testés à l’aide de séquences d’ADN de quatre régions nucléaires en utilisant une méthode bayesienne basée sur la théorie de la coalescence. Finalement, les occurrences et les valeurs de variables de température et de précipitation qui y prévalent sont utilisées dans une analyse en composantes principales bioclimatique pour comparer les groupes délimités morphologiquement et génétiquement. Les résultats de l’analyse morphologique multivariée supportent la distinction entre les groupes formés par les sous-espèces actuellement reconnues de G. viridiflora. Les résultats, incluant des données génétiques, suggèrent une distinction jusqu’ici insoupçonnée des populations du Massif de la Hotte au sud-ouest d’Haïti qui sont génétiquement plus rapprochées des populations de Cuba que de celles d’Hispaniola. Bioclimatiquement, les groupes délimités par les analyses morphologiques et génétiques sont distincts. L’approche de taxonomie intégrative a permis de distinguer cinq espèces distinctes plutôt que les quatre sous-espèces acceptées jusqu’à aujourd’hui. Ces espèces sont : G. acrochordonanthe, G. quisqueyana, G. sintenisii, G. sylvicola et G. viridiflora. Une carte de distribution géographique, un tableau de la nouvelle taxonomie applicable et une clé d’identification des espèces sont présentés. La nouvelle taxonomie déterminée dans cette étude démontre un endémisme insoupçonné dans plusieurs régions du point chaud en biodiversité des Antilles et souligne l’importance d’investiguer les limites d’espèces dans les groupes diversifiés comprenant des taxons aux limites d’espèces incomprises.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação (mestrado)—Universidade de Brasília, Instituto de Física, Programa de Pós-Graduação em Física, 2016.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Produced water is characterized as one of the most common wastes generated during exploration and production of oil. This work aims to develop methodologies based on comparative statistical processes of hydrogeochemical analysis of production zones in order to minimize types of high-cost interventions to perform identification test fluids - TIF. For the study, 27 samples were collected from five different production zones were measured a total of 50 chemical species. After the chemical analysis was applied the statistical data, using the R Statistical Software, version 2.11.1. Statistical analysis was performed in three steps. In the first stage, the objective was to investigate the behavior of chemical species under study in each area of production through the descriptive graphical analysis. The second step was to identify a function that classify production zones from each sample, using discriminant analysis. In the training stage, the rate of correct classification function of discriminant analysis was 85.19%. The next stage of processing of the data used for Principal Component Analysis, by reducing the number of variables obtained from the linear combination of chemical species, try to improve the discriminant function obtained in the second stage and increase the discrimination power of the data, but the result was not satisfactory. In Profile Analysis curves were obtained for each production area, based on the characteristics of the chemical species present in each zone. With this study it was possible to develop a method using hydrochemistry and statistical analysis that can be used to distinguish the water produced in mature fields of oil, so that it is possible to identify the zone of production that is contributing to the excessive elevation of the water volume.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Undergraduate psychology students rated expectations of a bogus professor (randomly designated a man or woman and hot versus not hot) based on an online rating and sample comments as found on RateMyProfessors.com (RMP). Five professor qualities were derived using principal components analysis (PCA): dedication, attractiveness, enhancement, fairness, and clarity. Participants rated current psychology professors on the same qualities. Current professors were divided based on gender (man or woman), age (under 35 or 35 and older), and attractiveness (at or below the median or above the median). Using multivariate analysis of covariance (MANCOVA), students expected hot professors to be more attractive but lower in clarity. They rated current professors as lowest in clarity when a man and 35 or older. Current professors were rated significantly lower in dedication, enhancement, fairness, and clarity when rated at or below the median on attractiveness. Results, with previous research, suggest numerous factors, largely out of professors’ control, influencing how students interpret and create professor ratings. Caution is therefore warranted in using online ratings to select courses or make hiring and promotion decisions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of high spatial resolution airborne and spaceborne sensors has improved the capability of ground-based data collection in the fields of agriculture, geography, geology, mineral identification, detection [2, 3], and classification [4–8]. The signal read by the sensor from a given spatial element of resolution and at a given spectral band is a mixing of components originated by the constituent substances, termed endmembers, located at that element of resolution. This chapter addresses hyperspectral unmixing, which is the decomposition of the pixel spectra into a collection of constituent spectra, or spectral signatures, and their corresponding fractional abundances indicating the proportion of each endmember present in the pixel [9, 10]. Depending on the mixing scales at each pixel, the observed mixture is either linear or nonlinear [11, 12]. The linear mixing model holds when the mixing scale is macroscopic [13]. The nonlinear model holds when the mixing scale is microscopic (i.e., intimate mixtures) [14, 15]. The linear model assumes negligible interaction among distinct endmembers [16, 17]. The nonlinear model assumes that incident solar radiation is scattered by the scene through multiple bounces involving several endmembers [18]. Under the linear mixing model and assuming that the number of endmembers and their spectral signatures are known, hyperspectral unmixing is a linear problem, which can be addressed, for example, under the maximum likelihood setup [19], the constrained least-squares approach [20], the spectral signature matching [21], the spectral angle mapper [22], and the subspace projection methods [20, 23, 24]. Orthogonal subspace projection [23] reduces the data dimensionality, suppresses undesired spectral signatures, and detects the presence of a spectral signature of interest. The basic concept is to project each pixel onto a subspace that is orthogonal to the undesired signatures. As shown in Settle [19], the orthogonal subspace projection technique is equivalent to the maximum likelihood estimator. This projection technique was extended by three unconstrained least-squares approaches [24] (signature space orthogonal projection, oblique subspace projection, target signature space orthogonal projection). Other works using maximum a posteriori probability (MAP) framework [25] and projection pursuit [26, 27] have also been applied to hyperspectral data. In most cases the number of endmembers and their signatures are not known. Independent component analysis (ICA) is an unsupervised source separation process that has been applied with success to blind source separation, to feature extraction, and to unsupervised recognition [28, 29]. ICA consists in finding a linear decomposition of observed data yielding statistically independent components. Given that hyperspectral data are, in given circumstances, linear mixtures, ICA comes to mind as a possible tool to unmix this class of data. In fact, the application of ICA to hyperspectral data has been proposed in reference 30, where endmember signatures are treated as sources and the mixing matrix is composed by the abundance fractions, and in references 9, 25, and 31–38, where sources are the abundance fractions of each endmember. In the first approach, we face two problems: (1) The number of samples are limited to the number of channels and (2) the process of pixel selection, playing the role of mixed sources, is not straightforward. In the second approach, ICA is based on the assumption of mutually independent sources, which is not the case of hyperspectral data, since the sum of the abundance fractions is constant, implying dependence among abundances. This dependence compromises ICA applicability to hyperspectral images. In addition, hyperspectral data are immersed in noise, which degrades the ICA performance. IFA [39] was introduced as a method for recovering independent hidden sources from their observed noisy mixtures. IFA implements two steps. First, source densities and noise covariance are estimated from the observed data by maximum likelihood. Second, sources are reconstructed by an optimal nonlinear estimator. Although IFA is a well-suited technique to unmix independent sources under noisy observations, the dependence among abundance fractions in hyperspectral imagery compromises, as in the ICA case, the IFA performance. Considering the linear mixing model, hyperspectral observations are in a simplex whose vertices correspond to the endmembers. Several approaches [40–43] have exploited this geometric feature of hyperspectral mixtures [42]. Minimum volume transform (MVT) algorithm [43] determines the simplex of minimum volume containing the data. The MVT-type approaches are complex from the computational point of view. Usually, these algorithms first find the convex hull defined by the observed data and then fit a minimum volume simplex to it. Aiming at a lower computational complexity, some algorithms such as the vertex component analysis (VCA) [44], the pixel purity index (PPI) [42], and the N-FINDR [45] still find the minimum volume simplex containing the data cloud, but they assume the presence in the data of at least one pure pixel of each endmember. This is a strong requisite that may not hold in some data sets. In any case, these algorithms find the set of most pure pixels in the data. Hyperspectral sensors collects spatial images over many narrow contiguous bands, yielding large amounts of data. For this reason, very often, the processing of hyperspectral data, included unmixing, is preceded by a dimensionality reduction step to reduce computational complexity and to improve the signal-to-noise ratio (SNR). Principal component analysis (PCA) [46], maximum noise fraction (MNF) [47], and singular value decomposition (SVD) [48] are three well-known projection techniques widely used in remote sensing in general and in unmixing in particular. The newly introduced method [49] exploits the structure of hyperspectral mixtures, namely the fact that spectral vectors are nonnegative. The computational complexity associated with these techniques is an obstacle to real-time implementations. To overcome this problem, band selection [50] and non-statistical [51] algorithms have been introduced. This chapter addresses hyperspectral data source dependence and its impact on ICA and IFA performances. The study consider simulated and real data and is based on mutual information minimization. Hyperspectral observations are described by a generative model. This model takes into account the degradation mechanisms normally found in hyperspectral applications—namely, signature variability [52–54], abundance constraints, topography modulation, and system noise. The computation of mutual information is based on fitting mixtures of Gaussians (MOG) to data. The MOG parameters (number of components, means, covariances, and weights) are inferred using the minimum description length (MDL) based algorithm [55]. We study the behavior of the mutual information as a function of the unmixing matrix. The conclusion is that the unmixing matrix minimizing the mutual information might be very far from the true one. Nevertheless, some abundance fractions might be well separated, mainly in the presence of strong signature variability, a large number of endmembers, and high SNR. We end this chapter by sketching a new methodology to blindly unmix hyperspectral data, where abundance fractions are modeled as a mixture of Dirichlet sources. This model enforces positivity and constant sum sources (full additivity) constraints. The mixing matrix is inferred by an expectation-maximization (EM)-type algorithm. This approach is in the vein of references 39 and 56, replacing independent sources represented by MOG with mixture of Dirichlet sources. Compared with the geometric-based approaches, the advantage of this model is that there is no need to have pure pixels in the observations. The chapter is organized as follows. Section 6.2 presents a spectral radiance model and formulates the spectral unmixing as a linear problem accounting for abundance constraints, signature variability, topography modulation, and system noise. Section 6.3 presents a brief resume of ICA and IFA algorithms. Section 6.4 illustrates the performance of IFA and of some well-known ICA algorithms with experimental data. Section 6.5 studies the ICA and IFA limitations in unmixing hyperspectral data. Section 6.6 presents results of ICA based on real data. Section 6.7 describes the new blind unmixing scheme and some illustrative examples. Section 6.8 concludes with some remarks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The developmental processes and functions of an organism are controlled by the genes and the proteins that are derived from these genes. The identification of key genes and the reconstruction of gene networks can provide a model to help us understand the regulatory mechanisms for the initiation and progression of biological processes or functional abnormalities (e.g. diseases) in living organisms. In this dissertation, I have developed statistical methods to identify the genes and transcription factors (TFs) involved in biological processes, constructed their regulatory networks, and also evaluated some existing association methods to find robust methods for coexpression analyses. Two kinds of data sets were used for this work: genotype data and gene expression microarray data. On the basis of these data sets, this dissertation has two major parts, together forming six chapters. The first part deals with developing association methods for rare variants using genotype data (chapter 4 and 5). The second part deals with developing and/or evaluating statistical methods to identify genes and TFs involved in biological processes, and construction of their regulatory networks using gene expression data (chapter 2, 3, and 6). For the first part, I have developed two methods to find the groupwise association of rare variants with given diseases or traits. The first method is based on kernel machine learning and can be applied to both quantitative as well as qualitative traits. Simulation results showed that the proposed method has improved power over the existing weighted sum method (WS) in most settings. The second method uses multiple phenotypes to select a few top significant genes. It then finds the association of each gene with each phenotype while controlling the population stratification by adjusting the data for ancestry using principal components. This method was applied to GAW 17 data and was able to find several disease risk genes. For the second part, I have worked on three problems. First problem involved evaluation of eight gene association methods. A very comprehensive comparison of these methods with further analysis clearly demonstrates the distinct and common performance of these eight gene association methods. For the second problem, an algorithm named the bottom-up graphical Gaussian model was developed to identify the TFs that regulate pathway genes and reconstruct their hierarchical regulatory networks. This algorithm has produced very significant results and it is the first report to produce such hierarchical networks for these pathways. The third problem dealt with developing another algorithm called the top-down graphical Gaussian model that identifies the network governed by a specific TF. The network produced by the algorithm is proven to be of very high accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The identification, modeling, and analysis of interactions between nodes of neural systems in the human brain have become the aim of interest of many studies in neuroscience. The complex neural network structure and its correlations with brain functions have played a role in all areas of neuroscience, including the comprehension of cognitive and emotional processing. Indeed, understanding how information is stored, retrieved, processed, and transmitted is one of the ultimate challenges in brain research. In this context, in functional neuroimaging, connectivity analysis is a major tool for the exploration and characterization of the information flow between specialized brain regions. In most functional magnetic resonance imaging (fMRI) studies, connectivity analysis is carried out by first selecting regions of interest (ROI) and then calculating an average BOLD time series (across the voxels in each cluster). Some studies have shown that the average may not be a good choice and have suggested, as an alternative, the use of principal component analysis (PCA) to extract the principal eigen-time series from the ROI(s). In this paper, we introduce a novel approach called cluster Granger analysis (CGA) to study connectivity between ROIs. The main aim of this method was to employ multiple eigen-time series in each ROI to avoid temporal information loss during identification of Granger causality. Such information loss is inherent in averaging (e.g., to yield a single ""representative"" time series per ROI). This, in turn, may lead to a lack of power in detecting connections. The proposed approach is based on multivariate statistical analysis and integrates PCA and partial canonical correlation in a framework of Granger causality for clusters (sets) of time series. We also describe an algorithm for statistical significance testing based on bootstrapping. By using Monte Carlo simulations, we show that the proposed approach outperforms conventional Granger causality analysis (i.e., using representative time series extracted by signal averaging or first principal components estimation from ROIs). The usefulness of the CGA approach in real fMRI data is illustrated in an experiment using human faces expressing emotions. With this data set, the proposed approach suggested the presence of significantly more connections between the ROIs than were detected using a single representative time series in each ROI. (c) 2010 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a method for analyzing scoliosis trunk deformities using Independent Component Analysis (ICA). Our hypothesis is that ICA can capture the scoliosis deformities visible on the trunk. Unlike Principal Component Analysis (PCA), ICA gives local shape variation and assumes that the data distribution is not normal. 3D torso images of 56 subjects including 28 patients with adolescent idiopathic scoliosis and 28 healthy subjects are analyzed using ICA. First, we remark that the independent components capture the local scoliosis deformities as the shoulder variation, the scapula asymmetry and the waist deformation. Second, we note that the different scoliosis curve types are characterized by different combinations of specific independent components.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multi-factor approaches to analysis of real estate returns have, since the pioneering work of Chan, Hendershott and Sanders (1990), emphasised a macro-variables approach in preference to the latent factor approach that formed the original basis of the arbitrage pricing theory. With increasing use of high frequency data and trading strategies and with a growing emphasis on the risks of extreme events, the macro-variable procedure has some deficiencies. This paper explores a third way, with the use of an alternative to the standard principal components approach – independent components analysis (ICA). ICA seeks higher moment independence and maximises in relation to a chosen risk parameter. We apply an ICA based on kurtosis maximisation to weekly US REIT data using a kurtosis maximising algorithm. The results show that ICA is successful in capturing the kurtosis characteristics of REIT returns, offering possibilities for the development of risk management strategies that are sensitive to extreme events and tail distributions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In present research, headspace solid-phase microextraction (HS-SPME) followed by gas chromatography–mass spectrometry (GC–qMS), was evaluated as a reliable and improved alternative to the commonly used liquid–liquid extraction (LLE) technique for the establishment of the pattern of hydrolytically released components of 7 Vitis vinifera L. grape varieties, commonly used to produce the world-famous Madeira wine. Since there is no data available on their glycosidic fractions, at a first step, two hydrolyse procedures, acid and enzymatic, were carried out using Boal grapes as matrix. Several parameters susceptible of influencing the hydrolytic process were studied. The best results, expressed as GC peak area, number of identified components and reproducibility, were obtained using ProZym M with b-glucosidase activity at 35 °C for 42 h. For the extraction of hydrolytically released components, HS-SPME technique was evaluated as a reliable and improved alternative to the conventional extraction technique, LLE (ethyl acetate). HS-SPME using DVB/CAR/PDMS as coating fiber displayed an extraction capacity two fold higher than LLE (ethyl acetate). The hydrolyzed fraction was mainly characterized by the occurrence of aliphatic and aromatic alcohols, followed by acids, esters, carbonyl compounds, terpenoids, and volatile phenols. Concerning to terpenoids its contribution to the total hydrolyzed fraction is highest for Malvasia Cândida (23%) and Malvasia Roxa (13%), and their presence according previous studies, even at low concentration, is important from a sensorial point of view (can impart floral notes to the wines), due to their low odor threshold (μg/L). According to the obtained data by principal component analysis (PCA), the sensorial properties of Madeira wines produced by Malvasia Cândida and Malvasia Roxa could be improved by hydrolysis procedure, since their hydrolyzed fraction is mainly characterized by terpenoids (e.g. linalool, geraniol) which are responsible for floral notes. Bual and Sercial grapes are characterized by aromatic alcohols (e.g. benzyl alcohol, 2-phenylethyl alcohol), so an improvement in sensorial characteristics (citrus, sweet and floral odors) of the corresponding wines, as result of hydrolytic process, is expected.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When searching for prospective novel peptides, it is difficult to determine the biological activity of a peptide based only on its sequence. The trial and error approach is generally laborious, expensive and time consuming due to the large number of different experimental setups required to cover a reasonable number of biological assays. To simulate a virtual model for Hymenoptera insects, 166 peptides were selected from the venoms and hemolymphs of wasps, bees and ants and applied to a mathematical model of multivariate analysis, with nine different chemometric components: GRAVY, aliphaticity index, number of disulfide bonds, total residues, net charge, pI value, Boman index, percentage of alpha helix, and flexibility prediction. Principal component analysis (PCA) with non-linear iterative projections by alternating least-squares (NIPALS) algorithm was performed, without including any information about the biological activity of the peptides. This analysis permitted the grouping of peptides in a way that strongly correlated to the biological function of the peptides. Six different groupings were observed, which seemed to correspond to the following groups: chemotactic peptides, mastoparans, tachykinins, kinins, antibiotic peptides, and a group of long peptides with one or two disulfide bonds and with biological activities that are not yet clearly defined. The partial overlap between the mastoparans group and the chemotactic peptides, tachykinins, kinins and antibiotic peptides in the PCA score plot may be used to explain the frequent reports in the literature about the multifunctionality of some of these peptides. The mathematical model used in the present investigation can be used to predict the biological activities of novel peptides in this system, and it may also be easily applied to other biological systems. © 2011 Elsevier Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Piotr Omenzetter and Simon Hoell’s work within the Lloyd’s Register Foundation Centre for Safety and Reliability Engineering at the University of Aberdeen is supported by Lloyd’s Register Foundation. The Foundation helps to protect life and property by supporting engineering-related education, public engagement and the application of research.