928 resultados para improved principal components analysis (IPCA) algorithm
Resumo:
As one of the newest members in the field of articial immune systems (AIS), the Dendritic Cell Algorithm (DCA) is based on behavioural models of natural dendritic cells (DCs). Unlike other AIS, the DCA does not rely on training data, instead domain or expert knowledge is required to predetermine the mapping between input signals from a particular instance to the three categories used by the DCA. This data preprocessing phase has received the criticism of having manually over-fitted the data to the algorithm, which is undesirable. Therefore, in this paper we have attempted to ascertain if it is possible to use principal component analysis (PCA) techniques to automatically categorise input data while still generating useful and accurate classication results. The integrated system is tested with a biometrics dataset for the stress recognition of automobile drivers. The experimental results have shown the application of PCA to the DCA for the purpose of automated data preprocessing is successful.
Resumo:
Three-dimensional spectroscopy techniques are becoming more and more popular, producing an increasing number of large data cubes. The challenge of extracting information from these cubes requires the development of new techniques for data processing and analysis. We apply the recently developed technique of principal component analysis (PCA) tomography to a data cube from the center of the elliptical galaxy NGC 7097 and show that this technique is effective in decomposing the data into physically interpretable information. We find that the first five principal components of our data are associated with distinct physical characteristics. In particular, we detect a low-ionization nuclear-emitting region (LINER) with a weak broad component in the Balmer lines. Two images of the LINER are present in our data, one seen through a disk of gas and dust, and the other after scattering by free electrons and/or dust particles in the ionization cone. Furthermore, we extract the spectrum of the LINER, decontaminated from stellar and extended nebular emission, using only the technique of PCA tomography. We anticipate that the scattered image has polarized light due to its scattered nature.
Resumo:
Aims. A model-independent reconstruction of the cosmic expansion rate is essential to a robust analysis of cosmological observations. Our goal is to demonstrate that current data are able to provide reasonable constraints on the behavior of the Hubble parameter with redshift, independently of any cosmological model or underlying gravity theory. Methods. Using type Ia supernova data, we show that it is possible to analytically calculate the Fisher matrix components in a Hubble parameter analysis without assumptions about the energy content of the Universe. We used a principal component analysis to reconstruct the Hubble parameter as a linear combination of the Fisher matrix eigenvectors (principal components). To suppress the bias introduced by the high redshift behavior of the components, we considered the value of the Hubble parameter at high redshift as a free parameter. We first tested our procedure using a mock sample of type Ia supernova observations, we then applied it to the real data compiled by the Sloan Digital Sky Survey (SDSS) group. Results. In the mock sample analysis, we demonstrate that it is possible to drastically suppress the bias introduced by the high redshift behavior of the principal components. Applying our procedure to the real data, we show that it allows us to determine the behavior of the Hubble parameter with reasonable uncertainty, without introducing any ad-hoc parameterizations. Beyond that, our reconstruction agrees with completely independent measurements of the Hubble parameter obtained from red-envelope galaxies.
Resumo:
In this paper a methodology for integrated multivariate monitoring and control of biological wastewater treatment plants during extreme events is presented. To monitor the process, on-line dynamic principal component analysis (PCA) is performed on the process data to extract the principal components that represent the underlying mechanisms of the process. Fuzzy c-means (FCM) clustering is used to classify the operational state. Performing clustering on scores from PCA solves computational problems as well as increases robustness due to noise attenuation. The class-membership information from FCM is used to derive adequate control set points for the local control loops. The methodology is illustrated by a simulation study of a biological wastewater treatment plant, on which disturbances of various types are imposed. The results show that the methodology can be used to determine and co-ordinate control actions in order to shift the control objective and improve the effluent quality.
Resumo:
OBJECTIVE To analyze the association between concentrations of air pollutants and admissions for respiratory causes in children. METHODS Ecological time series study. Daily figures for hospital admissions of children aged < 6, and daily concentrations of air pollutants (PM10, SO2, NO2, O3 and CO) were analyzed in the Região da Grande Vitória, ES, Southeastern Brazil, from January 2005 to December 2010. For statistical analysis, two techniques were combined: Poisson regression with generalized additive models and principal model component analysis. Those analysis techniques complemented each other and provided more significant estimates in the estimation of relative risk. The models were adjusted for temporal trend, seasonality, day of the week, meteorological factors and autocorrelation. In the final adjustment of the model, it was necessary to include models of the Autoregressive Moving Average Models (p, q) type in the residuals in order to eliminate the autocorrelation structures present in the components. RESULTS For every 10:49 μg/m3 increase (interquartile range) in levels of the pollutant PM10 there was a 3.0% increase in the relative risk estimated using the generalized additive model analysis of main components-seasonal autoregressive – while in the usual generalized additive model, the estimate was 2.0%. CONCLUSIONS Compared to the usual generalized additive model, in general, the proposed aspect of generalized additive model − principal component analysis, showed better results in estimating relative risk and quality of fit.
Resumo:
Understanding the different background landscapes in which malaria transmission occurs is fundamental to understanding malaria epidemiology and to designing effective local malaria control programs. Geology, geomorphology, vegetation, climate, land use, and anopheline distribution were used as a basis for an ecological classification of the state of Roraima, Brazil, in the northern Amazon Basin, focused on the natural history of malaria and transmission. We used unsupervised maximum likelihood classification, principal components analysis, and weighted overlay with equal contribution analyses to fine-scale thematic maps that resulted in clustered regions. We used ecological niche modeling techniques to develop a fine-scale picture of malaria vector distributions in the state. Eight ecoregions were identified and malaria-related aspects are discussed based on this classification, including 5 types of dense tropical rain forest and 3 types of savannah. Ecoregions formed by dense tropical rain forest were named as montane (ecoregion I), submontane (II), plateau (III), lowland (IV), and alluvial (V). Ecoregions formed by savannah were divided into steppe (VI, campos de Roraima), savannah (VII, cerrado), and wetland (VIII, campinarana). Such ecoregional mappings are important tools in integrated malaria control programs that aim to identify specific characteristics of malaria transmission, classify transmission risk, and define priority areas and appropriate interventions. For some areas, extension of these approaches to still-finer resolutions will provide an improved picture of malaria transmission patterns.
Resumo:
In human Population Genetics, routine applications of principal component techniques are oftenrequired. Population biologists make widespread use of certain discrete classifications of humansamples into haplotypes, the monophyletic units of phylogenetic trees constructed from severalsingle nucleotide bimorphisms hierarchically ordered. Compositional frequencies of the haplotypesare recorded within the different samples. Principal component techniques are then required as adimension-reducing strategy to bring the dimension of the problem to a manageable level, say two,to allow for graphical analysis.Population biologists at large are not aware of the special features of compositional data and normally make use of the crude covariance of compositional relative frequencies to construct principalcomponents. In this short note we present our experience with using traditional linear principalcomponents or compositional principal components based on logratios, with reference to a specificdataset
Resumo:
Functional connectivity (FC) as measured by correlation between fMRI BOLD time courses of distinct brain regions has revealed meaningful organization of spontaneous fluctuations in the resting brain. However, an increasing amount of evidence points to non-stationarity of FC; i.e., FC dynamically changes over time reflecting additional and rich information about brain organization, but representing new challenges for analysis and interpretation. Here, we propose a data-driven approach based on principal component analysis (PCA) to reveal hidden patterns of coherent FC dynamics across multiple subjects. We demonstrate the feasibility and relevance of this new approach by examining the differences in dynamic FC between 13 healthy control subjects and 15 minimally disabled relapse-remitting multiple sclerosis patients. We estimated whole-brain dynamic FC of regionally-averaged BOLD activity using sliding time windows. We then used PCA to identify FC patterns, termed "eigenconnectivities", that reflect meaningful patterns in FC fluctuations. We then assessed the contributions of these patterns to the dynamic FC at any given time point and identified a network of connections centered on the default-mode network with altered contribution in patients. Our results complement traditional stationary analyses, and reveal novel insights into brain connectivity dynamics and their modulation in a neurodegenerative disease.
Resumo:
A new drift compensation method based on Common Principal Component Analysis (CPCA) is proposed. The drift variance in data is found as the principal components computed by CPCA. This method finds components that are common for all gasses in feature space. The method is compared in classification task with respect to the other approaches published where the drift direction is estimated through a Principal Component Analysis (PCA) of a reference gas. The proposed new method ¿ employing no specific reference gas, but information from all gases ¿has shown the same performance as the traditional approach with the best-fitted reference gas. Results are shown with data lasting 7-months including three gases at different concentrations for an array of 17 polymeric sensors.
Resumo:
In this paper, we propose a multispectral analysis system using wavelet based Principal Component Analysis (PCA), to improve the brain tissue classification from MRI images. Global transforms like PCA often neglects significant small abnormality details, while dealing with a massive amount of multispectral data. In order to resolve this issue, input dataset is expanded by detail coefficients from multisignal wavelet analysis. Then, PCA is applied on the new dataset to perform feature analysis. Finally, an unsupervised classification with Fuzzy C-Means clustering algorithm is used to measure the improvement in reproducibility and accuracy of the results. A detailed comparative analysis of classified tissues with those from conventional PCA is also carried out. Proposed method yielded good improvement in classification of small abnormalities with high sensitivity/accuracy values, 98.9/98.3, for clinical analysis. Experimental results from synthetic and clinical data recommend the new method as a promising approach in brain tissue analysis.
Resumo:
In human Population Genetics, routine applications of principal component techniques are often required. Population biologists make widespread use of certain discrete classifications of human samples into haplotypes, the monophyletic units of phylogenetic trees constructed from several single nucleotide bimorphisms hierarchically ordered. Compositional frequencies of the haplotypes are recorded within the different samples. Principal component techniques are then required as a dimension-reducing strategy to bring the dimension of the problem to a manageable level, say two, to allow for graphical analysis. Population biologists at large are not aware of the special features of compositional data and normally make use of the crude covariance of compositional relative frequencies to construct principal components. In this short note we present our experience with using traditional linear principal components or compositional principal components based on logratios, with reference to a specific dataset
Resumo:
Constrained principal component analysis (CPCA) with a finite impulse response (FIR) basis set was used to reveal functionally connected networks and their temporal progression over a multistage verbal working memory trial in which memory load was varied. Four components were extracted, and all showed statistically significant sensitivity to the memory load manipulation. Additionally, two of the four components sustained this peak activity, both for approximately 3 s (Components 1 and 4). The functional networks that showed sustained activity were characterized by increased activations in the dorsal anterior cingulate cortex, right dorsolateral prefrontal cortex, and left supramarginal gyrus, and decreased activations in the primary auditory cortex and "default network" regions. The functional networks that did not show sustained activity were instead dominated by increased activation in occipital cortex, dorsal anterior cingulate cortex, sensori-motor cortical regions, and superior parietal cortex. The response shapes suggest that although all four components appear to be invoked at encoding, the two sustained-peak components are likely to be additionally involved in the delay period. Our investigation provides a unique view of the contributions made by a network of brain regions over the course of a multiple-stage working memory trial.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
A total of 61,528 weight records from 22,246 Nellore animals born between 1984 and 2002 were used to compare different multiple-trait analysis methods for birth to mature weights. The following models were used: standard multivarite model (MV), five reduced-rank models fitting the first 1, 2, 3, 4 and 5 genetic principal components, and five models using factor analysis with 1, 2, 3, 4 and 5 factors. Direct additive genetic random effects and residual effects were included in all models. In addition, maternal genetic and maternal permanent environmental effects were included as random effects for birth and weaning weight. The models included contemporary group as fixed effect and age of animal at recording (except for birth weight) and age of dam at calving as linear and quadratic effects (for birth weight and weaning weight). The maternal genetic, maternal permanent environmental and residual (co)variance matrices were assumed to be full rank. According to model selection criteria, the model fitting the three first principal components (PC3) provided the best fit, without the need for factor analysis models. Similar estimates of phenotypic, direct additive and maternal genetic, maternal permanent environmental and residual (co)variances were obtained with models MV and PC3. Direct heritability ranged from 0.21 (birth weight) to 0.45 (weight at 6 years of age). The genetic and phenotypic correlations obtained with model PC3 were slightly higher than those estimated with model MV. In general, the reduced-rank model substantially decreased the number of parameters in the analyses without reducing the goodness-of-fit. © 2013 Elsevier B.V.
Resumo:
Phenotypic data from female Canchim beef cattle were used to obtain estimates of genetic parameters for reproduction and growth traits using a linear animal mixed model. In addition, relationships among animal estimated breeding values (EBVs) for these traits were explored using principal component analysis. The traits studied in female Canchim cattle were age at first calving (AFC), age at second calving (ASC), calving interval (CI), and bodyweight at 420 days of age (BW420). The heritability estimates for AFC, ASC, CI and BW420 were 0.03±0.01, 0.07±0.01, 0.06±0.02, and 0.24±0.02, respectively. The genetic correlations for AFC with ASC, AFC with CI, AFC with BW420, ASC with CI, ASC with BW420, and CI with BW420 were 0.87±0.07, 0.23±0.02, -0.15±0.01, 0.67±0.13, -0.07±0.13, and 0.02±0.14, respectively. Standardised EBVs for AFC, ASC and CI exhibited a high association with the first principal component, whereas the standardised EBV for BW420 was closely associated with the second principal component. The heritability estimates for AFC, ASC and CI suggest that these traits would respond slowly to selection. However, selection response could be enhanced by constructing selection indices based on the principal components. © CSIRO 2013.