Biblioteca Digital

914 resultados para Minor Component Analysis

Distributional equivalence and subcompositional coherence in the analysis of contingency tables, ratio-scale measurements and compositional data

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We consider two fundamental properties in the analysis of two-way tables of positive data: the principle of distributional equivalence, one of the cornerstones of correspondence analysis of contingency tables, and the principle of subcompositional coherence, which forms the basis of compositional data analysis. For an analysis to be subcompositionally coherent, it suffices to analyse the ratios of the data values. The usual approach to dimension reduction in compositional data analysis is to perform principal component analysis on the logarithms of ratios, but this method does not obey the principle of distributional equivalence. We show that by introducing weights for the rows and columns, the method achieves this desirable property. This weighted log-ratio analysis is theoretically equivalent to spectral mapping , a multivariate method developed almost 30 years ago for displaying ratio-scale data from biological activity spectra. The close relationship between spectral mapping and correspondence analysis is also explained, as well as their connection with association modelling. The weighted log-ratio methodology is applied here to frequency data in linguistics and to chemical compositional data in archaeology.

Analysis of matched matrices

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We consider the joint visualization of two matrices which have common rowsand columns, for example multivariate data observed at two time pointsor split accord-ing to a dichotomous variable. Methods of interest includeprincipal components analysis for interval-scaled data, or correspondenceanalysis for frequency data or ratio-scaled variables on commensuratescales. A simple result in matrix algebra shows that by setting up thematrices in a particular block format, matrix sum and difference componentscan be visualized. The case when we have more than two matrices is alsodiscussed and the methodology is applied to data from the InternationalSocial Survey Program.

A note on the dual scaling of dominance data and its relationship to correspondence analysis

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Dual scaling of a subjects-by-objects table of dominance data (preferences,paired comparisons and successive categories data) has been contrasted with correspondence analysis, as if the two techniques were somehow different. In this note we show that dual scaling of dominance data is equivalent to the correspondence analysis of a table which is doubled with respect to subjects. We also show that the results of both methods can be recovered from a principal components analysis of the undoubled dominance table which is centred with respect to subject means.

Multivariate analysis and geostatistics of the fertility of a humic rhodic hapludox under coffee cultivation

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The spatial variability of soil and plant properties exerts great influence on the yeld of agricultural crops. This study analyzed the spatial variability of the fertility of a Humic Rhodic Hapludox with Arabic coffee, using principal component analysis, cluster analysis and geostatistics in combination. The experiment was carried out in an area under Coffea arabica L., variety Catucai 20/15 - 479. The soil was sampled at a depth 0.20 m, at 50 points of a sampling grid. The following chemical properties were determined: P, K+, Ca2+, Mg2+, Na+, S, Al3+, pH, H + Al, SB, t, T, V, m, OM, Na saturation index (SSI), remaining phosphorus (P-rem), and micronutrients (Zn, Fe, Mn, Cu and B). The data were analyzed with descriptive statistics, followed by principal component and cluster analyses. Geostatistics were used to check and quantify the degree of spatial dependence of properties, represented by principal components. The principal component analysis allowed a dimensional reduction of the problem, providing interpretable components, with little information loss. Despite the characteristic information loss of principal component analysis, the combination of this technique with geostatistical analysis was efficient for the quantification and determination of the structure of spatial dependence of soil fertility. In general, the availability of soil mineral nutrients was low and the levels of acidity and exchangeable Al were high.

Secretory IgA, a major immunoglobulin in most bovine external secretions.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Bovine secretory IgA (SIgA), recently identified in colostrum, was shown to be homologous to human SIgA by immunologic cross-reaction. A quantitative study indicated that bovine SIgA, a minor component of colostrum, is a major immunoglobulin in most other external secretions including saliva, spermatic fluid, lacrimal, nasal and gastrointestinal secretions. SIgA was isolated from saliva. The free form of secretory component was found to be abundant in milk. A normal lactating cow produces about 1.2 g of this protein per day. Two forms of IgA were identified in serum: a normal serum IgA with no secretory antigenic determinant, and a small amount of SIgA. In vitro synthesis of SIgA by the salivary gland was studied by tissue cultures with incorporation of labeled amino acids.

UPLC-TOF-MS for plant metabolomics: a sequential approach for wound marker analysis in Arabidopsis thaliana.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The model plant Arabidopsis thaliana was studied for the search of new metabolites involved in wound signalling. Diverse LC approaches were considered in terms of efficiency and analysis time and a 7-min gradient on a UPLC-TOF-MS system with a short column was chosen for metabolite fingerprinting. This screening step was designed to allow the comparison of a high number of samples over a wide range of time points after stress induction in positive and negative ionisation modes. Thanks to data treatment, clear discrimination was obtained, providing lists of potential stress-induced ions. In a second step, the fingerprinting conditions were transferred to longer column, providing a higher peak capacity able to demonstrate the presence of isomers among the highlighted compounds.

Raman analysis of multilayer automotive paints in forensic science: measurement variability and depth profile?

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this work is to study the influence of several analytical parameters on the variability of Raman spectra of paint samples. In the present study, microtome thin section and direct (no preparation) analysis are considered as sample preparation. In order to evaluate their influence on the measures, an experimental design such as 'fractional full factorial' with seven factors (including the sampling process) is applied, for a total of 32 experiments representing 160 measures. Once the influence of sample preparation highlighted, a depth profile of a paint sample is carried out by changing the focusing plane in order to measure the colored layer under a clearcoat. This is undertaken in order to avoid sample preparation such a microtome sectioning. Finally, chemometric treatments such as principal component analysis are applied to the resulting spectra. The findings of this study indicate the importance of sample preparation, or more specifically, the surface roughness, on the variability of the measurements on a same sample. Moreover, the depth profile experiment highlights the influence of the refractive index of the upper layer (clearcoat) when measuring through a transparent layer.

Binding free energy differences in a TCR-peptide-MHC complex induced by a peptide mutation: a simulation analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Recognition by the T-cell receptor (TCR) of immunogenic peptides presented by class I major histocompatibility complexes (MHCs) is the determining event in the specific cellular immune response against virus-infected cells or tumor cells. It is of great interest, therefore, to elucidate the molecular principles upon which the selectivity of a TCR is based. These principles can in turn be used to design therapeutic approaches, such as peptide-based immunotherapies of cancer. In this study, free energy simulation methods are used to analyze the binding free energy difference of a particular TCR (A6) for a wild-type peptide (Tax) and a mutant peptide (Tax P6A), both presented in HLA A2. The computed free energy difference is 2.9 kcal/mol, in good agreement with the experimental value. This makes possible the use of the simulation results for obtaining an understanding of the origin of the free energy difference which was not available from the experimental results. A free energy component analysis makes possible the decomposition of the free energy difference between the binding of the wild-type and mutant peptide into its components. Of particular interest is the fact that better solvation of the mutant peptide when bound to the MHC molecule is an important contribution to the greater affinity of the TCR for the latter. The results make possible identification of the residues of the TCR which are important for the selectivity. This provides an understanding of the molecular principles that govern the recognition. The possibility of using free energy simulations in designing peptide derivatives for cancer immunotherapy is briefly discussed.

Statistical Analysis of Spectral Images

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Tämä diplomityö liittyy Spektrikuvien tutkimiseen tilastollisen kuvamallin näkökulmasta. Diplomityön ensimmäisessä osassa tarkastellaan tilastollisten parametrien jakaumien vaikutusta väreihin ja korostumiin erilaisissa valaistusolosuhteissa. Havaittiin, että tilastollisten parametrien väliset suhteet eivät riipu valaistusolosuhteista, mutta riippuvat kuvan häiriöttömyydestä. Ilmeni myös, että korkea huipukkuus saattaa aiheutua värikylläisyydestä. Lisäksi työssä kehitettiin tilastolliseen spektrimalliin perustuvaa tekstuurinyhdistämisalgoritmia. Sillä saavutettiin hyviä tuloksia, kun tilastollisten parametrien väliset riippuvuussuhteet olivat voimassa. Työn toisessa osassa erilaisia spektrikuvia tutkittiin käyttäen itsenäistä komponenttien analyysia (ICA). Seuraavia itsenäiseen komponenttien analyysiin tarkoitettuja algoritmia tarkasteltiin: JADE, kiinteän pisteen ICA ja momenttikeskeinen ICA. Tutkimuksissa painotettiin erottelun laatua. Paras erottelu saavutettiin JADE- algoritmilla, joskin erot muiden algoritmien välillä eivät olleet merkittäviä. Algoritmi jakoi kuvan kahteen itsenäiseen, joko korostuneeseen ja korostumattomaan tai kromaattiseen ja akromaattiseen, komponenttiin. Lopuksi pohditaan huipukkuuden suhdetta kuvan ominaisuuksiin, kuten korostuneisuuteen ja värikylläisyyteen. Työn viimeisessä osassa ehdotetaan mahdollisia jatkotutkimuskohteita.

Sampling in chemical analysis

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The uncertainty of any analytical determination depends on analysis and sampling. Uncertainty arising from sampling is usually not controlled and methods for its evaluation are still little known. Pierre Gy’s sampling theory is currently the most complete theory about samplingwhich also takes the design of the sampling equipment into account. Guides dealing with the practical issues of sampling also exist, published by international organizations such as EURACHEM, IUPAC (International Union of Pure and Applied Chemistry) and ISO (International Organization for Standardization). In this work Gy’s sampling theory was applied to several cases, including the analysis of chromite concentration estimated on SEM (Scanning Electron Microscope) images and estimation of the total uncertainty of a drug dissolution procedure. The results clearly show that Gy’s sampling theory can be utilized in both of the above-mentioned cases and that the uncertainties achieved are reliable. Variographic experiments introduced in Gy’s sampling theory are beneficially applied in analyzing the uncertainty of auto-correlated data sets such as industrial process data and environmental discharges. The periodic behaviour of these kinds of processes can be observed by variographic analysis as well as with fast Fourier transformation and auto-correlation functions. With variographic analysis, the uncertainties are estimated as a function of the sampling interval. This is advantageous when environmental data or process data are analyzed as it can be easily estimated how the sampling interval is affecting the overall uncertainty. If the sampling frequency is too high, unnecessary resources will be used. On the other hand, if a frequency is too low, the uncertainty of the determination may be unacceptably high. Variographic methods can also be utilized to estimate the uncertainty of spectral data produced by modern instruments. Since spectral data are multivariate, methods such as Principal Component Analysis (PCA) are needed when the data are analyzed. Optimization of a sampling plan increases the reliability of the analytical process which might at the end have beneficial effects on the economics of chemical analysis,

Analysis of patterns in mining time series using Qlucore

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Raw measurement data does not always immediately convey useful information, but applying mathematical statistical analysis tools into measurement data can improve the situation. Data analysis can offer benefits like acquiring meaningful insight from the dataset, basing critical decisions on the findings, and ruling out human bias through proper statistical treatment. In this thesis we analyze data from an industrial mineral processing plant with the aim of studying the possibility of forecasting the quality of the final product, given by one variable, with a model based on the other variables. For the study mathematical tools like Qlucore Omics Explorer (QOE) and Sparse Bayesian regression (SB) are used. Later on, linear regression is used to build a model based on a subset of variables that seem to have most significant weights in the SB model. The results obtained from QOE show that the variable representing the desired final product does not correlate with other variables. For SB and linear regression, the results show that both SB and linear regression models built on 1-day averaged data seriously underestimate the variance of true data, whereas the two models built on 1-month averaged data are reliable and able to explain a larger proportion of variability in the available data, making them suitable for prediction purposes. However, it is concluded that no single model can fit well the whole available dataset and therefore, it is proposed for future work to make piecewise non linear regression models if the same available dataset is used, or the plant to provide another dataset that should be collected in a more systematic fashion than the present data for further analysis.

A quantitative validated method using liquid chromatography and chemometric analysis for evaluation of raw material oF Maytenus ilicifolia (Schrad.) Planch., Celastraceae

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The hydroalcoholic extracts prepared from standard leaves of Maytenus ilicifolia and commercial samples of espinheira-santa were evaluated qualitatively (fingerprinting) and quantitatively. In this paper, fingerprinting chromatogram coupled with Principal Component Analysis (PCA) is described for the metabolomic analysis of standard and commercial espinheira-santa samples. The epicatechin standard was used as an external standard for the development and validation of a quantitative method for the analysis in herbal medicines using a photo diode array detector. This method has been applied for quantification of epicatechin in commercialized herbal medicines sold as espinheira-santa in Brazil and in the standard sample of M. ilicifolia.

Integrated multivariate analysis to evaluate effects of pre-slaughter handling on pork quality

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of this study was to investigate the effect of pre-slaughter handling on the occurrence of PSE (Pale, Soft, and Exudative) meat in swine slaughtered at a commercial slaughterhouse located in the metropolitan region of Dourados, Mato Grosso do Sul, Brazil. Based on the database (n=1,832 carcasses), it was possible to apply the integrated multivariate analysis for the purpose of identifying, among the selected variables, those of greatest relevance to this study. Results of the Principal Component Analysis showed that the first five components explained 89.28% of total variance. In the Factor Analysis, the first factor represented the thermal stress and fatiguing conditions for swine during pre-slaughter handling. In general, this study indicated the importance of the pre-slaughter handling stages, evidencing those of greatest stress and threat to animal welfare and pork quality, which are transport time, resting period, lairage time before unloading, unloading time, and ambience.

Multivariate statistical analysis to support the minimum streamflow regionalization

Relevância:

90.00% 90.00%

Publicador:

Resumo:

ABSTRACT This study aimed to develop a methodology based on multivariate statistical analysis of principal components and cluster analysis, in order to identify the most representative variables in studies of minimum streamflow regionalization, and to optimize the identification of the hydrologically homogeneous regions for the Doce river basin. Ten variables were used, referring to the river basin climatic and morphometric characteristics. These variables were individualized for each of the 61 gauging stations. Three dependent variables that are indicative of minimum streamflow (Q7,10, Q90 and Q95). And seven independent variables that concern to climatic and morphometric characteristics of the basin (total annual rainfall – Pa; total semiannual rainfall of the dry and of the rainy season – Pss and Psc; watershed drainage area – Ad; length of the main river – Lp; total length of the rivers – Lt; and average watershed slope – SL). The results of the principal component analysis pointed out that the variable SL was the least representative for the study, and so it was discarded. The most representative independent variables were Ad and Psc. The best divisions of hydrologically homogeneous regions for the three studied flow characteristics were obtained using the Mahalanobis similarity matrix and the complete linkage clustering method. The cluster analysis enabled the identification of four hydrologically homogeneous regions in the Doce river basin.

Efficient Optimization Algorithms for Nonlinear Data Analysis

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.

«
1
2
...
10
11
12
13
14
15
16
...
60
61
»