842 resultados para MULTIVARIATE FACTORIAL ANALYSIS
Resumo:
This paper describes a geostatistical method, known as factorial kriging analysis, which is well suited for analyzing multivariate spatial information. The method involves multivariate variogram modeling, principal component analysis, and cokriging. It uses several separate correlation structures, each corresponding to a specific spatial scale, and yields a set of regionalized factors summarizing the main features of the data for each spatial scale. This method is applied to an area of high manganese-ore mining activity in Amapa State, North Brazil. Two scales of spatial variation (0.33 and 2.0 km) are identified and interpreted. The results indicate that, for the short-range structure, manganese, arsenic, iron, and cadmium are associated with human activities due to the mining work, while for the long-range structure, the high aluminum, selenium, copper, and lead concentrations, seem to be related to the natural environment. At each scale, the correlation structure is analyzed, and regionalized factors are estimated by cokriging and then mapped.
Resumo:
This thesis presents a creative and practical approach to dealing with the problem of selection bias. Selection bias may be the most important vexing problem in program evaluation or in any line of research that attempts to assert causality. Some of the greatest minds in economics and statistics have scrutinized the problem of selection bias, with the resulting approaches – Rubin’s Potential Outcome Approach(Rosenbaum and Rubin,1983; Rubin, 1991,2001,2004) or Heckman’s Selection model (Heckman, 1979) – being widely accepted and used as the best fixes. These solutions to the bias that arises in particular from self selection are imperfect, and many researchers, when feasible, reserve their strongest causal inference for data from experimental rather than observational studies. The innovative aspect of this thesis is to propose a data transformation that allows measuring and testing in an automatic and multivariate way the presence of selection bias. The approach involves the construction of a multi-dimensional conditional space of the X matrix in which the bias associated with the treatment assignment has been eliminated. Specifically, we propose the use of a partial dependence analysis of the X-space as a tool for investigating the dependence relationship between a set of observable pre-treatment categorical covariates X and a treatment indicator variable T, in order to obtain a measure of bias according to their dependence structure. The measure of selection bias is then expressed in terms of inertia due to the dependence between X and T that has been eliminated. Given the measure of selection bias, we propose a multivariate test of imbalance in order to check if the detected bias is significant, by using the asymptotical distribution of inertia due to T (Estadella et al. 2005) , and by preserving the multivariate nature of data. Further, we propose the use of a clustering procedure as a tool to find groups of comparable units on which estimate local causal effects, and the use of the multivariate test of imbalance as a stopping rule in choosing the best cluster solution set. The method is non parametric, it does not call for modeling the data, based on some underlying theory or assumption about the selection process, but instead it calls for using the existing variability within the data and letting the data to speak. The idea of proposing this multivariate approach to measure selection bias and test balance comes from the consideration that in applied research all aspects of multivariate balance, not represented in the univariate variable- by-variable summaries, are ignored. The first part contains an introduction to evaluation methods as part of public and private decision process and a review of the literature of evaluation methods. The attention is focused on Rubin Potential Outcome Approach, matching methods, and briefly on Heckman’s Selection Model. The second part focuses on some resulting limitations of conventional methods, with particular attention to the problem of how testing in the correct way balancing. The third part contains the original contribution proposed , a simulation study that allows to check the performance of the method for a given dependence setting and an application to a real data set. Finally, we discuss, conclude and explain our future perspectives.
Resumo:
In this paper, spatially offset Raman spectroscopy (SORS) is demonstrated for non-invasively investigating the composition of drug mixtures inside an opaque plastic container. The mixtures consisted of three components including a target drug (acetaminophen or phenylephrine hydrochloride) and two diluents (glucose and caffeine). The target drug concentrations ranged from 5% to 100%. After conducting SORS analysis to ascertain the Raman spectra of the concealed mixtures, principal component analysis (PCA) was performed on the SORS spectra to reveal trends within the data. Partial least squares (PLS) regression was used to construct models that predicted the concentration of each target drug, in the presence of the other two diluents. The PLS models were able to predict the concentration of acetaminophen in the validation samples with a root-mean-square error of prediction (RMSEP) of 3.8% and the concentration of phenylephrine hydrochloride with an RMSEP of 4.6%. This work demonstrates the potential of SORS, used in conjunction with multivariate statistical techniques, to perform non-invasive, quantitative analysis on mixtures inside opaque containers. This has applications for pharmaceutical analysis, such as monitoring the degradation of pharmaceutical products on the shelf, in forensic investigations of counterfeit drugs, and for the analysis of illicit drug mixtures which may contain multiple components.
Resumo:
Concerns regarding groundwater contamination with nitrate and the long-term sustainability of groundwater resources have prompted the development of a multi-layered three dimensional (3D) geological model to characterise the aquifer geometry of the Wairau Plain, Marlborough District, New Zealand. The 3D geological model which consists of eight litho-stratigraphic units has been subsequently used to synthesise hydrogeological and hydrogeochemical data for different aquifers in an approach that aims to demonstrate how integration of water chemistry data within the physical framework of a 3D geological model can help to better understand and conceptualise groundwater systems in complex geological settings. Multivariate statistical techniques(e.g. Principal Component Analysis and Hierarchical Cluster Analysis) were applied to groundwater chemistry data to identify hydrochemical facies which are characteristic of distinct evolutionary pathways and a common hydrologic history of groundwaters. Principal Component Analysis on hydrochemical data demonstrated that natural water-rock interactions, redox potential and human agricultural impact are the key controls of groundwater quality in the Wairau Plain. Hierarchical Cluster Analysis revealed distinct hydrochemical water quality groups in the Wairau Plain groundwater system. Visualisation of the results of the multivariate statistical analyses and distribution of groundwater nitrate concentrations in the context of aquifer lithology highlighted the link between groundwater chemistry and the lithology of host aquifers. The methodology followed in this study can be applied in a variety of hydrogeological settings to synthesise geological, hydrogeological and hydrochemical data and present them in a format readily understood by a wide range of stakeholders. This enables a more efficient communication of the results of scientific studies to the wider community.
Resumo:
The Clarence-Moreton Basin (CMB) covers approximately 26000 km2 and is the only sub-basin of the Great Artesian Basin (GAB) in which there is flow to both the south-west and the east, although flow to the south-west is predominant. In many parts of the basin, including catchments of the Bremer, Logan and upper Condamine Rivers in southeast Queensland, the Walloon Coal Measures are under exploration for Coal Seam Gas (CSG). In order to assess spatial variations in groundwater flow and hydrochemistry at a basin-wide scale, a 3D hydrogeological model of the Queensland section of the CMB has been developed using GoCAD modelling software. Prior to any large-scale CSG extraction, it is essential to understand the existing hydrochemical character of the different aquifers and to establish any potential linkage. To effectively use the large amount of water chemistry data existing for assessment of hydrochemical evolution within the different lithostratigraphic units, multivariate statistical techniques were employed.
Resumo:
A catchment-scale multivariate statistical analysis of hydrochemistry enabled assessment of interactions between alluvial groundwater and Cressbrook Creek, an intermittent drainage system in southeast Queensland, Australia. Hierarchical cluster analyses and principal component analysis were applied to time-series data to evaluate the hydrochemical evolution of groundwater during periods of extreme drought and severe flooding. A simple three-dimensional geological model was developed to conceptualise the catchment morphology and the stratigraphic framework of the alluvium. The alluvium forms a two-layer system with a basal coarse-grained layer overlain by a clay-rich low-permeability unit. In the upper and middle catchment, alluvial groundwater is chemically similar to streamwater, particularly near the creek (reflected by high HCO3/Cl and K/Na ratios and low salinities), indicating a high degree of connectivity. In the lower catchment, groundwater is more saline with lower HCO3/Cl and K/Na ratios, notably during dry periods. Groundwater salinity substantially decreased following severe flooding in 2011, notably in the lower catchment, confirming that flooding is an important mechanism for both recharge and maintaining groundwater quality. The integrated approach used in this study enabled effective interpretation of hydrological processes and can be applied to a variety of hydrological settings to synthesise and evaluate large hydrochemical datasets.
Resumo:
Several genetic variants are thought to influence white matter (WM) integrity, measured with diffusion tensor imaging (DTI). Voxel based methods can test genetic associations, but heavy multiple comparisons corrections are required to adjust for searching the whole brain and for all genetic variants analyzed. Thus, genetic associations are hard to detect even in large studies. Using a recently developed multi-SNP analysis, we examined the joint predictive power of a group of 18 cholesterol-related single nucleotide polymorphisms (SNPs) on WM integrity, measured by fractional anisotropy. To boost power, we limited the analysis to brain voxels that showed significant associations with total serum cholesterol levels. From this space, we identified two genes with effects that replicated in individual voxel-wise analyses of the whole brain. Multivariate analyses of genetic variants on a reduced anatomical search space may help to identify SNPs with strongest effects on the brain from a broad panel of genes.
Resumo:
The use of near infrared (NIR) hyperspectral imaging and hyperspectral image analysis for distinguishing between hard, intermediate and soft maize kernels from inbred lines was evaluated. NIR hyperspectral images of two sets (12 and 24 kernels) of whole maize kernels were acquired using a Spectral Dimensions MatrixNIR camera with a spectral range of 960-1662 nm and a sisuChema SWIR (short wave infrared) hyperspectral pushbroom imaging system with a spectral range of 1000-2498 nm. Exploratory principal component analysis (PCA) was used on absorbance images to remove background, bad pixels and shading. On the cleaned images. PCA could be used effectively to find histological classes including glassy (hard) and floury (soft) endosperm. PCA illustrated a distinct difference between glassy and floury endosperm along principal component (PC) three on the MatrixNIR and PC two on the sisuChema with two distinguishable clusters. Subsequently partial least squares discriminant analysis (PLS-DA) was applied to build a classification model. The PLS-DA model from the MatrixNIR image (12 kernels) resulted in root mean square error of prediction (RMSEP) value of 0.18. This was repeated on the MatrixNIR image of the 24 kernels which resulted in RMSEP of 0.18. The sisuChema image yielded RMSEP value of 0.29. The reproducible results obtained with the different data sets indicate that the method proposed in this paper has a real potential for future classification uses.
Resumo:
The purpose of the present study was to use attenuated total reflectance-Fourier transform infrared spectroscopy (ATR-FTIR) and target factor analysis (TFA) to investigate the permeation of model drugs and formulation components through Carbosil® membrane and human skin. Diffusion studies of saturated solutions in 50:50 water/ethanol of methyl paraben (MP), ibuprofen (IBU) and caffeine (CF) were performed on Carbosil® membrane. The spectroscopic data were analysed by target factor analysis, and evolution profiles of the signal for each component (i.e. the drug, water, ethanol and membrane) over time were obtained. Results showed that the data were successfully deconvoluted as correlations between factors from the data and reference spectra of the components, were above 0.8 in all cases. Good reproducibility over three runs for the evolution profiles was obtained. From the evolution profiles it was observed that water diffused better through the Carbosil® membrane than ethanol, confirming the hydrophilic properties of the Carbosil® membrane used. IBU diffused slower compared with MP and CF. The evolution profile of CF was very similar to that of water, probably because of the high solubility of CF in water, indicating that both compounds are diffusing concurrently. The second part of the work involved a study of the evolution profiles of the components of a commercial topical gel containing 5% (w/w) of ibuprofen as it permeated through human skin. Although the system was much more complex, data were still successfully deconvoluted and the different components of the formulation identified except for benzyl alcohol which might be attributed to the low concentrations of benzyl alcohol used in topical formulations. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
This paper introduces the application of linear multivariate statistical techniques, including partial least squares (PLS), canonical correlation analysis (CCA) and reduced rank regression (RRR), into the area of Systems Biology. This new approach aims to extract the important proteins embedded in complex signal transduction pathway models.The analysis is performed on a model of intracellular signalling along the janus-associated kinases/signal transducers and transcription factors (JAK/STAT) and mitogen activated protein kinases (MAPK) signal transduction pathways in interleukin-6 (IL6) stimulated hepatocytes, which produce signal transducer and activator of transcription factor 3 (STAT3).A region of redundancy within the MAPK pathway that does not affect the STAT3 transcription was identified using CCA. This is the core finding of this analysis and cannot be obtained by inspecting the model by eye. In addition, RRR was found to isolate terms that do not significantly contribute to changes in protein concentrations, while the application of PLS does not provide such a detailed picture by virtue of its construction.This analysis has a similar objective to conventional model reduction techniques with the advantage of maintaining the meaning of the states prior to and after the reduction process. A significant model reduction is performed, with a marginal loss in accuracy, offering a more concise model while maintaining the main influencing factors on the STAT3 transcription.The findings offer a deeper understanding of the reaction terms involved, confirm the relevance of several proteins to the production of Acute Phase Proteins and complement existing findings regarding cross-talk between the two signalling pathways.
Resumo:
Baking and 2-g mixograph analyses were performed for 55 cultivars (19 spring and 36 winter wheat) from various quality classes from the 2002 harvest in Poland. An instrumented 2-g direct-drive mixograph was used to study the mixing characteristics of the wheat cultivars. A number of parameters were extracted automatically from each mixograph trace and correlated with baking volume and flour quality parameters (protein content and high molecular weight glutenin subunit [HMW-GS] composition by SDS-PAGE) using multiple linear regression statistical analysis. Principal component analysis of the mixograph data discriminated between four flour quality classes, and predictions of baking volume were obtained using several selected mixograph parameters, chosen using a best subsets regression routine, giving R-2 values of 0.862-0.866. In particular, three new spring wheat strains (CHD 502a-c) recently registered in Poland were highly discriminated and predicted to give high baking volume on the basis of two mixograph parameters: peak bandwidth and 10-min bandwidth.
Resumo:
Synoptic climatology relates the atmospheric circulation with the surface environment. The aim of this study is to examine the variability of the surface meteorological patterns, which are developing under different synoptic scale categories over a suburban area with complex topography. Multivariate Data Analysis techniques were performed to a data set with surface meteorological elements. Three principal components related to the thermodynamic status of the surface environment and the two components of the wind speed were found. The variability of the surface flows was related with atmospheric circulation categories by applying Correspondence Analysis. Similar surface thermodynamic fields develop under cyclonic categories, which are contrasted with the anti-cyclonic category. A strong, steady wind flow characterized by high shear values develops under the cyclonic Closed Low and the anticyclonic H–L categories, in contrast to the variable weak flow under the anticyclonic Open Anticyclone category.