937 resultados para principal components analysis (PCA) algorithm
Resumo:
In this thesis some multivariate spectroscopic methods for the analysis of solutions are proposed. Spectroscopy and multivariate data analysis form a powerful combination for obtaining both quantitative and qualitative information and it is shown how spectroscopic techniques in combination with chemometric data evaluation can be used to obtain rapid, simple and efficient analytical methods. These spectroscopic methods consisting of spectroscopic analysis, a high level of automation and chemometric data evaluation can lead to analytical methods with a high analytical capacity, and for these methods, the term high-capacity analysis (HCA) is suggested. It is further shown how chemometric evaluation of the multivariate data in chromatographic analyses decreases the need for baseline separation. The thesis is based on six papers and the chemometric tools used are experimental design, principal component analysis (PCA), soft independent modelling of class analogy (SIMCA), partial least squares regression (PLS) and parallel factor analysis (PARAFAC). The analytical techniques utilised are scanning ultraviolet-visible (UV-Vis) spectroscopy, diode array detection (DAD) used in non-column chromatographic diode array UV spectroscopy, high-performance liquid chromatography with diode array detection (HPLC-DAD) and fluorescence spectroscopy. The methods proposed are exemplified in the analysis of pharmaceutical solutions and serum proteins. In Paper I a method is proposed for the determination of the content and identity of the active compound in pharmaceutical solutions by means of UV-Vis spectroscopy, orthogonal signal correction and multivariate calibration with PLS and SIMCA classification. Paper II proposes a new method for the rapid determination of pharmaceutical solutions by the use of non-column chromatographic diode array UV spectroscopy, i.e. a conventional HPLC-DAD system without any chromatographic column connected. In Paper III an investigation is made of the ability of a control sample, of known content and identity to diagnose and correct errors in multivariate predictions something that together with use of multivariate residuals can make it possible to use the same calibration model over time. In Paper IV a method is proposed for simultaneous determination of serum proteins with fluorescence spectroscopy and multivariate calibration. Paper V proposes a method for the determination of chromatographic peak purity by means of PCA of HPLC-DAD data. In Paper VI PARAFAC is applied for the decomposition of DAD data of some partially separated peaks into the pure chromatographic, spectral and concentration profiles.
Resumo:
This thesis is focused on the metabolomic study of human cancer tissues by ex vivo High Resolution-Magic Angle Spinning (HR-MAS) nuclear magnetic resonance (NMR) spectroscopy. This new technique allows for the acquisition of spectra directly on intact tissues (biopsy or surgery), and it has become very important for integrated metabonomics studies. The objective is to identify metabolites that can be used as markers for the discrimination of the different types of cancer, for the grading, and for the assessment of the evolution of the tumour. Furthermore, an attempt to recognize metabolites, that although involved in the metabolism of tumoral tissues in low concentration, can be important modulators of neoplastic proliferation, was performed. In addition, NMR data was integrated with statistical techniques in order to obtain semi-quantitative information about the metabolite markers. In the case of gliomas, the NMR study was correlated with gene expression of neoplastic tissues. Chapter 1 begins with a general description of a new “omics” study, the metabolomics. The study of metabolism can contribute significantly to biomedical research and, ultimately, to clinical medical practice. This rapidly developing discipline involves the study of the metabolome: the total repertoire of small molecules present in cells, tissues, organs, and biological fluids. Metabolomic approaches are becoming increasingly popular in disease diagnosis and will play an important role on improving our understanding of cancer mechanism. Chapter 2 addresses in more detail the basis of NMR Spectroscopy, presenting the new HR-MAS NMR tool, that is gaining importance in the examination of tumour tissues, and in the assessment of tumour grade. Some advanced chemometric methods were used in an attempt to enhance the interpretation and quantitative information of the HR-MAS NMR data are and presented in chapter 3. Chemometric methods seem to have a high potential in the study of human diseases, as it permits the extraction of new and relevant information from spectroscopic data, allowing a better interpretation of the results. Chapter 4 reports results obtained from HR-MAS NMR analyses performed on different brain tumours: medulloblastoma, meningioms and gliomas. The medulloblastoma study is a case report of primitive neuroectodermal tumor (PNET) localised in the cerebellar region by Magnetic Resonance Imaging (MRI) in a 3-year-old child. In vivo single voxel 1H MRS shows high specificity in detecting the main metabolic alterations in the primitive cerebellar lesion; which consist of very high amounts of the choline-containing compounds and of very low levels of creatine derivatives and N-acetylaspartate. Ex vivo HR-MAS NMR, performed at 9.4 Tesla on the neoplastic specimen collected during surgery, allows the unambiguous identification of several metabolites giving a more in-depth evaluation of the metabolic pattern of the lesion. The ex vivo HR-MAS NMR spectra show higher detail than that obtained in vivo. In addition, the spectroscopic data appear to correlate with some morphological features of the medulloblastoma. The present study shows that ex vivo HR-MAS 1H NMR is able to strongly improve the clinical possibility of in vivo MRS and can be used in conjunction with in vivo spectroscopy for clinical purposes. Three histological subtypes of meningiomas (meningothelial, fibrous and oncocytic) were analysed both by in vivo and ex vivo MRS experiments. The ex vivo HR-MAS investigations are very helpful for the assignment of the in vivo resonances of human meningiomas and for the validation of the quantification procedure of in vivo MR spectra. By using one- and two dimensional experiments, several metabolites in different histological subtypes of meningiomas, were identified. The spectroscopic data confirmed the presence of the typical metabolites of these benign neoplasms and, at the same time, that meningomas with different morphological characteristics have different metabolic profiles, particularly regarding macromolecules and lipids. The profile of total choline metabolites (tCho) and the expression of the Kennedy pathway genes in biopsies of human gliomas were also investigated using HR-MAS NMR, and microfluidic genomic cards. 1H HR-MAS spectra, allowed the resolution and relative quantification by LCModel of the resonances from choline (Cho), phosphorylcholine (PC) and glycerolphorylcholine (GPC), the three main components of the combined tCho peak observed in gliomas by in vivo 1H MRS spectroscopy. All glioma biopsies depicted an increase in tCho as calculated from the addition of Cho, PC and GPC HR-MAS resonances. However, the increase was constantly derived from augmented GPC in low grade NMR gliomas or increased PC content in the high grade gliomas, respectively. This circumstance allowed the unambiguous discrimination of high and low grade gliomas by 1H HR-MAS, which could not be achieved by calculating the tCho/Cr ratio commonly used by in vivo 1H MR spectroscopy. The expression of the genes involved in choline metabolism was investigated in the same biopsies. The present findings offer a convenient procedure to classify accurately glioma grade using 1H HR-MAS, providing in addition the genetic background for the alterations of choline metabolism observed in high and low gliomas grade. Chapter 5 reports the study on human gastrointestinal tract (stomach and colon) neoplasms. The human healthy gastric mucosa, and the characteristics of the biochemical profile of human gastric adenocarcinoma in comparison with that of healthy gastric mucosa were analyzed using ex vivo HR-MAS NMR. Healthy human mucosa is mainly characterized by the presence of small metabolites (more than 50 identified) and macromolecules. The adenocarcinoma spectra were dominated by the presence of signals due to triglycerides, that are usually very low in healthy gastric mucosa. The use of spin-echo experiments enable us to detect some metabolites in the unhealthy tissues and to determine their variation with respect to the healthy ones. Then, the ex vivo HR-MAS NMR analysis was applied to human gastric tissue, to obtain information on the molecular steps involved in the gastric carcinogenesis. A microscopic investigation was also carried out in order to identify and locate the lipids in the cellular and extra-cellular environments. Correlation of the morphological changes detected by transmission (TEM) and scanning (SEM) electron microscopy, with the metabolic profile of gastric mucosa in healthy, gastric atrophy autoimmune diseases (AAG), Helicobacter pylori-related gastritis and adenocarcinoma subjects, were obtained. These ultrastructural studies of AAG and gastric adenocarcinoma revealed lipid intra- and extra-cellularly accumulation associated with a severe prenecrotic hypoxia and mitochondrial degeneration. A deep insight into the metabolic profile of human healthy and neoplastic colon tissues was gained using ex vivo HR-MAS NMR spectroscopy in combination with multivariate methods: Principal Component Analysis (PCA) and Partial Least Squares Discriminant Analysis (PLS-DA). The NMR spectra of healthy tissues highlight different metabolic profiles with respect to those of neoplastic and microscopically normal colon specimens (these last obtained at least 15 cm far from the adenocarcinoma). Furthermore, metabolic variations are detected not only for neoplastic tissues with different histological diagnosis, but also for those classified identical by histological analysis. These findings suggest that the same subclass of colon carcinoma is characterized, at a certain degree, by metabolic heterogeneity. The statistical multivariate approach applied to the NMR data is crucial in order to find metabolic markers of the neoplastic state of colon tissues, and to correctly classify the samples. Significant different levels of choline containing compounds, taurine and myoinositol, were observed. Chapter 6 deals with the metabolic profile of normal and tumoral renal human tissues obtained by ex vivo HR-MAS NMR. The spectra of human normal cortex and medulla show the presence of differently distributed osmolytes as markers of physiological renal condition. The marked decrease or disappearance of these metabolites and the high lipid content (triglycerides and cholesteryl esters) is typical of clear cell renal carcinoma (RCC), while papillary RCC is characterized by the absence of lipids and very high amounts of taurine. This research is a contribution to the biochemical classification of renal neoplastic pathologies, especially for RCCs, which can be evaluated by in vivo MRS for clinical purposes. Moreover, these data help to gain a better knowledge of the molecular processes envolved in the onset of renal carcinogenesis.
Resumo:
Coastal sand dunes represent a richness first of all in terms of defense from the sea storms waves and the saltwater ingression; moreover these morphological elements constitute an unique ecosystem of transition between the sea and the land environment. The research about dune system is a strong part of the coastal sciences, since the last century. Nowadays this branch have assumed even more importance for two reasons: on one side the born of brand new technologies, especially related to the Remote Sensing, have increased the researcher possibilities; on the other side the intense urbanization of these days have strongly limited the dune possibilities of development and fragmented what was remaining from the last century. This is particularly true in the Ravenna area, where the industrialization united to the touristic economy and an intense subsidence, have left only few dune ridges residual still active. In this work three different foredune ridges, along the Ravenna coast, have been studied with Laser Scanner technology. This research didn’t limit to analyze volume or spatial difference, but try also to find new ways and new features to monitor this environment. Moreover the author planned a series of test to validate data from Terrestrial Laser Scanner (TLS), with the additional aim of finalize a methodology to test 3D survey accuracy. Data acquired by TLS were then applied on one hand to test some brand new applications, such as Digital Shore Line Analysis System (DSAS) and Computational Fluid Dynamics (CFD), to prove their efficacy in this field; on the other hand the author used TLS data to find any correlation with meteorological indexes (Forcing Factors), linked to sea and wind (Fryberger's method) applying statistical tools, such as the Principal Component Analysis (PCA).
Resumo:
A critical point in the analysis of ground displacements time series is the development of data driven methods that allow the different sources that generate the observed displacements to be discerned and characterised. A widely used multivariate statistical technique is the Principal Component Analysis (PCA), which allows reducing the dimensionality of the data space maintaining most of the variance of the dataset explained. Anyway, PCA does not perform well in finding the solution to the so-called Blind Source Separation (BSS) problem, i.e. in recovering and separating the original sources that generated the observed data. This is mainly due to the assumptions on which PCA relies: it looks for a new Euclidean space where the projected data are uncorrelated. The Independent Component Analysis (ICA) is a popular technique adopted to approach this problem. However, the independence condition is not easy to impose, and it is often necessary to introduce some approximations. To work around this problem, I use a variational bayesian ICA (vbICA) method, which models the probability density function (pdf) of each source signal using a mix of Gaussian distributions. This technique allows for more flexibility in the description of the pdf of the sources, giving a more reliable estimate of them. Here I present the application of the vbICA technique to GPS position time series. First, I use vbICA on synthetic data that simulate a seismic cycle (interseismic + coseismic + postseismic + seasonal + noise) and a volcanic source, and I study the ability of the algorithm to recover the original (known) sources of deformation. Secondly, I apply vbICA to different tectonically active scenarios, such as the 2009 L'Aquila (central Italy) earthquake, the 2012 Emilia (northern Italy) seismic sequence, and the 2006 Guerrero (Mexico) Slow Slip Event (SSE).
Resumo:
Classical liquid-state high-resolution (HR) NMR spectroscopy has proved a powerful tool in the metabonomic analysis of liquid food samples like fruit juices. In this paper the application of (1)H high-resolution magic angle spinning (HR-MAS) NMR spectroscopy to apple tissue is presented probing its potential for metabonomic studies. The (1)H HR-MAS NMR spectra are discussed in terms of the chemical composition of apple tissue and compared to liquid-state NMR spectra of apple juice. Differences indicate that specific metabolic changes are induced by juice preparation. The feasibility of HR-MAS NMR-based multivariate analysis is demonstrated by a study distinguishing three different apple cultivars by principal component analysis (PCA). Preliminary results are shown from subsequent studies comparing three different cultivation methods by means of PCA and partial least squares discriminant analysis (PLS-DA) of the HR-MAS NMR data. The compounds responsible for discriminating organically grown apples are discussed. Finally, an outlook of our ongoing work is given including a longitudinal study on apples.
Resumo:
Background: Inflammation is implicated in the development of cancer related fatigue (CRF). However there is limited literature on the mediators of inflammation (namely), cytokines and their receptors, associated with clinically significant fatigue and response to treatment. Methods: We reviewed 37 advanced cancer patients with fatigue (≥4/10), who participated in two Randomized Controlled Trials, of anti-inflammatory agents (Thalidomide and Dexamethasone) for CRF. Responders showed improvement in FACIT-F subscale at the end of study (Day 15). Baseline patient characteristics and symptoms were assessed by FACIT-F, ESAS; serum cytokines [IL-1β and receptor antagonist (IL-1RA), IL-6, IL-6R, TNF-α and sTNF-R1 and R2, IL-8, IL-10, IL-17] levels measured by Luminex. Data were analyzed using principal component analysis (PCA) [reporting cumulative variance (variance) for the first four components] to determine their association with fatigue and response to treatment. Results: Females were 54%. Mean (SD) was as follows for age, 61(14); baseline FACIT (F) scores, 21.4(8.6); ESAS Fatigue item, 6.5(1.9); and FACIT-F change, 6.4(9.7); ESAS (fatigue) change, -2 (2.41). Baseline median in pg/mL for IL-6, TNF-α, IL-1β were 31.9; 18.9; 0.55, respectively. Change in IL-6 negatively correlated with change in FACIT-F scores (p=0.02). Baseline CRF (FACIT-F score) was associated with IL-6, IL-6R and IL-17, Variance = 78% whereas IL-10, IL-1RA, TNF-α and IL-1β were associated with improvement of CRF, Variance=74%. Conversely, IL-6 and IL-8 were associated with no improvement or worsening of CRF, Variance= 93%. Conclusions: Change in IL-6 negatively correlated with change in FACIT-F scores. IL-6, IL-6R and IL-17 are associated with CRF while IL-6 and IL-8 were associated with no improvement of CRF. Further studies are warranted confirm our findings.
Resumo:
OBJECTIVES Molecular subclassification of non small-cell lung cancer (NSCLC) is essential to improve clinical outcome. This study assessed the prognostic and predictive value of circulating micro-RNA (miRNA) in patients with non-squamous NSCLC enrolled in the phase II SAKK (Swiss Group for Clinical Cancer Research) trial 19/05, receiving uniform treatment with first-line bevacizumab and erlotinib followed by platinum-based chemotherapy at progression. MATERIALS AND METHODS Fifty patients with baseline and 24 h blood samples were included from SAKK 19/05. The primary study endpoint was to identify prognostic (overall survival, OS) miRNA's. Patient samples were analyzed with Agilent human miRNA 8x60K microarrays, each glass slide formatted with eight high-definition 60K arrays. Each array contained 40 probes targeting each of the 1347 miRNA. Data preprocessing included quantile normalization using robust multi-array average (RMA) algorithm. Prognostic and predictive miRNA expression profiles were identified by Spearman's rank correlation test (percentage tumor shrinkage) or log-rank testing (for time-to-event endpoints). RESULTS Data preprocessing kept 49 patients and 424 miRNA for further analysis. Ten miRNA's were significantly associated with OS, with hsa-miR-29a being the strongest prognostic marker (HR=6.44, 95%-CI 2.39-17.33). Patients with high has-miR-29a expression had a significantly lower survival at 10 months compared to patients with a low expression (54% versus 83%). Six out of the 10 miRNA's (hsa-miRN-29a, hsa-miR-542-5p, hsa-miR-502-3p, hsa-miR-376a, hsa-miR-500a, hsa-miR-424) were insensitive to perturbations according to jackknife cross-validation on their HR for OS. The respective principal component analysis (PCA) defined a meta-miRNA signature including the same 6 miRNA's, resulting in a HR of 0.66 (95%-CI 0.53-0.82). CONCLUSION Cell-free circulating miRNA-profiling successfully identified a highly prognostic 6-gene signature in patients with advanced non-squamous NSCLC. Circulating miRNA profiling should further be validated in external cohorts for the selection and monitoring of systemic treatment in patients with advanced NSCLC.
Resumo:
Over the past few decades, the advantages of the visible-near infra-red (VisNIR) diffuse reflectance spectrometer (DRS) method have enabled prediction of soil organic carbon (SOC). In this study, SOC was predicted using regression models for samples taken from three sites (Gununo, Maybar and Anjeni) in Ethiopia. SOC was characterized in laboratory using conventional wet chemistry and VisNIR-DRS methods. Principal component analysis (PCA), principal component regression (PCR) and partial least square regression (PLS) models were developed using Unscrambler X 10.2. PCA results show that the first two components accounted for a minimum of 96% variation which increased for individual sites and with data treatments. Correlation (r), coefficient of determination (R2) and residual prediction deviation (RPD) were used to rate four models built. PLS model (r, R2, RPD) values for Anjeni were 0.9, 0.9 and 3.6; for Gununo values 0.6, 0.3 and 1.2; for Maybar values 0.6, 0.3 and 0.9, and for the three sites values 0.7, 0.6 and 1.5, respectively. PCR model values (r, R2, RPD) for Anjeni were 0.9, 0.8 and 2.7; for Gununo values 0.5, 0.3 and 1; for Maybar values 0.5, 0.1 and 0.7, and for the three sites values 0.7, 0.5 and 1.2, respectively. Comparison and testing of models shows superior performance of PLS to PCR. Models were rated as very poor (Maybar), poor (Gununo and three sites) and excellent (Anjeni). A robust model, Anjeni, is recommended for prediction of SOC in Ethiopia.
Resumo:
• Premise of the study: Isometric and allometric scaling of a conserved floral plan could provide a parsimonious mechanism for rapid and reversible transitions between breeding systems. This scaling may occur during transitions between predominant autogamy and xenogamy, contributing to the maintenance of a stable mixed mating system. • Methods: We compared nine disjunct populations of the polytypic, mixed mating species Oenothera flava (Onagraceae) to two parapatric relatives, the obligately xenogamous species O. acutissima and the mixed mating species O. triloba. We compared floral morphology of all taxa using principal component analysis (PCA) and developmental trajectories of floral organs using ANCOVA homogeneity of slopes. • Key results: The PCA revealed both isometric and allometric scaling of a conserved floral plan. Three principal components (PCs) explained 92.5% of the variation in the three species. PC1 predominantly loaded on measures of floral size and accounts for 36% of the variation. PC2 accounted for 35% of the variation, predominantly in traits that influence pollinator handling. PC3 accounted for 22% of the variation, primarily in anther–stigma distance (herkogamy). During O. flava subsp. taraxacoides development, style elongation was accelerated relative to anthers, resulting in positive herkogamy. During O. flava subsp. flava development, style elongation was decelerated, resulting in zero or negative herkogamy. Of the two populations with intermediate morphology, style elongation was accelerated in one population and decelerated in the other. • Conclusions: Isometric and allometric scaling of floral organs in North American Oenothera section Lavauxia drive variation in breeding system. Multiple developmental paths to intermediate phenotypes support the likelihood of multiple mating system transitions.
Resumo:
Improvements in the analysis of microarray images are critical for accurately quantifying gene expression levels. The acquisition of accurate spot intensities directly influences the results and interpretation of statistical analyses. This dissertation discusses the implementation of a novel approach to the analysis of cDNA microarray images. We use a stellar photometric model, the Moffat function, to quantify microarray spots from nylon microarray images. The inherent flexibility of the Moffat shape model makes it ideal for quantifying microarray spots. We apply our novel approach to a Wilms' tumor microarray study and compare our results with a fixed-circle segmentation approach for spot quantification. Our results suggest that different spot feature extraction methods can have an impact on the ability of statistical methods to identify differentially expressed genes. We also used the Moffat function to simulate a series of microarray images under various experimental conditions. These simulations were used to validate the performance of various statistical methods for identifying differentially expressed genes. Our simulation results indicate that tests taking into account the dependency between mean spot intensity and variance estimation, such as the smoothened t-test, can better identify differentially expressed genes, especially when the number of replicates and mean fold change are low. The analysis of the simulations also showed that overall, a rank sum test (Mann-Whitney) performed well at identifying differentially expressed genes. Previous work has suggested the strengths of nonparametric approaches for identifying differentially expressed genes. We also show that multivariate approaches, such as hierarchical and k-means cluster analysis along with principal components analysis, are only effective at classifying samples when replicate numbers and mean fold change are high. Finally, we show how our stellar shape model approach can be extended to the analysis of 2D-gel images by adapting the Moffat function to take into account the elliptical nature of spots in such images. Our results indicate that stellar shape models offer a previously unexplored approach for the quantification of 2D-gel spots. ^
Resumo:
Longitudinal principal components analyses on a combination of four subcutaneous skinfolds (biceps, triceps, subscapular and suprailiac) were performed using data from the London Longitudinal Growth Study. The main objectives were to discover at what age during growth sex differences in body fat distribution occur and to see if there is continuity in body fatness and body fat distribution from childhood into the adult status (18 years). The analyses were done for four age sectors (3mon-3yrs, 3yrs-8yrs, 8yrs-18yrs and 3yrs-18yrs). Longitudinal principal component one (LPC1) for each age interval in both sexes represents the population mean fat curve. Component two (LPC2) is a velocity of fatness component. Component three (LPC3) in the 3mon-3yrs age sector represents infant fat wave in both sexes. In the next two age sectors component three in males represents peaks and shifts in fat growth (change in velocity), while in females it represents body fat distribution. Component four (LPC4) in the same two age sectors is a reversal in the sexes of the patterns seen for component three, i.e., in males it is body fat distribution and in females velocity shifts. Components five and above represent more complicated patterns of change (multiple increases and decreases across the age interval). In both sexes there is strong tracking in fatness from middle childhood to adolescence. In males only there is also a low to moderate tracking of infant fat with middle to late childhood fat. These data are strongly supported in the literature. Several factors are known to predict adult fatness among the most important being previous levels of fatness (at earlier ages) and the age at rebound. In addition we found that the velocity of fat change in middle childhood was highly predictive of later fatness (r $\approx -$0.7), even more so than age at rebound (r $\approx -$0.5). In contrast to fatness (LPC1), body fat distribution (LPC3-LPC4) did not track well even though significant components of body fat distribution occur at each age. Tracking of body fat distribution was higher in females than males. Sex differences in body fat distribution are non existent. Some sex differences are evident with the peripheral-to-central ratios after age 14 years. ^