451 resultados para PLS-DA
Resumo:
The aim of this work is to present a tutorial on Multivariate Calibration, a tool which is nowadays necessary in basically most laboratories but very often misused. The basic concepts of preprocessing, principal component analysis (PCA), principal component regression (PCR) and partial least squares (PLS) are given. The two basic steps on any calibration procedure: model building and validation are fully discussed. The concepts of cross validation (to determine the number of factors to be used in the model), leverage and studentized residuals (to detect outliers) for the validation step are given. The whole calibration procedure is illustrated using spectra recorded for ternary mixtures of 2,4,6 trinitrophenolate, 2,4 dinitrophenolate and 2,5 dinitrophenolate followed by the concentration prediction of these three chemical species during a diffusion experiment through a hydrophobic liquid membrane. MATLAB software is used for numerical calculations. Most of the commands for the analysis are provided in order to allow a non-specialist to follow step by step the analysis.
Resumo:
One of the major interests in soil analysis is the evaluation of its chemical, physical and biological parameters, which are indicators of soil quality (the most important is the organic matter). Besides there is a great interest in the study of humic substances and on the assessment of pollutants, such as pesticides and heavy metals, in soils. Chemometrics is a powerful tool to deal with these problems and can help soil researchers to extract much more information from their data. In spite of this, the presence of these kinds of strategies in the literature has obtained projection only recently. The utilization of chemometric methods in soil analysis is evaluated in this article. The applications will be divided in four parts (with emphasis in the first two): (i) descriptive and exploratory methods based on Principal Component Analysis (PCA); (ii) multivariate calibration methods (MLR, PCR and PLS); (iii) methods such as Evolving Factor Analysis and SIMPLISMA; and (iv) artificial intelligence methods, such as Artificial Neural Networks.
Resumo:
Genetic algorithm was used for variable selection in simultaneous determination of mixtures of glucose, maltose and fructose by mid infrared spectroscopy. Different models, using partial least squares (PLS) and multiple linear regression (MLR) with and without data pre-processing, were used. Based on the results obtained, it was verified that a simpler model (multiple linear regression with variable selection by genetic algorithm) produces results comparable to more complex methods (partial least squares). The relative errors obtained for the best model was around 3% for the sugar determination, which is acceptable for this kind of determination.
Resumo:
Rosin is a natural product from pine forests and it is used as a raw material in resinate syntheses. Resinates are polyvalent metal salts of rosin acids and especially Ca- and Ca/Mg- resinates find wide application in the printing ink industry. In this thesis, analytical methods were applied to increase general knowledge of resinate chemistry and the reaction kinetics was studied in order to model the non linear solution viscosity increase during resinate syntheses by the fusion method. Solution viscosity in toluene is an important quality factor for resinates to be used in printing inks. The concept of critical resinate concentration, c crit, was introduced to define an abrupt change in viscosity dependence on resinate concentration in the solution. The concept was then used to explain the non-inear solution viscosity increase during resinate syntheses. A semi empirical model with two estimated parameters was derived for the viscosity increase on the basis of apparent reaction kinetics. The model was used to control the viscosity and to predict the total reaction time of the resinate process. The kinetic data from the complex reaction media was obtained by acid value titration and by FTIR spectroscopic analyses using a conventional calibration method to measure the resinate concentration and the concentration of free rosin acids. A multivariate calibration method was successfully applied to make partial least square (PLS) models for monitoring acid value and solution viscosity in both mid-infrared (MIR) and near infrared (NIR) regions during the syntheses. The calibration models can be used for on line resinate process monitoring. In kinetic studies, two main reaction steps were observed during the syntheses. First a fast irreversible resination reaction occurs at 235 °C and then a slow thermal decarboxylation of rosin acids starts to take place at 265 °C. Rosin oil is formed during the decarboxylation reaction step causing significant mass loss as the rosin oil evaporates from the system while the viscosity increases to the target level. The mass balance of the syntheses was determined based on the resinate concentration increase during the decarboxylation reaction step. A mechanistic study of the decarboxylation reaction was based on the observation that resinate molecules are partly solvated by rosin acids during the syntheses. Different decarboxylation mechanisms were proposed for the free and solvating rosin acids. The deduced kinetic model supported the analytical data of the syntheses in a wide resinate concentration region, over a wide range of viscosity values and at different reaction temperatures. In addition, the application of the kinetic model to the modified resinate syntheses gave a good fit. A novel synthesis method with the addition of decarboxylated rosin (i.e. rosin oil) to the reaction mixture was introduced. The conversion of rosin acid to resinate was increased to the level necessary to obtain the target viscosity for the product at 235 °C. Due to a lower reaction temperature than in traditional fusion synthesis at 265 °C, thermal decarboxylation is avoided. As a consequence, the mass yield of the resinate syntheses can be increased from ca. 70% to almost 100% by recycling the added rosin oil.
Resumo:
A model based on chemical structure was developed for the accurate prediction of octanol/water partition coefficient (K OW) of polychlorinated biphenyls (PCBs), which are molecules of environmental interest. Partial least squares (PLS) was used to build the regression model. Topological indices were used as molecular descriptors. Variable selection was performed by Hierarchical Cluster Analysis (HCA). In the modeling process, the experimental K OW measured for 30 PCBs by thin-layer chromatography - retention time (TLC-RT) has been used. The developed model (Q² = 0,990 and r² = 0,994) was used to estimate the log K OW values for the 179 PCB congeners whose K OW data have not yet been measured by TLC-RT method. The results showed that topological indices can be very useful to predict the K OW.
Resumo:
Dilutions of methylmetacrylate ranging between 1 and 50 ppm were obtained from a stock solution of 1 ml of monomer in 100 ml of deionised water, and were analyzed by an absorption spectrophotometer in the UV-visible. Absorbance values were used to develop a calibration model based on the PLS, with the aim to determine new sample concentrations. The number of latent variables used was 6, with the standard errors of calibration and prediction found to be 0,048 ml/100 ml and 0,058 ml/100 ml. The calibration model was successfully used to calculate the concentration of monomer released in water, where complete dentures were kept for one hour after polymerization.
Resumo:
A method is presented for the choice of spectral regions when absorption measurements are coupled to chemometric tools to perform quantitative analyses. The method is based on the spectral distribution of the relative standard deviation of concentration (s c/c). It has been applied to the development of PLS-FTNIR calibration models for the determination of density and MON of gasoline, and ethanol content and density of ethanol fuel. The new method was also compared with the correlation (R²) method and has proved to generate PLS calibration models that present better accuracy and precision than those based on R².
Resumo:
The application of analytical procedures based on multivariate calibration models has been limited in several areas due to requirements of validation and certification of the model. Procedures for validation are presented based on the determination of figures of merit, such as precision (mean, repeatability, intermediate), accuracy, sensitivity, analytical sensitivity, selectivity, signal-to-noise ratio and confidence intervals for PLS models. An example is discussed of a model for polymorphic purity control of carbamazepine by NIR diffuse reflectance spectroscopy. The results show that multivariate calibration models can be validated to fulfill the requirements imposed by industry and standardization agencies.
Resumo:
A simple method was proposed for determination of paracetamol and ibuprofen in tablets, based on UV measurements and partial least squares. The procedure was performed at pH 10.5, in the concentration ranges 3.00-15.00 µg ml-1 (paracetamol) and 2.40-12.00 µg ml-1 (ibuprofen). The model was able to predict paracetamol and ibuprofen in synthetic mixtures with root mean squares errors of prediction of 0.12 and 0.17 µg ml-1, respectively. Figures of merit (sensitivity, limit of detection and precision) were also estimated. The results achieved for the determination of these drugs in pharmaceutical formulations were in agreement with label claims and verified by HPLC.
Resumo:
In this work, the artificial neural networks (ANN) and partial least squares (PLS) regression were applied to UV spectral data for quantitative determination of thiamin hydrochloride (VB1), riboflavin phosphate (VB2), pyridoxine hydrochloride (VB6) and nicotinamide (VPP) in pharmaceutical samples. For calibration purposes, commercial samples in 0.2 mol L-1 acetate buffer (pH 4.0) were employed as standards. The concentration ranges used in the calibration step were: 0.1 - 7.5 mg L-1 for VB1, 0.1 - 3.0 mg L-1 for VB2, 0.1 - 3.0 mg L-1 for VB6 and 0.4 - 30.0 mg L-1 for VPP. From the results it is possible to verify that both methods can be successfully applied for these determinations. The similar error values were obtained by using neural network or PLS methods. The proposed methodology is simple, rapid and can be easily used in quality control laboratories.
Resumo:
Recent years have produced great advances in the instrumentation technology. The amount of available data has been increasing due to the simplicity, speed and accuracy of current spectroscopic instruments. Most of these data are, however, meaningless without a proper analysis. This has been one of the reasons for the overgrowing success of multivariate handling of such data. Industrial data is commonly not designed data; in other words, there is no exact experimental design, but rather the data have been collected as a routine procedure during an industrial process. This makes certain demands on the multivariate modeling, as the selection of samples and variables can have an enormous effect. Common approaches in the modeling of industrial data are PCA (principal component analysis) and PLS (projection to latent structures or partial least squares) but there are also other methods that should be considered. The more advanced methods include multi block modeling and nonlinear modeling. In this thesis it is shown that the results of data analysis vary according to the modeling approach used, thus making the selection of the modeling approach dependent on the purpose of the model. If the model is intended to provide accurate predictions, the approach should be different than in the case where the purpose of modeling is mostly to obtain information about the variables and the process. For industrial applicability it is essential that the methods are robust and sufficiently simple to apply. In this way the methods and the results can be compared and an approach selected that is suitable for the intended purpose. Differences in data analysis methods are compared with data from different fields of industry in this thesis. In the first two papers, the multi block method is considered for data originating from the oil and fertilizer industries. The results are compared to those from PLS and priority PLS. The third paper considers applicability of multivariate models to process control for a reactive crystallization process. In the fourth paper, nonlinear modeling is examined with a data set from the oil industry. The response has a nonlinear relation to the descriptor matrix, and the results are compared between linear modeling, polynomial PLS and nonlinear modeling using nonlinear score vectors.
Resumo:
Diffuse reflectance near-infrared (DR-NIR) spectroscopy associated with partial least squares (PLS) multivariate calibration is proposed for a direct, non-destructive, determination of total nitrogen in wheat leaves. The procedure was developed for an Analytical Instrumental Analysis course, carried out at the Institute of Chemistry of the State University of Campinas. The DR-NIR results are in good agreement with those obtained by the Kjeldhal standard procedure, with a relative error of less than ± 3% and the method may be used for teaching purposes as well as for routine analysis.
Resumo:
Cooling crystallization is one of the most important purification and separation techniques in the chemical and pharmaceutical industry. The product of the cooling crystallization process is always a suspension that contains both the mother liquor and the product crystals, and therefore the first process step following crystallization is usually solid-liquid separation. The properties of the produced crystals, such as their size and shape, can be affected by modifying the conditions during the crystallization process. The filtration characteristics of solid/liquid suspensions, on the other hand, are strongly influenced by the particle properties, as well as the properties of the liquid phase. It is thus obvious that the effect of the changes made to the crystallization parameters can also be seen in the course of the filtration process. Although the relationship between crystallization and filtration is widely recognized, the number of publications where these unit operations have been considered in the same context seems to be surprisingly small. This thesis explores the influence of different crystallization parameters in an unseeded batch cooling crystallization process on the external appearance of the product crystals and on the pressure filtration characteristics of the obtained product suspensions. Crystallization experiments are performed by crystallizing sulphathiazole (C9H9N3O2S2), which is a wellknown antibiotic agent, from different mixtures of water and n-propanol in an unseeded batch crystallizer. The different crystallization parameters that are studied are the composition of the solvent, the cooling rate during the crystallization experiments carried out by using a constant cooling rate throughout the whole batch, the cooling profile, as well as the mixing intensity during the batch. The obtained crystals are characterized by using an automated image analyzer and the crystals are separated from the solvent through constant pressure batch filtration experiments. Separation characteristics of the suspensions are described by means of average specific cake resistance and average filter cake porosity, and the compressibilities of the cakes are also determined. The results show that fairly large differences can be observed between the size and shape of the crystals, and it is also shown experimentally that the changes in the crystal size and shape have a direct impact on the pressure filtration characteristics of the crystal suspensions. The experimental results are utilized to create a procedure that can be used for estimating the filtration characteristics of solid-liquid suspensions according to the particle size and shape data obtained by image analysis. Multilinear partial least squares regression (N-PLS) models are created between the filtration parameters and the particle size and shape data, and the results presented in this thesis show that relatively obvious correlations can be detected with the obtained models.
Resumo:
Two spectrophotometric methods are described for the simultaneous determination of ezetimibe (EZE) and simvastatin (SIM) in pharmaceutical preparations. The obtained data was evaluated by using two different chemometric techniques, Principal Component Regression (PCR) and Partial Least-Squares (PLS-1). In these techniques, the concentration data matrix was prepared by using the mixtures containing these drugs in methanol. The absorbance data matrix corresponding to the concentration data matrix was obtained by the measurements of absorbances in the range of 240 - 300 nm in the intervals with Δλ = 1 nm at 61 wavelengths in their zero order spectra, then, calibration or regression was obtained by using the absorbance data matrix and concentration data matrix for the prediction of the unknown concentrations of EZE and SIM in their mixture. The procedure did not require any separation step. The linear range was found to be 5 - 20 µg mL-1 for EZE and SIM in both methods. The accuracy and precision of the methods were assessed. These methods were successfully applied to a pharmaceutical preparation, tablet; and the results were compared with each other.
Resumo:
In this work an analytical methodology for the determination of relevant physicochemical parameters of prato cheese is reported, using infrared spectroscopy (DRIFT) and partial least squares regression (PLS). Several multivariate models were developed, using different spectral regions and preprocessing routines. In general, good precision and accuracy was observed for all studied parameters (fat, protein, moisture, total solids, ashes and pH) with standard deviations comparable with those provided by the conventional methodologies. The implantation of this multivariate routine involves significant analytical advantages, including reduction of cost and time of analysis, minimization of human errors, and elimination of chemical residues.