916 resultados para Partial least square regression


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A quantitative structure-activity relationship (QSAR) study of 19 quinone compounds with trypanocidal activity was performed by Partial Least Squares (PLS) and Principal Component Regression (PCR) methods with the use of leave-one-out crossvalidation procedure to build the regression models. The trypanocidal activity of the compounds is related to their first cathodic potential (Ep(c1)). The regression PLS and PCR models built in this study were also used to predict the Ep(c1) of six new quinone compounds. The PLS model was built with three principal components that described 96.50% of the total variance and present Q(2) = 0.83 and R-2 = 0.90. The results obtained with the PCR model were similar to those obtained with the PLS model. The PCR model was also built with three principal components that described 96.67% of the total variance with Q(2) = 0.83 and R-2 = 0.90. The most important descriptors for our PLS and PCR models were HOMO-1 (energy of the molecular orbital below HOMO), Q4 (atomic charge at position 4), MAXDN (maximal electrotopological negative difference), and HYF (hydrophilicity index).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Conventional reflectance spectroscopy (NIRS) and hyperspectral imaging (HI) in the near-infrared region (1000-2500 nm) are evaluated and compared, using, as the case study, the determination of relevant properties related to the quality of natural rubber. Mooney viscosity (MV) and plasticity indices (PI) (PI0 - original plasticity, PI30 - plasticity after accelerated aging, and PRI - the plasticity retention index after accelerated aging) of rubber were determined using multivariate regression models. Two hundred and eighty six samples of rubber were measured using conventional and hyperspectral near-infrared imaging reflectance instruments in the range of 1000-2500 nm. The sample set was split into regression (n = 191) and external validation (n = 95) sub-sets. Three instruments were employed for data acquisition: a line scanning hyperspectral camera and two conventional FT-NIR spectrometers. Sample heterogeneity was evaluated using hyperspectral images obtained with a resolution of 150 × 150 μm and principal component analysis. The probed sample area (5 cm(2); 24,000 pixels) to achieve representativeness was found to be equivalent to the average of 6 spectra for a 1 cm diameter probing circular window of one FT-NIR instrument. The other spectrophotometer can probe the whole sample in only one measurement. The results show that the rubber properties can be determined with very similar accuracy and precision by Partial Least Square (PLS) regression models regardless of whether HI-NIR or conventional FT-NIR produce the spectral datasets. The best Root Mean Square Errors of Prediction (RMSEPs) of external validation for MV, PI0, PI30, and PRI were 4.3, 1.8, 3.4, and 5.3%, respectively. Though the quantitative results provided by the three instruments can be considered equivalent, the hyperspectral imaging instrument presents a number of advantages, being about 6 times faster than conventional bulk spectrometers, producing robust spectral data by ensuring sample representativeness, and minimizing the effect of the presence of contaminants.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dimensionality reduction is employed for visual data analysis as a way to obtaining reduced spaces for high dimensional data or to mapping data directly into 2D or 3D spaces. Although techniques have evolved to improve data segregation on reduced or visual spaces, they have limited capabilities for adjusting the results according to user's knowledge. In this paper, we propose a novel approach to handling both dimensionality reduction and visualization of high dimensional data, taking into account user's input. It employs Partial Least Squares (PLS), a statistical tool to perform retrieval of latent spaces focusing on the discriminability of the data. The method employs a training set for building a highly precise model that can then be applied to a much larger data set very effectively. The reduced data set can be exhibited using various existing visualization techniques. The training data is important to code user's knowledge into the loop. However, this work also devises a strategy for calculating PLS reduced spaces when no training data is available. The approach produces increasingly precise visual mappings as the user feeds back his or her knowledge and is capable of working with small and unbalanced training sets.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The advances in computational biology have made simultaneous monitoring of thousands of features possible. The high throughput technologies not only bring about a much richer information context in which to study various aspects of gene functions but they also present challenge of analyzing data with large number of covariates and few samples. As an integral part of machine learning, classification of samples into two or more categories is almost always of interest to scientists. In this paper, we address the question of classification in this setting by extending partial least squares (PLS), a popular dimension reduction tool in chemometrics, in the context of generalized linear regression based on a previous approach, Iteratively ReWeighted Partial Least Squares, i.e. IRWPLS (Marx, 1996). We compare our results with two-stage PLS (Nguyen and Rocke, 2002A; Nguyen and Rocke, 2002B) and other classifiers. We show that by phrasing the problem in a generalized linear model setting and by applying bias correction to the likelihood to avoid (quasi)separation, we often get lower classification error rates.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present an independent calibration model for the determination of biogenic silica (BSi) in sediments, developed from analysis of synthetic sediment mixtures and application of Fourier transform infrared spectroscopy (FTIRS) and partial least squares regression (PLSR) modeling. In contrast to current FTIRS applications for quantifying BSi, this new calibration is independent from conventional wet-chemical techniques and their associated measurement uncertainties. This approach also removes the need for developing internal calibrations between the two methods for individual sediments records. For the independent calibration, we produced six series of different synthetic sediment mixtures using two purified diatom extracts, with one extract mixed with quartz sand, calcite, 60/40 quartz/calcite and two different natural sediments, and a second extract mixed with one of the natural sediments. A total of 306 samples—51 samples per series—yielded BSi contents ranging from 0 to 100 %. The resulting PLSR calibration model between the FTIR spectral information and the defined BSi concentration of the synthetic sediment mixtures exhibits a strong cross-validated correlation ( R2cv = 0.97) and a low root-mean square error of cross-validation (RMSECV = 4.7 %). Application of the independent calibration to natural lacustrine and marine sediments yields robust BSi reconstructions. At present, the synthetic mixtures do not include the variation in organic matter that occurs in natural samples, which may explain the somewhat lower prediction accuracy of the calibration model for organic-rich samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The accurate identification of T-cell epitopes remains a principal goal of bioinformatics within immunology. As the immunogenicity of peptide epitopes is dependent on their binding to major histocompatibility complex (MHC) molecules, the prediction of binding affinity is a prerequisite to the reliable prediction of epitopes. The iterative self-consistent (ISC) partial-least-squares (PLS)-based additive method is a recently developed bioinformatic approach for predicting class II peptide−MHC binding affinity. The ISC−PLS method overcomes many of the conceptual difficulties inherent in the prediction of class II peptide−MHC affinity, such as the binding of a mixed population of peptide lengths due to the open-ended class II binding site. The method has applications in both the accurate prediction of class II epitopes and the manipulation of affinity for heteroclitic and competitor peptides. The method is applied here to six class II mouse alleles (I-Ab, I-Ad, I-Ak, I-As, I-Ed, and I-Ek) and included peptides up to 25 amino acids in length. A series of regression equations highlighting the quantitative contributions of individual amino acids at each peptide position was established. The initial model for each allele exhibited only moderate predictivity. Once the set of selected peptide subsequences had converged, the final models exhibited a satisfactory predictive power. Convergence was reached between the 4th and 17th iterations, and the leave-one-out cross-validation statistical terms - q2, SEP, and NC - ranged between 0.732 and 0.925, 0.418 and 0.816, and 1 and 6, respectively. The non-cross-validated statistical terms r2 and SEE ranged between 0.98 and 0.995 and 0.089 and 0.180, respectively. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made freely available online (http://www.jenner.ac.uk/MHCPred).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Motivation: The immunogenicity of peptides depends on their ability to bind to MHC molecules. MHC binding affinity prediction methods can save significant amounts of experimental work. The class II MHC binding site is open at both ends, making epitope prediction difficult because of the multiple binding ability of long peptides. Results: An iterative self-consistent partial least squares (PLS)-based additive method was applied to a set of 66 pep- tides no longer than 16 amino acids, binding to DRB1*0401. A regression equation containing the quantitative contributions of the amino acids at each of the nine positions was generated. Its predictability was tested using two external test sets which gave r pred =0.593 and r pred=0.655, respectively. Furthermore, it was benchmarked using 25 known T-cell epitopes restricted by DRB1*0401 and we compared our results with four other online predictive methods. The additive method showed the best result finding 24 of the 25 T-cell epitopes. Availability: Peptides used in the study are available from http://www.jenner.ac.uk/JenPep. The PLS method is available commercially in the SYBYL molecular modelling software package. The final model for affinity prediction of peptides binding to DRB1*0401 molecule is available at http://www.jenner.ac.uk/MHCPred. Models developed for DRB1*0101 and DRB1*0701 also are available in MHC- Pred

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Los turistas urbanos se caracterizan por ser uno de los segmentos de mayor crecimiento en los mercados turísticos actuales. Monterrey (México), uno de los principales destinos urbanos del país, ha apostado en la actualidad por mejorar su competitividad. Esta investigación se propuso encontrar evidencia acerca de la relación causal de la motivación de viaje sobre la imagen percibida del destino, dos variables importantes por su influencia en la satisfacción de los visitantes. Una revisión de la literatura permitió proponer constructos teóricos integrados en un instrumento para la recogida de datos vía encuesta a una muestra representativa. Por medio del método de regresión y ecuaciones estructurales por mínimos cuadrados parciales (PLS), se identificaron los componentes principales de ambas variables y se obtuvo un modelo explicativo de la imagen percibida del destino en función de la motivación de viaje. Finalmente, se emiten recomendaciones para la gestión del destino urbano en función de los resultados obtenidos. ABSTRACT: Abstract Urban tourists are recognized as one of the fastest growing segments in today’s tourism markets. Monterrey, Mexico, one of the main urban destinations in the country aims at improving its competitiveness. This research work had the purpose of finding evidence on the causal relationship between travel motivation and destination image, two important variables because of their influence on visitors’ satisfaction. A literature review enabled the proposal of a research instrument with theoretically based constructs to gather data through survey from a representative sample. Using regression and structural equations modelling by partial least squares (pls) a set of main components of both variables were identified thus enabling the obtention of a explanatory model of destination image in terms of travel motivations. Finally based on the results some recommendations of tourism management are given.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A miniaturised gas analyser is described and evaluated based on the use of a substrate-integrated hollow waveguide (iHWG) coupled to a microsized near-infrared spectrophotometer comprising a linear variable filter and an array of InGaAs detectors. This gas sensing system was applied to analyse surrogate samples of natural fuel gas containing methane, ethane, propane and butane, quantified by using multivariate regression models based on partial least square (PLS) algorithms and Savitzky-Golay 1(st) derivative data preprocessing. The external validation of the obtained models reveals root mean square errors of prediction of 0.37, 0.36, 0.67 and 0.37% (v/v), for methane, ethane, propane and butane, respectively. The developed sensing system provides particularly rapid response times upon composition changes of the gaseous sample (approximately 2 s) due the minute volume of the iHWG-based measurement cell. The sensing system developed in this study is fully portable with a hand-held sized analyser footprint, and thus ideally suited for field analysis. Last but not least, the obtained results corroborate the potential of NIR-iHWG analysers for monitoring the quality of natural gas and petrochemical gaseous products.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Chlorpheniramine maleate (CLOR) enantiomers were quantified by ultraviolet spectroscopy and partial least squares regression. The CLOR enantiomers were prepared as inclusion complexes with beta-cyclodextrin and 1-butanol with mole fractions in the range from 50 to 100%. For the multivariate calibration the outliers were detected and excluded and variable selection was performed by interval partial least squares and a genetic algorithm. Figures of merit showed results for accuracy of 3.63 and 2.83% (S)-CLOR for root mean square errors of calibration and prediction, respectively. The ellipse confidence region included the point for the intercept and the slope of 1 and 0, respectively. Precision and analytical sensitivity were 0.57 and 0.50% (S)-CLOR, respectively. The sensitivity, selectivity, adjustment, and signal-to-noise ratio were also determined. The model was validated by a paired t test with the results obtained by high-performance liquid chromatography proposed by the European pharmacopoeia and circular dichroism spectroscopy. The results showed there was no significant difference between the methods at the 95% confidence level, indicating that the proposed method can be used as an alternative to standard procedures for chiral analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Human mesenchymal stem/stromal cells (MSCs) have received considerable attention in the field of cell-based therapies due to their high differentiation potential and ability to modulate immune responses. However, since these cells can only be isolated in very low quantities, successful realization of these therapies requires MSCs ex-vivo expansion to achieve relevant cell doses. The metabolic activity is one of the parameters often monitored during MSCs cultivation by using expensive multi-analytical methods, some of them time-consuming. The present work evaluates the use of mid-infrared (MIR) spectroscopy, through rapid and economic high-throughput analyses associated to multivariate data analysis, to monitor three different MSCs cultivation runs conducted in spinner flasks, under xeno-free culture conditions, which differ in the type of microcarriers used and the culture feeding strategy applied. After evaluating diverse spectral preprocessing techniques, the optimized partial least square (PLS) regression models based on the MIR spectra to estimate the glucose, lactate and ammonia concentrations yielded high coefficients of determination (R2 ≥ 0.98, ≥0.98, and ≥0.94, respectively) and low prediction errors (RMSECV ≤ 4.7%, ≤4.4% and ≤5.7%, respectively). Besides PLS models valid for specific expansion protocols, a robust model simultaneously valid for the three processes was also built for predicting glucose, lactate and ammonia, yielding a R2 of 0.95, 0.97 and 0.86, and a RMSECV of 0.33, 0.57, and 0.09 mM, respectively. Therefore, MIR spectroscopy combined with multivariate data analysis represents a promising tool for both optimization and control of MSCs expansion processes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Customer satisfaction and retention are key issues for organizations in today’s competitive market place. As such, much research and revenue has been invested in developing accurate ways of assessing consumer satisfaction at both the macro (national) and micro (organizational) level, facilitating comparisons in performance both within and between industries. Since the instigation of the national customer satisfaction indices (CSI), partial least squares (PLS) has been used to estimate the CSI models in preference to structural equation models (SEM) because they do not rely on strict assumptions about the data. However, this choice was based upon some misconceptions about the use of SEM’s and does not take into consideration more recent advances in SEM, including estimation methods that are robust to non-normality and missing data. In this paper, both SEM and PLS approaches were compared by evaluating perceptions of the Isle of Man Post Office Products and Customer service using a CSI format. The new robust SEM procedures were found to be advantageous over PLS. Product quality was found to be the only driver of customer satisfaction, while image and satisfaction were the only predictors of loyalty, thus arguing for the specificity of postal services

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of the present article is to take stock of a recent exchange in Organizational Research Methods between critics (Rönkkö & Evermann, 2013) and proponents (Henseler et al., 2014) of partial least squares path modeling (PLS-PM). The two target articles were centered around six principal issues, namely whether PLS-PM: (1) can be truly characterized as a technique for structural equation modeling (SEM); (2) is able to correct for measurement error; (3) can be used to validate measurement models; (4) accommodates small sample sizes; (5) is able to provide null hypothesis tests for path coefficients; and (6) can be employed in an exploratory, model-building fashion. We summarize and elaborate further on the key arguments underlying the exchange, drawing from the broader methodological and statistical literature in order to offer additional thoughts concerning the utility of PLS-PM and ways in which the technique might be improved. We conclude with recommendations as to whether and how PLS-PM serves as a viable contender to SEM approaches for estimating and evaluating theoretical models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Metastatic melanomas are frequently refractory to most adjuvant therapies such as chemotherapies and radiotherapies. Recently, immunotherapies have shown good results in the treatment of some metastatic melanomas. Immune cell infiltration in the tumor has been associated with successful immunotherapy. More generally, tumor infiltrating lymphocytes (TILs) in the primary tumor and in metastases of melanoma patients have been demonstrated to correlate positively with favorable clinical outcomes. Altogether, these findings suggest the importance of being able to identify, quantify and characterize immune infiltration at the tumor site for a better diagnostic and treatment choice. In this paper, we used Fourier Transform Infrared (FTIR) imaging to identify and quantify different subpopulations of T cells: the cytotoxic T cells (CD8+), the helper T cells (CD4+) and the regulatory T cells (T reg). As a proof of concept, we investigated pure populations isolated from human peripheral blood from 6 healthy donors. These subpopulations were isolated from blood samples by magnetic labeling and purities were assessed by Fluorescence Activated Cell Sorting (FACS). The results presented here show that Fourier Transform Infrared (FTIR) imaging followed by supervised Partial Least Square Discriminant Analysis (PLS-DA) allows an accurate identification of CD4+ T cells and CD8+ T cells (>86%). We then developed a PLS regression allowing the quantification of T reg in a different mix of immune cells (e.g. Peripheral Blood Mononuclear Cells (PBMCs)). Altogether, these results demonstrate the sensitivity of infrared imaging to detect the low biological variability observed in T cell subpopulations.