908 resultados para partial least-squares regression


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a chemotaxonomic analysis of a database of triterpenoid compounds from the Celastraceae family using principal component analysis (PCA). The numbers of occurrences of thirty types of triterpene skeleton in different tribes of the family were used as variables. The study shows that PCA applied to chemical data can contribute to an intrafamilial classification of Celastraceae, once some questionable taxa affinity was observed, from chemotaxonomic inferences about genera and they are in agreement with the phylogeny previously proposed. The inclusion of Hippocrateaceae within Celastraceae is supported by the triterpene chemistry.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Following the recent success in quantitative analysis of essential fatty acid compositions in a commercial microencapsulated fish oil (?EFO) supplement, we extended the application of portable attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopic technique and partial least square regression (PLSR) analysis for rapid determination of total protein contents-the other major component in most commercial ?EFO powders. In contrast to the traditional chromatographic methodology used in a routine amino acid analysis (AAA), the ATR-FTIR spectra of the ?EFO powder can be acquired directly from its original powder form with no requirement of any sample preparation, making the technique exceptionally fast, noninvasive, and environmentally friendly as well as being cost effective and hence eminently suitable for routine use by industry. By optimizing the spectral region of interest and number of latent factors through the developed PLSR strategy, a good linear calibration model was produced as indicated by an excellent value of coefficient of determination R2 = 0.9975, using standard ?EFO powders with total protein contents in the range of 140-450 mg/g. The prediction of the protein contents acquired from an independent validation set through the optimized PLSR model was highly accurate as evidenced through (1) a good linear fitting (R2 = 0.9759) in the plot of predicted versus reference values, which were obtained from a standard AAA method, (2) lowest root mean square error of prediction (11.64 mg/g), and (3) high residual predictive deviation (6.83) ranked in very good level of predictive quality indicating high robustness and good predictive performance of the achieved PLSR calibration model. The study therefore demonstrated the potential application of the portable ATR-FTIR technique when used together with PLSR analysis for rapid online monitoring of the two major components (i.e., oil and protein contents) in finished ?EFO powders in the actual manufacturing setting.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Intervention programs aimed at promoting study and work opportunities in the Information and Communications Technology (ICT) field to schoolgirls (Interventions) have been encouraged to combat a decline in the interest among girls to study ICT at school. The goal of our study is to investigate the influence of Interventions on schoolgirls’ intentions to choose a career in the ICT field by analysing the  comprehensive survey data (n = 3577), collected during four interventions in Australia, using the Partial Least Squares method. Our study is also aimed at identifying other factors influencing ICT career intentions. We found that the attitude towards interventions has an indirect influence on ICT career intentions by affecting interest in ICT. Our results also challenge several existing theoretical studies by showing that factors that had previously been suggested as influencers were found to have little or no impact in this study, these being same-sex education and computer usage.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When applying multivariate analysis techniques in information systems and social science disciplines, such as management information systems (MIS) and marketing, the assumption that the empirical data originate from a single homogeneous population is often unrealistic. When applying a causal modeling approach, such as partial least squares (PLS) path modeling, segmentation is a key issue in coping with the problem of heterogeneity in estimated cause-and-effect relationships. This chapter presents a new PLS path modeling approach which classifies units on the basis of the heterogeneity of the estimates in the inner model. If unobserved heterogeneity significantly affects the estimated path model relationships on the aggregate data level, the methodology will allow homogenous groups of observations to be created that exhibit distinctive path model estimates. The approach will, thus, provide differentiated analytical outcomes that permit more precise interpretations of each segment formed. An application on a large data set in an example of the American customer satisfaction index (ACSI) substantiates the methodology’s effectiveness in evaluating PLS path modeling results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

High-throughput techniques are necessary to efficiently screen potential lignocellulosic feedstocks for the production of renewable fuels, chemicals, and bio-based materials, thereby reducing experimental time and expense while supplanting tedious, destructive methods. The ratio of lignin syringyl (S) to guaiacyl (G) monomers has been routinely quantified as a way to probe biomass recalcitrance. Mid-infrared and Raman spectroscopy have been demonstrated to produce robust partial least squares models for the prediction of lignin S/G ratios in a diverse group of Acacia and eucalypt trees. The most accurate Raman model has now been used to predict the S/G ratio from 269 unknown Acacia and eucalypt feedstocks. This study demonstrates the application of a partial least squares model composed of Raman spectral data and lignin S/G ratios measured using pyrolysis/molecular beam mass spectrometry (pyMBMS) for the prediction of S/G ratios in an unknown data set. The predicted S/G ratios calculated by the model were averaged according to plant species, and the means were not found to differ from the pyMBMS ratios when evaluating the mean values of each method within the 95 % confidence interval. Pairwise comparisons within each data set were employed to assess statistical differences between each biomass species. While some pairwise appraisals failed to differentiate between species, Acacias, in both data sets, clearly display significant differences in their S/G composition which distinguish them from eucalypts. This research shows the power of using Raman spectroscopy to supplant tedious, destructive methods for the evaluation of the lignin S/G ratio of diverse plant biomass materials. © 2015, The Author(s).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Data fluctuation in multiple measurements of Laser Induced Breakdown Spectroscopy (LIBS) greatly affects the accuracy of quantitative analysis. A new LIBS quantitative analysis method based on the Robust Least Squares Support Vector Machine (RLS-SVM) regression model is proposed. The usual way to enhance the analysis accuracy is to improve the quality and consistency of the emission signal, such as by averaging the spectral signals or spectrum standardization over a number of laser shots. The proposed method focuses more on how to enhance the robustness of the quantitative analysis regression model. The proposed RLS-SVM regression model originates from the Weighted Least Squares Support Vector Machine (WLS-SVM) but has an improved segmented weighting function and residual error calculation according to the statistical distribution of measured spectral data. Through the improved segmented weighting function, the information on the spectral data in the normal distribution will be retained in the regression model while the information on the outliers will be restrained or removed. Copper elemental concentration analysis experiments of 16 certified standard brass samples were carried out. The average value of relative standard deviation obtained from the RLS-SVM model was 3.06% and the root mean square error was 1.537%. The experimental results showed that the proposed method achieved better prediction accuracy and better modeling robustness compared with the quantitative analysis methods based on Partial Least Squares (PLS) regression, standard Support Vector Machine (SVM) and WLS-SVM. It was also demonstrated that the improved weighting function had better comprehensive performance in model robustness and convergence speed, compared with the four known weighting functions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Reliable pollutant build-up prediction plays a critical role in the accuracy of urban stormwater quality modelling outcomes. However, water quality data collection is resource demanding compared to streamflow data monitoring, where a greater quantity of data is generally available. Consequently, available water quality data sets span only relatively short time scales unlike water quantity data. Therefore, the ability to take due consideration of the variability associated with pollutant processes and natural phenomena is constrained. This in turn gives rise to uncertainty in the modelling outcomes as research has shown that pollutant loadings on catchment surfaces and rainfall within an area can vary considerably over space and time scales. Therefore, the assessment of model uncertainty is an essential element of informed decision making in urban stormwater management. This paper presents the application of a range of regression approaches such as ordinary least squares regression, weighted least squares Regression and Bayesian Weighted Least Squares Regression for the estimation of uncertainty associated with pollutant build-up prediction using limited data sets. The study outcomes confirmed that the use of ordinary least squares regression with fixed model inputs and limited observational data may not provide realistic estimates. The stochastic nature of the dependent and independent variables need to be taken into consideration in pollutant build-up prediction. It was found that the use of the Bayesian approach along with the Monte Carlo simulation technique provides a powerful tool, which attempts to make the best use of the available knowledge in the prediction and thereby presents a practical solution to counteract the limitations which are otherwise imposed on water quality modelling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Customer satisfaction and retention are key issues for organizations in today’s competitive market place. As such, much research and revenue has been invested in developing accurate ways of assessing consumer satisfaction at both the macro (national) and micro (organizational) level, facilitating comparisons in performance both within and between industries. Since the instigation of the national customer satisfaction indices (CSI), partial least squares (PLS) has been used to estimate the CSI models in preference to structural equation models (SEM) because they do not rely on strict assumptions about the data. However, this choice was based upon some misconceptions about the use of SEM’s and does not take into consideration more recent advances in SEM, including estimation methods that are robust to non-normality and missing data. In this paper, both SEM and PLS approaches were compared by evaluating perceptions of the Isle of Man Post Office Products and Customer service using a CSI format. The new robust SEM procedures were found to be advantageous over PLS. Product quality was found to be the only driver of customer satisfaction, while image and satisfaction were the only predictors of loyalty, thus arguing for the specificity of postal services

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Phylogenetic generalised least squares (PGLS) is one of the most commonly employed phylogenetic comparative methods. The technique, a modification of generalised least squares, uses knowledge of phylogenetic relationships to produce an estimate of expected covariance in cross-species data. Closely related species are assumed to have more similar traits because of their shared ancestry and hence produce more similar residuals from the least squares regression line. By taking into account the expected covariance structure of these residuals, modified slope and intercept estimates are generated that can account for interspecific autocorrelation due to phylogeny. Here, we provide a basic conceptual background to PGLS, for those unfamiliar with the approach. We describe the requirements for a PGLS analysis and highlight the packages that can be used to implement the method. We show how phylogeny is used to calculate the expected covariance structure in the data and how this is applied to the generalised least squares regression equation. We demonstrate how PGLS can incorporate information about phylogenetic signal, the extent to which closely related species truly are similar, and how it controls for this signal appropriately, thereby negating concerns about unnecessarily ‘correcting’ for phylogeny. In addition to discussing the appropriate way to present the results of PGLS analyses, we highlight some common misconceptions about the approach and commonly encountered problems with the method. These include misunderstandings about what phylogenetic signal refers to in the context of PGLS (residuals errors, not the traits themselves), and issues associated with unknown or uncertain phylogeny.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present an independent calibration model for the determination of biogenic silica (BSi) in sediments, developed from analysis of synthetic sediment mixtures and application of Fourier transform infrared spectroscopy (FTIRS) and partial least squares regression (PLSR) modeling. In contrast to current FTIRS applications for quantifying BSi, this new calibration is independent from conventional wet-chemical techniques and their associated measurement uncertainties. This approach also removes the need for developing internal calibrations between the two methods for individual sediments records. For the independent calibration, we produced six series of different synthetic sediment mixtures using two purified diatom extracts, with one extract mixed with quartz sand, calcite, 60/40 quartz/calcite and two different natural sediments, and a second extract mixed with one of the natural sediments. A total of 306 samples—51 samples per series—yielded BSi contents ranging from 0 to 100 %. The resulting PLSR calibration model between the FTIR spectral information and the defined BSi concentration of the synthetic sediment mixtures exhibits a strong cross-validated correlation ( R2cv = 0.97) and a low root-mean square error of cross-validation (RMSECV = 4.7 %). Application of the independent calibration to natural lacustrine and marine sediments yields robust BSi reconstructions. At present, the synthetic mixtures do not include the variation in organic matter that occurs in natural samples, which may explain the somewhat lower prediction accuracy of the calibration model for organic-rich samples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is now an emerging need for an efficient modeling strategy to develop a new generation of monitoring systems. One method of approaching the modeling of complex processes is to obtain a global model. It should be able to capture the basic or general behavior of the system, by means of a linear or quadratic regression, and then superimpose a local model on it that can capture the localized nonlinearities of the system. In this paper, a novel method based on a hybrid incremental modeling approach is designed and applied for tool wear detection in turning processes. It involves a two-step iterative process that combines a global model with a local model to take advantage of their underlying, complementary capacities. Thus, the first step constructs a global model using a least squares regression. A local model using the fuzzy k-nearest-neighbors smoothing algorithm is obtained in the second step. A comparative study then demonstrates that the hybrid incremental model provides better error-based performance indices for detecting tool wear than a transductive neurofuzzy model and an inductive neurofuzzy model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.