45 resultados para Partial least square regression
em Aston University Research Archive
Resumo:
Levels of lignin and hydroxycinnamic acid wall components in three genera of forage grasses (Lolium,Festuca and Dactylis) have been accurately predicted by Fourier-transform infrared spectroscopy using partial least squares models correlated to analytical measurements. Different models were derived that predicted the concentrations of acid detergent lignin, total hydroxycinnamic acids, total ferulate monomers plus dimers, p-coumarate and ferulate dimers in independent spectral test data from methanol extracted samples of perennial forage grass with accuracies of 92.8%, 86.5%, 86.1%, 59.7% and 84.7% respectively, and analysis of model projection scores showed that the models relied generally on spectral features that are known absorptions of these compounds. Acid detergent lignin was predicted in samples of two species of energy grass, (Phalaris arundinacea and Pancium virgatum) with an accuracy of 84.5%.
Resumo:
Two energy grass species, switch grass, a North American tuft grass, and reed canary grass, a European native, are likely to be important sources of biomass in Western Europe for the production of biorenewable energy. Matching chemical composition to conversion efficiency is a primary goal for improvement programmes and for determining the quality of biomass feed-stocks prior to use and there is a need for methods which allow cost effective characterisation of chemical composition at high rates of sample through-put. In this paper we demonstrate that nitrogen content and alkali index, parameters greatly influencing thermal conversion efficiency, can be accurately predicted in dried samples of these species grown under a range of agronomic conditions by partial least square regression of Fourier transform infrared spectra (R2 values for plots of predicted vs. measured values of 0.938 and 0.937, respectively). We also discuss the prediction of carbon and ash content in these samples and the application of infrared based predictive methods for the breeding improvement of energy grasses.
Resumo:
Relationships among quality factors in retailed free-range, corn-fed, organic, and conventional chicken breasts (9) were modeled using chemometric approaches. Use of principal component analysis (PCA) to neutral lipid composition data explained the majority (93%) of variability (variance) in fatty acid contents in 2 significant multivariate factors. PCA explained 88 and 75% variance in 3 factors for, respectively, flame ionization detection (FID) and nitrogen phosphorus (NPD) components in chromatographic flavor data from cooked chicken after simultaneous distillation extraction. Relationships to tissue antioxidant contents were modeled. Partial least square regression (PLS2), interrelating total data matrices, provided no useful models. By using single antioxidants as Y variables in PLS (1), good models (r2 values > 0.9) were obtained for alpha-tocopherol, glutathione, catalase, glutathione peroxidase, and reductase and FID flavor components and among the variables total mono and polyunsaturated fatty acids and subsets of FID, and saturated fatty acid and NPD components. Alpha-tocopherol had a modest (r2 = 0.63) relationship with neutral lipid n-3 fatty acid content. Such factors thus relate to flavor development and quality in chicken breast meat.
Resumo:
Objective In this study, we have used a chemometrics-based method to correlate key liposomal adjuvant attributes with in-vivo immune responses based on multivariate analysis. Methods The liposomal adjuvant composed of the cationic lipid dimethyldioctadecylammonium bromide (DDA) and trehalose 6,6-dibehenate (TDB) was modified with 1,2-distearoyl-sn-glycero-3-phosphocholine at a range of mol% ratios, and the main liposomal characteristics (liposome size and zeta potential) was measured along with their immunological performance as an adjuvant for the novel, postexposure fusion tuberculosis vaccine, Ag85B-ESAT-6-Rv2660c (H56 vaccine). Partial least square regression analysis was applied to correlate and cluster liposomal adjuvants particle characteristics with in-vivo derived immunological performances (IgG, IgG1, IgG2b, spleen proliferation, IL-2, IL-5, IL-6, IL-10, IFN-γ). Key findings While a range of factors varied in the formulations, decreasing the 1,2-distearoyl-sn-glycero-3-phosphocholine content (and subsequent zeta potential) together built the strongest variables in the model. Enhanced DDA and TDB content (and subsequent zeta potential) stimulated a response skewed towards a cell mediated immunity, with the model identifying correlations with IFN-γ, IL-2 and IL-6. Conclusion This study demonstrates the application of chemometrics-based correlations and clustering, which can inform liposomal adjuvant design.
Resumo:
The accurate identification of T-cell epitopes remains a principal goal of bioinformatics within immunology. As the immunogenicity of peptide epitopes is dependent on their binding to major histocompatibility complex (MHC) molecules, the prediction of binding affinity is a prerequisite to the reliable prediction of epitopes. The iterative self-consistent (ISC) partial-least-squares (PLS)-based additive method is a recently developed bioinformatic approach for predicting class II peptide−MHC binding affinity. The ISC−PLS method overcomes many of the conceptual difficulties inherent in the prediction of class II peptide−MHC affinity, such as the binding of a mixed population of peptide lengths due to the open-ended class II binding site. The method has applications in both the accurate prediction of class II epitopes and the manipulation of affinity for heteroclitic and competitor peptides. The method is applied here to six class II mouse alleles (I-Ab, I-Ad, I-Ak, I-As, I-Ed, and I-Ek) and included peptides up to 25 amino acids in length. A series of regression equations highlighting the quantitative contributions of individual amino acids at each peptide position was established. The initial model for each allele exhibited only moderate predictivity. Once the set of selected peptide subsequences had converged, the final models exhibited a satisfactory predictive power. Convergence was reached between the 4th and 17th iterations, and the leave-one-out cross-validation statistical terms - q2, SEP, and NC - ranged between 0.732 and 0.925, 0.418 and 0.816, and 1 and 6, respectively. The non-cross-validated statistical terms r2 and SEE ranged between 0.98 and 0.995 and 0.089 and 0.180, respectively. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made freely available online (http://www.jenner.ac.uk/MHCPred).
Resumo:
Motivation: The immunogenicity of peptides depends on their ability to bind to MHC molecules. MHC binding affinity prediction methods can save significant amounts of experimental work. The class II MHC binding site is open at both ends, making epitope prediction difficult because of the multiple binding ability of long peptides. Results: An iterative self-consistent partial least squares (PLS)-based additive method was applied to a set of 66 pep- tides no longer than 16 amino acids, binding to DRB1*0401. A regression equation containing the quantitative contributions of the amino acids at each of the nine positions was generated. Its predictability was tested using two external test sets which gave r pred =0.593 and r pred=0.655, respectively. Furthermore, it was benchmarked using 25 known T-cell epitopes restricted by DRB1*0401 and we compared our results with four other online predictive methods. The additive method showed the best result finding 24 of the 25 T-cell epitopes. Availability: Peptides used in the study are available from http://www.jenner.ac.uk/JenPep. The PLS method is available commercially in the SYBYL molecular modelling software package. The final model for affinity prediction of peptides binding to DRB1*0401 molecule is available at http://www.jenner.ac.uk/MHCPred. Models developed for DRB1*0101 and DRB1*0701 also are available in MHC- Pred
Resumo:
When applying multivariate analysis techniques in information systems and social science disciplines, such as management information systems (MIS) and marketing, the assumption that the empirical data originate from a single homogeneous population is often unrealistic. When applying a causal modeling approach, such as partial least squares (PLS) path modeling, segmentation is a key issue in coping with the problem of heterogeneity in estimated cause-and-effect relationships. This chapter presents a new PLS path modeling approach which classifies units on the basis of the heterogeneity of the estimates in the inner model. If unobserved heterogeneity significantly affects the estimated path model relationships on the aggregate data level, the methodology will allow homogenous groups of observations to be created that exhibit distinctive path model estimates. The approach will, thus, provide differentiated analytical outcomes that permit more precise interpretations of each segment formed. An application on a large data set in an example of the American customer satisfaction index (ACSI) substantiates the methodology’s effectiveness in evaluating PLS path modeling results.
Resumo:
Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.
Resumo:
A rapid method for the analysis of biomass feedstocks was established to identify the quality of the pyrolysis products likely to impact on bio-oil production. A total of 15 Lolium and Festuca grasses known to exhibit a range of Klason lignin contents were analysed by pyroprobe-GC/MS (Py-GC/MS) to determine the composition of the thermal degradation products of lignin. The identification of key marker compounds which are the derivatives of the three major lignin subunits (G, H, and S) allowed pyroprobe-GC/MS to be statistically correlated to the Klason lignin content of the biomass using the partial least-square method to produce a calibration model. Data from this multivariate modelling procedure was then applied to identify likely "key marker" ions representative of the lignin subunits from the mass spectral data. The combined total abundance of the identified key markers for the lignin subunits exhibited a linear relationship with the Klason lignin content. In addition the effect of alkali metal concentration on optimum pyrolysis characteristics was also examined. Washing of the grass samples removed approximately 70% of the metals and changed the characteristics of the thermal degradation process and products. Overall the data indicate that both the organic and inorganic specification of the biofuel impacts on the pyrolysis process and that pyroprobe-GC/MS is a suitable analytical technique to asses lignin composition. © 2007 Elsevier B.V. All rights reserved.
Resumo:
This project explored how consumers in emerging economies evaluate brand extension by using China as a case. Two separate but related studies were conducted, and university students were used as respondents in both the studies. Study one or replication study tested Aaker and Keller's brand extension model in China. Assuming similar methods to Aaker and Keller's, six well-recognised brands were chosen as parent brand and each was extended to three product categories. Totally, 469 respondents completed the survey questionnaire. As each was to evaluate six extensions, this made the cases 2814. The data was analysed using Optimal Least Square regression approach and "residual centred" approach respectively. The result confirmed most of the findings observed in developed countries. Specifically, consumer's attitude towards the extension is primarily driven by the brand affect, the fit between the two product categories, the difficulty of making the extension and moderated via the interactions between the brand affect and the fit variables. Study two refined and extended Aaker and Keller's model by adding new variables and making methodological adjustments. The same stimuli and data analysis techniques as those in the replication were employed. 252 respondents participated in the survey and each evaluated six extensions, making cases 1512. In addition to re-verifying the findings of the replication and providing cross validation to these findings, the extended study found that the image consistency between the parent brand and the extension, the competition intensity of the extension product market were important in determining the success of the extension. Further, consumer differed in evaluating durable extensions and non-durable extensions. The thesis detailed the two studies above, and discussed the findings and their implications by relating to branding literature, to the general situation of the emerging economies as well as the reality of China. It also presented the limitations of the research and the future research directions.
Resumo:
Chicken breast from nine products and from the following production regimes: conventional (chilled and frozen), organic and free range, were analysed for fatty acid composition of total lipids, preventative and chain breaking antioxidant contents and lipid oxidation during 5 days of sub-ambient storage following purchase. Total lipids were extracted with an optimal amount of a cold chloroform methanol solvent. Lipid compositions varied, but there were differences between conventional and organic products in their contents of total polyunsaturated fatty acids and n-3 and n-6 fatty acids and n-6:n-3 ratio. Of the antioxidants, a-tocopherol content was inversely correlated with lipid oxidation. The antioxidant enzyme activities of catalase, glutathione peroxidase and glutathione reductase varied between products. Modelling with partial least squares regression showed no overall relationship between total antioxidants and lipid data, but certain individual antioxidants showed a relationship with specific lipid fractions.
Resumo:
Consumers expect organic, free-range and corn-fed chicken to be nutritionally wholesome and have premium flavour characters. Interrelationships between flavour, fatty acids and antioxidants of retailed breasts were explored using simple correlations and chemometrics. Saturated fatty acid C16:0, and n-6 polyunsaturated C20:4 and C22:4 contents were correlated with lipid oxidation products (thiobarbituric acid reactive substances) and in partial least-squares regression (PLS1) with 32 high-resonance gas chromatography (flame ionization) flavour components (r2>0.90), and also linked (r2>0.80) to antioxidants (-tocopherol, glutathione and catalase). A further 10 high-resonance gas chromatography nitrogen phosphorus detector flavour components were correlated (r 2>0.85) with C18:3(n-3) content. Chicken character was correlated with C18:3(n-3), and C18:3(n-6) inversely with oily, off-flavour and lipid oxidation. Sweet, fruity and oily aromas were linked in PLS1 with 13 specific fatty acids (r2>0.6), and bland taste with total summed (six) fatty acid fractions (r2>0.81). Specific antioxidants were correlated with sweet, fruity and chicken aromas, and -tocopherol inversely with lipid oxidation. PLS2 confirmed relationships between fatty acid composition, antioxidants and the subsets of 32 and 10 flavour components. Clear relationships were thus observed between lipid and antioxidant compositions and flavour in chicken breast meat.
Resumo:
Biological experiments often produce enormous amount of data, which are usually analyzed by data clustering. Cluster analysis refers to statistical methods that are used to assign data with similar properties into several smaller, more meaningful groups. Two commonly used clustering techniques are introduced in the following section: principal component analysis (PCA) and hierarchical clustering. PCA calculates the variance between variables and groups them into a few uncorrelated groups or principal components (PCs) that are orthogonal to each other. Hierarchical clustering is carried out by separating data into many clusters and merging similar clusters together. Here, we use an example of human leukocyte antigen (HLA) supertype classification to demonstrate the usage of the two methods. Two programs, Generating Optimal Linear Partial Least Square Estimations (GOLPE) and Sybyl, are used for PCA and hierarchical clustering, respectively. However, the reader should bear in mind that the methods have been incorporated into other software as well, such as SIMCA, statistiXL, and R.
Resumo:
Purpose - The paper aims to examine the role of market orientation (MO) and innovation capability in determining business performance during an economic upturn and downturn. Design/methodology/approach - The data comprise two national-level surveys conducted in Finland in 2008, representing an economic boom, and in 2010 when the global economic crisis had hit the Finnish market. Partial least square path analysis is used to test the potential mediating effect of innovation capability on the relationship between MO and business performance during economic boom and bust. Findings - The results show that innovation capability fully mediates the performance effects of a MO during an economic upturn, whereas the mediation is only partial during a downturn. Innovation capability also mediates the relationship between a customer orientation and business performance during an upturn, whereas the mediating effect culminates in a competitor orientation during a downturn. Thus, the role of innovation capability as a mediator between the individual market-orientation components varies along the business cycle. Originality/value - This paper is one of the first studies that empirically examine the impact of the economic cycle on the relationship between strategic marketing concepts, such as MO or innovation capability, and the firm's business performance.
Resumo:
Circulating low density lipoproteins (LDL) are thought to play a crucial role in the onset and development of atherosclerosis, though the detailed molecular mechanisms responsible for their biological effects remain controversial. The complexity of biomolecules (lipids, glycans and protein) and structural features (isoforms and chemical modifications) found in LDL particles hampers the complete understanding of the mechanism underlying its atherogenicity. For this reason the screening of LDL for features discriminative of a particular pathology in search of biomarkers is of high importance. Three major biomolecule classes (lipids, protein and glycans) in LDL particles were screened using mass spectrometry coupled to liquid chromatography. Dual-polarity screening resulted in good lipidome coverage, identifying over 300 lipid species from 12 lipid sub-classes. Multivariate analysis was used to investigate potential discriminators in the individual lipid sub-classes for different study groups (age, gender, pathology). Additionally, the high protein sequence coverage of ApoB-100 routinely achieved (≥70%) assisted in the search for protein modifications correlating to aging and pathology. The large size and complexity of the datasets required the use of chemometric methods (Partial Least Square-Discriminant Analysis, PLS-DA) for their analysis and for the identification of ions that discriminate between study groups. The peptide profile from enzymatically digested ApoB-100 can be correlated with the high structural complexity of lipids associated with ApoB-100 using exploratory data analysis. In addition, using targeted scanning modes, glycosylation sites within neutral and acidic sugar residues in ApoB-100 are also being explored. Together or individually, knowledge of the profiles and modifications of the major biomolecules in LDL particles will contribute towards an in-depth understanding, will help to map the structural features that contribute to the atherogenicity of LDL, and may allow identification of reliable, pathology-specific biomarkers. This research was supported by a Marie Curie Intra-European Fellowship within the 7th European Community Framework Program (IEF 255076). Work of A. Rudnitskaya was supported by Portuguese Science and Technology Foundation, through the European Social Fund (ESF) and "Programa Operacional Potencial Humano - POPH".