920 resultados para Least-Squares prediction
Resumo:
Background - MHC Class I molecules present antigenic peptides to cytotoxic T cells, which forms an integral part of the adaptive immune response. Peptides are bound within a groove formed by the MHC heavy chain. Previous approaches to MHC Class I-peptide binding prediction have largely concentrated on the peptide anchor residues located at the P2 and C-terminus positions. Results - A large dataset comprising MHC-peptide structural complexes was created by re-modelling pre-determined x-ray crystallographic structures. Static energetic analysis, following energy minimisation, was performed on the dataset in order to characterise interactions between bound peptides and the MHC Class I molecule, partitioning the interactions within the groove into van der Waals, electrostatic and total non-bonded energy contributions. Conclusion - The QSAR techniques of Genetic Function Approximation (GFA) and Genetic Partial Least Squares (G/PLS) algorithms were used to identify key interactions between the two molecules by comparing the calculated energy values with experimentally-determined BL50 data. Although the peptide termini binding interactions help ensure the stability of the MHC Class I-peptide complex, the central region of the peptide is also important in defining the specificity of the interaction. As thermodynamic studies indicate that peptide association and dissociation may be driven entropically, it may be necessary to incorporate entropic contributions into future calculations.
Resumo:
Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.
Resumo:
The focus of this study is on the governance decisions in a concurrent channels context, in the case of uncertainty. The study examines how a firm chooses to deploy its sales force in times of uncertainty, and the subsequent performance outcome of those deployment choices. The theoretical framework is based on multiple theories of governance, including transaction cost analysis (TCA), agency theory, and institutional economics. Three uncertainty variables are investigated in this study. The first two are demand and competitive uncertainty which are considered to be industry-level market uncertainty forms. The third uncertainty, political uncertainty, is chosen as it is an important dimension of institutional environments, capturing non-economic circumstances such as regulations and political systemic issues. The study employs longitudinal secondary data from a Thai hotel chain, comprising monthly observations from January 2007 – December 2012. This hotel chain has its operations in 4 countries, Thailand, the Philippines, United Arab Emirates – Dubai, and Egypt, all of which experienced substantial demand, competitive, and political uncertainty during the study period. This makes them ideal contexts for this study. Two econometric models, both deploying Newey-West estimations, are employed to test 13 hypotheses. The first model considers the relationship between uncertainty and governance. The second model is a version of Newey-West, using an Instrumental Variables (IV) estimator and a Two-Stage Least Squares model (2SLS), to test the direct effect of uncertainty on performance and the moderating effect of governance on the relationship between uncertainty and performance. The observed relationship between uncertainty and governance observed follows a core prediction of TCA; that vertical integration is the preferred choice of governance when uncertainty rises. As for the subsequent performance outcomes, the results corroborate that uncertainty has a negative effect on performance. Importantly, the findings show that becoming more vertically integrated cannot help moderate the effect of demand and competitive uncertainty, but can significantly moderate the effect of political uncertainty. These findings have significant theoretical and practical implications, and extend our knowledge of the impact on uncertainty significantly, as well as bringing an institutional perspective to TCA. Further, they offer managers novel insight into the nature of different types of uncertainty, their impact on performance, and how channel decisions can mitigate these impacts.
Resumo:
A cikk a páros összehasonlításokon alapuló pontozási eljárásokat alkalmazza svájci rendszerű sakk csapatversenyek eredményének meghatározására. Bemutatjuk a nem körmérkőzéses esetben felmerülő kérdéseket, az egyéni és csapatversenyek jellemzőit, valamint a hivatalos lexikografikus rendezések hibáit. Axiomatikus alapokon rangsorolási problémaként modellezzük a bajnokságokat, definícióinkat összekapcsoljuk a pontszám, az általánosított sorösszeg és a legkisebb négyzetek módszerének tulajdonságaival. A javasolt eljárást két sakkcsapat Európa-bajnokság részletes elemzésével illusztráljuk. A végső rangsorok összehasonlítását távolságfüggvények segítségével végezzük el, majd a sokdimenziós skálázás révén ábrázoljuk azokat. A hivatalos sorrendtől való eltérés okait a legkisebb négyzetek módszerének dekompozíciójával tárjuk fel. A sorrendeket három szempont, az előrejelző képesség, a mintailleszkedés és a robusztusság alapján értékeljük, és a legkisebb négyzetek módszerének alkalmas eredménymátrixszal történő használata mellett érvelünk. ____ The paper uses paired comparison-based scoring procedures in order to determine the result of Swiss system chess team tournaments. We present the main challenges of ranking in these tournaments, the features of individual and team competitions as well as the failures of official lexicographical orders. The tournament is represented as a ranking problem, our model is discussed with respect to the properties of the score, generalised row sum and least squares methods. The proposed method is illustrated with a detailed analysis of the two recent chess team European championships. Final rankings are compared through their distances and visualized by multidimensional scaling (MDS). Differences to official ranking are revealed due to the decomposition of least squares method. Rankings are evaluated by prediction accuracy, retrodictive performance, and stability. The paper argues for the use of least squares method with an appropriate generalised results matrix favouring match points.
Resumo:
The paper uses paired comparison-based scoring procedures for ranking the participants of a Swiss system chess team tournament. We present the main challenges of ranking in Swiss system, the features of individual and team competitions as well as the failures of official lexicographical orders. The tournament is represented as a ranking problem, our model is discussed with respect to the properties of the score, generalized row sum and least squares methods. The proposed procedure is illustrated with a detailed analysis of the two recent chess team European championships. Final rankings are compared by their distances and visualized with multidimensional scaling (MDS). Differences to official ranking are revealed by the decomposition of least squares method. Rankings are evaluated by prediction accuracy, retrodictive performance, and stability. The paper argues for the use of least squares method with a results matrix favoring match points.
Resumo:
The composition and distribution of diatom algae inhabiting estuaries and coasts of the subtropical Americas are poorly documented, especially relative to the central role diatoms play in coastal food webs and to their potential utility as sentinels of environmental change in these threatened ecosystems. Here, we document the distribution of diatoms among the diverse habitat types and long environmental gradients represented by the shallow topographic relief of the South Florida, USA, coastline. A total of 592 species were encountered from 38 freshwater, mangrove, and marine locations in the Everglades wetland and Florida Bay during two seasonal collections, with the highest diversity occurring at sites of high salinity and low water column organic carbon concentration (WTOC). Freshwater, mangrove, and estuarine assemblages were compositionally distinct, but seasonal differences were only detected in mangrove and estuarine sites where solute concentration differed greatly between wet and dry seasons. Epiphytic, planktonic, and sediment assemblages were compositionally similar, implying a high degree of mixing along the shallow, tidal, and storm-prone coast. The relationships between diatom taxa and salinity, water total phosphorus (WTP), water total nitrogen (WTN), and WTOC concentrations were determined and incorporated into weighted averaging partial least squares regression models. Salinity was the most influential variable, resulting in a highly predictive model (r apparent 2 = 0.97, r jackknife 2 = 0.95) that can be used in the future to infer changes in coastal freshwater delivery or sea-level rise in South Florida and compositionally similar environments. Models predicting WTN (r apparent 2 = 0.75, r jackknife 2 = 0.46), WTP (r apparent 2 = 0.75, r jackknife 2 = 0.49), and WTOC (r apparent 2 = 0.79, r jackknife 2 = 0.57) were also strong, suggesting that diatoms can provide reliable inferences of changes in solute delivery to the coastal ecosystem.
Resumo:
The spatial and temporal distribution of planktonic, sediment-associated and epiphytic diatoms among 58 sites in Biscayne Bay, Florida was examined in order to identify diatom taxa indicative of different salinity and water quality conditions, geographic locations and habitat types. Assessments were made in contrasting wet and dry seasons in order to develop robust assessment models for salinity and water quality for this region. We found that diatom assemblages differed between nearshore and offshore locations, especially during the wet season when salinity and nutrient gradients were steepest. In the dry season, habitat structure was primary determinant of diatom assemblage composition. Among a suite of physicochemical variables, water depth and sediment total phosphorus (STP) were most strongly associated with diatom assemblage composition in the dry season, while salinity and water total phosphorus (TP) were more important in the wet season. We used indicator species analysis (ISA) to identify taxa that were most abundant and frequent at nearshore and offshore locations, in planktonic, epiphytic and benthic habitats and in contrasting salinity and water quality regimes. Because surface water concentrations of salts, total phosphorus, nitrogen (TN) and organic carbon (TOC) are partly controlled by water management in this region, diatom-based models were produced to infer these variables in modern and retrospective assessments of management-driven changes. Weighted averaging (WA) and weighted averaging partial least squares (WA-PLS) regressions produced reliable estimates of salinity, TP, TN and TOC from diatoms (r2 = 0.92, 0.77, 0.77 and 0.71, respectively). Because of their sensitivity to salinity, nutrient and TOC concentrations diatom assemblages should be useful in developing protective nutrient criteria for estuaries and coastal waters of Florida.
Resumo:
Quantitative Structure-Activity Relationship (QSAR) has been applied extensively in predicting toxicity of Disinfection By-Products (DBPs) in drinking water. Among many toxicological properties, acute and chronic toxicities of DBPs have been widely used in health risk assessment of DBPs. These toxicities are correlated with molecular properties, which are usually correlated with molecular descriptors. The primary goals of this thesis are: (1) to investigate the effects of molecular descriptors (e.g., chlorine number) on molecular properties such as energy of the lowest unoccupied molecular orbital (E LUMO) via QSAR modelling and analysis; (2) to validate the models by using internal and external cross-validation techniques; (3) to quantify the model uncertainties through Taylor and Monte Carlo Simulation. One of the very important ways to predict molecular properties such as ELUMO is using QSAR analysis. In this study, number of chlorine (NCl ) and number of carbon (NC) as well as energy of the highest occupied molecular orbital (EHOMO) are used as molecular descriptors. There are typically three approaches used in QSAR model development: (1) Linear or Multi-linear Regression (MLR); (2) Partial Least Squares (PLS); and (3) Principle Component Regression (PCR). In QSAR analysis, a very critical step is model validation after QSAR models are established and before applying them to toxicity prediction. The DBPs to be studied include five chemical classes: chlorinated alkanes, alkenes, and aromatics. In addition, validated QSARs are developed to describe the toxicity of selected groups (i.e., chloro-alkane and aromatic compounds with a nitro- or cyano group) of DBP chemicals to three types of organisms (e.g., Fish, T. pyriformis, and P.pyosphoreum) based on experimental toxicity data from the literature. The results show that: (1) QSAR models to predict molecular property built by MLR, PLS or PCR can be used either to select valid data points or to eliminate outliers; (2) The Leave-One-Out Cross-Validation procedure by itself is not enough to give a reliable representation of the predictive ability of the QSAR models, however, Leave-Many-Out/K-fold cross-validation and external validation can be applied together to achieve more reliable results; (3) E LUMO are shown to correlate highly with the NCl for several classes of DBPs; and (4) According to uncertainty analysis using Taylor method, the uncertainty of QSAR models is contributed mostly from NCl for all DBP classes.
Resumo:
Based on the quantitative analysis of diatom assemblages preserved in 274 surface sediment samples recovered in the Pacific, Atlantic and western Indian sectors of the Southern Ocean we have defined a new reference database for quantitative estimation of late-middle Pleistocene Antarctic sea ice fields using the transfer function technique. The Detrended Canonical Analysis (DCA) of the diatom data set points to a unimodal distribution of the diatom assemblages. Canonical Correspondence Analysis (CCA) indicates that winter sea ice (WSI) but also summer sea surface temperature (SSST) represent the most prominent environmental variables that control the spatial species distribution. To test the applicability of transfer functions for sea ice reconstruction in terms of concentration and occurrence probability we applied four different methods, the Imbrie and Kipp Method (IKM), the Modern Analog Technique (MAT), Weighted Averaging (WA), and Weighted Averaging Partial Least Squares (WAPLS), using logarithm-transformed diatom data and satellite-derived (1981-2010) sea ice data as a reference. The best performance for IKM results was obtained using a subset of 172 samples with 28 diatom taxa/taxa groups, quadratic regression and a three-factor model (IKM-D172/28/3q) resulting in root mean square errors of prediction (RMSEP) of 7.27% and 11.4% for WSI and summer sea ice (SSI) concentration, respectively. MAT estimates were calculated with different numbers of analogs (4, 6) using a 274-sample/28-taxa reference data set (MAT-D274/28/4an, -6an) resulting in RMSEP's ranging from 5.52% (4an) to 5.91% (6an) for WSI as well as 8.93% (4an) to 9.05% (6an) for SSI. WA and WAPLS performed less well with the D274 data set, compared to MAT, achieving WSI concentration RMSEP's of 9.91% with WA and 11.29% with WAPLS, recommending the use of IKM and MAT. The application of IKM and MAT to surface sediment data revealed strong relations to the satellite-derived winter and summer sea ice field. Sea ice reconstructions performed on an Atlantic- and a Pacific Southern Ocean sediment core, both documenting sea ice variability over the past 150,000 years (MIS 1 - MIS 6), resulted in similar glacial/interglacial trends of IKM and MAT-based sea-ice estimates. On the average, however, IKM estimates display smaller WSI and slightly higher SSI concentration and probability at lower variability in comparison with MAT. This pattern is a result of different estimation techniques with integration of WSI and SSI signals in one single factor assemblage by applying IKM and selecting specific single samples, thus keeping close to the original diatom database and included variability, by MAT. In contrast to the estimation of WSI, reconstructions of past SSI variability remains weaker. Combined with diatom-based estimates, the abundance and flux pattern of biogenic opal represents an additional indication for the WSI and SSI extent.
Resumo:
Biodiesel is a renewable fuel derived from vegetable oils or animal fats, which can be a total or partial substitute for diesel. Since 2005, this fuel was introduced in the Brazilian energy matrix through Law 11.097 that determines the percentage of biodiesel added to diesel oil as well as monitoring the insertion of this fuel in market. The National Agency of Petroleum, Natural Gas and Biofuels (ANP) establish the obligation of adding 7% (v/v) of biodiesel to diesel commercialized in the country, making crucial the analytical control of this content. Therefore, in this study were developed and validated methodologies based on the use of Mid Infrared Spectroscopy (MIR) and Multivariate Calibration by Partial Least Squares (PLS) to quantify the methyl and ethyl biodiesels content of cotton and jatropha in binary blends with diesel at concentration range from 1.00 to 30.00% (v/v), since this is the range specified in standard ABNT NBR 15568. The biodiesels were produced from two routes, using ethanol or methanol, and evaluated according to the parameters: oxidative stability, water content, kinematic viscosity and density, presenting results according to ANP Resolution No. 45/2014. The built PLS models were validated on the basis of ASTM E1655-05 for Infrared Spectroscopy and Multivariate Calibration and ABNT NBR 15568, with satisfactory results due to RMSEP (Root Mean Square Error of Prediction) values below 0.08% (<0.1%), correlation coefficients (R) above 0.9997 and the absence of systematic error (bias). Therefore, the methodologies developed can be a promising alternative in the quality control of this fuel.
Resumo:
Quantitative Structure-Activity Relationship (QSAR) has been applied extensively in predicting toxicity of Disinfection By-Products (DBPs) in drinking water. Among many toxicological properties, acute and chronic toxicities of DBPs have been widely used in health risk assessment of DBPs. These toxicities are correlated with molecular properties, which are usually correlated with molecular descriptors. The primary goals of this thesis are: 1) to investigate the effects of molecular descriptors (e.g., chlorine number) on molecular properties such as energy of the lowest unoccupied molecular orbital (ELUMO) via QSAR modelling and analysis; 2) to validate the models by using internal and external cross-validation techniques; 3) to quantify the model uncertainties through Taylor and Monte Carlo Simulation. One of the very important ways to predict molecular properties such as ELUMO is using QSAR analysis. In this study, number of chlorine (NCl) and number of carbon (NC) as well as energy of the highest occupied molecular orbital (EHOMO) are used as molecular descriptors. There are typically three approaches used in QSAR model development: 1) Linear or Multi-linear Regression (MLR); 2) Partial Least Squares (PLS); and 3) Principle Component Regression (PCR). In QSAR analysis, a very critical step is model validation after QSAR models are established and before applying them to toxicity prediction. The DBPs to be studied include five chemical classes: chlorinated alkanes, alkenes, and aromatics. In addition, validated QSARs are developed to describe the toxicity of selected groups (i.e., chloro-alkane and aromatic compounds with a nitro- or cyano group) of DBP chemicals to three types of organisms (e.g., Fish, T. pyriformis, and P.pyosphoreum) based on experimental toxicity data from the literature. The results show that: 1) QSAR models to predict molecular property built by MLR, PLS or PCR can be used either to select valid data points or to eliminate outliers; 2) The Leave-One-Out Cross-Validation procedure by itself is not enough to give a reliable representation of the predictive ability of the QSAR models, however, Leave-Many-Out/K-fold cross-validation and external validation can be applied together to achieve more reliable results; 3) ELUMO are shown to correlate highly with the NCl for several classes of DBPs; and 4) According to uncertainty analysis using Taylor method, the uncertainty of QSAR models is contributed mostly from NCl for all DBP classes.
Resumo:
Motivated by environmental protection concerns, monitoring the flue gas of thermal power plant is now often mandatory due to the need to ensure that emission levels stay within safe limits. Optical based gas sensing systems are increasingly employed for this purpose, with regression techniques used to relate gas optical absorption spectra to the concentrations of specific gas components of interest (NOx, SO2 etc.). Accurately predicting gas concentrations from absorption spectra remains a challenging problem due to the presence of nonlinearities in the relationships and the high-dimensional and correlated nature of the spectral data. This article proposes a generalized fuzzy linguistic model (GFLM) to address this challenge. The GFLM is made up of a series of “If-Then” fuzzy rules. The absorption spectra are input variables in the rule antecedent. The rule consequent is a general nonlinear polynomial function of the absorption spectra. Model parameters are estimated using least squares and gradient descent optimization algorithms. The performance of GFLM is compared with other traditional prediction models, such as partial least squares, support vector machines, multilayer perceptron neural networks and radial basis function networks, for two real flue gas spectral datasets: one from a coal-fired power plant and one from a gas-fired power plant. The experimental results show that the generalized fuzzy linguistic model has good predictive ability, and is competitive with alternative approaches, while having the added advantage of providing an interpretable model.
Resumo:
Motivated by environmental protection concerns, monitoring the flue gas of thermal power plant is now often mandatory due to the need to ensure that emission levels stay within safe limits. Optical based gas sensing systems are increasingly employed for this purpose, with regression techniques used to relate gas optical absorption spectra to the concentrations of specific gas components of interest (NOx, SO2 etc.). Accurately predicting gas concentrations from absorption spectra remains a challenging problem due to the presence of nonlinearities in the relationships and the high-dimensional and correlated nature of the spectral data. This article proposes a generalized fuzzy linguistic model (GFLM) to address this challenge. The GFLM is made up of a series of “If-Then” fuzzy rules. The absorption spectra are input variables in the rule antecedent. The rule consequent is a general nonlinear polynomial function of the absorption spectra. Model parameters are estimated using least squares and gradient descent optimization algorithms. The performance of GFLM is compared with other traditional prediction models, such as partial least squares, support vector machines, multilayer perceptron neural networks and radial basis function networks, for two real flue gas spectral datasets: one from a coal-fired power plant and one from a gas-fired power plant. The experimental results show that the generalized fuzzy linguistic model has good predictive ability, and is competitive with alternative approaches, while having the added advantage of providing an interpretable model.
Resumo:
Cette thèse développe des méthodes bootstrap pour les modèles à facteurs qui sont couram- ment utilisés pour générer des prévisions depuis l'article pionnier de Stock et Watson (2002) sur les indices de diffusion. Ces modèles tolèrent l'inclusion d'un grand nombre de variables macroéconomiques et financières comme prédicteurs, une caractéristique utile pour inclure di- verses informations disponibles aux agents économiques. Ma thèse propose donc des outils éco- nométriques qui améliorent l'inférence dans les modèles à facteurs utilisant des facteurs latents extraits d'un large panel de prédicteurs observés. Il est subdivisé en trois chapitres complémen- taires dont les deux premiers en collaboration avec Sílvia Gonçalves et Benoit Perron. Dans le premier article, nous étudions comment les méthodes bootstrap peuvent être utilisées pour faire de l'inférence dans les modèles de prévision pour un horizon de h périodes dans le futur. Pour ce faire, il examine l'inférence bootstrap dans un contexte de régression augmentée de facteurs où les erreurs pourraient être autocorrélées. Il généralise les résultats de Gonçalves et Perron (2014) et propose puis justifie deux approches basées sur les résidus : le block wild bootstrap et le dependent wild bootstrap. Nos simulations montrent une amélioration des taux de couverture des intervalles de confiance des coefficients estimés en utilisant ces approches comparativement à la théorie asymptotique et au wild bootstrap en présence de corrélation sérielle dans les erreurs de régression. Le deuxième chapitre propose des méthodes bootstrap pour la construction des intervalles de prévision permettant de relâcher l'hypothèse de normalité des innovations. Nous y propo- sons des intervalles de prédiction bootstrap pour une observation h périodes dans le futur et sa moyenne conditionnelle. Nous supposons que ces prévisions sont faites en utilisant un ensemble de facteurs extraits d'un large panel de variables. Parce que nous traitons ces facteurs comme latents, nos prévisions dépendent à la fois des facteurs estimés et les coefficients de régres- sion estimés. Sous des conditions de régularité, Bai et Ng (2006) ont proposé la construction d'intervalles asymptotiques sous l'hypothèse de Gaussianité des innovations. Le bootstrap nous permet de relâcher cette hypothèse et de construire des intervalles de prédiction valides sous des hypothèses plus générales. En outre, même en supposant la Gaussianité, le bootstrap conduit à des intervalles plus précis dans les cas où la dimension transversale est relativement faible car il prend en considération le biais de l'estimateur des moindres carrés ordinaires comme le montre une étude récente de Gonçalves et Perron (2014). Dans le troisième chapitre, nous suggérons des procédures de sélection convergentes pour les regressions augmentées de facteurs en échantillons finis. Nous démontrons premièrement que la méthode de validation croisée usuelle est non-convergente mais que sa généralisation, la validation croisée «leave-d-out» sélectionne le plus petit ensemble de facteurs estimés pour l'espace généré par les vraies facteurs. Le deuxième critère dont nous montrons également la validité généralise l'approximation bootstrap de Shao (1996) pour les regressions augmentées de facteurs. Les simulations montrent une amélioration de la probabilité de sélectionner par- cimonieusement les facteurs estimés comparativement aux méthodes de sélection disponibles. L'application empirique revisite la relation entre les facteurs macroéconomiques et financiers, et l'excès de rendement sur le marché boursier américain. Parmi les facteurs estimés à partir d'un large panel de données macroéconomiques et financières des États Unis, les facteurs fortement correlés aux écarts de taux d'intérêt et les facteurs de Fama-French ont un bon pouvoir prédictif pour les excès de rendement.
Resumo:
Yield loss in crops is often associated with plant disease or external factors such as environment, water supply and nutrient availability. Improper agricultural practices can also introduce risks into the equation. Herbicide drift can be a combination of improper practices and environmental conditions which can create a potential yield loss. As traditional assessment of plant damage is often imprecise and time consuming, the ability of remote and proximal sensing techniques to monitor various bio-chemical alterations in the plant may offer a faster, non-destructive and reliable approach to predict yield loss caused by herbicide drift. This paper examines the prediction capabilities of partial least squares regression (PLS-R) models for estimating yield. Models were constructed with hyperspectral data of a cotton crop sprayed with three simulated doses of the phenoxy herbicide 2,4-D at three different growth stages. Fibre quality, photosynthesis, conductance, and two main hormones, indole acetic acid (IAA) and abscisic acid (ABA) were also analysed. Except for fibre quality and ABA, Spearman correlations have shown that these variables were highly affected by the chemical. Four PLS-R models for predicting yield were developed according to four timings of data collection: 2, 7, 14 and 28 days after the exposure (DAE). As indicated by the model performance, the analysis revealed that 7 DAE was the best time for data collection purposes (RMSEP = 2.6 and R2 = 0.88), followed by 28 DAE (RMSEP = 3.2 and R2 = 0.84). In summary, the results of this study show that it is possible to accurately predict yield after a simulated herbicide drift of 2,4-D on a cotton crop, through the analysis of hyperspectral data, thereby providing a reliable, effective and non-destructive alternative based on the internal response of the cotton leaves.