918 resultados para least squares matching
Resumo:
This article analyses the impact that innovation expenditure and intrasectoral and intersectoral externalities have on productivity in Spanish firms. While there is an extensive literature analysing the relationship between innovation and productivity, in this particular area there are far fewer studies that examine the importance of sectoral externalities, especially with the focus on Spain. One novelty of the study, which covers the industrial and service sectors, is that we also consider jointly the technology level of the sector in which the firm operates and the firm size. The database used is the Technological Innovation Panel, PITEC, which includes 12,813 firms for the year 2008 and has been little used in this type of study. The estimation method used is Iteratively Reweighted Least Squares method, IRLS, which is very useful for obtaining robust estimations in the presence of outliers. The results confirm that innovation has a positive effect on productivity, especially in high-tech and large firms. The impact of externalities is more heterogeneous because, while intrasectoral externalities have a poitive and significant effect, especially in low-tech firms independently of size, intersectoral externalities have a more ambiguous effect, being clearly significant for advanced industries in which size has a positive effect.
Resumo:
The determination of zirconium-hafnium mixtures is one of the most critical problem of the analytical chemistry, on account of the close similarity of their chemical properties. The spectrophotometric determination proposed by Yagodin et al. show not many practical applications due to the significant spectral interference on the 200-220 nm region. In this work we propound the use of a multivariate calibration method called partial least squares ( PLS ) for colorimetric determination of these mixtures. By using PLS and 16 calibration mixtures we obtained a model which permits determination of zirconium and hafnium with accuracy of about 1-2% and 10-20%, respectively. Using conventional univariate calibration the inaccuracy of the determination is about 10-25% for zirconium and above 57% for hafnium.
Resumo:
The aim of this work is to present a tutorial on Multivariate Calibration, a tool which is nowadays necessary in basically most laboratories but very often misused. The basic concepts of preprocessing, principal component analysis (PCA), principal component regression (PCR) and partial least squares (PLS) are given. The two basic steps on any calibration procedure: model building and validation are fully discussed. The concepts of cross validation (to determine the number of factors to be used in the model), leverage and studentized residuals (to detect outliers) for the validation step are given. The whole calibration procedure is illustrated using spectra recorded for ternary mixtures of 2,4,6 trinitrophenolate, 2,4 dinitrophenolate and 2,5 dinitrophenolate followed by the concentration prediction of these three chemical species during a diffusion experiment through a hydrophobic liquid membrane. MATLAB software is used for numerical calculations. Most of the commands for the analysis are provided in order to allow a non-specialist to follow step by step the analysis.
Resumo:
Genetic algorithm was used for variable selection in simultaneous determination of mixtures of glucose, maltose and fructose by mid infrared spectroscopy. Different models, using partial least squares (PLS) and multiple linear regression (MLR) with and without data pre-processing, were used. Based on the results obtained, it was verified that a simpler model (multiple linear regression with variable selection by genetic algorithm) produces results comparable to more complex methods (partial least squares). The relative errors obtained for the best model was around 3% for the sugar determination, which is acceptable for this kind of determination.
Resumo:
Learning of preference relations has recently received significant attention in machine learning community. It is closely related to the classification and regression analysis and can be reduced to these tasks. However, preference learning involves prediction of ordering of the data points rather than prediction of a single numerical value as in case of regression or a class label as in case of classification. Therefore, studying preference relations within a separate framework facilitates not only better theoretical understanding of the problem, but also motivates development of the efficient algorithms for the task. Preference learning has many applications in domains such as information retrieval, bioinformatics, natural language processing, etc. For example, algorithms that learn to rank are frequently used in search engines for ordering documents retrieved by the query. Preference learning methods have been also applied to collaborative filtering problems for predicting individual customer choices from the vast amount of user generated feedback. In this thesis we propose several algorithms for learning preference relations. These algorithms stem from well founded and robust class of regularized least-squares methods and have many attractive computational properties. In order to improve the performance of our methods, we introduce several non-linear kernel functions. Thus, contribution of this thesis is twofold: kernel functions for structured data that are used to take advantage of various non-vectorial data representations and the preference learning algorithms that are suitable for different tasks, namely efficient learning of preference relations, learning with large amount of training data, and semi-supervised preference learning. Proposed kernel-based algorithms and kernels are applied to the parse ranking task in natural language processing, document ranking in information retrieval, and remote homology detection in bioinformatics domain. Training of kernel-based ranking algorithms can be infeasible when the size of the training set is large. This problem is addressed by proposing a preference learning algorithm whose computation complexity scales linearly with the number of training data points. We also introduce sparse approximation of the algorithm that can be efficiently trained with large amount of data. For situations when small amount of labeled data but a large amount of unlabeled data is available, we propose a co-regularized preference learning algorithm. To conclude, the methods presented in this thesis address not only the problem of the efficient training of the algorithms but also fast regularization parameter selection, multiple output prediction, and cross-validation. Furthermore, proposed algorithms lead to notably better performance in many preference learning tasks considered.
Resumo:
The objective of this work was to accomplish the simultaneous determination of some chemical elements by Energy Dispersive X-ray Fluorescence (EDXRF) Spectroscopy through multivariate calibration in several sample types. The multivariate calibration models were: Back Propagation neural network, Levemberg-Marquardt neural network and Radial Basis Function neural network, fuzzy modeling and Partial Least Squares Regression. The samples were soil standards, plant standards, and mixtures of lead and sulfur salts diluted in silica. The smallest Root Mean Square errors (RMS) were obtained with Back Propagation neural networks, which solved main EDXRF problems in a better way.
Estudo QSPR sobre os coeficientes de partição: descritores mecânico-quânticos e análise multivariada
Resumo:
Quantum chemistry and multivariate analysis were used to estimate the partition coefficients between n-octanol and water for a serie of 188 compounds, with the values of the q 2 until 0.86 for crossvalidation test. The quantum-mechanical descriptors are obtained with ab initio calculation, using the solvation effects of the Polarizable Continuum Method. Two different Hartree-Fock bases were used, and two different ways for simulating solvent cavity formation. The results for each of the cases were analised, and each methodology proposed is indicated for particular case.
Resumo:
A model based on chemical structure was developed for the accurate prediction of octanol/water partition coefficient (K OW) of polychlorinated biphenyls (PCBs), which are molecules of environmental interest. Partial least squares (PLS) was used to build the regression model. Topological indices were used as molecular descriptors. Variable selection was performed by Hierarchical Cluster Analysis (HCA). In the modeling process, the experimental K OW measured for 30 PCBs by thin-layer chromatography - retention time (TLC-RT) has been used. The developed model (Q² = 0,990 and r² = 0,994) was used to estimate the log K OW values for the 179 PCB congeners whose K OW data have not yet been measured by TLC-RT method. The results showed that topological indices can be very useful to predict the K OW.
Resumo:
Dilutions of methylmetacrylate ranging between 1 and 50 ppm were obtained from a stock solution of 1 ml of monomer in 100 ml of deionised water, and were analyzed by an absorption spectrophotometer in the UV-visible. Absorbance values were used to develop a calibration model based on the PLS, with the aim to determine new sample concentrations. The number of latent variables used was 6, with the standard errors of calibration and prediction found to be 0,048 ml/100 ml and 0,058 ml/100 ml. The calibration model was successfully used to calculate the concentration of monomer released in water, where complete dentures were kept for one hour after polymerization.
Resumo:
In this work, a partial least squares regression routine was used to develop a multivariate calibration model to predict the chemical oxygen demand (COD) in substrates of environmental relevance (paper effluents and landfill leachates) from UV-Vis spectral data. The calibration models permit the fast determination of the COD with typical relative errors lower by 10% with respect to the conventional methodology.
Resumo:
A simple method was proposed for determination of paracetamol and ibuprofen in tablets, based on UV measurements and partial least squares. The procedure was performed at pH 10.5, in the concentration ranges 3.00-15.00 µg ml-1 (paracetamol) and 2.40-12.00 µg ml-1 (ibuprofen). The model was able to predict paracetamol and ibuprofen in synthetic mixtures with root mean squares errors of prediction of 0.12 and 0.17 µg ml-1, respectively. Figures of merit (sensitivity, limit of detection and precision) were also estimated. The results achieved for the determination of these drugs in pharmaceutical formulations were in agreement with label claims and verified by HPLC.
Resumo:
Least-squares support vector machines (LS-SVM) were used as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants found in powdered milk samples, using near-infrared spectroscopy. Excellent models were built using LS-SVM for determining R², RMSECV and RMSEP values. LS-SVMs show superior performance for quantifying starch, whey and sucrose in powdered milk samples in relation to PLSR. This study shows that it is possible to determine precisely the amount of one and two common adulterants simultaneously in powdered milk samples using LS-SVM and NIR spectra.
Resumo:
EPR users often face the problem of extracting information from frequently low-resolution and complex EPR spectra. Simulation programs that provide a series of parameters, characteristic of the investigated system, have been used to achieve this goal. This work describes the general aspects of one of those programs, the NLSL program, used to fit EPR spectra applying a nonlinear least squares method. Several motion regimes of the probes are included in this computational tool, covering a broad range of spectral changes. The meanings of the different parameters and rotational diffusion models are discussed. The anisotropic case is also treated by including an orienting potential and order parameters. Some examples are presented in order to show its applicability in different systems.
Resumo:
In this work, the artificial neural networks (ANN) and partial least squares (PLS) regression were applied to UV spectral data for quantitative determination of thiamin hydrochloride (VB1), riboflavin phosphate (VB2), pyridoxine hydrochloride (VB6) and nicotinamide (VPP) in pharmaceutical samples. For calibration purposes, commercial samples in 0.2 mol L-1 acetate buffer (pH 4.0) were employed as standards. The concentration ranges used in the calibration step were: 0.1 - 7.5 mg L-1 for VB1, 0.1 - 3.0 mg L-1 for VB2, 0.1 - 3.0 mg L-1 for VB6 and 0.4 - 30.0 mg L-1 for VPP. From the results it is possible to verify that both methods can be successfully applied for these determinations. The similar error values were obtained by using neural network or PLS methods. The proposed methodology is simple, rapid and can be easily used in quality control laboratories.
Determinação de misturas de sulfametoxazol e trimetoprima por espectroscopia eletrônica multivariada
Resumo:
In this work a multivariate spectroscopic methodology is proposed for quantitative determination of sulfamethoxazole and trimethoprim in pharmaceutical associations. The multivariate model was developed by partial least-squares regression, using twenty synthetic mixtures and the spectral region between 190 and 350 nm. In the validation stage, which involved the analysis of five synthetic mixtures, prediction errors lower that 3% were observed. The predictive capacity of the multivariate models is seriously affected by spectral changes induced by pH variations, a fact that acquires a great significance in the analysis of real samples (pharmaceuticals) that contain chemical additives.