922 resultados para principal component regression
Resumo:
Agricultural management with chemicals may contaminate the soil with heavy metals. The objective of this study was to apply Principal Component Analysis and geoprocessing techniques to identify the origin of the metals Cu, Fe, Mn, Zn, Ni, Pb, Cr and Cd as potential contaminants of agricultural soils. The study was developed in an area of vineyard cultivation in the State of São Paulo, Brazil. Soil samples were collected and GPS located under different uses and coverings. The metal concentrations in the soils were determined using the DTPA method. The Cu and Zn content was considered high in most of the samples, and was larger in the areas cultivated with vineyards that had been under the application of fungicides for several decades. The concentrations of Cu and Zn were correlated. The geoprocessing techniques and the Principal Component Analysis confirmed the enrichment of the soil with Cu and Zn because of the use and management of the vineyards with chemicals in the preceding decades.
Resumo:
This study aimed to characterize air pollution and the associated carcinogenic risks of polycyclic aromatic hydrocarbon (PAHs) at an urban site, to identify possible emission sources of PAHs using several statistical methodologies, and to analyze the influence of other air pollutants and meteorological variables on PAH concentrations.The air quality and meteorological data were collected in Oporto, the second largest city of Portugal. Eighteen PAHs (the 16 PAHs considered by United States Environment Protection Agency (USEPA) as priority pollutants, dibenzo[a,l]pyrene, and benzo[j]fluoranthene) were collected daily for 24 h in air (gas phase and in particles) during 40 consecutive days in November and December 2008 by constant low-flow samplers and using polytetrafluoroethylene (PTFE) membrane filters for particulate (PM10 and PM2.5 bound) PAHs and pre-cleaned polyurethane foam plugs for gaseous compounds. The other monitored air pollutants were SO2, PM10, NO2, CO, and O3; the meteorological variables were temperature, relative humidity, wind speed, total precipitation, and solar radiation. Benzo[a]pyrene reached a mean concentration of 2.02 ngm−3, surpassing the EU annual limit value. The target carcinogenic risks were equal than the health-based guideline level set by USEPA (10−6) at the studied site, with the cancer risks of eight PAHs reaching senior levels of 9.98×10−7 in PM10 and 1.06×10−6 in air. The applied statistical methods, correlation matrix, cluster analysis, and principal component analysis, were in agreement in the grouping of the PAHs. The groups were formed according to their chemical structure (number of rings), phase distribution, and emission sources. PAH diagnostic ratios were also calculated to evaluate the main emission sources. Diesel vehicular emissions were the major source of PAHs at the studied site. Besides that source, emissions from residential heating and oil refinery were identified to contribute to PAH levels at the respective area. Additionally, principal component regression indicated that SO2, NO2, PM10, CO, and solar radiation had positive correlation with PAHs concentrations, while O3, temperature, relative humidity, and wind speed were negatively correlated.
Resumo:
Water is a limited resource for which demand is growing. Contaminated water from inadequate wastewater treatment provides one of the greatest health challenges as it restricts development and increases poverty in emerging and developing countries. Therefore, the connection between wastewater and human health is linked to access to sanitation and to human waste disposal. Adequate sanitation is expected to create a barrier between disposed human excreta and sources of drinking water. Different approaches to wastewater management are required for different geographical regions and different stages of economic governance depending on the capacity to manage wastewater. Effective wastewater management can contribute to overcome the challenges of water scarcity. Separate collection of human urine at its source is one promising approach that strongly reduces the economic and load demands on wastewater treatment plants (WWTP). Treatment of source-separated urine appears as a sanitation system that is affordable, produces a valuable fertiliser, reduces pollution of water resources and promotes health. However, the technical realisation of urine separation still faces challenges. Biological hydrolysis of urea causes a strong increase of ammonia and pH. Under these conditions ammonia volatilises which can cause odour problems and significant nitrogen losses. The above problems can be avoided by urine stabilisation. Biological nitrification is a suitable process for stabilisation of urine. Urine is a highly concentrated nutrient solution which can lead to strong inhibition effects during bacterial nitrification. This can further lead to process instabilities. The major cause of instability is accumulation of the inhibitory intermediate compound nitrite, which could lead to process breakdown. Enhanced on-line nitrite monitoring can be applied in biological source-separated urine nitrification reactors as a sustainable and efficient way to improve the reactor performance, avoiding reactor failures and eventual loss of biological activity. Spectrophotometry appears as a promising candidate for the development and application of on-line nitrite monitoring. Spectroscopic methods together with chemometrics are presented in this work as a powerful tool for estimation of nitrite concentrations. Principal component regression (PCR) is applied for the estimation of nitrite concentrations using an immersible UV sensor and off-line spectra acquisition. The effect of particles and the effect of saturation, respectively, on the UV absorbance spectra are investigated. The analysis allows to conclude that (i) saturation has a substantial effect on nitrite estimation; (ii) particles appear to have less impact on nitrite estimation. In addition, improper mixing together with instabilities in the urine nitrification process appears to significantly reduce the performance of the estimation model.
Resumo:
The aim of this work is to present a tutorial on Multivariate Calibration, a tool which is nowadays necessary in basically most laboratories but very often misused. The basic concepts of preprocessing, principal component analysis (PCA), principal component regression (PCR) and partial least squares (PLS) are given. The two basic steps on any calibration procedure: model building and validation are fully discussed. The concepts of cross validation (to determine the number of factors to be used in the model), leverage and studentized residuals (to detect outliers) for the validation step are given. The whole calibration procedure is illustrated using spectra recorded for ternary mixtures of 2,4,6 trinitrophenolate, 2,4 dinitrophenolate and 2,5 dinitrophenolate followed by the concentration prediction of these three chemical species during a diffusion experiment through a hydrophobic liquid membrane. MATLAB software is used for numerical calculations. Most of the commands for the analysis are provided in order to allow a non-specialist to follow step by step the analysis.
Resumo:
Two spectrophotometric methods are described for the simultaneous determination of ezetimibe (EZE) and simvastatin (SIM) in pharmaceutical preparations. The obtained data was evaluated by using two different chemometric techniques, Principal Component Regression (PCR) and Partial Least-Squares (PLS-1). In these techniques, the concentration data matrix was prepared by using the mixtures containing these drugs in methanol. The absorbance data matrix corresponding to the concentration data matrix was obtained by the measurements of absorbances in the range of 240 - 300 nm in the intervals with Δλ = 1 nm at 61 wavelengths in their zero order spectra, then, calibration or regression was obtained by using the absorbance data matrix and concentration data matrix for the prediction of the unknown concentrations of EZE and SIM in their mixture. The procedure did not require any separation step. The linear range was found to be 5 - 20 µg mL-1 for EZE and SIM in both methods. The accuracy and precision of the methods were assessed. These methods were successfully applied to a pharmaceutical preparation, tablet; and the results were compared with each other.
Resumo:
Pós-graduação em Química - IQ
Resumo:
Este artigo apresenta uma aplicação do método para determinação espectrofotométrica simultânea dos íons divalentes de cobre, manganês e zinco à análise de medicamento polivitamínico/polimineral. O método usa 4-(2-piridilazo) resorcinol (PAR), calibração multivariada e técnicas de seleção de variáveis e foi otimizado o empregando-se o algoritmo das projeções sucessivas (APS) e o algoritmo genético (AG), para escolha dos comprimentos de onda mais informativos para a análise. Com essas técnicas, foi possível construir modelos de calibração por regressão linear múltipla (RLM-APS e RLM-AG). Os resultados obtidos foram comparados com modelos de regressão em componentes principais (PCR) e nos mínimos quadrados parciais (PLS). Demonstra-se a partir do erro médio quadrático de previsão (RMSEP) que os modelos apresentam desempenhos semelhantes ao prever as concentrações dos três analitos no medicamento. Todavia os modelos RLM são mais simples pois requerem um número muito menor de comprimentos de onda e são mais fáceis de interpretar que os baseados em variáveis latentes.
Resumo:
Over the past few decades, the advantages of the visible-near infra-red (VisNIR) diffuse reflectance spectrometer (DRS) method have enabled prediction of soil organic carbon (SOC). In this study, SOC was predicted using regression models for samples taken from three sites (Gununo, Maybar and Anjeni) in Ethiopia. SOC was characterized in laboratory using conventional wet chemistry and VisNIR-DRS methods. Principal component analysis (PCA), principal component regression (PCR) and partial least square regression (PLS) models were developed using Unscrambler X 10.2. PCA results show that the first two components accounted for a minimum of 96% variation which increased for individual sites and with data treatments. Correlation (r), coefficient of determination (R2) and residual prediction deviation (RPD) were used to rate four models built. PLS model (r, R2, RPD) values for Anjeni were 0.9, 0.9 and 3.6; for Gununo values 0.6, 0.3 and 1.2; for Maybar values 0.6, 0.3 and 0.9, and for the three sites values 0.7, 0.6 and 1.5, respectively. PCR model values (r, R2, RPD) for Anjeni were 0.9, 0.8 and 2.7; for Gununo values 0.5, 0.3 and 1; for Maybar values 0.5, 0.1 and 0.7, and for the three sites values 0.7, 0.5 and 1.2, respectively. Comparison and testing of models shows superior performance of PLS to PCR. Models were rated as very poor (Maybar), poor (Gununo and three sites) and excellent (Anjeni). A robust model, Anjeni, is recommended for prediction of SOC in Ethiopia.
Resumo:
Radiomics is the high-throughput extraction and analysis of quantitative image features. For non-small cell lung cancer (NSCLC) patients, radiomics can be applied to standard of care computed tomography (CT) images to improve tumor diagnosis, staging, and response assessment. The first objective of this work was to show that CT image features extracted from pre-treatment NSCLC tumors could be used to predict tumor shrinkage in response to therapy. This is important since tumor shrinkage is an important cancer treatment endpoint that is correlated with probability of disease progression and overall survival. Accurate prediction of tumor shrinkage could also lead to individually customized treatment plans. To accomplish this objective, 64 stage NSCLC patients with similar treatments were all imaged using the same CT scanner and protocol. Quantitative image features were extracted and principal component regression with simulated annealing subset selection was used to predict shrinkage. Cross validation and permutation tests were used to validate the results. The optimal model gave a strong correlation between the observed and predicted shrinkages with . The second objective of this work was to identify sets of NSCLC CT image features that are reproducible, non-redundant, and informative across multiple machines. Feature sets with these qualities are needed for NSCLC radiomics models to be robust to machine variation and spurious correlation. To accomplish this objective, test-retest CT image pairs were obtained from 56 NSCLC patients imaged on three CT machines from two institutions. For each machine, quantitative image features with concordance correlation coefficient values greater than 0.90 were considered reproducible. Multi-machine reproducible feature sets were created by taking the intersection of individual machine reproducible feature sets. Redundant features were removed through hierarchical clustering. The findings showed that image feature reproducibility and redundancy depended on both the CT machine and the CT image type (average cine 4D-CT imaging vs. end-exhale cine 4D-CT imaging vs. helical inspiratory breath-hold 3D CT). For each image type, a set of cross-machine reproducible, non-redundant, and informative image features was identified. Compared to end-exhale 4D-CT and breath-hold 3D-CT, average 4D-CT derived image features showed superior multi-machine reproducibility and are the best candidates for clinical correlation.
Resumo:
Conventional reflectance spectroscopy (NIRS) and hyperspectral imaging (HI) in the near-infrared region (1000-2500 nm) are evaluated and compared, using, as the case study, the determination of relevant properties related to the quality of natural rubber. Mooney viscosity (MV) and plasticity indices (PI) (PI0 - original plasticity, PI30 - plasticity after accelerated aging, and PRI - the plasticity retention index after accelerated aging) of rubber were determined using multivariate regression models. Two hundred and eighty six samples of rubber were measured using conventional and hyperspectral near-infrared imaging reflectance instruments in the range of 1000-2500 nm. The sample set was split into regression (n = 191) and external validation (n = 95) sub-sets. Three instruments were employed for data acquisition: a line scanning hyperspectral camera and two conventional FT-NIR spectrometers. Sample heterogeneity was evaluated using hyperspectral images obtained with a resolution of 150 × 150 μm and principal component analysis. The probed sample area (5 cm(2); 24,000 pixels) to achieve representativeness was found to be equivalent to the average of 6 spectra for a 1 cm diameter probing circular window of one FT-NIR instrument. The other spectrophotometer can probe the whole sample in only one measurement. The results show that the rubber properties can be determined with very similar accuracy and precision by Partial Least Square (PLS) regression models regardless of whether HI-NIR or conventional FT-NIR produce the spectral datasets. The best Root Mean Square Errors of Prediction (RMSEPs) of external validation for MV, PI0, PI30, and PRI were 4.3, 1.8, 3.4, and 5.3%, respectively. Though the quantitative results provided by the three instruments can be considered equivalent, the hyperspectral imaging instrument presents a number of advantages, being about 6 times faster than conventional bulk spectrometers, producing robust spectral data by ensuring sample representativeness, and minimizing the effect of the presence of contaminants.
Resumo:
OBJECTIVE To analyze the association between concentrations of air pollutants and admissions for respiratory causes in children. METHODS Ecological time series study. Daily figures for hospital admissions of children aged < 6, and daily concentrations of air pollutants (PM10, SO2, NO2, O3 and CO) were analyzed in the Região da Grande Vitória, ES, Southeastern Brazil, from January 2005 to December 2010. For statistical analysis, two techniques were combined: Poisson regression with generalized additive models and principal model component analysis. Those analysis techniques complemented each other and provided more significant estimates in the estimation of relative risk. The models were adjusted for temporal trend, seasonality, day of the week, meteorological factors and autocorrelation. In the final adjustment of the model, it was necessary to include models of the Autoregressive Moving Average Models (p, q) type in the residuals in order to eliminate the autocorrelation structures present in the components. RESULTS For every 10:49 μg/m3 increase (interquartile range) in levels of the pollutant PM10 there was a 3.0% increase in the relative risk estimated using the generalized additive model analysis of main components-seasonal autoregressive – while in the usual generalized additive model, the estimate was 2.0%. CONCLUSIONS Compared to the usual generalized additive model, in general, the proposed aspect of generalized additive model − principal component analysis, showed better results in estimating relative risk and quality of fit.
Resumo:
This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.
Resumo:
The problem of regression under Gaussian assumptions is treated generally. The relationship between Bayesian prediction, regularization and smoothing is elucidated. The ideal regression is the posterior mean and its computation scales as O(n3), where n is the sample size. We show that the optimal m-dimensional linear model under a given prior is spanned by the first m eigenfunctions of a covariance operator, which is a trace-class operator. This is an infinite dimensional analogue of principal component analysis. The importance of Hilbert space methods to practical statistics is also discussed.
Resumo:
Dulce de leche samples available in the Brazilian market were submitted to sensory profiling by quantitative descriptive analysis and acceptance test, as well sensory evaluation using the just-about-right scale and purchase intent. External preference mapping and the ideal sensory characteristics of dulce de leche were determined. The results were also evaluated by principal component analysis, hierarchical cluster analysis, partial least squares regression, artificial neural networks, and logistic regression. Overall, significant product acceptance was related to intermediate scores of the sensory attributes in the descriptive test, and this trend was observed even after consumer segmentation. The results obtained by sensometric techniques showed that optimizing an ideal dulce de leche from the sensory standpoint is a multidimensional process, with necessary adjustments on the appearance, aroma, taste, and texture attributes of the product for better consumer acceptance and purchase. The optimum dulce de leche was characterized by high scores for the attributes sweet taste, caramel taste, brightness, color, and caramel aroma in accordance with the preference mapping findings. In industrial terms, this means changing the parameters used in the thermal treatment and quantitative changes in the ingredients used in formulations.