950 resultados para cross validation


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The composition and abundance of algal pigments provide information on phytoplankton community characteristics such as photoacclimation, overall biomass and taxonomic composition. In particular, pigments play a major role in photoprotection and in the light-driven part of photosynthesis. Most phytoplankton pigments can be measured by high-performance liquid chromatography (HPLC) techniques applied to filtered water samples. This method, as well as other laboratory analyses, is time consuming and therefore limits the number of samples that can be processed in a given time. In order to receive information on phytoplankton pigment composition with a higher temporal and spatial resolution, we have developed a method to assess pigment concentrations from continuous optical measurements. The method applies an empirical orthogonal function (EOF) analysis to remote-sensing reflectance data derived from ship-based hyperspectral underwater radiometry and from multispectral satellite data (using the Medium Resolution Imaging Spectrometer - MERIS - Polymer product developed by Steinmetz et al., 2011, doi:10.1364/OE.19.009783) measured in the Atlantic Ocean. Subsequently we developed multiple linear regression models with measured (collocated) pigment concentrations as the response variable and EOF loadings as predictor variables. The model results show that surface concentrations of a suite of pigments and pigment groups can be well predicted from the ship-based reflectance measurements, even when only a multispectral resolution is chosen (i.e., eight bands, similar to those used by MERIS). Based on the MERIS reflectance data, concentrations of total and monovinyl chlorophyll a and the groups of photoprotective and photosynthetic carotenoids can be predicted with high quality. As a demonstration of the utility of the approach, the fitted model based on satellite reflectance data as input was applied to 1 month of MERIS Polymer data to predict the concentration of those pigment groups for the whole eastern tropical Atlantic area. Bootstrapping explorations of cross-validation error indicate that the method can produce reliable predictions with relatively small data sets (e.g., < 50 collocated values of reflectance and pigment concentration). The method allows for the derivation of time series from continuous reflectance data of various pigment groups at various regions, which can be used to study variability and change of phytoplankton composition and photophysiology.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Dynamic positron emission tomography (PET) imaging can be used to track the distribution of injected radio-labelled molecules over time in vivo. This is a powerful technique, which provides researchers and clinicians the opportunity to study the status of healthy and pathological tissue by examining how it processes substances of interest. Widely used tracers include 18F-uorodeoxyglucose, an analog of glucose, which is used as the radiotracer in over ninety percent of PET scans. This radiotracer provides a way of quantifying the distribution of glucose utilisation in vivo. The interpretation of PET time-course data is complicated because the measured signal is a combination of vascular delivery and tissue retention effects. If the arterial time-course is known, the tissue time-course can typically be expressed in terms of a linear convolution between the arterial time-course and the tissue residue function. As the residue represents the amount of tracer remaining in the tissue, this can be thought of as a survival function; these functions been examined in great detail by the statistics community. Kinetic analysis of PET data is concerned with estimation of the residue and associated functionals such as ow, ux and volume of distribution. This thesis presents a Markov chain formulation of blood tissue exchange and explores how this relates to established compartmental forms. A nonparametric approach to the estimation of the residue is examined and the improvement in this model relative to compartmental model is evaluated using simulations and cross-validation techniques. The reference distribution of the test statistics, generated in comparing the models, is also studied. We explore these models further with simulated studies and an FDG-PET dataset from subjects with gliomas, which has previously been analysed with compartmental modelling. We also consider the performance of a recently proposed mixture modelling technique in this study.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Quantitative Structure-Activity Relationship (QSAR) has been applied extensively in predicting toxicity of Disinfection By-Products (DBPs) in drinking water. Among many toxicological properties, acute and chronic toxicities of DBPs have been widely used in health risk assessment of DBPs. These toxicities are correlated with molecular properties, which are usually correlated with molecular descriptors. The primary goals of this thesis are: 1) to investigate the effects of molecular descriptors (e.g., chlorine number) on molecular properties such as energy of the lowest unoccupied molecular orbital (ELUMO) via QSAR modelling and analysis; 2) to validate the models by using internal and external cross-validation techniques; 3) to quantify the model uncertainties through Taylor and Monte Carlo Simulation. One of the very important ways to predict molecular properties such as ELUMO is using QSAR analysis. In this study, number of chlorine (NCl) and number of carbon (NC) as well as energy of the highest occupied molecular orbital (EHOMO) are used as molecular descriptors. There are typically three approaches used in QSAR model development: 1) Linear or Multi-linear Regression (MLR); 2) Partial Least Squares (PLS); and 3) Principle Component Regression (PCR). In QSAR analysis, a very critical step is model validation after QSAR models are established and before applying them to toxicity prediction. The DBPs to be studied include five chemical classes: chlorinated alkanes, alkenes, and aromatics. In addition, validated QSARs are developed to describe the toxicity of selected groups (i.e., chloro-alkane and aromatic compounds with a nitro- or cyano group) of DBP chemicals to three types of organisms (e.g., Fish, T. pyriformis, and P.pyosphoreum) based on experimental toxicity data from the literature. The results show that: 1) QSAR models to predict molecular property built by MLR, PLS or PCR can be used either to select valid data points or to eliminate outliers; 2) The Leave-One-Out Cross-Validation procedure by itself is not enough to give a reliable representation of the predictive ability of the QSAR models, however, Leave-Many-Out/K-fold cross-validation and external validation can be applied together to achieve more reliable results; 3) ELUMO are shown to correlate highly with the NCl for several classes of DBPs; and 4) According to uncertainty analysis using Taylor method, the uncertainty of QSAR models is contributed mostly from NCl for all DBP classes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A number of studies have shown that Fourier transform infrared spectroscopy (FTIR) can be applied to quantitatively assess lacustrine sediment constituents. In this study, we developed calibration models based on FTIR for the quantitative determination of biogenic silica (BSi; n = 420; gradient: 0.9-56.5%), total organic carbon (TOC; n = 309; gradient: 0-2.9%), and total inorganic carbon (TIC; n= 152; gradient: 0-0.4%) in a 318 m-long sediment record with a basal age of 3.6 million years from Lake El'gygytgyn, Far East Russian Arctic. The developed partial least squares (PLS) regression models yield high cross-validated (CV) R2CV = 0.86-0.91 and low root mean square error of cross-validation (RMSECV) (3.1-7.0% of the gradient for the different properties). By applying these models to 6771 samples from the entire sediment record, we obtained detailed insight into bioproductivity variations in Lake El'gygytgyn throughout the middle to late Pliocene and Quaternary. High accumulation rates of BSi indicate a productivity maximum during the middle Pliocene (3.6-3.3 Ma), followed by gradually decreasing rates during the late Pliocene and Quaternary. The average BSi accumulation during the middle Pliocene was ~3 times higher than maximum accumulation rates during the past 1.5 million years. The indicated progressive deterioration of environmental and climatic conditions in the Siberian Arctic starting at ca. 3.3 Ma is consistent with the first occurrence of glacial periods and the finally complete establishment of glacial-interglacial cycles during the Quaternary.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

River runoff is an essential climate variable as it is directly linked to the terrestrial water balance and controls a wide range of climatological and ecological processes. Despite its scientific and societal importance, there are to date no pan-European observation-based runoff estimates available. Here we employ a recently developed methodology to estimate monthly runoff rates on regular spatial grid in Europe. For this we first assemble an unprecedented collection of river flow observations, combining information from three distinct data bases. Observed monthly runoff rates are first tested for homogeneity and then related to gridded atmospheric variables (E-OBS version 12) using machine learning. The resulting statistical model is then used to estimate monthly runoff rates (December 1950 - December 2015) on a 0.5° x 0.5° grid. The performance of the newly derived runoff estimates is assessed in terms of cross validation. The paper closes with example applications, illustrating the potential of the new runoff estimates for climatological assessments and drought monitoring.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

River runoff is an essential climate variable as it is directly linked to the terrestrial water balance and controls a wide range of climatological and ecological processes. Despite its scientific and societal importance, there are to date no pan-European observation-based runoff estimates available. Here we employ a recently developed methodology to estimate monthly runoff rates on regular spatial grid in Europe. For this we first collect an unprecedented collection of river flow observations, combining information from three distinct data bases. Observed monthly runoff rates are first tested for homogeneity and then related to gridded atmospheric variables (E-OBS version 11) using machine learning. The resulting statistical model is then used to estimate monthly runoff rates (December 1950-December 2014) on a 0.5° × 0.5° grid. The performance of the newly derived runoff estimates is assessed in terms of cross validation. The paper closes with example applications, illustrating the potential of the new runoff estimates for climatological assessments and drought monitoring.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background. In pre-school and primary education pupils differ in many abilities and competences (‘giftedness’). Yet mainstream educational practice seems rather homogeneous in providing age-based or grade-class subject matter approaches. Aims. To clarify whether pupils scoring initially at high ability level do develop and attain differently at school with respect to language and arithmetic compared with pupils displaying other initial ability levels. To investigate whether specific individual, family or educational variables co-vary with the attainment of these different types of pupils in school. Samples. Data from the large-scale PRIMA cohort study including a total of 8258 grade 2 and 4 pupils from 438 primary schools in The Netherlands. Methods. Secondary analyses were carried out to construct gain scores for both language and arithmetic proficiency and a number of behavioural, attitudinal, family and educational characteristics. The pupils were grouped into different ability categories (highly able; able; above average; average and below). Further analyses used Pearson correlations and analyses of variance both between and within ability categories. Cross-validation was done by introducing a cohort of younger pupils in pre-school and grouping both cohorts into decile groups based on initial ability in language and arithmetic. Results. Highly able pupils generally decreased in attainment in both language and arithmetic, whereas pupils in average and below average groups improved their language and arithmetic scores. Only with highly able pupils were some educational characteristics correlated with the pupils’ development in achievement, behaviour and attitudes. Conclusions. Pre-school and primary education should better match pupils’ differences in abilities and competences from their start in pre-school to improve their functioning, learning processes and outcomes. Recommendations for educational improvement strategies are presented in closing.

Relevância:

60.00% 60.00%

Publicador:

Resumo:


In order to predict compressive strength of geopolymers prepared from alumina-silica natural products, based on the effect of Al 2 O 3 /SiO 2, Na 2 O/Al 2 O 3, Na 2 O/H 2 O, and Na/[Na+K], more than 50 pieces of data were gathered from the literature. The data was utilized to train and test a multilayer artificial neural network (ANN). Therefore a multilayer feedforward network was designed with chemical compositions of alumina silicate and alkali activators as inputs and compressive strength as output. In this study, a feedforward network with various numbers of hidden layers and neurons were tested to select the optimum network architecture. The developed three-layer neural network simulator model used the feedforward back propagation architecture, demonstrated its ability in training the given input/output patterns. The cross-validation data was used to show the validity and high prediction accuracy of the network. This leads to the optimum chemical composition and the best paste can be made from activated alumina-silica natural products using alkaline hydroxide, and alkaline silicate. The research results are in agreement with mechanism of geopolymerization.


Read More: http://ascelibrary.org/doi/abs/10.1061/(ASCE)MT.1943-5533.0000829

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Cette thèse développe des méthodes bootstrap pour les modèles à facteurs qui sont couram- ment utilisés pour générer des prévisions depuis l'article pionnier de Stock et Watson (2002) sur les indices de diffusion. Ces modèles tolèrent l'inclusion d'un grand nombre de variables macroéconomiques et financières comme prédicteurs, une caractéristique utile pour inclure di- verses informations disponibles aux agents économiques. Ma thèse propose donc des outils éco- nométriques qui améliorent l'inférence dans les modèles à facteurs utilisant des facteurs latents extraits d'un large panel de prédicteurs observés. Il est subdivisé en trois chapitres complémen- taires dont les deux premiers en collaboration avec Sílvia Gonçalves et Benoit Perron. Dans le premier article, nous étudions comment les méthodes bootstrap peuvent être utilisées pour faire de l'inférence dans les modèles de prévision pour un horizon de h périodes dans le futur. Pour ce faire, il examine l'inférence bootstrap dans un contexte de régression augmentée de facteurs où les erreurs pourraient être autocorrélées. Il généralise les résultats de Gonçalves et Perron (2014) et propose puis justifie deux approches basées sur les résidus : le block wild bootstrap et le dependent wild bootstrap. Nos simulations montrent une amélioration des taux de couverture des intervalles de confiance des coefficients estimés en utilisant ces approches comparativement à la théorie asymptotique et au wild bootstrap en présence de corrélation sérielle dans les erreurs de régression. Le deuxième chapitre propose des méthodes bootstrap pour la construction des intervalles de prévision permettant de relâcher l'hypothèse de normalité des innovations. Nous y propo- sons des intervalles de prédiction bootstrap pour une observation h périodes dans le futur et sa moyenne conditionnelle. Nous supposons que ces prévisions sont faites en utilisant un ensemble de facteurs extraits d'un large panel de variables. Parce que nous traitons ces facteurs comme latents, nos prévisions dépendent à la fois des facteurs estimés et les coefficients de régres- sion estimés. Sous des conditions de régularité, Bai et Ng (2006) ont proposé la construction d'intervalles asymptotiques sous l'hypothèse de Gaussianité des innovations. Le bootstrap nous permet de relâcher cette hypothèse et de construire des intervalles de prédiction valides sous des hypothèses plus générales. En outre, même en supposant la Gaussianité, le bootstrap conduit à des intervalles plus précis dans les cas où la dimension transversale est relativement faible car il prend en considération le biais de l'estimateur des moindres carrés ordinaires comme le montre une étude récente de Gonçalves et Perron (2014). Dans le troisième chapitre, nous suggérons des procédures de sélection convergentes pour les regressions augmentées de facteurs en échantillons finis. Nous démontrons premièrement que la méthode de validation croisée usuelle est non-convergente mais que sa généralisation, la validation croisée «leave-d-out» sélectionne le plus petit ensemble de facteurs estimés pour l'espace généré par les vraies facteurs. Le deuxième critère dont nous montrons également la validité généralise l'approximation bootstrap de Shao (1996) pour les regressions augmentées de facteurs. Les simulations montrent une amélioration de la probabilité de sélectionner par- cimonieusement les facteurs estimés comparativement aux méthodes de sélection disponibles. L'application empirique revisite la relation entre les facteurs macroéconomiques et financiers, et l'excès de rendement sur le marché boursier américain. Parmi les facteurs estimés à partir d'un large panel de données macroéconomiques et financières des États Unis, les facteurs fortement correlés aux écarts de taux d'intérêt et les facteurs de Fama-French ont un bon pouvoir prédictif pour les excès de rendement.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recommendation systems aim to help users make decisions more efficiently. The most widely used method in recommendation systems is collaborative filtering, of which, a critical step is to analyze a user's preferences and make recommendations of products or services based on similarity analysis with other users' ratings. However, collaborative filtering is less usable for recommendation facing the "cold start" problem, i.e. few comments being given to products or services. To tackle this problem, we propose an improved method that combines collaborative filtering and data classification. We use hotel recommendation data to test the proposed method. The accuracy of the recommendation is determined by the rankings. Evaluations regarding the accuracies of Top-3 and Top-10 recommendation lists using the 10-fold cross-validation method and ROC curves are conducted. The results show that the Top-3 hotel recommendation list proposed by the combined method has the superiority of the recommendation performance than the Top-10 list under the cold start condition in most of the times.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-08

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis deals with tensor completion for the solution of multidimensional inverse problems. We study the problem of reconstructing an approximately low rank tensor from a small number of noisy linear measurements. New recovery guarantees, numerical algorithms, non-uniform sampling strategies, and parameter selection algorithms are developed. We derive a fixed point continuation algorithm for tensor completion and prove its convergence. A restricted isometry property (RIP) based tensor recovery guarantee is proved. Probabilistic recovery guarantees are obtained for sub-Gaussian measurement operators and for measurements obtained by non-uniform sampling from a Parseval tight frame. We show how tensor completion can be used to solve multidimensional inverse problems arising in NMR relaxometry. Algorithms are developed for regularization parameter selection, including accelerated k-fold cross-validation and generalized cross-validation. These methods are validated on experimental and simulated data. We also derive condition number estimates for nonnegative least squares problems. Tensor recovery promises to significantly accelerate N-dimensional NMR relaxometry and related experiments, enabling previously impractical experiments. Our methods could also be applied to other inverse problems arising in machine learning, image processing, signal processing, computer vision, and other fields.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Introduction Prediction of soft tissue changes following orthognathic surgery has been frequently attempted in the past decades. It has gradually progressed from the classic “cut and paste” of photographs to the computer assisted 2D surgical prediction planning; and finally, comprehensive 3D surgical planning was introduced to help surgeons and patients to decide on the magnitude and direction of surgical movements as well as the type of surgery to be considered for the correction of facial dysmorphology. A wealth of experience was gained and numerous published literature is available which has augmented the knowledge of facial soft tissue behaviour and helped to improve the ability to closely simulate facial changes following orthognathic surgery. This was particularly noticed following the introduction of the three dimensional imaging into the medical research and clinical applications. Several approaches have been considered to mathematically predict soft tissue changes in three dimensions, following orthognathic surgery. The most common are the Finite element model and Mass tensor Model. These were developed into software packages which are currently used in clinical practice. In general, these methods produce an acceptable level of prediction accuracy of soft tissue changes following orthognathic surgery. Studies, however, have shown a limited prediction accuracy at specific regions of the face, in particular the areas around the lips. Aims The aim of this project is to conduct a comprehensive assessment of hard and soft tissue changes following orthognathic surgery and introduce a new method for prediction of facial soft tissue changes.   Methodology The study was carried out on the pre- and post-operative CBCT images of 100 patients who received their orthognathic surgery treatment at Glasgow dental hospital and school, Glasgow, UK. Three groups of patients were included in the analysis; patients who underwent Le Fort I maxillary advancement surgery; bilateral sagittal split mandibular advancement surgery or bimaxillary advancement surgery. A generic facial mesh was used to standardise the information obtained from individual patient’s facial image and Principal component analysis (PCA) was applied to interpolate the correlations between the skeletal surgical displacement and the resultant soft tissue changes. The identified relationship between hard tissue and soft tissue was then applied on a new set of preoperative 3D facial images and the predicted results were compared to the actual surgical changes measured from their post-operative 3D facial images. A set of validation studies was conducted. To include: • Comparison between voxel based registration and surface registration to analyse changes following orthognathic surgery. The results showed there was no statistically significant difference between the two methods. Voxel based registration, however, showed more reliability as it preserved the link between the soft tissue and skeletal structures of the face during the image registration process. Accordingly, voxel based registration was the method of choice for superimposition of the pre- and post-operative images. The result of this study was published in a refereed journal. • Direct DICOM slice landmarking; a novel technique to quantify the direction and magnitude of skeletal surgical movements. This method represents a new approach to quantify maxillary and mandibular surgical displacement in three dimensions. The technique includes measuring the distance of corresponding landmarks digitized directly on DICOM image slices in relation to three dimensional reference planes. The accuracy of the measurements was assessed against a set of “gold standard” measurements extracted from simulated model surgery. The results confirmed the accuracy of the method within 0.34mm. Therefore, the method was applied in this study. The results of this validation were published in a peer refereed journal. • The use of a generic mesh to assess soft tissue changes using stereophotogrammetry. The generic facial mesh played a major role in the soft tissue dense correspondence analysis. The conformed generic mesh represented the geometrical information of the individual’s facial mesh on which it was conformed (elastically deformed). Therefore, the accuracy of generic mesh conformation is essential to guarantee an accurate replica of the individual facial characteristics. The results showed an acceptable overall mean error of the conformation of generic mesh 1 mm. The results of this study were accepted for publication in peer refereed scientific journal. Skeletal tissue analysis was performed using the validated “Direct DICOM slices landmarking method” while soft tissue analysis was performed using Dense correspondence analysis. The analysis of soft tissue was novel and produced a comprehensive description of facial changes in response to orthognathic surgery. The results were accepted for publication in a refereed scientific Journal. The main soft tissue changes associated with Le Fort I were advancement at the midface region combined with widening of the paranasal, upper lip and nostrils. Minor changes were noticed at the tip of the nose and oral commissures. The main soft tissue changes associated with mandibular advancement surgery were advancement and downward displacement of the chin and lower lip regions, limited widening of the lower lip and slight reversion of the lower lip vermilion combined with minimal backward displacement of the upper lip were recorded. Minimal changes were observed on the oral commissures. The main soft tissue changes associated with bimaxillary advancement surgery were generalized advancement of the middle and lower thirds of the face combined with widening of the paranasal, upper lip and nostrils regions. In Le Fort I cases, the correlation between the changes of the facial soft tissue and the skeletal surgical movements was assessed using PCA. A statistical method known as ’Leave one out cross validation’ was applied on the 30 cases which had Le Fort I osteotomy surgical procedure to effectively utilize the data for the prediction algorithm. The prediction accuracy of soft tissue changes showed a mean error ranging between (0.0006mm±0.582) at the nose region to (-0.0316mm±2.1996) at the various facial regions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Considering the social and economic importance that the milk has, the objective of this study was to evaluate the incidence and quantifying antimicrobial residues in the food. The samples were collected in dairy industry of southwestern Paraná state and thus they were able to cover all ten municipalities in the region of Pato Branco. The work focused on the development of appropriate models for the identification and quantification of analytes: tetracycline, sulfamethazine, sulfadimethoxine, chloramphenicol and ampicillin, all antimicrobials with health interest. For the calibration procedure and validation of the models was used the Infrared Spectroscopy Fourier Transform associated with chemometric method based on Partial Least Squares regression (PLS - Partial Least Squares). To prepare a work solution antimicrobials, the five analytes of interest were used in increasing doses, namely tetracycline from 0 to 0.60 ppm, sulfamethazine 0 to 0.12 ppm, sulfadimethoxine 0 to 2.40 ppm chloramphenicol 0 1.20 ppm and ampicillin 0 to 1.80 ppm to perform the work with the interest in multiresidues analysis. The performance of the models constructed was evaluated through the figures of merit: mean square error of calibration and cross-validation, correlation coefficients and offset performance ratio. For the purposes of applicability in this work, it is considered that the models generated for Tetracycline, Sulfadimethoxine and Chloramphenicol were considered viable, with the greatest predictive power and efficiency, then were employed to evaluate the quality of raw milk from the region of Pato Branco . Among the analyzed samples by NIR, 70% were in conformity with sanitary legislation, and 5% of these samples had concentrations below the Maximum Residue permitted, and is also satisfactory. However 30% of the sample set showed unsatisfactory results when evaluating the contamination with antimicrobials residues, which is non conformity related to the presence of antimicrobial unauthorized use or concentrations above the permitted limits. With the development of this work can be said that laboratory tests in the food area, using infrared spectroscopy with multivariate calibration was also good, fast in analysis, reduced costs and with minimum generation of laboratory waste. Thus, the alternative method proposed meets the quality concerns and desired efficiency by industrial sectors and society in general.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The routine analysis for quantization of organic acids and sugars are generally slow methods that involve the use and preparation of several reagents, require trained professional, the availability of special equipment and is expensive. In this context, it has been increasing investment in research whose purpose is the development of substitutive methods to reference, which are faster, cheap and simple, and infrared spectroscopy have been highlighted in this regard. The present study developed multivariate calibration models for the simultaneous and quantitative determination of ascorbic acid, citric, malic and tartaric and sugars sucrose, glucose and fructose, and soluble solids in juices and fruit nectars and classification models for ACP. We used methods of spectroscopy in the near infrared (Near Infrared, NIR) in association with the method regression of partial least squares (PLS). Were used 42 samples between juices and fruit nectars commercially available in local shops. For the construction of the models were performed with reference analysis using high-performance liquid chromatography (HPLC) and refractometry for the analysis of soluble solids. Subsequently, the acquisition of the spectra was done in triplicate, in the spectral range 12500 to 4000 cm-1. The best models were applied to the quantification of analytes in study on natural juices and juice samples produced in the Paraná Southwest Region. The juices used in the application of the models also underwent physical and chemical analysis. Validation of chromatographic methodology has shown satisfactory results, since the external calibration curve obtained R-square value (R2) above 0.98 and coefficient of variation (%CV) for intermediate precision and repeatability below 8.83%. Through the Principal Component Analysis (PCA) was possible to separate samples of juices into two major groups, grape and apple and tangerine and orange, while for nectars groups separated guava and grape, and pineapple and apple. Different validation methods, and pre-processes that were used separately and in combination, were obtained with multivariate calibration models with average forecast square error (RMSEP) and cross validation (RMSECV) errors below 1.33 and 1.53 g.100 mL-1, respectively and R2 above 0.771, except for malic acid. The physicochemical analysis enabled the characterization of drinks, including the pH working range (variation of 2.83 to 5.79) and acidity within the parameters Regulation for each flavor. Regression models have demonstrated the possibility of determining both ascorbic acids, citric, malic and tartaric with successfully, besides sucrose, glucose and fructose by means of only a spectrum, suggesting that the models are economically viable for quality control and product standardization in the fruit juice and nectars processing industry.