86 resultados para Cross Validation

em Scielo Saúde Pública - SP


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Twenty-four hepatitis C virus patients coinfected with human T-lymphotropic virus type 1 were compared with six coinfected with HTLV-2 and 55 with HCV alone, regarding clinical, epidemiological, laboratory and histopathological data. Fischer's discriminant analysis was applied to define functions capable of differentiating between the study groups (HCV, HCV/HTLV-1 and HCV/HTLV-2). The discriminant accuracy was evaluated by cross-validation. Alcohol consumption, use of intravenous drugs or inhaled cocaine and sexual partnership with intravenous drug users were more frequent in the HCV/HTLV-2 group, whereas patients in the HCV group more often reported abdominal pain or a sexual partner with hepatitis. Coinfected patients presented higher platelet counts, but aminotransferase and gamma-glutamyl transpeptidase levels were higher among HCV-infected subjects. No significant difference between the groups was seen regarding liver histopathological findings. Through discriminant analysis, classification functions were defined, including sex, age group, intravenous drug use and sexual partner with hepatitis. Cross-validation revealed high discriminant accuracy for the HCV group.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In the areas where irrigated rice is grown in the south of Brazil, few studies have been carried out to investigate the spatial variability structure of soil properties and to establish new forms of soil management as well as determine soil corrective and fertilizer applications. In this sense, this study had the objective of evaluating the spatial variability of chemical, physical and biological soil properties in a lowland area under irrigated rice cultivation in the conventional till system. For this purpose, a 10 x 10 m grid of 100 points was established, in an experimental field of the Embrapa Clima Temperado, in the County of Capão do Leão, State of Rio Grande do Sul. The spatial variability structure was evaluated by geostatistical tools and the number of subsamples required to represent each soil property in future studies was calculated using classical statistics. Results showed that the spatial variability structure of sand, silt, SMP index, cation exchange capacity (pH 7.0), Al3+ and total N properties could be detected by geostatistical analysis. A pure nugget effect was observed for the nutrients K, S and B, as well as macroporosity, mean weighted diameter of aggregates, and soil water storage. The cross validation procedure, based on linear regression and the determination coefficient, was more efficient to evaluate the quality of the adjusted mathematical model than the degree of spatial dependence. It was also concluded that the combination of classical with geostatistics can in many cases simplify the soil sampling process without losing information quality.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The estimation of non available soil variables through the knowledge of other related measured variables can be achieved through pedotransfer functions (PTF) mainly saving time and reducing cost. Great differences among soils, however, can yield non desirable results when applying this method. This study discusses the application of developed PTFs by several authors using a variety of soils of different characteristics, to evaluate soil water contents of two Brazilian lowland soils. Comparisons are made between PTF evaluated data and field measured data, using statistical and geostatistical tools, like mean error, root mean square error, semivariogram, cross-validation, and regression coefficient. The eight tested PTFs to evaluate gravimetric soil water contents (Ug) at the tensions of 33 kPa and 1,500 kPa presented a tendency to overestimate Ug 33 kPa and underestimate Ug1,500 kPa. The PTFs were ranked according to their performance and also with respect to their potential in describing the structure of the spatial variability of the set of measured values. Although none of the PTFs have changed the distribution pattern of the data, all resulted in mean and variance statistically different from those observed for all measured values. The PTFs that presented the best predictive values of Ug33 kPa and Ug1,500 kPa were not the same that had the best performance to reproduce the structure of spatial variability of these variables.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Is it possible to build predictive models (PMs) of soil particle-size distribution (psd) in a region with complex geology and a young and unstable land-surface? The main objective of this study was to answer this question. A set of 339 soil samples from a small slope catchment in Southern Brazil was used to build PMs of psd in the surface soil layer. Multiple linear regression models were constructed using terrain attributes (elevation, slope, catchment area, convergence index, and topographic wetness index). The PMs explained more than half of the data variance. This performance is similar to (or even better than) that of the conventional soil mapping approach. For some size fractions, the PM performance can reach 70 %. Largest uncertainties were observed in geologically more complex areas. Therefore, significant improvements in the predictions can only be achieved if accurate geological data is made available. Meanwhile, PMs built on terrain attributes are efficient in predicting the particle-size distribution (psd) of soils in regions of complex geology.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The objective of this work was to select semivariogram models to estimate the population density of fig fly (Zaprionus indianus; Diptera: Drosophilidae) throughout the year, using ordinary kriging. Nineteen monitoring sites were demarcated in an area of 8,200 m2, cropped with six fruit tree species: persimmon, citrus, fig, guava, apple, and peach. During a 24 month period, 106 weekly evaluations were done in these sites. The average number of adult fig flies captured weekly per trap, during each month, was subjected to the circular, spherical, pentaspherical, exponential, Gaussian, rational quadratic, hole effect, K-Bessel, J-Bessel, and stable semivariogram models, using ordinary kriging interpolation. The models with the best fit were selected by cross-validation. Each data set (months) has a particular spatial dependence structure, which makes it necessary to define specific models of semivariograms in order to enhance the adjustment to the experimental semivariogram. Therefore, it was not possible to determine a standard semivariogram model; instead, six theoretical models were selected: circular, Gaussian, hole effect, K-Bessel, J-Bessel, and stable.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The aim of this work is to present a tutorial on Multivariate Calibration, a tool which is nowadays necessary in basically most laboratories but very often misused. The basic concepts of preprocessing, principal component analysis (PCA), principal component regression (PCR) and partial least squares (PLS) are given. The two basic steps on any calibration procedure: model building and validation are fully discussed. The concepts of cross validation (to determine the number of factors to be used in the model), leverage and studentized residuals (to detect outliers) for the validation step are given. The whole calibration procedure is illustrated using spectra recorded for ternary mixtures of 2,4,6 trinitrophenolate, 2,4 dinitrophenolate and 2,5 dinitrophenolate followed by the concentration prediction of these three chemical species during a diffusion experiment through a hydrophobic liquid membrane. MATLAB software is used for numerical calculations. Most of the commands for the analysis are provided in order to allow a non-specialist to follow step by step the analysis.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Genetic algorithm and partial least square (GA-PLS) and kernel PLS (GA-KPLS) techniques were used to investigate the correlation between retention indices (RI) and descriptors for 117 diverse compounds in essential oils from 5 Pimpinella species gathered from central Turkey which were obtained by gas chromatography and gas chromatography-mass spectrometry. The square correlation coefficient leave-group-out cross validation (LGO-CV) (Q²) between experimental and predicted RI for training set by GA-PLS and GA-KPLS was 0.940 and 0.963, respectively. This indicates that GA-KPLS can be used as an alternative modeling tool for quantitative structure-retention relationship (QSRR) studies.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

QSAR modeling is a novel computer program developed to generate and validate QSAR or QSPR (quantitative structure- activity or property relationships) models. With QSAR modeling, users can build partial least squares (PLS) regression models, perform variable selection with the ordered predictors selection (OPS) algorithm, and validate models by using y-randomization and leave-N-out cross validation. An additional new feature is outlier detection carried out by simultaneous comparison of sample leverage with the respective Studentized residuals. The program was developed using Java version 6, and runs on any operating system that supports Java Runtime Environment version 6. The use of the program is illustrated. This program is available for download at lqta.iqm.unicamp.br.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The quantitative structure property relationship (QSPR) for the boiling point (Tb) of polychlorinated dibenzo-p-dioxins and polychlorinated dibenzofurans (PCDD/Fs) was investigated. The molecular distance-edge vector (MDEV) index was used as the structural descriptor. The quantitative relationship between the MDEV index and Tb was modeled by using multivariate linear regression (MLR) and artificial neural network (ANN), respectively. Leave-one-out cross validation and external validation were carried out to assess the prediction performance of the models developed. For the MLR method, the prediction root mean square relative error (RMSRE) of leave-one-out cross validation and external validation was 1.77 and 1.23, respectively. For the ANN method, the prediction RMSRE of leave-one-out cross validation and external validation was 1.65 and 1.16, respectively. A quantitative relationship between the MDEV index and Tb of PCDD/Fs was demonstrated. Both MLR and ANN are practicable for modeling this relationship. The MLR model and ANN model developed can be used to predict the Tb of PCDD/Fs. Thus, the Tb of each PCDD/F was predicted by the developed models.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Information about rainfall erosivity is important during soil and water conservation planning. Thus, the spatial variability of rainfall erosivity of the state Mato Grosso do Sul was analyzed using ordinary kriging interpolation. For this, three pluviograph stations were used to obtain the regression equations between the erosivity index and the rainfall coefficient EI30. The equations obtained were applied to 109 pluviometric stations, resulting in EI30 values. These values were analyzed from geostatistical technique, which can be divided into: descriptive statistics, adjust to semivariogram, cross-validation process and implementation of ordinary kriging to generate the erosivity map.Highest erosivity values were found in central and northeast regions of the State, while the lowest values were observed in the southern region. In addition, high annual precipitation values not necessarily produce higher erosivity values.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The aim of this study was to compare the hydrographically conditioned digital elevation models (HCDEMs) generated from data of VNIR (Visible Near Infrared) sensor of ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer), of SRTM (Shuttle Radar Topography Mission) and topographical maps from IBGE in a scale of 1:50,000, processed in the Geographical Information System (GIS), aiming the morphometric characterization of watersheds. It was taken as basis the Sub-basin of São Bartolomeu River, obtaining morphometric characteristics from HCDEMs. Root Mean Square Error (RMSE) and cross validation were the statistics indexes used to evaluate the quality of HCDEMs. The percentage differences in the morphometric parameters obtained from these three different data sets were less than 10%, except for the mean slope (21%). In general, it was observed a good agreement between HCDEMs generated from remote sensing data and IBGE maps. The result of HCDEM ASTER was slightly higher than that from HCDEM SRTM. The HCDEM ASTER was more accurate than the HCDEM SRTM in basins with high altitudes and rugged terrain, by presenting frequency altimetry nearest to HCDEM IBGE, considered standard in this study.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study aimed to identify differences in swine vocalization pattern according to animal gender and different stress conditions. A total of 150 barrow males and 150 females (Dalland® genetic strain), aged 100 days, were used in the experiment. Pigs were exposed to different stressful situations: thirst (no access to water), hunger (no access to food), and thermal stress (THI exceeding 74). For the control treatment, animals were kept under a comfort situation (animals with full access to food and water, with environmental THI lower than 70). Acoustic signals were recorded every 30 minutes, totaling six samples for each stress situation. Afterwards, the audios were analyzed by Praat® 5.1.19 software, generating a sound spectrum. For determination of stress conditions, data were processed by WEKA® 3.5 software, using the decision tree algorithm C4.5, known as J48 in the software environment, considering cross-validation with samples of 10% (10-fold cross-validation). According to the Decision Tree, the acoustic most important attribute for the classification of stress conditions was sound Intensity (root node). It was not possible to identify, using the tested attributes, the animal gender by vocal register. A decision tree was generated for recognition of situations of swine hunger, thirst, and heat stress from records of sound intensity, Pitch frequency, and Formant 1.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The aim of this study was to generate maps of intense rainfall equation parameters using interpolated maximum intense rainfall data. The study area comprised Espírito Santo State, Brazil. A total of 59 intense rainfall equations were used to interpolate maximum intense rainfall, with a 1 x 1 km spatial resolution. Maximum intense rainfall was interpolated considering recurrence of 2; 5; 10; 20; 50 and 100 years, and duration of 10; 20; 30; 40; 50; 60; 120; 240; 360; 420; 660; 720; 900; 1,140; 1,380 and 1,440 minutes, resulting in 96 maps of maximum intense rainfall. The used interpolators were inverse distance weighting and ordinary kriging, for which significance level (p-value) and coefficient of determination (R²) were evaluated for the cross-validation data, choosing the method that presented better R² to generate maps. Finally, maps of maximum intense precipitation were used to estimate, cell by cell, the intense rainfall equation parameters. In comparison with literature data, the mean percentage error of estimated intense rainfall equations was 13.8%. Maps of spatialized parameters, obtained in this study, are of simple use; once they are georeferenced, they may be imported into any geographic information system to be used for a specific area of interest.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In breast cancer patients submitted to neoadjuvant chemotherapy (4 cycles of doxorubicin and cyclophosphamide, AC), expression of groups of three genes (gene trio signatures) could distinguish responsive from non-responsive tumors, as demonstrated by cDNA microarray profiling in a previous study by our group. In the current study, we determined if the expression of the same genes would retain the predictive strength, when analyzed by a more accessible technique (real-time RT-PCR). We evaluated 28 samples already analyzed by cDNA microarray, as a technical validation procedure, and 14 tumors, as an independent biological validation set. All patients received neoadjuvant chemotherapy (4 AC). Among five trio combinations previously identified, defined by nine genes individually investigated (BZRP, CLPTM1,MTSS1, NOTCH1, NUP210, PRSS11, RPL37A, SMYD2, and XLHSRF-1), the most accurate were established by RPL37A, XLHSRF-1based trios, with NOTCH1 or NUP210. Both trios correctly separated 86% of tumors (87% sensitivity and 80% specificity for predicting response), according to their response to chemotherapy (82% in a leave-one-out cross-validation method). Using the pre-established features obtained by linear discriminant analysis, 71% samples from the biological validation set were also correctly classified by both trios (72% sensitivity; 66% specificity). Furthermore, we explored other gene combinations to achieve a higher accuracy in the technical validation group (as a training set). A new trio, MTSS1, RPL37 and SMYD2, correctly classified 93% of samples from the technical validation group (95% sensitivity and 80% specificity; 86% accuracy by the cross-validation method) and 79% from the biological validation group (72% sensitivity and 100% specificity). Therefore, the combined expression of MTSS1, RPL37 and SMYD2, as evaluated by real-time RT-PCR, is a potential candidate to predict response to neoadjuvant doxorubicin and cyclophosphamide in breast cancer patients.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

High resolution proton nuclear magnetic resonance spectroscopy (¹H MRS) can be used to detect biochemical changes in vitro caused by distinct pathologies. It can reveal distinct metabolic profiles of brain tumors although the accurate analysis and classification of different spectra remains a challenge. In this study, the pattern recognition method partial least squares discriminant analysis (PLS-DA) was used to classify 11.7 T ¹H MRS spectra of brain tissue extracts from patients with brain tumors into four classes (high-grade neuroglial, low-grade neuroglial, non-neuroglial, and metastasis) and a group of control brain tissue. PLS-DA revealed 9 metabolites as the most important in group differentiation: γ-aminobutyric acid, acetoacetate, alanine, creatine, glutamate/glutamine, glycine, myo-inositol, N-acetylaspartate, and choline compounds. Leave-one-out cross-validation showed that PLS-DA was efficient in group characterization. The metabolic patterns detected can be explained on the basis of previous multimodal studies of tumor metabolism and are consistent with neoplastic cell abnormalities possibly related to high turnover, resistance to apoptosis, osmotic stress and tumor tendency to use alternative energetic pathways such as glycolysis and ketogenesis.