917 resultados para Cross-validation


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Lateralization of temporal lobe epilepsy (TLE) is critical for successful outcome of surgery to relieve seizures. TLE affects brain regions beyond the temporal lobes and has been associated with aberrant brain networks, based on evidence from functional magnetic resonance imaging. We present here a machine learning-based method for determining the laterality of TLE, using features extracted from resting-state functional connectivity of the brain. A comprehensive feature space was constructed to include network properties within local brain regions, between brain regions, and across the whole network. Feature selection was performed based on random forest and a support vector machine was employed to train a linear model to predict the laterality of TLE on unseen patients. A leave-one-patient-out cross validation was carried out on 12 patients and a prediction accuracy of 83% was achieved. The importance of selected features was analyzed to demonstrate the contribution of resting-state connectivity attributes at voxel, region, and network levels to TLE lateralization.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Background Spatial analysis is increasingly important for identifying modifiable geographic risk factors for disease. However, spatial health data from surveys are often incomplete, ranging from missing data for only a few variables, to missing data for many variables. For spatial analyses of health outcomes, selection of an appropriate imputation method is critical in order to produce the most accurate inferences. Methods We present a cross-validation approach to select between three imputation methods for health survey data with correlated lifestyle covariates, using as a case study, type II diabetes mellitus (DM II) risk across 71 Queensland Local Government Areas (LGAs). We compare the accuracy of mean imputation to imputation using multivariate normal and conditional autoregressive prior distributions. Results Choice of imputation method depends upon the application and is not necessarily the most complex method. Mean imputation was selected as the most accurate method in this application. Conclusions Selecting an appropriate imputation method for health survey data, after accounting for spatial correlation and correlation between covariates, allows more complete analysis of geographic risk factors for disease with more confidence in the results to inform public policy decision-making.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

An important uncertainty when estimating per capita consumption of, for example, illicit drugs by means of wastewater analysis (sometimes referred to as “sewage epidemiology”) relates to the size and variability of the de facto population in the catchment of interest. In the absence of a day-specific direct population count any indirect surrogate model to estimate population size lacks a standard to assess associated uncertainties. Therefore, the objective of this study was to collect wastewater samples at a unique opportunity, that is, on a census day, as a basis for a model to estimate the number of people contributing to a given wastewater sample. Mass loads for a wide range of pharmaceuticals and personal care products were quantified in influents of ten sewage treatment plants (STP) serving populations ranging from approximately 3500 to 500 000 people. Separate linear models for population size were estimated with the mass loads of the different chemical as the explanatory variable: 14 chemicals showed good, linear relationships, with highest correlations for acesulfame and gabapentin. De facto population was then estimated through Bayesian inference, by updating the population size provided by STP staff (prior knowledge) with measured chemical mass loads. Cross validation showed that large populations can be estimated fairly accurately with a few chemical mass loads quantified from 24-h composite samples. In contrast, the prior knowledge for small population sizes cannot be improved substantially despite the information of multiple chemical mass loads. In the future, observations other than chemical mass loads may improve this deficit, since Bayesian inference allows including any kind of information relating to population size.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recent decreases in costs, and improvements in performance, of silicon array detectors open a range of potential applications of relevance to plant physiologists, associated with spectral analysis in the visible and short-wave near infra-red (far-red) spectrum. The performance characteristics of three commercially available ‘miniature’ spectrometers based on silicon array detectors operating in the 650–1050-nm spectral region (MMS1 from Zeiss, S2000 from Ocean Optics, and FICS from Oriel, operated with a Larry detector) were compared with respect to the application of non-invasive prediction of sugar content of fruit using near infra-red spectroscopy (NIRS). The FICS–Larry gave the best wavelength resolution; however, the narrow slit and small pixel size of the charge-coupled device detector resulted in a very low sensitivity, and this instrumentation was not considered further. Wavelength resolution was poor with the MMS1 relative to the S2000 (e.g. full width at half maximum of the 912 nm Hg peak, 13 and 2 nm for the MMS1 and S2000, respectively), but the large pixel height of the array used in the MMS1 gave it sensitivity comparable to the S2000. The signal-to-signal standard error ratio of spectra was greater by an order of magnitude with the MMS1, relative to the S2000, at both near saturation and low light levels. Calibrations were developed using reflectance spectra of filter paper soaked in range of concentrations (0–20% w/v) of sucrose, using a modified partial least squares procedure. Calibrations developed with the MMS1 were superior to those developed using the S2000 (e.g. coefficient of correlation of 0.90 and 0.62, and standard error of cross-validation of 1.9 and 5.4%, respectively), indicating the importance of high signal to noise ratio over wavelength resolution to calibration accuracy. The design of a bench top assembly using the MMS1 for the non-invasive assessment of mesocarp sugar content of (intact) melon fruit is reported in terms of light source and angle between detector and light source, and optimisation of math treatment (derivative condition and smoothing function).

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Brix content of pineapple fruit can be non-invasively predicted from the second derivative of near infrared reflectance spectra. Correlations obtained using a NIRSystems 6500 spectrophotometer through multiple linear regression and modified partial least squares analyses using a post-dispersive configuration were comparable with that from a pre-dispersive configuration in terms of accuracy (e.g. coefficient of determination, R2, 0.73; standard error of cross validation, SECV, 1.01°Brix). The effective depth of sample assessed was slightly greater using the post-dispersive technique (about 20 mm for pineapple fruit), as expected in relation to the higher incident light intensity, relative to the pre-dispersive configuration. The effect of such environmental variables as temperature, humidity and external light, and instrumental variables such as the number of scans averaged to form a spectrum, were considered with respect to the accuracy and precision of the measurement of absorbance at 876 nm, as a key term in the calibration for Brix, and predicted Brix. The application of post-dispersive near infrared technology to in-line assessment of intact fruit in a packing shed environment is discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The potential of near infra-red (NIR) spectroscopy for non-invasive measurement of fruit quality of pineapple (Ananas comosus var. Smooth Cayenne) and mango (Magnifera indica var. Kensington) fruit was assessed. A remote reflectance fibre optic probe, placed in contact with the fruit skin surface in a light-proof box, was used to deliver monochromatic light to the fruit, and to collect NIR reflectance spectra (760–2500 nm). The probe illuminated and collected reflected radiation from an area of about 16 cm2. The NIR spectral attributes were correlated with pineapple juice Brix and with mango flesh dry matter (DM) measured from fruit flesh directly underlying the scanned area. The highest correlations for both fruit were found using the second derivative of the spectra (d2 log 1/R) and an additive calibration equation. Multiple linear regression (MLR) on pineapple fruit spectra (n = 85) gave a calibration equation using d2 log 1/R at wavelengths of 866, 760, 1232 and 832 nm with a multiple coefficient of determination (R2) of 0.75, and a standard error of calibration (SEC) of 1.21 °Brix. Modified partial least squares (MPLS) regression analysis yielded a calibration equation with R2 = 0.91, SEC = 0.69, and a standard error of cross validation (SECV) of 1.09 oBrix. For mango, MLR gave a calibration equation using d2 log 1/R at 904, 872, 1660 and 1516 nm with R2 = 0.90, and SEC = 0.85% DM and a bias of 0.39. Using MPLS analysis, a calibration equation with R2 = 0.98, SEC = 0.54 and SECV = 1.19 was obtained. We conclude that NIR technology offers the potential to assess fruit sweetness in intact whole pineapple and DM in mango fruit, respectively, to within 1° Brix and 1% DM, and could be used for the grading of fruit in fruit packing sheds.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Volatile chemical compounds responsible for the aroma of wine are derived from a number of different biochemical and chemical pathways. These chemical compounds are formed during grape berry metabolism, crushing of the berries, fermentation processes (i.e. yeast and malolactic bacteria) and also from the ageing and storage of wine. Not surprisingly, there are a large number of chemical classes of compounds found in wine which are present at varying concentrations (ng L-1 to mg L-1), exhibit differing potencies, and have a broad range of volatilities and boiling points. The aim of this work was to investigate the potential use of near infrared (NIR) spectroscopy combined with chemometrics as a rapid and low-cost technique to measure volatile compounds in Riesling wines. Samples of commercial Riesling wine were analyzed using an NIR instrument and volatile compounds by gas chromatography (GC) coupled with selected ion monitoring mass spectrometry. Correlation between the NIR and GC data were developed using partial least-squares (PLS) regression with full cross validation (leave one out). Coefficients of determination in cross validation (R 2) and the standard error in cross validation (SECV) were 0.74 (SECV: 313.6 μg L−1) for esters, 0.90 (SECV: 20.9 μg L−1) for monoterpenes and 0.80 (SECV: 1658 ?g L-1) for short-chain fatty acids. This study has shown that volatile chemical compounds present in wine can be measured by NIR spectroscopy. Further development with larger data sets will be required to test the predictive ability of the NIR calibration models developed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

To facilitate marketing and export, the Australian macadamia industry requires accurate crop forecasts. Each year, two levels of crop predictions are produced for this industry. The first is an overall longer-term forecast based on tree census data of growers in the Australian Macadamia Society (AMS). This data set currently accounts for around 70% of total production, and is supplemented by our best estimates of non-AMS orchards. Given these total tree numbers, average yields per tree are needed to complete the long-term forecasts. Yields from regional variety trials were initially used, but were found to be consistently higher than the average yields that growers were obtaining. Hence, a statistical model was developed using growers' historical yields, also taken from the AMS database. This model accounted for the effects of tree age, variety, year, region and tree spacing, and explained 65% of the total variation in the yield per tree data. The second level of crop prediction is an annual climate adjustment of these overall long-term estimates, taking into account the expected effects on production of the previous year's climate. This adjustment is based on relative historical yields, measured as the percentage deviance between expected and actual production. The dominant climatic variables are observed temperature, evaporation, solar radiation and modelled water stress. Initially, a number of alternate statistical models showed good agreement within the historical data, with jack-knife cross-validation R2 values of 96% or better. However, forecasts varied quite widely between these alternate models. Exploratory multivariate analyses and nearest-neighbour methods were used to investigate these differences. For 2001-2003, the overall forecasts were in the right direction (when compared with the long-term expected values), but were over-estimates. In 2004 the forecast was well under the observed production, and in 2005 the revised models produced a forecast within 5.1% of the actual production. Over the first five years of forecasting, the absolute deviance for the climate-adjustment models averaged 10.1%, just outside the targeted objective of 10%.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Grass (monocots) and non-grass (dicots) proportions in ruminant diets are important nutritionally because the non-grasses are usually higher in nutritive value, particularly protein, than the grasses, especially in tropical pastures. For ruminants grazing tropical pastures where the grasses are C-4 species and most non-grasses are C-3 species, the ratio of C-13/C-12 in diet and faeces, measured as delta C-13 parts per thousand, is proportional to dietary non-grass%. This paper describes the development of a faecal near infrared (NIR) spectroscopy calibration equation for predicting faecal delta C-13 from which dietary grass and non-grass proportions can be calculated. Calibration development used cattle faeces derived from diets containing only C-3 non-grass and C-4 grass components, and a series of expansion and validation steps was employed to develop robustness and predictive reliability. The final calibration equation contained 1637 samples and faecal delta C-13 range (parts per thousand) of [12.27]-[27.65]. Calibration statistics were: standard error of calibration (SEC) of 0.78, standard error of cross-validation (SECV) of 0.80, standard deviation (SD) of reference values of 3.11 and R-2 of 0.94. Validation statistics for the final calibration equation applied to 60 samples were: standard error of prediction (SEP) of 0.87, bias of -0.15, R-2 of 0.92 and RPD of 3.16. The calibration equation was also tested on faeces from diets containing C-4 non-grass species or temperate C-3 grass species. Faecal delta C-13 predictions indicated that the spectral basis of the calibration was not related to C-13/C-12 ratios per se but to consistent differences between grasses and non-grasses in chemical composition and that the differences were modified by photosynthetic pathway. Thus, although the calibration equation could not be used to make valid faecal delta C-13 predictions when the diet contained either C-3 grass or C-4 non-grass, it could be used to make useful estimates of dietary non-grass proportions. It could also be ut :sed to make useful estimates of non-grass in mixed C-3 grass/non-grass diets by applying a modified formula to calculate non-grass from predicted faecal delta C-13. The development of a robust faecal-NIR calibration equation for estimating non-grass proportions in the diets of grazing cattle demonstrated a novel and useful application of NIR spectroscopy in agriculture.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Acidity in terms of pH and titratable acids influences the texture and flavour of fermented dairy products, such as Kefir. However, the methods for determining pH and titratable acidity (TA) are time consuming. Near infrared (NIR) spectroscopy is a non-destructive method, which simultaneously predicts multiple traits from a single scan and can be used to predict pH and TA. The best pH NIR calibration model was obtained with no spectral pre-treatment applied, whereas smoothing was found to be the best pre-treatment to develop the TA calibration model. Using cross-validation, the prediction results were found acceptable for both pH and TA. With external validation, similar results were found for pH and TA, and both models were found to be acceptable for screening purposes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Hydrogen cyanide (HCN) is a toxic chemical that can potentially cause mild to severe reactions in animals when grazing forage sorghum. Developing technologies to monitor the level of HCN in the growing crop would benefit graziers, so that they can move cattle into paddocks with acceptable levels of HCN. In this study, we developed near-infrared spectroscopy (MRS) calibrations to estimate HCN in forage sorghum and hay. The full spectral NIRS range (400-2498 nm) was used as well as specific spectral ranges within the full spectral range, i.e., visible (400-750 nm), shortwave (800-1100 nm) and near-infrared (NIR) (1100-2498 nm). Using the full spectrum approach and partial least-squares (PLS), the calibration produced a coefficient of determination (R-2) = 0.838 and standard error of cross-validation (SECV) = 0.040%, while the validation set had a R-2 = 0.824 with a low standard error of prediction (SEP = 0.047%). When using a multiple linear regression (MLR) approach, the best model (NIR spectra) produced a R-2 = 0.847 and standard error of calibration (SEC) = 0.050% and a R-2 = 0.829 and SEP = 0.057% for the validation set. The MLR models built from these spectral regions all used nine wavelengths. Two specific wavelengths 2034 and 2458 nm were of interest, with the former associated with C=O carbonyl stretch and the latter associated with C-N-C stretching. The most accurate PLS and MLR models produced a ratio of standard error of prediction to standard deviation of 3.4 and 3.0, respectively, suggesting that the calibrations could be used for screening breeding material. The results indicated that it should be feasible to develop calibrations using PLS or MLR models for a number of users, including breeding programs to screen for genotypes with low HCN, as well as graziers to monitor crop status to help with grazing efficiency.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Context. Irregular plagues of house mice cause high production losses in grain crops in Australia. If plagues can be forecast through broad-scale monitoring or model-based prediction, then mice can be proactively controlled by poison baiting. Aims. To predict mouse plagues in grain crops in Queensland and assess the value of broad-scale monitoring. Methods. Regular trapping of mice at the same sites on the Darling Downs in southern Queensland has been undertaken since 1974. This provides an index of abundance over time that can be related to rainfall, crop yield, winter temperature and past mouse abundance. Other sites have been trapped over a shorter time period elsewhere on the Darling Downs and in central Queensland, allowing a comparison of mouse population dynamics and cross-validation of models predicting mouse abundance. Key results. On the regularly trapped 32-km transect on the Darling Downs, damaging mouse densities occur in 50% of years and a plague in 25% of years, with no detectable increase in mean monthly mouse abundance over the past 35 years. High mouse abundance on this transect is not consistently matched by high abundance in the broader area. Annual maximum mouse abundance in autumn–winter can be predicted (R2 = 57%) from spring mouse abundance and autumn–winter rainfall in the previous year. In central Queensland, mouse dynamics contrast with those on the Darling Downs and lack the distinct annual cycle, with peak abundance occurring in any month outside early spring.Onaverage, damaging mouse densities occur in 1 in 3 years and a plague occurs in 1 in 7 years. The dynamics of mouse populations on two transects ~70 km apart were rarely synchronous. Autumn–winter rainfall can indicate mouse abundance in some seasons (R2 = ~52%). Conclusion. Early warning of mouse plague formation in Queensland grain crops from regional models should trigger farm-based monitoring. This can be incorporated with rainfall into a simple model predicting future abundance that will determine any need for mouse control. Implications. A model-based warning of a possible mouse plague can highlight the need for local monitoring of mouse activity, which in turn could trigger poison baiting to prevent further mouse build-up.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

PURPOSE To develop and test decision tree (DT) models to classify physical activity (PA) intensity from accelerometer output and Gross Motor Function Classification System (GMFCS) classification level in ambulatory youth with cerebral palsy (CP); and 2) compare the classification accuracy of the new DT models to that achieved by previously published cut-points for youth with CP. METHODS Youth with CP (GMFCS Levels I - III) (N=51) completed seven activity trials with increasing PA intensity while wearing a portable metabolic system and ActiGraph GT3X accelerometers. DT models were used to identify vertical axis (VA) and vector magnitude (VM) count thresholds corresponding to sedentary (SED) (<1.5 METs), light PA (LPA) (>/=1.5 and <3 METs) and moderate-to-vigorous PA (MVPA) (>/=3 METs). Models were trained and cross-validated using the 'rpart' and 'caret' packages within R. RESULTS For the VA (VA_DT) and VM decision trees (VM_DT), a single threshold differentiated LPA from SED, while the threshold for differentiating MVPA from LPA decreased as the level of impairment increased. The average cross-validation accuracy for the VC_DT was 81.1%, 76.7%, and 82.9% for GMFCS levels I, II, and III, respectively. The corresponding cross-validation accuracy for the VM_DT was 80.5%, 75.6%, and 84.2%, respectively. Within each GMFCS level, the decision tree models achieved better PA intensity recognition than previously published cut-points. The accuracy differential was greatest among GMFCS level III participants, in whom the previously published cut-points misclassified 40% of the MVPA activity trials. CONCLUSION GMFCS-specific cut-points provide more accurate assessments of MVPA levels in youth with CP across the full spectrum of ambulatory ability.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Four species of large mackerels (Scomberomorus spp.) co-occur in the waters off northern Australia and are important to fisheries in the region. State fisheries agencies monitor these species for fisheries assessment; however, data inaccuracies may exist due to difficulties with identification of these closely related species, particularly when specimens are incomplete from fish processing. This study examined the efficacy of using otolith morphometrics to differentiate and predict among the four mackerel species off northeastern Australia. Seven otolith measurements and five shape indices were recorded from 555 mackerel specimens. Multivariate modelling including linear discriminant analysis (LDA) and support vector machines, successfully differentiated among the four species based on otolith morphometrics. Cross validation determined a predictive accuracy of at least 96% for both models. An optimum predictive model for the four mackerel species was an LDA model that included fork length, feret length, feret width, perimeter, area, roundness, form factor and rectangularity as explanatory variables. This analysis may improve the accuracy of fisheries monitoring, the estimates based on this monitoring (i.e. mortality rate) and the overall management of mackerel species in Australia.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

BACKGROUND Polygenic risk scores comprising established susceptibility variants have shown to be informative classifiers for several complex diseases including prostate cancer. For prostate cancer it is unknown if inclusion of genetic markers that have so far not been associated with prostate cancer risk at a genome-wide significant level will improve disease prediction. METHODS We built polygenic risk scores in a large training set comprising over 25,000 individuals. Initially 65 established prostate cancer susceptibility variants were selected. After LD pruning additional variants were prioritized based on their association with prostate cancer. Six-fold cross validation was performed to assess genetic risk scores and optimize the number of additional variants to be included. The final model was evaluated in an independent study population including 1,370 cases and 1,239 controls. RESULTS The polygenic risk score with 65 established susceptibility variants provided an area under the curve (AUC) of 0.67. Adding an additional 68 novel variants significantly increased the AUC to 0.68 (P = 0.0012) and the net reclassification index with 0.21 (P = 8.5E-08). All novel variants were located in genomic regions established as associated with prostate cancer risk. CONCLUSIONS Inclusion of additional genetic variants from established prostate cancer susceptibility regions improves disease prediction. Prostate 75:1467–1474, 2015. © 2015 Wiley Periodicals, Inc.