874 resultados para Cross-validation
Resumo:
This paper deals with an experimental study of pressure-swirl hydraulic injector nozzles using non-intrusive optical techniques. Experiments were conducted to study atomization characteristics using two nozzles with different orifice diameters, 0.3 mm and 0.5 mm, and injection pressures, 0.3-3.5 Mpa, which correspond to Reynolds number (Re-p) = 7,000-45,000, depending on nozzle utilized. Three laser diagnostic techniques were utilized: Shadowgraph, PIV (Particle Image Velocimetry), and PDPA (Phase Doppler Particle Anemometry). Measurements made in the spray in both axial and radial directions indicate that velocity, average droplet diameter profiles, and spray dynamics are highly dependent on the nozzle characteristics and injection pressure. Limitations of these techniques in the different flow regimes, related to the primary and secondary breakups as well as coalescence, are provided. Results indicate that all three techniques provide similar results throughout the different regimes. Shadowgraph and PDPA were possible in the secondary atomization and coalescence regimes while PIV measurements could be made only at the end of secondary atomization and coalescence.
Resumo:
The standard, ad-hoc stopping criteria used in decision tree-based context clustering are known to be sub-optimal and require parameters to be tuned. This paper proposes a new approach for decision tree-based context clustering based on cross validation and hierarchical priors. Combination of cross validation and hierarchical priors within decision tree-based context clustering offers better model selection and more robust parameter estimation than conventional approaches, with no tuning parameters. Experimental results on HMM-based speech synthesis show that the proposed approach achieved significant improvements in naturalness of synthesized speech over the conventional approaches. © 2011 IEEE.
Resumo:
The validation of variable-density flow models simulating seawater intrusion in coastal aquifers requires information about concentration distribution in groundwater. Electrical resistivity tomography (ERT) provides relevant data for this purpose. However, inverse modeling is not accurate because of the non-uniqueness of solutions. Such difficulties in evaluating seawater intrusion can be overcome by coupling geophysical data and groundwater modeling. First, the resistivity distribution obtained by inverse geo-electrical modeling is established. Second, a 3-D variable-density flow hydrogeological model is developed. Third, using Archie's Law, the electrical resistivity model deduced from salt concentration is compared to the formerly interpreted electrical model. Finally, aside from that usual comparison-validation, the theoretical geophysical response of concentrations simulated with the groundwater model can be compared to field-measured resistivity data. This constitutes a cross-validation of both the inverse geo-electrical model and the groundwater model.
[Comte, J.-C., and O. Banton (2007), Cross-validation of geo-electrical and hydrogeological models to evaluate seawater intrusion in coastal aquifers, Geophys. Res. Lett., 34, L10402, doi:10.1029/2007GL029981.]
Resumo:
The paper addresses the issue of choice of bandwidth in the application of semiparametric estimation of the long memory parameter in a univariate time series process. The focus is on the properties of forecasts from the long memory model. A variety of cross-validation methods based on out of sample forecasting properties are proposed. These procedures are used for the choice of bandwidth and subsequent model selection. Simulation evidence is presented that demonstrates the advantage of the proposed new methodology.
Resumo:
Typically, algorithms for generating stereo disparity maps have been developed to minimise the energy equation of a single image. This paper proposes a method for implementing cross validation in a belief propagation optimisation. When tested using the Middlebury online stereo evaluation, the cross validation improves upon the results of standard belief propagation. Furthermore, it has been shown that regions of homogeneous colour within the images can be used for enforcing the so-called "Segment Constraint". Developing from this, Segment Support is introduced to boost belief between pixels of the same image region and improve propagation into textureless regions.
Resumo:
We present cross-validation of remote sensing measurements of methane profiles in the Canadian high Arctic. Accurate and precise measurements of methane are essential to understand quantitatively its role in the climate system and in global change. Here, we show a cross-validation between three datasets: two from spaceborne instruments and one from a ground-based instrument. All are Fourier Transform Spectrometers (FTSs). We consider the Canadian SCISAT Atmospheric Chemistry Experiment (ACE)-FTS, a solar occultation infrared spectrometer operating since 2004, and the thermal infrared band of the Japanese Greenhouse Gases Observing Satellite (GOSAT) Thermal And Near infrared Sensor for carbon Observation (TANSO)-FTS, a nadir/off-nadir scanning FTS instrument operating at solar and terrestrial infrared wavelengths, since 2009. The ground-based instrument is a Bruker 125HR Fourier Transform Infrared (FTIR) spectrometer, measuring mid-infrared solar absorption spectra at the Polar Environment Atmospheric Research Laboratory (PEARL) Ridge Lab at Eureka, Nunavut (80° N, 86° W) since 2006. For each pair of instruments, measurements are collocated within 500 km and 24 h. An additional criterion based on potential vorticity values was found not to significantly affect differences between measurements. Profiles are regridded to a common vertical grid for each comparison set. To account for differing vertical resolutions, ACE-FTS measurements are smoothed to the resolution of either PEARL-FTS or TANSO-FTS, and PEARL-FTS measurements are smoothed to the TANSO-FTS resolution. Differences for each pair are examined in terms of profile and partial columns. During the period considered, the number of collocations for each pair is large enough to obtain a good sample size (from several hundred to tens of thousands depending on pair and configuration). Considering full profiles, the degrees of freedom for signal (DOFS) are between 0.2 and 0.7 for TANSO-FTS and between 1.5 and 3 for PEARL-FTS, while ACE-FTS has considerably more information (roughly 1° of freedom per altitude level). We take partial columns between roughly 5 and 30 km for the ACE-FTS–PEARL-FTS comparison, and between 5 and 10 km for the other pairs. The DOFS for the partial columns are between 1.2 and 2 for PEARL-FTS collocated with ACE-FTS, between 0.1 and 0.5 for PEARL-FTS collocated with TANSO-FTS or for TANSO-FTS collocated with either other instrument, while ACE-FTS has much higher information content. For all pairs, the partial column differences are within ± 3 × 1022 molecules cm−2. Expressed as median ± median absolute deviation (expressed in absolute or relative terms), these differences are 0.11 ± 9.60 × 10^20 molecules cm−2 (0.012 ± 1.018 %) for TANSO-FTS–PEARL-FTS, −2.6 ± 2.6 × 10^21 molecules cm−2 (−1.6 ± 1.6 %) for ACE-FTS–PEARL-FTS, and 7.4 ± 6.0 × 10^20 molecules cm−2 (0.78 ± 0.64 %) for TANSO-FTS–ACE-FTS. The differences for ACE-FTS–PEARL-FTS and TANSO-FTS–PEARL-FTS partial columns decrease significantly as a function of PEARL partial columns, whereas the range of partial column values for TANSO-FTS–ACE-FTS collocations is too small to draw any conclusion on its dependence on ACE-FTS partial columns.
Resumo:
Estimation of the number of mixture components (k) is an unsolved problem. Available methods for estimation of k include bootstrapping the likelihood ratio test statistics and optimizing a variety of validity functionals such as AIC, BIC/MDL, and ICOMP. We investigate the minimization of distance between fitted mixture model and the true density as a method for estimating k. The distances considered are Kullback-Leibler (KL) and “L sub 2”. We estimate these distances using cross validation. A reliable estimate of k is obtained by voting of B estimates of k corresponding to B cross validation estimates of distance. This estimation methods with KL distance is very similar to Monte Carlo cross validated likelihood methods discussed by Smyth (2000). With focus on univariate normal mixtures, we present simulation studies that compare the cross validated distance method with AIC, BIC/MDL, and ICOMP. We also apply the cross validation estimate of distance approach along with AIC, BIC/MDL and ICOMP approach, to data from an osteoporosis drug trial in order to find groups that differentially respond to treatment.
Resumo:
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^
Resumo:
This study aimed to replicate and cross-validate the Rapid Screen of Concussion (RSC) for diagnosing mild TBI (mTBI). One hundred (81 male, 19 female) cases of mTBI and 35 (23 male and 12 female) cases of orthopaedic injuries were tested within 24 hr of injury. Double cross-validation was used to examine whether total RSC scores obtained in the cur-rent sample, generalised to one previously reported. In the new sample, mTBI patients answered fewer orientation questions, recalled fewer words on the learning trial and after a delay, judged fewer sentences in 2 min, and completed fewer symbols in the Digit Symbol Substitution Test than orthopaedic controls. The formulae and cut-offs developed on the original and new samples produced similar sensitivity and overall correct classification rates. Inclusion of the Digit Symbol Substitution Test performance of the new sample improved the sensitivity (80.2%) and specificity (82.6%) in males. It did not improve the correct classification rate in females, which was 89.5% sensitivity and 91.7% specificity before the inclusion of the Digit Symbol Substitution Test. Taken together, these results indicate that a combined score on this 12-min screen yields a measure of level of brain impairment up to 24 hr after mTBI.
Resumo:
It is known theoretically that an algorithm cannot be good for an arbitrary prior. We show that in practical terms this also applies to the technique of ``cross validation'', which has been widely regarded as defying this general rule. Numerical examples are analysed in detail. Their implications to researches on learning algorithms are discussed.
Resumo:
Poster presented at the First International Congress of CiiEM - From Basic Sciences To Clinical Research. Egas Moniz, Caparica, Portugal, 27-28 November 2015.