936 resultados para leave one out cross validation
Resumo:
An interoperable web processing service (WPS) for the automatic interpolation of environmental data has been developed in the frame of the INTAMAP project. In order to assess the performance of the interpolation method implemented, a validation WPS has also been developed. This validation WPS can be used to perform leave one out and K-fold cross validation: a full dataset is submitted and a range of validation statistics and diagnostic plots (e.g. histograms, variogram of residuals, mean errors) is received in return. This paper presents the architecture of the validation WPS and a case study is used to briefly illustrate its use in practice. We conclude with a discussion on the current limitations of the system and make proposals for further developments.
Resumo:
* Работа выполнена при поддержке РФФИ, гранты 07-01-00331-a и 08-01-00944-a
Resumo:
Prostate cancer is the most common non-dermatological cancer amongst men in the developed world. The current definitive diagnosis is core needle biopsy guided by transrectal ultrasound. However, this method suffers from low sensitivity and specificity in detecting cancer. Recently, a new ultrasound based tissue typing approach has been proposed, known as temporal enhanced ultrasound (TeUS). In this approach, a set of temporal ultrasound frames is collected from a stationary tissue location without any intentional mechanical excitation. The main aim of this thesis is to implement a deep learning-based solution for prostate cancer detection and grading using TeUS data. In the proposed solution, convolutional neural networks are trained to extract high-level features from time domain TeUS data in temporally and spatially adjacent frames in nine in vivo prostatectomy cases. This approach avoids information loss due to feature extraction and also improves cancer detection rate. The output likelihoods of two TeUS arrangements are then combined to form our novel decision support system. This deep learning-based approach results in the area under the receiver operating characteristic curve (AUC) of 0.80 and 0.73 for prostate cancer detection and grading, respectively, in leave-one-patient-out cross-validation. Recently, multi-parametric magnetic resonance imaging (mp-MRI) has been utilized to improve detection rate of aggressive prostate cancer. In this thesis, for the first time, we present the fusion of mp-MRI and TeUS for characterization of prostate cancer to compensates the deficiencies of each image modalities and improve cancer detection rate. The results obtained using TeUS are fused with those attained using consolidated mp-MRI maps from multiple MR modalities and cancer delineations on those by multiple clinicians. The proposed fusion approach yields the AUC of 0.86 in prostate cancer detection. The outcomes of this thesis emphasize the viable potential of TeUS as a tissue typing method. Employing this ultrasound-based intervention, which is non-invasive and inexpensive, can be a valuable and practical addition to enhance the current prostate cancer detection.
Resumo:
Objective: The biochemical alterations between inflammatory fibrous hyperplasia (IFH) and normal tissues of buccal mucosa were probed by using the FT-Raman spectroscopy technique. The aim was to find the minimal set of Raman bands that would furnish the best discrimination. Background: Raman-based optical biopsy is a widely recognized potential technique for noninvasive real-time diagnosis. However, few studies had been devoted to the discrimination of very common subtle or early pathologic states as inflammatory processes that are always present on, for example, cancer lesion borders. Methods: Seventy spectra of IFH from 14 patients were compared with 30 spectra of normal tissues from six patients. The statistical analysis was performed with principal components analysis and soft independent modeling class analogy cross-validated, leave-one-out methods. Results: Bands close to 574, 1,100, 1,250 to 1,350, and 1,500 cm(-1) (mainly amino acids and collagen bands) showed the main intragroup variations that are due to the acanthosis process in the IFH epithelium. The 1,200 (C-C aromatic/DNA), 1,350 (CH(2) bending/collagen 1), and 1,730 cm(-1) (collagen III) regions presented the main intergroup variations. This finding was interpreted as originating in an extracellular matrix-degeneration process occurring in the inflammatory tissues. The statistical analysis results indicated that the best discrimination capability (sensitivity of 95% and specificity of 100%) was found by using the 530-580 cm(-1) spectral region. Conclusions: The existence of this narrow spectral window enabling normal and inflammatory diagnosis also had useful implications for an in vivo dispersive Raman setup for clinical applications.
Resumo:
Objective: Several limitations of published bioelectrical impedance analysis (BIA) equations have been reported. The aims were to develop in a multiethnic, elderly population a new prediction equation and cross-validate it along with some published BIA equations for estimating fat-free mass using deuterium oxide dilution as the reference method. Design and setting: Cross-sectional study of elderly from five developing countries. Methods: Total body water (TBW) measured by deuterium dilution was used to determine fat-free mass (FFM) in 383 subjects. Anthropometric and BIA variables were also measured. Only 377 subjects were included for the analysis, randomly divided into development and cross-validation groups after stratified by gender. Stepwise model selection was used to generate the model and Bland Altman analysis was used to test agreement. Results: FFM = 2.95 - 3.89 (Gender) + 0.514 (Ht(2)/Z) + 0.090 (Waist) + 0.156 (Body weight). The model fit parameters were an R(2), total F-Ratio, and the SEE of 0.88, 314.3, and 3.3, respectively. None of the published BIA equations met the criteria for agreement. The new BIA equation underestimated FFM by just 0.3 kg in the cross-validation sample. The mean of the difference between FFM by TBW and the new BIA equation were not significantly different; 95% of the differences were between the limits of agreement of -6.3 to 6.9 kg of FFM. There was no significant association between the mean of the differences and their averages (r = 0.008 and p = 0.2). Conclusions: This new BIA equation offers a valid option compared with some of the current published BIA equations to estimate FFM in elderly subjects from five developing countries.
Resumo:
BACKGROUND/OBJECTIVES: (1) To cross-validate tetra- (4-BIA) and octopolar (8-BIA) bioelectrical impedance analysis vs dual-energy X-ray absorptiometry (DXA) for the assessment of total and appendicular body composition and (2) to evaluate the accuracy of external 4-BIA algorithms for the prediction of total body composition, in a representative sample of Swiss children. SUBJECTS/METHODS: A representative sample of 333 Swiss children aged 6-13 years from the Kinder-Sportstudie (KISS) (ISRCTN15360785). Whole-body fat-free mass (FFM) and appendicular lean tissue mass were measured with DXA. Body resistance (R) was measured at 50 kHz with 4-BIA and segmental body resistance at 5, 50, 250 and 500 kHz with 8-BIA. The resistance index (RI) was calculated as height(2)/R. Selection of predictors (gender, age, weight, RI4 and RI8) for BIA algorithms was performed using bootstrapped stepwise linear regression on 1000 samples. We calculated 95% confidence intervals (CI) of regression coefficients and measures of model fit using bootstrap analysis. Limits of agreement were used as measures of interchangeability of BIA with DXA. RESULTS: 8-BIA was more accurate than 4-BIA for the assessment of FFM (root mean square error (RMSE)=0.90 (95% CI 0.82-0.98) vs 1.12 kg (1.01-1.24); limits of agreement 1.80 to -1.80 kg vs 2.24 to -2.24 kg). 8-BIA also gave accurate estimates of appendicular body composition, with RMSE < or = 0.10 kg for arms and < or = 0.24 kg for legs. All external 4-BIA algorithms performed poorly with substantial negative proportional bias (r> or = 0.48, P<0.001). CONCLUSIONS: In a representative sample of young Swiss children (1) 8-BIA was superior to 4-BIA for the prediction of FFM, (2) external 4-BIA algorithms gave biased predictions of FFM and (3) 8-BIA was an accurate predictor of segmental body composition.
Resumo:
In the last five years, Deep Brain Stimulation (DBS) has become the most popular and effective surgical technique for the treatent of Parkinson's disease (PD). The Subthalamic Nucleus (STN) is the usual target involved when applying DBS. Unfortunately, the STN is in general not visible in common medical imaging modalities. Therefore, atlas-based segmentation is commonly considered to locate it in the images. In this paper, we propose a scheme that allows both, to perform a comparison between different registration algorithms and to evaluate their ability to locate the STN automatically. Using this scheme we can evaluate the expert variability against the error of the algorithms and we demonstrate that automatic STN location is possible and as accurate as the methods currently used.
Resumo:
In the last five years, Deep Brain Stimulation (DBS) has become the most popular and effective surgical technique for the treatent of Parkinson's disease (PD). The Subthalamic Nucleus (STN) is the usual target involved when applying DBS. Unfortunately, the STN is in general not visible in common medical imaging modalities. Therefore, atlas-based segmentation is commonly considered to locate it in the images. In this paper, we propose a scheme that allows both, to perform a comparison between different registration algorithms and to evaluate their ability to locate the STN automatically. Using this scheme we can evaluate the expert variability against the error of the algorithms and we demonstrate that automatic STN location is possible and as accurate as the methods currently used.
Resumo:
Head space gas chromatography with flame-ionization detection (HS-GC-FID), ancl purge and trap gas chromatography-mass spectrometry (P&T-GC-MS) have been used to determine methyl-tert-butyl ether (MTBE) and benzene, toluene, and the ylenes (BTEX) in groundwater. In the work discussed in this paper measures of quality, e.g. recovery (94-111%), precision (4.6 - 12.2%), limits of detection (0.3 - 5.7 I~g L 1 for HS and 0.001 I~g L 1 for PT), and robust-ness, for both methods were compared. In addition, for purposes of comparison, groundwater samples from areas suffering from odor problems because of fuel spillage and tank leakage were analyzed by use of both techniques. For high concentration levels there was good correlation between results from both methods.
Resumo:
Head space gas chromatography with flame-ionization detection (HS-GC-FID), ancl purge and trap gas chromatography-mass spectrometry (P&T-GC-MS) have been used to determine methyl-tert-butyl ether (MTBE) and benzene, toluene, and the ylenes (BTEX) in groundwater. In the work discussed in this paper measures of quality, e.g. recovery (94-111%), precision (4.6 - 12.2%), limits of detection (0.3 - 5.7 I~g L 1 for HS and 0.001 I~g L 1 for PT), and robust-ness, for both methods were compared. In addition, for purposes of comparison, groundwater samples from areas suffering from odor problems because of fuel spillage and tank leakage were analyzed by use of both techniques. For high concentration levels there was good correlation between results from both methods.
Resumo:
Does Independent Component Analysis (ICA) denature EEG signals? We applied ICA to two groups of subjects (mild Alzheimer patients and control subjects). The aim of this study was to examine whether or not the ICA method can reduce both group di®erences and within-subject variability. We found that ICA diminished Leave-One- Out root mean square error (RMSE) of validation (from 0.32 to 0.28), indicative of the reduction of group di®erence. More interestingly, ICA reduced the inter-subject variability within each group (¾ = 2:54 in the ± range before ICA, ¾ = 1:56 after, Bartlett p = 0.046 after Bonfer- roni correction). Additionally, we present a method to limit the impact of human error (' 13:8%, with 75.6% inter-cleaner agreement) during ICA cleaning, and reduce human bias. These ¯ndings suggests the novel usefulness of ICA in clinical EEG in Alzheimer's disease for reduction of subject variability.
Resumo:
Typically, algorithms for generating stereo disparity maps have been developed to minimise the energy equation of a single image. This paper proposes a method for implementing cross validation in a belief propagation optimisation. When tested using the Middlebury online stereo evaluation, the cross validation improves upon the results of standard belief propagation. Furthermore, it has been shown that regions of homogeneous colour within the images can be used for enforcing the so-called "Segment Constraint". Developing from this, Segment Support is introduced to boost belief between pixels of the same image region and improve propagation into textureless regions.
Resumo:
We propose a new sparse model construction method aimed at maximizing a model’s generalisation capability for a large class of linear-in-the-parameters models. The coordinate descent optimization algorithm is employed with a modified l1- penalized least squares cost function in order to estimate a single parameter and its regularization parameter simultaneously based on the leave one out mean square error (LOOMSE). Our original contribution is to derive a closed form of optimal LOOMSE regularization parameter for a single term model, for which we show that the LOOMSE can be analytically computed without actually splitting the data set leading to a very simple parameter estimation method. We then integrate the new results within the coordinate descent optimization algorithm to update model parameters one at the time for linear-in-the-parameters models. Consequently a fully automated procedure is achieved without resort to any other validation data set for iterative model evaluation. Illustrative examples are included to demonstrate the effectiveness of the new approaches.
Resumo:
An efficient two-level model identification method aiming at maximising a model׳s generalisation capability is proposed for a large class of linear-in-the-parameters models from the observational data. A new elastic net orthogonal forward regression (ENOFR) algorithm is employed at the lower level to carry out simultaneous model selection and elastic net parameter estimation. The two regularisation parameters in the elastic net are optimised using a particle swarm optimisation (PSO) algorithm at the upper level by minimising the leave one out (LOO) mean square error (LOOMSE). There are two elements of original contributions. Firstly an elastic net cost function is defined and applied based on orthogonal decomposition, which facilitates the automatic model structure selection process with no need of using a predetermined error tolerance to terminate the forward selection process. Secondly it is shown that the LOOMSE based on the resultant ENOFR models can be analytically computed without actually splitting the data set, and the associate computation cost is small due to the ENOFR procedure. Consequently a fully automated procedure is achieved without resort to any other validation data set for iterative model evaluation. Illustrative examples are included to demonstrate the effectiveness of the new approaches.
Resumo:
Il presente lavoro di tesi si inserisce nell’ambito della classificazione di dati ad alta dimensionalità, sviluppando un algoritmo basato sul metodo della Discriminant Analysis. Esso classifica i campioni attraverso le variabili prese a coppie formando un network a partire da quelle che hanno una performance sufficientemente elevata. Successivamente, l’algoritmo si avvale di proprietà topologiche dei network (in particolare la ricerca di subnetwork e misure di centralità di singoli nodi) per ottenere varie signature (sottoinsiemi delle variabili iniziali) con performance ottimali di classificazione e caratterizzate da una bassa dimensionalità (dell’ordine di 101, inferiore di almeno un fattore 103 rispetto alle variabili di partenza nei problemi trattati). Per fare ciò, l’algoritmo comprende una parte di definizione del network e un’altra di selezione e riduzione della signature, calcolando ad ogni passaggio la nuova capacità di classificazione operando test di cross-validazione (k-fold o leave- one-out). Considerato l’alto numero di variabili coinvolte nei problemi trattati – dell’ordine di 104 – l’algoritmo è stato necessariamente implementato su High-Performance Computer, con lo sviluppo in parallelo delle parti più onerose del codice C++, nella fattispecie il calcolo vero e proprio del di- scriminante e il sorting finale dei risultati. L’applicazione qui studiata è a dati high-throughput in ambito genetico, riguardanti l’espressione genica a livello cellulare, settore in cui i database frequentemente sono costituiti da un numero elevato di variabili (104 −105) a fronte di un basso numero di campioni (101 −102). In campo medico-clinico, la determinazione di signature a bassa dimensionalità per la discriminazione e classificazione di campioni (e.g. sano/malato, responder/not-responder, ecc.) è un problema di fondamentale importanza, ad esempio per la messa a punto di strategie terapeutiche personalizzate per specifici sottogruppi di pazienti attraverso la realizzazione di kit diagnostici per l’analisi di profili di espressione applicabili su larga scala. L’analisi effettuata in questa tesi su vari tipi di dati reali mostra che il metodo proposto, anche in confronto ad altri metodi esistenti basati o me- no sull’approccio a network, fornisce performance ottime, tenendo conto del fatto che il metodo produce signature con elevate performance di classifica- zione e contemporaneamente mantenendo molto ridotto il numero di variabili utilizzate per questo scopo.