Biblioteca Digital

967 resultados para random forest

SURVIVAL PREDICTION FOR BRAIN TUMOR PATIENTS USING GENE EXPRESSION DATA

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.

Fully Automatic CT Segmentation for Computer-Assisted Pre-operative Planning of Hip Arthroscopy

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Extraction of both pelvic and femoral surface models of a hip joint from CT data for computer-assisted pre-operative planning of hip arthroscopy is addressed. We present a method for a fully automatic image segmentation of a hip joint. Our method works by combining fast random forest (RF) regression based landmark detection, atlas-based segmentation, with articulated statistical shape model (aSSM) based hip joint reconstruction. The two fundamental contributions of our method are: (1) An improved fast Gaussian transform (IFGT) is used within the RF regression framework for a fast and accurate landmark detection, which then allows for a fully automatic initialization of the atlas-based segmentation; and (2) aSSM based fitting is used to preserve hip joint structure and to avoid penetration between the pelvic and femoral models. Validation on 30 hip CT images show that our method achieves high performance in segmenting pelvis, left proximal femur, and right proximal femur surfaces with an average accuracy of 0.59 mm, 0.62 mm, and 0.58 mm, respectively.

FACTS: Fully Automatic CT Segmentation of a Hip Joint

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Extraction of surface models of a hip joint from CT data is a pre-requisite step for computer assisted diagnosis and planning (CADP) of periacetabular osteotomy (PAO). Most of existing CADP systems are based on manual segmentation, which is time-consuming and hard to achieve reproducible results. In this paper, we present a Fully Automatic CT Segmentation (FACTS) approach to simultaneously extract both pelvic and femoral models. Our approach works by combining fast random forest (RF) regression based landmark detection, multi-atlas based segmentation, with articulated statistical shape model (aSSM) based fitting. The two fundamental contributions of our approach are: (1) an improved fast Gaussian transform (IFGT) is used within the RF regression framework for a fast and accurate landmark detection, which then allows for a fully automatic initialization of the multi-atlas based segmentation; and (2) aSSM based fitting is used to preserve hip joint structure and to avoid penetration between the pelvic and femoral models. Taking manual segmentation as the ground truth, we evaluated the present approach on 30 hip CT images (60 hips) with a 6-fold cross validation. When the present approach was compared to manual segmentation, a mean segmentation accuracy of 0.40, 0.36, and 0.36 mm was found for the pelvis, the left proximal femur, and the right proximal femur, respectively. When the models derived from both segmentations were used to compute the PAO diagnosis parameters, a difference of 2.0 ± 1.5°, 2.1 ± 1.6°, and 3.5 ± 2.3% were found for anteversion, inclination, and acetabular coverage, respectively. The achieved accuracy is regarded as clinically accurate enough for our target applications.

Recognition of activities of daily living in healthy subjects using two ad-hoc classifiers

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Activities of daily living (ADL) are important for quality of life. They are indicators of cognitive health status and their assessment is a measure of independence in everyday living. ADL are difficult to reliably assess using questionnaires due to self-reporting biases. Various sensor-based (wearable, in-home, intrusive) systems have been proposed to successfully recognize and quantify ADL without relying on self-reporting. New classifiers required to classify sensor data are on the rise. We propose two ad-hoc classifiers that are based only on non-intrusive sensor data. METHODS: A wireless sensor system with ten sensor boxes was installed in the home of ten healthy subjects to collect ambient data over a duration of 20 consecutive days. A handheld protocol device and a paper logbook were also provided to the subjects. Eight ADL were selected for recognition. We developed two ad-hoc ADL classifiers, namely the rule based forward chaining inference engine (RBI) classifier and the circadian activity rhythm (CAR) classifier. The RBI classifier finds facts in data and matches them against the rules. The CAR classifier works within a framework to automatically rate routine activities to detect regular repeating patterns of behavior. For comparison, two state-of-the-art [Naïves Bayes (NB), Random Forest (RF)] classifiers have also been used. All classifiers were validated with the collected data sets for classification and recognition of the eight specific ADL. RESULTS: Out of a total of 1,373 ADL, the RBI classifier correctly determined 1,264, while missing 109 and the CAR determined 1,305 while missing 68 ADL. The RBI and CAR classifier recognized activities with an average sensitivity of 91.27 and 94.36%, respectively, outperforming both RF and NB. CONCLUSIONS: The performance of the classifiers varied significantly and shows that the classifier plays an important role in ADL recognition. Both RBI and CAR classifier performed better than existing state-of-the-art (NB, RF) on all ADL. Of the two ad-hoc classifiers, the CAR classifier was more accurate and is likely to be better suited than the RBI for distinguishing and recognizing complex ADL.

Evaluation of Three State-of-the-Art Classifiers for Recognition of Activities of Daily Living from Smart Home Ambient Data

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Smart homes for the aging population have recently started attracting the attention of the research community. The "health state" of smart homes is comprised of many different levels; starting with the physical health of citizens, it also includes longer-term health norms and outcomes, as well as the arena of positive behavior changes. One of the problems of interest is to monitor the activities of daily living (ADL) of the elderly, aiming at their protection and well-being. For this purpose, we installed passive infrared (PIR) sensors to detect motion in a specific area inside a smart apartment and used them to collect a set of ADL. In a novel approach, we describe a technology that allows the ground truth collected in one smart home to train activity recognition systems for other smart homes. We asked the users to label all instances of all ADL only once and subsequently applied data mining techniques to cluster in-home sensor firings. Each cluster would therefore represent the instances of the same activity. Once the clusters were associated to their corresponding activities, our system was able to recognize future activities. To improve the activity recognition accuracy, our system preprocessed raw sensor data by identifying overlapping activities. To evaluate the recognition performance from a 200-day dataset, we implemented three different active learning classification algorithms and compared their performance: naive Bayesian (NB), support vector machine (SVM) and random forest (RF). Based on our results, the RF classifier recognized activities with an average specificity of 96.53%, a sensitivity of 68.49%, a precision of 74.41% and an F-measure of 71.33%, outperforming both the NB and SVM classifiers. Further clustering markedly improved the results of the RF classifier. An activity recognition system based on PIR sensors in conjunction with a clustering classification approach was able to detect ADL from datasets collected from different homes. Thus, our PIR-based smart home technology could improve care and provide valuable information to better understand the functioning of our societies, as well as to inform both individual and collective action in a smart city scenario.

Facial Nerve Image Enhancement from CBCT using Supervised Learning Technique

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Facial nerve segmentation plays an important role in surgical planning of cochlear implantation. Clinically available CBCT images are used for surgical planning. However, its relatively low resolution renders the identification of the facial nerve difficult. In this work, we present a supervised learning approach to enhance facial nerve image information from CBCT. A supervised learning approach based on multi-output random forest was employed to learn the mapping between CBCT and micro-CT images. Evaluation was performed qualitatively and quantitatively by using the predicted image as input for a previously published dedicated facial nerve segmentation, and cochlear implantation surgical planning software, OtoPlan. Results show the potential of the proposed approach to improve facial nerve image quality as imaged by CBCT and to leverage its segmentation using OtoPlan.

Automatic quality control in clinical (1) H MRSI of brain cancer.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

MRSI grids frequently show spectra with poor quality, mainly because of the high sensitivity of MRS to field inhomogeneities. These poor quality spectra are prone to quantification and/or interpretation errors that can have a significant impact on the clinical use of spectroscopic data. Therefore, quality control of the spectra should always precede their clinical use. When performed manually, quality assessment of MRSI spectra is not only a tedious and time-consuming task, but is also affected by human subjectivity. Consequently, automatic, fast and reliable methods for spectral quality assessment are of utmost interest. In this article, we present a new random forest-based method for automatic quality assessment of (1) H MRSI brain spectra, which uses a new set of MRS signal features. The random forest classifier was trained on spectra from 40 MRSI grids that were classified as acceptable or non-acceptable by two expert spectroscopists. To account for the effects of intra-rater reliability, each spectrum was rated for quality three times by each rater. The automatic method classified these spectra with an area under the curve (AUC) of 0.976. Furthermore, in the subset of spectra containing only the cases that were classified every time in the same way by the spectroscopists, an AUC of 0.998 was obtained. Feature importance for the classification was also evaluated. Frequency domain skewness and kurtosis, as well as time domain signal-to-noise ratios (SNRs) in the ranges 50-75 ms and 75-100 ms, were the most important features. Given that the method is able to assess a whole MRSI grid faster than a spectroscopist (approximately 3 s versus approximately 3 min), and without loss of accuracy (agreement between classifier trained with just one session and any of the other labelling sessions, 89.88%; agreement between any two labelling sessions, 89.03%), the authors suggest its implementation in the clinical routine. The method presented in this article was implemented in jMRUI's SpectrIm plugin. Copyright © 2016 John Wiley & Sons, Ltd.

Literature review of elasmobranch bycatch and retention rate

Relevância:

60.00% 60.00%

Publicador:

Resumo:

To address growing concern over the effects of fisheries non-target catch on elasmobranchs worldwide, the accurate reporting of elasmobranch catch is essential. This requires data on a combination of measures, including reported landings, retained and discarded non-target catch, and post-discard survival. Identification of the factors influencing discard vs. retention is needed to improve catch estimates and to determine wasteful fishing practices. To do this we compared retention rates of elasmobranch non-target catch in a broad subset of fisheries throughout the world by taxon, fishing country, and gear. A regression tree and random forest analysis indicated that taxon was the most important determinant of retention in this dataset, but all three factors together explained 59% of the variance. Estimates of total elasmobranch removals were calculated by dividing the FAO global elasmobranch landings by average retention rates and suggest that total elasmobranch removals may exceed FAO reported landings by as much as 400%. This analysis is the first effort to directly characterize global drivers of discards for elasmobranch non-target catch. Our results highlight the importance of accurate quantification of retention and discard rates to improve assessments of the potential impacts of fisheries on these species.

Planktonic foraminiferal assemblages and SST calculation of sediment core TAN1106/28

Relevância:

60.00% 60.00%

Publicador:

Planktonic foraminiferal assemblages and SST calculation of sediment core TAN1106/34

Relevância:

60.00% 60.00%

Publicador:

Planktonic foraminiferal assemblages and SST calculation of sediment core TAN1106/89

Relevância:

60.00% 60.00%

Publicador:

Planktonic foraminiferal assemblages and SST calculation of sediment surface samples of the Solander Trough, New Zealand

Relevância:

60.00% 60.00%

Publicador:

Planktonic foraminiferal assemblages and SST calculation of sediment core TAN0803/9

Relevância:

60.00% 60.00%

Publicador:

Planktonic foraminiferal assemblages and SST calculation of sediment core TAN1106/43

Relevância:

60.00% 60.00%

Publicador:

The aid of machine learning to overcome the classification of real health discharge reports written in Spanish

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Hospitals attached to the Spanish Ministry of Health are currently using the International Classification of Diseases 9 Clinical Modification (ICD9-CM) to classify health discharge records. Nowadays, this work is manually done by experts. This paper tackles the automatic classification of real Discharge Records in Spanish following the ICD9-CM standard. The challenge is that the Discharge Records are written in spontaneous language. We explore several machine learning techniques to deal with the classification problem. Random Forest resulted in the most competitive one, achieving an F-measure of 0.876.

«
1
2
3
4
5
6
7
8
...
64
65
»