447 resultados para classifiers


Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Early detection and treatment of colorectal adenomatous polyps (AP) and colorectal cancer (CRC) is associated with decreased mortality for CRC. However, accurate, non-invasive and compliant tests to screen for AP and early stages of CRC are not yet available. A blood-based screening test is highly attractive due to limited invasiveness and high acceptance rate among patients. AIM: To demonstrate whether gene expression signatures in the peripheral blood mononuclear cells (PBMC) were able to detect the presence of AP and early stages CRC. METHODS: A total of 85 PBMC samples derived from colonoscopy-verified subjects without lesion (controls) (n = 41), with AP (n = 21) or with CRC (n = 23) were used as training sets. A 42-gene panel for CRC and AP discrimination, including genes identified by Digital Gene Expression-tag profiling of PBMC, and genes previously characterised and reported in the literature, was validated on the training set by qPCR. Logistic regression analysis followed by bootstrap validation determined CRC- and AP-specific classifiers, which discriminate patients with CRC and AP from controls. RESULTS: The CRC and AP classifiers were able to detect CRC with a sensitivity of 78% and AP with a sensitivity of 46% respectively. Both classifiers had a specificity of 92% with very low false-positive detection when applied on subjects with inflammatory bowel disease (n = 23) or tumours other than CRC (n = 14). CONCLUSION: This pilot study demonstrates the potential of developing a minimally invasive, accurate test to screen patients at average risk for colorectal cancer, based on gene expression analysis of peripheral blood mononuclear cells obtained from a simple blood sample.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Este trabajo presenta una metodología para detectar y realizar el seguimiento de características faciales. En el primer paso del procedimiento se detectan caras mediante Adaboost con cascadas de clasificadores débiles. El segundo paso busca las características internas de la cara mediante el CSR, detectando zonas de interés. Una vez que estas características se capturan, un proceso de tracking basado en el descriptor SIFT, que hemos llamado pseudo-SIFT, es capaz de guardar información sobre la evolución de movimiento en las regiones detectadas. Además, un conjunto de datos públicos ha sido desarrollado con el propósito de compartirlo con otras investigaciones sobre detección, clasificación y tracking. Experimentos reales muestran la robustez de este trabajo y su adaptabilidad para trabajos futuros.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Introduction: As part of the MicroArray Quality Control (MAQC)-II project, this analysis examines how the choice of univariate feature-selection methods and classification algorithms may influence the performance of genomic predictors under varying degrees of prediction difficulty represented by three clinically relevant endpoints. Methods: We used gene-expression data from 230 breast cancers (grouped into training and independent validation sets), and we examined 40 predictors (five univariate feature-selection methods combined with eight different classifiers) for each of the three endpoints. Their classification performance was estimated on the training set by using two different resampling methods and compared with the accuracy observed in the independent validation set. Results: A ranking of the three classification problems was obtained, and the performance of 120 models was estimated and assessed on an independent validation set. The bootstrapping estimates were closer to the validation performance than were the cross-validation estimates. The required sample size for each endpoint was estimated, and both gene-level and pathway-level analyses were performed on the obtained models. Conclusions: We showed that genomic predictor accuracy is determined largely by an interplay between sample size and classification difficulty. Variations on univariate feature-selection methods and choice of classification algorithm have only a modest impact on predictor performance, and several statistically equally good predictors can be developed for any given classification problem.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the recent years, kernel methods have revealed very powerful tools in many application domains in general and in remote sensing image classification in particular. The special characteristics of remote sensing images (high dimension, few labeled samples and different noise sources) are efficiently dealt with kernel machines. In this paper, we propose the use of structured output learning to improve remote sensing image classification based on kernels. Structured output learning is concerned with the design of machine learning algorithms that not only implement input-output mapping, but also take into account the relations between output labels, thus generalizing unstructured kernel methods. We analyze the framework and introduce it to the remote sensing community. Output similarity is here encoded into SVM classifiers by modifying the model loss function and the kernel function either independently or jointly. Experiments on a very high resolution (VHR) image classification problem shows promising results and opens a wide field of research with structured output kernel methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A parts based model is a parametrization of an object class using a collection of landmarks following the object structure. The matching of parts based models is one of the problems where pairwise Conditional Random Fields have been successfully applied. The main reason of their effectiveness is tractable inference and learning due to the simplicity of involved graphs, usually trees. However, these models do not consider possible patterns of statistics among sets of landmarks, and thus they sufffer from using too myopic information. To overcome this limitation, we propoese a novel structure based on a hierarchical Conditional Random Fields, which we explain in the first part of this memory. We build a hierarchy of combinations of landmarks, where matching is performed taking into account the whole hierarchy. To preserve tractable inference we effectively sample the label set. We test our method on facial feature selection and human pose estimation on two challenging datasets: Buffy and MultiPIE. In the second part of this memory, we present a novel approach to multiple kernel combination that relies on stacked classification. This method can be used to evaluate the landmarks of the parts-based model approach. Our method is based on combining responses of a set of independent classifiers for each individual kernel. Unlike earlier approaches that linearly combine kernel responses, our approach uses them as inputs to another set of classifiers. We will show that we outperform state-of-the-art methods on most of the standard benchmark datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

An active learning method is proposed for the semi-automatic selection of training sets in remote sensing image classification. The method adds iteratively to the current training set the unlabeled pixels for which the prediction of an ensemble of classifiers based on bagged training sets show maximum entropy. This way, the algorithm selects the pixels that are the most uncertain and that will improve the model if added in the training set. The user is asked to label such pixels at each iteration. Experiments using support vector machines (SVM) on an 8 classes QuickBird image show the excellent performances of the methods, that equals accuracies of both a model trained with ten times more pixels and a model whose training set has been built using a state-of-the-art SVM specific active learning method

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The application of support vector machine classification (SVM) to combined information from magnetic resonance imaging (MRI) and [F18]fluorodeoxyglucose positron emission tomography (FDG-PET) has been shown to improve detection and differentiation of Alzheimer's disease dementia (AD) and frontotemporal lobar degeneration. To validate this approach for the most frequent dementia syndrome AD, and to test its applicability to multicenter data, we randomly extracted FDG-PET and MRI data of 28 AD patients and 28 healthy control subjects from the database provided by the Alzheimer's Disease Neuroimaging Initiative (ADNI) and compared them to data of 21 patients with AD and 13 control subjects from our own Leipzig cohort. SVM classification using combined volume-of-interest information from FDG-PET and MRI based on comprehensive quantitative meta-analyses investigating dementia syndromes revealed a higher discrimination accuracy in comparison to single modality classification. For the ADNI dataset accuracy rates of up to 88% and for the Leipzig cohort of up to 100% were obtained. Classifiers trained on the ADNI data discriminated the Leipzig cohorts with an accuracy of 91%. In conclusion, our results suggest SVM classification based on quantitative meta-analyses of multicenter data as a valid method for individual AD diagnosis. Furthermore, combining imaging information from MRI and FDG-PET might substantially improve the accuracy of AD diagnosis.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A recent trend in digital mammography is computer-aided diagnosis systems, which are computerised tools designed to assist radiologists. Most of these systems are used for the automatic detection of abnormalities. However, recent studies have shown that their sensitivity is significantly decreased as the density of the breast increases. This dependence is method specific. In this paper we propose a new approach to the classification of mammographic images according to their breast parenchymal density. Our classification uses information extracted from segmentation results and is based on the underlying breast tissue texture. Classification performance was based on a large set of digitised mammograms. Evaluation involves different classifiers and uses a leave-one-out methodology. Results demonstrate the feasibility of estimating breast density using image processing and analysis techniques

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Breast cancer is a heterogeneous disease with varied morphological appearances, molecular features, behavior, and response to therapy. Current routine clinical management of breast cancer relies on the availability of robust clinical and pathological prognostic and predictive factors to support clinical and patient decision making in which potentially suitable treatment options are increasingly available. One of the best-established prognostic factors in breast cancer is histological grade, which represents the morphological assessment of tumor biological characteristics and has been shown to be able to generate important information related to the clinical behavior of breast cancers. Genome-wide microarray-based expression profiling studies have unraveled several characteristics of breast cancer biology and have provided further evidence that the biological features captured by histological grade are important in determining tumor behavior. Also, expression profiling studies have generated clinically useful data that have significantly improved our understanding of the biology of breast cancer, and these studies are undergoing evaluation as improved prognostic and predictive tools in clinical practice. Clinical acceptance of these molecular assays will require them to be more than expensive surrogates of established traditional factors such as histological grade. It is essential that they provide additional prognostic or predictive information above and beyond that offered by current parameters. Here, we present an analysis of the validity of histological grade as a prognostic factor and a consensus view on the significance of histological grade and its role in breast cancer classification and staging systems in this era of emerging clinical use of molecular classifiers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer's Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. METHODS It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. RESULTS Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. CONCLUSIONS All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET).

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The work presented in this paper belongs to the power quality knowledge area and deals with the voltage sags in power transmission and distribution systems. Propagating throughout the power network, voltage sags can cause plenty of problems for domestic and industrial loads that can financially cost a lot. To impose penalties to responsible party and to improve monitoring and mitigation strategies, sags must be located in the power network. With such a worthwhile objective, this paper comes up with a new method for associating a sag waveform with its origin in transmission and distribution networks. It solves this problem through developing hybrid methods which hire multiway principal component analysis (MPCA) as a dimension reduction tool. MPCA reexpresses sag waveforms in a new subspace just in a few scores. We train some well-known classifiers with these scores and exploit them for classification of future sags. The capabilities of the proposed method for dimension reduction and classification are examined using the real data gathered from three substations in Catalonia, Spain. The obtained classification rates certify the goodness and powerfulness of the developed hybrid methods as brand-new tools for sag classification

Relevância:

10.00% 10.00%

Publicador:

Resumo:

It has been shown that the accuracy of mammographic abnormality detection methods is strongly dependent on the breast tissue characteristics, where a dense breast drastically reduces detection sensitivity. In addition, breast tissue density is widely accepted to be an important risk indicator for the development of breast cancer. Here, we describe the development of an automatic breast tissue classification methodology, which can be summarized in a number of distinct steps: 1) the segmentation of the breast area into fatty versus dense mammographic tissue; 2) the extraction of morphological and texture features from the segmented breast areas; and 3) the use of a Bayesian combination of a number of classifiers. The evaluation, based on a large number of cases from two different mammographic data sets, shows a strong correlation ( and 0.67 for the two data sets) between automatic and expert-based Breast Imaging Reporting and Data System mammographic density assessment

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We are going to implement the "GA-SEFS" by Tsymbal and analyse experimentally its performance depending on the classifier algorithms used in the fitness function (NB, MNge, SMO). We are also going to study the effect of adding to the fitness function a measure to control complexity of the base classifiers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this work we present the results of experimental work on the development of lexical class-based lexica by automatic means. Our purpose is to assess the use of linguistic lexical-class based information as a feature selection methodology for the use of classifiers in quick lexical development. The results show that the approach can help reduce the human effort required in the development of language resources significantly.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Résumé tout public : Le développement du diabète de type II et de l'obésité est causé par l'interaction entre des gènes de susceptibilité et des facteurs environnementaux, en particulier une alimentation riche en calories et une activité physique insuffisante. Afín d'évaluer le rôle de l'alimentation en absence d'hétérogénéité génétique, nous avons nourri une lignée de souris génétiquement pure avec un régime extrêmement gras. Ce régime a conduit à l'établissement de différents phénotypes parmi ces souris, soit : un diabète et une obésité (ObD), un diabète mais pas d'obésité (LD) ou ni un diabète, ni une obésité (LnD). Nous avons fait l'hypothèse que ces adaptations différentes au stress nutritionnel induit par le régime gras étaient dues à l'établissement de programmes génétiques différents dans les principaux organes impliqués dans le maintien de l'équilibre énergétique. Afin d'évaluer cette hypothèse, nous avons développé une puce à ADN contenant approximativement 700 gènes du métabolisme. Cette puce à ADN, en rendant possible la mesure simultanée de l'expression de nombreux gènes, nous a permis d'établir les profils d'expression des gènes caractéristiques de chaque groupe de souris nourries avec le régime gras, dans le foie et le muscle squelettique. Les données que nous avons obtenues à partir de ces profils d'expression ont montré que des changements d'expression marqués se produisaient dans le foie et le muscle entre les différents groupes de souris nourries avec le régime gras. Dans l'ensemble, ces changements suggèrent que l'établissement du diabète de type II et de l'obésité induits par un régime gras est associé à une synthèse accrue de lipides par le foie et à un flux augmenté de lipides du foie jusqu'à la périphérie (muscles squelettiques). Dans un deuxième temps, ces profils d'expression des gènes ont été utilisés pour sélectionner un sous-ensemble de gènes suffisamment discriminants pour pouvoir distinguer entre les différents phénotypes. Ce sous-ensemble de gènes nous a permis de construire un classificateur phénotypique capable de prédire avec une précision relativement élevée le phénotype des souris. Dans le futur, de tels « prédicteurs » basés sur l'expression des gènes pourraient servir d'outils pour le diagnostic de pathologies liées au métabolisme. Summary: Aetiology of obesity and type II diabetes is multifactorial, involving both genetic and environmental factors, such as calory-rich diets or lack of exercice. Genetically homogenous C57BL/6J mice fed a high fat diet (HFD) up to nine months develop differential adaptation, becoming either obese and diabetic (ObD) or remaining lean in the presence (LD) or absence (LnD) of diabetes development. Each phenotype is associated with diverse metabolic alterations, which may result from diverse molecular adaptations of key organs involved in the control of energy homeostasis. In this study, we evaluated if specific patterns of gene expression could be associated with each different phenotype of HFD mice in the liver and the skeletal muscles. To perform this, we constructed a metabolic cDNA microarray containing approximately 700 cDNA representing genes involved in the main metabolic pathways of energy homeostasis. Our data indicate that the development of diet-induced obesity and type II diabetes is linked to some defects in lipid metabolism, involving a preserved hepatic lipogenesis and increased levels of very low density lipoproteins (VLDL). In skeletal muscles, an increase in fatty acids uptake, as suggested by the increased expression of lipoprotein lipase, would contribute to the increased level of insulin resistance observed in the ObD mice. Conversely, both groups of lean mice showed a reduced expression in lipogenic genes, particularly stearoyl-CoA desaturase 1 (Scd-1), a gene linked to sensitivity to diet-induced obesity. Secondly, we identified a subset of genes from expression profiles that classified with relative accuracy the different groups of mice. Such classifiers may be used in the future as diagnostic tools of each metabolic state in each tissue. Résumé Développement d'une puce à ADN métabolique et application à l'étude d'un modèle murin d'obésité et de diabète de type II L'étiologie de l'obésité et du diabète de type II est multifactorielle, impliquant à la fois des facteurs génétiques et environnementaux, tels que des régimes riches en calories ou un manque d'exercice physique. Des souris génétiquement homogènes C57BL/6J nourries avec un régime extrêmement gras (HFD) pendant 9 mois développent une adaptation métabolique différentielle, soit en devenant obèses et diabétiques (ObD), soit en restant minces en présence (LD) ou en absence (LnD) d'un diabète. Chaque phénotype est associé à diverses altérations métaboliques, qui pourraient résulter de diverses adaptations moléculaires des organes impliqués dans le contrôle de l'homéostasie énergétique. Dans cette étude, nous avons évalué si des profils d'expression des gènes dans le foie et le muscle squelettique pouvaient être associés à chacun des phénotypes de souris HFD. Dans ce but, nous avons développé une puce à ADN métabolique contenant approximativement 700 ADNc représentant des gènes impliqués dans les différentes voies métaboliques de l'homéostasie énergétique. Nos données indiquent que le développement de l'obésité et du diabète de type II induit par un régime gras est associé à certains défauts du métabolisme lipidique, impliquant une lipogenèse hépatique préservée et des niveaux de lipoprotéines de très faible densité (VLDL) augmentés. Au niveau du muscle squelettique, une augmentation du captage des acides gras, suggéré par l'expression augmentée de la lipoprotéine lipase, contribuerait à expliquer la résistance à l'insuline plus marquée observée chez les souris ObD. Au contraire, les souris minces ont montré une réduction marquée de l'expression des gènes lipogéniques, en particulier de la stéaroyl-CoA désaturase 1 (scd-1), un gène associé à la sensibilité au développement de l'obésité par un régime gras. Dans un deuxième temps, nous avons identifié un sous-ensemble de gènes à partir des profils d'expression, qui permettent de classifier avec une précision relativement élevée les différents groupes de souris. De tels classificateurs pourraient être utilisés dans le futur comme outils pour le diagnostic de l'état métabolique d'un tissu donné.