43 resultados para Unsupervised unmixing


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Despite recent therapeutic improvements, the clinical course of diffuse large B-cell lymphoma (DLBCL) still differs considerably among patients. We conducted this retrospective multi-centre study to evaluate the impact of genomic aberrations detected using a high-density genome wide-single nucleotide polymorphism-based array on clinical outcome in a population of DLBCL patients treated with R-CHOP-21 (rituximab, cyclophosphamide, doxorubicine, vincristine and prednisone repeated every 21_d). 166 DNA samples were analysed using the GeneChip Human Mapping 250K NspI. Genomic anomalies were analysed regarding their impact on the clinical course of 124 patients treated with R-CHOP-21. Unsupervised clustering was performed to identify genetically related subgroups of patients with different clinical outcomes. Twenty recurrent genetic lesions showed an impact on the clinical course. Loss of genomic material at 8p23.1 showed the strongest statistical significance and was associated with additional aberrations, such as 17p- and 15q-. Unsupervised clustering identified five DLBCL clusters with distinct genetic profiles, clinical characteristics and outcomes. Genetic features and clusters, associated with a different outcome in patients treated with R-CHOP, have been identified by arrayCGH.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Colorectal cancer (CRC) is the second leading cause of cancer-related deaths in the Western world. It is becoming increasingly clear that CRC is a diverse disease, as exemplified by the identification of subgroups of CRC tumours that are driven by distinct biology. Recently, a number of studies have begun to define panels of diagnostically relevant markers to align patients into individual subgroups in an attempt to give information on prognosis and treatment response. We examined the immunohistochemical expression profile of 18 markers, each representing a putative role in cancer development, in 493 primary colorectal carcinomas using tissue microarrays. Through unsupervised clustering in stage II cancers, we identified two cluster groups that are broadly defined by inflammatory or immune-related factors (CD3, CD8, COX-2 and FOXP3) and stem-like factors (CD44, LGR5, SOX2, OCT4). The expression of the stem-like group markers was associated with a significantly worse prognosis compared to cases with lower expression. In addition, patients classified in the stem-like subgroup displayed a trend towards a benefit from adjuvant treatment. The biologically relevant and poor prognostic stem-like group could also be identified in early stage I cancers, suggesting a potential opportunity for the identification of aggressive tumors at a very early stage of the disease.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Visual salience is an intriguing phenomenon observed in biological neural systems. Numerous attempts have been made to model visual salience mathematically using various feature contrasts, either locally or globally. However, these algorithmic models tend to ignore the problem’s biological solutions, in which visual salience appears to arise during the propagation of visual stimuli along the visual cortex. In this paper, inspired by the conjecture that salience arises from deep propagation along the visual cortex, we present a Deep Salience model where a multi-layer model based on successive Markov random fields (sMRF) is proposed to analyze the input image successively through its deep belief propagation. As a result, the foreground object can be automatically separated from the background in a fully unsupervised way. Experimental evaluation on the benchmark dataset validated that our Deep Salience model can consistently outperform eleven state-of-the-art salience models, yielding the higher rates in the precision-recall tests and attaining the best F-measure and mean-square error in the experiments.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mycosis fungoides (MF) is the most frequent type of cutaneous T-cell lymphoma, whose diagnosis and study is hampered by its morphologic similarity to inflammatory dermatoses (ID) and the low proportion of tumoral cells, which often account for only 5% to 10% of the total tissue cells. cDNA microarray studies using the CNIO OncoChip of 29 MF and 11 ID cases revealed a signature of 27 genes implicated in the tumorigenesis of MF, including tumor necrosis factor receptor (TNFR)-dependent apoptosis regulators, STAT4, CD40L, and other oncogenes and apoptosis inhibitors. Subsequently a 6-gene prediction model was constructed that is capable of distinguishing MF and ID cases with unprecedented accuracy. This model correctly predicted the class of 97% of cases in a blind test validation using 24 MF patients with low clinical stages. Unsupervised hierarchic clustering has revealed 2 major subclasses of MF, one of which tends to include more aggressive-type MF cases including tumoral MF forms. Furthermore, signatures associated with abnormal immunophenotype (11 genes) and tumor stage disease (5 genes) were identified.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In many applications, and especially those where batch processes are involved, a target scalar output of interest is often dependent on one or more time series of data. With the exponential growth in data logging in modern industries such time series are increasingly available for statistical modeling in soft sensing applications. In order to exploit time series data for predictive modelling, it is necessary to summarise the information they contain as a set of features to use as model regressors. Typically this is done in an unsupervised fashion using simple techniques such as computing statistical moments, principal components or wavelet decompositions, often leading to significant information loss and hence suboptimal predictive models. In this paper, a functional learning paradigm is exploited in a supervised fashion to derive continuous, smooth estimates of time series data (yielding aggregated local information), while simultaneously estimating a continuous shape function yielding optimal predictions. The proposed Supervised Aggregative Feature Extraction (SAFE) methodology can be extended to support nonlinear predictive models by embedding the functional learning framework in a Reproducing Kernel Hilbert Spaces setting. SAFE has a number of attractive features including closed form solution and the ability to explicitly incorporate first and second order derivative information. Using simulation studies and a practical semiconductor manufacturing case study we highlight the strengths of the new methodology with respect to standard unsupervised feature extraction approaches.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Conventional practice in Regional Geochemistry includes as a final step of any geochemical campaign the generation of a series of maps, to show the spatial distribution of each of the components considered. Such maps, though necessary, do not comply with the compositional, relative nature of the data, which unfortunately make any conclusion based on them sensitive
to spurious correlation problems. This is one of the reasons why these maps are never interpreted isolated. This contribution aims at gathering a series of statistical methods to produce individual maps of multiplicative combinations of components (logcontrasts), much in the flavor of equilibrium constants, which are designed on purpose to capture certain aspects of the data.
We distinguish between supervised and unsupervised methods, where the first require an external, non-compositional variable (besides the compositional geochemical information) available in an analogous training set. This external variable can be a quantity (soil density, collocated magnetics, collocated ratio of Th/U spectral gamma counts, proportion of clay particle fraction, etc) or a category (rock type, land use type, etc). In the supervised methods, a regression-like model between the external variable and the geochemical composition is derived in the training set, and then this model is mapped on the whole region. This case is illustrated with the Tellus dataset, covering Northern Ireland at a density of 1 soil sample per 2 square km, where we map the presence of blanket peat and the underlying geology. The unsupervised methods considered include principal components and principal balances
(Pawlowsky-Glahn et al., CoDaWork2013), i.e. logcontrasts of the data that are devised to capture very large variability or else be quasi-constant. Using the Tellus dataset again, it is found that geological features are highlighted by the quasi-constant ratios Hf/Nb and their ratio against SiO2; Rb/K2O and Zr/Na2O and the balance between these two groups of two variables; the balance of Al2O3 and TiO2 vs. MgO; or the balance of Cr, Ni and Co vs. V and Fe2O3. The largest variability appears to be related to the presence/absence of peat.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Clusters of text documents output by clustering algorithms are often hard to interpret. We describe motivating real-world scenarios that necessitate reconfigurability and high interpretability of clusters and outline the problem of generating clusterings with interpretable and reconfigurable cluster models. We develop two clustering algorithms toward the outlined goal of building interpretable and reconfigurable cluster models. They generate clusters with associated rules that are composed of conditions on word occurrences or nonoccurrences. The proposed approaches vary in the complexity of the format of the rules; RGC employs disjunctions and conjunctions in rule generation whereas RGC-D rules are simple disjunctions of conditions signifying presence of various words. In both the cases, each cluster is comprised of precisely the set of documents that satisfy the corresponding rule. Rules of the latter kind are easy to interpret, whereas the former leads to more accurate clustering. We show that our approaches outperform the unsupervised decision tree approach for rule-generating clustering and also an approach we provide for generating interpretable models for general clusterings, both by significant margins. We empirically show that the purity and f-measure losses to achieve interpretability can be as little as 3 and 5%, respectively using the algorithms presented herein.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The problem of detecting spatially-coherent groups of data that exhibit anomalous behavior has started to attract attention due to applications across areas such as epidemic analysis and weather forecasting. Earlier efforts from the data mining community have largely focused on finding outliers, individual data objects that display deviant behavior. Such point-based methods are not easy to extend to find groups of data that exhibit anomalous behavior. Scan Statistics are methods from the statistics community that have considered the problem of identifying regions where data objects exhibit a behavior that is atypical of the general dataset. The spatial scan statistic and methods that build upon it mostly adopt the framework of defining a character for regions (e.g., circular or elliptical) of objects and repeatedly sampling regions of such character followed by applying a statistical test for anomaly detection. In the past decade, there have been efforts from the statistics community to enhance efficiency of scan statstics as well as to enable discovery of arbitrarily shaped anomalous regions. On the other hand, the data mining community has started to look at determining anomalous regions that have behavior divergent from their neighborhood.In this chapter,we survey the space of techniques for detecting anomalous regions on spatial data from across the data mining and statistics communities while outlining connections to well-studied problems in clustering and image segmentation. We analyze the techniques systematically by categorizing them appropriately to provide a structured birds eye view of the work on anomalous region detection;we hope that this would encourage better cross-pollination of ideas across communities to help advance the frontier in anomaly detection.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we introduce a novel approach to face recognition which simultaneously tackles three combined challenges: 1) uneven illumination; 2) partial occlusion; and 3) limited training data. The new approach performs lighting normalization, occlusion de-emphasis and finally face recognition, based on finding the largest matching area (LMA) at each point on the face, as opposed to traditional fixed-size local area-based approaches. Robustness is achieved with novel approaches for feature extraction, LMA-based face image comparison and unseen data modeling. On the extended YaleB and AR face databases for face identification, our method using only a single training image per person, outperforms other methods using a single training image, and matches or exceeds methods which require multiple training images. On the labeled faces in the wild face verification database, our method outperforms comparable unsupervised methods. We also show that the new method performs competitively even when the training images are corrupted.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work addresses the problem of detecting human behavioural anomalies in crowded surveillance environments. We focus in particular on the problem of detecting subtle anomalies in a behaviourally heterogeneous surveillance scene. To reach this goal we implement a novel unsupervised context-aware process. We propose and evaluate a method of utilising social context and scene context to improve behaviour analysis. We find that in a crowded scene the application of Mutual Information based social context permits the ability to prevent self-justifying groups and propagate anomalies in a social network, granting a greater anomaly detection capability. Scene context uniformly improves the detection of anomalies in both datasets. The strength of our contextual features is demonstrated by the detection of subtly abnormal behaviours, which otherwise remain indistinguishable from normal behaviour.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

To define specific pathways important in the multistep transformation process of normal plasma cells (PCs) to monoclonal gammopathy of uncertain significance (MGUS) and multiple myeloma (MM), we have applied microarray analysis to PCs from 5 healthy donors (N), 7 patients with MGUS, and 24 patients with newly diagnosed MM. Unsupervised hierarchical clustering using 125 genes with a large variation across all samples defined 2 groups: N and MGUS/MM. Supervised analysis identified 263 genes differentially expressed between N and MGUS and 380 genes differentially expressed between N and MM, 197 of which were also differentially regulated between N and MGUS. Only 74 genes were differentially expressed between MGUS and MM samples, indicating that the differences between MGUS and MM are smaller than those between N and MM or N and MGUS. Differentially expressed genes included oncogenes/tumor-suppressor genes (LAF4, RB1, and disabled homolog 2), cell-signaling genes (RAS family members, B-cell signaling and NF-kappaB genes), DNA-binding and transcription-factor genes (XBP1, zinc finger proteins, forkhead box, and ring finger proteins), and developmental genes (WNT and SHH pathways). Understanding the molecular pathogenesis of MM by gene expression profiling has demonstrated sequential genetic changes from N to malignant PCs and highlighted important pathways involved in the transformation of MGUS to MM.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

OBJECTIVE: To compare the overall performance of specially trained neonatal nurses acting autonomously, unsupervised, and without a protocol with specialist registrars when weaning neonates from mechanical ventilation.

DESIGN: Prospective, randomized, controlled trial.

SETTING: A single neonatal intensive care unit.

PATIENTS: Neonates requiring conventional mechanical ventilation (n = 50).

INTERVENTIONS: Infants on conventional ventilation were randomly assigned to receive either nurse-led (n = 25) or registrar-led (n = 23) weaning. A total of 48 infants completed the study (two infants in the registrar group were excluded when their parents withdrew consent).

MEASUREMENTS AND MAIN RESULTS: The main outcome measure, median weaning time, was 1200 mins (95% confidence interval [CI], 621-1779 mins) in the nurse group and 3015 mins (95% CI, 2650-3380 mins) in the registrar group (p = .0458). The median time from treatment assignment to the first ventilator change was 60 mins (95% CI, 52-68 mins) in the nurse group and 120 mins (95% CI, 103-137 mins) in the registrar group (p = .35). On average, the nurses made ventilator changes every 4.5 hrs (95% CI, 2.9-6 hrs) and the registrars every 7.2 hrs (95% CI, 5.4-9 hrs; p = .003). The median number (range) of backward steps taken per infant was 0 (0-5 steps) in the nurse group and 1 (0-5 steps) in the registrar group (p = .019).

CONCLUSIONS: The findings of this study suggest that additional domains of neonatal critical care could be reviewed for their potential transfer to appropriately prepared nurses.