980 resultados para Labeling hierarchical clustering


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Coastal zooplankton have been investigated since 1984 at a Long Term Ecological Research station MC (LTER-MC) in the inner Gulf of Naples (Tyrrhenian Sea, Western Mediterranean). The sampling site, located between the littoral and the open sea systems, has very active hydrography that affects plankton communities. The present work was aimed at establishing whether, in such a dynamic and variable environment, species associations and homogeneous periods could be identified as characteristic and stable features of the mesozooplankton over the period 1984–2006. Hierarchical clustering was applied to assess species associations based on a matrix of similarities between species (R-mode), and homogeneous periods based on a matrix of similarities between observations (Q-mode). The Indicator Value index [IndVal, Dufrene and Legendre (1997) Species assemblages and indicator species: the need for a flexible asymmetrical approach. Ecol. Monogr., 67, 345–366] was calculated to identify species characterizing each period. Five taxonomic groups with well-defined composition and abundance were identified as robust associations that likely reflect different modes of community functioning. The temporal course of these associations was largely shaped by strong seasonal forcing comprising both physical and biological (e.g. trophic) signals. These associations persisted over the long term, thus indicating some stable characters in the Naples zooplankton time-series, providing evidence of resilience in communities in highly variable coastal conditions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Juvenile idiopathic arthritis (JIA) comprises a poorly understood group of chronic, childhood onset, autoimmune diseases with variable clinical outcomes. We investigated whether profiling of the synovial fluid (SF) proteome by a fluorescent dye based, two-dimensional gel (DIGE) approach could distinguish patients in whom inflammation extends to affect a large number of joints, early in the disease process. SF samples from 22 JIA patients were analyzed: 10 with oligoarticular arthritis, 5 extended oligoarticular and 7 polyarticular disease. SF samples were labeled with Cy dyes and separated by two-dimensional electrophoresis. Multivariate analyses were used to isolate a panel of proteins which distinguish patient subgroups. Proteins were identified using MALDI-TOF mass spectrometry with expression further verified by Western immunoblotting and immunohistochemistry. Hierarchical clustering based on the expression levels of a set of 40 proteins segregated the extended oligoarticular from the oligoarticular patients (p <0.05). Expression patterns of the isolated protein panel have also been observed over time, as disease spreads to multiple joints. The data indicates that synovial fluid proteome profiles could be used to stratify patients based on risk of disease extension. These protein profiles may also assist in monitoring therapeutic responses over time and help predict joint damage. © 2009 American Chemical Society.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The recent emergence of high-throughput arrays for methylation analysis has made the influence of tumor content on the interpretation of methylation levels increasingly pertinent. However, to what degree does tumor content have an influence, and what degree of tumor content makes a specimen acceptable for accurate analysis remains unclear. Taking a systematic approach, we analyzed 98 unselected formalin-fixed and paraffin-embedded gastric tumors and matched normal tissue samples using the Illumina GoldenGate methylation assay. Unsupervised hierarchical clustering showed 2 separate clusters with a significant difference in average tumor content levels. The probes identified to be significantly differentially methylated between the tumors and normals also differed according to the tumor content of the samples included, with the sensitivity of identifying the

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Evidence suggests that in prokaryotes sequence-dependent transcriptional pauses a?ect the dynamics of transcription and translation, as well as of small genetic circuits. So far, a few pause-prone sequences have been identi?ed from in vitro measurements of transcription elongation kinetics.

Results: Using a stochastic model of gene expression at the nucleotide and codon levels with realistic parameter values, we investigate three di?erent but related questions and present statistical methods for their analysis. First, we show that information from in vivo RNA and protein temporal numbers is su?cient to discriminate between models with and without a pause site in their coding sequence. Second, we demonstrate that it is possible to separate a large variety of models from each other with pauses of various durations and locations in the template by means of a hierarchical clustering and a random forest classi?er. Third, we introduce an approximate likelihood function that allows to estimate the location of a pause site.

Conclusions: This method can aid in detecting unknown pause-prone sequences from temporal measurements of RNA and protein numbers at a genome-wide scale and thus elucidate possible roles that these sequences play in the dynamics of genetic networks and phenotype.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Ineffective risk stratification can delay diagnosis of serious disease in patients with hematuria. We applied a systems biology approach to analyze clinical, demographic and biomarker measurements (n = 29) collected from 157 hematuric patients: 80 urothelial cancer (UC) and 77 controls with confounding pathologies.

Methods: On the basis of biomarkers, we conducted agglomerative hierarchical clustering to identify patient and biomarker clusters. We then explored the relationship between the patient clusters and clinical characteristics using Chi-square analyses. We determined classification errors and areas under the receiver operating curve of Random Forest Classifiers (RFC) for patient subpopulations using the biomarker clusters to reduce the dimensionality of the data.

Results: Agglomerative clustering identified five patient clusters and seven biomarker clusters. Final diagnoses categories were non-randomly distributed across the five patient clusters. In addition, two of the patient clusters were enriched with patients with ‘low cancer-risk’ characteristics. The biomarkers which contributed to the diagnostic classifiers for these two patient clusters were similar. In contrast, three of the patient clusters were significantly enriched with patients harboring ‘high cancer-risk” characteristics including proteinuria, aggressive pathological stage and grade, and malignant cytology. Patients in these three clusters included controls, that is, patients with other serious disease and patients with cancers other than UC. Biomarkers which contributed to the diagnostic classifiers for the largest ‘high cancer- risk’ cluster were different than those contributing to the classifiers for the ‘low cancer-risk’ clusters. Biomarkers which contributed to subpopulations that were split according to smoking status, gender and medication were different.

Conclusions: The systems biology approach applied in this study allowed the hematuric patients to cluster naturally on the basis of the heterogeneity within their biomarker data, into five distinct risk subpopulations. Our findings highlight an approach with the promise to unlock the potential of biomarkers. This will be especially valuable in the field of diagnostic bladder cancer where biomarkers are urgently required. Clinicians could interpret risk classification scores in the context of clinical parameters at the time of triage. This could reduce cystoscopies and enable priority diagnosis of aggressive diseases, leading to improved patient outcomes at reduced costs. © 2013 Emmert-Streib et al; licensee BioMed Central Ltd.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: There is no method routinely used to predict response to anthracycline and cyclophosphamide–based chemotherapy in the clinic; therefore patients often receive treatment for breast cancer with no benefit. Loss of the Fanconi anemia/BRCA (FA/BRCA) DNA damage response (DDR) pathway occurs in approximately 25% of breast cancer patients through several mechanisms and results in sensitization to DNA-damaging agents. The aim of this study was to develop an assay to detect DDR-deficient tumors associated with loss of the FA/BRCA pathway, for the purpose of treatment selection.

Methods: DNA microarray data from 21 FA patients and 11 control subjects were analyzed to identify genetic processes associated with a deficiency in DDR. Unsupervised hierarchical clustering was then performed using 60 BRCA1/2 mutant and 47 sporadic tumor samples, and a molecular subgroup was identified that was defined by the molecular processes represented within FA patients. A 44-gene microarray-based assay (the DDR deficiency assay) was developed to prospectively identify this subgroup from formalin-fixed, paraffin-embedded samples. All statistical tests were two-sided.

Results: In a publicly available independent cohort of 203 patients, the assay predicted complete pathologic response vs residual disease after neoadjuvant DNA-damaging chemotherapy (5-fluorouracil, anthracycline, and cyclophosphamide) with an odds ratio of 3.96 (95% confidence interval [Cl] =1.67 to 9.41; P = .002). In a new independent cohort of 191 breast cancer patients treated with adjuvant 5-fluorouracil, epirubicin, and cyclophosphamide, a positive assay result predicted 5-year relapse-free survival with a hazard ratio of 0.37 (95% Cl = 0.15 to 0.88; P = .03) compared with the assay negative population.

Conclusions: A formalin-fixed, paraffin-embedded tissue-based assay has been developed and independently validated as a predictor of response and prognosis after anthracycline/cyclophosphamide–based chemotherapy in the neoadjuvant and adjuvant settings. These findings warrant further validation in a prospective clinical study.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Traditional Chinese Medicines (TCMs) derived from animal horns are one of the most important types of Chinese medicine. In the present study, a fast and sensitive analytical method was established for qualitative and quantitative determination of 14 nucleosides and nucleobases in animal horns using hydrophilic interaction ultra-high performance liquid chromatography coupled with triple-quadruple tandem mass spectrometry (HILIC-UPLC-QQQ-MS/MS) in selective reaction monitoring (SRM) mode. The method was optimized and validated, and showed good linearity, precision, repeatability, and accuracy. The method was successfully used to determine contents of the 14 nucleosides and nucleobases in 25 animal horn samples. Hierarchical clustering analysis (HCA) and principal component analysis (PCA) were performed and the 25 samples were thereby divided into two groups, which agreed with taxonomy. The method may enable quick and effective search of substitutes for precious horns.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Invasive urothelial cell carcinoma (UCC) is characterized by increased chromosomal instability and follows an aggressive clinical course in contrast to non-invasive disease. To identify molecular processes that confer and maintain an aggressive malignant phenotype, we used a high-throughput genome-wide approach to interrogate a cohort of high and low clinical risk UCC tumors. Differential expression analyses highlighted cohesive dysregulation of critical genes involved in the G(2)/M checkpoint in aggressive UCC. Hierarchical clustering based on DNA Damage Response (DDR) genes separated tumors according to a pre-defined clinical risk phenotype. Using array-comparative genomic hybridization, we confirmed that the DDR was disrupted in tumors displaying high genomic instability. We identified DNA copy number gains at 20q13.2-q13.3 (AURKA locus) and determined that overexpression of AURKA accompanied dysregulation of DDR genes in high risk tumors. We postulated that DDR-deficient UCC tumors are advantaged by a selective pressure for AURKA associated override of M phase barriers and confirmed this in an independent tissue microarray series. This mechanism that enables cancer cells to maintain an aggressive phenotype forms a rationale for targeting AURKA as a therapeutic strategy in advanced stage UCC.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background/Purpose:Juvenile idiopathic arthritis (JIA) comprises a poorly understood group of chronic, childhood onset, autoimmune diseases with variable clinical outcomes. We investigated whether profiling of the synovial fluid (SF) proteome by a fluorescent dye based, two-dimensional gel (DIGE) approach could distinguish the subset of patients in whom inflammation extends to affect a large number of joints, early in the disease process. The post-translational modifications to candidate protein markers were verified by a novel deglycosylation strategy.Methods:SF samples from 57 patients were obtained around time of initial diagnosis of JIA. At 1 year from inclusion patients were categorized according to ILAR criteria as oligoarticular arthritis (n=26), extended oligoarticular (n=8) and polyarticular disease (n=18). SF samples were labeled with Cy dyes and separated by two-dimensional electrophoresis. Multivariate analyses were used to isolate a panel of proteins which distinguish patient subgroups. Proteins were identified using MALDI-TOF mass spectrometry with vitamin D binding protein (VDBP) expression and siaylation further verified by immunohistochemistry, ELISA test and immunoprecipitation. Candidate biomarkers were compared to conventional inflammation measure C-reactive protein (CRP). Sialic acid residues were enzymatically cleaved from immunopurified SF VDBP, enriched by hydrophilic interaction liquid chromatography (HILIC) and analysed by mass spectrometry.Results:Hierarchical clustering based on the expression levels of a set of 23 proteins segregated the extended-to-be oligoarticular from the oligoarticular patients. A cleaved isoform of VDBP, spot 873, is present at significantly reduced levels in the SF of oligoarticular patients at risk of disease extension, relative to other subgroups (p<0.05). Conversely total levels of vitamin D binding protein are elevated in plasma and ROC curves indicate an improved diagnostic sensitivity to detect patients at risk of disease extension, over both spot 873 and CRP levels. Sialysed forms of intact immunopurified VDBP were more prevalent in persistent oligoarticular patient synovial fluids.Conclusion:The data indicate that a subset of the synovial fluid proteome may be used to stratify patients to determine risk of disease extension. Reduced conversion of VDBP to a macrophage activation factor may represent a novel pathway contributing to increased risk of disease extension in JIA patients.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Os avanços tecnológicos e científicos, na área da saúde, têm vindo a aliar áreas como a Medicina e a Matemática, cabendo à ciência adequar de forma mais eficaz os meios de investigação, diagnóstico, monitorização e terapêutica. Os métodos desenvolvidos e os estudos apresentados nesta dissertação resultam da necessidade de encontrar respostas e soluções para os diferentes desafios identificados na área da anestesia. A índole destes problemas conduz, necessariamente, à aplicação, adaptação e conjugação de diferentes métodos e modelos das diversas áreas da matemática. A capacidade para induzir a anestesia em pacientes, de forma segura e confiável, conduz a uma enorme variedade de situações que devem ser levadas em conta, exigindo, por isso, intensivos estudos. Assim, métodos e modelos de previsão, que permitam uma melhor personalização da dosagem a administrar ao paciente e por monitorizar, o efeito induzido pela administração de cada fármaco, com sinais mais fiáveis, são fundamentais para a investigação e progresso neste campo. Neste contexto, com o objetivo de clarificar a utilização em estudos na área da anestesia de um ajustado tratamento estatístico, proponho-me abordar diferentes análises estatísticas para desenvolver um modelo de previsão sobre a resposta cerebral a dois fármacos durante sedação. Dados obtidos de voluntários serão utilizados para estudar a interação farmacodinâmica entre dois fármacos anestésicos. Numa primeira fase são explorados modelos de regressão lineares que permitam modelar o efeito dos fármacos no sinal cerebral BIS (índice bispectral do EEG – indicador da profundidade de anestesia); ou seja estimar o efeito que as concentrações de fármacos têm na depressão do eletroencefalograma (avaliada pelo BIS). Na segunda fase deste trabalho, pretende-se a identificação de diferentes interações com Análise de Clusters bem como a validação do respetivo modelo com Análise Discriminante, identificando grupos homogéneos na amostra obtida através das técnicas de agrupamento. O número de grupos existentes na amostra foi, numa fase exploratória, obtido pelas técnicas de agrupamento hierárquicas, e a caracterização dos grupos identificados foi obtida pelas técnicas de agrupamento k-means. A reprodutibilidade dos modelos de agrupamento obtidos foi testada através da análise discriminante. As principais conclusões apontam que o teste de significância da equação de Regressão Linear indicou que o modelo é altamente significativo. As variáveis propofol e remifentanil influenciam significativamente o BIS e o modelo melhora com a inclusão do remifentanil. Este trabalho demonstra ainda ser possível construir um modelo que permite agrupar as concentrações dos fármacos, com base no efeito no sinal cerebral BIS, com o apoio de técnicas de agrupamento e discriminantes. Os resultados desmontram claramente a interacção farmacodinâmica dos dois fármacos, quando analisamos o Cluster 1 e o Cluster 3. Para concentrações semelhantes de propofol o efeito no BIS é claramente diferente dependendo da grandeza da concentração de remifentanil. Em suma, o estudo demostra claramente, que quando o remifentanil é administrado com o propofol (um hipnótico) o efeito deste último é potenciado, levando o sinal BIS a valores bastante baixos.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper deals with the establishment of a characterization methodology of electric power profiles of medium voltage (MV) consumers. The characterization is supported on the data base knowledge discovery process (KDD). Data Mining techniques are used with the purpose of obtaining typical load profiles of MV customers and specific knowledge of their customers’ consumption habits. In order to form the different customers’ classes and to find a set of representative consumption patterns, a hierarchical clustering algorithm and a clustering ensemble combination approach (WEACS) are used. Taking into account the typical consumption profile of the class to which the customers belong, new tariff options were defined and new energy coefficients prices were proposed. Finally, and with the results obtained, the consequences that these will have in the interaction between customer and electric power suppliers are analyzed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper describes a methodology that was developed for the classification of Medium Voltage (MV) electricity customers. Starting from a sample of data bases, resulting from a monitoring campaign, Data Mining (DM) techniques are used in order to discover a set of a MV consumer typical load profile and, therefore, to extract knowledge regarding to the electric energy consumption patterns. In first stage, it was applied several hierarchical clustering algorithms and compared the clustering performance among them using adequacy measures. In second stage, a classification model was developed in order to allow classifying new consumers in one of the obtained clusters that had resulted from the previously process. Finally, the interpretation of the discovered knowledge are presented and discussed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper analyses forest fires in the perspective of dynamical systems. Forest fires exhibit complex correlations in size, space and time, revealing features often present in complex systems, such as the absence of a characteristic length-scale, or the emergence of long range correlations and persistent memory. This study addresses a public domain forest fires catalogue, containing information of events for Portugal, during the period from 1980 up to 2012. The data is analysed in an annual basis, modelling the occurrences as sequences of Dirac impulses with amplitude proportional to the burnt area. First, we consider mutual information to correlate annual patterns. We use visualization trees, generated by hierarchical clustering algorithms, in order to compare and to extract relationships among the data. Second, we adopt the Multidimensional Scaling (MDS) visualization tool. MDS generates maps where each object corresponds to a point. Objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to identify forest fire patterns.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper analyses forest fires in the perspective of dynamical systems. Forest fires exhibit complex correlations in size, space and time, revealing features often present in complex systems, such as the absence of a characteristic length-scale, or the emergence of long range correlations and persistent memory. This study addresses a public domain forest fires catalogue, containing information of events for Portugal, during the period from 1980 up to 2012. The data is analysed in an annual basis, modelling the occurrences as sequences of Dirac impulses with amplitude proportional to the burnt area. First, we consider mutual information to correlate annual patterns. We use visualization trees, generated by hierarchical clustering algorithms, in order to compare and to extract relationships among the data. Second, we adopt the Multidimensional Scaling (MDS) visualization tool. MDS generates maps where each object corresponds to a point. Objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to identify forest fire patterns.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper studies the statistical distributions of worldwide earthquakes from year 1963 up to year 2012. A Cartesian grid, dividing Earth into geographic regions, is considered. Entropy and the Jensen–Shannon divergence are used to analyze and compare real-world data. Hierarchical clustering and multi-dimensional scaling techniques are adopted for data visualization. Entropy-based indices have the advantage of leading to a single parameter expressing the relationships between the seismic data. Classical and generalized (fractional) entropy and Jensen–Shannon divergence are tested. The generalized measures lead to a clear identification of patterns embedded in the data and contribute to better understand earthquake distributions.