930 resultados para Hierarchical cluster analysis
Resumo:
INTRODUCTION: Breast cancer subtyping and prognosis have been studied extensively by gene expression profiling, resulting in disparate signatures with little overlap in their constituent genes. Although a previous study demonstrated a prognostic concordance among gene expression signatures, it was limited to only one dataset and did not fully elucidate how the different genes were related to one another nor did it examine the contribution of well-known biological processes of breast cancer tumorigenesis to their prognostic performance. METHOD: To address the above issues and to further validate these initial findings, we performed the largest meta-analysis of publicly available breast cancer gene expression and clinical data, which are comprised of 2,833 breast tumors. Gene coexpression modules of three key biological processes in breast cancer (namely, proliferation, estrogen receptor [ER], and HER2 signaling) were used to dissect the role of constituent genes of nine prognostic signatures. RESULTS: Using a meta-analytical approach, we consolidated the signatures associated with ER signaling, ERBB2 amplification, and proliferation. Previously published expression-based nomenclature of breast cancer 'intrinsic' subtypes can be mapped to the three modules, namely, the ER-/HER2- (basal-like), the HER2+ (HER2-like), and the low- and high-proliferation ER+/HER2- subtypes (luminal A and B). We showed that all nine prognostic signatures exhibited a similar prognostic performance in the entire dataset. Their prognostic abilities are due mostly to the detection of proliferation activity. Although ER- status (basal-like) and ERBB2+ expression status correspond to bad outcome, they seem to act through elevated expression of proliferation genes and thus contain only indirect information about prognosis. Clinical variables measuring the extent of tumor progression, such as tumor size and nodal status, still add independent prognostic information to proliferation genes. CONCLUSION: This meta-analysis unifies various results of previous gene expression studies in breast cancer. It reveals connections between traditional prognostic factors, expression-based subtyping, and prognostic signatures, highlighting the important role of proliferation in breast cancer prognosis.
Resumo:
We investigate the evolutionary history of the greater white-toothed shrew across its distribution in northern Africa and mainland Europe using sex-specific (mtDNA and Y chromosome) and biparental (X chromosome) markers. All three loci confirm a large divergence between eastern (Tunisia and Sardinia) and western (Morocco and mainland Europe) lineages, and application of a molecular clock to mtDNA divergence estimates indicates a more ancient separation (2.25 M yr ago) than described by some previous studies, supporting claims for taxonomic revision. Moroccan ancestry for the mainland European population is inconclusive from phylogenetic trees, but is supported by greater nucleotide diversity and a more ancient population expansion in Morocco than in Europe. Signatures of rapid population expansion in mtDNA, combined with low X and Y chromosome diversity, suggest a single colonization of mainland Europe by a small number of Moroccan shrews >38 K yr ago. This study illustrates that multilocus genetic analyses can facilitate the interpretation of species' evolutionary history but that phylogeographic inference using X and Y chromosomes is restricted by low levels of observed polymorphism.
Resumo:
The present study tested the effect of a school-based physical activity (PA) program on quality of life (QoL) in 540 elementary school children. First and fifth graders were randomly assigned to a PA program or a no-PA control condition during one academic year. QoL was assessed by the Child Health Questionnaire at baseline and postintervention. Based on mixed linear model analyses, physical QoL in first graders and physical and psychosocial QoL in fifth graders were not affected by the intervention. In first graders, the PA intervention had a positive impact on psychosocial QoL (effect size [d], 0.32; p < .05). Subpopulation analyses revealed that this effect was caused by an effect in urban (effect size [d], 0.38; p < .05) and overweight first graders (effect size [d], 0.45; p < .05). In conclusion, a school-based PA intervention had little effect on QoL in elementary school children.
Resumo:
The study of transcriptional regulation often needs the integration of diverse yet independent data. In the present work, sequence conservation, predic-tion of transcription factor binding sites (TFBS) and gene expression analysis have been applied to the detection of putative transcription factor (TF) modules in the regulatory region of the FGFR3 oncogene. Several TFs with conserved binding sites in the FGFR3 regulatory region have shown high positive or negative corre-lation with FGFR3 expression both in urothelial carcinoma and in benign nevi. By means of conserved TF cluster analysis, two different TF modules have been iden-tified in the promoter and first intron of FGFR3 gene. These modules contain acti-vating AP2, E2F, E47 and SP1 binding sites plus motifs for EGR with possible repressor function.
Resumo:
Background: Differences in the distribution of genotypes between individuals of the same ethnicity are an important confounder factor commonly undervalued in typical association studies conducted in radiogenomics. Objective: To evaluate the genotypic distribution of SNPs in a wide set of Spanish prostate cancer patients for determine the homogeneity of the population and to disclose potential bias. Design, Setting, and Participants: A total of 601 prostate cancer patients from Andalusia, Basque Country, Canary and Catalonia were genotyped for 10 SNPs located in 6 different genes associated to DNA repair: XRCC1 (rs25487, rs25489, rs1799782), ERCC2 (rs13181), ERCC1 (rs11615), LIG4 (rs1805388, rs1805386), ATM (rs17503908, rs1800057) and P53 (rs1042522). The SNP genotyping was made in a Biotrove OpenArrayH NT Cycler. Outcome Measurements and Statistical Analysis: Comparisons of genotypic and allelic frequencies among populations, as well as haplotype analyses were determined using the web-based environment SNPator. Principal component analysis was made using the SnpMatrix and XSnpMatrix classes and methods implemented as an R package. Non-supervised hierarchical cluster of SNP was made using MultiExperiment Viewer. Results and Limitations: We observed that genotype distribution of 4 out 10 SNPs was statistically different among the studied populations, showing the greatest differences between Andalusia and Catalonia. These observations were confirmed in cluster analysis, principal component analysis and in the differential distribution of haplotypes among the populations. Because tumor characteristics have not been taken into account, it is possible that some polymorphisms may influence tumor characteristics in the same way that it may pose a risk factor for other disease characteristics. Conclusion: Differences in distribution of genotypes within different populations of the same ethnicity could be an important confounding factor responsible for the lack of validation of SNPs associated with radiation-induced toxicity, especially when extensive meta-analysis with subjects from different countries are carried out.
Resumo:
The objective of this work was to propose a way of using the Tocher's method of clustering to obtain a matrix similar to the cophenetic one obtained for hierarchical methods, which would allow the calculation of a cophenetic correlation. To illustrate the obtention of the proposed cophenetic matrix, we used two dissimilarity matrices - one obtained with the generalized squared Mahalanobis distance and the other with the Euclidean distance - between 17 garlic cultivars, based on six morphological characters. Basically, the proposal for obtaining the cophenetic matrix was to use the average distances within and between clusters, after performing the clustering. A function in R language was proposed to compute the cophenetic matrix for Tocher's method. The empirical distribution of this correlation coefficient was briefly studied. For both dissimilarity measures, the values of cophenetic correlation obtained for the Tocher's method were higher than those obtained with the hierarchical methods (Ward's algorithm and average linkage - UPGMA). Comparisons between the clustering made with the agglomerative hierarchical methods and with the Tocher's method can be performed using a criterion in common: the correlation between matrices of original and cophenetic distances.
Resumo:
The objective of research was to analyse the potential of Normalized Difference Vegetation Index (NDVI) maps from satellite images, yield maps and grapevine fertility and load variables to delineate zones with different wine grape properties for selective harvesting. Two vineyard blocks located in NE Spain (Cabernet Sauvignon and Syrah) were analysed. The NDVI was computed from a Quickbird-2 multi-spectral image at veraison (July 2005). Yield data was acquired by means of a yield monitor during September 2005. Other variables, such as the number of buds, number of shoots, number of wine grape clusters and weight of 100 berries were sampled in a 10 rows × 5 vines pattern and used as input variables, in combination with the NDVI, to define the clusters as alternative to yield maps. Two days prior to the harvesting, grape samples were taken. The analysed variables were probable alcoholic degree, pH of the juice, total acidity, total phenolics, colour, anthocyanins and tannins. The input variables, alone or in combination, were clustered (2 and 3 Clusters) by using the ISODATA algorithm, and an analysis of variance and a multiple rang test were performed. The results show that the zones derived from the NDVI maps are more effective to differentiate grape maturity and quality variables than the zones derived from the yield maps. The inclusion of other grapevine fertility and load variables did not improve the results.
Resumo:
A new issue, once again a bouquet of attractive papers. First of all the paper by Droit-Dupré et al. (10.1007/s00428-015-1724-9). The group studied colonic adenocarcinomas, not otherwise specified, by immunohistochemistry for the expression of markers of intestinal epithelial cell differentiation. Hierarchical clustering analysis identified a major cluster of two thirds of the case series, expressing cytokeratin 20, CDX2 and MUC2 and invariably mismatch repair competent, which they called crypt-like. In stage III colon cancer, the crypt-like cluster had a better prognosis. The paper is a relatively simple example of what is happening in cancer classification beyond morphology: multiparameter differentiation and (epi)genomic markers defining new subtypes of cancer with potential clinical significance in clinical decision making.
Resumo:
Background: Differences in the distribution of genotypes between individuals of the same ethnicity are an important confounder factor commonly undervalued in typical association studies conducted in radiogenomics. Objective: To evaluate the genotypic distribution of SNPs in a wide set of Spanish prostate cancer patients for determine the homogeneity of the population and to disclose potential bias. Design, Setting, and Participants: A total of 601 prostate cancer patients from Andalusia, Basque Country, Canary and Catalonia were genotyped for 10 SNPs located in 6 different genes associated to DNA repair: XRCC1 (rs25487, rs25489, rs1799782), ERCC2 (rs13181), ERCC1 (rs11615), LIG4 (rs1805388, rs1805386), ATM (rs17503908, rs1800057) and P53 (rs1042522). The SNP genotyping was made in a Biotrove OpenArrayH NT Cycler. Outcome Measurements and Statistical Analysis: Comparisons of genotypic and allelic frequencies among populations, as well as haplotype analyses were determined using the web-based environment SNPator. Principal component analysis was made using the SnpMatrix and XSnpMatrix classes and methods implemented as an R package. Non-supervised hierarchical cluster of SNP was made using MultiExperiment Viewer. Results and Limitations: We observed that genotype distribution of 4 out 10 SNPs was statistically different among the studied populations, showing the greatest differences between Andalusia and Catalonia. These observations were confirmed in cluster analysis, principal component analysis and in the differential distribution of haplotypes among the populations. Because tumor characteristics have not been taken into account, it is possible that some polymorphisms may influence tumor characteristics in the same way that it may pose a risk factor for other disease characteristics. Conclusion: Differences in distribution of genotypes within different populations of the same ethnicity could be an important confounding factor responsible for the lack of validation of SNPs associated with radiation-induced toxicity, especially when extensive meta-analysis with subjects from different countries are carried out.
Resumo:
A key strategic issue for banks is the implementation of internet banking. The ‘click and mortar’ model that complements classical branch banking with online facilities is competing with pure internet banks. The objective of this paper is to compare the performance of these two models across countries, so as to examine the role of differences in the banking system and technological progress. A fuzzy cluster analysis on the performance of banks in Finland, Spain, Italy and the UK shows that internet banks are hard to distinguish from banks that follow a click and mortar strategy; country borders are more important. We therefore explain bank performance by a group of selected bank features, country-specific economic and IT indicators over the period 1995-2004. We find that the strategy of banking groups to incorporate internet banks reflects some competitive edge that these banks have in their business models. Extensive technological innovation boosts internet banking.
Resumo:
L'anàlisi de conglomerats o cluster és una tècnica multivariant que busca agrupar elements o variables tractant d'aconseguir la màxima homogeneïtat en cada grup i la major diferència entre ells, mitjançant una estructura jerarquitzada per poder decidir quin nivell jeràrquic és el més apropiat per establir la classificació. El programa SPSS disposa de tres tipus d'anàlisi de conglomerats: l'anàlisi de conglomerats jeràrquic, bietàpic i de K mitjanes. Aplicarem el mètode jeràrquic com el més idoni per determinar el nombre òptim de conglomerats existent en les dades i el contingut dels mateixos per al nostre cas pràctic.
Resumo:
The objective of this master’s thesis was to study how customer relationships should be assessed and categorized in order to support customer relationship management (CRM) in the context of business-to-business (B2B) and professional services. This sophisticated and complex market is utilizing possibilities of CRM only rarely and even then the focus is often on technology. The theoretical part considered first CRM from the value chain point of view and then discussed the cyclical nature of relationships. The case study focused on B2B professional service firm. The data was collected from company databases and included the sample of 90 customers. The research was conducted in three phases first studying the age, then the service type of relationships and finally executing the cluster analysis. The data was analysed by statistical analysis program SAS Enterprise Guide. The results indicate that there are great differences between developments of customer relationships. While some relationships are dynamically growing and changing, most of customers are remaining constant. This implies expectations and requirements of customers are similarly divergent and relationships should be managed accordingly.
Resumo:
The penetration resistance (PR) is a soil attribute that allows identifies areas with restrictions due to compaction, which results in mechanical impedance for root growth and reduced crop yield. The aim of this study was to characterize the PR of an agricultural soil by geostatistical and multivariate analysis. Sampling was done randomly in 90 points up to 0.60 m depth. It was determined spatial distribution models of PR, and defined areas with mechanical impedance for roots growth. The PR showed a random distribution to 0.55 and 0.60 m depth. PR in other depths analyzed showed spatial dependence, with adjustments to exponential and spherical models. The cluster analysis that considered sampling points allowed establishing areas with compaction problem identified in the maps by kriging interpolation. The analysis with main components identified three soil layers, where the middle layer showed the highest values of PR.
Resumo:
ABSTRACT This study aimed to develop a methodology based on multivariate statistical analysis of principal components and cluster analysis, in order to identify the most representative variables in studies of minimum streamflow regionalization, and to optimize the identification of the hydrologically homogeneous regions for the Doce river basin. Ten variables were used, referring to the river basin climatic and morphometric characteristics. These variables were individualized for each of the 61 gauging stations. Three dependent variables that are indicative of minimum streamflow (Q7,10, Q90 and Q95). And seven independent variables that concern to climatic and morphometric characteristics of the basin (total annual rainfall – Pa; total semiannual rainfall of the dry and of the rainy season – Pss and Psc; watershed drainage area – Ad; length of the main river – Lp; total length of the rivers – Lt; and average watershed slope – SL). The results of the principal component analysis pointed out that the variable SL was the least representative for the study, and so it was discarded. The most representative independent variables were Ad and Psc. The best divisions of hydrologically homogeneous regions for the three studied flow characteristics were obtained using the Mahalanobis similarity matrix and the complete linkage clustering method. The cluster analysis enabled the identification of four hydrologically homogeneous regions in the Doce river basin.