980 resultados para Labeling hierarchical clustering


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The soybean crop is considered a high expression around the world. In plant breeding programs, knowledge of genetic diversity is extremely important and in this context, are frequently used multivariate analyzes. Thus, the aim of the present study was to evaluate the genetic divergence between soybean crosses through multivariate techniques. In total, 16 crosses were evaluated, which were in the F2 generation of inbreeding. The evaluated characteristics were plant height at maturity, height of the first pod, number of branches per plant, number of pods per plant, number of nodes per plant, hundred seed weight, grain yield and oil content. For the analyzes was used Euclidean distance, methods of hierarchical clustering UPGMA and Ward and principal component analysis. Genetic distances estimated using Euclidean distance ranged from 1.24 to 8.13, with the smallest distance observed between crosses C1 and C4, and the greatest distance between the C2 crosses and C6. The methods UPGMA clustering and Ward met crossings in five different groups. The principal component analysis explained 86.2% of the variance contained in the original eight variables with three main components. The APM characters, NV, NR, NN, PG% and oil were the main contributors to genetic divergence among traits. Multivariate techniques were crucial to the analysis of genetic diversity, and the methods of Ward and UPGMA clustering and principal components have consistent results in this way, the simultaneous use of these tools in genetic analysis of crosses is indicated

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Pós-graduação em Agronomia (Produção Vegetal) - FCAV

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: Large gene expression studies, such as those conducted using DNA arrays, often provide millions of different pieces of data. To address the problem of analyzing such data, we describe a statistical method, which we have called ‘gene shaving’. The method identifies subsets of genes with coherent expression patterns and large variation across conditions. Gene shaving differs from hierarchical clustering and other widely used methods for analyzing gene expression studies in that genes may belong to more than one cluster, and the clustering may be supervised by an outcome measure. The technique can be ‘unsupervised’, that is, the genes and samples are treated as unlabeled, or partially or fully supervised by using known properties of the genes or samples to assist in finding meaningful groupings. Results: We illustrate the use of the gene shaving method to analyze gene expression measurements made on samples from patients with diffuse large B-cell lymphoma. The method identifies a small cluster of genes whose expression is highly predictive of survival. Conclusions: The gene shaving method is a potentially useful tool for exploration of gene expression data and identification of interesting clusters of genes worth further investigation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Biogeography has been difficult to apply as a methodological approach because organismic biology is incomplete at levels where the process of formulating comparisons and analogies is complex. The study of insect biogeography became necessary because insects possess numerous evolutionary traits and play an important role as pollinators. Among insects, the euglossine bees, or orchid bees, attract interest because the study of their biology allows us to explain important steps in the evolution of social behavior and many other adaptive tradeoffs. We analyzed the distribution of morphological characteristics in Colombian orchid bees from an ecological perspective. The aim of this study was to observe the distribution of these attributes on a regional basis. Data corresponding to Colombian euglossine species were ordered with a correspondence analysis and with subsequent hierarchical clustering. Later, and based on community proprieties, we compared the resulting hierarchical model with the collection localities to seek to identify a biogeographic classification pattern. From this analysis, we derived a model that classifies the territory of Colombia into 11 biogeographic units or natural clusters. Ecological assumptions in concordance with the derived classification levels suggest that species characteristics associated with flight performance, nectar uptake, and social behavior are the factors that served to produce the current geographical structure.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Alzheimer's disease (AD) is the most common cause of dementia in the human population, characterized by a spectrum of neuropathological abnormalities that results in memory impairment and loss of other cognitive processes as well as the presence of non-cognitive symptoms. Transcriptomic analyses provide an important approach to elucidating the pathogenesis of complex diseases like AD, helping to figure out both pre-clinical markers to identify susceptible patients and the early pathogenic mechanisms to serve as therapeutic targets. This study provides the gene expression profile of postmortem brain tissue from subjects with clinic-pathological AD (Braak IV, V, or V and CERAD B or C; and CDR >= 1), preclinical AD (Braak IV, V, or VI and CERAD B or C; and CDR = 0), and healthy older individuals (Braak <= II and CERAD 0 or A; and CDR = 0) in order to establish genes related to both AD neuropathology and clinical emergence of dementia. Based on differential gene expression, hierarchical clustering and network analysis, genes involved in energy metabolism, oxidative stress, DNA damage/repair, senescence, and transcriptional regulation were implicated with the neuropathology of AD; a transcriptional profile related to clinical manifestation of AD could not be detected with reliability using differential gene expression analysis, although genes involved in synaptic plasticity, and cell cycle seems to have a role revealed by gene classifier. In conclusion, the present data suggest gene expression profile changes secondary to the development of AD-related pathology and some genes that appear to be related to the clinical manifestation of dementia in subjects with significant AD pathology, making necessary further investigations to better understand these transcriptional findings on the pathogenesis and clinical emergence of AD.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract Background Oral squamous cell carcinoma (OSCC) is a frequent neoplasm, which is usually aggressive and has unpredictable biological behavior and unfavorable prognosis. The comprehension of the molecular basis of this variability should lead to the development of targeted therapies as well as to improvements in specificity and sensitivity of diagnosis. Results Samples of primary OSCCs and their corresponding surgical margins were obtained from male patients during surgery and their gene expression profiles were screened using whole-genome microarray technology. Hierarchical clustering and Principal Components Analysis were used for data visualization and One-way Analysis of Variance was used to identify differentially expressed genes. Samples clustered mostly according to disease subsite, suggesting molecular heterogeneity within tumor stages. In order to corroborate our results, two publicly available datasets of microarray experiments were assessed. We found significant molecular differences between OSCC anatomic subsites concerning groups of genes presently or potentially important for drug development, including mRNA processing, cytoskeleton organization and biogenesis, metabolic process, cell cycle and apoptosis. Conclusion Our results corroborate literature data on molecular heterogeneity of OSCCs. Differences between disease subsites and among samples belonging to the same TNM class highlight the importance of gene expression-based classification and challenge the development of targeted therapies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract Background Regardless the regulatory function of microRNAs (miRNA), their differential expression pattern has been used to define miRNA signatures and to disclose disease biomarkers. To address the question of whether patients presenting the different types of diabetes mellitus could be distinguished on the basis of their miRNA and mRNA expression profiling, we obtained peripheral blood mononuclear cell (PBMC) RNAs from 7 type 1 (T1D), 7 type 2 (T2D), and 6 gestational diabetes (GDM) patients, which were hybridized to Agilent miRNA and mRNA microarrays. Data quantification and quality control were obtained using the Feature Extraction software, and data distribution was normalized using quantile function implemented in the Aroma light package. Differentially expressed miRNAs/mRNAs were identified using Rank products, comparing T1DxGDM, T2DxGDM and T1DxT2D. Hierarchical clustering was performed using the average linkage criterion with Pearson uncentered distance as metrics. Results The use of the same microarrays platform permitted the identification of sets of shared or specific miRNAs/mRNA interaction for each type of diabetes. Nine miRNAs (hsa-miR-126, hsa-miR-1307, hsa-miR-142-3p, hsa-miR-142-5p, hsa-miR-144, hsa-miR-199a-5p, hsa-miR-27a, hsa-miR-29b, and hsa-miR-342-3p) were shared among T1D, T2D and GDM, and additional specific miRNAs were identified for T1D (20 miRNAs), T2D (14) and GDM (19) patients. ROC curves allowed the identification of specific and relevant (greater AUC values) miRNAs for each type of diabetes, including: i) hsa-miR-1274a, hsa-miR-1274b and hsa-let-7f for T1D; ii) hsa-miR-222, hsa-miR-30e and hsa-miR-140-3p for T2D, and iii) hsa-miR-181a and hsa-miR-1268 for GDM. Many of these miRNAs targeted mRNAs associated with diabetes pathogenesis. Conclusions These results indicate that PBMC can be used as reporter cells to characterize the miRNA expression profiling disclosed by the different diabetes mellitus manifestations. Shared miRNAs may characterize diabetes as a metabolic and inflammatory disorder, whereas specific miRNAs may represent biological markers for each type of diabetes, deserving further attention.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

[EN] Breast cancer patients show a wide variation in normal tissue reactions after radiotherapy. The individual sensitivity to x-rays limits the efficiency of the therapy. Prediction of individual sensitivity to radiotherapy could help to select the radiation protocol and to improve treatment results. The aim of this study was to assess the relationship between gene expression profiles of ex vivo un-irradiated and irradiated lymphocytes and the development of toxicity due to high-dose hyperfractionated radiotherapy in patients with locally advanced breast cancer. Raw data from microarray experiments were uploaded to the Gene Expression Omnibus Database http://www.ncbi.nlm.nih.gov/geo/ (GEO accession GSE15341). We obtained a small group of 81 genes significantly regulated by radiotherapy, lumped in 50 relevant pathways. Using ANOVA and t-test statistical tools we found 20 and 26 constitutive genes (0 Gy) that segregate patients with and without acute and late toxicity, respectively. Non-supervised hierarchical clustering was used for the visualization of results. Six and 9 pathways were significantly regulated respectively. Concerning to irradiated lymphocytes (2 Gy), we founded 29 genes that separate patients with acute toxicity and without it. Those genes were gathered in 4 significant pathways. We could not identify a set of genes that segregates patients with and without late toxicity. In conclusion, we have found an association between the constitutive gene expression profile of peripheral blood lymphocytes and the development of acute and late toxicity in consecutive, unselected patients. These observations suggest the possibility of predicting normal tissue response to irradiation in high-dose non-conventional radiation therapy regimens. Prospective studies with higher number of patients are needed to validate these preliminary results.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Because of its aberrant activation, the PI3K/AKT/mTOR signaling pathway represents a pharmacological target in blast cells from patients with acute myelogenous leukemia (AML). Using Reverse Phase Protein Microarrays (RPMA), we have analyzed 20 phosphorylated epitopes of the PI3K/Akt/mTor signal pathway of peripheral blood and bone marrow specimens of 84 patients with newly diagnosed AML. Fresh blast cells were grown for 2 h, 4 h or 20 h untreated or treated with a panel of phase I or phase II Akt allosteric inhibitors, either alone or in combination with the mTOR kinase inhibitor Torin1 or the broad RTK inhibitor Sunitinib. By unsupervised hierarchical clustering a strong phosphorylation/activity of most of the sampled members of the PI3K/Akt/mTOR pathway was observed in 70% of samples from AML patients. Remarkably, however, we observed that inhibition of Akt phosphorylation, as well as of its substrates, was transient, and recovered or even increased far above basal level after 20 h in 60% samples. We demonstrated that inhibition of Akt induces FOXO-dependent insulin receptor expression and IRS-1 activation, attenuating the effect of drug treatment by reactivation of PI3K/Akt. Consistent with this model we found that combined inhibition of Akt and RTKs is much more effective than either alone, revealing the adaptive capabilities of signaling networks in blast cells and highliting the limations of these drugs if used as monotherapy.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Intelligent Transport Systems (ITS) consists in the application of ICT to transport to offer new and improved services to the mobility of people and freights. While using ITS, travellers produce large quantities of data that can be collected and analysed to study their behaviour and to provide information to decision makers and planners. The thesis proposes innovative deployments of classification algorithms for Intelligent Transport System with the aim to support the decisions on traffic rerouting, bus transport demand and behaviour of two wheelers vehicles. The first part of this work provides an overview and a classification of a selection of clustering algorithms that can be implemented for the analysis of ITS data. The first contribution of this thesis is an innovative use of the agglomerative hierarchical clustering algorithm to classify similar travels in terms of their origin and destination, together with the proposal for a methodology to analyse drivers’ route choice behaviour using GPS coordinates and optimal alternatives. The clusters of repetitive travels made by a sample of drivers are then analysed to compare observed route choices to the modelled alternatives. The results of the analysis show that drivers select routes that are more reliable but that are more expensive in terms of travel time. Successively, different types of users of a service that provides information on the real time arrivals of bus at stop are classified using Support Vector Machines. The results shows that the results of the classification of different types of bus transport users can be used to update or complement the census on bus transport flows. Finally, the problem of the classification of accidents made by two wheelers vehicles is presented together with possible future application of clustering methodologies aimed at identifying and classifying the different types of accidents.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The purpose of this study was to determine the role of saliva-derived biomarkers and periodontal pathogens during periodontal disease progression (PDP). One hundred human participants were recruited into a 12-month investigation. They were seen bi-monthly for saliva and clinical measures and bi-annually for subtraction radiography, serum and plaque biofilm assessments. Saliva and serum were analyzed with protein arrays for 14 pro-inflammatory and bone turnover markers, while qPCR was used for detection of biofilm. A hierarchical clustering algorithm was used to group study participants based on clinical, microbiological, salivary/serum biomarkers, and PDP. Eighty-three individuals completed the six-month monitoring phase, with 39 [corrected] exhibiting PDP, while 44 [corrected] demonstrated stability. Participants assembled into three clusters based on periodontal pathogens, serum and salivary biomarkers. Cluster 1 members displayed high salivary biomarkers and biofilm; 71% [corrected] of these individuals were undergoing PDP. Cluster 2 members displayed low biofilm and biomarker levels; 76% [corrected] of these individuals were stable. Cluster 3 members were not discriminated by PDP status; however, cluster stratification followed groups 1 and 2 based on thresholds of salivary biomarkers and biofilm pathogens. The association of cluster membership to PDP was highly significant (p < 0.0007). [corrected] The use of salivary and biofilm biomarkers offers potential for the identification of PDP or stability (ClinicalTrials.gov number, CT00277745).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Objective:The most difficult thyroid tumors to be diagnosed by cytology and histology are conventional follicular carcinomas (cFTCs) and oncocytic follicular carcinomas (oFTCs). Several microRNAs (miRNAs) have been previously found to be consistently deregulated in papillary thyroid carcinomas; however, very limited information is available for cFTC and oFTC. The aim of this study was to explore miRNA deregulation and find candidate miRNA markers for follicular carcinomas that can be used diagnostically.Design:Thirty-eight follicular thyroid carcinomas (21 cFTCs, 17 oFTCs) and 10 normal thyroid tissue samples were studied for expression of 381 miRNAs using human microarray assays. Expression of deregulated miRNAs was confirmed by individual RT-PCR assays in all samples. In addition, 11 follicular adenomas, two hyperplastic nodules (HNs), and 19 fine-needle aspiration samples were studied for expression of novel miRNA markers detected in this study.Results:The unsupervised hierarchical clustering analysis demonstrated individual clusters for cFTC and oFTC, indicating the difference in miRNA expression between these tumor types. Both cFTCs and oFTCs showed an up-regulation of miR-182/-183/-221/-222/-125a-3p and a down-regulation of miR-542-5p/-574-3p/-455/-199a. Novel miRNA (miR-885-5p) was found to be strongly up-regulated (>40-fold) in oFTCs but not in cFTCs, follicular adenomas, and HNs. The classification and regression tree algorithm applied to fine-needle aspiration samples demonstrated that three dysregulated miRNAs (miR-885-5p/-221/-574-3p) allowed distinguishing follicular thyroid carcinomas from benign HNs with high accuracy.Conclusions:In this study we demonstrate that different histopathological types of follicular thyroid carcinomas have distinct miRNA expression profiles. MiR-885-5p is highly up-regulated in oncocytic follicular carcinomas and may serve as a diagnostic marker for these tumors. A small set of deregulated miRNAs allows for an accurate discrimination between follicular carcinomas and hyperplastic nodules and can be used diagnostically in fine-needle aspiration biopsies.