790 resultados para Agglomerative Hierarchical Clustering


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Microarray gene expression profiles of fresh clinical samples of chronic myeloid leukaemia in chronic phase, acute promyelocytic leukaemia and acute monocytic leukaemia were compared with profiles from cell lines representing the corresponding types of leukaemia (K562, NB4, HL60). In a hierarchical clustering analysis, all clinical samples clustered separately from the cell lines, regardless of leukaemic subtype. Gene ontology analysis showed that cell lines chiefly overexpressed genes related to macromolecular metabolism, whereas in clinical samples genes related to the immune response were abundantly expressed. These findings must be taken into consideration when conclusions from cell line-based studies are extrapolated to patients.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Alveolar echinococcosis (AE)--caused by the cestode Echinococcus multilocularis--is a severe zoonotic disease found in temperate and arctic regions of the northern hemisphere. Even though the transmission patterns observed in different geographical areas are heterogeneous, the nuclear and mitochondrial targets usually used for the genotyping of E. multilocularis have shown only a marked genetic homogeneity in this species. We used microsatellite sequences, because of their high typing resolution, to explore the genetic diversity of E. multilocularis. Four microsatellite targets (EmsJ, EmsK, and EmsB, which were designed in our laboratory, and NAK1, selected from the literature) were tested on a panel of 76 E. multilocularis samples (larval and adult stages) obtained from Alaska, Canada, Europe, and Asia. Genetic diversity for each target was assessed by size polymorphism analysis. With the EmsJ and EmsK targets, two alleles were found for each locus, yielding two and three genotypes, respectively, discriminating European isolates from the other groups. With NAK1, five alleles were found, yielding seven genotypes, including those specific to Tibetan and Alaskan isolates. The EmsB target, a tandem repeated multilocus microsatellite, found 17 alleles showing a complex pattern. Hierarchical clustering analyses were performed with the EmsB findings, and 29 genotypes were identified. Due to its higher genetic polymorphism, EmsB exhibited a higher discriminatory power than the other targets. The complex EmsB pattern was able to discriminate isolates on a regional and sectoral level, while avoiding overdistinction. EmsB will be used to assess the putative emergence of E. multilocularis in Europe.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Sexual selection theory largely rests on the assumption that populations contain individual variation in mating preferences and that individuals are consistent in their preferences. However, there are few empirical studies of within-population variation and even fewer have examined individual male mating preferences. Here, we studied a color polymorphic population of the Lake Victoria cichlid fish Neochromis omnicaeruleus, a species in which color morphs are associated with different sex-determining factors. Wild-caught males were tested in three-way choice trials with multiple combinations of different females belonging to the three color morphs. Compositional log-ratio techniques were applied to analyze individual male mating preferences. Large individual variation in consistency, strength, and direction of male mating preferences for female color morphs was found and hierarchical clustering of the compositional data revealed the presence of four distinct preference groups corresponding to the three color morphs in addition to a no-preference class. Consistency of individual male mating preferences was higher in males with strongest preferences. We discuss the implications of these findings for our understanding of the mechanisms underlying polymorphism in mating preferences.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Skin segmentation is a challenging task due to several influences such as unknown lighting conditions, skin colored background, and camera limitations. A lot of skin segmentation approaches were proposed in the past including adaptive (in the sense of updating the skin color online) and non-adaptive approaches. In this paper, we compare three skin segmentation approaches that are promising to work well for hand tracking, which is our main motivation for this work. Hand tracking can widely be used in VR/AR e.g. navigation and object manipulation. The first skin segmentation approach is a well-known non-adaptive approach. It is based on a simple, pre-computed skin color distribution. Methods two and three adaptively estimate the skin color in each frame utilizing clustering algorithms. The second approach uses a hierarchical clustering for a simultaneous image and color space segmentation, while the third approach is a pure color space clustering, but with a more sophisticated clustering approach. For evaluation, we compared the segmentation results of the approaches against a ground truth dataset. To obtain the ground truth dataset, we labeled about 500 images captured under various conditions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND Follicular variant of papillary thyroid carcinoma (FVPTC) shares features of papillary (PTC) and follicular (FTC) thyroid carcinomas on a clinical, morphological, and genetic level. MicroRNA (miRNA) deregulation was extensively studied in PTCs and FTCs. However, very limited information is available for FVPTC. The aim of this study was to assess miRNA expression in FVPTC with the most comprehensive miRNA array panel and to correlate it with the clinicopathological data. METHODS Forty-four papillary thyroid carcinomas (17 FVPTC, 27 classic PTC) and eight normal thyroid tissue samples were analyzed for expression of 748 miRNAs using Human Microarray Assays on the ABI 7900 platform (Life Technologies, Carlsbad, CA). In addition, an independent set of 61 tumor and normal samples was studied for expression of novel miRNA markers detected in this study. RESULTS Overall, the miRNA expression profile demonstrated similar trends between FVPTC and classic PTC. Fourteen miRNAs were deregulated in FVPTC with a fold change of more than five (up/down), including miRNAs known to be upregulated in PTC (miR-146b-3p, -146-5p, -221, -222 and miR-222-5p) and novel miRNAs (miR-375, -551b, 181-2-3p, 99b-3p). However, the levels of miRNA expression were different between these tumor types and some miRNAs were uniquely dysregulated in FVPTC allowing separation of these tumors on the unsupervised hierarchical clustering analysis. Upregulation of novel miR-375 was confirmed in a large independent set of follicular cell derived neoplasms and benign nodules and demonstrated specific upregulation for PTC. Two miRNAs (miR-181a-2-3p, miR-99b-3p) were associated with an adverse outcome in FVPTC patients by a Kaplan-Meier (p < 0.05) and multivariate Cox regression analysis (p < 0.05). CONCLUSIONS Despite high similarity in miRNA expression between FVPTC and classic PTC, several miRNAs were uniquely expressed in each tumor type, supporting their histopathologic differences. Highly upregulated miRNA identified in this study (miR-375) can serve as a novel marker of papillary thyroid carcinoma, and miR-181a-2-3p and miR-99b-3p can predict relapse-free survival in patients with FVPTC thus potentially providing important diagnostic and predictive value.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Sterols are an essential class of lipids in eukaryotes, where they serve as structural components of membranes and play important roles as signaling molecules. Sterols are also of high pharmacological significance: cholesterol-lowering drugs are blockbusters in human health, and inhibitors of ergosterol biosynthesis are widely used as antifungals. Inhibitors of ergosterol synthesis are also being developed for Chagas's disease, caused by Trypanosoma cruzi. Here we develop an in silico pipeline to globally evaluate sterol metabolism and perform comparative genomics. We generate a library of hidden Markov model-based profiles for 42 sterol biosynthetic enzymes, which allows expressing the genomic makeup of a given species as a numerical vector. Hierarchical clustering of these vectors functionally groups eukaryote proteomes and reveals convergent evolution, in particular metabolic reduction in obligate endoparasites. We experimentally explore sterol metabolism by testing a set of sterol biosynthesis inhibitors against trypanosomatids, Plasmodium falciparum, Giardia, and mammalian cells, and by quantifying the expression levels of sterol biosynthetic genes during the different life stages of T. cruzi and Trypanosoma brucei. The phenotypic data correlate with genomic makeup for simvastatin, which showed activity against trypanosomatids. Other findings, such as the activity of terbinafine against Giardia, are not in agreement with the genotypic profile.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Radiomics is the high-throughput extraction and analysis of quantitative image features. For non-small cell lung cancer (NSCLC) patients, radiomics can be applied to standard of care computed tomography (CT) images to improve tumor diagnosis, staging, and response assessment. The first objective of this work was to show that CT image features extracted from pre-treatment NSCLC tumors could be used to predict tumor shrinkage in response to therapy. This is important since tumor shrinkage is an important cancer treatment endpoint that is correlated with probability of disease progression and overall survival. Accurate prediction of tumor shrinkage could also lead to individually customized treatment plans. To accomplish this objective, 64 stage NSCLC patients with similar treatments were all imaged using the same CT scanner and protocol. Quantitative image features were extracted and principal component regression with simulated annealing subset selection was used to predict shrinkage. Cross validation and permutation tests were used to validate the results. The optimal model gave a strong correlation between the observed and predicted shrinkages with . The second objective of this work was to identify sets of NSCLC CT image features that are reproducible, non-redundant, and informative across multiple machines. Feature sets with these qualities are needed for NSCLC radiomics models to be robust to machine variation and spurious correlation. To accomplish this objective, test-retest CT image pairs were obtained from 56 NSCLC patients imaged on three CT machines from two institutions. For each machine, quantitative image features with concordance correlation coefficient values greater than 0.90 were considered reproducible. Multi-machine reproducible feature sets were created by taking the intersection of individual machine reproducible feature sets. Redundant features were removed through hierarchical clustering. The findings showed that image feature reproducibility and redundancy depended on both the CT machine and the CT image type (average cine 4D-CT imaging vs. end-exhale cine 4D-CT imaging vs. helical inspiratory breath-hold 3D CT). For each image type, a set of cross-machine reproducible, non-redundant, and informative image features was identified. Compared to end-exhale 4D-CT and breath-hold 3D-CT, average 4D-CT derived image features showed superior multi-machine reproducibility and are the best candidates for clinical correlation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Hierarchical clustering. Taxonomic assignment of reads was performed using a preexisting database of SSU rDNA sequences from including XXX reference sequences generated by Sanger sequencing. Experimental amplicons (reads), sorted by abundance, were then concatenated with the reference extracted sequences sorted by decreasing length. All sequences, experimental and referential, were then clustered to 85% identity using the global alignment clustering option of the uclust module from the usearch v4.0 software (Edgar, 2010). Each 85% cluster was then reclustered at a higher stringency level (86%) and so on (87%, 88%,.) in a hierarchical manner up to 100% similarity. Each experimental sequence was then identified by the list of clusters to which it belonged at 85% to 100% levels. This information can be viewed as a matrix with the lines corresponding to different sequences and the columns corresponding to the cluster membership at each clustering level. Taxonomic assignment for a given read was performed by first looking if reference sequences clustered with the experimental sequence at the 100% clustering level. If this was the case, the last common taxonomic name of the reference sequence(s) within the cluster was used to assign the environmental read. If not, the same procedure was applied to clusters from 99% to 85% similarity if necessary, until a cluster was found containing both the experimental read and reference sequence(s), in which case sequences were taxonomically assigned as described above.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Esta Tesis tiene como objetivo principal el desarrollo de métodos de identificación del daño que sean robustos y fiables, enfocados a sistemas estructurales experimentales, fundamentalmente a las estructuras de hormigón armado reforzadas externamente con bandas fibras de polímeros reforzados (FRP). El modo de fallo de este tipo de sistema estructural es crítico, pues generalmente es debido a un despegue repentino y frágil de la banda del refuerzo FRP originado en grietas intermedias causadas por la flexión. La detección de este despegue en su fase inicial es fundamental para prevenir fallos futuros, que pueden ser catastróficos. Inicialmente, se lleva a cabo una revisión del método de la Impedancia Electro-Mecánica (EMI), de cara a exponer sus capacidades para la detección de daño. Una vez la tecnología apropiada es seleccionada, lo que incluye un analizador de impedancias así como novedosos sensores PZT para monitorización inteligente, se ha diseñado un procedimiento automático basado en los registros de impedancias de distintas estructuras de laboratorio. Basándonos en el hecho de que las mediciones de impedancias son posibles gracias a una colocación adecuada de una red de sensores PZT, la estimación de la presencia de daño se realiza analizando los resultados de distintos indicadores de daño obtenidos de la literatura. Para que este proceso sea automático y que no sean necesarios conocimientos previos sobre el método EMI para realizar un experimento, se ha diseñado e implementado un Interfaz Gráfico de Usuario, transformando la medición de impedancias en un proceso fácil e intuitivo. Se evalúa entonces el daño a través de los correspondientes índices de daño, intentando estimar no sólo su severidad, sino también su localización aproximada. El desarrollo de estos experimentos en cualquier estructura genera grandes cantidades de datos que han de ser procesados, y algunas veces los índices de daño no son suficientes para una evaluación completa de la integridad de una estructura. En la mayoría de los casos se pueden encontrar patrones de daño en los datos, pero no se tiene información a priori del estado de la estructura. En este punto, se ha hecho una importante investigación en técnicas de reconocimiento de patrones particularmente en aprendizaje no supervisado, encontrando aplicaciones interesantes en el campo de la medicina. De ahí surge una idea creativa e innovadora: detectar y seguir la evolución del daño en distintas estructuras como si se tratase de un cáncer propagándose por el cuerpo humano. En ese sentido, las lecturas de impedancias se emplean como información intrínseca de la salud de la propia estructura, de forma que se pueden aplicar las mismas técnicas que las empleadas en la investigación del cáncer. En este caso, se ha aplicado un algoritmo de clasificación jerárquica dado que ilustra además la clasificación de los datos de forma gráfica, incluyendo información cualitativa y cuantitativa sobre el daño. Se ha investigado la efectividad de este procedimiento a través de tres estructuras de laboratorio, como son una viga de aluminio, una unión atornillada de aluminio y un bloque de hormigón reforzado con FRP. La primera ayuda a mostrar la efectividad del método en sencillos escenarios de daño simple y múltiple, de forma que las conclusiones extraídas se aplican sobre los otros dos, diseñados para simular condiciones de despegue en distintas estructuras. Demostrada la efectividad del método de clasificación jerárquica de lecturas de impedancias, se aplica el procedimiento sobre las estructuras de hormigón armado reforzadas con bandas de FRP objeto de esta tesis, detectando y clasificando cada estado de daño. Finalmente, y como alternativa al anterior procedimiento, se propone un método para la monitorización continua de la interfase FRP-Hormigón, a través de una red de sensores FBG permanentemente instalados en dicha interfase. De esta forma, se obtienen medidas de deformación de la interfase en condiciones de carga continua, para ser implementadas en un modelo de optimización multiobjetivo, cuya solución se haya por medio de una expansión multiobjetivo del método Particle Swarm Optimization (PSO). La fiabilidad de este último método de detección se investiga a través de sendos ejemplos tanto numéricos como experimentales. ABSTRACT This thesis aims to develop robust and reliable damage identification methods focused on experimental structural systems, in particular Reinforced Concrete (RC) structures externally strengthened with Fiber Reinforced Polymers (FRP) strips. The failure mode of this type of structural system is critical, since it is usually due to sudden and brittle debonding of the FRP reinforcement originating from intermediate flexural cracks. Detection of the debonding in its initial stage is essential thus to prevent future failure, which might be catastrophic. Initially, a revision of the Electro-Mechanical Impedance (EMI) method is carried out, in order to expose its capabilities for local damage detection. Once the appropriate technology is selected, which includes impedance analyzer as well as novel PZT sensors for smart monitoring, an automated procedure has been design based on the impedance signatures of several lab-scale structures. On the basis that capturing impedance measurements is possible thanks to an adequately deployed PZT sensor network, the estimation of damage presence is done by analyzing the results of different damage indices obtained from the literature. In order to make this process automatic so that it is not necessary a priori knowledge of the EMI method to carry out an experimental test, a Graphical User Interface has been designed, turning the impedance measurements into an easy and intuitive procedure. Damage is then assessed through the analysis of the corresponding damage indices, trying to estimate not only the damage severity, but also its approximate location. The development of these tests on any kind of structure generates large amounts of data to be processed, and sometimes the information provided by damage indices is not enough to achieve a complete analysis of the structural health condition. In most of the cases, some damage patterns can be found in the data, but none a priori knowledge of the health condition is given for any structure. At this point, an important research on pattern recognition techniques has been carried out, particularly on unsupervised learning techniques, finding interesting applications in the medicine field. From this investigation, a creative and innovative idea arose: to detect and track the evolution of damage in different structures, as if it were a cancer propagating through a human body. In that sense, the impedance signatures are used to give intrinsic information of the health condition of the structure, so that the same clustering algorithms applied in the cancer research can be applied to the problem addressed in this dissertation. Hierarchical clustering is then applied since it also provides a graphical display of the clustered data, including quantitative and qualitative information about damage. The performance of this approach is firstly investigated using three lab-scale structures, such as a simple aluminium beam, a bolt-jointed aluminium beam and an FRP-strengthened concrete specimen. The first one shows the performance of the method on simple single and multiple damage scenarios, so that the first conclusions can be extracted and applied to the other two experimental tests, which are designed to simulate a debonding condition on different structures. Once the performance of the impedance-based hierarchical clustering method is proven to be successful, it is then applied to the structural system studied in this dissertation, the RC structures externally strengthened with FRP strips, where the debonding failure in the interface between the FRP and the concrete is successfully detected and classified, proving thus the feasibility of this method. Finally, as an alternative to the previous approach, a continuous monitoring procedure of the FRP-Concrete interface is proposed, based on an FBGsensors Network permanently deployed within that interface. In this way, strain measurements can be obtained under controlled loading conditions, and then they are used in order to implement a multi-objective model updating method solved by a multi-objective expansion of the Particle Swarm Optimization (PSO) method. The feasibility of this last proposal is investigated and successfully proven on both numerical and experimental RC beams strengthened with FRP.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We introduce a method of functionally classifying genes by using gene expression data from DNA microarray hybridization experiments. The method is based on the theory of support vector machines (SVMs). SVMs are considered a supervised computer learning method because they exploit prior knowledge of gene function to identify unknown genes of similar function from expression data. SVMs avoid several problems associated with unsupervised clustering methods, such as hierarchical clustering and self-organizing maps. SVMs have many mathematical features that make them attractive for gene expression analysis, including their flexibility in choosing a similarity function, sparseness of solution when dealing with large data sets, the ability to handle large feature spaces, and the ability to identify outliers. We test several SVMs that use different similarity metrics, as well as some other supervised learning methods, and find that the SVMs best identify sets of genes with a common function using expression data. Finally, we use SVMs to predict functional roles for uncharacterized yeast ORFs based on their expression data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The global amino acid compositions as deduced from the complete genomic sequences of six thermophilic archaea, two thermophilic bacteria, 17 mesophilic bacteria and two eukaryotic species were analysed by hierarchical clustering and principal components analysis. Both methods showed an influence of several factors on amino acid composition. Although GC content has a dominant effect, thermophilic species can be identified by their global amino acid compositions alone. This study presents a careful statistical analysis of factors that affect amino acid composition and also yielded specific features of the average amino acid composition of thermophilic species. Moreover, we introduce the first example of a ‘compositional tree’ of species that takes into account not only homologous proteins, but also proteins unique to particular species. We expect this simple yet novel approach to be a useful additional tool for the study of phylogeny at the genome level.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-06

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We have used microarray gene expression pro. ling and machine learning to predict the presence of BRAF mutations in a panel of 61 melanoma cell lines. The BRAF gene was found to be mutated in 42 samples (69%) and intragenic mutations of the NRAS gene were detected in seven samples (11%). No cell line carried mutations of both genes. Using support vector machines, we have built a classifier that differentiates between melanoma cell lines based on BRAF mutation status. As few as 83 genes are able to discriminate between BRAF mutant and BRAF wild-type samples with clear separation observed using hierarchical clustering. Multidimensional scaling was used to visualize the relationship between a BRAF mutation signature and that of a generalized mitogen-activated protein kinase ( MAPK) activation ( either BRAF or NRAS mutation) in the context of the discriminating gene list. We observed that samples carrying NRAS mutations lie somewhere between those with or without BRAF mutations. These observations suggest that there are gene-specific mutation signals in addition to a common MAPK activation that result from the pleiotropic effects of either BRAF or NRAS on other signaling pathways, leading to measurably different transcriptional changes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Chronic alcohol exposure induces lasting behavioral changes, tolerance, and dependence. This results, at least partially, from neural adaptations at a cellular level. Previous genome-wide gene expression studies using pooled human brain samples showed that alcohol abuse causes widespread changes in the pattern of gene expression in the frontal and motor cortices of human brain. Because these studies used pooled samples, they could not determine variability between different individuals. In the present study, we profiled gene expression levels of 14 postmortem human brains (seven controls and seven alcoholic cases) using cDNA microarrays (46 448 clones per array). Both frontal cortex and motor cortex brain regions were studied. The list of genes differentially expressed confirms and extends previous studies of alcohol responsive genes. Genes identified as differentially expressed in two brain regions fell generally into similar functional groups, including metabolism, immune response, cell survival, cell communication, signal transduction and energy production. Importantly, hierarchical clustering of differentially expressed genes accurately distinguished between control and alcoholic cases, particularly in the frontal cortex.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A combination of uni- and multiplex PCR assays targeting 58 virulence genes (VGs) associated with Escherichia coli strains causing intestinal and extraintestinal disease in humans and other mammals was used to analyze the VG repertoire of 23 commensal E. coli isolates from healthy pigs and 52 clinical isolates associated with porcine neonatal diarrhea (ND) and postweaning diarrhea (PWD). The relationship between the presence and absence of VGs was interrogated using three statistical methods. According to the generalized linear model, 17 of 58 VGs were found to be significant (P < 0.05) in distinguishing between commensal and clinical isolates. Nine of the 17 genes represented by iha, hlyA, aidA, east1, aah, fimH, iroN(E).(coli), traT, and saa have not been previously identified as important VGs in clinical porcine isolates in Australia. The remaining eight VGs code for fimbriae (F4, F5, F18, and F41) and toxins (STa, STh, LT, and Stx2), normally associated with porcine enterotoxigenic E. coli. Agglomerative hierarchical algorithm analysis grouped E. coli strains into subclusters based primarily on their serogroup. Multivariate analyses of clonal relationships based on the 17 VGs were collapsed into two-dimensional space by principal coordinate analysis. PWD clones were distributed in two quadrants, separated from ND and commensal clones, which tended to cluster within one quadrant. Clonal subclusters within quadrants were highly correlated with serogroups. These methods of analysis provide different perspectives in our attempts to understand how commensal and clinical porcine enterotoxigenic E. coli strains have evolved and are engaged in the dynamic process of losing or acquiring VGs within the pig population.