971 resultados para Genetic clustering analysis


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Dyslexia is one of the most common childhood disorders with a prevalence of around 5-10% in school-age children. Although an important genetic component is known to have a role in the aetiology of dyslexia, we are far from understanding the molecular mechanisms leading to the disorder. Several candidate genes have been implicated in dyslexia, including DYX1C1, DCDC2, KIAA0319, and the MRPL19/C2ORF3 locus, each with reports of both positive and no replications. We generated a European cross-linguistic sample of school-age children-the NeuroDys cohort-that includes more than 900 individuals with dyslexia, sampled with homogenous inclusion criteria across eight European countries, and a comparable number of controls. Here, we describe association analysis of the dyslexia candidate genes/locus in the NeuroDys cohort. We performed both case-control and quantitative association analyses of single markers and haplotypes previously reported to be dyslexia-associated. Although we observed association signals in samples from single countries, we did not find any marker or haplotype that was significantly associated with either case-control status or quantitative measurements of word-reading or spelling in the meta-analysis of all eight countries combined. Like in other neurocognitive disorders, our findings underline the need for larger sample sizes to validate possibly weak genetic effects. © 2014 Macmillan Publishers Limited All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

A new distance function to compare arbitrary partitions is proposed. Clustering of image collections and image segmentation give objects to be matched. Offered metric intends for combination of visual features and metadata analysis to solve a semantic gap between low-level visual features and high-level human concept.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In a paper the method of complex systems and processes clustering based use of genetic algorithm is offered. The aspects of its realization and shaping of fitness-function are considered. The solution of clustering task of Ukraine areas on socio-economic indexes is represented and comparative analysis with outcomes of classical methods is realized.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

To carry out their specific roles in the cell, genes and gene products often work together in groups, forming many relationships among themselves and with other molecules. Such relationships include physical protein-protein interaction relationships, regulatory relationships, metabolic relationships, genetic relationships, and much more. With advances in science and technology, some high throughput technologies have been developed to simultaneously detect tens of thousands of pairwise protein-protein interactions and protein-DNA interactions. However, the data generated by high throughput methods are prone to noise. Furthermore, the technology itself has its limitations, and cannot detect all kinds of relationships between genes and their products. Thus there is a pressing need to investigate all kinds of relationships and their roles in a living system using bioinformatic approaches, and is a central challenge in Computational Biology and Systems Biology. This dissertation focuses on exploring relationships between genes and gene products using bioinformatic approaches. Specifically, we consider problems related to regulatory relationships, protein-protein interactions, and semantic relationships between genes. A regulatory element is an important pattern or "signal", often located in the promoter of a gene, which is used in the process of turning a gene "on" or "off". Predicting regulatory elements is a key step in exploring the regulatory relationships between genes and gene products. In this dissertation, we consider the problem of improving the prediction of regulatory elements by using comparative genomics data. With regard to protein-protein interactions, we have developed bioinformatics techniques to estimate support for the data on these interactions. While protein-protein interactions and regulatory relationships can be detected by high throughput biological techniques, there is another type of relationship called semantic relationship that cannot be detected by a single technique, but can be inferred using multiple sources of biological data. The contributions of this thesis involved the development and application of a set of bioinformatic approaches that address the challenges mentioned above. These included (i) an EM-based algorithm that improves the prediction of regulatory elements using comparative genomics data, (ii) an approach for estimating the support of protein-protein interaction data, with application to functional annotation of genes, (iii) a novel method for inferring functional network of genes, and (iv) techniques for clustering genes using multi-source data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

I thank George Pandarakalam for research assistance; Hans-Jörg Rheinberger for hosting my stay at the Max Planck Institute for History of Science, Berlin; and Sahotra Sarkar and referees of this journal for offering detailed comments. Funded by the Wellcome Trust (WT098764MA).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

I thank George Pandarakalam for research assistance; Hans-Jörg Rheinberger for hosting my stay at the Max Planck Institute for History of Science, Berlin; and Sahotra Sarkar and referees of this journal for offering detailed comments. Funded by the Wellcome Trust (WT098764MA).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Mitotic genome instability can occur during the repair of double-strand breaks (DSBs) in DNA, which arise from endogenous and exogenous sources. Studying the mechanisms of DNA repair in the budding yeast, Saccharomyces cerevisiae has shown that Homologous Recombination (HR) is a vital repair mechanism for DSBs. HR can result in a crossover event, in which the broken molecule reciprocally exchanges information with a homologous repair template. The current model of double-strand break repair (DSBR) also allows for a tract of information to non-reciprocally transfer from the template molecule to the broken molecule. These “gene conversion” events can vary in size and can occur in conjunction with a crossover event or in isolation. The frequency and size of gene conversions in isolation and gene conversions associated with crossing over has been a source of debate due to the variation in systems used to detect gene conversions and the context in which the gene conversions are measured.

In Chapter 2, I use an unbiased system that measures the frequency and size of gene conversion events, as well as the association of gene conversion events with crossing over between homologs in diploid yeast. We show mitotic gene conversions occur at a rate of 1.3x10-6 per cell division, are either large (median 54.0kb) or small (median 6.4kb), and are associated with crossing over 43% of the time.

DSBs can arise from endogenous cellular processes such as replication and transcription. Two important RNA/DNA hybrids are involved in replication and transcription: R-loops, which form when an RNA transcript base pairs with the DNA template and displaces the non-template DNA strand, and ribonucleotides embedded into DNA (rNMPs), which arise when replicative polymerase errors insert ribonucleotide instead of deoxyribonucleotide triphosphates. RNaseH1 (encoded by RNH1) and RNaseH2 (whose catalytic subunit is encoded by RNH201) both recognize and degrade the RNA in within R-loops while RNaseH2 alone recognizes, nicks, and initiates removal of rNMPs embedded into DNA. Due to their redundant abilities to act on RNA:DNA hybrids, aberrant removal of rNMPs from DNA has been thought to lead to genome instability in an rnh201Δ background.

In Chapter 3, I characterize (1) non-selective genome-wide homologous recombination events and (2) crossing over on chromosome IV in mutants defective in RNaseH1, RNaseH2, or RNaseH1 and RNaseH2. Using a mutant DNA polymerase that incorporates 4-fold fewer rNMPs than wild type, I demonstrate that the primary recombinogenic lesion in the RNaseH2-defective genome is not rNMPs, but rather R-loops. This work suggests different in-vivo roles for RNaseH1 and RNaseH2 in resolving R-loops in yeast and is consistent with R-loops, not rNMPs, being the the likely source of pathology in Aicardi-Goutières Syndrome patients defective in RNaseH2.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The genomes of many strains of baker’s yeast, Saccharomyces cerevisiae, contain multiple repeats of the copper-binding protein Cup1. Cup1 is a member of the metallothionein family, and is found in a tandem array on chromosome VIII. In this thesis, I describe studies that characterized these tandem arrays and their mechanism of formation across diverse strains of yeast. I show that CUP1 arrays are an illuminating model system for observing recombination in eukaryotes, and describe insights derived from these observations.

In our first study, we analyzed 101 natural isolates of S. cerevisiae in order to examine the diversity of CUP1-containing repeats across different strains. We identified five distinct classes of repeats that contain CUP1. We also showed that some strains have only a single copy of CUP1. By comparing the sequences of all the strains, we were able to elucidate the mechanism of formation of the CUP1 tandem arrays, which involved unequal non-homologous recombination events starting from a strain that had only a single CUP1 gene. Our observation of CUP1 repeat formation allows more general insights about the formation of tandem repeats from single-copy genes in eukaryotes, which is one of the most important mechanisms by which organisms evolve.

In our second study, we delved deeper into our mechanistic investigations by measuring the relative rates of inter-homolog and intra-/inter-sister chromatid recombination in CUP1 tandem arrays. We used a diploid strain that is heterozygous both for insertion of a selectable marker (URA3) inside the tandem array, and also for markers at either end of the array. The intra-/inter-sister chromatid recombination rate turned out to be more than ten-fold greater than the inter-homolog rate. Moreover, we found that loss of the proteins Rad51 and Rad52, which are required for most inter-homolog recombination, did not greatly reduce recombination in the CUP1 tandem repeats. Additionally, we investigated the effects of elevated copper levels on the rate of each type of recombination at the CUP1 locus. Both types of recombination are increased at high concentrations of copper (as is known to be the case for CUP1 transcription). Furthermore, the inter-homolog recombination rate at the CUP1 locus is higher than the average over the genome during mitosis, but is lower than the average during meiosis.

The research described in Chapter 2 is published in 2014.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

To define specific pathways important in the multistep transformation process of normal plasma cells (PCs) to monoclonal gammopathy of uncertain significance (MGUS) and multiple myeloma (MM), we have applied microarray analysis to PCs from 5 healthy donors (N), 7 patients with MGUS, and 24 patients with newly diagnosed MM. Unsupervised hierarchical clustering using 125 genes with a large variation across all samples defined 2 groups: N and MGUS/MM. Supervised analysis identified 263 genes differentially expressed between N and MGUS and 380 genes differentially expressed between N and MM, 197 of which were also differentially regulated between N and MGUS. Only 74 genes were differentially expressed between MGUS and MM samples, indicating that the differences between MGUS and MM are smaller than those between N and MM or N and MGUS. Differentially expressed genes included oncogenes/tumor-suppressor genes (LAF4, RB1, and disabled homolog 2), cell-signaling genes (RAS family members, B-cell signaling and NF-kappaB genes), DNA-binding and transcription-factor genes (XBP1, zinc finger proteins, forkhead box, and ring finger proteins), and developmental genes (WNT and SHH pathways). Understanding the molecular pathogenesis of MM by gene expression profiling has demonstrated sequential genetic changes from N to malignant PCs and highlighted important pathways involved in the transformation of MGUS to MM.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This papers examines the use of trajectory distance measures and clustering techniques to define normal
and abnormal trajectories in the context of pedestrian tracking in public spaces. In order to detect abnormal
trajectories, what is meant by a normal trajectory in a given scene is firstly defined. Then every trajectory
that deviates from this normality is classified as abnormal. By combining Dynamic Time Warping and a
modified K-Means algorithms for arbitrary-length data series, we have developed an algorithm for trajectory
clustering and abnormality detection. The final system performs with an overall accuracy of 83% and 75%
when tested in two different standard datasets.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper introduces a new stochastic clustering methodology devised for the analysis of categorized or sorted data. The methodology reveals consumers' common category knowledge as well as individual differences in using this knowledge for classifying brands in a designated product class. A small study involving the categorization of 28 brands of U.S. automobiles is presented where the results of the proposed methodology are compared with those obtained from KMEANS clustering. Finally, directions for future research are discussed.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Cauliflower (Brassica oleracea var. botrytis) is a vernalization-responsive crop. High ambient temperatures delay harvest time. The elucidation of the genetic regulation of floral transition is highly interesting for a precise harvest scheduling and to ensure stable market supply. This study aims at genetic dissection of temperature-dependent curd induction in cauliflower by genome-wide association studies and gene expression analysis. To assess temperature dependent curd induction, two greenhouse trials under distinct temperature regimes were conducted on a diversity panel consisting of 111 cauliflower commercial parent lines, genotyped with 14,385 SNPs. Broad phenotypic variation and high heritability (0.93) were observed for temperature-related curd induction within the cauliflower population. GWA mapping identified a total of 18 QTL localized on chromosomes O1, O2, O3, O4, O6, O8, and O9 for curding time under two distinct temperature regimes. Among those, several QTL are localized within regions of promising candidate flowering genes. Inferring population structure and genetic relatedness among the diversity set assigned three main genetic clusters. Linkage disequilibrium (LD) patterns estimated global LD extent of r(2) = 0.06 and a maximum physical distance of 400 kb for genetic linkage. Transcriptional profiling of flowering genes FLOWERING LOCUS C (BoFLC) and VERNALIZATION 2 (BoVRN2) was performed, showing increased expression levels of BoVRN2 in genotypes with faster curding. However, functional relevance of BoVRN2 and BoFLC2 could not consistently be supported, which probably suggests to act facultative and/or might evidence for BoVRN2/BoFLC-independent mechanisms in temperature regulated floral transition in cauliflower. Genetic insights in temperature-regulated curd induction can underpin genetically informed phenology models and benefit molecular breeding strategies toward the development of thermo-tolerant cultivars.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by deficits in social communication/interaction and by unusual repetitive and restricted behaviors and interests. ASD often co-occurs in the same families with other neuropsychiatric diseases (NPD), such as intellectual disability, schizophrenia, epilepsy, depression and attention deficit hyperactivity disorder. Genetic factors have an important role in ASD etiology. Multiple copy number variants (CNVs) and single nucleotide variants (SNVs) in candidate genes have been associated with an increased risk to develop ASD. Nevertheless, recent heritability estimates and the high genotypic and phenotypic heterogeneity characteristic of ASD indicate a role of environmental and epigenetic factors, such as long noncoding RNA (lncRNA) and microRNA (miRNA), as modulators of genetic expression and further clinical presentation. Both miRNA and lncRNA are functional RNA molecules that are transcribed from DNA but not translated into proteins, instead they act as powerful regulators of gene expression. While miRNA are small noncoding RNAs with 22-25 nucleotides in length that act at the post-transcriptional level of gene expression, the lncRNA are bigger molecules (>200 nucleotides in length) that are capped, spliced, and polyadenylated, similar to messenger RNA. Although few lncRNA were well characterized until date, there is a great evidence that they are implicated in several levels of gene expression (transcription/post-transcription/post-translation, organization of protein complexes, cell– cell signaling as well as recombination) as shown in figure 1.