986 resultados para Genomic Regions


Relevância:

20.00% 20.00%

Publicador:

Resumo:

For the ∼1% of the human genome in the ENCODE regions, only about half of the transcriptionally active regions (TARs) identified with tiling microarrays correspond to annotated exons. Here we categorize this large amount of “unannotated transcription.” We use a number of disparate features to classify the 6988 novel TARs—array expression profiles across cell lines and conditions, sequence composition, phylogenetic profiles (presence/absence of syntenic conservation across 17 species), and locations relative to genes. In the classification, we first filter out TARs with unusual sequence composition and those likely resulting from cross-hybridization. We then associate some of those remaining with proximal exons having correlated expression profiles. Finally, we cluster unclassified TARs into putative novel loci, based on similar expression and phylogenetic profiles. To encapsulate our classification, we construct a Database of Active Regions and Tools (DART.gersteinlab.org). DART has special facilities for rapidly handling and comparing many sets of TARs and their heterogeneous features, synchronizing across builds, and interfacing with other resources. Overall, we find that ∼14% of the novel TARs can be associated with known genes, while ∼21% can be clustered into ∼200 novel loci. We observe that TARs associated with genes are enriched in the potential to form structural RNAs and many novel TAR clusters are associated with nearby promoters. To benchmark our classification, we design a set of experiments for testing the connectivity of novel TARs. Overall, we find that 18 of the 46 connections tested validate by RT-PCR and four of five sequenced PCR products confirm connectivity unambiguously.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Descriptors based on Molecular Interaction Fields (MIF) are highly suitable for drug discovery, but their size (thousands of variables) often limits their application in practice. Here we describe a simple and fast computational method that extracts from a MIF a handful of highly informative points (hot spots) which summarize the most relevant information. The method was specifically developed for drug discovery, is fast, and does not require human supervision, being suitable for its application on very large series of compounds. The quality of the results has been tested by running the method on the ligand structure of a large number of ligand-receptor complexes and then comparing the position of the selected hot spots with actual atoms of the receptor. As an additional test, the hot spots obtained with the novel method were used to obtain GRIND-like molecular descriptors which were compared with the original GRIND. In both cases the results show that the novel method is highly suitable for describing ligand-receptor interactions and compares favorably with other state-of-the-art methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: The analysis of the promoter sequence of genes with similar expression patterns isa basic tool to annotate common regulatory elements. Multiple sequence alignments are on thebasis of most comparative approaches. The characterization of regulatory regions from coexpressedgenes at the sequence level, however, does not yield satisfactory results in manyoccasions as promoter regions of genes sharing similar expression programs often do not shownucleotide sequence conservation.Results: In a recent approach to circumvent this limitation, we proposed to align the maps ofpredicted transcription factors (referred as TF-maps) instead of the nucleotide sequence of tworelated promoters, taking into account the label of the corresponding factor and the position in theprimary sequence. We have now extended the basic algorithm to permit multiple promotercomparisons using the progressive alignment paradigm. In addition, non-collinear conservationblocks might now be identified in the resulting alignments. We have optimized the parameters ofthe algorithm in a small, but well-characterized collection of human-mouse-chicken-zebrafishorthologous gene promoters.Conclusion: Results in this dataset indicate that TF-map alignments are able to detect high-levelregulatory conservation at the promoter and the 3'UTR gene regions, which cannot be detectedby the typical sequence alignments. Three particular examples are introduced here to illustrate thepower of the multiple TF-map alignments to characterize conserved regulatory elements inabsence of sequence similarity. We consider this kind of approach can be extremely useful in thefuture to annotate potential transcription factor binding sites on sets of co-regulated genes fromhigh-throughput expression experiments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Statewide and Regional projected industry employment 2002 - 2012

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of comparing and characterizing the promoter regions of genes with similar expression patterns. This remains a challenging problem in sequence analysis, because often the promoter regions of co-expressed genes do not show discernible sequence conservation. In our approach, thus, we have not directly compared the nucleotide sequence of promoters. Instead, we have obtained predictions of transcription factor binding sites, annotated the predicted sites with the labels of the corresponding binding factors, and aligned the resulting sequences of labels—to which we refer here as transcription factor maps (TF-maps). To obtain the global pairwise alignment of two TF-maps, we have adapted an algorithm initially developed to align restriction enzyme maps. We have optimized the parameters of the algorithm in a small, but well-curated, collection of human–mouse orthologous gene pairs. Results in this dataset, as well as in an independent much larger dataset from the CISRED database, indicate that TF-map alignments are able to uncover conserved regulatory elements, which cannot be detected by the typical sequence alignments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Studies of large sets of SNP data have proven to be a powerful tool in the analysis of the genetic structure of human populations. In this work, we analyze genotyping data for 2,841 SNPs in 12 Sub-Saharan African populations, including a previously unsampled region of south-eastern Africa (Mozambique). We show that robust results in a world-wide perspective can be obtained when analyzing only 1,000 SNPs. Our main results both confirm the results of previous studies, and show new and interesting features in Sub-Saharan African genetic complexity. There is a strong differentiation of Nilo-Saharans, much beyond what would be expected by geography. Hunter-gatherer populations (Khoisan and Pygmies) show a clear distinctiveness with very intrinsic Pygmy (and not only Khoisan) genetic features. Populations of the West Africa present an unexpected similarity among them, possibly the result of a population expansion. Finally, we find a strong differentiation of the south-eastern Bantu population from Mozambique, which suggests an assimilation of a pre-Bantu substrate by Bantu speakers in the region.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Aproximately 5–10% of cases of mental retardation in males are due to copy number variations (CNV) on the X chromosome. Novel technologies, such as array comparative genomic hybridization (aCGH), may help to uncover cryptic rearrangements in X-linked mental retardation (XLMR) patients. We have constructed an X-chromosome tiling path array using bacterial artificial chromosomes (BACs) and validated it using samples with cytogenetically defined copy number changes. We have studied 54 patients with idiopathic mental retardation and 20 controls subjects. Results: Known genomic aberrations were reliably detected on the array and eight novel submicroscopic imbalances, likely causative for the mental retardation (MR) phenotype, were detected. Putatively pathogenic rearrangements included three deletions and five duplications (ranging between 82 kb to one Mb), all but two affecting genes previously known to be responsible for XLMR. Additionally, we describe different CNV regions with significant different frequencies in XLMR and control subjects (44% vs. 20%). Conclusion:This tiling path array of the human X chromosome has proven successful for the detection and characterization of known rearrangements and novel CNVs in XLMR patients.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: The understanding of whole genome sequences in higher eukaryotes depends to a large degree on the reliable definition of transcription units including exon/intron structures, translated open reading frames (ORFs) and flanking untranslated regions. The best currently available chicken transcript catalog is the Ensembl build based on the mappings of a relatively small number of full length cDNAs and ESTs to the genome as well as genome sequence derived in silico gene predictions.Results: We use Long Serial Analysis of Gene Expression (LongSAGE) in bursal lymphocytes and the DT40 cell line to verify the quality and completeness of the annotated transcripts. 53.6% of the more than 38,000 unique SAGE tags (unitags) match to full length bursal cDNAs, the Ensembl transcript build or the genome sequence. The majority of all matching unitags show single matches to the genome, but no matches to the genome derived Ensembl transcript build. Nevertheless, most of these tags map close to the 3' boundaries of annotated Ensembl transcripts.Conclusions: These results suggests that rather few genes are missing in the current Ensembl chicken transcript build, but that the 3' ends of many transcripts may not have been accurately predicted. The tags with no match in the transcript sequences can now be used to improve gene predictions, pinpoint the genomic location of entirely missed transcripts and optimize the accuracy of gene finder software.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Copy number variants contribute extensively to inter-individual genomic differences, but little is known about their inter-population variability and diversity. In a previous study (Bosch et al., 2007; 16:2572-2582), we reported that the primate-specific gene family FAM90A, which accounts for as many as 25 members in the human reference assembly, has expanded the number of FAM90A clusters across the hominoid lineage. Here we examined the copy number variability of FAM90A genes in 260 HapMap samples of European, African, and Asian ancestry, and showed significant inter-population differences (p<0.0001). Based on the recent study of Stranger et al. (2007; 315:848-853), we also explored the correlation between copy number variability and expression levels of the FAM90A gene family. Despite the high genomic variability, we found a low correlation between FAM90A copy number and expression levels, which could be due to the action of independent trans-acting factors. Our results show that FAM90A is highly variable in copy number between individuals and between populations. However, this variability has little impact on gene expression levels, thus highlighting the importance of genomic variability for genes located in regions containing segmental duplications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose: To evaluate the suitability of an improved version of an automatic segmentation method based on geodesic active regions (GAR) for segmenting cerebral vasculature with aneurysms from 3D X-ray reconstruc-tion angiography (3DRA) and time of °ight magnetic resonance angiography (TOF-MRA) images available in the clinical routine.Methods: Three aspects of the GAR method have been improved: execution time, robustness to variability in imaging protocols and robustness to variability in image spatial resolutions. The improved GAR was retrospectively evaluated on images from patients containing intracranial aneurysms in the area of the Circle of Willis and imaged with two modalities: 3DRA and TOF-MRA. Images were obtained from two clinical centers, each using di®erent imaging equipment. Evaluation included qualitative and quantitative analyses ofthe segmentation results on 20 images from 10 patients. The gold standard was built from 660 cross-sections (33 per image) of vessels and aneurysms, manually measured by interventional neuroradiologists. GAR has also been compared to an interactive segmentation method: iso-intensity surface extraction (ISE). In addition, since patients had been imaged with the two modalities, we performed an inter-modality agreement analysis with respect to both the manual measurements and each of the two segmentation methods. Results: Both GAR and ISE di®ered from the gold standard within acceptable limits compared to the imaging resolution. GAR (ISE, respectively) had an average accuracy of 0.20 (0.24) mm for 3DRA and 0.27 (0.30) mm for TOF-MRA, and had a repeatability of 0.05 (0.20) mm. Compared to ISE, GAR had a lower qualitative error in the vessel region and a lower quantitative error in the aneurysm region. The repeatabilityof GAR was superior to manual measurements and ISE. The inter-modality agreement was similar between GAR and the manual measurements. Conclusions: The improved GAR method outperformed ISE qualitatively as well as quantitatively and is suitable for segmenting 3DRA and TOF-MRA images from clinical routine.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Retroposed genes (retrogenes) originate via the reverse transcription of mature messenger RNAs from parental source genes and are therefore usually devoid of introns. Here, we characterize a particular set of mammalian retrogenes that acquired introns upon their emergence and thus represent rare cases of intron gain in mammals. We find that although a few retrogenes evolved introns in their coding or 3' untranslated regions (untranslated region, UTR), most introns originated together with untranslated exons in the 5' flanking regions of the retrogene insertion site. They emerged either de novo or through fusions with 5' UTR exons of host genes into which the retrogenes inserted. Generally, retrogenes with introns display high transcription levels and show broader spatial expression patterns than other retrogenes. Our experimental expression analyses of individual intron-containing retrogenes show that 5' UTR introns may indeed promote higher expression levels, at least in part through encoded regulatory elements. By contrast, 3' UTR introns may lead to downregulation of expression levels via nonsense-mediated decay mechanisms. Notably, the majority of retrogenes with introns in their 5' flanks depend on distant, sometimes bidirectional CpG dinucleotide-enriched promoters for their expression that may be recruited from other genes in the genomic vicinity. We thus propose a scenario where the acquisition of new 5' exon-intron structures was directly linked to the recruitment of distant promoters by these retrogenes, a process potentially facilitated by the presence of proto-splice sites in the genomic vicinity of retrogene insertion sites. Thus, the primary role and selective benefit of new 5' introns (and UTR exons) was probably initially to span the often substantial distances to potent CpG promoters driving retrogene transcription. Later in evolution, these introns then obtained additional regulatory roles in fine tuning retrogene expression levels. Our study provides novel insights regarding mechanisms underlying the origin of new introns, the evolutionary relevance of intron gain, and the origin of new gene promoters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Splenic marginal zone lymphoma (SMZL) is an indolent B-cell lymphoproliferative disorder characterised by 7q32 deletion, but the target genes of this deletion remain unknown. In order to elucidate the genetic target of this deletion, we performed an integrative analysis of the genetic, epigenetic, transcriptomic and miRNomic data. High resolution array comparative genomic hybridization of 56 cases of SMZL delineated a minimally deleted region (2.8 Mb) at 7q32, but showed no evidence of any cryptic homozygous deletion or recurrent breakpoint in this region. Integrated transcriptomic analysis confirmed significant under-expression of a number of genes in this region in cases of SMZL with deletion, several of which showed hypermethylation. In addition, a cluster of 8 miRNA in this region showed under-expression in cases with the deletion, and three (miR-182/96/183) were also significantly under-expressed (P<0.05) in SMZL relative to other lymphomas. Genomic sequencing of these miRNA and IRF5, a strong candidate gene, did not show any evidence of somatic mutation in SMZL. These observations provide valuable guidance for further characterisation of 7q deletion.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

PURPOSE: To report a large deletion that encompasses more than 90% of PRPF31 gene and two other neighboring genes in their entirety in an adRP pedigree that appears to show only the typical clinical features of retinitis pigmentosa. METHODS: To identify PRPF31 mutation in a dominant RP family (ADRP2) previously linked to the RP11 locus, the 14 exons of PRPF31 were screened for mutations by direct sequencing. To investigate the possibility of a large deletion, microsatellite markers near PRPF31 gene were analyzed by non-denaturing PAGE. RESULTS: Initial screening of PRPF31 gene in the ADRP2 family did not reveal an obvious mutation. A large deletion was however suspected due to lack of heterozygosity for nearly all PRPF31 intragenic single nucleotide polymorphysm (SNPs). In order to estimate the size of the deletion, SNPs and microsatellite markers spanning and flanking PRPF31 were analyzed in the entire ADRP2 family. Haplotype analysis with the above markers suggested a deletion of approximately 30 kb that included the putative promoter region of a novel gene OSCAR, the entire genomic content of genes NDUFA3, TFPT and more than 90% of PRPF31 gene. Sequence analysis of the region flanking the potential deletion showed a high presence of Alu elements implicating Alu mediated recombination as the mechanism responsible for this event. CONCLUSIONS: This mutation provides evidence that haploinsufficiency rather than aberrant function of mutated proteins is the cause of disease in these adRP patients with mutations in PRPF31 gene.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The human Me14-D12 antigen is a cell surface glycoprotein regulated by interferon-gamma (IFN-gamma) on tumor cell lines of neuroectodermal origin. It consists of two non-convalently linked subunits with apparent mol. wt sizes of 33,000 and 38,000. Here we describe the molecular cloning of a genomic probe for the Me14-D12 gene using the gene transfer approach. Mouse Ltk- cells were stably cotransfected with human genomic DNA and the Herpes Simplex virus thymidine kinase (TK) gene. Primary and secondary transfectants expressing the Me14-D12 antigen were isolated after selection in HAT medium by repeated sorting on a fluorescence activated cell sorter (FACS). A recombinant phage harboring a 14.3 kb insert of human DNA was isolated from a genomic library made from a positive secondary transfectant cell line. A specific probe derived from the phage DNA insert allowed the identification of two mRNAs of 3.5 kb and 2.2 kb in primary and secondary L cell transfectants, as well as in human melanoma cell lines expressing the Me14-D12 antigen. The regulation of Me14-D12 antigen by INF-gamma was retained in the L cell transfectants and could be detected both at the level of protein and mRNA expression.