903 resultados para Human genome - Theses
Resumo:
Olfactory receptor (OR) genes represent ≈1% of genomic coding sequence in mammals, and these genes are clustered on multiple chromosomes in both the mouse and human genomes. We have taken a comparative genomics approach to identify features that may be involved in the dynamic evolution of this gene family and in the transcriptional control that results in a single OR gene expressed per olfactory neuron. We sequenced ≈350 kb of the murine P2 OR cluster and used synteny, gene linkage, and phylogenetic analysis to identify and sequence ≈111 kb of an orthologous cluster in the human genome. In total, 18 mouse and 8 human OR genes were identified, including 7 orthologs that appear to be functional in both species. Noncoding homology is evident between orthologs and generally is confined within the transcriptional unit. We find no evidence for common regulatory features shared among paralogs, and promoter regions generally do not contain strong promoter motifs. We discuss these observations, as well as OR clustering, in the context of evolutionary expansion and transcriptional regulation of OR repertoires.
Resumo:
The nucleotide sequence of the human alpha-albumin gene, including 887 bp of the 5'-flanking region and 1311 bp of the 3-flanking region (24,454 in total), was determined from three overlapping lambda phage clones. The sequence spans 22,256 bp from the cap site to the polyadenylylation site, revealing a gene structure of 15 exons separated by 14 introns. The methionine initiation codon ATG is within exon 1; the termination codon TGA is within exon 14. Exon 15 is entirely untranslated and contains the polyadenylylation signal AATAAA. The deduced polypeptide chain is composed of a 21-amino-acid leader peptide, followed by 578 amino acids of the mature protein. There are seven repetitive DNA elements (Alu and Kpn) in the introns and 3-flanking region. The sizes of the 15 alpha-albumin exons match closely those of the albumin, alpha-fetoprotein, and vitamin D-binding protein genes. The exons are symmetrically placed within the three domains of the individual proteins, and they share a characteristic codon splitting pattern that is conserved among members of the gene family. The results provide strong evidence that alpha-albumin belongs to, and most likely completes with, the serum albumin gene family. Based on structural similarity, alpha-albumin appears to be most closely related to alpha-fetoprotein. The complete structure of this family of four tandemly linked genes provides a well-characterized approximately 200 kb locus in the 4q subcentromeric region of the human genome.
Resumo:
We have constructed a physical map of human chromosome 22q using bacterial artificial chromosome (BAC) clones. The map consists of 613 chromosome 22-specific BAC clones that have been localized and assembled into contigs using 452 landmarks, 346 of which were previously ordered and mapped to specific regions of the q arm of the chromosome by means of chromosome 22-specific yeast artificial chromosome clones. The BAC-based map provides immediate access to clones that are stable and convenient for direct genome analysis. The approach to rapidly developing marker-specific BAC contigs is relatively straightforward and can be extended to generate scaffold BAC contig maps of the rest of the chromosomes. These contigs will provide substrates for sequencing the entire human genome. We discuss how to efficiently close contig gaps using the end sequences of BAC clone inserts.
Resumo:
Human endogenous retroviruses (HERVs) are very likely footprints of ancient germ-cell infections. HERV sequences encompass about 1% of the human genome. HERVs have retained the potential of other retroelements to retrotranspose and thus to change genomic structure and function. The genomes of almost all HERV families are highly defective. Recent progress has allowed the identification of the biologically most active family, HTDV/HERV-K, which codes for viral proteins and particles and is highly expressed in germ-cell tumors. The demonstrable and potential roles of HTDV/HERV-K as well as of other human elements in disease and in maintaining genome plasticity are illustrated.
Resumo:
Plectin, a 500-kDa intermediate filament binding protein, has been proposed to provide mechanical strength to cells and tissues by acting as a cross-linking element of the cytoskeleton. To set the basis for future studies on gene regulation, tissue-specific expression, and pathological conditions involving this protein, we have cloned the human plectin gene, determined its coding sequence, and established its genomic organization. The coding sequence contains 32 exons that extend over 32 kb of the human genome. Most of the introns reside within a region encoding the globular N-terminal domain of the molecule, whereas the entire central rod domain and the entire C-terminal globular domain were found to be encoded by single exons of remarkable length, >3 kb and >6 kb, respectively. Overall, the organization of the human plectin gene was strikingly similar to that of human bullous pemphigoid antigen 1 (BPAG1), confirming that both proteins belong to the same gene family. Comparison of the deduced protein sequences for human and rat plectin revealed that they were 93% identical. By using fluorescence in situ hybridization, we have mapped the plectin gene to the long arm of chromosome 8 within the telomeric region. This gene locus (8q24) has previously been implicated in the human blistering skin disease epidermolysis bullosa simplex Ogna. Detailed knowledge of the structure of the plectin gene and its chromosome localization will aid in the elucidation of whether this or any other pathological conditions are linked to alterations in the plectin gene.
Resumo:
The development of a highly reliable physical map with landmark sites spaced an average of 100 kbp apart has been a central goal of the Human Genome Project. We have approached the physical mapping of human chromosome 11 with this goal as a primary target. We have focused on strategies that would utilize yeast artificial chromosome (YAC) technology, thus permitting long-range coverage of hundreds of kilobases of genomic DNA, yet we sought to minimize the ambiguities inherent in the use of this technology, particularly the occurrence of chimeric genomic DNA clones. This was achieved through the development of a chromosome 11-specific YAC library from a human somatic cell hybrid line that has retained chromosome 11 as its sole human component.To maximize the efficiency of YAC contig assembly and extension, we have employed an Alu-PCR-based hybridization screening system. This system eliminates many of the more costly and time-consuming steps associated with sequence tagged site content mapping such as sequencing, primer production, and hierarchical screening, resulting in greater efficiency with increased throughput and reduced cost. Using these approaches, we have achieved YAC coverage for >90% of human chromosome 11, with an average intermarker distance of <100 kbp. Cytogenetic localization has been determined for each contig by fluorescent in situ hybridization and/or sequence tagged site content. The YAC contigs that we have generated should provide a robust framework to move forward to sequence-ready templates for the sequencing efforts of the Human Genome Project as well as more focused positional cloning on chromosome 11.
Resumo:
Integration of human immunodeficiency virus (HIV) DNA into the human genome requires the virus-encoded integrase (IN) protein, and therefore the IN protein is a suitable target for antiviral strategies. To find a potent HIV IN inhibitor, we screened a "synthetic peptide combinatorial library." We identified a hexapeptide with the sequence HCKFWW that inhibits IN-mediated 3'-processing and integration with an IC50 of 2 microM. The peptide is active on IN proteins from other retroviruses such as HIV-2, feline immunodeficiency virus, and Moloney murine leukemia virus, supporting the notion that a conserved region of IN is targeted. The hexapeptide was also tested in the disintegration reaction. This phosphoryl-transfer reaction can be carried out by the catalytic core of IN alone, and the peptide HCKFWW was found to inhibit this reaction, suggesting that the hexapeptide acts at or near the catalytic site of IN. Identification of an IN hexapeptide inhibitor provides proof of concept for the approach, and, moreover, this peptide may be useful for structure-function analysis of IN.
Resumo:
Many features of Down syndrome might result from the overdosage of only a few genes located in a critical region of chromosome 21. To search for these genes, cosmids mapping in this region were isolated and used for trapping exons. One of the trapped exons obtained has a sequence very similar to part of the Drosophila single-minded (sim) gene, a master regulator of the early development of the fly central nervous system midline. Mapping data indicated that this exonic sequence is only present in the Down syndrome-critical region in the human genome. Hybridization of this exonic sequence with human fetal kidney poly(A)+ RNA revealed two transcripts of 6 and 4.3 kb. In situ hybridization of a probe derived from this exon with human and rat fetuses showed that the corresponding gene is expressed during early fetal life in the central nervous system and in other tissues, including the facial, skull, palate, and vertebra primordia. The expression pattern of this gene suggests that it might be involved in the pathogenesis of some of the morphological features and brain anomalies observed in Down syndrome.
Resumo:
We report a general mass spectrometric approach for the rapid identification and characterization of proteins isolated by preparative two-dimensional polyacrylamide gel electrophoresis. This method possesses the inherent power to detect and structurally characterize covalent modifications. Absolute sensitivities of matrix-assisted laser desorption ionization and high-energy collision-induced dissociation tandem mass spectrometry are exploited to determine the mass and sequence of subpicomole sample quantities of tryptic peptides. These data permit mass matching and sequence homology searching of computerized peptide mass and protein sequence data bases for known proteins and design of oligonucleotide probes for cloning unknown proteins. We have identified 11 proteins in lysates of human A375 melanoma cells, including: alpha-enolase, cytokeratin, stathmin, protein disulfide isomerase, tropomyosin, Cu/Zn superoxide dismutase, nucleoside diphosphate kinase A, galaptin, and triosephosphate isomerase. We have characterized several posttranslational modifications and chemical modifications that may result from electrophoresis or subsequent sample processing steps. Detection of comigrating and covalently modified proteins illustrates the necessity of peptide sequencing and the advantages of tandem mass spectrometry to reliably and unambiguously establish the identity of each protein. This technology paves the way for studies of cell-type dependent gene expression and studies of large suites of cellular proteins with unprecedented speed and rigor to provide information complementary to the ongoing Human Genome Project.
Resumo:
The human genome contains many repeated DNA sequences that vary in complexity of repeating unit from a single nucleotide to a whole gene. The repeat sequences can be widely dispersed or in simple tandem arrays. Arrays of up to 5 or 6 nt are known as simple tandem repeats, and these are widely dispersed and highly polymorphic. Members of one group of the simple tandem repeats, the trinucleotide repeats, can undergo an increase in copy number by a process of dynamic mutation. Dynamic mutations of the CCG trinucleotide give rise to one group of fragile sites on human chromosomes, the rare folate-sensitive group. One member of this group, the fragile X (FRAXA) is responsible for the most common familial form of mental retardation. Another member of the group FRAXE is responsible for a rarer mild form of mental retardation. Similar mutations of AGC repeats give rise to a number of neurological disorders. The expanded repeats are unstable between generations and somatically. The intergenerational instability gives rise to unusual patterns of inheritance--particularly anticipation, the increasing severity and/or earlier age of onset of the disorder in successive generations. Dynamic mutations have been found only in the human species, and possible reasons for this are considered. The mechanism of dynamic mutation is discussed, and a number of observations of simple tandem repeat mutation that could assist in understanding this phenomenon are commented on.
Resumo:
CpG island is a GC-rich motif occurred in gene promoter region, which can play important roles in gene silencing and imprinting. Here, we present a set of discriminant functions that can recognize the structural and compositional features of CpG islands in the putative promoter regions (PPRs) of human and mouse immunoglobulin (Ig) genes. We showed that the PPRs of both human and mouse Ig genes irrespective of gene chromosomal localization are apparently CpG island poor, with a low percentage of the CpG islands overlapped with the transcription start site (TSS). The human Ig genes that have CpG islands in the PPRs show a very narrow range of CpG densities. 47% of the Ig genes fall in the range of 3.5-4 CpGs/100 bp. In contrast, the non-Ig genes examined have a wide range of the density of CpG island, with 10.5% having the density of 8.1-15 CpGs/100 bp. Meantime, five patterns of the CpG distributions within the CpG islands have been classified: Pat A, B, C, D, and E. 21.6% and 10.8% of the Ig genes fall into the Pat B and Pat D groups, respectively, which were significantly higher than the non-Ig genes examined (8.2% and 3.8%). Moreover, the length of CpG islands is shorter in human Ig genes than in non-Ig genes but is much longer than in mouse orthologues. These findings provide a clear picture of non-neutral and nonrandom occurrence of the CpG islands in the PPRs of human and mouse Ig genes, which facilitate rational recommendations regarding their nomenclature. (C) 2005 Elsevier B.V. All rights reserved.
Resumo:
Cross-species comparative genomics is a powerful strategy for identifying functional regulatory elements within noncoding DNA. In this paper, comparative analysis of human and mouse intronic sequences in the breast cancer susceptibility gene (BRCA1) revealed two evolutionarily conserved noncoding sequences (CNS) in intron 2, 5 kb downstream of the core BRCA1 promoter. The functionality of these elements was examined using homologous-recombination-based mutagenesis of reporter gene-tagged cosmids incorporating these regions and flanking sequences from the BRCA1 locus. This showed that CNS-1 and CNS-2 have differential transcriptional regulatory activity in epithelial cell lines. Mutation of CNS-1 significantly reduced reporter gene expression to 30% of control levels. Conversely mutation of CNS-2 increased expression to 200% of control levels. Regulation is at the level of transcription and shows promoter specificity. Both elements also specifically bind nuclear proteins in vitro. These studies demonstrate that the combination of comparative genomics and functional analysis is a successful strategy to identify novel regulatory elements and provide the first direct evidence that conserved noncoding sequences in BRCA1 regulate gene expression. (c) 2005 Elsevier Inc. All rights reserved.
Resumo:
The zebrafish golden mutation is characterized by the production of small and irregular-shaped melanin granules, resulting in a lightening of the pigmented lateral stripes of the animal. The recent positional cloning and localization of the golden gene, combined with genotype-phenotype correlations of alleles of its human orthologue (SLC24A5) in African-American and African-Caribbean populations, provide insights into the genetic and molecular basis of human skin colour. SLC24A5 promotes melanin deposition through maturation of the melanosome, highlighting the importance of ion-exchange in the function of this organelle.
Resumo:
Improvements in genomic technology, both in the increased speed and reduced cost of sequencing, have expanded the appreciation of the abundance of human genetic variation. However the sheer amount of variation, as well as the varying type and genomic content of variation, poses a challenge in understanding the clinical consequence of a single mutation. This work uses several methodologies to interpret the observed variation in the human genome, and presents novel strategies for the prediction of allele pathogenicity.
Using the zebrafish model system as an in vivo assay of allele function, we identified a novel driver of Bardet-Biedl Syndrome (BBS) in CEP76. A combination of targeted sequencing of 785 cilia-associated genes in a cohort of BBS patients and subsequent in vivo functional assays recapitulating the human phenotype gave strong evidence for the role of CEP76 mutations in the pathology of an affected family. This portion of the work demonstrated the necessity of functional testing in validating disease-associated mutations, and added to the catalogue of known BBS disease genes.
Further study into the role of copy-number variations (CNVs) in a cohort of BBS patients showed the significant contribution of CNVs to disease pathology. Using high-density array comparative genomic hybridization (aCGH) we were able to identify pathogenic CNVs as small as several hundred bp. Dissection of constituent gene and in vivo experiments investigating epistatic interactions between affected genes allowed for an appreciation of several paradigms by which CNVs can contribute to disease. This study revealed that the contribution of CNVs to disease in BBS patients is much higher than previously expected, and demonstrated the necessity of consideration of CNV contribution in future (and retrospective) investigations of human genetic disease.
Finally, we used a combination of comparative genomics and in vivo complementation assays to identify second-site compensatory modification of pathogenic alleles. These pathogenic alleles, which are found compensated in other species (termed compensated pathogenic deviations [CPDs]), represent a significant fraction (from 3 – 10%) of human disease-associated alleles. In silico pathogenicity prediction algorithms, a valuable method of allele prioritization, often misrepresent these alleles as benign, leading to omission of possibly informative variants in studies of human genetic disease. We created a mathematical model that was able to predict CPDs and putative compensatory sites, and functionally showed in vivo that second-site mutation can mitigate the pathogenicity of disease alleles. Additionally, we made publically available an in silico module for the prediction of CPDs and modifier sites.
These studies have advanced the ability to interpret the pathogenicity of multiple types of human variation, as well as made available tools for others to do so as well.
Resumo:
Le glaucome est un groupe hétérogène de maladies qui sont caractérisées par l’apoptose des cellules ganglionnaires de la rétine et la dégénérescence progressive du nerf optique. Il s’agit de la première cause de cécité irréversible, qui touche environ 60 millions de personnes dans le monde. Sa forme la plus commune est le glaucome à angle ouvert (GAO), un trouble polygénique causé principalement par une prédisposition génétique, en interaction avec d’autres facteurs de risque tels que l’âge et la pression intraoculaire élevée (PIO). Le GAO est une maladie génétique complexe, bien que certaines formes sévères sont autosomiques dominantes. Dix-sept loci ont été liés à la maladie et acceptés par la « Human Genome Organisation » (HUGO) et cinq gènes ont été identifiés à ces loci (MYOC, OPTN, WDR36, NTF4, ASB10). Récemment, des études d’association sur l’ensemble du génome ont identifié plus de 20 facteurs de risque fréquents, avec des effets relativement faibles. Depuis plus de 50 ans, notre équipe étudie 749 membres de la grande famille canadienne-française CA où la mutation MYOCK423E cause une forme autosomale dominante de GAO dont l’âge de début est fortement variable. Premièrement, il a été montré que cette variabilité de l’âge de début de l’hypertension intraoculaire possède une importante composante génétique causée par au moins un gène modificateur. Ce modificateur interagit avec la mutation primaire et altère la sévérité du glaucome chez les porteurs de MYOCK423E. Un gène modificateur candidat WDR36 a été génotypé dans 2 grandes familles CA et BV. Les porteurs de variations non-synonymes de WDR36 ainsi que de MYOCK423E de la famille CA ont montré une tendance à développer la maladie plus jeune. Un outil de forage de données a été développé pour représenter des informations connues relatives à la maladie et faciliter la priorisation des gènes candidats. Cet outil a été appliqué avec succès à la dépression bipolaire et au glaucome. La suite du projet consiste à finaliser un balayage de génome sur la famille CA et à séquencer les loci afin d’identifier les variations modificatrices du glaucome. Éventuellement, ces variations permettront d’identifier les individus dont le glaucome risque d’être plus agressif.