74 resultados para human genome variation
Resumo:
The geographically constrained distribution of Epstein-Barr virus (EBV)-associated nasopharyngeal carcinoma (NPC) in southeast Asian populations suggests that both viral and host genetics may influence disease risk. Although susceptibility loci have been mapped within the human genome, the role of viral genetics in the focal distribution of NPC remains an enigma. Here we report a molecular phylogenetic analysis of an NPC-associated viral oncogene, LMP1, in a large panel of EBV isolates from southeast Asia and from Papua New Guinea, Africa, and Australia, regions of the world where NPC is and is not endemic, respectively. This analysis revealed that LMP1 sequences show a distinct geographic structure, indicating that the southeast Asian isolates have evolved as a lineage distinct from those of Papua New Guinea, African, and Australian isolates. Furthermore, a likelihood ratio test revealed that the C termini of the LMP1 sequences of the southeast Asian lineage are under significant positive selection pressure, particularly at some sites within the C-terminal activator regions. We also present evidence that although the N terminus and transmembrane region of LMP1 have undergone recombination, the C-terminal region of the gene has evolved without any history of recombination. Based on these observations, we speculate that selection pressure may be driving the LMP1 sequences in virus isolates from southeast Asia towards a more malignant phenotype, thereby influencing the endemic distribution of NPC in this region.
Resumo:
The EF-hand superfamily of calcium binding proteins includes the S100, calcium binding protein, and troponin subfamilies. This study represents a genome, structure, and expression analysis of the S100 protein family, in mouse, human, and rat. We confirm the high level of conservation between mammalian sequences but show that four members, including S100A12, are present only in the human genome. We describe three new members of the S100 family in the three species and their locations within the S100 genomic clusters and propose a revised nomenclature and phylogenetic relationship between members of the EF-hand superfamily. Two of the three new genes were induced in bone-marrow-derived macrophages activated with bacterial lipopolysaccharide, suggesting a role in inflammation. Normal human and murine tissue distribution profiles indicate that some members of the family are expressed in a specific manner, whereas others are more ubiquitous. Structure-function analysis of the chemotactic properties of murine S100A8 and human S100A12, particularly within the active hinge domain, suggests that the human protein is the functional homolog of the murine protein. Strong similarities between the promoter regions of human S100A12 and murine S100A8 support this possibility. This study provides insights into the possible processes of evolution of the EF-hand protein superfamily. Evolution of the S100 proteins appears to have occurred in a modular fashion, also seen in other protein families such as the C2H2-type zinc-finger family. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
Alternative splicing is widespread in mammalian gene expression, and variant splice patterns are often specific to different stages of development, particular tissues or a disease state. There is a need to systematically collect data on alternatively spliced exons, introns and splice isoforms, and to annotate this data. The Alternative Splicing Database consortium has been addressing this need, and is committed to maintaining and developing a value-added database of alternative splice events, and of experimentally verified regulatory mechanisms that mediate splice variants. In this paper we present two of the products from this project: namely, a database of computationally delineated alternative splice events as seen in alignments of EST/cDNA sequences with genome sequences, and a database of alternatively spliced exons collected from literature. The reported splice events are from nine different organisms and are annotated for various biological features including expression states and cross-species conservation. The data are presented on our ASD web pages (http://www.ebi.ac.uk/asd).
Resumo:
Do non-coding RNAs that are derived from the introns and exons of protein-coding and non-protein-coding genes represent a fundamental advance in the genetic operating system of higher organisms? Recent evidence from comparative genomics and molecular genetics indicates that this might be the case. If so, there will be profound consequences for our understanding of the genetics of these organisms, and in particular how the trajectories of differentiation and development and the differences among individuals and species are genomically programmed. But how might this hypothesis be tested?
Resumo:
Modern toxicology investigates a wide array of both old and new health hazards. Priority setting is needed to select agents for research from the plethora of exposure circumstances. The changing societies and a growing fraction of the aged have to be taken into consideration. A precise exposure assessment is of importance for risk estimation and regulation. Toxicology contributes to the exploration of pathomechanisms to specify the exposure metrics for risk estimation. Combined effects of co-existing agents are not yet sufficiently understood. Animal experiments allow a separate administration of agents which can not be disentangled by epidemiological means, but their value is limited for low exposure levels in many of today's settings. As an experimental science, toxicology has to keep pace with the rapidly growing knowledge about the language of the genome and the changing paradigms in cancer development. During the pioneer era of assembling a working draft of the human genome, toxicogenomics has been developed. Gene and pathway complexity have to be considered when investigating gene-environment interactions. For a best conduct of studies, modem toxicology needs a close liaison with many other disciplines like epidemiology and bioinformatics. (C) 2004 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Antisense transcription (transcription from the opposite strand to a protein-coding or sense strand) has been ascribed roles in gene regulation involving degradation of the corresponding sense transcripts (RNA interference), as well as gene silencing at the chromatin level. Global transcriptome analysis provides evidence that a large proportion of the genome can produce transcripts from both strands, and that antisense transcripts commonly link neighboring genes in complex loci into chains of linked transcriptional units. Expression profiling reveals frequent concordant regulation of sense/antisense pairs. We present experimental evidence that perturbation of an antisense RNA can alter the expression of sense messenger RNAs, suggesting that antisense transcription contributes to control of transcriptional outputs in mammals.
Resumo:
In recent years, there have been increasing numbers of transcripts identified that do not encode proteins, many of which are developmentally regulated and appear to have regulatory functions. Here, we describe the construction of a comprehensive mammalian noncoding RNA database (RNAdb) which contains over 800 unique experimentally studied noncoding RNAs (ncRNAs), including many associated with diseases and/or developmental processes. The database is available at http://research.imb.uq. edu.au/RNAdb and is searchable by many criteria. It includes microRNAs and snoRNAs, but not infrastructural RNAs, such as rRNAs and tRNAs, which are catalogued elsewhere. The database also includes over 1100 putative antisense ncRNAs and almost 20000 putative ncRNAs identified in high-quality murine and human cDNA libraries, with more to be added in the near future. Many of these RNAs are large, and many are spliced, some alternatively. The database will be useful as a foundation for the emerging field of RNomics and the characterization of the roles of ncRNAs in mammalian gene expression and regulation.
Resumo:
Hyaluronic acid (HA) is a commercially valuable medical biopolymer increasingly produced through microbial fermentation. Viscosity limits product yield and the focus of research and development has been on improving the key quality parameters, purity and molecular weight. Traditional strain and process optimisation has yielded significant improvements, but appears to have reached a limit. Metabolic engineering is providing new opportunities and HA produced in a heterologous host is about to enter the market. In order to realise the full potential of metabolic engineering, however, greater understanding of the mechanisms underlying chain termination is required.
Resumo:
The aim of the study was to perform a genetic linkage analysis for eye color, for comparative data. Similarity in eye color of mono- and dizygotic twins was rated by the twins' mother, their father and/or the twins themselves. For 4748 twin pairs the similarity in eye color was available on a three point scale (not at all alike-somewhat alike-completely alike), absolute eye color on individuals was not assessed. The probability that twins were alike for eye color was calculated as a weighted average of the different responses of all respondents on several different time points. The mean probability of being alike for eye color was 0.98 for MZ twins (2167 pairs), whereas the mean probability for DZ twins was 0.46 (2537 pairs), suggesting very high heritability for eye color. For 294 DZ twin pairs genome-wide marker data were available. The probability of being alike for eye color was regressed on the average amount of IBD sharing. We found a peak LOD-score of 2.9 at chromosome 15q, overlapping with the region recently implicated for absolute ratings of eye color in Australian twins [Zhu, G., Evans, D. M., Duffy, D. L., Montgomery, G. W., Medland, S. E., Gillespie, N. A., Ewen, K. R., Jewell, M., Liew, Y. W., Hayward, N. K., Sturm, R. A., Trent, J. M., and Martin, N. G. (2004). Twin Res. 7:197-210] and containing the OCA2 gene, which is the major candidate gene for eye color [Sturm, R. A. Teasdale, R. D, and Box, N. F. (2001). Gene 277:49-62]. Our results demonstrate that comparative measures on relatives can be used in genetic linkage analysis.
Resumo:
Non- protein- coding RNAs ( ncRNAs) are increasingly being recognized as having important regulatory roles. Although much recent attention has focused on tiny 22- to 25- nucleotide microRNAs, several functional ncRNAs are orders of magnitude larger in size. Examples of such macro ncRNAs include Xist and Air, which in mouse are 18 and 108 kilobases ( Kb), respectively. We surveyed the 102,801 FANTOM3 mouse cDNA clones and found that Air and Xist were present not as single, full- length transcripts but as a cluster of multiple, shorter cDNAs, which were unspliced, had little coding potential, and were most likely primed from internal adenine- rich regions within longer parental transcripts. We therefore conducted a genome- wide search for regional clusters of such cDNAs to find novel macro ncRNA candidates. Sixty- six regions were identified, each of which mapped outside known protein- coding loci and which had a mean length of 92 Kb. We detected several known long ncRNAs within these regions, supporting the basic rationale of our approach. In silico analysis showed that many regions had evidence of imprinting and/ or antisense transcription. These regions were significantly associated with microRNAs and transcripts from the central nervous system. We selected eight novel regions for experimental validation by northern blot and RT- PCR and found that the majority represent previously unrecognized noncoding transcripts that are at least 10 Kb in size and predominantly localized in the nucleus. Taken together, the data not only identify multiple new ncRNAs but also suggest the existence of many more macro ncRNAs like Xist and Air.
Resumo:
Background: Current methods to find significantly under- and over-represented gene ontology (GO) terms in a set of genes consider the genes as equally probable balls in a bag, as may be appropriate for transcripts in micro-array data. However, due to the varying length of genes and intergenic regions, that approach is inappropriate for deciding if any GO terms are correlated with a set of genomic positions. Results: We present an algorithm - GONOME - that can determine which GO terms are significantly associated with a set of genomic positions given a genome annotated with (at least) the starts and ends of genes. We show that certain GO terms may appear to be significantly associated with a set of randomly chosen positions in the human genome if gene lengths are not considered, and that these same terms have been reported as significantly over-represented in a number of recent papers. This apparent over-representation disappears when gene lengths are considered, as GONOME does. For example, we show that, when gene length is taken into account, the term development is not significantly enriched in genes associated with human CpG islands, in contradiction to a previous report. We further demonstrate the efficacy of GONOME by showing that occurrences of the proteosome-associated control element (PACE) upstream activating sequence in the S. cerevisiae genome associate significantly to appropriate GO terms. An extension of this approach yields a whole-genome motif discovery algorithm that allows identification of many other promoter sequences linked to different types of genes, including a large group of previously unknown motifs significantly associated with the terms 'translation' and 'translational elongation'. Conclusion: GONOME is an algorithm that correctly extracts over-represented GO terms from a set of genomic positions. By explicitly considering gene size, GONOME avoids a systematic bias toward GO terms linked to large genes. Inappropriate use of existing algorithms that do not take gene size into account has led to erroneous or suspect conclusions. Reciprocally GONOME may be used to identify new features in genomes that are significantly associated with particular categories of genes.
Resumo:
Aim-Colorectal cancer has been described in association with hyperplastic polyposis but the mechanism underlying this observation is unknown. The aim of this study was to characterise foci of dysplasia developing in the polyps of subjects with hyperplastic polyposis on the basis of DNA microsatellite status and expression of the DNA mismatch repair proteins hMLH1, hMSH2, and hMSH6. Materials and methods-The material was derived from four patients with hyperplastic polyposis and between one and six synchronous colorectal cancers. Normal (four), hyperplastic (13), dysplastic (13), and malignant (11) samples were microdissected and a PCR based approach was used to identify mutations at 10 microsatellite loci, TGF beta IIR, IGF2R, BAX, MSH3, and MSH6. Microsatellite instability-high (MSI-H) was diagnosed when 40% or more of the microsatellite loci showed mutational bandshifts. Serial sections were stained for hMLH1, hMSH2, and hMSH6. Result-DNA microsatellite instability was found in 1/13 (8%) hyperplastic samples, in 7/13 (54%) dysplastic foci, and in 8/11 (73%) cancers. None of the MSI-low (MSI-L) samples (one hyperplastic, three dysplastic, two cancers) showed loss of hMLH1 expression. All four MSI-H dysplastic foci and six MSI-H cancers showed loss of hMLH1 expression. Loss of hMLH1 in MSI-H but not in MSI-L lesions showing dysplasia or cancer was significant (p< 0.001, Fisher's exact test). Loss of hMSH6 occurred in one MSI-H cancer and one MSS focus of dysplasia which also showed loss of hMLH1 staining. Conclusion-Neoplastic changes in hyperplastic polyposis may occur within a hyperplastic polyp. Neoplasia may be driven by DNA instability that is present to a low (MSI-L) or high (MSI-H) degree. MSI-H but not MSI-L dysplastic foci are associated with loss of hMLH1 expression. At least two mutator pathways drive neoplasia in hyperplastic polyposis. The role of the hyperplastic polyp in the histogenesis of sporadic DNA microsatellite unstable colorectal cancer should be examined.
Resumo:
We analyzed the codon usage bias of eight open reading frames (ORFs) across up to 79 human papillomavirus (HPV) genotypes from three distinct phylogenetic groups. All eight ORFs across HPV genotypes show a strong codon usage bias, amongst degenerately encoded amino acids, toward 18 codons mainly with T at the 3rd position. For all 18 degenerately encoded amino acids, codon preferences amongst human and animal PV ORFs are significantly different from those averaged across mammalian genes. Across the HPV types, the L2 ORFs show the highest codon usage bias (73.2 +/- 1.6% and the E4 ORFs the lowest (51.1 +/- 0.5%), reflecting as similar bias in codon 3rd position A + T content (L2: 76.1 +/- 4.2%; E4: 58.6 +/- 4.5%). The E4 ORF, uniquely amongst the HPV ORFs, is G + C rich, while the other ORFs are A + T rich. Codon usage bias correlates positively with A + T content at the codon 3rd position in the E2, E6, L1 and L2 ORFs, but negatively in the E4 ORFs. A general conservation of preferred codon usage across human and non-human PV genotypes whether they originate from a same supergroup or not, together with observed difference between the preferred codon usage for HPV ORFs and for genes of the cells they infect, suggests that specific codon usage bias and A + T content variation may somehow increase the replicational fitness of HPVs in mammalian epithelial cells, and have practical implications for gene therapy of HPV infection. (C) 2003 Elsevier B.V. All rights reserved.
Resumo:
We have rated eye color on a 3-point scale (1=blue/grey, 2=hazel/green, 3=brown) in 502 twin families and carried out a 5-10 cM genome scan (400-757 markers). We analyzed eye color as a threshold trait and performed multipoint sib pair linkage analysis using variance components analysis in Mx. A lod of 19.2 was found at the marker D15S1002, less than 1 cM from OCA2, which has been previously implicated in eye color variation. We estimate that 74% of variance in eye color liability is due to this QTL and a further 18% due to polygenic effects. However, a large shoulder on this peak suggests that other loci affecting eye color may be telomeric of OCA2 and inflating the QTL estimate. No other peaks reached genome-wide significance, although lods >2 were seen on 5p and 14q and lods >1 were additionally seen on chromosomes 2, 3, 6, 7, 8, 9, 17 and 18. Most of these secondary peaks were reduced or eliminated when we repeated the scan as a two locus analysis with the 15q linkage included, although this does not necessarily exclude them as false positives. We also estimated the interaction between the 15q QTL and the other marker locus but there was only minor evidence for additive x additive epistasis. Elaborating the analysis to the full two-locus model including non-additive main effects and interactions did not strengthen the evidence for epistasis. We conclude that most variation in eye color in Europeans is due to polymorphism in OCA2 but that there may be modifiers at several other loci.
Resumo:
Chlamydia pneumoniae is an obligate intracellular respiratory pathogen that causes 10% of community-acquired pneumonia and has been associated with cardiovascular disease. Both whole-genome sequencing and specific gene typing suggest that there is relatively little genetic variation in human isolates of C. pneumoniae. To date, there has been little genomic analysis of strains from human cardiovascular sites. The genotypes of C. pneumoniae present in human atherosclerotic carotid plaque were analysed and several polymorphisms in the variable domain 4 (VD4) region of the outer-membrane protein-A (ompA) gene and the intergenic region between the ygeD and uridine kinase (ygeD-urk) genes were found. While one genotype was identified that was the same as one reported previously in humans (respiratory and cardiovascular), another genotype was found that was identical to a genotype from non-human sources (frog/koala).