903 resultados para Human genome - Theses
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Abstract Background RNAs transcribed from intronic regions of genes are involved in a number of processes related to post-transcriptional control of gene expression. However, the complement of human genes in which introns are transcribed, and the number of intronic transcriptional units and their tissue expression patterns are not known. Results A survey of mRNA and EST public databases revealed more than 55,000 totally intronic noncoding (TIN) RNAs transcribed from the introns of 74% of all unique RefSeq genes. Guided by this information, we designed an oligoarray platform containing sense and antisense probes for each of 7,135 randomly selected TIN transcripts plus the corresponding protein-coding genes. We identified exonic and intronic tissue-specific expression signatures for human liver, prostate and kidney. The most highly expressed antisense TIN RNAs were transcribed from introns of protein-coding genes significantly enriched (p = 0.002 to 0.022) in the 'Regulation of transcription' Gene Ontology category. RNA polymerase II inhibition resulted in increased expression of a fraction of intronic RNAs in cell cultures, suggesting that other RNA polymerases may be involved in their biosynthesis. Members of a subset of intronic and protein-coding signatures transcribed from the same genomic loci have correlated expression patterns, suggesting that intronic RNAs regulate the abundance or the pattern of exon usage in protein-coding messages. Conclusion We have identified diverse intronic RNA expression patterns, pointing to distinct regulatory roles. This gene-oriented approach, using a combined intron-exon oligoarray, should permit further comparative analysis of intronic transcription under various physiological and pathological conditions, thus advancing current knowledge about the biological functions of these noncoding RNAs.
Resumo:
Genome-wide association studies have failed to establish common variant risk for the majority of common human diseases. The underlying reasons for this failure are explained by recent studies of resequencing and comparison of over 1200 human genomes and 10 000 exomes, together with the delineation of DNA methylation patterns (epigenome) and full characterization of coding and noncoding RNAs (transcriptome) being transcribed. These studies have provided the most comprehensive catalogues of functional elements and genetic variants that are now available for global integrative analysis and experimental validation in prospective cohort studies. With these datasets, researchers will have unparalleled opportunities for the alignment, mining, and testing of hypotheses for the roles of specific genetic variants, including copy number variations, single nucleotide polymorphisms, and indels as the cause of specific phenotypes and diseases. Through the use of next-generation sequencing technologies for genotyping and standardized ontological annotation to systematically analyze the effects of genomic variation on humans and model organism phenotypes, we will be able to find candidate genes and new clues for disease’s etiology and treatment. This article describes essential concepts in genetics and genomic technologies as well as the emerging computational framework to comprehensively search websites and platforms available for the analysis and interpretation of genomic data.
Resumo:
The domestic dog offers a unique opportunity to explore the genetic basis of disease, morphology and behaviour. Humans share many diseases with our canine companions, making dogs an ideal model organism for comparative disease genetics. Using newly developed resources, genome-wide association studies in dog breeds are proving to be exceptionally powerful. Towards this aim, veterinarians and geneticists from 12 European countries are collaborating to collect and analyse the DNA from large cohorts of dogs suffering from a range of carefully defined diseases of relevance to human health. This project, named LUPA, has already delivered considerable results. The consortium has collaborated to develop a new high density single nucleotide polymorphism (SNP) array. Mutations for four monogenic diseases have been identified and the information has been utilised to find mutations in human patients. Several complex diseases have been mapped and fine mapping is underway. These findings should ultimately lead to a better understanding of the molecular mechanisms underlying complex diseases in both humans and their best friend.
Resumo:
BACKGROUND: Staphylococcus aureus, a leading cause of chronic or acute infections, is traditionally considered an extracellular pathogen despite repeated reports of S. aureus internalization by a variety of non-myeloid cells in vitro. This property potentially contributes to bacterial persistence, protection from antibiotics and evasion of immune defenses. Mechanisms contributing to internalization have been partly elucidated, but bacterial processes triggered intracellularly are largely unknown. RESULTS: We have developed an in vitro model using human lung epithelial cells that shows intracellular bacterial persistence for up to 2 weeks. Using an original approach we successfully collected and amplified low amounts of bacterial RNA recovered from infected eukaryotic cells. Transcriptomic analysis using an oligoarray covering the whole S. aureus genome was performed at two post-internalization times and compared to gene expression of non-internalized bacteria. No signs of cellular death were observed after prolonged internalization of Staphylococcus aureus 6850 in epithelial cells. Following internalization, extensive alterations of bacterial gene expression were observed. Whereas major metabolic pathways including cell division, nutrient transport and regulatory processes were drastically down-regulated, numerous genes involved in iron scavenging and virulence were up-regulated. This initial adaptation was followed by a transcriptional increase in several metabolic functions. However, expression of several toxin genes known to affect host cell integrity appeared strictly limited. CONCLUSION: These molecular insights correlated with phenotypic observations and demonstrated that S. aureus modulates gene expression at early times post infection to promote survival. Staphylococcus aureus appears adapted to intracellular survival in non-phagocytic cells.
Resumo:
This dissertation has three separate parts: the first part deals with the general pedigree association testing incorporating continuous covariates; the second part deals with the association tests under population stratification using the conditional likelihood tests; the third part deals with the genome-wide association studies based on the real rheumatoid arthritis (RA) disease data sets from Genetic Analysis Workshop 16 (GAW16) problem 1. Many statistical tests are developed to test the linkage and association using either case-control status or phenotype covariates for family data structure, separately. Those univariate analyses might not use all the information coming from the family members in practical studies. On the other hand, the human complex disease do not have a clear inheritance pattern, there might exist the gene interactions or act independently. In part I, the new proposed approach MPDT is focused on how to use both the case control information as well as the phenotype covariates. This approach can be applied to detect multiple marker effects. Based on the two existing popular statistics in family studies for case-control and quantitative traits respectively, the new approach could be used in the simple family structure data set as well as general pedigree structure. The combined statistics are calculated using the two statistics; A permutation procedure is applied for assessing the p-value with adjustment from the Bonferroni for the multiple markers. We use simulation studies to evaluate the type I error rates and the powers of the proposed approach. Our results show that the combined test using both case-control information and phenotype covariates not only has the correct type I error rates but also is more powerful than the other existing methods. For multiple marker interactions, our proposed method is also very powerful. Selective genotyping is an economical strategy in detecting and mapping quantitative trait loci in the genetic dissection of complex disease. When the samples arise from different ethnic groups or an admixture population, all the existing selective genotyping methods may result in spurious association due to different ancestry distributions. The problem can be more serious when the sample size is large, a general requirement to obtain sufficient power to detect modest genetic effects for most complex traits. In part II, I describe a useful strategy in selective genotyping while population stratification is present. Our procedure used a principal component based approach to eliminate any effect of population stratification. The paper evaluates the performance of our procedure using both simulated data from an early study data sets and also the HapMap data sets in a variety of population admixture models generated from empirical data. There are one binary trait and two continuous traits in the rheumatoid arthritis dataset of Problem 1 in the Genetic Analysis Workshop 16 (GAW16): RA status, AntiCCP and IgM. To allow multiple traits, we suggest a set of SNP-level F statistics by the concept of multiple-correlation to measure the genetic association between multiple trait values and SNP-specific genotypic scores and obtain their null distributions. Hereby, we perform 6 genome-wide association analyses using the novel one- and two-stage approaches which are based on single, double and triple traits. Incorporating all these 6 analyses, we successfully validate the SNPs which have been identified to be responsible for rheumatoid arthritis in the literature and detect more disease susceptibility SNPs for follow-up studies in the future. Except for chromosome 13 and 18, each of the others is found to harbour susceptible genetic regions for rheumatoid arthritis or related diseases, i.e., lupus erythematosus. This topic is discussed in part III.
Resumo:
HIV-1 sequence diversity is affected by selection pressures arising from host genomic factors. Using paired human and viral data from 1071 individuals, we ran >3000 genome-wide scans, testing for associations between host DNA polymorphisms, HIV-1 sequence variation and plasma viral load (VL), while considering human and viral population structure. We observed significant human SNP associations to a total of 48 HIV-1 amino acid variants (p<2.4 × 10−12). All associated SNPs mapped to the HLA class I region. Clinical relevance of host and pathogen variation was assessed using VL results. We identified two critical advantages to the use of viral variation for identifying host factors: (1) association signals are much stronger for HIV-1 sequence variants than VL, reflecting the ‘intermediate phenotype’ nature of viral variation; (2) association testing can be run without any clinical data. The proposed genome-to-genome approach highlights sites of genomic conflict and is a strategy generally applicable to studies of host–pathogen interaction.
Resumo:
High-throughput assays, such as yeast two-hybrid system, have generated a huge amount of protein-protein interaction (PPI) data in the past decade. This tremendously increases the need for developing reliable methods to systematically and automatically suggest protein functions and relationships between them. With the available PPI data, it is now possible to study the functions and relationships in the context of a large-scale network. To data, several network-based schemes have been provided to effectively annotate protein functions on a large scale. However, due to those inherent noises in high-throughput data generation, new methods and algorithms should be developed to increase the reliability of functional annotations. Previous work in a yeast PPI network (Samanta and Liang, 2003) has shown that the local connection topology, particularly for two proteins sharing an unusually large number of neighbors, can predict functional associations between proteins, and hence suggest their functions. One advantage of the work is that their algorithm is not sensitive to noises (false positives) in high-throughput PPI data. In this study, we improved their prediction scheme by developing a new algorithm and new methods which we applied on a human PPI network to make a genome-wide functional inference. We used the new algorithm to measure and reduce the influence of hub proteins on detecting functionally associated proteins. We used the annotations of the Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) as independent and unbiased benchmarks to evaluate our algorithms and methods within the human PPI network. We showed that, compared with the previous work from Samanta and Liang, our algorithm and methods developed in this study improved the overall quality of functional inferences for human proteins. By applying the algorithms to the human PPI network, we obtained 4,233 significant functional associations among 1,754 proteins. Further comparisons of their KEGG and GO annotations allowed us to assign 466 KEGG pathway annotations to 274 proteins and 123 GO annotations to 114 proteins with estimated false discovery rates of <21% for KEGG and <30% for GO. We clustered 1,729 proteins by their functional associations and made pathway analysis to identify several subclusters that are highly enriched in certain signaling pathways. Particularly, we performed a detailed analysis on a subcluster enriched in the transforming growth factor β signaling pathway (P<10-50) which is important in cell proliferation and tumorigenesis. Analysis of another four subclusters also suggested potential new players in six signaling pathways worthy of further experimental investigations. Our study gives clear insight into the common neighbor-based prediction scheme and provides a reliable method for large-scale functional annotations in this post-genomic era.
Resumo:
The lack of a permissive cell culture system hampers the study of human parvovirus B19 (B19V). UT7/Epo is one of the few established cell lines that can be infected with B19V but generates none or few infectious progeny. Recently, hypoxic conditions or the use of primary CD36+ erythroid progenitor cells (CD36+ EPCs) have been shown to improve the infection. These novel approaches were evaluated in infection and transfection experiments. Hypoxic conditions or the use of CD36+ EPCs resulted in a significant acceleration of the infection/transfection and a modest increase in the yield of capsid progeny. However, under all tested conditions, genome encapsidation was impaired seriously. Further analysis of the cell culture virus progeny revealed that differently to the wild-type virus, the VP1 unique region (VP1u) was exposed partially and was unable to become further externalized upon heat treatment. The fivefold axes pore, which is used for VP1u externalization and genome encapsidation, might be constricted by the atypical VP1u conformation explaining the packaging failure. Although CD36+ EPCs and hypoxia facilitate B19V infection, large quantities of infectious progeny cannot be generated due to a failure in genome encapsidation, which arises as a major limiting factor for the in vitro propagation of B19V.
Resumo:
Systemic lupus erythematosus (SLE) is an autoimmune disorder characterized by production of autoantibodies against intracellular antigens including DNA, ribosomal P, Ro (SS-A), La (SS-B), and the spliceosome. Etiology is suspected to involve genetic and environmental factors. Evidence of genetic involvement includes: associations with HLA-DR3, HLA-DR2, Fcγ receptors (FcγR) IIA and IIIA, and hereditary complement component deficiencies, as well as familial aggregation, monozygotic twin concordance >20%, λs > 10, purported linkage at 1q41–42, and inbred mouse strains that consistently develop lupus. We have completed a genome scan in 94 extended multiplex pedigrees by using model-based linkage analysis. Potential [log10 of the odds for linkage (lod) > 2.0] SLE loci have been identified at chromosomes 1q41, 1q23, and 11q14–23 in African-Americans; 14q11, 4p15, 11q25, 2q32, 19q13, 6q26–27, and 12p12–11 in European-Americans; and 1q23, 13q32, 20q13, and 1q31 in all pedigrees combined. An effect for the FcγRIIA candidate polymorphism) at 1q23 (lod = 3.37 in African-Americans) is syntenic with linkage in a murine model of lupus. Sib-pair and multipoint nonparametric analyses also support linkage (P < 0.05) at nine loci detected by using two-point lod score analysis (lod > 2.0). Our results are consistent with the presumed complexity of genetic susceptibility to SLE and illustrate racial origin is likely to influence the specific nature of these genetic effects.
Resumo:
Systemic lupus erythematosus (SLE) is an autoimmune multisystem inflammatory disease characterized by the production of pathogenic autoantibodies. Previous genetic studies have suggested associations with HLA Class II alleles, complement gene deficiencies, and Fc receptor polymorphisms; however, it is likely that other genes contribute to SLE susceptibility and pathogenesis. Here, we report the results of a genome-wide microsatellite marker screen in 105 SLE sib-pair families. By using multipoint nonparametric methods, the strongest evidence for linkage was found near the HLA locus (6p11-p21) [D6S257, logarithm of odds (lod) = 3.90, P = 0.000011] and at three additional regions: 16q13 (D16S415, lod = 3.64, P = 0.000022), 14q21–23 (D14S276, lod = 2.81, P = 0.00016), and 20p12 (D20S186, lod = 2.62, P = 0.00025). Another nine regions (1p36, 1p13, 1q42, 2p15, 2q21–33, 3cent-q11, 4q28, 11p15, and 15q26) were identified with lod scores ≥1.00. These data support the hypothesis that multiple genes, including one in the HLA region, influence susceptibility to human SLE.
Resumo:
Nuclear-localized mtDNA pseudogenes might explain a recent report describing a heteroplasmic mtDNA molecule containing five linked missense mutations dispersed over the contiguous mtDNA CO1 and CO2 genes in Alzheimer’s disease (AD) patients. To test this hypothesis, we have used the PCR primers utilized in the original report to amplify CO1 and CO2 sequences from two independent ρ° (mtDNA-less) cell lines. CO1 and CO2 sequences amplified from both of the ρ° cells, demonstrating that these sequences are also present in the human nuclear DNA. The nuclear pseudogene CO1 and CO2 sequences were then tested for each of the five “AD” missense mutations by restriction endonuclease site variant assays. All five mutations were found in the nuclear CO1 and CO2 PCR products from ρ° cells, but none were found in the PCR products obtained from cells with normal mtDNA. Moreover, when the overlapping nuclear CO1 and CO2 PCR products were cloned and sequenced, all five missense mutations were found, as well as a linked synonymous mutation. Unlike the findings in the original report, an additional 32 base substitutions were found, including two in adjacent tRNAs and a two base pair deletion in the CO2 gene. Phylogenetic analysis of the nuclear CO1 and CO2 sequences revealed that they diverged from modern human mtDNAs early in hominid evolution about 770,000 years before present. These data would be consistent with the interpretation that the missense mutations proposed to cause AD may be the product of ancient mtDNA variants preserved as nuclear pseudogenes.
Resumo:
Human gene MAGE-1 encodes tumor-specific antigens that are recognized on melanoma cells by autologous cytolytic T lymphocytes. This gene is expressed in a significant proportion of tumors of various histological types, but not in normal tissues except male germ-line cells. We reported previously that reporter genes driven by the MAGE-1 promoter are active not only in the tumor cell lines that express MAGE-1 but also in those that do not. This suggests that the critical factor causing the activation of MAGE-1 in certain tumors is not the presence of the appropriate transcription factors. The two major MAGE-1 promoter elements have an Ets binding site, which contains a CpG dinucleotide. We report here that these CpG are demethylated in the tumor cell lines that express MAGE-1, and are methylated in those that do not express the gene. Methylation of these CpG inhibits the binding of transcription factors, as seen by mobility shift assay. Treatment with the demethylating agent 5-aza-2'-deoxycytidine activated gene MAGE-1 not only in tumor cell lines but also in primary fibroblasts. Finally, the overall level of CpG methylation was evaluated in 20 different tumor cell lines. It was inversely correlated with the expression of MAGE-1. We conclude that the activation of MAGE-1 in cancer cells is due to the demethylation of the promoter. This appears to be a consequence of a genome-wide demethylation process that occurs in many cancers and is correlated with tumor progression.