959 resultados para Genome sequencing
Resumo:
Matrix attachment regions are DNA sequences found throughout eukaryotic genomes that are believed to define boundaries interfacing heterochromatin and euchromatin domains, thereby acting as epigenetic regulators. When included in expression vectors, MARs can improve and sustain transgene expression, and a search for more potent novel elements is therefore actively pursued to further improve recombinant protein production. Here we describe the isolation of new MARs from the mouse genome using a modified in silico analysis. One of these MARs was found to be a powerful activator of transgene expression in stable transfections. Interestingly, this MAR also increased GFP and/or immunoglobulin expression from some but not all expression vectors in transient transfections. This effect was attributed to the presence or absence of elements on the vector backbone, providing an explanation for earlier discrepancies as to the ability of this class of elements to affect transgene expression under such conditions.
Resumo:
The genomic loci occupied by RNA polymerase (RNAP) III have been characterized in human culture cells by genome-wide chromatin immunoprecipitations, followed by deep sequencing (ChIP-seq). These studies have shown that only ∼40% of the annotated 622 human tRNA genes and pseudogenes are occupied by RNAP-III, and that these genes are often in open chromatin regions rich in active RNAP-II transcription units. We have used ChIP-seq to characterize RNAP-III-occupied loci in a differentiated tissue, the mouse liver. Our studies define the mouse liver RNAP-III-occupied loci including a conserved mammalian interspersed repeat (MIR) as a potential regulator of an RNAP-III subunit-encoding gene. They reveal that synteny relationships can be established between a number of human and mouse RNAP-III genes, and that the expression levels of these genes are significantly linked. They establish that variations within the A and B promoter boxes, as well as the strength of the terminator sequence, can strongly affect RNAP-III occupancy of tRNA genes. They reveal correlations with various genomic features that explain the observed variation of 81% of tRNA scores. In mouse liver, loci represented in the NCBI37/mm9 genome assembly that are clearly occupied by RNAP-III comprise 50 Rn5s (5S RNA) genes, 14 known non-tRNA RNAP-III genes, nine Rn4.5s (4.5S RNA) genes, and 29 SINEs. Moreover, out of the 433 annotated tRNA genes, half are occupied by RNAP-III. Transfer RNA gene expression levels reflect both an underlying genomic organization conserved in dividing human culture cells and resting mouse liver cells, and the particular promoter and terminator strengths of individual genes.
Resumo:
Neuroticism is a moderately heritable personality trait considered to be a risk factor for developing major depression, anxiety disorders and dementia. We performed a genome-wide association study in 2,235 participants drawn from a population-based study of neuroticism, making this the largest association study for neuroticism to date. Neuroticism was measured by the Eysenck Personality Questionnaire. After Quality Control, we analysed 430,000 autosomal SNPs together with an additional 1.2 million SNPs imputed with high quality from the Hap Map CEU samples. We found a very small effect of population stratification, corrected using one principal component, and some cryptic kinship that required no correction. NKAIN2 showed suggestive evidence of association with neuroticism as a main effect (p < 10(-6)) and GPC6 showed suggestive evidence for interaction with age (p approximately = 10(-7)). We found support for one previously-reported association (PDE4D), but failed to replicate other recent reports. These results suggest common SNP variation does not strongly influence neuroticism. Our study was powered to detect almost all SNPs explaining at least 2% of heritability, and so our results effectively exclude the existence of loci having a major effect on neuroticism.
Resumo:
Background: Network reconstructions at the cell level are a major development in Systems Biology. However, we are far from fully exploiting its potentialities. Often, the incremental complexity of the pursued systems overrides experimental capabilities, or increasingly sophisticated protocols are underutilized to merely refine confidence levels of already established interactions. For metabolic networks, the currently employed confidence scoring system rates reactions discretely according to nested categories of experimental evidence or model-based likelihood. Results: Here, we propose a complementary network-based scoring system that exploits the statistical regularities of a metabolic network as a bipartite graph. As an illustration, we apply it to the metabolism of Escherichia coli. The model is adjusted to the observations to derive connection probabilities between individual metabolite-reaction pairs and, after validation, to assess the reliability of each reaction in probabilistic terms. This network-based scoring system uncovers very specific reactions that could be functionally or evolutionary important, identifies prominent experimental targets, and enables further confirmation of modeling results. Conclusions: We foresee a wide range of potential applications at different sub-cellular or supra-cellular levels of biological interactions given the natural bipartivity of many biological networks.
Resumo:
High-fidelity 'proofreading' polymerases are often used in library construction for next-generation sequencing projects, in an effort to minimize errors in the resulting sequence data. The increased template fidelity of these polymerases can come at the cost of reduced template specificity, and library preparation methods based on the AFLP technique may be particularly susceptible. Here, we compare AFLP profiles generated with standard Taq and two versions of a high-fidelity polymerase. We find that Taq produces fewer and brighter peaks than high-fidelity polymerase, suggesting that Taq performs better at selectively amplifying templates that exactly match the primer sequences. Because the higher accuracy of proofreading polymerases remains important for sequencing applications, we suggest that it may be more effective to use alternative library preparation methods.
Resumo:
BACKGROUND: Small RNAs (sRNAs) are widespread among bacteria and have diverse regulatory roles. Most of these sRNAs have been discovered by a combination of computational and experimental methods. In Pseudomonas aeruginosa, a ubiquitous Gram-negative bacterium and opportunistic human pathogen, the GacS/GacA two-component system positively controls the transcription of two sRNAs (RsmY, RsmZ), which are crucial for the expression of genes involved in virulence. In the biocontrol bacterium Pseudomonas fluorescens CHA0, three GacA-controlled sRNAs (RsmX, RsmY, RsmZ) regulate the response to oxidative stress and the expression of extracellular products including biocontrol factors. RsmX, RsmY and RsmZ contain multiple unpaired GGA motifs and control the expression of target mRNAs at the translational level, by sequestration of translational repressor proteins of the RsmA family. RESULTS: A combined computational and experimental approach enabled us to identify 14 intergenic regions encoding sRNAs in P. aeruginosa. Eight of these regions encode newly identified sRNAs. The intergenic region 1698 was found to specify a novel GacA-controlled sRNA termed RgsA. GacA regulation appeared to be indirect. In P. fluorescens CHA0, an RgsA homolog was also expressed under positive GacA control. This 120-nt sRNA contained a single GGA motif and, unlike RsmX, RsmY and RsmZ, was unable to derepress translation of the hcnA gene (involved in the biosynthesis of the biocontrol factor hydrogen cyanide), but contributed to the bacterium's resistance to hydrogen peroxide. In both P. aeruginosa and P. fluorescens the stress sigma factor RpoS was essential for RgsA expression. CONCLUSION: The discovery of an additional sRNA expressed under GacA control in two Pseudomonas species highlights the complexity of this global regulatory system and suggests that the mode of action of GacA control may be more elaborate than previously suspected. Our results also confirm that several GGA motifs are required in an sRNA for sequestration of the RsmA protein.
Resumo:
The genotyping of human papillomaviruses (HPV) is essential for the surveillance of HPV vaccines. We describe and validate a low-cost PGMY-based PCR assay (PGMY-CHUV) for the genotyping of 31 HPV by reverse blotting hybridization (RBH). Genotype-specific detection limits were 50 to 500 genome equivalents per reaction. RBH was 100% specific and 98.61% sensitive using DNA sequencing as the gold standard (n = 1,024 samples). PGMY-CHUV was compared to the validated and commercially available linear array (Roche) on 200 samples. Both assays identified the same positive (n = 182) and negative samples (n = 18). Seventy-six percent of the positives were fully concordant after restricting the comparison to the 28 genotypes shared by both assays. At the genotypic level, agreement was 83% (285/344 genotype-sample combinations; κ of 0.987 for single infections and 0.853 for multiple infections). Fifty-seven of the 59 discordant cases were associated with multiple infections and with the weakest genotypes within each sample (P < 0.0001). PGMY-CHUV was significantly more sensitive for HPV56 (P = 0.0026) and could unambiguously identify HPV52 in mixed infections. PGMY-CHUV was reproducible on repeat testing (n = 275 samples; 392 genotype-sample combinations; κ of 0.933) involving different reagents lots and different technicians. Discordant results (n = 47) were significantly associated with the weakest genotypes in samples with multiple infections (P < 0.0001). Successful participation in proficiency testing also supported the robustness of this assay. The PGMY-CHUV reagent costs were estimated at $2.40 per sample using the least expensive yet proficient genotyping algorithm that also included quality control. This assay may be used in low-resource laboratories that have sufficient manpower and PCR expertise.
Resumo:
Adenovirus serotype 5 (Ad5) vectors and specific neutralizing antibodies (NAbs) generate immune complexes (ICs) which are potent inducers of dendritic cell (DC) maturation. Here we show that ICs generated with rare Ad vector serotypes, such as Ad26 and Ad35, which are lead candidates in HIV vaccine development, are poor inducers of DC maturation and that their potency in inducing DC maturation strongly correlated with the number of Toll-like receptor 9 (TLR9)-agonist motifs present in the Ad vector's genome. In addition, we showed that antihexon but not antifiber antibodies are responsible for the induction of Ad IC-mediated DC maturation.
Resumo:
BACKGROUND: Genotypes obtained with commercial SNP arrays have been extensively used in many large case-control or population-based cohorts for SNP-based genome-wide association studies for a multitude of traits. Yet, these genotypes capture only a small fraction of the variance of the studied traits. Genomic structural variants (GSV) such as Copy Number Variation (CNV) may account for part of the missing heritability, but their comprehensive detection requires either next-generation arrays or sequencing. Sophisticated algorithms that infer CNVs by combining the intensities from SNP-probes for the two alleles can already be used to extract a partial view of such GSV from existing data sets. RESULTS: Here we present several advances to facilitate the latter approach. First, we introduce a novel CNV detection method based on a Gaussian Mixture Model. Second, we propose a new algorithm, PCA merge, for combining copy-number profiles from many individuals into consensus regions. We applied both our new methods as well as existing ones to data from 5612 individuals from the CoLaus study who were genotyped on Affymetrix 500K arrays. We developed a number of procedures in order to evaluate the performance of the different methods. This includes comparison with previously published CNVs as well as using a replication sample of 239 individuals, genotyped with Illumina 550K arrays. We also established a new evaluation procedure that employs the fact that related individuals are expected to share their CNVs more frequently than randomly selected individuals. The ability to detect both rare and common CNVs provides a valuable resource that will facilitate association studies exploring potential phenotypic associations with CNVs. CONCLUSION: Our new methodologies for CNV detection and their evaluation will help in extracting additional information from the large amount of SNP-genotyping data on various cohorts and use this to explore structural variants and their impact on complex traits.
Resumo:
Isolated gonadotropin-releasing hormone (GnRH) deficiency is a treatable albeit rare form of reproductive failure that has revealed physiological mechanisms controlling human reproduction, but despite substantial progress in discovering pathogenic single-gene defects, most of the genetic basis of GnRH deficiency remains uncharted. Although unbiased genetic investigations of affected families have identified mutations in previously unsuspected genes as causes of this disease in some cases, their application has been severely limited because of the negative effect of GnRH deficiency on fertility; moreover, relatively few of the many candidate genes nominated because of biological plausibility from in vitro or animal model experiments were subsequently validated in patients. With the advent of exciting technological platforms for sequencing, homozygosity mapping, and detection of structural variation at the whole-genome level, human investigations are again assuming the leading role for gene discovery. Using human GnRH deficiency as a paradigm and presenting original data from the screening of numerous candidate genes, we discuss the emerging model of patient-focused clinical genetic research and its complementarities with basic approaches in the near future.
Resumo:
Centrifuge is a user-friendly system to simultaneously access Arabidopsis gene annotations and intra- and inter-organism sequence comparison data. The tool allows rapid retrieval of user-selected data for each annotated Arabidopsis gene providing, in any combination, data on the following features: predicted protein properties such as mass, pI, cellular location and transmembrane domains; SWISS-PROT annotations; Interpro domains; Gene Ontology records; verified transcription; BLAST matches to the proteomes of A.thaliana, Oryza sativa (rice), Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens. The tool lends itself particularly well to the rapid analysis of contigs or of tens or hundreds of genes identified by high-throughput gene expression experiments. In these cases, a summary table of principal predicted protein features for all genes is given followed by more detailed reports for each individual gene. Centrifuge can also be used for single gene analysis or in a word search mode. AVAILABILITY: http://centrifuge.unil.ch/ CONTACT: edward.farmer@unil.ch.
Resumo:
Genome-wide association studies (GWAS) are designed to identify the portion of single-nucleotide polymorphisms (SNPs) in genome sequences associated with a complex trait. Strategies based on the gene list enrichment concept are currently applied for the functional analysis of GWAS, according to which a significant overrepresentation of candidate genes associated with a biological pathway is used as a proxy to infer overrepresentation of candidate SNPs in the pathway. Here we show that such inference is not always valid and introduce the program SNP2GO, which implements a new method to properly test for the overrepresentation of candidate SNPs in biological pathways.
Resumo:
Meiosis in triploids faces the seemingly insuperable difficulty of dividing an odd number of chromosome sets by two. Triploid vertebrates usually circumvent this problem through either asexuality or some forms of hybridogenesis, including meiotic hybridogenesis that involve a reproductive community of different ploidy levels and genome composition. Batura toads (Bufo baturae; 3n = 33 chromosomes), however, present an all-triploid sexual reproduction. This hybrid species has two genome copies carrying a nucleolus-organizing region (NOR+) on chromosome 6, and a third copy without it (NOR-). Males only produce haploid NOR+ sperm, while ova are diploid, containing one NOR+ and one NOR- set. Here, we conduct sibship analyses with co-dominant microsatellite markers so as (i) to confirm the purely clonal and maternal transmission of the NOR- set, and (ii) to demonstrate Mendelian segregation and recombination of the NOR+ sets in both sexes. This new reproductive mode in vertebrates ('pre-equalizing hybrid meiosis') offers an ideal opportunity to study the evolution of non-recombining genomes. Elucidating the mechanisms that allow simultaneous transmission of two genomes, one of Mendelian, the other of clonal inheritance, might shed light on the general processes that regulate meiosis in vertebrates.
Resumo:
BACKGROUND: LDL cholesterol has a causal role in the development of cardiovascular disease. Improved understanding of the biological mechanisms that underlie the metabolism and regulation of LDL cholesterol might help to identify novel therapeutic targets. We therefore did a genome-wide association study of LDL-cholesterol concentrations. METHODS: We used genome-wide association data from up to 11,685 participants with measures of circulating LDL-cholesterol concentrations across five studies, including data for 293 461 autosomal single nucleotide polymorphisms (SNPs) with a minor allele frequency of 5% or more that passed our quality control criteria. We also used data from a second genome-wide array in up to 4337 participants from three of these five studies, with data for 290,140 SNPs. We did replication studies in two independent populations consisting of up to 4979 participants. Statistical approaches, including meta-analysis and linkage disequilibrium plots, were used to refine association signals; we analysed pooled data from all seven populations to determine the effect of each SNP on variations in circulating LDL-cholesterol concentrations. FINDINGS: In our initial scan, we found two SNPs (rs599839 [p=1.7x10(-15)] and rs4970834 [p=3.0x10(-11)]) that showed genome-wide statistical association with LDL cholesterol at chromosomal locus 1p13.3. The second genome screen found a third statistically associated SNP at the same locus (rs646776 [p=4.3x10(-9)]). Meta-analysis of data from all studies showed an association of SNPs rs599839 (combined p=1.2x10(-33)) and rs646776 (p=4.8x10(-20)) with LDL-cholesterol concentrations. SNPs rs599839 and rs646776 both explained around 1% of the variation in circulating LDL-cholesterol concentrations and were associated with about 15% of an SD change in LDL cholesterol per allele, assuming an SD of 1 mmol/L. INTERPRETATION: We found evidence for a novel locus for LDL cholesterol on chromosome 1p13.3. These results potentially provide insight into the biological mechanisms that underlie the regulation of LDL cholesterol and might help in the discovery of novel therapeutic targets for cardiovascular disease.
Resumo:
We report a new set of nine primer pairs specifically developed for amplification of Brassica plastid SSR markers. The wide utility of these markers is demonstrated for haplotype identification and detection of polymorphism in B. napus, B. nigra, B. oleracea, B. rapa and in related genera Arabidopsis, Camelina, Raphanus and Sinapis. Eleven gene regions (ndhB-rps7 spacer, rbcL-accD spacer, rpl16 intron, rps16 intron, atpB-rbcL spacer, trnE-trnT spacer, trnL intron, trnL-trnF spacer, trnM-atpE spacer, trnR-rpoC2 spacer, ycf3-psaA spacer) were sequenced from a range of Brassica and related genera for SSR detection and primer design. Other sequences were obtained from GenBank/EMBL. Eight out of nine selected SSR loci showed polymorphism when amplified using the new primers and a combined analysis detected variation within and between Brassica species, with the number of alleles detected per locus ranging from 5 (loci MF-6, MF-1) to 11 (locus MF-7). The combined SSR data were used in a neighbour-joining analysis (SMM, D (DM) distances) to group the samples based on the presence and absence of alleles. The analysis was generally able to separate plastid types into taxon-specific groups. Multi-allelic haplotypes were plotted onto the neighbour joining tree. A total number of 28 haplotypes were detected and these differentiated 22 of the 41 accessions screened from all other accessions. None of these haplotypes was shared by more than one species and some were not characteristic of their predicted type. We interpret our results with respect to taxon differentiation, hybridisation and introgression patterns relating to the 'Triangle of U'.