928 resultados para complete genome
Resumo:
This PhD Thesis is the result of my research activity in the last three years. My main research interest was centered on the evolution of mitochondrial genome (mtDNA), and on its usefulness as a phylogeographic and phylogenetic marker at different taxonomic levels in different taxa of Metazoa. From a methodological standpoint, my main effort was dedicated to the sequencing of complete mitochondrial genomes, and the approach to whole-genome sequencing was based on the application of Long-PCR and shotgun sequences. Moreover, this research project is a part of a bigger sequencing project of mtDNAs in many different Metazoans’ taxa, and I mostly dedicated myself to sequence and analyze mtDNAs in selected taxa of bivalves and hexapods (Insecta). Sequences of bivalve mtDNAs are particularly limited, and my study contributed to extend the sampling. Moreover, I used the bivalve Musculista senhousia as model taxon to investigate the molecular mechanisms and the evolutionary significance of their aberrant mode of mitochondrial inheritance (Doubly Uniparental Inheritance, see below). In Insects, I focused my attention on the Genus Bacillus (Insecta Phasmida). A detailed phylogenetic analysis was performed in order to assess phylogenetic relationships within the genus, and to investigate the placement of Phasmida in the phylogenetic tree of Insecta. The main goal of this part of my study was to add to the taxonomic coverage of sequenced mtDNAs in basal insects, which were only partially analyzed.
Resumo:
Gap junctions are clustered channels between contacting cells through which direct intercellular communication via diffusion of ions and metabolites can occur. Two hemichannels, each built up of six connexin protein subunits in the plasma membrane of adjacent cells, can dock to each other to form conduits between cells. We have recently screened mouse and human genomic data bases and have found 19 connexin (Cx) genes in the mouse genome and 20 connexin genes in the human genome. One mouse connexin gene and two human connexin genes do not appear to have orthologs in the other genome. With three exceptions, the characterized connexin genes comprise two exons whereby the complete reading frame is located on the second exon. Targeted ablation of eleven mouse connexin genes revealed basic insights into the functional diversity of the connexin gene family. In addition, the phenotypes of human genetic disorders caused by mutated connexin genes further complement our understanding of connexin functions in the human organism. In this review we compare currently identified connexin genes in both the mouse and human genome and discuss the functions of gap junctions deduced from targeted mouse mutants and human genetic disorders.
Resumo:
BACKGROUND: The mollicute Mycoplasma conjunctivae is the etiological agent leading to infectious keratoconjunctivitis (IKC) in domestic sheep and wild caprinae. Although this pathogen is relatively benign for domestic animals treated by antibiotics, it can lead wild animals to blindness and death. This is a major cause of death in the protected species in the Alps (e.g., Capra ibex, Rupicapra rupicapra). METHODS: The genome was sequenced using a combined technique of GS-FLX (454) and Sanger sequencing, and annotated by an automatic pipeline that we designed using several tools interconnected via PERL scripts. The resulting annotations are stored in a MySQL database. RESULTS: The annotated sequence is deposited in the EMBL database (FM864216) and uploaded into the mollicutes database MolliGen http://cbi.labri.fr/outils/molligen/ allowing for comparative genomics. CONCLUSION: We show that our automatic pipeline allows for annotating a complete mycoplasma genome and present several examples of analysis in search for biological targets (e.g., pathogenic proteins).
Resumo:
BACKGROUND: Enterococcus faecalis has emerged as a major hospital pathogen. To explore its diversity, we sequenced E. faecalis strain OG1RF, which is commonly used for molecular manipulation and virulence studies. RESULTS: The 2,739,625 base pair chromosome of OG1RF was found to contain approximately 232 kilobases unique to this strain compared to V583, the only publicly available sequenced strain. Almost no mobile genetic elements were found in OG1RF. The 64 areas of divergence were classified into three categories. First, OG1RF carries 39 unique regions, including 2 CRISPR loci and a new WxL locus. Second, we found nine replacements where a sequence specific to V583 was substituted by a sequence specific to OG1RF. For example, the iol operon of OG1RF replaces a possible prophage and the vanB transposon in V583. Finally, we found 16 regions that were present in V583 but missing from OG1RF, including the proposed pathogenicity island, several probable prophages, and the cpsCDEFGHIJK capsular polysaccharide operon. OG1RF was more rapidly but less frequently lethal than V583 in the mouse peritonitis model and considerably outcompeted V583 in a murine model of urinary tract infections. CONCLUSION: E. faecalis OG1RF carries a number of unique loci compared to V583, but the almost complete lack of mobile genetic elements demonstrates that this is not a defining feature of the species. Additionally, OG1RF's effects in experimental models suggest that mediators of virulence may be diverse between different E. faecalis strains and that virulence is not dependent on the presence of mobile genetic elements.
Resumo:
The macronuclear genome of the ciliate Oxytricha trifallax displays an extreme and unique eukaryotic genome architecture with extensive genomic variation. During sexual genome development, the expressed, somatic macronuclear genome is whittled down to the genic portion of a small fraction (∼5%) of its precursor "silent" germline micronuclear genome by a process of "unscrambling" and fragmentation. The tiny macronuclear "nanochromosomes" typically encode single, protein-coding genes (a small portion, 10%, encode 2-8 genes), have minimal noncoding regions, and are differentially amplified to an average of ∼2,000 copies. We report the high-quality genome assembly of ∼16,000 complete nanochromosomes (∼50 Mb haploid genome size) that vary from 469 bp to 66 kb long (mean ∼3.2 kb) and encode ∼18,500 genes. Alternative DNA fragmentation processes ∼10% of the nanochromosomes into multiple isoforms that usually encode complete genes. Nucleotide diversity in the macronucleus is very high (SNP heterozygosity is ∼4.0%), suggesting that Oxytricha trifallax may have one of the largest known effective population sizes of eukaryotes. Comparison to other ciliates with nonscrambled genomes and long macronuclear chromosomes (on the order of 100 kb) suggests several candidate proteins that could be involved in genome rearrangement, including domesticated MULE and IS1595-like DDE transposases. The assembly of the highly fragmented Oxytricha macronuclear genome is the first completed genome with such an unusual architecture. This genome sequence provides tantalizing glimpses into novel molecular biology and evolution. For example, Oxytricha maintains tens of millions of telomeres per cell and has also evolved an intriguing expansion of telomere end-binding proteins. In conjunction with the micronuclear genome in progress, the O. trifallax macronuclear genome will provide an invaluable resource for investigating programmed genome rearrangements, complementing studies of rearrangements arising during evolution and disease.
Resumo:
Stylonychia lemnae is a classical model single-celled eukaryote, and a quintessential ciliate typified by dimorphic nuclei: A small, germline micronucleus and a massive, vegetative macronucleus. The genome within Stylonychia's macronucleus has a very unusual architecture, comprised variably and highly amplified "nanochromosomes," each usually encoding a single gene with a minimal amount of surrounding noncoding DNA. As only a tiny fraction of the Stylonychia genes has been sequenced, and to promote research using this organism, we sequenced its macronuclear genome. We report the analysis of the 50.2-Mb draft S. lemnae macronuclear genome assembly, containing in excess of 16,000 complete nanochromosomes, assembled as less than 20,000 contigs. We found considerable conservation of fundamental genomic properties between S. lemnae and its close relative, Oxytricha trifallax, including nanochromosomal gene synteny, alternative fragmentation, and copy number. Protein domain searches in Stylonychia revealed two new telomere-binding protein homologs and the presence of linker histones. Among the diverse histone variants of S. lemnae and O. trifallax, we found divergent, coexpressed variants corresponding to four of the five core nucleosomal proteins (H1.2, H2A.6, H2B.4, and H3.7) suggesting that these ciliates may possess specialized nucleosomes involved in genome processing during nuclear differentiation. The assembly of the S. lemnae macronuclear genome demonstrates that largely complete, well-assembled highly fragmented genomes of similar size and complexity may be produced from one library and lane of Illumina HiSeq 2000 shotgun sequencing. The provision of the S. lemnae macronuclear genome sets the stage for future detailed experimental studies of chromatin-mediated, RNA-guided developmental genome rearrangements.
Resumo:
Hypothyroidism is a complex clinical condition found in both humans and dogs, thought to be caused by a combination of genetic and environmental factors. In this study we present a multi-breed analysis of predisposing genetic risk factors for hypothyroidism in dogs using three high-risk breeds-the Gordon Setter, Hovawart and the Rhodesian Ridgeback. Using a genome-wide association approach and meta-analysis, we identified a major hypothyroidism risk locus shared by these breeds on chromosome 12 (p = 2.1x10-11). Further characterisation of the candidate region revealed a shared ~167 kb risk haplotype (4,915,018-5,081,823 bp), tagged by two SNPs in almost complete linkage disequilibrium. This breed-shared risk haplotype includes three genes (LHFPL5, SRPK1 and SLC26A8) and does not extend to the dog leukocyte antigen (DLA) class II gene cluster located in the vicinity. These three genes have not been identified as candidate genes for hypothyroid disease previously, but have functions that could potentially contribute to the development of the disease. Our results implicate the potential involvement of novel genes and pathways for the development of canine hypothyroidism, raising new possibilities for screening, breeding programmes and treatments in dogs. This study may also contribute to our understanding of the genetic etiology of human hypothyroid disease, which is one of the most common endocrine disorders in humans.
Resumo:
Ultrastructural analysis of the polydnavirus of the braconid wasp Chelonus inanitus revealed that virions consist of one cylindrical nucleocapsid enveloped by a single unit membrane. Nucleocapsids have a constant diameter of 33.7 +/- 1.4 nm and a variable length of between 8 and 46 nm. Spreading of viral DNA showed that the genome consists of circular dsDNA molecules of variable sizes and measurement of the contour lengths indicated sizes of between 7 and 31 kbp. When virions were exposed to osmotic shock conditions to release the DNA, only one circular molecule was released per particle suggesting that the various DNA molecules are singly encapsidated in this bracovirus. The viral genome was seen to consist of at least 10 different segments and the aggregate genome size is in the order of 200 kbp. By partial digestion of viral DNA with HindIII or EcoRI in the presence of ethidium bromide and subsequent ligation with HindIII-cut pSP65 or EcoRI-cut pSP64 and transfection into Escherichia coli, libraries of 103 HindIII and 23 EcoRI clones were obtained. Southern blots revealed that complete and unrearranged segments were cloned with this approach, and restriction maps for five segments were obtained. Part of a 16.8 kbp segment was sequenced, found to be AT-rich (73%) and to contain six copies of a 17 bp repeated sequence. The development of the female reproductive tract in the course of pupal-adult development of the wasp was investigated and seen to be strictly correlated with the pigmentation pattern. By the use of a semiquantitative PCR, replication of viral DNA was observed to initiate at a specific stage of pupal-adult development.
Resumo:
Classical swine fever virus (CSFV) causes a highly contagious disease in pigs that can range from a severe haemorrhagic fever to a nearly unapparent disease, depending on the virulence of the virus strain. Little is known about the viral molecular determinants of CSFV virulence. The nonstructural protein NS4B is essential for viral replication. However, the roles of CSFV NS4B in viral genome replication and pathogenesis have not yet been elucidated. NS4B of the GPE- vaccine strain and of the highly virulent Eystrup strain differ by a total of seven amino acid residues, two of which are located in the predicted trans-membrane domains of NS4B and were described previously to relate to virulence, and five residues clustering in the N-terminal part. In the present study, we examined the potential role of these five amino acids in modulating genome replication and determining pathogenicity in pigs. A chimeric low virulent GPE- -derived virus carrying the complete Eystrup NS4B showed enhanced pathogenicity in pigs. The in vitro replication efficiency of the NS4B chimeric GPE- replicon was significantly higher than that of the replicon carrying only the two Eystrup-specific amino acids in NS4B. In silico and in vitro data suggest that the N-terminal part of NS4B forms an amphipathic α-helix structure. The N-terminal NS4B with these five amino acid residues is associated with the intracellular membranes. Taken together, this is the first gain-of-function study showing that the N-terminal domain of NS4B can determine CSFV genome replication in cell culture and viral pathogenicity in pigs.
Resumo:
A complete reference genome of the Apis mellifera Filamentous virus (AmFV) was determined using Illumina Hiseq sequencing. The AmFV genome is a double stranded DNA molecule of approximately 498,500 nucleotides with a GC content of 50.8%. It encompasses 247 non-overlapping open reading frames (ORFs), equally distributed on both strands, which cover 65% of the genome. While most of the ORFs lacked threshold sequence alignments to reference protein databases, twenty-eight were found to display significant homologies with proteins present in other large double stranded DNA viruses. Remarkably, 13 ORFs had strong similarity with typical baculovirus domains such as PIFs (per os infectivity factor genes: pif-1, pif-2, pif-3 and p74) and BRO (Baculovirus Repeated Open Reading Frame). The putative AmFV DNA polymerase is of type B, but is only distantly related to those of the baculoviruses. The ORFs encoding proteins involved in nucleotide metabolism had the highest percent identity to viral proteins in GenBank. Other notable features include the presence of several collagen-like, chitin-binding, kinesin and pacifastin domains. Due to the large size of the AmFV genome and the inconsistent affiliation with other large double stranded DNA virus families infecting invertebrates, AmFV may belong to a new virus family.
Resumo:
Ciliates have evolved highly complex and intricately controlled pathways to ensure the precise and complete removal of all genomic sequences not required for vegetative growth. At the same time, they retain a reference copy of all their genetic information for future generations. This chapter describes how different ciliates use RNA-mediated DNA comparison processes to form new somatic nuclei from germline nuclei. While these processes vary in their precise mechanisms, they all use RNA to target genomic DNA sequences—either for retention or elimination. They also all consist of more than one individual pathway acting cooperatively—the two subsets of small RNAs in Paramecium and the guide RNAs and Piwi-interacting RNAs in Oxytricha—to ensure a strong belt-and-braces approach to consistent and precise somatic nucleus development. Nonetheless, this genome comparison approach to somatic nucleus development provides an elegant method for trans-generational environmental adaptation. Conceptually, it is easy to imagine how somatic changes that occur during vegetative growth could be transferred to meiotic offspring, while an unaltered germline genome is retained. Further research in this area will have far-reaching implications for the trans-generational adaptation of more distantly related eukaryotes, such as humans.
Resumo:
In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^
Resumo:
The pufferfish Fugu rubripes has a genome ≈7.5 times smaller than that of mammals but with a similar number of genes. Although conserved synteny has been demonstrated between pufferfish and mammals across some regions of the genome, there is some controversy as to what extent Fugu will be a useful model for the human genome, e.g., [Gilley, J., Armes, N. & Fried, M. (1997) Nature (London) 385, 305–306]. We report extensive conservation of synteny between a 1.5-Mb region of human chromosome 11 and <100 kb of the Fugu genome in three overlapping cosmids. Our findings support the idea that the majority of DNA in the region of human chromosome 11p13 is intergenic. Comparative analysis of three unrelated genes with quite different roles, WT1, RCN1, and PAX6, has revealed differences in their structural evolution. Whereas the human WT1 gene can generate 16 protein isoforms via a combination of alternative splicing, RNA editing, and alternative start site usage, our data predict that Fugu WT1 is capable of generating only two isoforms. This raises the question of the extent to which the evolution of WT1 isoforms is related to the evolution of the mammalian genitourinary system. In addition, this region of the Fugu genome shows a much greater overall compaction than usual but with significant noncoding homology observed at the PAX6 locus, implying that comparative genomics has identified regulatory elements associated with this gene.
Resumo:
Multiple-complete-digest mapping is a DNA mapping technique based on complete-restriction-digest fingerprints of a set of clones that provides highly redundant coverage of the mapping target. The maps assembled from these fingerprints order both the clones and the restriction fragments. Maps are coordinated across three enzymes in the examples presented. Starting with yeast artificial chromosome contigs from the 7q31.3 and 7p14 regions of the human genome, we have produced cosmid-based maps spanning more than one million base pairs. Each yeast artificial chromosome is first subcloned into cosmids at a redundancy of ×15–30. Complete-digest fragments are electrophoresed on agarose gels, poststained, and imaged on a fluorescent scanner. Aberrant clones that are not representative of the underlying genome are rejected in the map construction process. Almost every restriction fragment is ordered, allowing selection of minimal tiling paths with clone-to-clone overlaps of only a few thousand base pairs. These maps demonstrate the practicality of applying the experimental and software-based steps in multiple-complete-digest mapping to a target of significant size and complexity. We present evidence that the maps are sufficiently accurate to validate both the clones selected for sequencing and the sequence assemblies obtained once these clones have been sequenced by a “shotgun” method.
Resumo:
A strategy for cloning and mutagenesis of an infectious herpesvirus genome is described. The mouse cytomegalovirus genome was cloned and maintained as a 230 kb bacterial artificial chromosome (BAC) in E. coli. Transfection of the BAC plasmid into eukaryotic cells led to a productive virus infection. The feasibility to introduce targeted mutations into the BAC cloned virus genome was shown by mutation of the immediate-early 1 gene and generation of a mutant virus. Thus, the complete construction of a mutant herpesvirus genome can now be carried out in a controlled manner prior to the reconstitution of infectious progeny. The described approach should be generally applicable to the mutagenesis of genomes of other large DNA viruses.