18 resultados para Genome sequencing

em Helda - Digital Repository of University of Helsinki


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The first glycyl radical in an enzyme was described 20 years ago and since then the family of glycyl radical enzymes (GREs) has expanded to include enzymes catalysing five chemically distinct reactions. The type enzymes of the family, anaerobic ribonucleotide reductase (RNRIII) and pyruvate formate lyase (PFL) had been studied long before it was known that they are GREs. Spectroscopic measurements on the radical and an observation that exposure to oxygen irreversibly inactivates the enzymes by cleavage of the protein proved that the radical is located on a particular glycine residue, close to the C-terminus of the protein. Both anaerobic RNRIII and PFL, are important for many anaerobic and facultative anaerobic bacteria as RNRIII is responsible for the synthesis of DNA precursors and PFL catalyses a key metabolic reaction in glycolysis. The crystal structures of both were solved in 1999 and they revealed that, although the enzymes do not share significant sequence identity, they share a similar structure - the radical site and residues necessary for catalysis are buried inside a ten stranded $\ualpha $/$\ubeta $-barrel. GREs are synthesised in an inactive form and are post-translationally activated by an activating enzyme which uses S-adenosyl methionine and an iron-sulphur cluster to generate the radical. One of the goals of this thesis work was to crystallise the activating enzyme of PFL. This task is challenging as, like GREs, the activating component is inactivated by oxygen. The experiments were therefore carried out in an oxygen free atmosphere. This is the first report of a crystalline GRE activating enzyme. Recently several new GREs have been characterised, all sharing sequence similarity to PFL but not to RNRIII. Also, the genome sequencing projects have identified many PFL-like GREs of unknown function, usually annotated as PFLs. In the present thesis I describe the grouping of these PFL family enzymes based on the sequence similarity and analyse the conservation patterns when compared to the structure of E. coli PFL. Based on this information an activation route is proposed. I also report a crystal structure of one of the PFL-like enzymes with unknown function, PFL2 from Archaeoglobus fulgidus. As A. fulgidus is a hyperthermophilic organism, possible mechanisms stabilising the structure are discussed. The organisation of an active site of PFL2 suggests that the enzyme may be a dehydratase. Keywords: glycyl radical, enzyme, pyruvate formate lyase, x-ray crystallography, bioinformatics

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Growth is a fundamental aspect of life cycle of all organisms. Body size varies highly in most animal groups, such as mammals. Moreover, growth of a multicellular organism is not uniform enlargement of size, but different body parts and organs grow to their characteristic sizes at different times. Currently very little is known about the molecular mechanisms governing this organ-specific growth. The genome sequencing projects have provided complete genomic DNA sequences of several species over the past decade. The amount of genomic sequence information, including sequence variants within species, is constantly increasing. Based on the universal genetic code, we can make sense of this sequence information as far as it codes proteins. However, less is known about the molecular mechanisms that control expression of genes, and about the variations in gene expression that underlie many pathological states in humans. This is caused in part by lack of information about the second genetic code that consists of the binding specificities of transcription factors and the combinatorial code by which transcription factor binding sites are assembled to form tissue-specific and/or ligand-regulated enhancer elements. This thesis presents a high-throughput assay for identification of transcription factor binding specificities, which were then used to measure the DNA binding profiles of transcription factors involved in growth control. We developed ‘enhancer element locator’, a computational tool, which can be used to predict functional enhancer elements. A genome-wide prediction of human and mouse enhancer elements generated a large database of enhancer elements. This database can be used to identify target genes of signaling pathways, and to predict activated transcription factors based on changes in gene expression. Predictions validated in transgenic mouse embryos revealed the presence of multiple tissue-specific enhancers in mouse c- and N-Myc genes, which has implications to organ specific growth control and tumor type specificity of oncogenes. Furthermore, we were able to locate a variation in a single nucleotide, which carries a susceptibility to colorectal cancer, to an enhancer element and propose a mechanism by which this SNP might be involved in generation of colorectal cancer.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Extraintestinal pathogenic Escherichia coli (ExPEC) represent a diverse group of strains of E. coli, which infect extraintestinal sites, such as the urinary tract, the bloodstream, the meninges, the peritoneal cavity, and the lungs. Urinary tract infections (UTIs) caused by uropathogenic E. coli (UPEC), the major subgroup of ExPEC, are among the most prevalent microbial diseases world wide and a substantial burden for public health care systems. UTIs are responsible for serious morbidity and mortality in the elderly, in young children, and in immune-compromised and hospitalized patients. ExPEC strains are different, both from genetic and clinical perspectives, from commensal E. coli strains belonging to the normal intestinal flora and from intestinal pathogenic E. coli strains causing diarrhea. ExPEC strains are characterized by a broad range of alternate virulence factors, such as adhesins, toxins, and iron accumulation systems. Unlike diarrheagenic E. coli, whose distinctive virulence determinants evoke characteristic diarrheagenic symptoms and signs, ExPEC strains are exceedingly heterogeneous and are known to possess no specific virulence factors or a set of factors, which are obligatory for the infection of a certain extraintestinal site (e. g. the urinary tract). The ExPEC genomes are highly diverse mosaic structures in permanent flux. These strains have obtained a significant amount of DNA (predictably up to 25% of the genomes) through acquisition of foreign DNA from diverse related or non-related donor species by lateral transfer of mobile genetic elements, including pathogenicity islands (PAIs), plasmids, phages, transposons, and insertion elements. The ability of ExPEC strains to cause disease is mainly derived from this horizontally acquired gene pool; the extragenous DNA facilitates rapid adaptation of the pathogen to changing conditions and hence the extent of the spectrum of sites that can be infected. However, neither the amount of unique DNA in different ExPEC strains (or UPEC strains) nor the mechanisms lying behind the observed genomic mobility are known. Due to this extreme heterogeneity of the UPEC and ExPEC populations in general, the routine surveillance of ExPEC is exceedingly difficult. In this project, we presented a novel virulence gene algorithm (VGA) for the estimation of the extraintestinal virulence potential (VP, pathogenicity risk) of clinically relevant ExPECs and fecal E. coli isolates. The VGA was based on a DNA microarray specific for the ExPEC phenotype (ExPEC pathoarray). This array contained 77 DNA probes homologous with known (e.g. adhesion factors, iron accumulation systems, and toxins) and putative (e.g. genes predictably involved in adhesion, iron uptake, or in metabolic functions) ExPEC virulence determinants. In total, 25 of DNA probes homologous with known virulence factors and 36 of DNA probes representing putative extraintestinal virulence determinants were found at significantly higher frequency in virulent ExPEC isolates than in commensal E. coli strains. We showed that the ExPEC pathoarray and the VGA could be readily used for the differentiation of highly virulent ExPECs both from less virulent ExPEC clones and from commensal E. coli strains as well. Implementing the VGA in a group of unknown ExPECs (n=53) and fecal E. coli isolates (n=37), 83% of strains were correctly identified as extraintestinal virulent or commensal E. coli. Conversely, 15% of clinical ExPECs and 19% of fecal E. coli strains failed to raster into their respective pathogenic and non-pathogenic groups. Clinical data and virulence gene profiles of these strains warranted the estimated VPs; UPEC strains with atypically low risk-ratios were largely isolated from patients with certain medical history, including diabetes mellitus or catheterization, or from elderly patients. In addition, fecal E. coli strains with VPs characteristic for ExPEC were shown to represent the diagnostically important fraction of resident strains of the gut flora with a high potential of causing extraintestinal infections. Interestingly, a large fraction of DNA probes associated with the ExPEC phenotype corresponded to novel DNA sequences without any known function in UTIs and thus represented new genetic markers for the extraintestinal virulence. These DNA probes included unknown DNA sequences originating from the genomic subtractions of four clinical ExPEC isolates as well as from five novel cosmid sequences identified in the UPEC strains HE300 and JS299. The characterized cosmid sequences (pJS332, pJS448, pJS666, pJS700, and pJS706) revealed complex modular DNA structures with known and unknown DNA fragments arranged in a puzzle-like manner and integrated into the common E. coli genomic backbone. Furthermore, cosmid pJS332 of the UPEC strain HE300, which carried a chromosomal virulence gene cluster (iroBCDEN) encoding the salmochelin siderophore system, was shown to be part of a transmissible plasmid of Salmonella enterica. Taken together, the results of this project pointed towards the assumptions that first, (i) homologous recombination, even within coding genes, contributes to the observed mosaicism of ExPEC genomes and secondly, (ii) besides en block transfer of large DNA regions (e.g. chromosomal PAIs) also rearrangements of small DNA modules provide a means of genomic plasticity. The data presented in this project supplemented previous whole genome sequencing projects of E. coli and indicated that each E. coli genome displays a unique assemblage of individual mosaic structures, which enable these strains to successfully colonize and infect different anatomical sites.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Bipolar disorder (BP) is a complex psychiatric disorder characterized by episodes of mania and depression. BP affects approximately 1% of the world’s population and shows no difference in lifetime prevalence between males and females. BP arises from complex interactions among genetic, developmental and environmental factors, and it is likely that several predisposing genes are involved in BP. The genetic background of BP is still poorly understood, although intensive and long-lasting research has identified several chromosomal regions and genes involved in susceptibility to BP. This thesis work aims to identify the genetic variants that influence bipolar disorder in the Finnish population by candidate gene and genome-wide linkage analyses in families with many BP cases. In addition to diagnosis-based phenotypes, neuropsychological traits that can be seen as potential endophenotypes or intermediate traits for BP were analyzed. In the first part of the thesis, we examined the role of the allelic variants of the TSNAX/DISC1 gene cluster to psychotic and bipolar spectrum disorders and found association of distinct allelic haplotypes with these two groups of disorders. The haplotype at the 5’ end of the Disrupted-in-Schizophrenia-1 gene (DISC1) was over-transmitted to males with psychotic disorder (p = 0.008; for an extended haplotype p = 0.0007 with both genders), whereas haplotypes at the 3’ end of DISC1 associated with bipolar spectrum disorder (p = 0.0002; for an extended haplotype p = 0.0001). The variants of these haplotypes also showed association with different cognitive traits. The haplotypes at the 5’ end associated with perseverations and auditory attention, while the variants at the 3’ end associated with several cognitive traits including verbal fluency and psychomotor processing speed. Second, in our complete set of BP families with 723 individuals we studied six functional candidate genes from three distinct signalling systems: serotonin-related genes (SLC6A4 and TPH2), BDNF -related genes (BDNF, CREB1 and NTRK2) and one gene related to the inflammation and cytokine system (P2RX7). We replicated association of the functional variant Val66Met of BDNF with BP and better performance in retention. The variants at the 5’ end of SLC6A4 also showed some evidence of association among males (p = 0.004), but the widely studied functional variants did not yield any significant results. A protective four-variant haplotype on P2RX7 showed evidence of association with BP and executive functions: semantic and phonemic fluency (p = 0.006 and p = 0.0003, respectively). Third, we analyzed 23 bipolar families originating from the North-Eastern region of Finland. A genome-wide scan was performed using the 6K single nucleotide polymorphism (SNP) array. We identified susceptibility loci at chromosomes 7q31 with a LOD score of 3.20 and at 9p13.1 with a LOD score of 4.02. We followed up both linkage findings in the complete set of 179 Finnish bipolar families. The finding on chromosome 9p13 was supported (maximum LOD score of 3.02), but the susceptibility gene itself remains unclarified. In the fourth part of the thesis, we wanted to test the role of the allelic variants that have associated with bipolar disorder in recent genome-wide association studies (GWAS). We could confirm findings for the DFNB31, SORCS2, SCL39A3, and DGKH genes. The best signal in this study comes from DFNB31, which remained significant after multiple testing corrections. Two variants of SORCS2 were allelic replications and presented the same signal as the haplotype analysis. However, no association was detected with the PALB2 gene, which was the most significantly associated region in the previous GWAS. Our results indicate that BP is heterogeneous and its genetic background may accordingly vary in different populations. In order to fully understand the allelic heterogeneity that underlies common diseases such as BP, complete genome sequencing for many individuals with and without the disease is required. Identification of the specific risk variants will help us better understand the pathophysiology underlying BP and will lead to the development of treatments with specific biochemical targets. In addition, it will further facilitate the identification of environmental factors that alter risk, which will potentially provide improved occupational, social and psychological advice for individuals with high risk of BP.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The growing interest for sequencing with higher throughput in the last decade has led to the development of new sequencing applications. This thesis concentrates on optimizing DNA library preparation for Illumina Genome Analyzer II sequencer. The library preparation steps that were optimized include fragmentation, PCR purification and quantification. DNA fragmentation was performed with focused sonication in different concentrations and durations. Two column based PCR purification method, gel matrix method and magnetic bead based method were compared. Quantitative PCR and gel electrophoresis in a chip were compared for DNA quantification. The magnetic bead purification was found to be the most efficient and flexible purification method. The fragmentation protocol was changed to produce longer fragments to be compatible with longer sequencing reads. Quantitative PCR correlates better with the cluster number and should thus be considered to be the default quantification method for sequencing. As a result of this study more data have been acquired from sequencing with lower costs and troubleshooting has become easier as qualification steps have been added to the protocol. New sequencing instruments and applications will create a demand for further optimizations in future.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Colorectal cancer (CRC) is the third most common cancer in Finland. Of all CRC tumors, 15% display microsatellite-instability (MSI) caused by defective cellular mismatch repair. Cells displaying MSI accumulate a high number of mutations genome-wide, especially in short repeat areas, microsatellites. When targeting genes essential for cell growth or death, MSI can promote tumorigenesis. In non-coding areas, microsatellite mutations are generally considered as passenger events. Since the discovery of MSI and its linkage to cancer, more that 200 genes have been investigated for a role in MSI tumorigenesis. Although various criteria have been suggested for MSI target gene identification, the challenge has been to distinguish driver mutations from passenger mutations. This study aimed to clarify these key issues in the research field of MSI cancer. Prior to this, background mutation rate in MSI cancer has not been studied in a large-scale. We investigated the background mutation rate in MSI CRC by analyzing the spectrum of microsatellite mutations in non-coding areas. First, semenogelin I was studied for a possible role in MSI carcinogenesis. The intronic T9 repeat of semenogelin I was frequently mutated but no evidence for selection during tumorigenesis was obtained. Second, a sequencing approach was utilized to evaluate the general background mutation rate in MSI CRC. Both intronic and intergenic repeats harbored extremely high mutation rates of ≤ 87% and intergenic repeats were more unstable than the intronic repeats. As mutation rates of presumably neutral microsatellites can be high in MSI CRC in the absence of apparent selection pressure, high mutation frequency alone is not sufficient evidence for identification of driver MSI target genes. Next, an unbiased approach was designed to identify the mutatome of MSI CRC. By combining expression array data and a database search we identified novel genes possibly related to MSI CRC carcinogenesis. One of the genes was studied further. In the functional analysis this gene was observed to cause an abnormal cancer-prone cellular phenotype, possibly through altered responses to DNA damage. In our recent study, smooth muscle myosin heavy chain 11 (MYH11) was identified as a novel MSI CRC gene. Additionally, MYH11 has a well established role in acute myeloid leukemia (AML) through an oncogenic fusion protein CBFB-MYH11. We investigated further the role of MYH11 in AML by sequencing. Three novel missense variants of MYH11 were identified. None of the variants were present in the population-based control material. One of the identified variants, V71A, lies in the N-terminal SH3-like domain of MYH11 of unknown function. The other two variants, K1059E and R1792Q are located in the coil-coiled myosin rod essential for the regulation and filament formation of MYH11. The variant K1059E lies in the close proximity of the K1044N that has been functionally assessed in our earlier work of CRC and has been reported to cause total loss of MYH11 protein regulation. As the functional significance of the three novel variants examined in this work remains unknown, future studies should clarify the further role of MYH11 in AML leukaemogenesis and in other malignancies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Nemaline myopathy (NM) is a rare muscle disorder characterised by muscle weakness and nemaline bodies in striated muscle tissue. Nemaline bodies are derived from sarcomeric Z discs and may be detected by light microscopy. The disease can be divided into six subclasses varying from very severe, in some cases lethal forms to milder forms. NM is usually the consequence of a gene mutation and the mode of inheritance varies between NM subclasses and different families. Mutations in six genes are known to cause NM; nebulin (NEB), alpha-actin, alpha-tropomyosin (TPM3), troponin T1, beta-tropomyosin (TPM2) and cofilin 2, of which nebulin and -actin are the most common. One of the main interests of my research is NEB. Nebulin is a giant muscle protein (600-900 kDa) expressed mainly in the thin filaments of striated muscle. Mutations in NEB are the main cause of autosomal recessive NM. The gene consists of 183 exons. Thus being gigantic, NEB is very challenging to investigate. NEB was screened for mutations using denaturing High Performance Liquid Chromatography (dHPLC) and sequencing. DNA samples from 44 families were included in this study, and we found and published 45 different mutations in them. To date, we have identified 115 mutations in NEB in a total of 96 families. In addition, we determined the occurrence in a world-wide sample cohort of a 2.5 kb deletion containing NEB exon 55 identified in the Ashkenazi Jewish population. In order to find the seventh putative NM gene a genome-wide linkage study was performed in a series of Turkish families. In two of these families, we identified a homozygous mutation disrupting the termination signal of the TPM3 gene, a previously known NM-causing gene. This mutation is likely a founder mutation in the Turkish population. In addition, we described a novel recessively inherited distal myopathy, named distal nebulin myopathy, caused by two different homozygous missense mutations in NEB in six Finnish patients. Both mutations, when combined in compound heterozygous form with a more disruptive mutation, are known to cause NM. This study consisted of molecular genetic mutation analyses, light and electron microscopic studies of muscle biopsies, muscle imaging and clinical examination of patients. In these patients the distribution of muscle weakness was different from NM. Nemaline bodies were not detectable with routine light microscopy, and they were inconspicuous or absent even using electron microscopy. No genetic cause was known to underlie cap myopathy, a congenital myopathy characterised by cap-like structures in the muscle fibres, until we identified a deletion of one codon of the TPM2 gene, in a 30-year-old cap myopathy patient. This mutation does not change the reading frame of the gene, but a deletion of one amino acid does affect the conformation of the protein produced. In summary, this thesis describes a novel distal myopathy caused by mutations in the nebulin gene, several novel nebulin mutations associated with nemaline myopathy, the first molecular genetic cause of cap myopathy, i.e. a mutation in the beta-tropomyosin gene, and a founder mutation in the alpha-tropomyosin gene underlying autosomal recessive nemaline myopathy in the Turkish population.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Positional cloning has enabled hypothesis-free, genome-wide scans for genetic factors contributing to disorders or traits. Traditionally linkage analysis has been used to identify regions of interest, followed by meticulous fine mapping and candidate gene screening using association methods and finally sequencing of regions of interest. More recently, genome-wide association analysis has enabled a more direct approach to identify specific genetic variants explaining a part of the variance of the phenotype of interest. Autism spectrum disorders (ASDs) are a group of childhood onset neuropsychiatric disorders with shared core symptoms but varying severity. Although a strong genetic component has been established in ASDs, genetic susceptibility factors have largely eluded characterization. Here, we have utilized modern molecular genetic methods combined with the advantages provided by the special population structure in Finland to identify genetic risk factors for ASDs. The results of this study show that numerous genetic risk factors exist for ASDs even within a population isolate. Stratification based on clinical phenotype resulted in encouraging results, as previously identified linkage to 3p14-p24 was replicated in an independent family set of families with Asperger syndrome, but no other ASDs. Fine-mapping of the previously identified linkage peak for ASDs at 3q25-q27 revealed association between autism and a subunit of the 5-hydroxytryptamine receptor 3C (HTR3C). We also used dense, genome-wide single nucleotide polymorphism (SNP) data to characterize the population structure of Finns. We observed significant population substructure which correlates with the known history of multiple consecutive bottle-necks experienced by the Finnish population. We used this information to ascertain a genetically homogenous subset of autism families to identify possible rare, enriched risk variants using genome-wide SNP data. No rare enriched genetic risk factors were identified in this dataset, although a subset of families could be genealogically linked to form two extended pedigrees. The lack of founder mutations in this isolated population suggests that the majority of genetic risk factors are rare, de novo mutations unique to individual nuclear families. The results of this study are consistent with others in the field. The underlying genetic architecture for this group of disorders appears highly heterogeneous, with common variants accounting for only a subset of genetic risk. The majority of identified risk factors have turned out to be exceedingly rare, and only explain a subset of the genetic risk in the general population in spite of their high penetrance within individual families. The results of this study, together with other results obtained in this field, indicate that family specific linkage, homozygosity mapping and resequencing efforts are needed to identify these rare genetic risk factors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The studies presented in this thesis contribute to the understanding of evolutionary ecology of three major viruses threatening cultivated sweetpotato (Ipomoea batatas Lam) in East Africa: Sweet potato feathery mottle virus (SPFMV; genus Potyvirus; Potyviridae), Sweet potato chlorotic stunt virus (SPCSV; genus Crinivirus; Closteroviridae) and Sweet potato mild mottle virus (SPMMV; genus Ipomovirus; Potyviridae). The viruses were serologically detected and the positive results confirmed by RT-PCR and sequencing. SPFMV was detected in 24 wild plant species of family Convolvulacea (genera Ipomoea, Lepistemon and Hewittia), of which 19 species were new natural hosts for SPFMV. SPMMV and SPCSV were detected in wild plants belonging to 21 and 12 species (genera Ipomoea, Lepistemon and Hewittia), respectively, all of which were previously unknown to be natural hosts of these viruses. SPFMV was the most abundant virus being detected in 17% of the plants, while SPMMV and SPCSV were detected in 9.8% and 5.4% of the assessed plants, respectively. Wild plants in Uganda were infected with the East African (EA), common (C), and the ordinary (O) strains, or co-infected with the EA and the C strain of SPFMV. The viruses and virus-like diseases were more frequent in the eastern agro-ecological zone than the western and central zones, which contrasted with known incidences of these viruses in sweetpotato crops, except for northern zone where incidences were lowest in wild plants as in sweetpotato. The NIb/CP junction in SPMMV was determined experimentally which facilitated CP-based phylogenetic and evolutionary analyses of SPMMV. Isolates of all the three viruses from wild plants were genetically similar to those found in cultivated sweetpotatoes in East Africa. There was no evidence of host-driven population genetic structures suggesting frequent transmission of these viruses between their wild and cultivated hosts. The p22 RNA silencing suppressor-encoding sequence was absent in a few SPCSV isolates, but regardless of this, SPCSV isolates incited sweet potato virus disease (SPVD) in sweetpotato plants co-infected with SPFMV, indicating that p22 is redundant for synergism between SCSV and SPFMV. Molecular evolutionary analysis revealed that isolates of strain EA of SPFMV that is largely restricted geographically in East Africa experience frequent recombination in comparison to isolates of strain C that is globally distributed. Moreover, non-homologous recombination events between strains EA and C were rare, despite frequent co-infections of these strains in wild plants, suggesting purifying selection against non-homologous recombinants between these strains or that such recombinants are mostly not infectious. Recombination was detected also in the 5 - and 3 -proximal regions of the SPMMV genome providing the first evidence of recombination in genus Ipomovirus, but no recombination events were detected in the characterized genomic regions of SPCSV. Strong purifying selection was implicated on evolution of majority of amino acids of the proteins encoded by the analyzed genomic regions of SPFMV, SPMMV and SPCSV. However, positive selection was predicted on 17 amino acids distributed over the whole the coat protein (CP) in the globally distributed strain C, as compared to only 4 amino acids in the multifunctional CP N-terminus (CP-NT) of strain EA largely restricted geographically to East Africa. A few amino acid sites in the N-terminus of SPMMV P1, the p7 protein and RNA silencing suppressor proteins p22 and RNase3 of SPCSV were also submitted to positive selection. Positively selected amino acids may constitute ligand-binding domains that determine interactions with plant host and/or insect vector factors. The P1 proteinase of SPMMV (genus Ipomovirus) seems to respond to needs of adaptation, which was not observed with the helper component proteinase (HC-Pro) of SPMMV, although the HC-Pro is responsible for many important molecular interactions in genus Potyvirus. Because the centre of origin of cultivated sweetpotato is in the Americas from where the crop was dispersed to other continents in recent history (except for the Australasia and South Pacific region), it would be expected that identical viruses and their strains occur worldwide, presuming virus dispersal with the host. Apparently, this seems not to be the case with SPMMV, the strain EA of SPFMV and the strain EA of SPCSV that are largely geographically confined in East Africa where they are predominant and occur both in natural and agro-ecosystems. The geographical distribution of plant viruses is constrained more by virus-vector relations than by virus-host interactions, which in accordance of the wide range of natural host species and the geographical confinement to East Africa suggest that these viruses existed in East African wild plants before the introduction of sweetpotato. Subsequently, these studies provide compelling evidence that East Africa constitutes a cradle of SPFMV strain EA, SPCSV strain EA, and SPMMV. Therefore, sweet potato virus disease (SPVD) in East Africa may be one of the examples of damaging virus diseases resulting from exchange of viruses between introduced crops and indigenous wild plant species. Keywords: Convolvulaceae, East Africa, epidemiology, evolution, genetic variability, Ipomoea, recombination, SPCSV, SPFMV, SPMMV, selection pressure, sweetpotato, wild plant species Author s Address: Arthur K. Tugume, Department of Agricultural Sciences, Faculty of Agriculture and Forestry, University of Helsinki, Latokartanonkaari 7, P.O Box 27, FIN-00014, Helsinki, Finland. Email: tugume.arthur@helsinki.fi Author s Present Address: Arthur K. Tugume, Department of Botany, Faculty of Science, Makerere University, P.O. Box 7062, Kampala, Uganda. Email: aktugume@botany.mak.ac.ug, tugumeka@yahoo.com

Relevância:

20.00% 20.00%

Publicador:

Resumo:

My work describes two sectors of the human bacterial environment: 1. The sources of exposure to infectious non-tuberculous mycobacteria. 2. Bacteria in dust, reflecting the airborne bacterial exposure in environments protecting from or predisposing to allergic disorders. Non-tuberculous mycobacteria (NTM) transmit to humans and animals from the environment. Infection by NTM in Finland has increased during the past decade beyond that by Mycobacterium tuberculosis. Among the farm animals, porcine mycobacteriosis is the predominant NTM disease in Finland. Symptoms of mycobacteriosis are found in 0.34 % of slaughtered pigs. Soil and drinking water are suspected as sources for humans and bedding materials for pigs. To achieve quantitative data on the sources of human and porcine NTM exposure, methods for quantitation of environmental NTM are needed. We developed a quantitative real-time PCR method, utilizing primers targeted at the 16S rRNA gene of the genus of Mycobacterium. With this method, I found in Finnish sphagnum peat, sandy soils and mud high contents of mycobacterial DNA, 106 to 107 genome equivalents per gram. A similar result was obtained by a method based on the Mycobacterium-specific hybridization of 16S rRNA. Since rRNA is found mainly in live cells, this result shows that the DNA detected by qPCR mainly represented live mycobacteria. Next, I investigated the occurrence of environmental mycobacteria in the bedding materials obtained from 5 pig farms with high prevalence (>4 %) of mycobacteriosis. When I used for quantification the same qPCR methods as for the soils, I found that piggery samples contained non-mycobacterial DNA that was amplified in spite of several mismatches with the primers. I therefore improved the qPCR assay by designing Mycobacterium-specific detection probes. Using the probe qPCR assay, I found 105 to 107 genome equivalents of mycobacterial DNA in unused bedding materials and up to 1000 fold more in the bedding collected after use in the piggery. This result shows that there was a source of mycobacteria in the bedding materials purchased by the piggery and that mycobacteria increased in the bedding materials during use in the piggery. Allergic diseases have reached epidemic proportions in urbanized countries. At the same time, childhood in rural environment or simple living conditions appears to protect against allergic disorders. Exposure to immunoreactive microbial components in rural environments seems to prevent allergies. I searched for differences in the bacterial communities of two indoor dusts, an urban house dust shown to possess immunoreactivity of the TH2-type and a farm barn dust with TH1-activity. The immunoreactivities of the dusts were revealed by my collaborators, in vitro in human dendritic cells and in vivo in mouse. The dusts accumulated >10 years in the respiratory zone (>1.5 m above floor), thus reflecting the long-term content of airborne bacteria at the two sites. I investigated these dusts by cloning and sequencing of bacterial 16S rRNA genes from dust contained DNA. From the TH2-active urban house dust, I isolated 139 16S rRNA gene clones. The most prevalent genera among the clones were Corynebacterium (5 species, 34 clones), Streptococcus (8 species, 33 clones), Staphylococcus (5 species, 9 clones) and Finegoldia (1 species, 9 clones). Almost all of these species are known as colonizers of the human skin and oral cavity. Species of Corynebacterium and Streptococcus have been reported to contain anti-inflammatory lipoarabinomannans and immunmoreactive beta-glucans respectively. Streptococcus mitis, found in the urban house dust is known as an inducer of TH2 polarized immunity, characteristic of allergic disorders. I isolated 152 DNA clones from the TH1-active farm barn dust and found species quite different from those found from the urban house dust. Among others, I found DNA clones representing Bacillus licheniformis, Acinetobacter lwoffii and Lactobacillus each of which was recently reported to possess anti-allergy immunoreactivity. Moreover, the farm barn dust contained dramatically higher bacterial diversity than the urban house dust. Exposure to this dust thus stimulated the human dendritic cells by multiple microbial components. Such stimulation was reported to promote TH1 immunity. The biodiversity in dust may thus be connected to its immunoreactivity. Furthermore, the bacterial biomass in the farm barn dust consisted of live intact bacteria mainly. In the urban house dust only ~1 % of the biomass appeared as intact bacteria, as judged by microscoping. Fragmented microbes may possess bioactivity different from that of intact cells. This was recently shown for moulds. If this is also valid for bacteria, the different immunoreactivities of the two dusts may be explained by the intactness of dustborne bacteria. Based on these results, we offer three factors potentially contributing to the polarized immunoreactivities of the two dusts: (i) the species-composition, (ii) the biodiversity and (iii) the intactness of the dustborne bacterial biomass. The risk of childhood atopic diseases is 4-fold lower in the Russian compared with the Finnish Karelia. This difference across the country border is not explainable by different geo-climatic factors or genetic susceptibilities of the two populations. Instead, the explanation must be lifestyle-related. It has already been reported that the microbiological quality of drinking water differs on the two sides of the borders. In collaboration with allergists, I investigated dusts collected from homes in the Russian Karelia and in the Finnish Karelia. I found that bacterial 16S rRNA genes cloned from the Russian Karelian dusts (10 homes, 234 clones) predominantly represented Gram-positive taxa (the phyla Actinobacteria and Firmicutes, 67%). The Russian Karelian dusts contained nine-fold more of muramic acid (60 to 70 ng mg-1) than the Finnish Karelian dusts (3 to 11 ng mg-1). Among the DNA clones isolated from the Finnish side (n=231), Gram-negative taxa (40%) outnumbered the Gram-positives (34%). Out of the 465 DNA clones isolated from the Karelian dusts, 242 were assigned to cultured validly described bacterial species. In Russian Karelia, animal-associated species e.g. Staphylococcus and Macrococcus were numerous (27 clones, 14 unique species). This finding may connect to the difference in the prevalence of allergy, as childhood contacts with pets and farm animals have been connected with low allergy risk. Plant-associated bacteria and plant-borne 16S rRNA genes (chloroplast) were frequent among the DNA clones isolated from the Finnish Karelia, indicating components originating from plants. In conclusion, my work revealed three major differences between the bacterial communtites in the Russian and in the Finnish Karelian homes: (i) the high prevalence of Gram-positive bacteria on the Russian side and of Gram-negative bacteria on the Finnish side and (ii) the rich presence of animal-associated bacteria on the Russian side whereas (iii) plant-associated bacteria prevailed on the Finnish side. One or several of these factors may connect to the differences in the prevalence of allergy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The present study focuses on the translational strategies of Cocksfoot mottle virus (CfMV, genus Sobemovirus), which infects monocotyledonous plants. CfMV RNA lacks the 5'cap and the 3'poly(A) tail that ensure efficient translation of cellular messenger RNAs (mRNAs). Instead, CfMV RNA is covalently linked to a viral protein VPg (viral protein, genome-linked). This indicates that the viral untranslated regions (UTRs) must functionally compensate for the lack of the cap and poly(A) tail. We examined the efficacy of translation initiation in CfMV by comparing it to well-studied viral translational enhancers. Although insertion of the CfMV 5'UTR (CfMVe) into plant expression vectors improved gene expression in barley more than the other translational enhancers examined, studies at the RNA level showed that CfMVe alone or in combination with the CfMV 3'UTR did not provide the RNAs translational advantage. Mutation analysis revealed that translation initiation from CfMVe involved scanning. Interestingly, CfMVe also promoted translation initiation from an intercistronic position of dicistronic mRNAs in vitro. Furthermore, internal initiation occurred with similar efficacy in translation lysates that had reduced concentrations of eukaryotic initiation factor (eIF) 4E, suggesting that initiation was independent of the eIF4E. In contrast, reduced translation in the eIF4G-depleted lysates indicated that translation from internally positioned CfMVe was eIF4G-dependent. After successful translation initiation, leaky scanning brings the ribosomes to the second open reading frame (ORF). The CfMV polyprotein is produced from this and the following overlapping ORF via programmed -1 ribosomal frameshift (-1 PRF). Two signals in the mRNA at the beginning of the overlap program approximately every fifth ribosome to slip one nucleotide backwards and continue translation in the new -1 frame. This leads to the production of C-terminally extended polyprotein, which encodes the viral RNA-dependent RNA polymerase (RdRp). The -1 PRF event in CfMV was very efficient, even though it was programmed by a simple stem-loop structure instead of a pseudoknot, which is usually required for high -1 PRF frequencies. Interestingly, regions surrounding the -1 PRF signals improved the -1 PRF frequencies. Viral protein P27 inhibited the -1 PRF event in vivo, putatively by binding to the -1 PRF site. This suggested that P27 could regulate the occurrence of -1 PRF. Initiation of viral replication requires that viral proteins are released from the polyprotein. This is catalyzed by viral serine protease, which is also encoded from the polyprotein. N-terminal amino acid sequencing of CfMV VPg revealed that the junction of the protease and VPg was cleaved between glutamate (E) and asparagine (N) residues. This suggested that the processing sites used in CfMV differ from the glutamate and serine (S) or threonine (T) sites utilized in other sobemoviruses. However, further analysis revealed that the E/S and E/T sites may be used to cleave out some of the CfMV proteins.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Metabolism is the cellular subsystem responsible for generation of energy from nutrients and production of building blocks for larger macromolecules. Computational and statistical modeling of metabolism is vital to many disciplines including bioengineering, the study of diseases, drug target identification, and understanding the evolution of metabolism. In this thesis, we propose efficient computational methods for metabolic modeling. The techniques presented are targeted particularly at the analysis of large metabolic models encompassing the whole metabolism of one or several organisms. We concentrate on three major themes of metabolic modeling: metabolic pathway analysis, metabolic reconstruction and the study of evolution of metabolism. In the first part of this thesis, we study metabolic pathway analysis. We propose a novel modeling framework called gapless modeling to study biochemically viable metabolic networks and pathways. In addition, we investigate the utilization of atom-level information on metabolism to improve the quality of pathway analyses. We describe efficient algorithms for discovering both gapless and atom-level metabolic pathways, and conduct experiments with large-scale metabolic networks. The presented gapless approach offers a compromise in terms of complexity and feasibility between the previous graph-theoretic and stoichiometric approaches to metabolic modeling. Gapless pathway analysis shows that microbial metabolic networks are not as robust to random damage as suggested by previous studies. Furthermore the amino acid biosynthesis pathways of the fungal species Trichoderma reesei discovered from atom-level data are shown to closely correspond to those of Saccharomyces cerevisiae. In the second part, we propose computational methods for metabolic reconstruction in the gapless modeling framework. We study the task of reconstructing a metabolic network that does not suffer from connectivity problems. Such problems often limit the usability of reconstructed models, and typically require a significant amount of manual postprocessing. We formulate gapless metabolic reconstruction as an optimization problem and propose an efficient divide-and-conquer strategy to solve it with real-world instances. We also describe computational techniques for solving problems stemming from ambiguities in metabolite naming. These techniques have been implemented in a web-based sofware ReMatch intended for reconstruction of models for 13C metabolic flux analysis. In the third part, we extend our scope from single to multiple metabolic networks and propose an algorithm for inferring gapless metabolic networks of ancestral species from phylogenetic data. Experimenting with 16 fungal species, we show that the method is able to generate results that are easily interpretable and that provide hypotheses about the evolution of metabolism.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Viral genomes are encapsidated within protective protein shells. This encapsidation can be achieved either by a co-condensation reaction of the nucleic acid and coat proteins, or by first forming empty viral particles which are subsequently packaged with nucleic acid, the latter mechanism being typical for many dsDNA bacteriophages. Bacteriophage PRD1 is an icosahedral, non-tailed dsDNA virus that has an internal lipid membrane, the hallmark of the Tectiviridae family. Although PRD1 has been known to assemble empty particles into which the genome is subsequently packaged, the mechanism for this has been unknown, and there has been no evidence for a separate packaging vertex, similar to the portal structures used for packaging in the tailed bacteriophages and herpesviruses. In this study, a unique DNA packaging vertex was identified for PRD1, containing the packaging ATPase P9, packaging factor P6 and two small membrane proteins, P20 and P22, extending the packaging vertex to the internal membrane. Lack of small membrane protein P20 was shown to totally abolish packaging, making it an essential part of the PRD1 packaging mechanism. The minor capsid proteins P6 was shown to be an important packaging factor, its absence leading to greatly reduced packaging efficiency. An in vitro DNA packaging mechanism consisting of recombinant packaging ATPase P9, empty procapsids and mutant PRD1 DNA with a LacZ-insert was developed for the analysis of PRD1 packaging, the first such system ever for a virus containing an internal membrane. A new tectiviral sequence, a linear plasmid called pBClin15, was identified in Bacillus cereus, providing material for sequence analysis of the tectiviruses. Analysis of PRD1 P9 and other putative tectiviral ATPase sequences revealed several conserved sequence motifs, among them a new tectiviral packaging ATPase motif. Mutagenesis studies on PRD1 P9 were used to confirm the significance of the motifs. P9-type putative ATPase sequences carrying a similar sequence motif were identified in several other membrane containing dsDNA viruses of bacterial, archaeal and eukaryotic hosts, suggesting that these viruses may have similar packaging mechanisms. Interestingly, almost the same set of viruses that were found to have similar putative packaging ATPases had earlier been found to share similar coat protein folds and capsid structures, and a common origin for these viruses had been suggested. The finding in this study of similar packaging proteins further supports the idea that these viruses are descendants of a common ancestor.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Filamentous fungi of the subphylum Pezizomycotina are well known as protein and secondary metabolite producers. Various industries take advantage of these capabilities. However, the molecular biology of yeasts, i.e. Saccharomycotina and especially that of Saccharomyces cerevisiae, the baker's yeast, is much better known. In an effort to explain fungal phenotypes through their genotypes we have compared protein coding gene contents of Pezizomycotina and Saccharomycotina. Only biomass degradation and secondary metabolism related protein families seem to have expanded recently in Pezizomycotina. Of the protein families clearly diverged between Pezizomycotina and Saccharomycotina, those related to mitochondrial functions emerge as the most prominent. However, the primary metabolism as described in S. cerevisiae is largely conserved in all fungi. Apart from the known secondary metabolism, Pezizomycotina have pathways that could link secondary metabolism to primary metabolism and a wealth of undescribed enzymes. Previous studies of individual Pezizomycotina genomes have shown that regardless of the difference in production efficiency and diversity of secreted proteins, the content of the known secretion machinery genes in Pezizomycotina and Saccharomycotina appears very similar. Genome wide analysis of gene products is therefore needed to better understand the efficient secretion of Pezizomycotina. We have developed methods applicable to transcriptome analysis of non-sequenced organisms. TRAC (Transcriptional profiling with the aid of affinity capture) has been previously developed at VTT for fast, focused transcription analysis. We introduce a version of TRAC that allows more powerful signal amplification and multiplexing. We also present computational optimisations of transcriptome analysis of non-sequenced organism and TRAC analysis in general. Trichoderma reesei is one of the most commonly used Pezizomycotina in the protein production industry. In order to understand its secretion system better and find clues for improvement of its industrial performance, we have analysed its transcriptomic response to protein secretion stress conditions. In comparison to S. cerevisiae, the response of T. reesei appears different, but still impacts on the same cellular functions. We also discovered in T. reesei interesting similarities to mammalian protein secretion stress response. Together these findings highlight targets for more detailed studies.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A repetitive sequence collection is one where portions of a base sequence of length n are repeated many times with small variations, forming a collection of total length N. Examples of such collections are version control data and genome sequences of individuals, where the differences can be expressed by lists of basic edit operations. Flexible and efficient data analysis on a such typically huge collection is plausible using suffix trees. However, suffix tree occupies O(N log N) bits, which very soon inhibits in-memory analyses. Recent advances in full-text self-indexing reduce the space of suffix tree to O(N log σ) bits, where σ is the alphabet size. In practice, the space reduction is more than 10-fold, for example on suffix tree of Human Genome. However, this reduction factor remains constant when more sequences are added to the collection. We develop a new family of self-indexes suited for the repetitive sequence collection setting. Their expected space requirement depends only on the length n of the base sequence and the number s of variations in its repeated copies. That is, the space reduction factor is no longer constant, but depends on N / n. We believe the structures developed in this work will provide a fundamental basis for storage and retrieval of individual genomes as they become available due to rapid progress in the sequencing technologies.