916 resultados para Genome-specific Sequence
Resumo:
Abstract Background In tropical countries, losses caused by bovine tick Rhipicephalus (Boophilus) microplus infestation have a tremendous economic impact on cattle production systems. Genetic variation between Bos taurus and Bos indicus to tick resistance and molecular biology tools might allow for the identification of molecular markers linked to resistance traits that could be used as an auxiliary tool in selection programs. The objective of this work was to identify QTL associated with tick resistance/susceptibility in a bovine F2 population derived from the Gyr (Bos indicus) × Holstein (Bos taurus) cross. Results Through a whole genome scan with microsatellite markers, we were able to map six genomic regions associated with bovine tick resistance. For most QTL, we have found that depending on the tick evaluation season (dry and rainy) different sets of genes could be involved in the resistance mechanism. We identified dry season specific QTL on BTA 2 and 10, rainy season specific QTL on BTA 5, 11 and 27. We also found a highly significant genome wide QTL for both dry and rainy seasons in the central region of BTA 23. Conclusions The experimental F2 population derived from Gyr × Holstein cross successfully allowed the identification of six highly significant QTL associated with tick resistance in cattle. QTL located on BTA 23 might be related with the bovine histocompatibility complex. Further investigation of these QTL will help to isolate candidate genes involved with tick resistance in cattle.
Resumo:
Abstract Background A large number of probabilistic models used in sequence analysis assign non-zero probability values to most input sequences. To decide when a given probability is sufficient the most common way is bayesian binary classification, where the probability of the model characterizing the sequence family of interest is compared to that of an alternative probability model. We can use as alternative model a null model. This is the scoring technique used by sequence analysis tools such as HMMER, SAM and INFERNAL. The most prevalent null models are position-independent residue distributions that include: the uniform distribution, genomic distribution, family-specific distribution and the target sequence distribution. This paper presents a study to evaluate the impact of the choice of a null model in the final result of classifications. In particular, we are interested in minimizing the number of false predictions in a classification. This is a crucial issue to reduce costs of biological validation. Results For all the tests, the target null model presented the lowest number of false positives, when using random sequences as a test. The study was performed in DNA sequences using GC content as the measure of content bias, but the results should be valid also for protein sequences. To broaden the application of the results, the study was performed using randomly generated sequences. Previous studies were performed on aminoacid sequences, using only one probabilistic model (HMM) and on a specific benchmark, and lack more general conclusions about the performance of null models. Finally, a benchmark test with P. falciparum confirmed these results. Conclusions Of the evaluated models the best suited for classification are the uniform model and the target model. However, the use of the uniform model presents a GC bias that can cause more false positives for candidate sequences with extreme compositional bias, a characteristic not described in previous studies. In these cases the target model is more dependable for biological validation due to its higher specificity.
Resumo:
Abstract Background The ongoing efforts to sequence the honey bee genome require additional initiatives to define its transcriptome. Towards this end, we employed the Open Reading frame ESTs (ORESTES) strategy to generate profiles for the life cycle of Apis mellifera workers. Results Of the 5,021 ORESTES, 35.2% matched with previously deposited Apis ESTs. The analysis of the remaining sequences defined a set of putative orthologs whose majority had their best-match hits with Anopheles and Drosophila genes. CAP3 assembly of the Apis ORESTES with the already existing 15,500 Apis ESTs generated 3,408 contigs. BLASTX comparison of these contigs with protein sets of organisms representing distinct phylogenetic clades revealed a total of 1,629 contigs that Apis mellifera shares with different taxa. Most (41%) represent genes that are in common to all taxa, another 21% are shared between metazoans (Bilateria), and 16% are shared only within the Insecta clade. A set of 23 putative genes presented a best match with human genes, many of which encode factors related to cell signaling/signal transduction. 1,779 contigs (52%) did not match any known sequence. Applying a correction factor deduced from a parallel analysis performed with Drosophila melanogaster ORESTES, we estimate that approximately half of these no-match ESTs contigs (22%) should represent Apis-specific genes. Conclusions The versatile and cost-efficient ORESTES approach produced minilibraries for honey bee life cycle stages. Such information on central gene regions contributes to genome annotation and also lends itself to cross-transcriptome comparisons to reveal evolutionary trends in insect genomes.
Resumo:
Abstract Background MicroRNAs (miRNAs) are small regulatory RNAs, some of which are conserved in diverse plant genomes. Therefore, computational identification and further experimental validation of miRNAs from non-model organisms is both feasible and instrumental for addressing miRNA-based gene regulation and evolution. Sugarcane (Saccharum spp.) is an important biofuel crop with publicly available expressed sequence tag and genomic survey sequence databases, but little is known about miRNAs and their targets in this highly polyploid species. Results In this study, we have computationally identified 19 distinct sugarcane miRNA precursors, of which several are highly similar with their sorghum homologs at both nucleotide and secondary structure levels. The accumulation pattern of mature miRNAs varies in organs/tissues from the commercial sugarcane hybrid as well as in its corresponding founder species S. officinarum and S. spontaneum. Using sugarcane MIR827 as a query, we found a novel MIR827 precursor in the sorghum genome. Based on our computational tool, a total of 46 potential targets were identified for the 19 sugarcane miRNAs. Several targets for highly conserved miRNAs are transcription factors that play important roles in plant development. Conversely, target genes of lineage-specific miRNAs seem to play roles in diverse physiological processes, such as SsCBP1. SsCBP1 was experimentally confirmed to be a target for the monocot-specific miR528. Our findings support the notion that the regulation of SsCBP1 by miR528 is shared at least within graminaceous monocots, and this miRNA-based post-transcriptional regulation evolved exclusively within the monocots lineage after the divergence from eudicots. Conclusions Using publicly available nucleotide databases, 19 sugarcane miRNA precursors and one new sorghum miRNA precursor were identified and classified into 14 families. Comparative analyses between sugarcane and sorghum suggest that these two species retain homologous miRNAs and targets in their genomes. Such conservation may help to clarify specific aspects of miRNA regulation and evolution in the polyploid sugarcane. Finally, our dataset provides a framework for future studies on sugarcane RNAi-dependent regulatory mechanisms.
Resumo:
Abstract Background Plasmodium vivax is the most widely distributed human malaria, responsible for 70–80 million clinical cases each year and large socio-economical burdens for countries such as Brazil where it is the most prevalent species. Unfortunately, due to the impossibility of growing this parasite in continuous in vitro culture, research on P. vivax remains largely neglected. Methods A pilot survey of expressed sequence tags (ESTs) from the asexual blood stages of P. vivax was performed. To do so, 1,184 clones from a cDNA library constructed with parasites obtained from 10 different human patients in the Brazilian Amazon were sequenced. Sequences were automatedly processed to remove contaminants and low quality reads. A total of 806 sequences with an average length of 586 bp met such criteria and their clustering revealed 666 distinct events. The consensus sequence of each cluster and the unique sequences of the singlets were used in similarity searches against different databases that included P. vivax, Plasmodium falciparum, Plasmodium yoelii, Plasmodium knowlesi, Apicomplexa and the GenBank non-redundant database. An E-value of <10-30 was used to define a significant database match. ESTs were manually assigned a gene ontology (GO) terminology Results A total of 769 ESTs could be assigned a putative identity based upon sequence similarity to known proteins in GenBank. Moreover, 292 ESTs were annotated and a GO terminology was assigned to 164 of them. Conclusion These are the first ESTs reported for P. vivax and, as such, they represent a valuable resource to assist in the annotation of the P. vivax genome currently being sequenced. Moreover, since the GC-content of the P. vivax genome is strikingly different from that of P. falciparum, these ESTs will help in the validation of gene predictions for P. vivax and to create a gene index of this malaria parasite.
Resumo:
Abstract Background From shotgun libraries used for the genomic sequencing of the phytopathogenic bacterium Xanthomonas axonopodis pv. citri (XAC), clones that were representative of the largest possible number of coding sequences (CDSs) were selected to create a DNA microarray platform on glass slides (XACarray). The creation of the XACarray allowed for the establishment of a tool that is capable of providing data for the analysis of global genome expression in this organism. Findings The inserts from the selected clones were amplified by PCR with the universal oligonucleotide primers M13R and M13F. The obtained products were purified and fixed in duplicate on glass slides specific for use in DNA microarrays. The number of spots on the microarray totaled 6,144 and included 768 positive controls and 624 negative controls per slide. Validation of the platform was performed through hybridization of total DNA probes from XAC labeled with different fluorophores, Cy3 and Cy5. In this validation assay, 86% of all PCR products fixed on the glass slides were confirmed to present a hybridization signal greater than twice the standard deviation of the deviation of the global median signal-to-noise ration. Conclusions Our validation of the XACArray platform using DNA-DNA hybridization revealed that it can be used to evaluate the expression of 2,365 individual CDSs from all major functional categories, which corresponds to 52.7% of the annotated CDSs of the XAC genome. As a proof of concept, we used this platform in a previously work to verify the absence of genomic regions that could not be detected by sequencing in related strains of Xanthomonas.
Resumo:
Background: Even before having its genome sequence published in 2004, Kluyveromyces lactis had long been considered a model organism for studies in genetics and physiology. Research on Kluyveromyces lactis is quite advanced and this yeast species is one of the few with which it is possible to perform formal genetic analysis. Nevertheless, until now, no complete metabolic functional annotation has been performed to the proteins encoded in the Kluyveromyces lactis genome. Results: In this work, a new metabolic genome-wide functional re-annotation of the proteins encoded in the Kluyveromyces lactis genome was performed, resulting in the annotation of 1759 genes with metabolic functions, and the development of a methodology supported by merlin (software developed in-house). The new annotation includes novelties, such as the assignment of transporter superfamily numbers to genes identified as transporter proteins. Thus, the genes annotated with metabolic functions could be exclusively enzymatic (1410 genes), transporter proteins encoding genes (301 genes) or have both metabolic activities (48 genes). The new annotation produced by this work largely surpassed the Kluyveromyces lactis currently available annotations. A comparison with KEGG’s annotation revealed a match with 844 (~90%) of the genes annotated by KEGG, while adding 850 new gene annotations. Moreover, there are 32 genes with annotations different from KEGG. Conclusions: The methodology developed throughout this work can be used to re-annotate any yeast or, with a little tweak of the reference organism, the proteins encoded in any sequenced genome. The new annotation provided by this study offers basic knowledge which might be useful for the scientific community working on this model yeast, because new functions have been identified for the so-called metabolic genes. Furthermore, it served as the basis for the reconstruction of a compartmentalized, genome-scale metabolic model of Kluyveromyces lactis, which is currently being finished.
Resumo:
Genome-wide association studies have failed to establish common variant risk for the majority of common human diseases. The underlying reasons for this failure are explained by recent studies of resequencing and comparison of over 1200 human genomes and 10 000 exomes, together with the delineation of DNA methylation patterns (epigenome) and full characterization of coding and noncoding RNAs (transcriptome) being transcribed. These studies have provided the most comprehensive catalogues of functional elements and genetic variants that are now available for global integrative analysis and experimental validation in prospective cohort studies. With these datasets, researchers will have unparalleled opportunities for the alignment, mining, and testing of hypotheses for the roles of specific genetic variants, including copy number variations, single nucleotide polymorphisms, and indels as the cause of specific phenotypes and diseases. Through the use of next-generation sequencing technologies for genotyping and standardized ontological annotation to systematically analyze the effects of genomic variation on humans and model organism phenotypes, we will be able to find candidate genes and new clues for disease’s etiology and treatment. This article describes essential concepts in genetics and genomic technologies as well as the emerging computational framework to comprehensively search websites and platforms available for the analysis and interpretation of genomic data.
Resumo:
[EN] First description of the complete embryo and larval development of the Canarian abalone (Haliotis tuberculata coccinea Reeve.) was conducted along 39 stages from fertilization to the appearance of the third tubule on the cephalic tentacles and illustrated in a microphotographic sequence. Eggs obtained by induced spawning with hydrogen peroxide from the GIA captive broodstock were stocked at a density of 10 eggs/mL and kept at 23 0.5 BC for 62 h until the formation of the third tubule. Live eggs and larvae were continuously observed on a 24 h basis at a 3400 magnification under transmitted light. At each stages, specific morphological features, illustrated by microscopic photographs, were described, as well as the time required for their apparition. Fertilized eggs diameter was 205 8 mm (mean SD), whereas length and width of larvae ready to undergo metamorphosis were 216.6 5.3 mmand 172 8.8 mm, respectively. Knowledge on the larval morphological development acquired through this study will contribute to the improvement of larval rearing techniques for this abalone species.
Resumo:
Motivation An actual issue of great interest, both under a theoretical and an applicative perspective, is the analysis of biological sequences for disclosing the information that they encode. The development of new technologies for genome sequencing in the last years, opened new fundamental problems since huge amounts of biological data still deserve an interpretation. Indeed, the sequencing is only the first step of the genome annotation process that consists in the assignment of biological information to each sequence. Hence given the large amount of available data, in silico methods became useful and necessary in order to extract relevant information from sequences. The availability of data from Genome Projects gave rise to new strategies for tackling the basic problems of computational biology such as the determination of the tridimensional structures of proteins, their biological function and their reciprocal interactions. Results The aim of this work has been the implementation of predictive methods that allow the extraction of information on the properties of genomes and proteins starting from the nucleotide and aminoacidic sequences, by taking advantage of the information provided by the comparison of the genome sequences from different species. In the first part of the work a comprehensive large scale genome comparison of 599 organisms is described. 2,6 million of sequences coming from 551 prokaryotic and 48 eukaryotic genomes were aligned and clustered on the basis of their sequence identity. This procedure led to the identification of classes of proteins that are peculiar to the different groups of organisms. Moreover the adopted similarity threshold produced clusters that are homogeneous on the structural point of view and that can be used for structural annotation of uncharacterized sequences. The second part of the work focuses on the characterization of thermostable proteins and on the development of tools able to predict the thermostability of a protein starting from its sequence. By means of Principal Component Analysis the codon composition of a non redundant database comprising 116 prokaryotic genomes has been analyzed and it has been showed that a cross genomic approach can allow the extraction of common determinants of thermostability at the genome level, leading to an overall accuracy in discriminating thermophilic coding sequences equal to 95%. This result outperform those obtained in previous studies. Moreover, we investigated the effect of multiple mutations on protein thermostability. This issue is of great importance in the field of protein engineering, since thermostable proteins are generally more suitable than their mesostable counterparts in technological applications. A Support Vector Machine based method has been trained to predict if a set of mutations can enhance the thermostability of a given protein sequence. The developed predictor achieves 88% accuracy.
Resumo:
The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.
Resumo:
The Poxviruses are a family of double stranded DNA (dsDNA) viruses that cause disease in many species, both vertebrate and invertebrate. Their genomes range in size from 135 to 365 kbp and show conservation in both organization and content. In particular, the central genomic regions of the chordopoxvirus subfamily (those capable of infecting vertebrates) contain 88 genes which are present in all the virus species characterised to date and which mostly occur in the same order and orientation. In contrast, however, the terminal regions of the genomes frequently contain genes that are species or genera-specific and that are not essential for the growth of the virus in vitro but instead often encode factors with important roles in vivo including modulation of the host immune response to infection and determination of the host range of the virus. The Parapoxviruses (PPV), of which Orf virus is the prototypic species, represent a genus within the chordopoxvirus subfamily of Poxviridae and are characterised by their ability to infect ruminants and humans. The genus currently contains four recognised species of virus, bovine papular stomatitis virus (BPSV) and pseudocowpox virus (PCPV) both of which infect cattle, orf virus (OV) that infects sheep and goats, and parapoxvirus of red deer in New Zealand (PVNZ). The ORFV genome has been fully sequenced, as has that of BPSV, and is ~138 kb in length encoding ~132 genes. The vast majority of these genes allow the virus to replicate in the cytoplasm of the infected host cell and therefore encode proteins involved in replication, transcription and metabolism of nucleic acids. These genes are well conserved between all known genera of poxviruses. There is however another class of genes, located at either end of the linear dsDNA genome, that encode proteins which are non-essential for replication and generally dictate host range and virulence of the virus. The non-essential genes are often the most variable within and between species of virus and therefore are potentially useful for diagnostic purposes. Given their role in subverting the host-immune response to infection they are also targets for novel therapeutics. The function of only a relatively small number of these proteins has been elucidated and there are several genes whose function still remains obscure principally because there is little similarity between them and proteins of known function in current sequence databases. It is thought that by selectively removing some of the virulence genes, or at least neutralising the proteins in some way, current vaccines could be improved. The evolution of poxviruses has been proposed to be an adaptive process involving frequent events of gene gain and loss, such that the virus co-evolves with its specific host. Gene capture or horizontal gene transfer from the host to the virus is considered an important source of new viral genes including those likely to be involved in host range and those enabling the virus to interfere with the host immune response to infection. Given the low rate of nucleotide substitution, recombination can be seen as an essential evolutionary driving force although it is likely underestimated. Recombination in poxviruses is intimately linked to DNA replication with both viral and cellular proteins participate in this recombination-dependent replication. It has been shown, in other poxvirus genera, that recombination between isolates and perhaps even between species does occur, thereby providing another mechanism for the acquisition of new genes and for the rapid evolution of viruses. Such events may result in viruses that have a selective advantage over others, for example in re-infections (a characteristic of the PPV), or in viruses that are able to jump the species barrier and infect new hosts. Sequence data related to viral strains isolated from goats suggest that possible recombination events may have occurred between OV and PCPV (Ueda et al. 2003). The recombination events are frequent during poxvirus replication and comparative genomic analysis of several poxvirus species has revealed that recombinations occur frequently on the right terminal region. Intraspecific recombination can occur between strains of the same PPV species, but also interspecific recombination can happen depending on enough sequence similarity to enable recombination between distinct PPV species. The most important pre-requisite for a successful recombination is the coinfection of the individual host by different virus strains or species. Consequently, the following factors affecting the distribution of different viruses to shared target cells need to be considered: dose of inoculated virus, time interval between inoculation of the first and the second virus, distance between the marker mutations, genetic homology. At present there are no available data on the replication dynamics of PPV in permissive and non permissive hosts and reguarding co-infetions there are no information on the interference mechanisms occurring during the simultaneous replication of viruses of different species. This work has been carried out to set up permissive substrates allowing the replication of different PPV species, in particular keratinocytes monolayers and organotypic skin cultures. Furthermore a method to isolate and expand ovine skin stem cells was has been set up to indeep further aspects of viral cellular tropism during natural infection. The study produced important data to elucidate the replication dynamics of OV and PCPV virus in vitro as well as the mechanisms of interference that can arise during co-infection with different viral species. Moreover, the analysis carried on the genomic right terminal region of PCPV 1303/05 contributed to a better knowledge of the viral genes involved in host interaction and pathogenesis as well as to locate recombination breakpoints and genetic homologies between PPV species. Taken together these data filled several crucial gaps for the study of interspecific recombinations of PPVs which are thought to be important for a better understanding of the viral evolution and to improve the biosafety of antiviral therapy and PPV-based vectors.
Resumo:
This PhD Thesis is the result of my research activity in the last three years. My main research interest was centered on the evolution of mitochondrial genome (mtDNA), and on its usefulness as a phylogeographic and phylogenetic marker at different taxonomic levels in different taxa of Metazoa. From a methodological standpoint, my main effort was dedicated to the sequencing of complete mitochondrial genomes, and the approach to whole-genome sequencing was based on the application of Long-PCR and shotgun sequences. Moreover, this research project is a part of a bigger sequencing project of mtDNAs in many different Metazoans’ taxa, and I mostly dedicated myself to sequence and analyze mtDNAs in selected taxa of bivalves and hexapods (Insecta). Sequences of bivalve mtDNAs are particularly limited, and my study contributed to extend the sampling. Moreover, I used the bivalve Musculista senhousia as model taxon to investigate the molecular mechanisms and the evolutionary significance of their aberrant mode of mitochondrial inheritance (Doubly Uniparental Inheritance, see below). In Insects, I focused my attention on the Genus Bacillus (Insecta Phasmida). A detailed phylogenetic analysis was performed in order to assess phylogenetic relationships within the genus, and to investigate the placement of Phasmida in the phylogenetic tree of Insecta. The main goal of this part of my study was to add to the taxonomic coverage of sequenced mtDNAs in basal insects, which were only partially analyzed.
Resumo:
Das Hepatitis C Virus (HCV) ist der Haupterreger der parenteral übertragenen non-A non-B Hepatitis. Bisher wurde die Erforschung der Replikation und Pathogenese des HCV durch das Fehlen eines effizienten und verläßlichen Zellkultursystems behindert.Virale RNA aus infizierten humanen Leberzellen wurde isoliert und kloniert. Mit Hilfe eines Vergleichs mehrerer Klone wurde eine isolatspezifische Konsensussequenz bestimmt, auf deren Basis ein Konsensusgenom konstruiert wurde. Mit dem Konsensusgenom als Grundlage wurden subgenomische RNA-Moleküle, sogenannte âselektionierbare Replikonsâ hergestellt. Nach Transfektion der Replikons in humane HuH-7 Hepatoma-Zellen konnte gezeigt werden, daß die Replikons autonom und in hohem Maße in den Wirtszellen replizierten.Die Arbeit definiert die Struktur von HCV-Replikons, die in Zellkultur funktionell sind. Damit wird die Basis für ein lange gesuchtes HCV-Zellkultursystem geschaffen, welches das Studium der HCV-Replikation im Detail und die Entwicklung antiviral wirksamer Substanzen ermöglicht.
Resumo:
Das Wolf-Hirschhorn-Syndrom (WHS) ist ein komplexes und variables Fehlbildungs- Retardierungssyndrom, das durch Deletion in der distalen Chromosomenregion 4p16.3 hervorgerufen wird und dessen Ätiologie und Pathogenese bisher weitgehend unverstanden sind. Die Zielsetzung in der vorliegenden Arbeit bestand in der Identifizierung und vorläufigen Charakterisierung neuer Gene, die an der Entstehung des Syndroms beteiligt sein könnten. Die Wolf-Hirschhorn-Syndrom-kritische Region (WHSCR) konnte zu Beginn der vorliegenden Arbeit auf einen ca. 2 Mb großen Bereich zwischen den Markern D4S43 und D4S142 eingegrenzt werden. Für die Identifizierung neuer Gene wurden zunächst drei größere genomische Cosmid-/PAC-Contigs (I-III) im Bereich der Marker D4S114 bis D4S142 erstellt und mittels Exonamplifikation auf transkribierte Bereiche (Exons) untersucht. Es konnten insgesamt 67 putative 'Exons' isoliert werden, von denen einige bereits bekannten Genen (ZNF141, PDEB, MYL5, GAK, DAGK4 und FGFR3) entsprechen. Zwei dieser Gene konnten im Rahmen dieser Arbeit erstmals (DAGK4) bzw. genauer (GAK) in die distale Region 4p16.3 kartiert werden. Die restlichen Exons können aufgrund von Homologievergleichen und/oder EST-cDNA-Homologien vermutlich neuen Genen oder auch Pseudogenen (z. B. YWEE1hu) zugeordnet werden. Durch die im Verlaufe der vorliegenden Arbeit publizierte weitere Eingrenzung der WHSCR auf einen 165 Kb-großen Bereich proximal des FGFR3-Gens konzentrierten sich weitere Untersuchungen auf die detaillierte Analyse der WHSCR zwischen dem Marker D4S43 und FGFR3. Mit Hilfe von Exonamplifikation bzw. computergestützter Auswertung vorliegender Sequenzdaten aus diesem Bereich ('GRAIL', 'GENSCAN' und Homologievergleiche in den EST-Datenbanken des NCBI) konnten mehrere neue Gene identifiziert werden. In distaler-proximaler Reihenfolge handelt es sich dabei um die Gene LETM1, 51, 43, 45, 57 und POL4P. LETM1 kodiert für ein putatives Transmembran-Protein mit einem Leucin-Zipper- und zwei EF-Hand-Motiven und könnte aufgrund seiner möglichen Beteiligung an der Ca2+-Homeostase und/oder der Signal-transduktion zu Merkmalen des WHS (Krampfanfällen, mentale Retardierung und muskuläre Hypotonie) beitragen. Das Gen 51 entspricht einem in etwa zeitgleich durch Stec et al. (1998) und Chesi et al. (1998) als WHSC1 bzw. MMSET bezeichnetem Gen und wurde daher nicht weiter charakterisiert. Es wird genauso wie das Gen 43, das zeitgleich von Wright et al. (1999b) als WHSC2 beschrieben werden konnte und eine mögliche Rolle bei der Transkriptionselongation spielt, ubiquitär exprimiert. Das in der vorliegenden Arbeit identifizierte Gen 45 zeigt demgegenüber ein ausgesprochen spezifisches Expressionsmuster (in Nervenzellen des Gehirns sowie in Spermatiden). Dies stellt zusammen mit der strukturellen Ähnlichkeit des putativen Genprodukts zu Signalmolekülen einen interessanten Zusammenhang zu Merkmalen des WHS (beispielsweise Kryptorchismus, Uterusfehlbildungen oder auch neurologische Defekte) her. Demgegenüber handelt es sich bei dem Gen 57 möglicherweise um ein trunkiertes Pseudogen des eRFS-Gens auf Chromosom 6q24 (Wallrapp et al., 1998). Das POL4P-Gen schließlich stellt allein aufgrund seiner genomischen Lokalisation sowie seiner möglichen Funktion (als DNA-Polymerase-ähnliches Gen) kein gutes Kandidatengen für spezifische Merkmale des Syndroms dar und wurde daher nicht im Detail charakterisiert. Um die Beteiligung der Gene an der Ätiologie und Pathogenese des Syndroms zu verstehen, ist die Entwicklung eines Mausmodells (über das Einfügen gezielter Deletionen in das Mausgenom) geplant. Um dies zu ermöglichen, wurde in der vorliegenden Arbeit die Charakterisierung der orthologen Region bei der Maus vorgenommen. Zunächst wurden die orthologen Gene der Maus (Letm1, Whsc1, Gen 43 (Whsc2h), Gen 45 und Pol4p) identifiziert. Durch die Erstellung sowie die genaue Kartierung eines murinen genomischen P1/PAC-Klon-Contigs konnte gezeigt werden, daß die murinen Gene Fgfr3, Letm1, Whsc1, Gen 43 (Whsc2h), Gen 45 und Pol4p sowie einige weitere der überprüften EST-cDNA-Klone der Maus in einem durchgehenden Syntänieblock zwischen Mensch (POL4P bis FGFR3) und Maus (Mmu 5.20) enthalten sind, der in seiner genomischen Ausdehnung in etwa den Verhältnissen beim Menschen (zwischen POL4P und FGFR3) entspricht.