17 resultados para Genome-specific Sequence
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
Self-incompatibility (SI) systems have evolved in many flowering plants to prevent self-fertilization and thus promote outbreeding. Pear and apple, as many of the species belonging to the Rosaceae, exhibit RNase-mediated gametophytic self-incompatibility, a widespread system carried also by the Solanaceae and Plantaginaceae. Pear orchards must for this reason contain at least two different cultivars that pollenize each other; to guarantee an efficient cross-pollination, they should have overlapping flowering periods and must be genetically compatible. This compatibility is determined by the S-locus, containing at least two genes encoding for a female (pistil) and a male (pollen) determinant. The female determinant in the Rosaceae, Solanaceae and Plantaginaceae system is a stylar glycoprotein with ribonuclease activity (S-RNase), that acts as a specific cytotoxin in incompatible pollen tubes degrading cellular RNAs. Since its identification, the S-RNase gene has been intensively studied and the sequences of a large number of alleles are available in online databases. On the contrary, the male determinant has been only recently identified as a pollen-expressed protein containing a F-box motif, called S-Locus F-box (abbreviated SLF or SFB). Since F-box proteins are best known for their participation to the SCF (Skp1 - Cullin - F-box) E3 ubiquitine ligase enzymatic complex, that is involved in protein degradation through the 26S proteasome pathway, the male determinant is supposed to act mediating the ubiquitination of the S-RNases, targeting them for the degradation in compatible pollen tubes. Attempts to clone SLF/SFB genes in the Pyrinae produced no results until very recently; in apple, the use of genomic libraries allowed the detection of two F-box genes linked to each S haplotype, called SFBB (S-locus F-Box Brothers). In Japanese pear, three SFBB genes linked to each haplotype were cloned from pollen cDNA. The SFBB genes exhibit S haplotype-specific sequence divergence and pollen-specific expression; their multiplicity is a feature whose interpretation is unclear: it has been hypothesized that all of them participate in the S-specific interaction with the RNase, but it is also possible that only one of them is involved in this function. Moreover, even if the S locus male and female determinants are the only responsible for the specificity of the pollen-pistil recognition, many other factors are supposed to play a role in GSI; these are not linked to the S locus and act in a S-haplotype independent manner. They can have a function in regulating the expression of S determinants (group 1 factors), modulating their activity (group 2) or acting downstream, in the accomplishment of the reaction of acceptance or rejection of the pollen tube (group 3). This study was aimed to the elucidation of the molecular mechanism of GSI in European pear (Pyrus communis) as well as in the other Pyrinae; it was divided in two parts, the first focusing on the characterization of male determinants, and the second on factors external to the S locus. The research of S locus F-box genes was primarily aimed to the identification of such genes in European pear, for which sequence data are still not available; moreover, it allowed also to investigate about the S locus structure in the Pyrinae. The analysis was carried out on a pool of varieties of the three species Pyrus communis (European pear), Pyrus pyrifolia (Japanese pear), and Malus × domestica (apple); varieties carrying S haplotypes whose RNases are highly similar were chosen, in order to check whether or not the same level of similarity is maintained also between the male determinants. A total of 82 sequences was obtained, 47 of which represent the first S-locus F-box genes sequenced from European pear. The sequence data strongly support the hypothesis that the S locus structure is conserved among the three species, and presumably among all the Pyrinae; at least five genes have homologs in the analysed S haplotypes, but the number of F-box genes surrounding the S-RNase could be even greater. The high level of sequence divergence and the similarity between alleles linked to highly conserved RNases, suggest a shared ancestral polymorphism also for the F-box genes. The F-box genes identified in European pear were mapped on a segregating population of 91 individuals from the cross 'Abbé Fétel' × 'Max Red Bartlett'. All the genes were placed on the linkage group 17, where the S locus has been placed both in pear and apple maps, and resulted strongly associated to the S-RNase gene. The linkage with the RNase was perfect for some of the F-box genes, while for others very rare single recombination events were identified. The second part of this study was focused on the research of other genes involved in the SI response in pear; it was aimed on one side to the identification of genes differentially expressed in compatible and incompatible crosses, and on the other to the cloning and characterization of the transglutaminase (TGase) gene, whose role may be crucial in pollen rejection. For the identification of differentially expressed genes, controlled pollinations were carried out in four combinations (self pollination, incompatible, half-compatible and fully compatible cross-pollination); expression profiles were compared through cDNA-AFLP. 28 fragments displaying an expression pattern related to compatibility or incompatibility were identified, cloned and sequenced; the sequence analysis allowed to assign a putative annotation to a part of them. The identified genes are involved in very different cellular processes or in defense mechanisms, suggesting a very complex change in gene expression following the pollen/pistil recognition. The pool of genes identified with this technique offers a good basis for further study toward a better understanding of how the SI response is carried out. Among the factors involved in SI response, moreover, an important role may be played by transglutaminase (TGase), an enzyme involved both in post-translational protein modification and in protein cross-linking. The TGase activity detected in pear styles was significantly higher when pollinated in incompatible combinations than in compatible ones, suggesting a role of this enzyme in the abnormal cytoskeletal reorganization observed during pollen rejection reaction. The aim of this part of the work was thus to identify and clone the pear TGase gene; the PCR amplification of fragments of this gene was achieved using primers realized on the alignment between the Arabidopsis TGase gene sequence and several apple EST fragments; the full-length coding sequence of the pear TGase gene was then cloned from cDNA, and provided a precious tool for further study of the in vitro and in vivo action of this enzyme.
Resumo:
The Myc oncoproteins belong to a family of transcription factors composed by Myc, N-Myc and L-Myc. The most studied components of this family are Myc and N-Myc because their expressions are frequently deregulated in a wide range of cancers. These oncoproteins can act both as activators or repressors of gene transcription. As activators, they heterodimerize with Max (Myc associated X-factor) and the heterodimer recognizes and binds a specific sequence elements (E-Box) onto gene promoters recruiting histone acetylase and inducing transcriptional activation. Myc-mediated transcriptional repression is a quite debated issue. One of the first mechanisms defined for the Myc-mediated transcriptional repression consisted in the interaction of Myc-Max complex Sp1 and/or Miz1 transcription factors already bound to gene promoters. This interaction may interfere with their activation functions by recruiting co-repressors such as Dnmt3 or HDACs. Moreover, in the absence of , Myc may interfere with the Sp1 activation function by direct interaction and subsequent recruitment of HDACs. More recently the Myc/Max complex was also shown to mediate transcriptional repression by direct binding to peculiar E-box. In this study we analyzed the role of Myc overexpression in Osteosarcoma and Neuroblastoma oncogenesis and the mechanisms underling to Myc function. Myc overexpression is known to correlate with chemoresistance in Osteosarcoma cells. We extended this study by demonstrating that c-Myc induces transcription of a panel of ABC drug transporter genes. ABCs are a large family trans-membrane transporter deeply involved in multi drug resistance. Furthermore expression levels of Myc, ABCC1, ABCC4 and ABCF1 were proved to be important prognostic tool to predict conventional therapy failure. N-Myc amplification/overexpression is the most important prognostic factor for Neuroblastoma. Cyclin G2 and Clusterin are two genes often down regulated in neuroblastoma cells. Cyclin G2 is an atypical member of Cyclin family and its expression is associated with terminal differentiation and apoptosis. Moreover it blocks cell cycle progression and induces cell growth arrest. Instead, CLU is a multifunctional protein involved in many physiological and pathological processes. Several lines of evidences support the view that CLU may act as a tumour suppressor in Neuroblastoma. In this thesis I showed that N-Myc represses CCNG2 and CLU transcription by different mechanisms. • N-Myc represses CCNG2 transcription by directly interacting with Sp1 bound in CCNG2 promoter and recruiting HDAC2. Importantly, reactivation of CCNG2 expression through epigenetic drugs partially reduces N-Myc and HDAC2 mediated cell proliferation. • N-Myc/Max complex represses CLU expression by direct binding to a peculiar E-box element on CLU promoter and by recruitment of HDACs and Polycomb Complexes, to the CLU promoter. Overall our findings strongly support the model in which Myc overexpression/amplification may contribute to some aspects of oncogenesis by a dual action: i) transcription activation of genes that confer a multidrug resistant phenotype to cancer cells; ii), transcription repression of genes involved in cell cycle inhibition and cellular differentiation.
Resumo:
The continuous increase of genome sequencing projects produced a huge amount of data in the last 10 years: currently more than 600 prokaryotic and 80 eukaryotic genomes are fully sequenced and publically available. However the sole sequencing process of a genome is able to determine just raw nucleotide sequences. This is only the first step of the genome annotation process that will deal with the issue of assigning biological information to each sequence. The annotation process is done at each different level of the biological information processing mechanism, from DNA to protein, and cannot be accomplished only by in vitro analysis procedures resulting extremely expensive and time consuming when applied at a this large scale level. Thus, in silico methods need to be used to accomplish the task. The aim of this work was the implementation of predictive computational methods to allow a fast, reliable, and automated annotation of genomes and proteins starting from aminoacidic sequences. The first part of the work was focused on the implementation of a new machine learning based method for the prediction of the subcellular localization of soluble eukaryotic proteins. The method is called BaCelLo, and was developed in 2006. The main peculiarity of the method is to be independent from biases present in the training dataset, which causes the over‐prediction of the most represented examples in all the other available predictors developed so far. This important result was achieved by a modification, made by myself, to the standard Support Vector Machine (SVM) algorithm with the creation of the so called Balanced SVM. BaCelLo is able to predict the most important subcellular localizations in eukaryotic cells and three, kingdom‐specific, predictors were implemented. In two extensive comparisons, carried out in 2006 and 2008, BaCelLo reported to outperform all the currently available state‐of‐the‐art methods for this prediction task. BaCelLo was subsequently used to completely annotate 5 eukaryotic genomes, by integrating it in a pipeline of predictors developed at the Bologna Biocomputing group by Dr. Pier Luigi Martelli and Dr. Piero Fariselli. An online database, called eSLDB, was developed by integrating, for each aminoacidic sequence extracted from the genome, the predicted subcellular localization merged with experimental and similarity‐based annotations. In the second part of the work a new, machine learning based, method was implemented for the prediction of GPI‐anchored proteins. Basically the method is able to efficiently predict from the raw aminoacidic sequence both the presence of the GPI‐anchor (by means of an SVM), and the position in the sequence of the post‐translational modification event, the so called ω‐site (by means of an Hidden Markov Model (HMM)). The method is called GPIPE and reported to greatly enhance the prediction performances of GPI‐anchored proteins over all the previously developed methods. GPIPE was able to predict up to 88% of the experimentally annotated GPI‐anchored proteins by maintaining a rate of false positive prediction as low as 0.1%. GPIPE was used to completely annotate 81 eukaryotic genomes, and more than 15000 putative GPI‐anchored proteins were predicted, 561 of which are found in H. sapiens. In average 1% of a proteome is predicted as GPI‐anchored. A statistical analysis was performed onto the composition of the regions surrounding the ω‐site that allowed the definition of specific aminoacidic abundances in the different considered regions. Furthermore the hypothesis that compositional biases are present among the four major eukaryotic kingdoms, proposed in literature, was tested and rejected. All the developed predictors and databases are freely available at: BaCelLo http://gpcr.biocomp.unibo.it/bacello eSLDB http://gpcr.biocomp.unibo.it/esldb GPIPE http://gpcr.biocomp.unibo.it/gpipe
Resumo:
Bioinformatics, in the last few decades, has played a fundamental role to give sense to the huge amount of data produced. Obtained the complete sequence of a genome, the major problem of knowing as much as possible of its coding regions, is crucial. Protein sequence annotation is challenging and, due to the size of the problem, only computational approaches can provide a feasible solution. As it has been recently pointed out by the Critical Assessment of Function Annotations (CAFA), most accurate methods are those based on the transfer-by-homology approach and the most incisive contribution is given by cross-genome comparisons. In the present thesis it is described a non-hierarchical sequence clustering method for protein automatic large-scale annotation, called “The Bologna Annotation Resource Plus” (BAR+). The method is based on an all-against-all alignment of more than 13 millions protein sequences characterized by a very stringent metric. BAR+ can safely transfer functional features (Gene Ontology and Pfam terms) inside clusters by means of a statistical validation, even in the case of multi-domain proteins. Within BAR+ clusters it is also possible to transfer the three dimensional structure (when a template is available). This is possible by the way of cluster-specific HMM profiles that can be used to calculate reliable template-to-target alignments even in the case of distantly related proteins (sequence identity < 30%). Other BAR+ based applications have been developed during my doctorate including the prediction of Magnesium binding sites in human proteins, the ABC transporters superfamily classification and the functional prediction (GO terms) of the CAFA targets. Remarkably, in the CAFA assessment, BAR+ placed among the ten most accurate methods. At present, as a web server for the functional and structural protein sequence annotation, BAR+ is freely available at http://bar.biocomp.unibo.it/bar2.0.
Resumo:
Motivation An actual issue of great interest, both under a theoretical and an applicative perspective, is the analysis of biological sequences for disclosing the information that they encode. The development of new technologies for genome sequencing in the last years, opened new fundamental problems since huge amounts of biological data still deserve an interpretation. Indeed, the sequencing is only the first step of the genome annotation process that consists in the assignment of biological information to each sequence. Hence given the large amount of available data, in silico methods became useful and necessary in order to extract relevant information from sequences. The availability of data from Genome Projects gave rise to new strategies for tackling the basic problems of computational biology such as the determination of the tridimensional structures of proteins, their biological function and their reciprocal interactions. Results The aim of this work has been the implementation of predictive methods that allow the extraction of information on the properties of genomes and proteins starting from the nucleotide and aminoacidic sequences, by taking advantage of the information provided by the comparison of the genome sequences from different species. In the first part of the work a comprehensive large scale genome comparison of 599 organisms is described. 2,6 million of sequences coming from 551 prokaryotic and 48 eukaryotic genomes were aligned and clustered on the basis of their sequence identity. This procedure led to the identification of classes of proteins that are peculiar to the different groups of organisms. Moreover the adopted similarity threshold produced clusters that are homogeneous on the structural point of view and that can be used for structural annotation of uncharacterized sequences. The second part of the work focuses on the characterization of thermostable proteins and on the development of tools able to predict the thermostability of a protein starting from its sequence. By means of Principal Component Analysis the codon composition of a non redundant database comprising 116 prokaryotic genomes has been analyzed and it has been showed that a cross genomic approach can allow the extraction of common determinants of thermostability at the genome level, leading to an overall accuracy in discriminating thermophilic coding sequences equal to 95%. This result outperform those obtained in previous studies. Moreover, we investigated the effect of multiple mutations on protein thermostability. This issue is of great importance in the field of protein engineering, since thermostable proteins are generally more suitable than their mesostable counterparts in technological applications. A Support Vector Machine based method has been trained to predict if a set of mutations can enhance the thermostability of a given protein sequence. The developed predictor achieves 88% accuracy.
Resumo:
The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.
Resumo:
The Poxviruses are a family of double stranded DNA (dsDNA) viruses that cause disease in many species, both vertebrate and invertebrate. Their genomes range in size from 135 to 365 kbp and show conservation in both organization and content. In particular, the central genomic regions of the chordopoxvirus subfamily (those capable of infecting vertebrates) contain 88 genes which are present in all the virus species characterised to date and which mostly occur in the same order and orientation. In contrast, however, the terminal regions of the genomes frequently contain genes that are species or genera-specific and that are not essential for the growth of the virus in vitro but instead often encode factors with important roles in vivo including modulation of the host immune response to infection and determination of the host range of the virus. The Parapoxviruses (PPV), of which Orf virus is the prototypic species, represent a genus within the chordopoxvirus subfamily of Poxviridae and are characterised by their ability to infect ruminants and humans. The genus currently contains four recognised species of virus, bovine papular stomatitis virus (BPSV) and pseudocowpox virus (PCPV) both of which infect cattle, orf virus (OV) that infects sheep and goats, and parapoxvirus of red deer in New Zealand (PVNZ). The ORFV genome has been fully sequenced, as has that of BPSV, and is ~138 kb in length encoding ~132 genes. The vast majority of these genes allow the virus to replicate in the cytoplasm of the infected host cell and therefore encode proteins involved in replication, transcription and metabolism of nucleic acids. These genes are well conserved between all known genera of poxviruses. There is however another class of genes, located at either end of the linear dsDNA genome, that encode proteins which are non-essential for replication and generally dictate host range and virulence of the virus. The non-essential genes are often the most variable within and between species of virus and therefore are potentially useful for diagnostic purposes. Given their role in subverting the host-immune response to infection they are also targets for novel therapeutics. The function of only a relatively small number of these proteins has been elucidated and there are several genes whose function still remains obscure principally because there is little similarity between them and proteins of known function in current sequence databases. It is thought that by selectively removing some of the virulence genes, or at least neutralising the proteins in some way, current vaccines could be improved. The evolution of poxviruses has been proposed to be an adaptive process involving frequent events of gene gain and loss, such that the virus co-evolves with its specific host. Gene capture or horizontal gene transfer from the host to the virus is considered an important source of new viral genes including those likely to be involved in host range and those enabling the virus to interfere with the host immune response to infection. Given the low rate of nucleotide substitution, recombination can be seen as an essential evolutionary driving force although it is likely underestimated. Recombination in poxviruses is intimately linked to DNA replication with both viral and cellular proteins participate in this recombination-dependent replication. It has been shown, in other poxvirus genera, that recombination between isolates and perhaps even between species does occur, thereby providing another mechanism for the acquisition of new genes and for the rapid evolution of viruses. Such events may result in viruses that have a selective advantage over others, for example in re-infections (a characteristic of the PPV), or in viruses that are able to jump the species barrier and infect new hosts. Sequence data related to viral strains isolated from goats suggest that possible recombination events may have occurred between OV and PCPV (Ueda et al. 2003). The recombination events are frequent during poxvirus replication and comparative genomic analysis of several poxvirus species has revealed that recombinations occur frequently on the right terminal region. Intraspecific recombination can occur between strains of the same PPV species, but also interspecific recombination can happen depending on enough sequence similarity to enable recombination between distinct PPV species. The most important pre-requisite for a successful recombination is the coinfection of the individual host by different virus strains or species. Consequently, the following factors affecting the distribution of different viruses to shared target cells need to be considered: dose of inoculated virus, time interval between inoculation of the first and the second virus, distance between the marker mutations, genetic homology. At present there are no available data on the replication dynamics of PPV in permissive and non permissive hosts and reguarding co-infetions there are no information on the interference mechanisms occurring during the simultaneous replication of viruses of different species. This work has been carried out to set up permissive substrates allowing the replication of different PPV species, in particular keratinocytes monolayers and organotypic skin cultures. Furthermore a method to isolate and expand ovine skin stem cells was has been set up to indeep further aspects of viral cellular tropism during natural infection. The study produced important data to elucidate the replication dynamics of OV and PCPV virus in vitro as well as the mechanisms of interference that can arise during co-infection with different viral species. Moreover, the analysis carried on the genomic right terminal region of PCPV 1303/05 contributed to a better knowledge of the viral genes involved in host interaction and pathogenesis as well as to locate recombination breakpoints and genetic homologies between PPV species. Taken together these data filled several crucial gaps for the study of interspecific recombinations of PPVs which are thought to be important for a better understanding of the viral evolution and to improve the biosafety of antiviral therapy and PPV-based vectors.
Resumo:
This PhD Thesis is the result of my research activity in the last three years. My main research interest was centered on the evolution of mitochondrial genome (mtDNA), and on its usefulness as a phylogeographic and phylogenetic marker at different taxonomic levels in different taxa of Metazoa. From a methodological standpoint, my main effort was dedicated to the sequencing of complete mitochondrial genomes, and the approach to whole-genome sequencing was based on the application of Long-PCR and shotgun sequences. Moreover, this research project is a part of a bigger sequencing project of mtDNAs in many different Metazoans’ taxa, and I mostly dedicated myself to sequence and analyze mtDNAs in selected taxa of bivalves and hexapods (Insecta). Sequences of bivalve mtDNAs are particularly limited, and my study contributed to extend the sampling. Moreover, I used the bivalve Musculista senhousia as model taxon to investigate the molecular mechanisms and the evolutionary significance of their aberrant mode of mitochondrial inheritance (Doubly Uniparental Inheritance, see below). In Insects, I focused my attention on the Genus Bacillus (Insecta Phasmida). A detailed phylogenetic analysis was performed in order to assess phylogenetic relationships within the genus, and to investigate the placement of Phasmida in the phylogenetic tree of Insecta. The main goal of this part of my study was to add to the taxonomic coverage of sequenced mtDNAs in basal insects, which were only partially analyzed.
Resumo:
It was decided to carry out a morphological and molecular characterization of the Italian Alternaria isolatescollected from apple , and evaluate their pathogenicity and subsequently combining the data collected. The strain collection (174 isolates) was constructed by collecting material (received from extension service personnel) between June and August of 2007, 2008, and 2009. A Preliminary bioassays were performed on detached plant materials (fruit and leaf wounded and unwounded), belonging to the Golden cultivar, with two different kind of inoculation (conidial suspension and conidial filtrate). Symptoms were monitored daily and a value of pathogenicity score (P.S.) was assigned on the basis of the diameter of the necrotic area that developed. On the basis of the bioassays, the number of isolates to undergo further molecular analysis was restricted to a representative set of single spore strains (44 strains). Morphological characteristics of the colony and sporulation pattern were determined according to previous systematic work on small-spored Alternaria spp. (Pryor and Michaelides, 2002 and Hong et al., 2006). Reference strains (Alternaria alternata, Alternaria tenuissima, Alternaria arborescens and four Japanese strains of Alternaria alternata mali pathotype), used in the study were kindly provided by Prof. Barry Pryor, who allows a open access to his own fungal collection. Molecular characterization was performed combining and comparing different data sets obtained from distinct molecular approach: 1) investigation of specific loci and 2) fingerprinting based on diverse randomly selected polymorphic sites of the genome. As concern the single locus analysis, it was chosen to sequence the EndoPG partial gene and three anonymous region (OPA1-3, OPA2- and OPa10-2). These markers has revealed a powerful tool in the latter systematic works on small-spored Alternaria spp. In fact, as reported in literature small-spored Alternaria taxonomy is complicated due to the inability to resolve evolutionary relationships among the taxa because of the lack of variability in the markers commonly used in fungi systematic. The three data set together provided the necessary variation to establish the phylogenetic relationships among the Italian isolates of Alternaria spp. On Italian strains these markers showed a variable number of informative sites (ranging from 7 for EndoPg to 85 for OPA1-3) and the parsimony analysis produced different tree topologies all concordant to define A. arborescens as a mophyletic clade. Fingerprinting analysis (nine ISSR primers and eight AFLP primers combination) led to the same result: a monophyleic A. arborescens clade and one clade containing both A. tenuissima and the A. alternata strains. This first attempt to characterize Italian Alternaria species recovered from apple produced concordant results with what was already described in a similar phylogenetic study on pistachio (Pryor and Michaelides, 2002), on walnut and hazelnut (Hong et al., 2006), apple (Kang et al., 2002) and citurus (Peever et al., 2004). Together with these studies, this research demonstrates that the three morphological groups are widely distributed and occupy similar ecological niches. Furthermore, this research suggest that these Alternaria species exhibit a similar infection pattern despite the taxonomic and pathogenic differences. The molecular characterization of the pathogens is a fundamental step to understanding the disease that is spreading in the apple orchards of the north Italy. At the beginning the causal agent was considered as Alteraria alternata (Marshall and Bertagnoll, 2006). Their preliminary studies purposed a pathogenic system related to the synthesis of toxins. Experimental data of our bioassays suggest an analogous hypothesis, considering that symptoms could be induced after inoculating plant material with solely the filtrate from pathogenic strains. Moreover, positive PCR reactions using AM-toxin gene specific primers, designed for identification of apple infecting Alternaria pathovar, led to a hypothesis that a host specific toxin (toxins) were involved. It remains an intriguing challenge to discover or not if the agent of the “Italian disease” is the same of the one previously typified as Alternaria mali, casual agent of the apple blotch disease.
Resumo:
Animal neocentromeres are defined as ectopic centromeres that have formed in non-centromeric locations and avoid some of the features, like the DNA satellite sequence, that normally characterize canonical centromeres. Despite this, they are stable functional centromeres inherited through generations. The only existence of neocentromeres provide convincing evidence that centromere specification is determined by epigenetic rather than sequence-specific mechanisms. For all this reasons, we used them as simplified models to investigate the molecular mechanisms that underlay the formation and the maintenance of functional centromeres. We collected human cell lines carrying neocentromeres in different positions. To investigate the region involved in the process at the DNA sequence level we applied a recent technology that integrates Chromatin Immuno-Precipitation and DNA microarrays (ChIP-on-chip) using rabbit polyclonal antibodies directed against CENP-A or CENP-C human centromeric proteins. These DNA binding-proteins are required for kinetochore function and are exclusively targeted to functional centromeres. Thus, the immunoprecipitation of DNA bound by these proteins allows the isolation of centromeric sequences, including those of the neocentromeres. Neocentromeres arise even in protein-coding genes region. We further analyzed if the increased scaffold attachment sites and the corresponding tighter chromatin of the region involved in the neocentromerization process still were permissive or not to transcription of within encoded genes. Centromere repositioning is a phenomenon in which a neocentromere arisen without altering the gene order, followed by the inactivation of the canonical centromere, becomes fixed in population. It is a process of chromosome rearrangement fundamental in evolution, at the bases of speciation. The repeat-free region where the neocentromere initially forms, progressively acquires extended arrays of satellite tandem repeats that may contribute to its functional stability. In this view our attention focalized to the repositioned horse ECA11 centromere. ChIP-on-chip analysis was used to define the region involved and SNPs studies, mapping within the region involved into neocentromerization, were carried on. We have been able to describe the structural polymorphism of the chromosome 11 centromeric domain of Caballus population. That polymorphism was seen even between homologues chromosome of the same cells. That discovery was the first described ever. Genomic plasticity had a fundamental role in evolution. Centromeres are not static packaged region of genomes. The key question that fascinates biologists is to understand how that centromere plasticity could be combined to the stability and maintenance of centromeric function. Starting from the epigenetic point of view that underlies centromere formation, we decided to analyze the RNA content of centromeric chromatin. RNA, as well as secondary chemically modifications that involve both histones and DNA, represents a good candidate to guide somehow the centromere formation and maintenance. Many observations suggest that transcription of centromeric DNA or of other non-coding RNAs could affect centromere formation. To date has been no thorough investigation addressing the identity of the chromatin-associated RNAs (CARs) on a global scale. This prompted us to develop techniques to identify CARs in a genome-wide approach using high-throughput genomic platforms. The future goal of this study will be to focalize the attention on what strictly happens specifically inside centromere chromatin.
Resumo:
In the last decade, the reverse vaccinology approach shifted the paradigm of vaccine discovery from conventional culture-based methods to high-throughput genome-based approaches for the development of recombinant protein-based vaccines against pathogenic bacteria. Besides reaching its main goal of identifying new vaccine candidates, this new procedure produced also a huge amount of molecular knowledge related to them. In the present work, we explored this knowledge in a species-independent way and we performed a systematic in silico molecular analysis of more than 100 protective antigens, looking at their sequence similarity, domain composition and protein architecture in order to identify possible common molecular features. This meta-analysis revealed that, beside a low sequence similarity, most of the known bacterial protective antigens shared structural/functional Pfam domains as well as specific protein architectures. Based on this, we formulated the hypothesis that the occurrence of these molecular signatures can be predictive of possible protective properties of other proteins in different bacterial species. We tested this hypothesis in Streptococcus agalactiae and identified four new protective antigens. Moreover, in order to provide a second proof of the concept for our approach, we used Staphyloccus aureus as a second pathogen and identified five new protective antigens. This new knowledge-driven selection process, named MetaVaccinology, represents the first in silico vaccine discovery tool based on conserved and predictive molecular and structural features of bacterial protective antigens and not dependent upon the prediction of their sub-cellular localization.
Resumo:
Grape berry is considered a non climacteric fruit, but there are some evidences that ethylene plays a role in the control of berry ripening. This PhD thesis aimed to give insights in the role of ethylene and ethylene-related genes in the regulation of grape berry ripening. During this study a small increase in ethylene concentration one week before véraison has been measured in Vitis vinifera L. ‘Pinot Noir’ grapes confirming previous findings in ‘Cabernet Sauvignon’. In addition, ethylene-related genes have been identified in the grapevine genome sequence. Similarly to other species, biosynthesis and ethylene receptor genes are present in grapevine as multi-gene families and their expression appeared tissue or developmental specific. All the other elements of the ethylene signal transduction cascade were also identified in the grape genome. Among them, there were ethylene response factors (ERF) which modulate the transcription of many effector genes in response to ethylene. In this study seven grapevine ERFs have been characterized and they showed tissue and berry development specific expression profiles. Two sequences, VvERF045 and VvERF063, seemed likely involved in berry ripening control due to their expression profiles and their sequence annotation. VvERF045 was induced before véraison and was specific of the ripe berry, by sequence similarity it was likely a transcription activator. VvERF063 displayed high sequence similarity to repressors of transcription and its expression, very high in green berries, was lowest at véraison and during ripening. To functionally characterize VvERF045 and VvERF063, a stable transformation strategy was chosen. Both sequences were cloned in vectors for over-expression and silencing and transferred in grape by Agrobacterium-mediated or biolistic-mediated gene transfer. In vitro, transgenic VvERF045 over-expressing plants displayed an epinastic phenotype whose extent was correlated to the transgene expression level. Four pathogen stress response genes were significantly induced in the transgenic plants, suggesting a putative function of VvERF045 in biotic stress defense during berry ripening. Further molecular analysis on the transgenic plants will help in identifying the actual VvERF045 target genes and together with the phenotypic characterization of the adult transgenic plants, will allow to extensively define the role of VvERF045 in berry ripening.
Resumo:
Like other vascular tumors, epithelioid hemangioendothelioma (EHE) is multifocal in approximately 50% of cases, and it is unclear whether the separate lesions represent multifocal disease or metastases. We hypothesized that the identification of an identical WWTR1-CAMTA1 rearrangement in different EHEs from the same patient supports the monoclonal origin of EHE. To test our hypothesis, we undertook a molecular analysis of two multicentric EHEs of the liver, including separate tumor samples from each patient. Matherial and Methods: We retrieved two cases of EHE with available tissue for molecular analysis. In both cases, fluorescence in situ hybridization (FISH) was performed to identify the presence of the WWTR1-CAMTA1 rearrangement to confirm the histologic diagnosis of EHE, as previously described. The reverse transcription-polymerase chain reaction (RT-PCR) products were analyzed by electrophoresis and the RT-PCR–amplified products were sequenced using the Sanger method. Results: FISH analysis revealed signal abnormalities in both WWTR1 and CAMTA1. Combined results confirmed the presence of the t(1;3)(1p36.23;3q25.1) translocation in both cases of EHE. Using RT-PCR analysis, we found that the size of the rearranged bands was identical in the different tumors from each patient. The sequence of the fusion gene confirmed a different WWTR1-CAMTA1 rearrangement in each patient, but an identical WWTR1-CAMTA1 rearrangement in the different lesions from each patient. Discussion: Because of its generally indolent clinical course, EHE is commonly classified as a multifocal, rather than metastatic, disease. In this study, we examined two cases of multifocal liver EHE and found an identical WWTR1-CAMTA1 rearrangement in each lesion from the same patient, but not between the two patients. These findings suggest that multifocal EHE arises from metastasis of the same neoplastic clone rather than from the simultaneous formation of multiple neoplastic clones, which supports the monoclonal origin of multifocal EHE.
Resumo:
Oncolytic virotherapy exploits the ability of viruses to infect and kill cells. It is suitable as treatment for tumors that are not accessible by surgery and/or respond poorly to the current therapeutic approach. HSV is a promising oncolytic agent. It has a large genome size able to accommodate large transgenes and some attenuated oncolytic HSVs (oHSV) are already in clinical trials phase I and II. The aim of this thesis was the generation of HSV-1 retargeted to tumor-specific receptors and detargeted from HSV natural receptors, HVEM and Nectin-1. The retargeting was achieved by inserting a specific single chain antibody (scFv) for the tumor receptor selected inside the HSV glycoprotein gD. In this research three tumor receptors were considered: epidermal growth factor receptor 2 (HER2) overexpressed in 25-30% of breast and ovarian cancers and gliomas, prostate specific membrane antigen (PSMA) expressed in prostate carcinomas and in neovascolature of solid tumors; and epidermal growth factor receptor variant III (EGFRvIII). In vivo studies on HER2 retargeted viruses R-LM113 and R-LM249 have demonstrated their high safety profile. For R-LM249 the antitumor efficacy has been highlighted by target-specific inhibition of the growth of human tumors in models of HER2-positive breast and ovarian cancer in nude mice. In a murine model of HER2-positive glioma in nude mice, R-LM113 was able to significantly increase the survival time of treated mice compared to control. Up to now, PSMA and EGFRvIII viruses (R-LM593 and R-LM613) are only characterized in vitro, confirming the specific retargeting to selected targets. This strategy has proved to be generally applicable to a broad spectrum of receptors for which a single chain antibody is available.
Resumo:
The objective of this work is to characterize the genome of the chromosome 1 of A.thaliana, a small flowering plants used as a model organism in studies of biology and genetics, on the basis of a recent mathematical model of the genetic code. I analyze and compare different portions of the genome: genes, exons, coding sequences (CDS), introns, long introns, intergenes, untranslated regions (UTR) and regulatory sequences. In order to accomplish the task, I transformed nucleotide sequences into binary sequences based on the definition of the three different dichotomic classes. The descriptive analysis of binary strings indicate the presence of regularities in each portion of the genome considered. In particular, there are remarkable differences between coding sequences (CDS and exons) and non-coding sequences, suggesting that the frame is important only for coding sequences and that dichotomic classes can be useful to recognize them. Then, I assessed the existence of short-range dependence between binary sequences computed on the basis of the different dichotomic classes. I used three different measures of dependence: the well-known chi-squared test and two indices derived from the concept of entropy i.e. Mutual Information (MI) and Sρ, a normalized version of the “Bhattacharya Hellinger Matusita distance”. The results show that there is a significant short-range dependence structure only for the coding sequences whose existence is a clue of an underlying error detection and correction mechanism. No doubt, further studies are needed in order to assess how the information carried by dichotomic classes could discriminate between coding and noncoding sequence and, therefore, contribute to unveil the role of the mathematical structure in error detection and correction mechanisms. Still, I have shown the potential of the approach presented for understanding the management of genetic information.