3 resultados para Similarity analysis
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
As distributed collaborative applications and architectures are adopting policy based management for tasks such as access control, network security and data privacy, the management and consolidation of a large number of policies is becoming a crucial component of such policy based systems. In large-scale distributed collaborative applications like web services, there is the need of analyzing policy interactions and integrating policies. In this thesis, we propose and implement EXAM-S, a comprehensive environment for policy analysis and management, which can be used to perform a variety of functions such as policy property analyses, policy similarity analysis, policy integration etc. As part of this environment, we have proposed and implemented new techniques for the analysis of policies that rely on a deep study of state of the art techniques. Moreover, we propose an approach for solving heterogeneity problems that usually arise when considering the analysis of policies belonging to different domains. Our work focuses on analysis of access control policies written in the dialect of XACML (Extensible Access Control Markup Language). We consider XACML policies because XACML is a rich language which can represent many policies of interest to real world applications and is gaining widespread adoption in the industry.
Resumo:
The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.
Resumo:
Apple consumption is highly recomended for a healthy diet and is the most important fruit produced in temperate climate regions. Unfortunately, it is also one of the fruit that most ofthen provoks allergy in atopic patients and the only treatment available up to date for these apple allergic patients is the avoidance. Apple allergy is due to the presence of four major classes of allergens: Mal d 1 (PR-10/Bet v 1-like proteins), Mal d 2 (Thaumatine-like proteins), Mal d 3 (Lipid transfer protein) and Mal d 4 (profilin). In this work new advances in the characterization of apple allergen gene families have been reached using a multidisciplinary approach. First of all, a genomic approach was used for the characterization of the allergen gene families of Mal d 1 (task of Chapter 1), Mal d 2 and Mal d 4 (task of Chapter 5). In particular, in Chapter 1 the study of two large contiguos blocks of DNA sequences containing the Mal d 1 gene cluster on LG16 allowed to acquire many new findings on number and orientation of genes in the cluster, their physical distances, their regulatory sequences and the presence of other genes or pseudogenes in this genomic region. Three new members were discovered co-localizing with the other Mal d 1 genes of LG16 suggesting that the complexity of the genetic base of allergenicity will increase with new advances. Many retrotranspon elements were also retrieved in this cluster. Due to the developement of molecular markers on the two sequences, the anchoring of the physical and the genetic map of the region has been successfully achieved. Moreover, in Chapter 5 the existence of other loci for the Thaumatine-like protein family in apple (Mal d 2.03 on LG4 and Mal d 2.02 on LG17) respect the one reported up to now was demonstred for the first time. Also one new locus for profilins (Mal d 4.04) was mapped on LG2, close to the Mal d 4.02 locus, suggesting a cluster organization for this gene family, as is well reported for Mal d 1 family. Secondly, a methodological approach was used to set up an highly specific tool to discriminate and quantify the expression of each Mal d 1 allergen gene (task of Chapter 2). In aprticular, a set of 20 Mal d 1 gene specific primer pairs for the quantitative Real time PCR technique was validated and optimized. As a first application, this tool was used on leaves and fruit tissues of the cultivar Florina in order to identify the Mal d 1 allergen genes that are expressed in different tissues. The differential expression retrieved in this study revealed a tissue-specificity for some Mal d 1 genes: 10/20 Mal d 1 genes were expressed in fruits and, indeed, probably more involved in the allergic reactions; while 17/20 Mal d 1 genes were expressed in leaves challenged with the fungus Venturia inaequalis and therefore probably interesting in the study of the plant defense mechanism. In Chapter 3 the specific expression levels of the 10 Mal d 1 isoallergen genes, found to be expressed in fruits, were studied for the first time in skin and flesh of apples of different genotypes. A complex gene expression profile was obtained due to the high gene-, tissue- and genotype-variability. Despite this, Mal d 1.06A and Mal d 1.07 expression patterns resulted particularly associated with the degree of allergenicity of the different cultivars. They were not the most expressed Mal d 1 genes in apple but here it was hypotized a relevant importance in the determination of allergenicity for both qualitative and quantitative aspects of the Mal d 1 gene expression levels. In Chapter 4 a clear modulation for all the 17 PR-10 genes tested in young leaves of Florina after challenging with the fungus V. inaequalis have been reported but with a peculiar expression profile for each gene. Interestingly, all the Mal d 1 genes resulted up-regulated except Mal d 1.10 that was down-regulated after the challenging with the fungus. The differences in direction, timing and magnitude of induction seem to confirm the hypothesis of a subfunctionalization inside the gene family despite an high sequencce and structure similarity. Moreover, a modulation of PR-10 genes was showed both in compatible (Gala-V. inaequalis) and incompatible (Florina-V. inaequalis) interactions contribute to validate the hypothesis of an indirect role for at least some of these proteins in the induced defense responses. Finally, a certain modulation of PR-10 transcripts retrieved also in leaves treated with water confirm their abilty to respond also to abiotic stress. To conclude, the genomic approach used here allowed to create a comprehensive inventory of all the genes of allergen families, especially in the case of extended gene families like Mal d 1. This knowledge can be considered a basal prerequisite for many further studies. On the other hand, the specific transcriptional approach make it possible to evaluate the Mal d 1 genes behavior on different samples and conditions and therefore, to speculate on their involvement on apple allergenicity process. Considering the double nature of Mal d 1 proteins, as apple allergens and as PR-10 proteins, the gene expression analysis upon the attack of the fungus created the base for unravel the Mal d 1 biological functions. In particular, the knowledge acquired in this work about the PR-10 genes putatively more involved in the specific Malus-V. inaequalis interaction will be helpful, in the future, to drive the apple breeding for hypo-allergenicity genotype without compromise the mechanism of response of the plants to stress conditions. For the future, the survey of the differences in allergenicity among cultivars has to be be thorough including other genotypes and allergic patients in the tests. After this, the allelic diversity analysis with the high and low allergenic cultivars on all the allergen genes, in particular on the ones with transcription levels correlated to allergencity, will provide the genetic background of the low ones. This step from genes to alleles will allow the develop of molecular markers for them that might be used to effectively addressed the apple breeding for hypo-allergenicity. Another important step forward for the study of apple allergens will be the use of a specific proteomic approach since apple allergy is a multifactor-determined disease and only an interdisciplinary and integrated approach can be effective for its prevention and treatment.