52 resultados para Analysis Tools
Resumo:
From the late 1980s, the automation of sequencing techniques and the computer spread gave rise to a flourishing number of new molecular structures and sequences and to proliferation of new databases in which to store them. Here are presented three computational approaches able to analyse the massive amount of publicly avalilable data in order to answer to important biological questions. The first strategy studies the incorrect assignment of the first AUG codon in a messenger RNA (mRNA), due to the incomplete determination of its 5' end sequence. An extension of the mRNA 5' coding region was identified in 477 in human loci, out of all human known mRNAs analysed, using an automated expressed sequence tag (EST)-based approach. Proof-of-concept confirmation was obtained by in vitro cloning and sequencing for GNB2L1, QARS and TDP2 and the consequences for the functional studies are discussed. The second approach analyses the codon bias, the phenomenon in which distinct synonymous codons are used with different frequencies, and, following integration with a gene expression profile, estimates the total number of codons present across all the expressed mRNAs (named here "codonome value") in a given biological condition. Systematic analyses across different pathological and normal human tissues and multiple species shows a surprisingly tight correlation between the codon bias and the codonome bias. The third approach is useful to studies the expression of human autism spectrum disorder (ASD) implicated genes. ASD implicated genes sharing microRNA response elements (MREs) for the same microRNA are co-expressed in brain samples from healthy and ASD affected individuals. The different expression of a recently identified long non coding RNA which have four MREs for the same microRNA could disrupt the equilibrium in this network, but further analyses and experiments are needed.
Resumo:
This thesis is focused on Smart Grid applications in medium voltage distribution networks. For the development of new applications it appears useful the availability of simulation tools able to model dynamic behavior of both the power system and the communication network. Such a co-simulation environment would allow the assessment of the feasibility of using a given network technology to support communication-based Smart Grid control schemes on an existing segment of the electrical grid and to determine the range of control schemes that different communications technologies can support. For this reason, is presented a co-simulation platform that has been built by linking the Electromagnetic Transients Program Simulator (EMTP v3.0) with a Telecommunication Network Simulator (OPNET-Riverbed v18.0). The simulator is used to design and analyze a coordinate use of Distributed Energy Resources (DERs) for the voltage/var control (VVC) in distribution network. This thesis is focused control structure based on the use of phase measurement units (PMUs). In order to limit the required reinforcements of the communication infrastructures currently adopted by Distribution Network Operators (DNOs), the study is focused on leader-less MAS schemes that do not assign special coordinating rules to specific agents. Leader-less MAS are expected to produce more uniform communication traffic than centralized approaches that include a moderator agent. Moreover, leader-less MAS are expected to be less affected by limitations and constraint of some communication links. The developed co-simulator has allowed the definition of specific countermeasures against the limitations of the communication network, with particular reference to the latency and loss and information, for both the case of wired and wireless communication networks. Moreover, the co-simulation platform has bee also coupled with a mobility simulator in order to study specific countermeasures against the negative effects on the medium voltage/current distribution network caused by the concurrent connection of electric vehicles.
Resumo:
Changing or creating an organisation means creating a new process. Each process involves many risks that need to be identified and managed. The main risks considered here are procedural and legal risks. The former are related to the risks of errors that may occur during processes, while the latter are related to the compliance of processes with regulations. Managing the risks implies proposing changes to the processes that allow the desired result: an optimised process. In order to manage a company and optimise it in the best possible way, not only should the organisational aspect, risk management and legal compliance be taken into account, but it is important that they are all analysed simultaneously with the aim of finding the right balance that satisfies them all. This is the aim of this thesis, to provide methods and tools to balance these three characteristics, and to enable this type of optimisation, ICT support is used. This work isn’t a thesis in computer science or law, but rather an interdisciplinary thesis. Most of the work done so far is vertical and in a specific domain. The particularity and aim of this thesis is not to carry out an in-depth analysis of a particular aspect, but rather to combine several important aspects, normally analysed separately, which however have an impact and influence each other. In order to carry out this kind of interdisciplinary analysis, the knowledge base of both areas was involved and the combination and collaboration of different experts in the various fields was necessary. Although the methodology described is generic and can be applied to all sectors, the case study considered is a new type of healthcare service that allows patients in acute disease to be hospitalised to their home. This provide the possibility to perform experiments using real hospital database.
Resumo:
Many combinatorial problems coming from the real world may not have a clear and well defined structure, typically being dirtied by side constraints, or being composed of two or more sub-problems, usually not disjoint. Such problems are not suitable to be solved with pure approaches based on a single programming paradigm, because a paradigm that can effectively face a problem characteristic may behave inefficiently when facing other characteristics. In these cases, modelling the problem using different programming techniques, trying to ”take the best” from each technique, can produce solvers that largely dominate pure approaches. We demonstrate the effectiveness of hybridization and we discuss about different hybridization techniques by analyzing two classes of problems with particular structures, exploiting Constraint Programming and Integer Linear Programming solving tools and Algorithm Portfolios and Logic Based Benders Decomposition as integration and hybridization frameworks.
Resumo:
The continuous increase of genome sequencing projects produced a huge amount of data in the last 10 years: currently more than 600 prokaryotic and 80 eukaryotic genomes are fully sequenced and publically available. However the sole sequencing process of a genome is able to determine just raw nucleotide sequences. This is only the first step of the genome annotation process that will deal with the issue of assigning biological information to each sequence. The annotation process is done at each different level of the biological information processing mechanism, from DNA to protein, and cannot be accomplished only by in vitro analysis procedures resulting extremely expensive and time consuming when applied at a this large scale level. Thus, in silico methods need to be used to accomplish the task. The aim of this work was the implementation of predictive computational methods to allow a fast, reliable, and automated annotation of genomes and proteins starting from aminoacidic sequences. The first part of the work was focused on the implementation of a new machine learning based method for the prediction of the subcellular localization of soluble eukaryotic proteins. The method is called BaCelLo, and was developed in 2006. The main peculiarity of the method is to be independent from biases present in the training dataset, which causes the over‐prediction of the most represented examples in all the other available predictors developed so far. This important result was achieved by a modification, made by myself, to the standard Support Vector Machine (SVM) algorithm with the creation of the so called Balanced SVM. BaCelLo is able to predict the most important subcellular localizations in eukaryotic cells and three, kingdom‐specific, predictors were implemented. In two extensive comparisons, carried out in 2006 and 2008, BaCelLo reported to outperform all the currently available state‐of‐the‐art methods for this prediction task. BaCelLo was subsequently used to completely annotate 5 eukaryotic genomes, by integrating it in a pipeline of predictors developed at the Bologna Biocomputing group by Dr. Pier Luigi Martelli and Dr. Piero Fariselli. An online database, called eSLDB, was developed by integrating, for each aminoacidic sequence extracted from the genome, the predicted subcellular localization merged with experimental and similarity‐based annotations. In the second part of the work a new, machine learning based, method was implemented for the prediction of GPI‐anchored proteins. Basically the method is able to efficiently predict from the raw aminoacidic sequence both the presence of the GPI‐anchor (by means of an SVM), and the position in the sequence of the post‐translational modification event, the so called ω‐site (by means of an Hidden Markov Model (HMM)). The method is called GPIPE and reported to greatly enhance the prediction performances of GPI‐anchored proteins over all the previously developed methods. GPIPE was able to predict up to 88% of the experimentally annotated GPI‐anchored proteins by maintaining a rate of false positive prediction as low as 0.1%. GPIPE was used to completely annotate 81 eukaryotic genomes, and more than 15000 putative GPI‐anchored proteins were predicted, 561 of which are found in H. sapiens. In average 1% of a proteome is predicted as GPI‐anchored. A statistical analysis was performed onto the composition of the regions surrounding the ω‐site that allowed the definition of specific aminoacidic abundances in the different considered regions. Furthermore the hypothesis that compositional biases are present among the four major eukaryotic kingdoms, proposed in literature, was tested and rejected. All the developed predictors and databases are freely available at: BaCelLo http://gpcr.biocomp.unibo.it/bacello eSLDB http://gpcr.biocomp.unibo.it/esldb GPIPE http://gpcr.biocomp.unibo.it/gpipe
Resumo:
The vast majority of known proteins have not yet been experimentally characterized and little is known about their function. The design and implementation of computational tools can provide insight into the function of proteins based on their sequence, their structure, their evolutionary history and their association with other proteins. Knowledge of the three-dimensional (3D) structure of a protein can lead to a deep understanding of its mode of action and interaction, but currently the structures of <1% of sequences have been experimentally solved. For this reason, it became urgent to develop new methods that are able to computationally extract relevant information from protein sequence and structure. The starting point of my work has been the study of the properties of contacts between protein residues, since they constrain protein folding and characterize different protein structures. Prediction of residue contacts in proteins is an interesting problem whose solution may be useful in protein folding recognition and de novo design. The prediction of these contacts requires the study of the protein inter-residue distances related to the specific type of amino acid pair that are encoded in the so-called contact map. An interesting new way of analyzing those structures came out when network studies were introduced, with pivotal papers demonstrating that protein contact networks also exhibit small-world behavior. In order to highlight constraints for the prediction of protein contact maps and for applications in the field of protein structure prediction and/or reconstruction from experimentally determined contact maps, I studied to which extent the characteristic path length and clustering coefficient of the protein contacts network are values that reveal characteristic features of protein contact maps. Provided that residue contacts are known for a protein sequence, the major features of its 3D structure could be deduced by combining this knowledge with correctly predicted motifs of secondary structure. In the second part of my work I focused on a particular protein structural motif, the coiled-coil, known to mediate a variety of fundamental biological interactions. Coiled-coils are found in a variety of structural forms and in a wide range of proteins including, for example, small units such as leucine zippers that drive the dimerization of many transcription factors or more complex structures such as the family of viral proteins responsible for virus-host membrane fusion. The coiled-coil structural motif is estimated to account for 5-10% of the protein sequences in the various genomes. Given their biological importance, in my work I introduced a Hidden Markov Model (HMM) that exploits the evolutionary information derived from multiple sequence alignments, to predict coiled-coil regions and to discriminate coiled-coil sequences. The results indicate that the new HMM outperforms all the existing programs and can be adopted for the coiled-coil prediction and for large-scale genome annotation. Genome annotation is a key issue in modern computational biology, being the starting point towards the understanding of the complex processes involved in biological networks. The rapid growth in the number of protein sequences and structures available poses new fundamental problems that still deserve an interpretation. Nevertheless, these data are at the basis of the design of new strategies for tackling problems such as the prediction of protein structure and function. Experimental determination of the functions of all these proteins would be a hugely time-consuming and costly task and, in most instances, has not been carried out. As an example, currently, approximately only 20% of annotated proteins in the Homo sapiens genome have been experimentally characterized. A commonly adopted procedure for annotating protein sequences relies on the "inheritance through homology" based on the notion that similar sequences share similar functions and structures. This procedure consists in the assignment of sequences to a specific group of functionally related sequences which had been grouped through clustering techniques. The clustering procedure is based on suitable similarity rules, since predicting protein structure and function from sequence largely depends on the value of sequence identity. However, additional levels of complexity are due to multi-domain proteins, to proteins that share common domains but that do not necessarily share the same function, to the finding that different combinations of shared domains can lead to different biological roles. In the last part of this study I developed and validate a system that contributes to sequence annotation by taking advantage of a validated transfer through inheritance procedure of the molecular functions and of the structural templates. After a cross-genome comparison with the BLAST program, clusters were built on the basis of two stringent constraints on sequence identity and coverage of the alignment. The adopted measure explicity answers to the problem of multi-domain proteins annotation and allows a fine grain division of the whole set of proteomes used, that ensures cluster homogeneity in terms of sequence length. A high level of coverage of structure templates on the length of protein sequences within clusters ensures that multi-domain proteins when present can be templates for sequences of similar length. This annotation procedure includes the possibility of reliably transferring statistically validated functions and structures to sequences considering information available in the present data bases of molecular functions and structures.
Resumo:
Background. One of the phenomena observed in human aging is the progressive increase of a systemic inflammatory state, a condition referred to as “inflammaging”, negatively correlated with longevity. A prominent mediator of inflammation is the transcription factor NF-kB, that acts as key transcriptional regulator of many genes coding for pro-inflammatory cytokines. Many different signaling pathways activated by very diverse stimuli converge on NF-kB, resulting in a regulatory network characterized by high complexity. NF-kB signaling has been proposed to be responsible of inflammaging. Scope of this analysis is to provide a wider, systemic picture of such intricate signaling and interaction network: the NF-kB pathway interactome. Methods. The study has been carried out following a workflow for gathering information from literature as well as from several pathway and protein interactions databases, and for integrating and analyzing existing data and the relative reconstructed representations by using the available computational tools. Strong manual intervention has been necessarily used to integrate data from multiple sources into mathematically analyzable networks. The reconstruction of the NF-kB interactome pursued with this approach provides a starting point for a general view of the architecture and for a deeper analysis and understanding of this complex regulatory system. Results. A “core” and a “wider” NF-kB pathway interactome, consisting of 140 and 3146 proteins respectively, were reconstructed and analyzed through a mathematical, graph-theoretical approach. Among other interesting features, the topological characterization of the interactomes shows that a relevant number of interacting proteins are in turn products of genes that are controlled and regulated in their expression exactly by NF-kB transcription factors. These “feedback loops”, not always well-known, deserve deeper investigation since they may have a role in tuning the response and the output consequent to NF-kB pathway initiation, in regulating the intensity of the response, or its homeostasis and balance in order to make the functioning of such critical system more robust and reliable. This integrated view allows to shed light on the functional structure and on some of the crucial nodes of thet NF-kB transcription factors interactome. Conclusion. Framing structure and dynamics of the NF-kB interactome into a wider, systemic picture would be a significant step toward a better understanding of how NF-kB globally regulates diverse gene programs and phenotypes. This study represents a step towards a more complete and integrated view of the NF-kB signaling system.
Resumo:
The dolphin (Tursiops truncatus) is a mammal that is adapted to life in a totally aquatic environment. Despite the popularity and even iconic status of the dolphin, our knowledge of its physiology, its unique adaptations and the effects on it of environmental stressors are limited. One approach to improve this limited understanding is the implementation of established cellular and molecular methods to provide sensitive and insightful information for dolphin biology. We initiated our studies with the analysis of wild dolphin peripheral blood leukocytes, which have the potential to be informative of the animal’s global immune status. Transcriptomic profiles from almost 200 individual samples were analyzed using a newly developed species-specific microarray to assess its value as a prognostic and diagnostic tool. Functional genomics analyses were informative of stress-induced gene expression profiles and also of geographical location specific transcriptomic signatures, determined by the interaction of genetic, disease and environmental factors. We have developed quantitative metrics to unambiguously characterize the phenotypic properties of dolphin cells in culture. These quantitative metrics can provide identifiable characteristics and baseline data which will enable identification of changes in the cells due to time in culture. We have also developed a novel protocol to isolate primary cultures from cryopreserved tissue of stranded marine mammals, establishing a tissue (and cell) biorepository, a new approach that can provide a solution to the limited availability of samples. The work presented represents the development and application of tools for the study of the biology, health and physiology of the dolphin, and establishes their relevance for future studies of the impact on the dolphin of environmental infection and stress.
Resumo:
The subject of this thesis is multicolour bioluminescence analysis and how it can provide new tools for drug discovery and development.The mechanism of color tuning in bioluminescent reactions is not fully understood yet but it is object of intense research and several hypothesis have been generated. In the past decade key residues of the active site of the enzyme or in the surface surrounding the active site have been identified as responsible of different color emission. Anyway since bioluminescence reaction is strictly dependent from the interaction between the enzyme and its substrate D-luciferin, modification of the substrate can lead to a different emission spectrum too. In the recent years firefly luciferase and other luciferases underwent mutagenesis in order to obtain mutants with different emission characteristics. Thanks to these new discoveries in the bioluminescence field multicolour luciferases can be nowadays employed in bioanalysis for assay developments and imaging purposes. The use of multicolor bioluminescent enzymes expanded the potential of a range of application in vitro and in vivo. Multiple analysis and more information can be obtained from the same analytical session saving cost and time. This thesis focuses on several application of multicolour bioluminescence for high-throughput screening and in vivo imaging. Multicolor luciferases can be employed as new tools for drug discovery and developments and some examples are provided in the different chapters. New red codon optimized luciferase have been demonstrated to be improved tools for bioluminescence imaging in small animal and the possibility to combine red and green luciferases for BLI has been achieved even if some aspects of the methodology remain challenging and need further improvement. In vivo Bioluminescence imaging has known a rapid progress since its first application no more than 15 years ago. It is becoming an indispensable tool in pharmacological research. At the same time the development of more sensitive and implemented microscopes and low-light imager for a better visualization and quantification of multicolor signals would boost the research and the discoveries in life sciences in general and in drug discovery and development in particular.
Resumo:
The research presented in my PhD thesis is part of a wider European project, FishPopTrace, focused on traceability of fish populations and products. My work was aimed at developing and analyzing novel genetic tools for a widely distributed marine fish species, the European hake (Merluccius merluccius), in order to investigate population genetic structure and explore potential applications to traceability scenarios. A total of 395 SNPs (Single Nucleotide Polymorphisms) were discovered from a massive collection of Expressed Sequence Tags, obtained by high-throughput sequencing, and validated on 19 geographic samples from Atlantic and Mediterranean. Genome-scan approaches were applied to identify polymorphisms on genes potentially under divergent selection (outlier SNPs), showing higher genetic differentiation among populations respect to the average observed across loci. Comparative analysis on population structure were carried out on putative neutral and outlier loci at wide (Atlantic and Mediterranean samples) and regional (samples within each basin) spatial scales, to disentangle the effects of demographic and adaptive evolutionary forces on European hake populations genetic structure. Results demonstrated the potential of outlier loci to unveil fine scale genetic structure, possibly identifying locally adapted populations, despite the weak signal showed from putative neutral SNPs. The application of outlier SNPs within the framework of fishery resources management was also explored. A minimum panel of SNP markers showing maximum discriminatory power was selected and applied to a traceability scenario aiming at identifying the basin (and hence the stock) of origin, Atlantic or Mediterranean, of individual fish. This case study illustrates how molecular analytical technologies have operational potential in real-world contexts, and more specifically, potential to support fisheries control and enforcement and fish and fish product traceability.
Resumo:
The prospect of the continuous multiplication of life styles, the obsolescence of the traditional typological diagrams, the usability of spaces on different territorial scales, imposes on contemporary architecture the search for new models of living. Limited densities in urban development have produced the erosion of territory, the increase of the harmful emissions and energy consumption. High density housing cannot refuse the social emergency to ensure high quality and low cost dwellings, to a new people target: students, temporary workers, key workers, foreign, young couples without children, large families and, in general, people who carry out public services. Social housing strategies have become particularly relevant in regenerating high density urban outskirts. The choice of this research topic derives from the desire to deal with the recent accommodation emergency, according to different perspectives, with a view to give a contribution to the current literature, by proposing some tools for a correct design of the social housing, by ensuring good quality, cost-effective, and eco-sustainable solutions, from the concept phase, through management and maintenance, until the end of the building life cycle. The purpose of the thesis is defining a framework of guidelines that become effective instruments to be used in designing the social housing. They should also integrate the existing regulations and are mainly thought for those who work in this sector. They would aim at supporting students who have to cope with this particular residential theme, and also the users themselves. The scientific evidence of either the recent specialized literature or the solutions adopted in some case studies within the selected metropolitan areas of Milan, London and São Paulo, it is possible to identify the principles of this new design approach, in which the connection between typology, morphology and technology pursues the goal of a high living standard.
Resumo:
Nanotechnologies are rapidly expanding because of the opportunities that the new materials offer in many areas such as the manufacturing industry, food production, processing and preservation, and in the pharmaceutical and cosmetic industry. Size distribution of the nanoparticles determines their properties and is a fundamental parameter that needs to be monitored from the small-scale synthesis up to the bulk production and quality control of nanotech products on the market. A consequence of the increasing number of applications of nanomaterial is that the EU regulatory authorities are introducing the obligation for companies that make use of nanomaterials to acquire analytical platforms for the assessment of the size parameters of the nanomaterials. In this work, Asymmetrical Flow Field-Flow Fractionation (AF4) and Hollow Fiber F4 (HF5), hyphenated with Multiangle Light Scattering (MALS) are presented as tools for a deep functional characterization of nanoparticles. In particular, it is demonstrated the applicability of AF4-MALS for the characterization of liposomes in a wide series of mediums. Afterwards the technique is used to explore the functional features of a liposomal drug vector in terms of its biological and physical interaction with blood serum components: a comprehensive approach to understand the behavior of lipid vesicles in terms of drug release and fusion/interaction with other biological species is described, together with weaknesses and strength of the method. Afterwards the size characterization, size stability, and conjugation of azidothymidine drug molecules with a new generation of metastable drug vectors, the Metal Organic Frameworks, is discussed. Lastly, it is shown the applicability of HF5-ICP-MS for the rapid screening of samples of relevant nanorisk: rather than a deep and comprehensive characterization it this time shown a quick and smart methodology that within few steps provides qualitative information on the content of metallic nanoparticles in tattoo ink samples.
Resumo:
Introgression of domestic cat genes into European wildcat (Felis silvestris silvestris) populations and reduction of wildcats’ range in Europe, leaded by habitat loss and fragmentation, are considered two of the main conservation problems for this endangered feline. This thesis addressed the questions related with the artificial hybridization and populations’ fragmentation, using a conservation genetics perspective. We combined the use of highly polymorphic loci, Bayesian statistical inferences and landscape analyses tools to investigate the origin of the geographic-genetic substructure of European wildcats (Felis silvestris silvestris) in Italy and Europe. The genetic variability of microsatellites evidenced that European wildcat populations currently distributed in Italy differentiated in, and expanded from two distinct glacial refuges during the Last Glacial Maximum. The genetic and geographic substructure detected between the eastern and western sides of the Apennine ridge, resulted by adaptation to specific ecological conditions of the Mediterranean habitats. European wildcat populations in Europe are strongly structured into 5 geographic-genetic macro clusters corresponding to: the Italian peninsular & Sicily; Balkans & north-eastern Italy; Germany eastern; central Europe; and Iberian Peninsula. Central European population might have differentiated in the extra-Mediterranean Würm ice age refuge areas (Northern Alps, Carpathians, and the Bulgarian mountain systems), while the divergence among and within the southern European populations might have resulted by the Pleistocene bio geographical framework of Europe, with three southern refugia localized in the Balkans, Italian Peninsula and Iberia Peninsula. We further combined the use of most informative autosomal SNPs with uniparental markers (mtDNA and Y-linked) for accurately detecting parental genotypes and levels of introgressive hybridization between European wild and domestic cats. A total of 11 hybrids were identified. The presence of domestic mitochondrial haplotypes shared with some wild individuals led us to hypnotize the possibility that ancient introgressive events might have occurred and that further investigation should be recommended.
Resumo:
The instability of river bank can result in considerable human and land losses. The Po river is the most important in Italy, characterized by main banks of significant and constantly increasing height. This study presents multilayer perceptron of artificial neural network (ANN) to construct prediction models for the stability analysis of river banks along the Po River, under various river and groundwater boundary conditions. For this aim, a number of networks of threshold logic unit are tested using different combinations of the input parameters. Factor of safety (FS), as an index of slope stability, is formulated in terms of several influencing geometrical and geotechnical parameters. In order to obtain a comprehensive geotechnical database, several cone penetration tests from the study site have been interpreted. The proposed models are developed upon stability analyses using finite element code over different representative sections of river embankments. For the validity verification, the ANN models are employed to predict the FS values of a part of the database beyond the calibration data domain. The results indicate that the proposed ANN models are effective tools for evaluating the slope stability. The ANN models notably outperform the derived multiple linear regression models.
Resumo:
This PhD thesis discusses the impact of Cloud Computing infrastructures on Digital Forensics in the twofold role of target of investigations and as a helping hand to investigators. The Cloud offers a cheap and almost limitless computing power and storage space for data which can be leveraged to commit either new or old crimes and host related traces. Conversely, the Cloud can help forensic examiners to find clues better and earlier than traditional analysis applications, thanks to its dramatically improved evidence processing capabilities. In both cases, a new arsenal of software tools needs to be made available. The development of this novel weaponry and its technical and legal implications from the point of view of repeatability of technical assessments is discussed throughout the following pages and constitutes the unprecedented contribution of this work