2 resultados para database design
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
The continuous increase of genome sequencing projects produced a huge amount of data in the last 10 years: currently more than 600 prokaryotic and 80 eukaryotic genomes are fully sequenced and publically available. However the sole sequencing process of a genome is able to determine just raw nucleotide sequences. This is only the first step of the genome annotation process that will deal with the issue of assigning biological information to each sequence. The annotation process is done at each different level of the biological information processing mechanism, from DNA to protein, and cannot be accomplished only by in vitro analysis procedures resulting extremely expensive and time consuming when applied at a this large scale level. Thus, in silico methods need to be used to accomplish the task. The aim of this work was the implementation of predictive computational methods to allow a fast, reliable, and automated annotation of genomes and proteins starting from aminoacidic sequences. The first part of the work was focused on the implementation of a new machine learning based method for the prediction of the subcellular localization of soluble eukaryotic proteins. The method is called BaCelLo, and was developed in 2006. The main peculiarity of the method is to be independent from biases present in the training dataset, which causes the over‐prediction of the most represented examples in all the other available predictors developed so far. This important result was achieved by a modification, made by myself, to the standard Support Vector Machine (SVM) algorithm with the creation of the so called Balanced SVM. BaCelLo is able to predict the most important subcellular localizations in eukaryotic cells and three, kingdom‐specific, predictors were implemented. In two extensive comparisons, carried out in 2006 and 2008, BaCelLo reported to outperform all the currently available state‐of‐the‐art methods for this prediction task. BaCelLo was subsequently used to completely annotate 5 eukaryotic genomes, by integrating it in a pipeline of predictors developed at the Bologna Biocomputing group by Dr. Pier Luigi Martelli and Dr. Piero Fariselli. An online database, called eSLDB, was developed by integrating, for each aminoacidic sequence extracted from the genome, the predicted subcellular localization merged with experimental and similarity‐based annotations. In the second part of the work a new, machine learning based, method was implemented for the prediction of GPI‐anchored proteins. Basically the method is able to efficiently predict from the raw aminoacidic sequence both the presence of the GPI‐anchor (by means of an SVM), and the position in the sequence of the post‐translational modification event, the so called ω‐site (by means of an Hidden Markov Model (HMM)). The method is called GPIPE and reported to greatly enhance the prediction performances of GPI‐anchored proteins over all the previously developed methods. GPIPE was able to predict up to 88% of the experimentally annotated GPI‐anchored proteins by maintaining a rate of false positive prediction as low as 0.1%. GPIPE was used to completely annotate 81 eukaryotic genomes, and more than 15000 putative GPI‐anchored proteins were predicted, 561 of which are found in H. sapiens. In average 1% of a proteome is predicted as GPI‐anchored. A statistical analysis was performed onto the composition of the regions surrounding the ω‐site that allowed the definition of specific aminoacidic abundances in the different considered regions. Furthermore the hypothesis that compositional biases are present among the four major eukaryotic kingdoms, proposed in literature, was tested and rejected. All the developed predictors and databases are freely available at: BaCelLo http://gpcr.biocomp.unibo.it/bacello eSLDB http://gpcr.biocomp.unibo.it/esldb GPIPE http://gpcr.biocomp.unibo.it/gpipe
Resumo:
Fibre-Reinforced-Plastics are composite materials composed by thin fibres with high mechanical properties, made to work together with a cohesive plastic matrix. The huge advantages of fibre reinforced plastics over traditional materials are their high specific mechanical properties i.e. high stiffness and strength to weight ratios. This kind of composite materials is the most disruptive innovation in the structural materials field seen in recent years and the areas of potential application are still many. However, there are few aspects which limit their growth: on the one hand the information available about their properties and long term behaviour is still scarce, especially if compared with traditional materials for which there has been developed an extended database through years of use and research. On the other hand, the technologies of production are still not as developed as the ones available to form plastics, metals and other traditional materials. A third aspect is that the new properties presented by these materials e.g. their anisotropy, difficult the design of components. This thesis will provide several case-studies with advancements regarding the three limitations mentioned. In particular, the long term mechanical properties have been studied through an experimental analysis of the impact of seawater on GFRP. Regarding production methods, the pre-impregnated cured in autoclave process was considered: a rapid tooling method to produce moulds will be presented, and a study about the production of thick components. Also, two liquid composite moulding methods will be presented, with a case-study regarding a large component with sandwich structure that was produced with the Vacuum-Assisted-Resin-Infusion method, and a case-study regarding a thick con-rod beam that was produced with the Resin-Transfer-Moulding process. The final case-study will analyse the loads acting during the use of a particular sportive component, made with FRP layers and a sandwich structure, practical design rules will be provided.