11 resultados para digital learning tools

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biology is now a “Big Data Science” thanks to technological advancements allowing the characterization of the whole macromolecular content of a cell or a collection of cells. This opens interesting perspectives, but only a small portion of this data may be experimentally characterized. From this derives the demand of accurate and efficient computational tools for automatic annotation of biological molecules. This is even more true when dealing with membrane proteins, on which my research project is focused leading to the development of two machine learning-based methods: BetAware-Deep and SVMyr. BetAware-Deep is a tool for the detection and topology prediction of transmembrane beta-barrel proteins found in Gram-negative bacteria. These proteins are involved in many biological processes and primary candidates as drug targets. BetAware-Deep exploits the combination of a deep learning framework (bidirectional long short-term memory) and a probabilistic graphical model (grammatical-restrained hidden conditional random field). Moreover, it introduced a modified formulation of the hydrophobic moment, designed to include the evolutionary information. BetAware-Deep outperformed all the available methods in topology prediction and reported high scores in the detection task. Glycine myristoylation in Eukaryotes is the binding of a myristic acid on an N-terminal glycine. SVMyr is a fast method based on support vector machines designed to predict this modification in dataset of proteomic scale. It uses as input octapeptides and exploits computational scores derived from experimental examples and mean physicochemical features. SVMyr outperformed all the available methods for co-translational myristoylation prediction. In addition, it allows (as a unique feature) the prediction of post-translational myristoylation. Both the tools here described are designed having in mind best practices for the development of machine learning-based tools outlined by the bioinformatics community. Moreover, they are made available via user-friendly web servers. All this make them valuable tools for filling the gap between sequential and annotated data.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Deep learning methods are extremely promising machine learning tools to analyze neuroimaging data. However, their potential use in clinical settings is limited because of the existing challenges of applying these methods to neuroimaging data. In this study, first a data leakage type caused by slice-level data split that is introduced during training and validation of a 2D CNN is surveyed and a quantitative assessment of the model’s performance overestimation is presented. Second, an interpretable, leakage-fee deep learning software written in a python language with a wide range of options has been developed to conduct both classification and regression analysis. The software was applied to the study of mild cognitive impairment (MCI) in patients with small vessel disease (SVD) using multi-parametric MRI data where the cognitive performance of 58 patients measured by five neuropsychological tests is predicted using a multi-input CNN model taking brain image and demographic data. Each of the cognitive test scores was predicted using different MRI-derived features. As MCI due to SVD has been hypothesized to be the effect of white matter damage, DTI-derived features MD and FA produced the best prediction outcome of the TMT-A score which is consistent with the existing literature. In a second study, an interpretable deep learning system aimed at 1) classifying Alzheimer disease and healthy subjects 2) examining the neural correlates of the disease that causes a cognitive decline in AD patients using CNN visualization tools and 3) highlighting the potential of interpretability techniques to capture a biased deep learning model is developed. Structural magnetic resonance imaging (MRI) data of 200 subjects was used by the proposed CNN model which was trained using a transfer learning-based approach producing a balanced accuracy of 71.6%. Brain regions in the frontal and parietal lobe showing the cerebral cortex atrophy were highlighted by the visualization tools.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Negli ultimi vent’anni sono state proposte al livello internazionale alcune analisi dei problemi per le scienze nella scuola e diverse strategie per l’innovazione didattica. Molte ricerche hanno fatto riferimento a una nuova nozione di literacy scientifica, quale sapere fondamentale dell’educazione, indipendente dalle scelte professionali successive alla scuola. L’ipotesi di partenza di questa ricerca sostiene che alcune di queste analisi e l’idea di una nuova literacy scientifica di tipo non-vocazionale mostrino notevoli limiti quando rapportate al contesto italiano. Le specificità di quest’ultimo sono state affrontate, innanzitutto, da un punto di vista comparativo, discutendo alcuni documenti internazionali sull’insegnamento delle scienze. Questo confronto ha messo in luce la difficoltà di ottenere un insieme di evidenze chiare e definitive sui problemi dell’educazione scientifica discussi da questi documenti, in particolare per quanto riguarda i dati sulla crisi delle vocazioni scientifiche e sull’attitudine degli studenti verso le scienze. Le raccomandazioni educative e alcuni progetti curricolari internazionali trovano degli ostacoli decisivi nella scuola superiore italiana anche a causa di specificità istituzionali, come particolari principi di selezione e l’articolazione dei vari indirizzi formativi. Il presente lavoro si è basato soprattutto su una ricostruzione storico-pedagogica del curricolo di fisica, attraverso l’analisi delle linee guida nazionali, dei programmi di studio e di alcuni rappresentativi manuali degli ultimi decenni. Questo esame del curricolo “programmato” ha messo in luce, primo, il carattere accademico della fisica liceale e la sua debole rielaborazione culturale e didattica, secondo, l’impatto di temi e problemi internazionali sui materiali didattici. Tale impatto ha prodotto dei cambiamenti sul piano delle finalità educative e degli strumenti di apprendimento incorporati nei manuali. Nonostante l’evoluzione di queste caratteristiche del curricolo, tuttavia, l’analisi delle conoscenze storico-filosofiche utilizzate dai manuali ha messo in luce la scarsa contestualizzazione culturale della fisica quale uno degli ostacoli principali per l’insegnamento di una scienza più rilevante e formativa.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Hematological cancers are a heterogeneous family of diseases that can be divided into leukemias, lymphomas, and myelomas, often called “liquid tumors”. Since they cannot be surgically removable, chemotherapy represents the mainstay of their treatment. However, it still faces several challenges like drug resistance and low response rate, and the need for new anticancer agents is compelling. The drug discovery process is long-term, costly, and prone to high failure rates. With the rapid expansion of biological and chemical "big data", some computational techniques such as machine learning tools have been increasingly employed to speed up and economize the whole process. Machine learning algorithms can create complex models with the aim to determine the biological activity of compounds against several targets, based on their chemical properties. These models are defined as multi-target Quantitative Structure-Activity Relationship (mt-QSAR) and can be used to virtually screen small and large chemical libraries for the identification of new molecules with anticancer activity. The aim of my Ph.D. project was to employ machine learning techniques to build an mt-QSAR classification model for the prediction of cytotoxic drugs simultaneously active against 43 hematological cancer cell lines. For this purpose, first, I constructed a large and diversified dataset of molecules extracted from the ChEMBL database. Then, I compared the performance of different ML classification algorithms, until Random Forest was identified as the one returning the best predictions. Finally, I used different approaches to maximize the performance of the model, which achieved an accuracy of 88% by correctly classifying 93% of inactive molecules and 72% of active molecules in a validation set. This model was further applied to the virtual screening of a small dataset of molecules tested in our laboratory, where it showed 100% accuracy in correctly classifying all molecules. This result is confirmed by our previous in vitro experiments.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis proposes a new document model, according to which any document can be segmented in some independent components and transformed in a pattern-based projection, that only uses a very small set of objects and composition rules. The point is that such a normalized document expresses the same fundamental information of the original one, in a simple, clear and unambiguous way. The central part of my work consists of discussing that model, investigating how a digital document can be segmented, and how a segmented version can be used to implement advanced tools of conversion. I present seven patterns which are versatile enough to capture the most relevant documents’ structures, and whose minimality and rigour make that implementation possible. The abstract model is then instantiated into an actual markup language, called IML. IML is a general and extensible language, which basically adopts an XHTML syntax, able to capture a posteriori the only content of a digital document. It is compared with other languages and proposals, in order to clarify its role and objectives. Finally, I present some systems built upon these ideas. These applications are evaluated in terms of users’ advantages, workflow improvements and impact over the overall quality of the output. In particular, they cover heterogeneous content management processes: from web editing to collaboration (IsaWiki and WikiFactory), from e-learning (IsaLearning) to professional printing (IsaPress).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The rapid progression of biomedical research coupled with the explosion of scientific literature has generated an exigent need for efficient and reliable systems of knowledge extraction. This dissertation contends with this challenge through a concentrated investigation of digital health, Artificial Intelligence, and specifically Machine Learning and Natural Language Processing's (NLP) potential to expedite systematic literature reviews and refine the knowledge extraction process. The surge of COVID-19 complicated the efforts of scientists, policymakers, and medical professionals in identifying pertinent articles and assessing their scientific validity. This thesis presents a substantial solution in the form of the COKE Project, an initiative that interlaces machine reading with the rigorous protocols of Evidence-Based Medicine to streamline knowledge extraction. In the framework of the COKE (“COVID-19 Knowledge Extraction framework for next-generation discovery science”) Project, this thesis aims to underscore the capacity of machine reading to create knowledge graphs from scientific texts. The project is remarkable for its innovative use of NLP techniques such as a BERT + bi-LSTM language model. This combination is employed to detect and categorize elements within medical abstracts, thereby enhancing the systematic literature review process. The COKE project's outcomes show that NLP, when used in a judiciously structured manner, can significantly reduce the time and effort required to produce medical guidelines. These findings are particularly salient during times of medical emergency, like the COVID-19 pandemic, when quick and accurate research results are critical.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The continuous increase of genome sequencing projects produced a huge amount of data in the last 10 years: currently more than 600 prokaryotic and 80 eukaryotic genomes are fully sequenced and publically available. However the sole sequencing process of a genome is able to determine just raw nucleotide sequences. This is only the first step of the genome annotation process that will deal with the issue of assigning biological information to each sequence. The annotation process is done at each different level of the biological information processing mechanism, from DNA to protein, and cannot be accomplished only by in vitro analysis procedures resulting extremely expensive and time consuming when applied at a this large scale level. Thus, in silico methods need to be used to accomplish the task. The aim of this work was the implementation of predictive computational methods to allow a fast, reliable, and automated annotation of genomes and proteins starting from aminoacidic sequences. The first part of the work was focused on the implementation of a new machine learning based method for the prediction of the subcellular localization of soluble eukaryotic proteins. The method is called BaCelLo, and was developed in 2006. The main peculiarity of the method is to be independent from biases present in the training dataset, which causes the over‐prediction of the most represented examples in all the other available predictors developed so far. This important result was achieved by a modification, made by myself, to the standard Support Vector Machine (SVM) algorithm with the creation of the so called Balanced SVM. BaCelLo is able to predict the most important subcellular localizations in eukaryotic cells and three, kingdom‐specific, predictors were implemented. In two extensive comparisons, carried out in 2006 and 2008, BaCelLo reported to outperform all the currently available state‐of‐the‐art methods for this prediction task. BaCelLo was subsequently used to completely annotate 5 eukaryotic genomes, by integrating it in a pipeline of predictors developed at the Bologna Biocomputing group by Dr. Pier Luigi Martelli and Dr. Piero Fariselli. An online database, called eSLDB, was developed by integrating, for each aminoacidic sequence extracted from the genome, the predicted subcellular localization merged with experimental and similarity‐based annotations. In the second part of the work a new, machine learning based, method was implemented for the prediction of GPI‐anchored proteins. Basically the method is able to efficiently predict from the raw aminoacidic sequence both the presence of the GPI‐anchor (by means of an SVM), and the position in the sequence of the post‐translational modification event, the so called ω‐site (by means of an Hidden Markov Model (HMM)). The method is called GPIPE and reported to greatly enhance the prediction performances of GPI‐anchored proteins over all the previously developed methods. GPIPE was able to predict up to 88% of the experimentally annotated GPI‐anchored proteins by maintaining a rate of false positive prediction as low as 0.1%. GPIPE was used to completely annotate 81 eukaryotic genomes, and more than 15000 putative GPI‐anchored proteins were predicted, 561 of which are found in H. sapiens. In average 1% of a proteome is predicted as GPI‐anchored. A statistical analysis was performed onto the composition of the regions surrounding the ω‐site that allowed the definition of specific aminoacidic abundances in the different considered regions. Furthermore the hypothesis that compositional biases are present among the four major eukaryotic kingdoms, proposed in literature, was tested and rejected. All the developed predictors and databases are freely available at: BaCelLo http://gpcr.biocomp.unibo.it/bacello eSLDB http://gpcr.biocomp.unibo.it/esldb GPIPE http://gpcr.biocomp.unibo.it/gpipe

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Different types of proteins exist with diverse functions that are essential for living organisms. An important class of proteins is represented by transmembrane proteins which are specifically designed to be inserted into biological membranes and devised to perform very important functions in the cell such as cell communication and active transport across the membrane. Transmembrane β-barrels (TMBBs) are a sub-class of membrane proteins largely under-represented in structure databases because of the extreme difficulty in experimental structure determination. For this reason, computational tools that are able to predict the structure of TMBBs are needed. In this thesis, two computational problems related to TMBBs were addressed: the detection of TMBBs in large datasets of proteins and the prediction of the topology of TMBB proteins. Firstly, a method for TMBB detection was presented based on a novel neural network framework for variable-length sequence classification. The proposed approach was validated on a non-redundant dataset of proteins. Furthermore, we carried-out genome-wide detection using the entire Escherichia coli proteome. In both experiments, the method significantly outperformed other existing state-of-the-art approaches, reaching very high PPV (92%) and MCC (0.82). Secondly, a method was also introduced for TMBB topology prediction. The proposed approach is based on grammatical modelling and probabilistic discriminative models for sequence data labeling. The method was evaluated using a newly generated dataset of 38 TMBB proteins obtained from high-resolution data in the PDB. Results have shown that the model is able to correctly predict topologies of 25 out of 38 protein chains in the dataset. When tested on previously released datasets, the performances of the proposed approach were measured as comparable or superior to the current state-of-the-art of TMBB topology prediction.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

L’elaborato ha lo scopo di presentare le nuove opportunità di business offerte dal Web. Il rivoluzionario cambiamento che la pervasività della Rete e tutte le attività correlate stanno portando, ha posto le aziende davanti ad un diverso modo di relazionarsi con i propri consumatori, che sono sempre più informati, consapevoli ed esigenti, e con la concorrenza. La sfida da accettare per rimanere competitivi sul mercato è significativa e il mutamento in rapido sviluppo: gli aspetti che contraddistinguono questo nuovo paradigma digitale sono, infatti, velocità, mutevolezza, ma al tempo stesso misurabilità, ponderabilità, previsione. Grazie agli strumenti tecnologici a disposizione e alle dinamiche proprie dei diversi spazi web (siti, social network, blog, forum) è possibile tracciare più facilmente, rispetto al passato, l’impatto di iniziative, lanci di prodotto, promozioni e pubblicità, misurandone il ritorno sull’investimento, oltre che la percezione dell’utente finale. Un approccio datacentrico al marketing, attraverso analisi di monitoraggio della rete, permette quindi al brand investimenti più mirati e ponderati sulla base di stime e previsioni. Tra le più significative strategie di marketing digitale sono citate: social advertising, keyword advertising, digital PR, social media, email marketing e molte altre. Sono riportate anche due case history: una come ottimo esempio di co-creation in cui il brand ha coinvolto direttamente il pubblico nel processo di produzione del prodotto, affidando ai fan della Pagina Facebook ufficiale la scelta dei gusti degli yogurt da mettere in vendita. La seconda, caso internazionale di lead generation, ha permesso al brand di misurare la conversione dei visitatori del sito (previa compilazione di popin) in reali acquirenti, collegando i dati di traffico del sito a quelli delle vendite. Esempio di come online e offline comunichino strettamente.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This PhD thesis discusses the impact of Cloud Computing infrastructures on Digital Forensics in the twofold role of target of investigations and as a helping hand to investigators. The Cloud offers a cheap and almost limitless computing power and storage space for data which can be leveraged to commit either new or old crimes and host related traces. Conversely, the Cloud can help forensic examiners to find clues better and earlier than traditional analysis applications, thanks to its dramatically improved evidence processing capabilities. In both cases, a new arsenal of software tools needs to be made available. The development of this novel weaponry and its technical and legal implications from the point of view of repeatability of technical assessments is discussed throughout the following pages and constitutes the unprecedented contribution of this work

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study concerns teachers’ use of digital technologies in student assessment, and how the learning that is developed through the use of technology in mathematics can be evaluated. Nowadays math teachers use digital technologies in their teaching, but not in student assessment. The activities carried out with technology are seen as ‘extra-curricular’ (by both teachers and students), thus students do not learn what they can do in mathematics with digital technologies. I was interested in knowing the reasons teachers do not use digital technology to assess students’ competencies, and what they would need to be able to design innovative and appropriate tasks to assess students’ learning through digital technology. This dissertation is built on two main components: teachers and task design. I analyze teachers’ practices involving digital technologies with Ruthven’s Structuring Features of Classroom Practice, and what relation these practices have to the types of assessment they use. I study the kinds of assessment tasks teachers design with a DGE (Dynamic Geometry Environment), using Laborde’s categorization of DGE tasks. I consider the competencies teachers aim to assess with these tasks, and how their goals relate to the learning outcomes of the curriculum. This study also develops new directions in finding how to design suitable tasks for student mathematical assessment in a DGE, and it is driven by the desire to know what kinds of questions teachers might be more interested in using. I investigate the kinds of technology-based assessment tasks teachers value, and the type of feedback they give to students. Finally, I point out that the curriculum should include a range of mathematical and technological competencies that involve the use of digital technologies in mathematics, and I evaluate the possibility to take advantage of technology feedback to allow students to continue learning while they are taking a test.