12 resultados para Frankenstein and constructivist learning

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Although the debate of what data science is has a long history and has not reached a complete consensus yet, Data Science can be summarized as the process of learning from data. Guided by the above vision, this thesis presents two independent data science projects developed in the scope of multidisciplinary applied research. The first part analyzes fluorescence microscopy images typically produced in life science experiments, where the objective is to count how many marked neuronal cells are present in each image. Aiming to automate the task for supporting research in the area, we propose a neural network architecture tuned specifically for this use case, cell ResUnet (c-ResUnet), and discuss the impact of alternative training strategies in overcoming particular challenges of our data. The approach provides good results in terms of both detection and counting, showing performance comparable to the interpretation of human operators. As a meaningful addition, we release the pre-trained model and the Fluorescent Neuronal Cells dataset collecting pixel-level annotations of where neuronal cells are located. In this way, we hope to help future research in the area and foster innovative methodologies for tackling similar problems. The second part deals with the problem of distributed data management in the context of LHC experiments, with a focus on supporting ATLAS operations concerning data transfer failures. In particular, we analyze error messages produced by failed transfers and propose a Machine Learning pipeline that leverages the word2vec language model and K-means clustering. This provides groups of similar errors that are presented to human operators as suggestions of potential issues to investigate. The approach is demonstrated on one full day of data, showing promising ability in understanding the message content and providing meaningful groupings, in line with previously reported incidents by human operators.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Three-Dimensional Single-Bin-Size Bin Packing Problem is one of the most studied problem in the Cutting & Packing category. From a strictly mathematical point of view, it consists of packing a finite set of strongly heterogeneous “small” boxes, called items, into a finite set of identical “large” rectangles, called bins, minimizing the unused volume and requiring that the items are packed without overlapping. The great interest is mainly due to the number of real-world applications in which it arises, such as pallet and container loading, cutting objects out of a piece of material and packaging design. Depending on these real-world applications, more objective functions and more practical constraints could be needed. After a brief discussion about the real-world applications of the problem and a exhaustive literature review, the design of a two-stage algorithm to solve the aforementioned problem is presented. The algorithm must be able to provide the spatial coordinates of the placed boxes vertices and also the optimal boxes input sequence, while guaranteeing geometric, stability, fragility constraints and a reduced computational time. Due to NP-hard complexity of this type of combinatorial problems, a fusion of metaheuristic and machine learning techniques is adopted. In particular, a hybrid genetic algorithm coupled with a feedforward neural network is used. In the first stage, a rich dataset is created starting from a set of real input instances provided by an industrial company and the feedforward neural network is trained on it. After its training, given a new input instance, the hybrid genetic algorithm is able to run using the neural network output as input parameter vector, providing as output the optimal solution. The effectiveness of the proposed works is confirmed via several experimental tests.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The design process of any electric vehicle system has to be oriented towards the best energy efficiency, together with the constraint of maintaining comfort in the vehicle cabin. Main aim of this study is to research the best thermal management solution in terms of HVAC efficiency without compromising occupant’s comfort and internal air quality. An Arduino controlled Low Cost System of Sensors was developed and compared against reference instrumentation (average R-squared of 0.92) and then used to characterise the vehicle cabin in real parking and driving conditions trials. Data on the energy use of the HVAC was retrieved from the car On-Board Diagnostic port. Energy savings using recirculation can reach 30 %, but pollutants concentration in the cabin builds up in this operating mode. Moreover, the temperature profile appeared strongly nonuniform with air temperature differences up to 10° C. Optimisation methods often require a high number of runs to find the optimal configuration of the system. Fast models proved to be beneficial for these task, while CFD-1D model are usually slower despite the higher level of detail provided. In this work, the collected dataset was used to train a fast ML model of both cabin and HVAC using linear regression. Average scaled RMSE over all trials is 0.4 %, while computation time is 0.0077 ms for each second of simulated time on a laptop computer. Finally, a reinforcement learning environment was built in OpenAI and Stable-Baselines3 using the built-in Proximal Policy Optimisation algorithm to update the policy and seek for the best compromise between comfort, air quality and energy reward terms. The learning curves show an oscillating behaviour overall, with only 2 experiments behaving as expected even if too slow. This result leaves large room for improvement, ranging from the reward function engineering to the expansion of the ML model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this thesis, we investigate the role of applied physics in epidemiological surveillance through the application of mathematical models, network science and machine learning. The spread of a communicable disease depends on many biological, social, and health factors. The large masses of data available make it possible, on the one hand, to monitor the evolution and spread of pathogenic organisms; on the other hand, to study the behavior of people, their opinions and habits. Presented here are three lines of research in which an attempt was made to solve real epidemiological problems through data analysis and the use of statistical and mathematical models. In Chapter 1, we applied language-inspired Deep Learning models to transform influenza protein sequences into vectors encoding their information content. We then attempted to reconstruct the antigenic properties of different viral strains using regression models and to identify the mutations responsible for vaccine escape. In Chapter 2, we constructed a compartmental model to describe the spread of a bacterium within a hospital ward. The model was informed and validated on time series of clinical measurements, and a sensitivity analysis was used to assess the impact of different control measures. Finally (Chapter 3) we reconstructed the network of retweets among COVID-19 themed Twitter users in the early months of the SARS-CoV-2 pandemic. By means of community detection algorithms and centrality measures, we characterized users’ attention shifts in the network, showing that scientific communities, initially the most retweeted, lost influence over time to national political communities. In the Conclusion, we highlighted the importance of the work done in light of the main contemporary challenges for epidemiological surveillance. In particular, we present reflections on the importance of nowcasting and forecasting, the relationship between data and scientific research, and the need to unite the different scales of epidemiological surveillance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rapid progression of biomedical research coupled with the explosion of scientific literature has generated an exigent need for efficient and reliable systems of knowledge extraction. This dissertation contends with this challenge through a concentrated investigation of digital health, Artificial Intelligence, and specifically Machine Learning and Natural Language Processing's (NLP) potential to expedite systematic literature reviews and refine the knowledge extraction process. The surge of COVID-19 complicated the efforts of scientists, policymakers, and medical professionals in identifying pertinent articles and assessing their scientific validity. This thesis presents a substantial solution in the form of the COKE Project, an initiative that interlaces machine reading with the rigorous protocols of Evidence-Based Medicine to streamline knowledge extraction. In the framework of the COKE (“COVID-19 Knowledge Extraction framework for next-generation discovery science”) Project, this thesis aims to underscore the capacity of machine reading to create knowledge graphs from scientific texts. The project is remarkable for its innovative use of NLP techniques such as a BERT + bi-LSTM language model. This combination is employed to detect and categorize elements within medical abstracts, thereby enhancing the systematic literature review process. The COKE project's outcomes show that NLP, when used in a judiciously structured manner, can significantly reduce the time and effort required to produce medical guidelines. These findings are particularly salient during times of medical emergency, like the COVID-19 pandemic, when quick and accurate research results are critical.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In recent decades, two prominent trends have influenced the data modeling field, namely network analysis and machine learning. This thesis explores the practical applications of these techniques within the domain of drug research, unveiling their multifaceted potential for advancing our comprehension of complex biological systems. The research undertaken during this PhD program is situated at the intersection of network theory, computational methods, and drug research. Across six projects presented herein, there is a gradual increase in model complexity. These projects traverse a diverse range of topics, with a specific emphasis on drug repurposing and safety in the context of neurological diseases. The aim of these projects is to leverage existing biomedical knowledge to develop innovative approaches that bolster drug research. The investigations have produced practical solutions, not only providing insights into the intricacies of biological systems, but also allowing the creation of valuable tools for their analysis. In short, the achievements are: • A novel computational algorithm to identify adverse events specific to fixed-dose drug combinations. • A web application that tracks the clinical drug research response to SARS-CoV-2. • A Python package for differential gene expression analysis and the identification of key regulatory "switch genes". • The identification of pivotal events causing drug-induced impulse control disorders linked to specific medications. • An automated pipeline for discovering potential drug repurposing opportunities. • The creation of a comprehensive knowledge graph and development of a graph machine learning model for predictions. Collectively, these projects illustrate diverse applications of data science and network-based methodologies, highlighting the profound impact they can have in supporting drug research activities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

That humans and animals learn from interaction with the environment is a foundational idea underlying nearly all theories of learning and intelligence. Learning that certain outcomes are associated with specific actions or stimuli (both internal and external), is at the very core of the capacity to adapt behaviour to environmental changes. In the present work, appetitive and aversive reinforcement learning paradigms have been used to investigate the fronto-striatal loops and behavioural correlates of adaptive and maladaptive reinforcement learning processes, aiming to a deeper understanding of how cortical and subcortical substrates interacts between them and with other brain systems to support learning. By combining a large variety of neuroscientific approaches, including behavioral and psychophysiological methods, EEG and neuroimaging techniques, these studies aim at clarifying and advancing the knowledge of the neural bases and computational mechanisms of reinforcement learning, both in normal and neurologically impaired population.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of Next Generation Sequencing promotes Biology in the Big Data era. The ever-increasing gap between proteins with known sequences and those with a complete functional annotation requires computational methods for automatic structure and functional annotation. My research has been focusing on proteins and led so far to the development of three novel tools, DeepREx, E-SNPs&GO and ISPRED-SEQ, based on Machine and Deep Learning approaches. DeepREx computes the solvent exposure of residues in a protein chain. This problem is relevant for the definition of structural constraints regarding the possible folding of the protein. DeepREx exploits Long Short-Term Memory layers to capture residue-level interactions between positions distant in the sequence, achieving state-of-the-art performances. With DeepRex, I conducted a large-scale analysis investigating the relationship between solvent exposure of a residue and its probability to be pathogenic upon mutation. E-SNPs&GO predicts the pathogenicity of a Single Residue Variation. Variations occurring on a protein sequence can have different effects, possibly leading to the onset of diseases. E-SNPs&GO exploits protein embeddings generated by two novel Protein Language Models (PLMs), as well as a new way of representing functional information coming from the Gene Ontology. The method achieves state-of-the-art performances and is extremely time-efficient when compared to traditional approaches. ISPRED-SEQ predicts the presence of Protein-Protein Interaction sites in a protein sequence. Knowing how a protein interacts with other molecules is crucial for accurate functional characterization. ISPRED-SEQ exploits a convolutional layer to parse local context after embedding the protein sequence with two novel PLMs, greatly surpassing the current state-of-the-art. All methods are published in international journals and are available as user-friendly web servers. They have been developed keeping in mind standard guidelines for FAIRness (FAIR: Findable, Accessible, Interoperable, Reusable) and are integrated into the public collection of tools provided by ELIXIR, the European infrastructure for Bioinformatics.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The integration of distributed and ubiquitous intelligence has emerged over the last years as the mainspring of transformative advancements in mobile radio networks. As we approach the era of “mobile for intelligence”, next-generation wireless networks are poised to undergo significant and profound changes. Notably, the overarching challenge that lies ahead is the development and implementation of integrated communication and learning mechanisms that will enable the realization of autonomous mobile radio networks. The ultimate pursuit of eliminating human-in-the-loop constitutes an ambitious challenge, necessitating a meticulous delineation of the fundamental characteristics that artificial intelligence (AI) should possess to effectively achieve this objective. This challenge represents a paradigm shift in the design, deployment, and operation of wireless networks, where conventional, static configurations give way to dynamic, adaptive, and AI-native systems capable of self-optimization, self-sustainment, and learning. This thesis aims to provide a comprehensive exploration of the fundamental principles and practical approaches required to create autonomous mobile radio networks that seamlessly integrate communication and learning components. The first chapter of this thesis introduces the notion of Predictive Quality of Service (PQoS) and adaptive optimization and expands upon the challenge to achieve adaptable, reliable, and robust network performance in dynamic and ever-changing environments. The subsequent chapter delves into the revolutionary role of generative AI in shaping next-generation autonomous networks. This chapter emphasizes achieving trustworthy uncertainty-aware generation processes with the use of approximate Bayesian methods and aims to show how generative AI can improve generalization while reducing data communication costs. Finally, the thesis embarks on the topic of distributed learning over wireless networks. Distributed learning and its declinations, including multi-agent reinforcement learning systems and federated learning, have the potential to meet the scalability demands of modern data-driven applications, enabling efficient and collaborative model training across dynamic scenarios while ensuring data privacy and reducing communication overhead.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The thesis of this paper is based on the assumption that the socio-economic system in which we are living is characterised by three great trends: growing attention to the promotion of human capital; extremely rapid technological progress, based above all on the information and communication technologies (ICT); the establishment of new production and organizational set-ups. These transformation processes pose a concrete challenge to the training sector, which is called to satisfy the demand for new skills that need to be developed and disseminated. Hence the growing interest that the various training sub-systems devote to the issues of lifelong learning and distance learning. In such a context, the so-called e-learning acquires a central role. The first chapter proposes a reference theoretical framework for the transformations that are shaping post-industrial society. It analyzes some key issues such as: how work is changing, the evolution of organizational set-ups and the introduction of learning organization, the advent of the knowledge society and of knowledge companies, the innovation of training processes, and the key role of ICT in the new training and learning systems. The second chapter focuses on the topic of e-learning as an effective training model in response to the need for constant learning that is emerging in the knowledge society. This chapter starts with a reflection on the importance of lifelong learning and introduces the key arguments of this thesis, i.e. distance learning (DL) and the didactic methodology called e-learning. It goes on with an analysis of the various theoretic and technical aspects of e-learning. In particular, it delves into the theme of e-learning as an integrated and constant training environment, characterized by customized programmes and collaborative learning, didactic assistance and constant monitoring of the results. Thus, all the aspects of e-learning are taken into exam: the actors and the new professionals, the virtual communities as learning subjects, the organization of contents in learning objects, the conformity to international standards, the integrated platforms and so on. The third chapter, which concludes the theoretic-interpretative part, starts with a short presentation of the state-of-the-art e-learning international market that aims to understand its peculiarities and its current trends. Finally, we focus on some important regulation aspects related to the strong impulse given by the European Commission first, and by the Italian governments secondly, to the development and diffusion of e-learning. The second part of the thesis (chapters 4, 5 and 6) focus on field research, which aims to define the Italian scenario for e-learning. In particular, we have examined some key topics such as: the challenges of training and the instruments to face such challenges; the new didactic methods and technologies for lifelong learning; the level of diffusion of e-learning in Italy; the relation between classroom training and online training; the main factors of success as well as the most critical aspects of the introduction of e-learning in the various learning environments. As far as the methodological aspects are concerned, we have favoured a qualitative and quantitative analysis. A background analysis has been done to collect the statistical data available on this topic, as well as the research previously carried out in this area. The main source of data is constituted by the results of the Observatory on e-learning of Aitech-Assinform, which covers the 2000s and four areas of implementation (firms, public administration, universities, school): the thesis has reviewed the results of the last three available surveys, offering a comparative interpretation of them. We have then carried out an in-depth empirical examination of two case studies, which have been selected by virtue of the excellence they have achieved and can therefore be considered advanced and emblematic experiences (a large firm and a Graduate School).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistical modelling and statistical learning theory are two powerful analytical frameworks for analyzing signals and developing efficient processing and classification algorithms. In this thesis, these frameworks are applied for modelling and processing biomedical signals in two different contexts: ultrasound medical imaging systems and primate neural activity analysis and modelling. In the context of ultrasound medical imaging, two main applications are explored: deconvolution of signals measured from a ultrasonic transducer and automatic image segmentation and classification of prostate ultrasound scans. In the former application a stochastic model of the radio frequency signal measured from a ultrasonic transducer is derived. This model is then employed for developing in a statistical framework a regularized deconvolution procedure, for enhancing signal resolution. In the latter application, different statistical models are used to characterize images of prostate tissues, extracting different features. These features are then uses to segment the images in region of interests by means of an automatic procedure based on a statistical model of the extracted features. Finally, machine learning techniques are used for automatic classification of the different region of interests. In the context of neural activity signals, an example of bio-inspired dynamical network was developed to help in studies of motor-related processes in the brain of primate monkeys. The presented model aims to mimic the abstract functionality of a cell population in 7a parietal region of primate monkeys, during the execution of learned behavioural tasks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Alzheimer's disease (AD) is probably caused by both genetic and environmental risk factors. The major genetic risk factor is the E4 variant of apolipoprotein E gene called apoE4. Several risk factors for developing AD have been identified including lifestyle, such as dietary habits. The mechanisms behind the AD pathogenesis and the onset of cognitive decline in the AD brain are presently unknown. In this study we wanted to characterize the effects of the interaction between environmental risk factors and apoE genotype on neurodegeneration processes, with particular focus on behavioural studies and neurodegenerative processes at molecular level. Towards this aim, we used 6 months-old apoE4 and apoE3 Target Replacement (TR) mice fed on different diets (high intake of cholesterol and high intake of carbohydrates). These mice were evaluated for learning and memory deficits in spatial reference (Morris Water Maze (MWM)) and contextual learning (Passive Avoidance) tasks, which involve the hippocampus and the amygdala, respectively. From these behavioural studies we found that the initial cognitive impairments manifested as a retention deficit in apoE4 mice fed on high carbohydrate diet. Thus, the genetic risk factor apoE4 genotype associated with a high carbohydrate diet seems to affect cognitive functions in young mice, corroborating the theory that the combination of genetic and environmental risk factors greatly increases the risk of developing AD and leads to an earlier onset of cognitive deficits. The cellular and molecular bases of the cognitive decline in AD are largely unknown. In order to determine the molecular changes for the onset of the early cognitive impairment observed in the behavioural studies, we performed molecular studies, with particular focus on synaptic integrity and Tau phosphorylation. The most relevant finding of our molecular studies showed a significant decrease of Brain-derived Neurotrophic Factor (BDNF) in apoE4 mice fed on high carbohydrate diet. Our results may suggest that BDNF decrease found in apoE4 HS mice could be involved in the earliest impairment in long-term reference memory observed in behavioural studies. The second aim of this thesis was to study possible involvement of leptin in AD. There is growing evidence that leptin has neuroprotective properties in the Central Nervous System (CNS). Recent evidence has shown that leptin and its receptors are widespread in the CNS and may provide neuronal survival signals. However, there are still numerous questions, regarding the molecular mechanism by which leptin acts, that remain unanswered. Thus, given to the importance of the involvement of leptin in AD, we wanted to clarify the function of leptin in the pathogenesis of AD and to investigate if apoE genotype affect leptin levels through studies in vitro, in mice and in human. Our findings suggest that apoE4 TR mice showed an increase of leptin in the brain. Leptin levels are also increased in the cerebral spinal fluid of AD patients and apoE4 carriers with AD have higher levels of leptin than apoE3 carriers. Moreover, leptin seems to be expressed by reactive glial cells in AD brains. In vitro, ApoE4 together with Amyloid beta increases leptin production by microglia and astrocytes. Taken together, all these findings suggest that leptin replacement might not be a good strategy for AD therapy. Our results show that high leptin levels were found in AD brains. These findings suggest that, as high leptin levels do not promote satiety in obese individuals, it might be possible that they do not promote neuroprotection in AD patients. Therefore, we hypothesized that AD brain could suffer from leptin resistance. Further studies will be critical to determine whether or not the central leptin resistance in SNC could affect its potential neuroprotective effects.