41 resultados para data driven approach
Resumo:
Machine learning is widely adopted to decode multi-variate neural time series, including electroencephalographic (EEG) and single-cell recordings. Recent solutions based on deep learning (DL) outperformed traditional decoders by automatically extracting relevant discriminative features from raw or minimally pre-processed signals. Convolutional Neural Networks (CNNs) have been successfully applied to EEG and are the most common DL-based EEG decoders in the state-of-the-art (SOA). However, the current research is affected by some limitations. SOA CNNs for EEG decoding usually exploit deep and heavy structures with the risk of overfitting small datasets, and architectures are often defined empirically. Furthermore, CNNs are mainly validated by designing within-subject decoders. Crucially, the automatically learned features mainly remain unexplored; conversely, interpreting these features may be of great value to use decoders also as analysis tools, highlighting neural signatures underlying the different decoded brain or behavioral states in a data-driven way. Lastly, SOA DL-based algorithms used to decode single-cell recordings rely on more complex, slower to train and less interpretable networks than CNNs, and the use of CNNs with these signals has not been investigated. This PhD research addresses the previous limitations, with reference to P300 and motor decoding from EEG, and motor decoding from single-neuron activity. CNNs were designed light, compact, and interpretable. Moreover, multiple training strategies were adopted, including transfer learning, which could reduce training times promoting the application of CNNs in practice. Furthermore, CNN-based EEG analyses were proposed to study neural features in the spatial, temporal and frequency domains, and proved to better highlight and enhance relevant neural features related to P300 and motor states than canonical EEG analyses. Remarkably, these analyses could be used, in perspective, to design novel EEG biomarkers for neurological or neurodevelopmental disorders. Lastly, CNNs were developed to decode single-neuron activity, providing a better compromise between performance and model complexity.
Resumo:
Long-term monitoring of acoustical environments is gaining popularity thanks to the relevant amount of scientific and engineering insights that it provides. The increasing interest is due to the constant growth of storage capacity and computational power to process large amounts of data. In this perspective, machine learning (ML) provides a broad family of data-driven statistical techniques to deal with large databases. Nowadays, the conventional praxis of sound level meter measurements limits the global description of a sound scene to an energetic point of view. The equivalent continuous level Leq represents the main metric to define an acoustic environment, indeed. Finer analyses involve the use of statistical levels. However, acoustic percentiles are based on temporal assumptions, which are not always reliable. A statistical approach, based on the study of the occurrences of sound pressure levels, would bring a different perspective to the analysis of long-term monitoring. Depicting a sound scene through the most probable sound pressure level, rather than portions of energy, brought more specific information about the activity carried out during the measurements. The statistical mode of the occurrences can capture typical behaviors of specific kinds of sound sources. The present work aims to propose an ML-based method to identify, separate and measure coexisting sound sources in real-world scenarios. It is based on long-term monitoring and is addressed to acousticians focused on the analysis of environmental noise in manifold contexts. The presented method is based on clustering analysis. Two algorithms, Gaussian Mixture Model and K-means clustering, represent the main core of a process to investigate different active spaces monitored through sound level meters. The procedure has been applied in two different contexts: university lecture halls and offices. The proposed method shows robust and reliable results in describing the acoustic scenario and it could represent an important analytical tool for acousticians.
Resumo:
In this thesis we focus on the analysis and interpretation of time dependent deformations recorded through different geodetic methods. Firstly, we apply a variational Bayesian Independent Component Analysis (vbICA) technique to GPS daily displacement solutions, to separate the postseismic deformation that followed the mainshocks of the 2016-2017 Central Italy seismic sequence from the other, hydrological, deformation sources. By interpreting the signal associated with the postseismic relaxation, we model an afterslip distribution on the faults involved by the mainshocks consistent with the co-seismic models available in literature. We find evidences of aseismic slip on the Paganica fault, responsible for the Mw 6.1 2009 L’Aquila earthquake, highlighting the importance of aseismic slip and static stress transfer to properly model the recurrence of earthquakes on nearby fault segments. We infer a possible viscoelastic relaxation of the lower crust as a contributing mechanism to the postseismic displacements. We highlight the importance of a proper separation of the hydrological signals for an accurate assessment of the tectonic processes, especially in cases of mm-scale deformations. Contextually, we provide a physical explanation to the ICs associated with the observed hydrological processes. In the second part of the thesis, we focus on strain data from Gladwin Tensor Strainmeters, working on the instruments deployed in Taiwan. We develop a novel approach, completely data driven, to calibrate these strainmeters. We carry out a joint analysis of geodetic (strainmeters, GPS and GRACE products) and hydrological (rain gauges and piezometers) data sets, to characterize the hydrological signals in Southern Taiwan. Lastly, we apply the calibration approach here proposed to the strainmeters recently installed in Central Italy. We provide, as an example, the detection of a storm that hit the Umbria-Marche regions (Italy), demonstrating the potential of strainmeters in following the dynamics of deformation processes with limited spatio-temporal signature
Resumo:
The integration of quantitative data from movement analysis technologies is reshaping the analysis of athletes’ performances and injury mitigation, e.g., anterior cruciate ligament (ACL) rupture. Most of the movement assessments are performed in laboratory environments. Recent progress provides the chance to shift the paradigm to a more ecological approach with sport-specific elements and a closer examination of “real” movement patterns associated with performance and (ACL) injury risk. The present PhD thesis aimed at investigating the on-field motion patterns related to performance and injury prevention in young football players. The objectives of the thesis were: (I) in-lab measures of high-dynamics movements were used to validate wearable inertial sensors technology; (II) in-laboratory and on-field agility movement tasks were compared to inspect the effect of football-specific environment; (III) on-field analysis was conducted to challenge wearable sensors technology in the assessment of dangerous movement patterns towards the ACL rupture; (IV) an overview of technologies that could shape present and future assessment of ACL injury risk in daily practice was presented. The validity of wearables in the assessment of high-dynamics movements was confirmed. Relevant differences emerged between the movements performed in a laboratory setting and on the football pitch, supporting the inclusion of an ecological dynamics approach in preventive protocols. The on-field analysis of football-specific movement tasks demonstrated good reliability of wearable sensors and the presence of residual dangerous patterns in the injured players. A tool to inspect at-risk movement patterns on the field through objective measurements was presented. It discussed how potential alternatives to wearable inertial sensors embrace artificial intelligence and closer collaboration between clinical and technical expertise. The present thesis was meant to contribute to setting the basis for data-driven prevention protocols. A deeper comprehension of injury-related principles and counteractions will contribute to preserving athletes’ careers and health over time.
Resumo:
Protected crop production is a modern and innovative approach to cultivating plants in a controlled environment to optimize growth, yield, and quality. This method involves using structures such as greenhouses or tunnels to create a sheltered environment. These productive solutions are characterized by a careful regulation of variables like temperature, humidity, light, and ventilation, which collectively contribute to creating an optimal microclimate for plant growth. Heating, cooling, and ventilation systems are used to maintain optimal conditions for plant growth, regardless of external weather fluctuations. Protected crop production plays a crucial role in addressing challenges posed by climate variability, population growth, and food security. Similarly, animal husbandry involves providing adequate nutrition, housing, medical care and environmental conditions to ensure animal welfare. Then, sustainability is a critical consideration in all forms of agriculture, including protected crop and animal production. Sustainability in animal production refers to the practice of producing animal products in a way that minimizes negative impacts on the environment, promotes animal welfare, and ensures the long-term viability of the industry. Then, the research activities performed during the PhD can be inserted exactly in the field of Precision Agriculture and Livestock farming. Here the focus is on the computational fluid dynamic (CFD) approach and environmental assessment applied to improve yield, resource efficiency, environmental sustainability, and cost savings. It represents a significant shift from traditional farming methods to a more technology-driven, data-driven, and environmentally conscious approach to crop and animal production. On one side, CFD is powerful and precise techniques of computer modeling and simulation of airflows and thermo-hygrometric parameters, that has been applied to optimize the growth environment of crops and the efficiency of ventilation in pig barns. On the other side, the sustainability aspect has been investigated and researched in terms of Life Cycle Assessment analyses.
Resumo:
The integration of distributed and ubiquitous intelligence has emerged over the last years as the mainspring of transformative advancements in mobile radio networks. As we approach the era of “mobile for intelligence”, next-generation wireless networks are poised to undergo significant and profound changes. Notably, the overarching challenge that lies ahead is the development and implementation of integrated communication and learning mechanisms that will enable the realization of autonomous mobile radio networks. The ultimate pursuit of eliminating human-in-the-loop constitutes an ambitious challenge, necessitating a meticulous delineation of the fundamental characteristics that artificial intelligence (AI) should possess to effectively achieve this objective. This challenge represents a paradigm shift in the design, deployment, and operation of wireless networks, where conventional, static configurations give way to dynamic, adaptive, and AI-native systems capable of self-optimization, self-sustainment, and learning. This thesis aims to provide a comprehensive exploration of the fundamental principles and practical approaches required to create autonomous mobile radio networks that seamlessly integrate communication and learning components. The first chapter of this thesis introduces the notion of Predictive Quality of Service (PQoS) and adaptive optimization and expands upon the challenge to achieve adaptable, reliable, and robust network performance in dynamic and ever-changing environments. The subsequent chapter delves into the revolutionary role of generative AI in shaping next-generation autonomous networks. This chapter emphasizes achieving trustworthy uncertainty-aware generation processes with the use of approximate Bayesian methods and aims to show how generative AI can improve generalization while reducing data communication costs. Finally, the thesis embarks on the topic of distributed learning over wireless networks. Distributed learning and its declinations, including multi-agent reinforcement learning systems and federated learning, have the potential to meet the scalability demands of modern data-driven applications, enabling efficient and collaborative model training across dynamic scenarios while ensuring data privacy and reducing communication overhead.
Resumo:
In this thesis, the viability of the Dynamic Mode Decomposition (DMD) as a technique to analyze and model complex dynamic real-world systems is presented. This method derives, directly from data, computationally efficient reduced-order models (ROMs) which can replace too onerous or unavailable high-fidelity physics-based models. Optimizations and extensions to the standard implementation of the methodology are proposed, investigating diverse case studies related to the decoding of complex flow phenomena. The flexibility of this data-driven technique allows its application to high-fidelity fluid dynamics simulations, as well as time series of real systems observations. The resulting ROMs are tested against two tasks: (i) reduction of the storage requirements of high-fidelity simulations or observations; (ii) interpolation and extrapolation of missing data. The capabilities of DMD can also be exploited to alleviate the cost of onerous studies that require many simulations, such as uncertainty quantification analysis, especially when dealing with complex high-dimensional systems. In this context, a novel approach to address parameter variability issues when modeling systems with space and time-variant response is proposed. Specifically, DMD is merged with another model-reduction technique, namely the Polynomial Chaos Expansion, for uncertainty quantification purposes. Useful guidelines for DMD deployment result from the study, together with the demonstration of its potential to ease diagnosis and scenario analysis when complex flow processes are involved.
Resumo:
In the first chapter, we consider the joint estimation of objective and risk-neutral parameters for SV option pricing models. We propose a strategy which exploits the information contained in large heterogeneous panels of options, and we apply it to S&P 500 index and index call options data. Our approach breaks the stochastic singularity between contemporaneous option prices by assuming that every observation is affected by measurement error. We evaluate the likelihood function by using a MC-IS strategy combined with a Particle Filter algorithm. The second chapter examines the impact of different categories of traders on market transactions. We estimate a model which takes into account traders’ identities at the transaction level, and we find that the stock prices follow the direction of institutional trading. These results are carried out with data from an anonymous market. To explain our estimates, we examine the informativeness of a wide set of market variables and we find that most of them are unambiguously significant to infer the identity of traders. The third chapter investigates the relationship between the categories of market traders and three definitions of financial durations. We consider trade, price and volume durations, and we adopt a Log-ACD model where we include information on traders at the transaction level. As to trade durations, we observe an increase of the trading frequency when informed traders and the liquidity provider intensify their presence in the market. For price and volume durations, we find the same effect to depend on the state of the market activity. The fourth chapter proposes a strategy to express order aggressiveness in quantitative terms. We consider a simultaneous equation model to examine price and volume aggressiveness at Euronext Paris, and we analyse the impact of a wide set of order book variables on the price-quantity decision.
Resumo:
Obiettivo del lavoro è migliorare la lettura della ruralità europea. A fronte delle profonde trasformazioni avvenute, oggi non è più possibile analizzare i territori rurali adottando un mero approccio dicotomico che semplicemente li distingua dalle città. Al contrario, il lavoro integra l’analisi degli aspetti socio-economici con quella degli elementi territoriali, esaltando le principali dimensioni che caratterizzano le tante tipologie di ruralità oggi presenti in Europa. Muovendo dal dibattito sulla classificazione delle aree rurali, si propone dapprima un indicatore sintetico di ruralità che, adottando la logica fuzzy, considera congiuntamente aspetti demografici (densità), settoriali (rilevanza dell’attività agricola), territoriali e geografici (accessibilità e uso del suolo). Tale tecnica permette di ricostruire un continuum di gradi di ruralità, distinguendo così, all’interno dell’Unione Europea (circa 1.300 osservazioni), le aree più centrali da quelle progressivamente più rurali e periferiche. Successivamente, attraverso un’analisi cluster vengono individuate tipologie di aree omogenee in termini di struttura economica, paesaggio, diversificazione dell’attività agricola. Tali cluster risentono anche della distribuzione geografica delle aree stesse: vengono infatti distinti gruppi di regioni centrali da gruppi di regioni più periferiche. Tale analisi evidenzia soprattutto come il binomio ruralità-arretratezza risulti ormai superato: alcune aree rurali, infatti, hanno tratto vantaggio dalle trasformazioni che hanno interessato l’Unione Europea negli ultimi decenni (diffusione dell’ICT o sviluppo della manifattura). L’ultima parte del lavoro offre strumenti di analisi a supporto dell’azione politica comunitaria, analizzando la diversa capacità delle regioni europee di rispondere alle sfide lanciate dalla Strategia Europa 2020. Un’analisi in componenti principali sintetizza le principali dimensioni di tale performance regionale: i risultati sono poi riletti alla luce delle caratteristiche strutturali dei territori europei. Infine, una più diretta analisi spaziale dei dati permette di evidenziare come la geografia influenzi ancora profondamente la capacità dei territori di rispondere alle nuove sfide del decennio.
Resumo:
Falls are common and burdensome accidents among the elderly. About one third of the population aged 65 years or more experience at least one fall each year. Fall risk assessment is believed to be beneficial for fall prevention. This thesis is about prognostic tools for falls for community-dwelling older adults. We provide an overview of the state of the art. We then take different approaches: we propose a theoretical probabilistic model to investigate some properties of prognostic tools for falls; we present a tool whose parameters were derived from data of the literature; we train and test a data-driven prognostic tool. Finally, we present some preliminary results on prediction of falls through features extracted from wearable inertial sensors. Heterogeneity in validation results are expected from theoretical considerations and are observed from empirical data. Differences in studies design hinder comparability and collaborative research. According to the multifactorial etiology of falls, assessment on multiple risk factors is needed in order to achieve good predictive accuracy.
Resumo:
The language connectome was in-vivo investigated using multimodal non-invasive quantitative MRI. In PPA patients (n=18) recruited by the IRCCS ISNB, Bologna, cortical thickness measures showed a predominant reduction on the left hemisphere (p<0.005) with respect to matched healthy controls (HC) (n=18), and an accuracy of 86.1% in discrimination from Alzheimer’s disease patients (n=18). The left temporal and para-hippocampal gyri significantly correlated (p<0.01) with language fluency. In PPA patients (n=31) recruited by the Northwestern University Chicago, DTI measures were longitudinally evaluated (2-years follow-up) under the supervision of Prof. M. Catani, King’s College London. Significant differences with matched HC (n=27) were found, tract-localized at baseline and widespread in the follow-up. Language assessment scores correlated with arcuate (AF) and uncinate (UF) fasciculi DTI measures. In left-ischemic stroke patients (n=16) recruited by the NatBrainLab, King’s College London, language recovery was longitudinally evaluated (6-months follow-up). Using arterial spin labelling imaging a significant correlation (p<0.01) between language recovery and cerebral blood flow asymmetry, was found in the middle cerebral artery perfusion, towards the right. In HC (n=29) recruited by the DIBINEM Functional MR Unit, University of Bologna, an along-tract algorithm was developed suitable for different tractography methods, using the Laplacian operator. A higher left superior temporal gyrus and precentral operculum AF connectivity was found (Talozzi L et al., 2018), and lateralized UF projections towards the left dorsal orbital cortex. In HC (n=50) recruited in the Human Connectome Project, a new tractography-driven approach was developed for left association fibres, using a principal component analysis. The first component discriminated cortical areas typically connected by the AF, suggesting a good discrimination of cortical areas sharing a similar connectivity pattern. The evaluation of morphological, microstructural and metabolic measures could be used as in-vivo biomarkers to monitor language impairment related to neurodegeneration or as surrogate of cognitive rehabilitation/interventional treatment efficacy.
Resumo:
In the last decades, global food supply chains had to deal with the increasing awareness of the stakeholders and consumers about safety, quality, and sustainability. In order to address these new challenges for food supply chain systems, an integrated approach to design, control, and optimize product life cycle is required. Therefore, it is essential to introduce new models, methods, and decision-support platforms tailored to perishable products. This thesis aims to provide novel practice-ready decision-support models and methods to optimize the logistics of food items with an integrated and interdisciplinary approach. It proposes a comprehensive review of the main peculiarities of perishable products and the environmental stresses accelerating their quality decay. Then, it focuses on top-down strategies to optimize the supply chain system from the strategical to the operational decision level. Based on the criticality of the environmental conditions, the dissertation evaluates the main long-term logistics investment strategies to preserve products quality. Several models and methods are proposed to optimize the logistics decisions to enhance the sustainability of the supply chain system while guaranteeing adequate food preservation. The models and methods proposed in this dissertation promote a climate-driven approach integrating climate conditions and their consequences on the quality decay of products in innovative models supporting the logistics decisions. Given the uncertain nature of the environmental stresses affecting the product life cycle, an original stochastic model and solving method are proposed to support practitioners in controlling and optimizing the supply chain systems when facing uncertain scenarios. The application of the proposed decision-support methods to real case studies proved their effectiveness in increasing the sustainability of the perishable product life cycle. The dissertation also presents an industry application of a global food supply chain system, further demonstrating how the proposed models and tools can be integrated to provide significant savings and sustainability improvements.
Resumo:
This thesis studies how commercial practice is developing with artificial intelligence (AI) technologies and discusses some normative concepts in EU consumer law. The author analyses the phenomenon of 'algorithmic business', which defines the increasing use of data-driven AI in marketing organisations for the optimisation of a range of consumer-related tasks. The phenomenon is orienting business-consumer relations towards some general trends that influence power and behaviors of consumers. These developments are not taking place in a legal vacuum, but against the background of a normative system aimed at maintaining fairness and balance in market transactions. The author assesses current developments in commercial practices in the context of EU consumer law, which is specifically aimed at regulating commercial practices. The analysis is critical by design and without neglecting concrete practices tries to look at the big picture. The thesis consists of nine chapters divided in three thematic parts. The first part discusses the deployment of AI in marketing organisations, a brief history, the technical foundations, and their modes of integration in business organisations. In the second part, a selected number of socio-technical developments in commercial practice are analysed. The following are addressed: the monitoring and analysis of consumers’ behaviour based on data; the personalisation of commercial offers and customer experience; the use of information on consumers’ psychology and emotions, the mediation through marketing conversational applications. The third part assesses these developments in the context of EU consumer law and of the broader policy debate concerning consumer protection in the algorithmic society. In particular, two normative concepts underlying the EU fairness standard are analysed: manipulation, as a substantive regulatory standard that limits commercial behaviours in order to protect consumers’ informed and free choices and vulnerability, as a concept of social policy that portrays people who are more exposed to marketing practices.
Resumo:
This dissertation explores the link between hate crimes that occurred in the United Kingdom in June 2017, June 2018 and June 2019 through the posts of a robust sample of Conservative and radical right users on Twitter. In order to avoid the traditional challenges of this kind of research, I adopted a four staged research protocol that enabled me to merge content produced by a group of randomly selected users to observe the phenomenon from different angles. I collected tweets from thirty Conservative/right wing accounts for each month of June over the three years with the help of programming languages such as Python and CygWin tools. I then examined the language of my data focussing on humorous content in order to reveal whether, and if so how, radical users online often use humour as a tool to spread their views in conditions of heightened disgust and wide-spread political instability. A reflection on humour as a moral occurrence, expanding on the works of Christie Davies as well as applying recent findings on the behavioural immune system on online data, offers new insights on the overlooked humorous nature of radical political discourse. An unorthodox take on the moral foundations pioneered by Jonathan Haidt enriched my understanding of the analysed material through the addition of a moral-based layer of enquiry to my more traditional content-based one. This convergence of theoretical, data driven and real life events constitutes a viable “collection of strategies” for academia, data scientists; NGO’s fighting hate crimes and the wider public alike. Bringing together the ideas of Davies, Haidt and others to my data, helps us to perceive humorous online content in terms of complex radical narratives that are all too often compressed into a single tweet.
Resumo:
Although the debate of what data science is has a long history and has not reached a complete consensus yet, Data Science can be summarized as the process of learning from data. Guided by the above vision, this thesis presents two independent data science projects developed in the scope of multidisciplinary applied research. The first part analyzes fluorescence microscopy images typically produced in life science experiments, where the objective is to count how many marked neuronal cells are present in each image. Aiming to automate the task for supporting research in the area, we propose a neural network architecture tuned specifically for this use case, cell ResUnet (c-ResUnet), and discuss the impact of alternative training strategies in overcoming particular challenges of our data. The approach provides good results in terms of both detection and counting, showing performance comparable to the interpretation of human operators. As a meaningful addition, we release the pre-trained model and the Fluorescent Neuronal Cells dataset collecting pixel-level annotations of where neuronal cells are located. In this way, we hope to help future research in the area and foster innovative methodologies for tackling similar problems. The second part deals with the problem of distributed data management in the context of LHC experiments, with a focus on supporting ATLAS operations concerning data transfer failures. In particular, we analyze error messages produced by failed transfers and propose a Machine Learning pipeline that leverages the word2vec language model and K-means clustering. This provides groups of similar errors that are presented to human operators as suggestions of potential issues to investigate. The approach is demonstrated on one full day of data, showing promising ability in understanding the message content and providing meaningful groupings, in line with previously reported incidents by human operators.