10 resultados para Graph analytics
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
Biological data are inherently interconnected: protein sequences are connected to their annotations, the annotations are structured into ontologies, and so on. While protein-protein interactions are already represented by graphs, in this work I am presenting how a graph structure can be used to enrich the annotation of protein sequences thanks to algorithms that analyze the graph topology. We also describe a novel solution to restrict the data generation needed for building such a graph, thanks to constraints on the data and dynamic programming. The proposed algorithm ideally improves the generation time by a factor of 5. The graph representation is then exploited to build a comprehensive database, thanks to the rising technology of graph databases. While graph databases are widely used for other kind of data, from Twitter tweets to recommendation systems, their application to bioinformatics is new. A graph database is proposed, with a structure that can be easily expanded and queried.
Resumo:
Analytics is the technology working with the manipulation of data to produce information able to change the world we live every day. Analytics have been largely used within the last decade to cluster people’s behaviour to predict their preferences of items to buy, music to listen, movies to watch and even electoral preference. The most advanced companies succeded in controlling people’s behaviour using analytics. Despite the evidence of the super-power of analytics, they are rarely applied to the big data collected within supply chain systems (i.e. distribution network, storage systems and production plants). This PhD thesis explores the fourth research paradigm (i.e. the generation of knowledge from data) applied to supply chain system design and operations management. An ontology defining the entities and the metrics of supply chain systems is used to design data structures for data collection in supply chain systems. The consistency of this data is provided by mathematical demonstrations inspired by the factory physics theory. The availability, quantity and quality of the data within these data structures define different decision patterns. Ten decision patterns are identified, and validated on-field, to address ten different class of design and control problems in the field of supply chain systems research.
Resumo:
The convergence between the recent developments in sensing technologies, data science, signal processing and advanced modelling has fostered a new paradigm to the Structural Health Monitoring (SHM) of engineered structures, which is the one based on intelligent sensors, i.e., embedded devices capable of stream processing data and/or performing structural inference in a self-contained and near-sensor manner. To efficiently exploit these intelligent sensor units for full-scale structural assessment, a joint effort is required to deal with instrumental aspects related to signal acquisition, conditioning and digitalization, and those pertaining to data management, data analytics and information sharing. In this framework, the main goal of this Thesis is to tackle the multi-faceted nature of the monitoring process, via a full-scale optimization of the hardware and software resources involved by the {SHM} system. The pursuit of this objective has required the investigation of both: i) transversal aspects common to multiple application domains at different abstraction levels (such as knowledge distillation, networking solutions, microsystem {HW} architectures), and ii) the specificities of the monitoring methodologies (vibrations, guided waves, acoustic emission monitoring). The key tools adopted in the proposed monitoring frameworks belong to the embedded signal processing field: namely, graph signal processing, compressed sensing, ARMA System Identification, digital data communication and TinyML.
Resumo:
The fast development of Information Communication Technologies (ICT) offers new opportunities to realize future smart cities. To understand, manage and forecast the city's behavior, it is necessary the analysis of different kinds of data from the most varied dataset acquisition systems. The aim of this research activity in the framework of Data Science and Complex Systems Physics is to provide stakeholders with new knowledge tools to improve the sustainability of mobility demand in future cities. Under this perspective, the governance of mobility demand generated by large tourist flows is becoming a vital issue for the quality of life in Italian cities' historical centers, which will worsen in the next future due to the continuous globalization process. Another critical theme is sustainable mobility, which aims to reduce private transportation means in the cities and improve multimodal mobility. We analyze the statistical properties of urban mobility of Venice, Rimini, and Bologna by using different datasets provided by companies and local authorities. We develop algorithms and tools for cartography extraction, trips reconstruction, multimodality classification, and mobility simulation. We show the existence of characteristic mobility paths and statistical properties depending on transport means and user's kinds. Finally, we use our results to model and simulate the overall behavior of the cars moving in the Emilia Romagna Region and the pedestrians moving in Venice with software able to replicate in silico the demand for mobility and its dynamic.
Resumo:
The idea behind the project is to develop a methodology for analyzing and developing techniques for the diagnosis and the prediction of the state of charge and health of lithium-ion batteries for automotive applications. For lithium-ion batteries, residual functionality is measured in terms of state of health; however, this value cannot be directly associated with a measurable value, so it must be estimated. The development of the algorithms is based on the identification of the causes of battery degradation, in order to model and predict the trend. Therefore, models have been developed that are able to predict the electrical, thermal and aging behavior. In addition to the model, it was necessary to develop algorithms capable of monitoring the state of the battery, online and offline. This was possible with the use of algorithms based on Kalman filters, which allow the estimation of the system status in real time. Through machine learning algorithms, which allow offline analysis of battery deterioration using a statistical approach, it is possible to analyze information from the entire fleet of vehicles. Both systems work in synergy in order to achieve the best performance. Validation was performed with laboratory tests on different batteries and under different conditions. The development of the model allowed to reduce the time of the experimental tests. Some specific phenomena were tested in the laboratory, and the other cases were artificially generated.
Resumo:
The importance of networks, in their broad sense, is rapidly and massively growing in modern-day society thanks to unprecedented communication capabilities offered by technology. In this context, the radio spectrum will be a primary resource to be preserved and not wasted. Therefore, the need for intelligent and automatic systems for in-depth spectrum analysis and monitoring will pave the way for a new set of opportunities and potential challenges. This thesis proposes a novel framework for automatic spectrum patrolling and the extraction of wireless network analytics. It aims to enhance the physical layer security of next generation wireless networks through the extraction and the analysis of dedicated analytical features. The framework consists of a spectrum sensing phase, carried out by a patrol composed of numerous radio-frequency (RF) sensing devices, followed by the extraction of a set of wireless network analytics. The methodology developed is blind, allowing spectrum sensing and analytics extraction of a network whose key features (i.e., number of nodes, physical layer signals, medium access protocol (MAC) and routing protocols) are unknown. Because of the wireless medium, over-the-air signals captured by the sensors are mixed; therefore, blind source separation (BSS) and measurement association are used to estimate the number of sources and separate the traffic patterns. After the separation, we put together a set of methodologies for extracting useful features of the wireless network, i.e., its logical topology, the application-level traffic patterns generated by the nodes, and their position. The whole framework is validated on an ad-hoc wireless network accounting for MAC protocol, packet collisions, nodes mobility, the spatial density of sensors, and channel impairments, such as path-loss, shadowing, and noise. The numerical results obtained by extensive and exhaustive simulations show that the proposed framework is consistent and can achieve the required performance.
Resumo:
This thesis deals with the analysis and management of emergency healthcare processes through the use of advanced analytics and optimization approaches. Emergency processes are among the most complex within healthcare. This is due to their non-elective nature and their high variability. This thesis is divided into two topics. The first one concerns the core of emergency healthcare processes, the emergency department (ED). In the second chapter, we describe the ED that is the case study. This is a real case study with data derived from a large ED located in northern Italy. In the next two chapters, we introduce two tools for supporting ED activities. The first one is a new type of analytics model. Its aim is to overcome the traditional methods of analyzing the activities provided in the ED by means of an algorithm that analyses the ED pathway (organized as event log) as a whole. The second tool is a decision-support system, which integrates a deep neural network for the prediction of patient pathways, and an online simulator to evaluate the evolution of the ED over time. Its purpose is to provide a set of solutions to prevent and solve the problem of the ED overcrowding. The second part of the thesis focuses on the COVID-19 pandemic emergency. In the fifth chapter, we describe a tool that was used by the Bologna local health authority in the first part of the pandemic. Its purpose is to analyze the clinical pathway of a patient and from this automatically assign them a state. Physicians used the state for routing the patients to the correct clinical pathways. The last chapter is dedicated to the description of a MIP model, which was used for the organization of the COVID-19 vaccination campaign in the city of Bologna, Italy.
Resumo:
Water Distribution Networks (WDNs) play a vital importance rule in communities, ensuring well-being band supporting economic growth and productivity. The need for greater investment requires design choices will impact on the efficiency of management in the coming decades. This thesis proposes an algorithmic approach to address two related problems:(i) identify the fundamental asset of large WDNs in terms of main infrastructure;(ii) sectorize large WDNs into isolated sectors in order to respect the minimum service to be guaranteed to users. Two methodologies have been developed to meet these objectives and subsequently they were integrated to guarantee an overall process which allows to optimize the sectorized configuration of WDN taking into account the needs to integrated in a global vision the two problems (i) and (ii). With regards to the problem (i), the methodology developed introduces the concept of primary network to give an answer with a dual approach, of connecting main nodes of WDN in terms of hydraulic infrastructures (reservoirs, tanks, pumps stations) and identifying hypothetical paths with the minimal energy losses. This primary network thus identified can be used as an initial basis to design the sectors. The sectorization problem (ii) has been faced using optimization techniques by the development of a new dedicated Tabu Search algorithm able to deal with real case studies of WDNs. For this reason, three new large WDNs models have been developed in order to test the capabilities of the algorithm on different and complex real cases. The developed methodology also allows to automatically identify the deficient parts of the primary network and dynamically includes new edges in order to support a sectorized configuration of the WDN. The application of the overall algorithm to the new real case studies and to others from literature has given applicable solutions even in specific complex situations.
Resumo:
The recent widespread use of social media platforms and web services has led to a vast amount of behavioral data that can be used to model socio-technical systems. A significant part of this data can be represented as graphs or networks, which have become the prevalent mathematical framework for studying the structure and the dynamics of complex interacting systems. However, analyzing and understanding these data presents new challenges due to their increasing complexity and diversity. For instance, the characterization of real-world networks includes the need of accounting for their temporal dimension, together with incorporating higher-order interactions beyond the traditional pairwise formalism. The ongoing growth of AI has led to the integration of traditional graph mining techniques with representation learning and low-dimensional embeddings of networks to address current challenges. These methods capture the underlying similarities and geometry of graph-shaped data, generating latent representations that enable the resolution of various tasks, such as link prediction, node classification, and graph clustering. As these techniques gain popularity, there is even a growing concern about their responsible use. In particular, there has been an increased emphasis on addressing the limitations of interpretability in graph representation learning. This thesis contributes to the advancement of knowledge in the field of graph representation learning and has potential applications in a wide range of complex systems domains. We initially focus on forecasting problems related to face-to-face contact networks with time-varying graph embeddings. Then, we study hyperedge prediction and reconstruction with simplicial complex embeddings. Finally, we analyze the problem of interpreting latent dimensions in node embeddings for graphs. The proposed models are extensively evaluated in multiple experimental settings and the results demonstrate their effectiveness and reliability, achieving state-of-the-art performances and providing valuable insights into the properties of the learned representations.
Resumo:
Knowledge graphs and ontologies are closely related concepts in the field of knowledge representation. In recent years, knowledge graphs have gained increasing popularity and are serving as essential components in many knowledge engineering projects that view them as crucial to their success. The conceptual foundation of the knowledge graph is provided by ontologies. Ontology modeling is an iterative engineering process that consists of steps such as the elicitation and formalization of requirements, the development, testing, refactoring, and release of the ontology. The testing of the ontology is a crucial and occasionally overlooked step of the process due to the lack of integrated tools to support it. As a result of this gap in the state-of-the-art, the testing of the ontology is completed manually, which requires a considerable amount of time and effort from the ontology engineers. The lack of tool support is noticed in the requirement elicitation process as well. In this aspect, the rise in the adoption and accessibility of knowledge graphs allows for the development and use of automated tools to assist with the elicitation of requirements from such a complementary source of data. Therefore, this doctoral research is focused on developing methods and tools that support the requirement elicitation and testing steps of an ontology engineering process. To support the testing of the ontology, we have developed XDTesting, a web application that is integrated with the GitHub platform that serves as an ontology testing manager. Concurrently, to support the elicitation and documentation of competency questions, we have defined and implemented RevOnt, a method to extract competency questions from knowledge graphs. Both methods are evaluated through their implementation and the results are promising.