10 resultados para Source Code Analysis
em AMS Tesi di Dottorato - Alm@DL - Università di Bologna
Resumo:
Embedded systems are increasingly integral to daily life, improving and facilitating the efficiency of modern Cyber-Physical Systems which provide access to sensor data, and actuators. As modern architectures become increasingly complex and heterogeneous, their optimization becomes a challenging task. Additionally, ensuring platform security is important to avoid harm to individuals and assets. This study primarily addresses challenges in contemporary Embedded Systems, focusing on platform optimization and security enforcement. The initial section of this study delves into the application of machine learning methods to efficiently determine the optimal number of cores for a parallel RISC-V cluster to minimize energy consumption using static source code analysis. Results demonstrate that automated platform configuration is not only viable but also that there is a moderate performance trade-off when relying solely on static features. The second part focuses on addressing the problem of heterogeneous device mapping, which involves assigning tasks to the most suitable computational device in a heterogeneous platform for optimal runtime. The contribution of this section lies in the introduction of novel pre-processing techniques, along with a training framework called Siamese Networks, that enhances the classification performance of DeepLLVM, an advanced approach for task mapping. Importantly, these proposed approaches are independent from the specific deep-learning model used. Finally, this research work focuses on addressing issues concerning the binary exploitation of software running in modern Embedded Systems. It proposes an architecture to implement Control-Flow Integrity in embedded platforms with a Root-of-Trust, aiming to enhance security guarantees with limited hardware modifications. The approach involves enhancing the architecture of a modern RISC-V platform for autonomous vehicles by implementing a side-channel communication mechanism that relays control-flow changes executed by the process running on the host core to the Root-of-Trust. This approach has limited impact on performance and it is effective in enhancing the security of embedded platforms.
Resumo:
A partire dagli anni '70, si è assistito ad un progressivo riassetto geopolitico a livello mondiale Grazie anche all’evoluzione tecnologica ed alla sua diffusione di massa, il tempo e lo spazio si contraggono nel processo di globalizzazione che ha caratterizzato le società contemporanee ove l'informazione e la comunicazione assumono ormai un ruolo centrale nelle dinamiche di conoscenza. Il presente studio, intende far luce in primis sulla disciplina dell'intelligence, così come enunciata in ambito militare e "civile", in particolare nel contesto USA, NATO ed ONU, al fine quindi di evidenziare le peculiarità di una nuova disciplina di intelligence, cosiddetta Open Source Intelligence, che ha come elemento di innovazione l'utilizio delle informazioni non classificate. Dopo aver affrontato il problema della concettualizzazione ed evoluzione del fenomeno terroristico, sarà posto il focus sull’espressione criminale ad oggi maggiormente preoccupante, il terrorismo internazionale di matrice islamica, in prospettiva multidimensionale, grazie all’adozione di concetti criminologici interdisciplinari. Sotto il profilo della sperimentazione, si è, quindi, deciso di proporre, progettare e sviluppare l’architettura della piattaforma Open Source Intellicence Analysis Platform,un tool operativo di supporto per l’analista Open Source Intelligence, che si pone quale risorsa del’analisi criminologica, in grado di fornire un valido contributo, grazie al merge tra practice e research, nell’applicazione di tale approccio informativo al fenomeno terroristico.
Resumo:
The availability of a huge amount of source code from code archives and open-source projects opens up the possibility to merge machine learning, programming languages, and software engineering research fields. This area is often referred to as Big Code where programming languages are treated instead of natural languages while different features and patterns of code can be exploited to perform many useful tasks and build supportive tools. Among all the possible applications which can be developed within the area of Big Code, the work presented in this research thesis mainly focuses on two particular tasks: the Programming Language Identification (PLI) and the Software Defect Prediction (SDP) for source codes. Programming language identification is commonly needed in program comprehension and it is usually performed directly by developers. However, when it comes at big scales, such as in widely used archives (GitHub, Software Heritage), automation of this task is desirable. To accomplish this aim, the problem is analyzed from different points of view (text and image-based learning approaches) and different models are created paying particular attention to their scalability. Software defect prediction is a fundamental step in software development for improving quality and assuring the reliability of software products. In the past, defects were searched by manual inspection or using automatic static and dynamic analyzers. Now, the automation of this task can be tackled using learning approaches that can speed up and improve related procedures. Here, two models have been built and analyzed to detect some of the commonest bugs and errors at different code granularity levels (file and method levels). Exploited data and models’ architectures are analyzed and described in detail. Quantitative and qualitative results are reported for both PLI and SDP tasks while differences and similarities concerning other related works are discussed.
Resumo:
Agent Communication Languages (ACLs) have been developed to provide a way for agents to communicate with each other supporting cooperation in Multi-Agent Systems. In the past few years many ACLs have been proposed for Multi-Agent Systems, such as KQML and FIPA-ACL. The goal of these languages is to support high-level, human like communication among agents, exploiting Knowledge Level features rather than symbol level ones. Adopting these ACLs, and mainly the FIPA-ACL specifications, many agent platforms and prototypes have been developed. Despite these efforts, an important issue in the research on ACLs is still open and concerns how these languages should deal (at the Knowledge Level) with possible failures of agents. Indeed, the notion of Knowledge Level cannot be straightforwardly extended to a distributed framework such as MASs, because problems concerning communication and concurrency may arise when several Knowledge Level agents interact (for example deadlock or starvation). The main contribution of this Thesis is the design and the implementation of NOWHERE, a platform to support Knowledge Level Agents on the Web. NOWHERE exploits an advanced Agent Communication Language, FT-ACL, which provides high-level fault-tolerant communication primitives and satisfies a set of well defined Knowledge Level programming requirements. NOWHERE is well integrated with current technologies, for example providing full integration for Web services. Supporting different middleware used to send messages, it can be adapted to various scenarios. In this Thesis we present the design and the implementation of the architecture, together with a discussion of the most interesting details and a comparison with other emerging agent platforms. We also present several case studies where we discuss the benefits of programming agents using the NOWHERE architecture, comparing the results with other solutions. Finally, the complete source code of the basic examples can be found in appendix.
Resumo:
Salmonella and Campylobacter are common causes of human gastroenteritis. Their epidemiology is complex and a multi-tiered approach to control is needed, taking into account the different reservoirs, pathways and risk factors. In this thesis, trends in human gastroenteritis and food-borne outbreak notifications in Italy were explored. Moreover, the improved sensitivity of two recently-implemented regional surveillance systems in Lombardy and Piedmont was evidenced, providing a basis for improving notification at the national level. Trends in human Salmonella serovars were explored: serovars Enteritidis and Infantis decreased, Typhimurium remained stable and 4,[5],12:i:-, Derby and Napoli increased, suggesting that sources of infection have changed over time. Attribution analysis identified pigs as the main source of human salmonellosis in Italy, accounting for 43–60% of infections, followed by Gallus gallus (18–34%). Attributions to pigs and Gallus gallus showed increasing and decreasing trends, respectively. Potential bias and sampling issues related to the use of non-local/non-recent multilocus sequence typing (MLST) data in Campylobacter jejuni/coli source attribution using the Asymmetric Island (AI) model were investigated. As MLST data become increasingly dissimilar with increasing geographical/temporal distance, attributions to sources not sampled close to human cases can be underestimated. A combined case-control and source attribution analysis was developed to investigate risk factors for human Campylobacter jejuni/coli infection of chicken, ruminant, environmental, pet and exotic origin in The Netherlands. Most infections (~87%) were attributed to chicken and cattle. Individuals infected from different reservoirs had different associated risk factors: chicken consumption increased the risk for chicken-attributed infections; animal contact, barbecuing, tripe consumption, and never/seldom chicken consumption increased that for ruminant-attributed infections; game consumption and attending swimming pools increased that for environment-attributed infections; and dog ownership increased that for environment- and pet-attributed infections. Person-to-person contacts around holiday periods were risk factors for infections with exotic strains, putatively introduced by returning travellers.
Resumo:
This thesis deals with the study of optimal control problems for the incompressible Magnetohydrodynamics (MHD) equations. Particular attention to these problems arises from several applications in science and engineering, such as fission nuclear reactors with liquid metal coolant and aluminum casting in metallurgy. In such applications it is of great interest to achieve the control on the fluid state variables through the action of the magnetic Lorentz force. In this thesis we investigate a class of boundary optimal control problems, in which the flow is controlled through the boundary conditions of the magnetic field. Due to their complexity, these problems present various challenges in the definition of an adequate solution approach, both from a theoretical and from a computational point of view. In this thesis we propose a new boundary control approach, based on lifting functions of the boundary conditions, which yields both theoretical and numerical advantages. With the introduction of lifting functions, boundary control problems can be formulated as extended distributed problems. We consider a systematic mathematical formulation of these problems in terms of the minimization of a cost functional constrained by the MHD equations. The existence of a solution to the flow equations and to the optimal control problem are shown. The Lagrange multiplier technique is used to derive an optimality system from which candidate solutions for the control problem can be obtained. In order to achieve the numerical solution of this system, a finite element approximation is considered for the discretization together with an appropriate gradient-type algorithm. A finite element object-oriented library has been developed to obtain a parallel and multigrid computational implementation of the optimality system based on a multiphysics approach. Numerical results of two- and three-dimensional computations show that a possible minimum for the control problem can be computed in a robust and accurate manner.
Resumo:
The objective of this work is to characterize the genome of the chromosome 1 of A.thaliana, a small flowering plants used as a model organism in studies of biology and genetics, on the basis of a recent mathematical model of the genetic code. I analyze and compare different portions of the genome: genes, exons, coding sequences (CDS), introns, long introns, intergenes, untranslated regions (UTR) and regulatory sequences. In order to accomplish the task, I transformed nucleotide sequences into binary sequences based on the definition of the three different dichotomic classes. The descriptive analysis of binary strings indicate the presence of regularities in each portion of the genome considered. In particular, there are remarkable differences between coding sequences (CDS and exons) and non-coding sequences, suggesting that the frame is important only for coding sequences and that dichotomic classes can be useful to recognize them. Then, I assessed the existence of short-range dependence between binary sequences computed on the basis of the different dichotomic classes. I used three different measures of dependence: the well-known chi-squared test and two indices derived from the concept of entropy i.e. Mutual Information (MI) and Sρ, a normalized version of the “Bhattacharya Hellinger Matusita distance”. The results show that there is a significant short-range dependence structure only for the coding sequences whose existence is a clue of an underlying error detection and correction mechanism. No doubt, further studies are needed in order to assess how the information carried by dichotomic classes could discriminate between coding and noncoding sequence and, therefore, contribute to unveil the role of the mathematical structure in error detection and correction mechanisms. Still, I have shown the potential of the approach presented for understanding the management of genetic information.
Resumo:
Atmospheric aerosol particles directly impact air quality and participate in controlling the climate system. Organic Aerosol (OA) in general accounts for a large fraction (10–90%) of the global submicron (PM1) particulate mass. Chemometric methods for source identification are used in many disciplines, but methods relying on the analysis of NMR datasets are rarely used in atmospheric sciences. This thesis provides an original application of NMR-based chemometric methods to atmospheric OA source apportionment. The method was tested on chemical composition databases obtained from samples collected at different environments in Europe, hence exploring the impact of a great diversity of natural and anthropogenic sources. We focused on sources of water-soluble OA (WSOA), for which NMR analysis provides substantial advantages compared to alternative methods. Different factor analysis techniques are applied independently to NMR datasets from nine field campaigns of the project EUCAARI and allowed the identification of recurrent source contributions to WSOA in European background troposphere: 1) Marine SOA; 2) Aliphatic amines from ground sources (agricultural activities, etc.); 3) Biomass burning POA; 4) Biogenic SOA from terpene oxidation; 5) “Aged” SOAs, including humic-like substances (HULIS); 6) Other factors possibly including contributions from Primary Biological Aerosol Particles, and products of cooking activities. Biomass burning POA accounted for more than 50% of WSOC in winter months. Aged SOA associated with HULIS was predominant (> 75%) during the spring-summer, suggesting that secondary sources and transboundary transport become more important in spring and summer. Complex aerosol measurements carried out, involving several foreign research groups, provided the opportunity to compare source apportionment results obtained by NMR analysis with those provided by more widespread Aerodyne aerosol mass spectrometers (AMS) techniques that now provided categorization schemes of OA which are becoming a standard for atmospheric chemists. Results emerging from this thesis partly confirm AMS classification and partly challenge it.
Resumo:
The discovery of the Cosmic Microwave Background (CMB) radiation in 1965 is one of the fundamental milestones supporting the Big Bang theory. The CMB is one of the most important source of information in cosmology. The excellent accuracy of the recent CMB data of WMAP and Planck satellites confirmed the validity of the standard cosmological model and set a new challenge for the data analysis processes and their interpretation. In this thesis we deal with several aspects and useful tools of the data analysis. We focus on their optimization in order to have a complete exploitation of the Planck data and contribute to the final published results. The issues investigated are: the change of coordinates of CMB maps using the HEALPix package, the problem of the aliasing effect in the generation of low resolution maps, the comparison of the Angular Power Spectrum (APS) extraction performances of the optimal QML method, implemented in the code called BolPol, and the pseudo-Cl method, implemented in Cromaster. The QML method has been then applied to the Planck data at large angular scales to extract the CMB APS. The same method has been applied also to analyze the TT parity and the Low Variance anomalies in the Planck maps, showing a consistent deviation from the standard cosmological model, the possible origins for this results have been discussed. The Cromaster code instead has been applied to the 408 MHz and 1.42 GHz surveys focusing on the analysis of the APS of selected regions of the synchrotron emission. The new generation of CMB experiments will be dedicated to polarization measurements, for which are necessary high accuracy devices for separating the polarizations. Here a new technology, called Photonic Crystals, is exploited to develop a new polarization splitter device and its performances are compared to the devices used nowadays.
Resumo:
The present study has been carried out with the following objectives: i) To investigate the attributes of source parameters of local and regional earthquakes; ii) To estimate, as accurately as possible, M0, fc, Δσ and their standard errors to infer their relationship with source size; iii) To quantify high-frequency earthquake ground motion and to study the source scaling. This work is based on observational data of micro, small and moderate -earthquakes for three selected seismic sequences, namely Parkfield (CA, USA), Maule (Chile) and Ferrara (Italy). For the Parkfield seismic sequence (CA), a data set of 757 (42 clusters) repeating micro-earthquakes (0 ≤ MW ≤ 2), collected using borehole High Resolution Seismic Network (HRSN), have been analyzed and interpreted. We used the coda methodology to compute spectral ratios to obtain accurate values of fc , Δσ, and M0 for three target clusters (San Francisco, Los Angeles, and Hawaii) of our data. We also performed a general regression on peak ground velocities to obtain reliable seismic spectra of all earthquakes. For the Maule seismic sequence, a data set of 172 aftershocks of the 2010 MW 8.8 earthquake (3.7 ≤ MW ≤ 6.2), recorded by more than 100 temporary broadband stations, have been analyzed and interpreted to quantify high-frequency earthquake ground motion in this subduction zone. We completely calibrated the excitation and attenuation of the ground motion in Central Chile. For the Ferrara sequence, we calculated moment tensor solutions for 20 events from MW 5.63 (the largest main event occurred on May 20 2012), down to MW 3.2 by a 1-D velocity model for the crust beneath the Pianura Padana, using all the geophysical and geological information available for the area. The PADANIA model allowed a numerical study on the characteristics of the ground motion in the thick sediments of the flood plain.