6 resultados para Distributed multimedia systems
em AMS Tesi di Laurea - Alm@DL - Università di Bologna
Resumo:
This thesis presents a study of the Grid data access patterns in distributed analysis in the CMS experiment at the LHC accelerator. This study ranges from the deep analysis of the historical patterns of access to the most relevant data types in CMS, to the exploitation of a supervised Machine Learning classification system to set-up a machinery able to eventually predict future data access patterns - i.e. the so-called dataset “popularity” of the CMS datasets on the Grid - with focus on specific data types. All the CMS workflows run on the Worldwide LHC Computing Grid (WCG) computing centers (Tiers), and in particular the distributed analysis systems sustains hundreds of users and applications submitted every day. These applications (or “jobs”) access different data types hosted on disk storage systems at a large set of WLCG Tiers. The detailed study of how this data is accessed, in terms of data types, hosting Tiers, and different time periods, allows to gain precious insight on storage occupancy over time and different access patterns, and ultimately to extract suggested actions based on this information (e.g. targetted disk clean-up and/or data replication). In this sense, the application of Machine Learning techniques allows to learn from past data and to gain predictability potential for the future CMS data access patterns. Chapter 1 provides an introduction to High Energy Physics at the LHC. Chapter 2 describes the CMS Computing Model, with special focus on the data management sector, also discussing the concept of dataset popularity. Chapter 3 describes the study of CMS data access patterns with different depth levels. Chapter 4 offers a brief introduction to basic machine learning concepts and gives an introduction to its application in CMS and discuss the results obtained by using this approach in the context of this thesis.
Resumo:
I sistemi decentralizzati hanno permesso agli utenti di condividere informazioni senza la presenza di un intermediario centralizzato che possiede la sovranità sui dati scambiati, rischi di sicurezza e la possibilità di colli di bottiglia. Tuttavia, sono rari i sistemi pratici per il recupero delle informazioni salvate su di essi che non includano una componente centralizzata. In questo lavoro di tesi viene presentato lo sviluppo di un'applicazione il cui scopo è quello di consentire agli utenti di caricare immagini in un'architettura totalmente decentralizzata, grazie ai Decentralized File Storage e alla successiva ricerca e recupero di tali oggetti attraverso una Distributed Hash Table (DHT) in cui sono memorizzati i necessari Content IDentifiers (CID).\\ L'obiettivo principale è stato quello di trovare una migliore allocazione delle immagini all'interno del DHT attraverso l'uso dell'International Standard Content Code (ISCC), ovvero uno standard ISO che, attraverso funzioni hash content-driven, locality-sensitive e similarity-preserving, assegna i CID IPFS delle immagini ai nodi del DHT in modo efficiente, per ridurre il più possibile i salti tra i nodi e recuperare immagini coerenti con la query eseguita. Verranno, poi, analizzati i risultati ottenuti dall'allocazione dei CID delle immagini nei nodi mettendo a confronto ISCC e hash crittografico SHA-256, per verificare se ISCC rappresenti meglio la somiglianza tra le immagini allocando le immagini simili in nodi vicini tra loro.
Resumo:
In this thesis, a tube-based Distributed Economic Predictive Control (DEPC) scheme is presented for a group of dynamically coupled linear subsystems. These subsystems are components of a large scale system and control inputs are computed based on optimizing a local economic objective. Each subsystem is interacting with its neighbors by sending its future reference trajectory, at each sampling time. It solves a local optimization problem in parallel, based on the received future reference trajectories of the other subsystems. To ensure recursive feasibility and a performance bound, each subsystem is constrained to not deviate too much from its communicated reference trajectory. This difference between the plan trajectory and the communicated one is interpreted as a disturbance on the local level. Then, to ensure the satisfaction of both state and input constraints, they are tightened by considering explicitly the effect of these local disturbances. The proposed approach averages over all possible disturbances, handles tightened state and input constraints, while satisfies the compatibility constraints to guarantee that the actual trajectory lies within a certain bound in the neighborhood of the reference one. Each subsystem is optimizing a local arbitrary economic objective function in parallel while considering a local terminal constraint to guarantee recursive feasibility. In this framework, economic performance guarantees for a tube-based distributed predictive control (DPC) scheme are developed rigorously. It is presented that the closed-loop nominal subsystem has a robust average performance bound locally which is no worse than that of a local robust steady state. Since a robust algorithm is applying on the states of the real (with disturbances) subsystems, this bound can be interpreted as an average performance result for the real closed-loop system. To this end, we present our outcomes on local and global performance, illustrated by a numerical example.
Resumo:
In this thesis, we state the collision avoidance problem as a vertex covering problem, then we consider a distributed framework in which a team of cooperating Unmanned Vehicles (UVs) aim to solve this optimization problem cooperatively to guarantee collision avoidance between group members. For this purpose, we implement a distributed control scheme based on a robust Set-Theoretic Model Predictive Control ( ST-MPC) strategy, where the problem involves vehicles with independent dynamics but with coupled constraints, to capture required cooperative behavior.
Resumo:
Cloud computing enables independent end users and applications to share data and pooled resources, possibly located in geographically distributed Data Centers, in a fully transparent way. This need is particularly felt by scientific applications to exploit distributed resources in efficient and scalable way for the processing of big amount of data. This paper proposes an open so- lution to deploy a Platform as a service (PaaS) over a set of multi- site data centers by applying open source virtualization tools to facilitate operation among virtual machines while optimizing the usage of distributed resources. An experimental testbed is set up in Openstack environment to obtain evaluations with different types of TCP sample connections to demonstrate the functionality of the proposed solution and to obtain throughput measurements in relation to relevant design parameters.
Resumo:
LHC experiments produce an enormous amount of data, estimated of the order of a few PetaBytes per year. Data management takes place using the Worldwide LHC Computing Grid (WLCG) grid infrastructure, both for storage and processing operations. However, in recent years, many more resources are available on High Performance Computing (HPC) farms, which generally have many computing nodes with a high number of processors. Large collaborations are working to use these resources in the most efficient way, compatibly with the constraints imposed by computing models (data distributed on the Grid, authentication, software dependencies, etc.). The aim of this thesis project is to develop a software framework that allows users to process a typical data analysis workflow of the ATLAS experiment on HPC systems. The developed analysis framework shall be deployed on the computing resources of the Open Physics Hub project and on the CINECA Marconi100 cluster, in view of the switch-on of the Leonardo supercomputer, foreseen in 2023.