Biblioteca Digital

1 resultado para Transition P System

em Duke University

Studying Recommender Systems to Enhance Distributed Computing Schedulers

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Distributed Computing frameworks belong to a class of programming models that allow developers top> launch workloads on large clusters of machines. Due to the dramatic increase in the volume ofp> data gathered by ubiquitous computing devices, data analytic workloads have become a commonp> case among distributed computing applications, making Data Science an entire field ofp> Computer Science. We argue that Data Scientist's concern lays in three main components: a dataset,p> a sequence of operations they wish to apply on this dataset, and some constraint they may havep> related to their work (performances, QoS, budget, etc). However, it is actually extremelyp> difficult, without domain expertise, to perform data science. One need to select the right amountp> and type of resources, pick up a framework, and configure it. Also, users are often running theirp> application in shared environments, ruled by schedulers expecting them to specify precisely their resourcep> needs. Inherent to the distributed and concurrent nature of the cited frameworks, monitoring and p> profiling are hard, high dimensional problems that block users from making the rightp> configuration choices and determining the right amount of resources they need. Paradoxically, the p> system is gathering a large amount of monitoring data at runtime, which remains unused.p> In the ideal abstraction we envision for data scientists, the system is adaptive, able to exploitp> monitoring data to learn about workloads, and process user requests into a tailored executionp> context. In this work, we study different techniques that have been used to make steps towardp> such system awareness, and explore a new way to do so by implementing machine learningp> techniques to recommend a specific subset of system configurations for Apache Spark applications.p> Furthermore, we present an in depth study of Apache Spark executors configuration, which highlightp> the complexity in choosing the best one for a given workload.p>

Veja mais

1 resultado para Transition P System

em Duke University

Filtro por publicador

Studying Recommender Systems to Enhance Distributed Computing Schedulers