1 resultado para Distributed Virtual Environments
em Duke University
Filtro por publicador
- JISC Information Environment Repository (3)
- AMS Tesi di Dottorato - Alm@DL - Università di Bologna (14)
- AMS Tesi di Laurea - Alm@DL - Università di Bologna (6)
- ArchiMeD - Elektronische Publikationen der Universität Mainz - Alemanha (1)
- Archimer: Archive de l'Institut francais de recherche pour l'exploitation de la mer (1)
- Aston University Research Archive (18)
- Biblioteca Digital - Universidad Icesi - Colombia (1)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (1)
- Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP) (48)
- BORIS: Bern Open Repository and Information System - Berna - Suiça (17)
- Brock University, Canada (2)
- Bulgarian Digital Mathematics Library at IMI-BAS (2)
- CentAUR: Central Archive University of Reading - UK (74)
- CiencIPCA - Instituto Politécnico do Cávado e do Ave, Portugal (11)
- Cochin University of Science & Technology (CUSAT), India (2)
- Coffee Science - Universidade Federal de Lavras (1)
- Consorci de Serveis Universitaris de Catalunya (CSUC), Spain (72)
- Cor-Ciencia - Acuerdo de Bibliotecas Universitarias de Córdoba (ABUC), Argentina (1)
- CORA - Cork Open Research Archive - University College Cork - Ireland (1)
- Corvinus Research Archive - The institutional repository for the Corvinus University of Budapest (1)
- CUNY Academic Works (1)
- Dalarna University College Electronic Archive (1)
- Department of Computer Science E-Repository - King's College London, Strand, London (3)
- Digital Commons at Florida International University (8)
- Digital Peer Publishing (20)
- Digital Repository at Iowa State University (1)
- DigitalCommons@The Texas Medical Center (1)
- DigitalCommons@University of Nebraska - Lincoln (1)
- Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland (15)
- Duke University (1)
- FUNDAJ - Fundação Joaquim Nabuco (21)
- Greenwich Academic Literature Archive - UK (1)
- Instituto Politécnico de Bragança (1)
- Instituto Politécnico de Santarém (1)
- Instituto Politécnico do Porto, Portugal (165)
- Instituto Superior de Psicologia Aplicada - Lisboa (1)
- Iowa Publications Online (IPO) - State Library, State of Iowa (Iowa), United States (1)
- Lume - Repositório Digital da Universidade Federal do Rio Grande do Sul (2)
- Martin Luther Universitat Halle Wittenberg, Germany (2)
- Massachusetts Institute of Technology (3)
- Memoria Académica - FaHCE, UNLP - Argentina (3)
- Ministerio de Cultura, Spain (3)
- Portal de Revistas Científicas Complutenses - Espanha (1)
- Portal do Conhecimento - Ministerio do Ensino Superior Ciencia e Inovacao, Cape Verde (1)
- Publishing Network for Geoscientific & Environmental Data (2)
- QSpace: Queen's University - Canada (1)
- QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast (2)
- RDBU - Repositório Digital da Biblioteca da Unisinos (2)
- Repositório Aberto da Universidade Aberta de Portugal (3)
- Repositorio Académico de la Universidad Nacional de Costa Rica (1)
- Repositório Científico do Instituto Politécnico de Lisboa - Portugal (37)
- Repositório Científico do Instituto Politécnico de Santarém - Portugal (1)
- Repositório da Escola Nacional de Administração Pública (ENAP) (3)
- Repositório da Produção Científica e Intelectual da Unicamp (5)
- Repositório da Universidade Federal do Espírito Santo (UFES), Brazil (2)
- Repositório digital da Fundação Getúlio Vargas - FGV (4)
- Repositório Digital da UNIVERSIDADE DA MADEIRA - Portugal (3)
- Repositório Institucional da Universidade de Aveiro - Portugal (1)
- Repositório Institucional da Universidade Estadual de São Paulo - UNESP (1)
- Repositório Institucional da Universidade Tecnológica Federal do Paraná (RIUT) (1)
- Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho" (45)
- RUN (Repositório da Universidade Nova de Lisboa) - FCT (Faculdade de Cienecias e Technologia), Universidade Nova de Lisboa (UNL), Portugal (34)
- Savoirs UdeS : plateforme de diffusion de la production intellectuelle de l’Université de Sherbrooke - Canada (1)
- Scielo Saúde Pública - SP (5)
- SerWisS - Server für Wissenschaftliche Schriften der Fachhochschule Hannover (1)
- Universidad de Alicante (10)
- Universidad del Rosario, Colombia (3)
- Universidad Politécnica de Madrid (41)
- Universidade Complutense de Madrid (1)
- Universidade de Madeira (1)
- Universidade do Minho (5)
- Universidade dos Açores - Portugal (5)
- Universidade Estadual Paulista "Júlio de Mesquita Filho" (UNESP) (1)
- Universidade Federal de Uberlândia (3)
- Universidade Federal do Pará (4)
- Universidade Federal do Rio Grande do Norte (UFRN) (15)
- Universidade Metodista de São Paulo (4)
- Universita di Parma (1)
- Universitat de Girona, Spain (8)
- Universitätsbibliothek Kassel, Universität Kassel, Germany (1)
- Université de Lausanne, Switzerland (7)
- Université de Montréal, Canada (2)
- University of Canberra Research Repository - Australia (1)
- University of Queensland eSpace - Australia (82)
- University of Southampton, United Kingdom (3)
- University of Washington (1)
- WestminsterResearch - UK (2)
- Worcester Research and Publications - Worcester Research and Publications - UK (2)
Resumo:
Distributed Computing frameworks belong to a class of programming models that allow developers to
launch workloads on large clusters of machines. Due to the dramatic increase in the volume of
data gathered by ubiquitous computing devices, data analytic workloads have become a common
case among distributed computing applications, making Data Science an entire field of
Computer Science. We argue that Data Scientist's concern lays in three main components: a dataset,
a sequence of operations they wish to apply on this dataset, and some constraint they may have
related to their work (performances, QoS, budget, etc). However, it is actually extremely
difficult, without domain expertise, to perform data science. One need to select the right amount
and type of resources, pick up a framework, and configure it. Also, users are often running their
application in shared environments, ruled by schedulers expecting them to specify precisely their resource
needs. Inherent to the distributed and concurrent nature of the cited frameworks, monitoring and
profiling are hard, high dimensional problems that block users from making the right
configuration choices and determining the right amount of resources they need. Paradoxically, the
system is gathering a large amount of monitoring data at runtime, which remains unused.
In the ideal abstraction we envision for data scientists, the system is adaptive, able to exploit
monitoring data to learn about workloads, and process user requests into a tailored execution
context. In this work, we study different techniques that have been used to make steps toward
such system awareness, and explore a new way to do so by implementing machine learning
techniques to recommend a specific subset of system configurations for Apache Spark applications.
Furthermore, we present an in depth study of Apache Spark executors configuration, which highlight
the complexity in choosing the best one for a given workload.