2 resultados para Natural Language Processing,Recommender Systems,Android,Applicazione mobile
em Duke University
Resumo:
Distributed Computing frameworks belong to a class of programming models that allow developers to
launch workloads on large clusters of machines. Due to the dramatic increase in the volume of
data gathered by ubiquitous computing devices, data analytic workloads have become a common
case among distributed computing applications, making Data Science an entire field of
Computer Science. We argue that Data Scientist's concern lays in three main components: a dataset,
a sequence of operations they wish to apply on this dataset, and some constraint they may have
related to their work (performances, QoS, budget, etc). However, it is actually extremely
difficult, without domain expertise, to perform data science. One need to select the right amount
and type of resources, pick up a framework, and configure it. Also, users are often running their
application in shared environments, ruled by schedulers expecting them to specify precisely their resource
needs. Inherent to the distributed and concurrent nature of the cited frameworks, monitoring and
profiling are hard, high dimensional problems that block users from making the right
configuration choices and determining the right amount of resources they need. Paradoxically, the
system is gathering a large amount of monitoring data at runtime, which remains unused.
In the ideal abstraction we envision for data scientists, the system is adaptive, able to exploit
monitoring data to learn about workloads, and process user requests into a tailored execution
context. In this work, we study different techniques that have been used to make steps toward
such system awareness, and explore a new way to do so by implementing machine learning
techniques to recommend a specific subset of system configurations for Apache Spark applications.
Furthermore, we present an in depth study of Apache Spark executors configuration, which highlight
the complexity in choosing the best one for a given workload.
Resumo:
Secure Access For Everyone (SAFE), is an integrated system for managing trust
using a logic-based declarative language. Logical trust systems authorize each
request by constructing a proof from a context---a set of authenticated logic
statements representing credentials and policies issued by various principals
in a networked system. A key barrier to practical use of logical trust systems
is the problem of managing proof contexts: identifying, validating, and
assembling the credentials and policies that are relevant to each trust
decision.
SAFE addresses this challenge by (i) proposing a distributed authenticated data
repository for storing the credentials and policies; (ii) introducing a
programmable credential discovery and assembly layer that generates the
appropriate tailored context for a given request. The authenticated data
repository is built upon a scalable key-value store with its contents named by
secure identifiers and certified by the issuing principal. The SAFE language
provides scripting primitives to generate and organize logic sets representing
credentials and policies, materialize the logic sets as certificates, and link
them to reflect delegation patterns in the application. The authorizer fetches
the logic sets on demand, then validates and caches them locally for further
use. Upon each request, the authorizer constructs the tailored proof context
and provides it to the SAFE inference for certified validation.
Delegation-driven credential linking with certified data distribution provides
flexible and dynamic policy control enabling security and trust infrastructure
to be agile, while addressing the perennial problems related to today's
certificate infrastructure: automated credential discovery, scalable
revocation, and issuing credentials without relying on centralized authority.
We envision SAFE as a new foundation for building secure network systems. We
used SAFE to build secure services based on case studies drawn from practice:
(i) a secure name service resolver similar to DNS that resolves a name across
multi-domain federated systems; (ii) a secure proxy shim to delegate access
control decisions in a key-value store; (iii) an authorization module for a
networked infrastructure-as-a-service system with a federated trust structure
(NSF GENI initiative); and (iv) a secure cooperative data analytics service
that adheres to individual secrecy constraints while disclosing the data. We
present empirical evaluation based on these case studies and demonstrate that
SAFE supports a wide range of applications with low overhead.