2 resultados para 080304 Concurrent Programming
em Duke University
                                
Resumo:
Nucleic Acid hairpins have been a subject of study for the last four decades. They are composed of single strand that is
hybridized to itself, and the central section forming an unhybridized loop. In nature, they stabilize single stranded RNA, serve as nucleation
sites for RNA folding, protein recognition signals, mRNA localization and regulation of mRNA degradation. On the other hand,
DNA hairpins in biological contexts have been studied with respect to forming cruciform structures that can regulate gene expression.
The use of DNA hairpins as fuel for synthetic molecular devices, including locomotion, was proposed and experimental demonstrated in 2003. They
were interesting because they bring to the table an on-demand energy/information supply mechanism.
The energy/information is hidden (from hybridization) in the hairpin’s loop, until required.
The energy/information is harnessed by opening the stem region, and exposing the single stranded loop section.
The loop region is now free for possible hybridization and help move the system into a thermodynamically favourable state.
The hidden energy and information coupled with
programmability provides another functionality, of selectively choosing what reactions to hide and
what reactions to allow to proceed, that helps develop a topological sequence of events.
Hairpins have been utilized as a source of fuel for many different DNA devices. In this thesis, we program four different
molecular devices using DNA hairpins, and experimentally validate them in the
laboratory. 1) The first device: A
novel enzyme-free autocatalytic self-replicating system composed entirely of DNA that operates isothermally. 2) The second
device: Time-Responsive Circuits using DNA have two properties: a) asynchronous: the final output is always correct
regardless of differences in the arrival time of different inputs.
b) renewable circuits which can be used multiple times without major degradation of the gate motifs
(so if the inputs change over time, the DNA-based circuit can re-compute the output correctly based on the new inputs).
3) The third device: Activatable tiles are a theoretical extension to the Tile assembly model that enhances
its robustness by protecting the sticky sides of tiles until a tile is partially incorporated into a growing assembly.
4) The fourth device: Controlled Amplification of DNA catalytic system: a device such that the amplification
of the system does not run uncontrollably until the system runs out of fuel, but instead achieves a finite
amount of gain.
Nucleic acid circuits with the ability
to perform complex logic operations have many potential practical applications, for example the ability to achieve point of care diagnostics.
We discuss the designs of our DNA Hairpin molecular devices, the results we have obtained, and the challenges we have overcome
to make these truly functional.
                                
Resumo:
Distributed Computing frameworks belong to a class of programming models that allow developers to
launch workloads on large clusters of machines. Due to the dramatic increase in the volume of
data gathered by ubiquitous computing devices, data analytic workloads have become a common
case among distributed computing applications, making Data Science an entire field of
Computer Science. We argue that Data Scientist's concern lays in three main components: a dataset,
a sequence of operations they wish to apply on this dataset, and some constraint they may have
related to their work (performances, QoS, budget, etc). However, it is actually extremely
difficult, without domain expertise, to perform data science. One need to select the right amount
and type of resources, pick up a framework, and configure it. Also, users are often running their
application in shared environments, ruled by schedulers expecting them to specify precisely their resource
needs. Inherent to the distributed and concurrent nature of the cited frameworks, monitoring and
profiling are hard, high dimensional problems that block users from making the right
configuration choices and determining the right amount of resources they need. Paradoxically, the
system is gathering a large amount of monitoring data at runtime, which remains unused.
In the ideal abstraction we envision for data scientists, the system is adaptive, able to exploit
monitoring data to learn about workloads, and process user requests into a tailored execution
context. In this work, we study different techniques that have been used to make steps toward
such system awareness, and explore a new way to do so by implementing machine learning
techniques to recommend a specific subset of system configurations for Apache Spark applications.
Furthermore, we present an in depth study of Apache Spark executors configuration, which highlight
the complexity in choosing the best one for a given workload.
 
                    