398 resultados para Runtime
Resumo:
Next generation electronic devices have to guarantee high performance while being less power-consuming and highly reliable for several application domains ranging from the entertainment to the business. In this context, multicore platforms have proven the most efficient design choice but new challenges have to be faced. The ever-increasing miniaturization of the components produces unexpected variations on technological parameters and wear-out characterized by soft and hard errors. Even though hardware techniques, which lend themselves to be applied at design time, have been studied with the objective to mitigate these effects, they are not sufficient; thus software adaptive techniques are necessary. In this thesis we focus on multicore task allocation strategies to minimize the energy consumption while meeting performance constraints. We firstly devise a technique based on an Integer Linear Problem formulation which provides the optimal solution but cannot be applied on-line since the algorithm it needs is time-demanding; then we propose a sub-optimal technique based on two steps which can be applied on-line. We demonstrate the effectiveness of the latter solution through an exhaustive comparison against the optimal solution, state-of-the-art policies, and variability-agnostic task allocations by running multimedia applications on the virtual prototype of a next generation industrial multicore platform. We also face the problem of the performance and lifetime degradation. We firstly focus on embedded multicore platforms and propose an idleness distribution policy that increases core expected lifetimes by duty cycling their activity; then, we investigate the use of micro thermoelectrical coolers in general-purpose multicore processors to control the temperature of the cores at runtime with the objective of meeting lifetime constraints without performance loss.
Resumo:
The continuous advancements and enhancements of wireless systems are enabling new compelling scenarios where mobile services can adapt according to the current execution context, represented by the computational resources available at the local device, current physical location, people in physical proximity, and so forth. Such services called context-aware require the timely delivery of all relevant information describing the current context, and that introduces several unsolved complexities, spanning from low-level context data transmission up to context data storage and replication into the mobile system. In addition, to ensure correct and scalable context provisioning, it is crucial to integrate and interoperate with different wireless technologies (WiFi, Bluetooth, etc.) and modes (infrastructure-based and ad-hoc), and to use decentralized solutions to store and replicate context data on mobile devices. These challenges call for novel middleware solutions, here called Context Data Distribution Infrastructures (CDDIs), capable of delivering relevant context data to mobile devices, while hiding all the issues introduced by data distribution in heterogeneous and large-scale mobile settings. This dissertation thoroughly analyzes CDDIs for mobile systems, with the main goal of achieving a holistic approach to the design of such type of middleware solutions. We discuss the main functions needed by context data distribution in large mobile systems, and we claim the precise definition and clean respect of quality-based contracts between context consumers and CDDI to reconfigure main middleware components at runtime. We present the design and the implementation of our proposals, both in simulation-based and in real-world scenarios, along with an extensive evaluation that confirms the technical soundness of proposed CDDI solutions. Finally, we consider three highly heterogeneous scenarios, namely disaster areas, smart campuses, and smart cities, to better remark the wide technical validity of our analysis and solutions under different network deployments and quality constraints.
Resumo:
We present new algorithms to approximate the discrete volume of a polyhedral geometry using boxes defined by the US standard SAE J1100. This problem is NP-hard and has its main application in the car design process. The algorithms produce maximum weighted independent sets on a so-called conflict graph for a discretisation of the geometry. We present a framework to eliminate a large portion of the vertices of a graph without affecting the quality of the optimal solution. Using this framework we are also able to define the conflict graph without the use of a discretisation. For the solution of the maximum weighted independent set problem we designed an enumeration scheme which uses the restrictions of the SAE J1100 standard for an efficient upper bound computation. We evaluate the packing algorithms according to the solution quality compared to manually derived results. Finally, we compare our enumeration scheme to several other exact algorithms in terms of their runtime. Grid-based packings either tend to be not tight or have intersections between boxes. We therefore present an algorithm which can compute box packings with arbitrary placements and fixed orientations. In this algorithm we make use of approximate Minkowski Sums, computed by uniting many axis-oriented equal boxes. We developed an algorithm which computes the union of equal axis-oriented boxes efficiently. This algorithm also maintains the Minkowski Sums throughout the packing process. We also extend these algorithms for packing arbitrary objects in fixed orientations.
Resumo:
Constructing ontology networks typically occurs at design time at the hands of knowledge engineers who assemble their components statically. There are, however, use cases where ontology networks need to be assembled upon request and processed at runtime, without altering the stored ontologies and without tampering with one another. These are what we call "virtual [ontology] networks", and keeping track of how an ontology changes in each virtual network is called "multiplexing". Issues may arise from the connectivity of ontology networks. In many cases, simple flat import schemes will not work, because many ontology managers can cause property assertions to be erroneously interpreted as annotations and ignored by reasoners. Also, multiple virtual networks should optimize their cumulative memory footprint, and where they cannot, this should occur for very limited periods of time. We claim that these problems should be handled by the software that serves these ontology networks, rather than by ontology engineering methodologies. We propose a method that spreads multiple virtual networks across a 3-tier structure, and can reduce the amount of erroneously interpreted axioms, under certain raw statement distributions across the ontologies. We assumed OWL as the core language handled by semantic applications in the framework at hand, due to the greater availability of reasoners and rule engines. We also verified that, in common OWL ontology management software, OWL axiom interpretation occurs in the worst case scenario of pre-order visit. To measure the effectiveness and space-efficiency of our solution, a Java and RESTful implementation was produced within an Apache project. We verified that a 3-tier structure can accommodate reasonably complex ontology networks better, in terms of the expressivity OWL axiom interpretation, than flat-tree import schemes can. We measured both the memory overhead of the additional components we put on top of traditional ontology networks, and the framework's caching capabilities.
Resumo:
Modern embedded systems embrace many-core shared-memory designs. Due to constrained power and area budgets, most of them feature software-managed scratchpad memories instead of data caches to increase the data locality. It is therefore programmers’ responsibility to explicitly manage the memory transfers, and this make programming these platform cumbersome. Moreover, complex modern applications must be adequately parallelized before they can the parallel potential of the platform into actual performance. To support this, programming languages were proposed, which work at a high level of abstraction, and rely on a runtime whose cost hinders performance, especially in embedded systems, where resources and power budget are constrained. This dissertation explores the applicability of the shared-memory paradigm on modern many-core systems, focusing on the ease-of-programming. It focuses on OpenMP, the de-facto standard for shared memory programming. In a first part, the cost of algorithms for synchronization and data partitioning are analyzed, and they are adapted to modern embedded many-cores. Then, the original design of an OpenMP runtime library is presented, which supports complex forms of parallelism such as multi-level and irregular parallelism. In the second part of the thesis, the focus is on heterogeneous systems, where hardware accelerators are coupled to (many-)cores to implement key functional kernels with orders-of-magnitude of speedup and energy efficiency compared to the “pure software” version. However, three main issues rise, namely i) platform design complexity, ii) architectural scalability and iii) programmability. To tackle them, a template for a generic hardware processing unit (HWPU) is proposed, which share the memory banks with cores, and the template for a scalable architecture is shown, which integrates them through the shared-memory system. Then, a full software stack and toolchain are developed to support platform design and to let programmers exploiting the accelerators of the platform. The OpenMP frontend is extended to interact with it.
Resumo:
Despite the several issues faced in the past, the evolutionary trend of silicon has kept its constant pace. Today an ever increasing number of cores is integrated onto the same die. Unfortunately, the extraordinary performance achievable by the many-core paradigm is limited by several factors. Memory bandwidth limitation, combined with inefficient synchronization mechanisms, can severely overcome the potential computation capabilities. Moreover, the huge HW/SW design space requires accurate and flexible tools to perform architectural explorations and validation of design choices. In this thesis we focus on the aforementioned aspects: a flexible and accurate Virtual Platform has been developed, targeting a reference many-core architecture. Such tool has been used to perform architectural explorations, focusing on instruction caching architecture and hybrid HW/SW synchronization mechanism. Beside architectural implications, another issue of embedded systems is considered: energy efficiency. Near Threshold Computing is a key research area in the Ultra-Low-Power domain, as it promises a tenfold improvement in energy efficiency compared to super-threshold operation and it mitigates thermal bottlenecks. The physical implications of modern deep sub-micron technology are severely limiting performance and reliability of modern designs. Reliability becomes a major obstacle when operating in NTC, especially memory operation becomes unreliable and can compromise system correctness. In the present work a novel hybrid memory architecture is devised to overcome reliability issues and at the same time improve energy efficiency by means of aggressive voltage scaling when allowed by workload requirements. Variability is another great drawback of near-threshold operation. The greatly increased sensitivity to threshold voltage variations in today a major concern for electronic devices. We introduce a variation-tolerant extension of the baseline many-core architecture. By means of micro-architectural knobs and a lightweight runtime control unit, the baseline architecture becomes dynamically tolerant to variations.
Resumo:
Modern software systems, in particular distributed ones, are everywhere around us and are at the basis of our everyday activities. Hence, guaranteeing their cor- rectness, consistency and safety is of paramount importance. Their complexity makes the verification of such properties a very challenging task. It is natural to expect that these systems are reliable and above all usable. i) In order to be reliable, compositional models of software systems need to account for consistent dynamic reconfiguration, i.e., changing at runtime the communication patterns of a program. ii) In order to be useful, compositional models of software systems need to account for interaction, which can be seen as communication patterns among components which collaborate together to achieve a common task. The aim of the Ph.D. was to develop powerful techniques based on formal methods for the verification of correctness, consistency and safety properties related to dynamic reconfiguration and communication in complex distributed systems. In particular, static analysis techniques based on types and type systems appeared to be an adequate methodology, considering their success in guaranteeing not only basic safety properties, but also more sophisticated ones like, deadlock or livelock freedom in a concurrent setting. The main contributions of this dissertation are twofold. i) On the components side: we design types and a type system for a concurrent object-oriented calculus to statically ensure consistency of dynamic reconfigurations related to modifications of communication patterns in a program during execution time. ii) On the communication side: we study advanced safety properties related to communication in complex distributed systems like deadlock-freedom, livelock- freedom and progress. Most importantly, we exploit an encoding of types and terms of a typical distributed language, session π-calculus, into the standard typed π- calculus, in order to understand their expressive power.
Resumo:
La tesi tratta il progetto di una piattaforma di simulazione per modulazioni back-scatter UWB su un sistema a microcontrollore PIC. Il sistema utilizza uno switch a UWB per eseguire la modulazione, modificando le condizioni di carico d'antenna; il software implementa la modulazione attraverso la variazione del segnale di controllo dello switch e si interfaccia con l'utente attraverso l'uso di una periferica USB, permettendo la modifica runtime della configurazione. Si interfaccia inoltre con strumenti esterni attraverso segnali di sincronizzazione.
Resumo:
Si continua il lavoro riguardo il porting di UMView su android studiando i problemi affrontati nelle tesi precedenti e suggerendo soluzioni per superarli. Sono stati effettuati anche esperimenti sulla nuova runtime di Android, ART, per studiare casi di possibili incompatibilità ed eventuali differenze di gestione rispetto a Dalvik.
Resumo:
Geometric packing problems may be formulated mathematically as constrained optimization problems. But finding a good solution is a challenging task. The more complicated the geometry of the container or the objects to be packed, the more complex the non-penetration constraints become. In this work we propose the use of a physics engine that simulates a system of colliding rigid bodies. It is a tool to resolve interpenetration conflicts and to optimize configurations locally. We develop an efficient and easy-to-implement physics engine that is specialized for collision detection and contact handling. In succession of the development of this engine a number of novel algorithms for distance calculation and intersection volume were designed and imple- mented, which are presented in this work. They are highly specialized to pro- vide fast responses for cuboids and triangles as input geometry whereas the concepts they are based on can easily be extended to other convex shapes. Especially noteworthy in this context is our ε-distance algorithm - a novel application that is not only very robust and fast but also compact in its im- plementation. Several state-of-the-art third party implementations are being presented and we show that our implementations beat them in runtime and robustness. The packing algorithm that lies on top of the physics engine is a Monte Carlo based approach implemented for packing cuboids into a container described by a triangle soup. We give an implementation for the SAE J1100 variant of the trunk packing problem. We compare this implementation to several established approaches and we show that it gives better results in faster time than these existing implementations.
Resumo:
Data sets describing the state of the earth's atmosphere are of great importance in the atmospheric sciences. Over the last decades, the quality and sheer amount of the available data increased significantly, resulting in a rising demand for new tools capable of handling and analysing these large, multidimensional sets of atmospheric data. The interdisciplinary work presented in this thesis covers the development and the application of practical software tools and efficient algorithms from the field of computer science, aiming at the goal of enabling atmospheric scientists to analyse and to gain new insights from these large data sets. For this purpose, our tools combine novel techniques with well-established methods from different areas such as scientific visualization and data segmentation. In this thesis, three practical tools are presented. Two of these tools are software systems (Insight and IWAL) for different types of processing and interactive visualization of data, the third tool is an efficient algorithm for data segmentation implemented as part of Insight.Insight is a toolkit for the interactive, three-dimensional visualization and processing of large sets of atmospheric data, originally developed as a testing environment for the novel segmentation algorithm. It provides a dynamic system for combining at runtime data from different sources, a variety of different data processing algorithms, and several visualization techniques. Its modular architecture and flexible scripting support led to additional applications of the software, from which two examples are presented: the usage of Insight as a WMS (web map service) server, and the automatic production of a sequence of images for the visualization of cyclone simulations. The core application of Insight is the provision of the novel segmentation algorithm for the efficient detection and tracking of 3D features in large sets of atmospheric data, as well as for the precise localization of the occurring genesis, lysis, merging and splitting events. Data segmentation usually leads to a significant reduction of the size of the considered data. This enables a practical visualization of the data, statistical analyses of the features and their events, and the manual or automatic detection of interesting situations for subsequent detailed investigation. The concepts of the novel algorithm, its technical realization, and several extensions for avoiding under- and over-segmentation are discussed. As example applications, this thesis covers the setup and the results of the segmentation of upper-tropospheric jet streams and cyclones as full 3D objects. Finally, IWAL is presented, which is a web application for providing an easy interactive access to meteorological data visualizations, primarily aimed at students. As a web application, the needs to retrieve all input data sets and to install and handle complex visualization tools on a local machine are avoided. The main challenge in the provision of customizable visualizations to large numbers of simultaneous users was to find an acceptable trade-off between the available visualization options and the performance of the application. Besides the implementational details, benchmarks and the results of a user survey are presented.
Resumo:
In questa tesi sono stati apportati due importanti contributi nel campo degli acceleratori embedded many-core. Abbiamo implementato un runtime OpenMP ottimizzato per la gestione del tasking model per sistemi a processori strettamente accoppiati in cluster e poi interconnessi attraverso una network on chip. Ci siamo focalizzati sulla loro scalabilità e sul supporto di task di granularità fine, come è tipico nelle applicazioni embedded. Il secondo contributo di questa tesi è stata proporre una estensione del runtime di OpenMP che cerca di prevedere la manifestazione di errori dati da fenomeni di variability tramite una schedulazione efficiente del carico di lavoro.
Resumo:
In vielen Industriezweigen, zum Beispiel in der Automobilindustrie, werden Digitale Versuchsmodelle (Digital MockUps) eingesetzt, um die Konstruktion und die Funktion eines Produkts am virtuellen Prototypen zu überprüfen. Ein Anwendungsfall ist dabei die Überprüfung von Sicherheitsabständen einzelner Bauteile, die sogenannte Abstandsanalyse. Ingenieure ermitteln dabei für bestimmte Bauteile, ob diese in ihrer Ruhelage sowie während einer Bewegung einen vorgegeben Sicherheitsabstand zu den umgebenden Bauteilen einhalten. Unterschreiten Bauteile den Sicherheitsabstand, so muss deren Form oder Lage verändert werden. Dazu ist es wichtig, die Bereiche der Bauteile, welche den Sicherhabstand verletzen, genau zu kennen. rnrnIn dieser Arbeit präsentieren wir eine Lösung zur Echtzeitberechnung aller den Sicherheitsabstand unterschreitenden Bereiche zwischen zwei geometrischen Objekten. Die Objekte sind dabei jeweils als Menge von Primitiven (z.B. Dreiecken) gegeben. Für jeden Zeitpunkt, in dem eine Transformation auf eines der Objekte angewendet wird, berechnen wir die Menge aller den Sicherheitsabstand unterschreitenden Primitive und bezeichnen diese als die Menge aller toleranzverletzenden Primitive. Wir präsentieren in dieser Arbeit eine ganzheitliche Lösung, welche sich in die folgenden drei großen Themengebiete unterteilen lässt.rnrnIm ersten Teil dieser Arbeit untersuchen wir Algorithmen, die für zwei Dreiecke überprüfen, ob diese toleranzverletzend sind. Hierfür präsentieren wir verschiedene Ansätze für Dreiecks-Dreiecks Toleranztests und zeigen, dass spezielle Toleranztests deutlich performanter sind als bisher verwendete Abstandsberechnungen. Im Fokus unserer Arbeit steht dabei die Entwicklung eines neuartigen Toleranztests, welcher im Dualraum arbeitet. In all unseren Benchmarks zur Berechnung aller toleranzverletzenden Primitive beweist sich unser Ansatz im dualen Raum immer als der Performanteste.rnrnDer zweite Teil dieser Arbeit befasst sich mit Datenstrukturen und Algorithmen zur Echtzeitberechnung aller toleranzverletzenden Primitive zwischen zwei geometrischen Objekten. Wir entwickeln eine kombinierte Datenstruktur, die sich aus einer flachen hierarchischen Datenstruktur und mehreren Uniform Grids zusammensetzt. Um effiziente Laufzeiten zu gewährleisten ist es vor allem wichtig, den geforderten Sicherheitsabstand sinnvoll im Design der Datenstrukturen und der Anfragealgorithmen zu beachten. Wir präsentieren hierzu Lösungen, die die Menge der zu testenden Paare von Primitiven schnell bestimmen. Darüber hinaus entwickeln wir Strategien, wie Primitive als toleranzverletzend erkannt werden können, ohne einen aufwändigen Primitiv-Primitiv Toleranztest zu berechnen. In unseren Benchmarks zeigen wir, dass wir mit unseren Lösungen in der Lage sind, in Echtzeit alle toleranzverletzenden Primitive zwischen zwei komplexen geometrischen Objekten, bestehend aus jeweils vielen hunderttausend Primitiven, zu berechnen. rnrnIm dritten Teil präsentieren wir eine neuartige, speicheroptimierte Datenstruktur zur Verwaltung der Zellinhalte der zuvor verwendeten Uniform Grids. Wir bezeichnen diese Datenstruktur als Shrubs. Bisherige Ansätze zur Speicheroptimierung von Uniform Grids beziehen sich vor allem auf Hashing Methoden. Diese reduzieren aber nicht den Speicherverbrauch der Zellinhalte. In unserem Anwendungsfall haben benachbarte Zellen oft ähnliche Inhalte. Unser Ansatz ist in der Lage, den Speicherbedarf der Zellinhalte eines Uniform Grids, basierend auf den redundanten Zellinhalten, verlustlos auf ein fünftel der bisherigen Größe zu komprimieren und zur Laufzeit zu dekomprimieren.rnrnAbschießend zeigen wir, wie unsere Lösung zur Berechnung aller toleranzverletzenden Primitive Anwendung in der Praxis finden kann. Neben der reinen Abstandsanalyse zeigen wir Anwendungen für verschiedene Problemstellungen der Pfadplanung.
Resumo:
To support development tools like debuggers, runtime systems need to provide a meta-programming interface to alter their semantics and access internal data. Reflective capabilities are typically fixed by the Virtual Machine (VM). Unanticipated reflective features must either be simulated by complex program transformations, or they require the development of a specially tailored VM. We propose a novel approach to behavioral reflection that eliminates the barrier between applications and the VM by manipulating an explicit tower of first-class interpreters. Pinocchio is a proof-of-concept implementation of our approach which enables radical changes to the interpretation of programs by explicitly instantiating subclasses of the base interpreter. We illustrate the design of Pinocchio through non-trivial examples that extend runtime semantics to support debugging, parallel debugging, and back-in-time object-flow debugging. Although performance is not yet addressed, we also discuss numerous opportunities for optimization, which we believe will lead to a practical approach to behavioral reflection.
Resumo:
Features encapsulate the domain knowledge of a software system and thus are valuable sources of information for a reverse engineer. When analyzing the evolution of a system, we need to know how and which features were modified to recover both the change intention and its extent, namely which source artifacts are affected. Typically, the implementation of a feature crosscuts a number of source artifacts. To obtain a mapping between features to the source artifacts, we exercise the features and capture their execution traces. However this results in large traces that are difficult to interpret. To tackle this issue we compact the traces into simple sets of source artifacts that participate in a feature's runtime behavior. We refer to these compacted traces as feature views. Within a feature view, we partition the source artifacts into disjoint sets of characterized software entities. The characterization defines the level of participation of a source entity in the features. We then analyze the features over several versions of a system and we plot their evolution to reveal how and hich features were affected by changes in the code. We show the usefulness of our approach by applying it to a case study where we address the problem of merging parallel development tracks of the same system.