780 resultados para embedded computing
Resumo:
Graphics processors were originally developed for rendering graphics but have recently evolved towards being an architecture for general-purpose computations. They are also expected to become important parts of embedded systems hardware -- not just for graphics. However, this necessitates the development of appropriate timing analysis techniques which would be required because techniques developed for CPU scheduling are not applicable. The reason is that we are not interested in how long it takes for any given GPU thread to complete, but rather how long it takes for all of them to complete. We therefore develop a simple method for finding an upper bound on the makespan of a group of GPU threads executing the same program and competing for the resources of a single streaming multiprocessor (whose architecture is based on NVIDIA Fermi, with some simplifying assunptions). We then build upon this method to formulate the derivation of the exact worst-case makespan (and corresponding schedule) as an optimization problem. Addressing the issue of tractability, we also present a technique for efficiently computing a safe estimate of the worstcase makespan with minimal pessimism, which may be used when finding an exact value would take too long.
Resumo:
Embedded real-time applications increasingly present high computation requirements, which need to be completed within specific deadlines, but that present highly variable patterns, depending on the set of data available in a determined instant. The current trend to provide parallel processing in the embedded domain allows providing higher processing power; however, it does not address the variability in the processing pattern. Dimensioning each device for its worst-case scenario implies lower average utilization, and increased available, but unusable, processing in the overall system. A solution for this problem is to extend the parallel execution of the applications, allowing networked nodes to distribute the workload, on peak situations, to neighbour nodes. In this context, this report proposes a framework to develop parallel and distributed real-time embedded applications, transparently using OpenMP and Message Passing Interface (MPI), within a programming model based on OpenMP. The technical report also devises an integrated timing model, which enables the structured reasoning on the timing behaviour of these hybrid architectures.
Resumo:
Despite the steady increase in experimental deployments, most of research work on WSNs has focused only on communication protocols and algorithms, with a clear lack of effective, feasible and usable system architectures, integrated in a modular platform able to address both functional and non–functional requirements. In this paper, we outline EMMON [1], a full WSN-based system architecture for large–scale, dense and real–time embedded monitoring [3] applications. EMMON provides a hierarchical communication architecture together with integrated middleware and command and control software. Then, EM-Set, the EMMON engineering toolset will be presented. EM-Set includes a network deployment planning, worst–case analysis and dimensioning, protocol simulation and automatic remote programming and hardware testing tools. This toolset was crucial for the development of EMMON which was designed to use standard commercially available technologies, while maintaining as much flexibility as possible to meet specific applications requirements. Finally, the EMMON architecture has been validated through extensive simulation and experimental evaluation, including a 300+ nodes testbed.
Resumo:
The recent trends of chip architectures with higher number of heterogeneous cores, and non-uniform memory/non-coherent caches, brings renewed attention to the use of Software Transactional Memory (STM) as a fundamental building block for developing parallel applications. Nevertheless, although STM promises to ease concurrent and parallel software development, it relies on the possibility of aborting conflicting transactions to maintain data consistency, which impacts on the responsiveness and timing guarantees required by embedded real-time systems. In these systems, contention delays must be (efficiently) limited so that the response times of tasks executing transactions are upper-bounded and task sets can be feasibly scheduled. In this paper we assess the use of STM in the development of embedded real-time software, defending that the amount of contention can be reduced if read-only transactions access recent consistent data snapshots, progressing in a wait-free manner. We show how the required number of versions of a shared object can be calculated for a set of tasks. We also outline an algorithm to manage conflicts between update transactions that prevents starvation.
Resumo:
Real-time systems demand guaranteed and predictable run-time behaviour in order to ensure that no task has missed its deadline. Over the years we are witnessing an ever increasing demand for functionality enhancements in the embedded real-time systems. Along with the functionalities, the design itself grows more complex. Posed constraints, such as energy consumption, time, and space bounds, also require attention and proper handling. Additionally, efficient scheduling algorithms, as proven through analyses and simulations, often impose requirements that have significant run-time cost, specially in the context of multi-core systems. In order to further investigate the behaviour of such systems to quantify and compare these overheads involved, we have developed the SPARTS, a simulator of a generic embedded real- time device. The tasks in the simulator are described by externally visible parameters (e.g. minimum inter-arrival, sporadicity, WCET, BCET, etc.), rather than the code of the tasks. While our current implementation is primarily focused on our immediate needs in the area of power-aware scheduling, it is designed to be extensible to accommodate different task properties, scheduling algorithms and/or hardware models for the application in wide variety of simulations. The source code of the SPARTS is available for download at [1].
Resumo:
The current industry trend is towards using Commercially available Off-The-Shelf (COTS) based multicores for developing real time embedded systems, as opposed to the usage of custom-made hardware. In typical implementation of such COTS-based multicores, multiple cores access the main memory via a shared bus. This often leads to contention on this shared channel, which results in an increase of the response time of the tasks. Analyzing this increased response time, considering the contention on the shared bus, is challenging on COTS-based systems mainly because bus arbitration protocols are often undocumented and the exact instants at which the shared bus is accessed by tasks are not explicitly controlled by the operating system scheduler; they are instead a result of cache misses. This paper makes three contributions towards analyzing tasks scheduled on COTS-based multicores. Firstly, we describe a method to model the memory access patterns of a task. Secondly, we apply this model to analyze the worst case response time for a set of tasks. Although the required parameters to obtain the request profile can be obtained by static analysis, we provide an alternative method to experimentally obtain them by using performance monitoring counters (PMCs). We also compare our work against an existing approach and show that our approach outperforms it by providing tighter upper-bound on the number of bus requests generated by a task.
Resumo:
Preemptions account for a non-negligible overhead during system execution. There has been substantial amount of research on estimating the delay incurred due to the loss of working sets in the processor state (caches, registers, TLBs) and some on avoiding preemptions, or limiting the preemption cost. We present an algorithm to reduce preemptions by further delaying the start of execution of high priority tasks in fixed priority scheduling. Our approaches take advantage of the floating non-preemptive regions model and exploit the fact that, during the schedule, the relative task phasing will differ from the worst-case scenario in terms of admissible preemption deferral. Furthermore, approximations to reduce the complexity of the proposed approach are presented. Substantial set of experiments demonstrate that the approach and approximations improve over existing work, in particular for the case of high utilisation systems, where savings of up to 22% on the number of preemption are attained.
Resumo:
Consider the problem of scheduling a set of sporadic tasks on a multiprocessor system to meet deadlines using a tasksplitting scheduling algorithm. Task-splitting (also called semipartitioning) scheduling algorithms assign most tasks to just one processor but a few tasks are assigned to two or more processors, and they are dispatched in a way that ensures that a task never executes on two or more processors simultaneously. A certain type of task-splitting algorithms, called slot-based task-splitting, is of particular interest because of its ability to schedule tasks at high processor utilizations. We present a new schedulability analysis for slot-based task-splitting scheduling algorithms that takes the overhead into account and also a new task assignment algorithm.
Resumo:
Several projects in the recent past have aimed at promoting Wireless Sensor Networks as an infrastructure technology, where several independent users can submit applications that execute concurrently across the network. Concurrent multiple applications cause significant energy-usage overhead on sensor nodes, that cannot be eliminated by traditional schemes optimized for single-application scenarios. In this paper, we outline two main optimization techniques for reducing power consumption across applications. First, we describe a compiler based approach that identifies redundant sensing requests across applications and eliminates those. Second, we cluster the radio transmissions together by concatenating packets from independent applications based on Rate-Harmonized Scheduling.
Resumo:
Replication is a proven concept for increasing the availability of distributed systems. However, actively replicating every software component in distributed embedded systems may not be a feasible approach. Not only the available resources are often limited, but also the imposed overhead could significantly degrade the system's performance. The paper proposes heuristics to dynamically determine which components to replicate based on their significance to the system as a whole, its consequent number of passive replicas, and where to place those replicas in the network. The results show that the proposed heuristics achieve a reasonably higher system's availability than static offline decisions when lower replication ratios are imposed due to resource or cost limitations. The paper introduces a novel approach to coordinate the activation of passive replicas in interdependent distributed environments. The proposed distributed coordination model reduces the complexity of the needed interactions among nodes and is faster to converge to a globally acceptable solution than a traditional centralised approach.
Resumo:
When the Internet was born, the purpose was to interconnect computers to share digital data at large-scale. On the other hand, when embedded systems were born, the objective was to control system components under real-time constraints through sensing devices, typically at small to medium scales. With the great evolution of the Information and Communication Technology (ICT), the tendency is to enable ubiquitous and pervasive computing to control everything (physical processes and physical objects) anytime and at a large-scale. This new vision gave recently rise to the paradigm of Cyber-Physical Systems (CPS). In this position paper, we provide a realistic vision to the concept of the Cyber-Physical Internet (CPI), discuss its design requirements and present the limitations of the current networking abstractions to fulfill these requirements. We also debate whether it is more productive to adopt a system integration approach or a radical design approach for building large-scale CPS. Finally, we present a sample of realtime challenges that must be considered in the design of the Cyber-Physical Internet.
Resumo:
Componentised systems, in particular those with fault confinement through address spaces, are currently emerging as a hot topic in embedded systems research. This paper extends the unified rate-based scheduling framework RBED in several dimensions to fit the requirements of such systems: we have removed the requirement that the deadline of a task is equal to its period. The introduction of inter-process communication reflects the need to communicate. Additionally we also discuss server tasks, budget replenishment and the low level details needed to deal with the physical reality of systems. While a number of these issues have been studied in previous work in isolation, we focus on the problems discovered and lessons learned when integrating solutions. We report on our experiences implementing the proposed mechanisms in a commercial grade OKL4 microkernel as well as an application with soft real-time and best-effort tasks on top of it.
Resumo:
Wireless sensor networks (WSNs) emerge as underlying infrastructures for new classes of large-scale networked embedded systems. However, WSNs system designers must fulfill the quality-of-service (QoS) requirements imposed by the applications (and users). Very harsh and dynamic physical environments and extremely limited energy/computing/memory/communication node resources are major obstacles for satisfying QoS metrics such as reliability, timeliness, and system lifetime. The limited communication range of WSN nodes, link asymmetry, and the characteristics of the physical environment lead to a major source of QoS degradation in WSNs-the ldquohidden node problem.rdquo In wireless contention-based medium access control (MAC) protocols, when two nodes that are not visible to each other transmit to a third node that is visible to the former, there will be a collision-called hidden-node or blind collision. This problem greatly impacts network throughput, energy-efficiency and message transfer delays, and the problem dramatically increases with the number of nodes. This paper proposes H-NAMe, a very simple yet extremely efficient hidden-node avoidance mechanism for WSNs. H-NAMe relies on a grouping strategy that splits each cluster of a WSN into disjoint groups of non-hidden nodes that scales to multiple clusters via a cluster grouping strategy that guarantees no interference between overlapping clusters. Importantly, H-NAMe is instantiated in IEEE 802.15.4/ZigBee, which currently are the most widespread communication technologies for WSNs, with only minor add-ons and ensuring backward compatibility with their protocols standards. H-NAMe was implemented and exhaustively tested using an experimental test-bed based on ldquooff-the-shelfrdquo technology, showing that it increases network throughput and transmission success probability up to twice the values obtained without H-NAMe. H-NAMe effectiveness was also demonstrated in a target tracking application with mobile robots - over a WSN deployment.
Resumo:
ARINC specification 653-2 describes the interface between application software and underlying middleware in a distributed real-time avionics system. The real-time workload in this system comprises of partitions, where each partition consists of one or more processes. Processes incur blocking and preemption overheads and can communicate with other processes in the system. In this work we develop compositional techniques for automated scheduling of such partitions and processes. At present, system designers manually schedule partitions based on interactions they have with the partition vendors. This approach is not only time consuming, but can also result in under utilization of resources. In contrast, the technique proposed in this paper is a principled approach for scheduling ARINC-653 partitions and therefore should facilitate system integration.