995 resultados para embedded computing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

"Preprint for a workshop sponsored by NASA Langley Research Center, Hampton, Virginia, and the American Institute of Aeronautics and Astronautics, New York, and held in Hampton, Virginia, November 7-8, 1978."

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Single processor architectures are unable to provide the required performance of high performance embedded systems. Parallel processing based on general-purpose processors can achieve these performances with a considerable increase of required resources. However, in many cases, simplified optimized parallel cores can be used instead of general-purpose processors achieving better performance at lower resource utilization. In this paper, we propose a configurable many-core architecture to serve as a co-processor for high-performance embedded computing on Field-Programmable Gate Arrays. The architecture consists of an array of configurable simple cores with support for floating-point operations interconnected with a configurable interconnection network. For each core it is possible to configure the size of the internal memory, the supported operations and number of interfacing ports. The architecture was tested in a ZYNQ-7020 FPGA in the execution of several parallel algorithms. The results show that the proposed many-core architecture achieves better performance than that achieved with a parallel generalpurpose processor and that up to 32 floating-point cores can be implemented in a ZYNQ-7020 SoC FPGA.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Energy efficient embedded computing enables new application scenarios in mobile devices like software-defined radio and video processing. The hierarchical multiprocessor considered in this work may contain dozens or hundreds of resource efficient VLIW CPUs. Programming this number of CPU cores is a complex task requiring compiler support. The stream programming paradigm provides beneficial properties that help to support automatic partitioning. This work describes a compiler for streaming applications targeting the self-build hierarchical CoreVA-MPSoC multiprocessor platform. The compiler is supported by a programming model that is tailored to fit the streaming programming paradigm. We present a novel simulated-annealing (SA) based partitioning algorithm, called Smart SA. The overall speedup of Smart SA is 12.84 for an MPSoC with 16 CPU cores compared to a single CPU implementation. Comparison with a state of the art partitioning algorithm shows an average performance improvement of 34.07%.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Today's SoCs are complex designs with multiple embedded processors, memory subsystems, and application specific peripherals. The memory architecture of embedded SoCs strongly influences the power and performance of the entire system. Further, the memory subsystem constitutes a major part (typically up to 70%) of the silicon area for the current day SoC. In this article, we address the on-chip memory architecture exploration for DSP processors which are organized as multiple memory banks, where banks can be single/dual ported with non-uniform bank sizes. In this paper we propose two different methods for physical memory architecture exploration and identify the strengths and applicability of these methods in a systematic way. Both methods address the memory architecture exploration for a given target application by considering the application's data access characteristics and generates a set of Pareto-optimal design points that are interesting from a power, performance and VLSI area perspective. To the best of our knowledge, this is the first comprehensive work on memory space exploration at physical memory level that integrates data layout and memory exploration to address the system objectives from both hardware design and application software development perspective. Further we propose an automatic framework that explores the design space identifying 100's of Pareto-optimal design points within a few hours of running on a standard desktop configuration.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Emerging embedded applications are based on evolving standards (e.g., MPEG2/4, H.264/265, IEEE802.11a/b/g/n). Since most of these applications run on handheld devices, there is an increasing need for a single chip solution that can dynamically interoperate between different standards and their derivatives. In order to achieve high resource utilization and low power dissipation, we propose REDEFINE, a polymorphic ASIC in which specialized hardware units are replaced with basic hardware units that can create the same functionality by runtime re-composition. It is a ``future-proof'' custom hardware solution for multiple applications and their derivatives in a domain. In this article, we describe a compiler framework and supporting hardware comprising compute, storage, and communication resources. Applications described in high-level language (e.g., C) are compiled into application substructures. For each application substructure, a set of compute elements on the hardware are interconnected during runtime to form a pattern that closely matches the communication pattern of that particular application. The advantage is that the bounded CEs are neither processor cores nor logic elements as in FPGAs. Hence, REDEFINE offers the power and performance advantage of an ASIC and the hardware reconfigurability and programmability of that of an FPGA/instruction set processor. In addition, the hardware supports custom instruction pipelining. Existing instruction-set extensible processors determine a sequence of instructions that repeatedly occur within the application to create custom instructions at design time to speed up the execution of this sequence. We extend this scheme further, where a kernel is compiled into custom instructions that bear strong producer-consumer relationship (and not limited to frequently occurring sequences of instructions). Custom instructions, realized as hardware compositions effected at runtime, allow several instances of the same to be active in parallel. A key distinguishing factor in majority of the emerging embedded applications is stream processing. To reduce the overheads of data transfer between custom instructions, direct communication paths are employed among custom instructions. In this article, we present the overview of the hardware-aware compiler framework, which determines the NoC-aware schedule of transports of the data exchanged between the custom instructions on the interconnect. The results for the FFT kernel indicate a 25% reduction in the number of loads/stores, and throughput improves by log(n) for n-point FFT when compared to sequential implementation. Overall, REDEFINE offers flexibility and a runtime reconfigurability at the expense of 1.16x in power and 8x in area when compared to an ASIC. REDEFINE implementation consumes 0.1x the power of an FPGA implementation. In addition, the configuration overhead of the FPGA implementation is 1,000x more than that of REDEFINE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Among the intelligent safety technologies for road vehicles, active suspensions controlled by embedded computing elements for preventing rollover have received a lot of attention. The existing models for synthesizing and allocating forces in such suspensions are conservatively based on the constraints that are valid until no wheels lift off the ground. However, the fault tolerance of the rollover-preventive systems can be enhanced if the smart/active suspensions can intervene in the more severe situation in which the wheels have just lifted off the ground. The difficulty in computing control in the last situation is that the vehicle dynamics then passes into the regime that yields a model involving disjunctive constraints on the dynamics. Simulation of dynamics with disjunctive constraints in this context becomes necessary to estimate, synthesize, and allocate the intended hardware realizable forces in an active suspension. In this paper, we give an algorithm for the previously mentioned problem by solving it as a disjunctive dynamic optimization problem. Based on this, we synthesize and allocate the roll-stabilizing time-dependent active suspension forces in terms of sensor output data. We show that the forces obtained from disjunctive dynamics are comparable with existing force allocations and, hence, are possibly realizable in the existing hardware framework toward enhancing the safety and fault tolerance.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This article introduces Periodically Controlled Hybrid Automata (PCHA) for modular specification of embedded control systems. In a PCHA, control actions that change the control input to the plant occur roughly periodically, while other actions that update the state of the controller may occur in the interim. Such actions could model, for example, sensor updates and information received from higher-level planning modules that change the set point of the controller. Based on periodicity and subtangential conditions, a new sufficient condition for verifying invariant properties of PCHAs is presented. For PCHAs with polynomial continuous vector fields, it is possible to check these conditions automatically using, for example, quantifier elimination or sum of squares decomposition. We examine the feasibility of this automatic approach on a small example. The proposed technique is also used to manually verify safety and progress properties of a fairly complex planner-controller subsystem of an autonomous ground vehicle. Geometric properties of planner-generated paths are derived which guarantee that such paths can be safely followed by the controller. © 2012 ACM.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Software-based control of life-critical embedded systems has become increasingly complex, and to a large extent has come to determine the safety of the human being. For example, implantable cardiac pacemakers have over 80,000 lines of code which are responsible for maintaining the heart within safe operating limits. As firmware-related recalls accounted for over 41% of the 600,000 devices recalled in the last decade, there is a need for rigorous model-driven design tools to generate verified code from verified software models. To this effect, we have developed the UPP2SF model-translation tool, which facilitates automatic conversion of verified models (in UPPAAL) to models that may be simulated and tested (in Simulink/Stateflow). We describe the translation rules that ensure correct model conversion, applicable to a large class of models. We demonstrate how UPP2SF is used in themodel-driven design of a pacemaker whosemodel is (a) designed and verified in UPPAAL (using timed automata), (b) automatically translated to Stateflow for simulation-based testing, and then (c) automatically generated into modular code for hardware-level integration testing of timing-related errors. In addition, we show how UPP2SF may be used for worst-case execution time estimation early in the design stage. Using UPP2SF, we demonstrate the value of integrated end-to-end modeling, verification, code-generation and testing process for complex software-controlled embedded systems. © 2014 ACM.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The design cycle for complex special-purpose computing systems is extremely costly and time-consuming. It involves a multiparametric design space exploration for optimization, followed by design verification. Designers of special purpose VLSI implementations often need to explore parameters, such as optimal bitwidth and data representation, through time-consuming Monte Carlo simulations. A prominent example of this simulation-based exploration process is the design of decoders for error correcting systems, such as the Low-Density Parity-Check (LDPC) codes adopted by modern communication standards, which involves thousands of Monte Carlo runs for each design point. Currently, high-performance computing offers a wide set of acceleration options that range from multicore CPUs to Graphics Processing Units (GPUs) and Field Programmable Gate Arrays (FPGAs). The exploitation of diverse target architectures is typically associated with developing multiple code versions, often using distinct programming paradigms. In this context, we evaluate the concept of retargeting a single OpenCL program to multiple platforms, thereby significantly reducing design time. A single OpenCL-based parallel kernel is used without modifications or code tuning on multicore CPUs, GPUs, and FPGAs. We use SOpenCL (Silicon to OpenCL), a tool that automatically converts OpenCL kernels to RTL in order to introduce FPGAs as a potential platform to efficiently execute simulations coded in OpenCL. We use LDPC decoding simulations as a case study. Experimental results were obtained by testing a variety of regular and irregular LDPC codes that range from short/medium (e.g., 8,000 bit) to long length (e.g., 64,800 bit) DVB-S2 codes. We observe that, depending on the design parameters to be simulated, on the dimension and phase of the design, the GPU or FPGA may suit different purposes more conveniently, thus providing different acceleration factors over conventional multicore CPUs.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Digital signatures are an important primitive for building secure systems and are used in most real-world security protocols. However, almost all popular signature schemes are either based on the factoring assumption (RSA) or the hardness of the discrete logarithm problem (DSA/ECDSA). In the case of classical cryptanalytic advances or progress on the development of quantum computers, the hardness of these closely related problems might be seriously weakened. A potential alternative approach is the construction of signature schemes based on the hardness of certain lattice problems that are assumed to be intractable by quantum computers. Due to significant research advancements in recent years, lattice-based schemes have now become practical and appear to be a very viable alternative to number-theoretic cryptography. In this article, we focus on recent developments and the current state of the art in lattice-based digital signatures and provide a comprehensive survey discussing signature schemes with respect to practicality. Additionally, we discuss future research areas that are essential for the continued development of lattice-based cryptography.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we address the problem of sharing a wireless channel among a set of sporadic message streams where a message stream issues transmission requests with real-time deadlines. We propose a collision-free wireless medium access control (MAC) protocol which implements static-priority scheduling, supports a large number of priority levels and is fully distributed. It is an adaptation to a wireless channel of the dominance protocol used in the CAN bus. But, unlike that protocol, our protocol does not require a node having the ability to receive an incoming bit from the channel while transmitting to the channel. The evaluation of the protocol with real embedded computing platforms is presented to show that the proposed protocol is in fact collision-free and prioritized. We measure the response times of our implementation and show that the response-time analysis developed for the protocol offers an upper bound on the response times.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper studies static-priority preemptive scheduling on a multiprocessor using partitioned scheduling. We propose a new scheduling algorithm and prove that if the proposed algorithm is used and if less than 50% of the capacity is requested then all deadlines are met. It is known that for every static-priority multiprocessor scheduling algorithm, there is a task set that misses a deadline although the requested capacity is arbitrary close to 50%.