957 resultados para On-Chip Multiprocessor (OCM)
Resumo:
Today, modern System-on-a-Chip (SoC) systems have grown rapidly due to the increased processing power, while maintaining the size of the hardware circuit. The number of transistors on a chip continues to increase, but current SoC designs may not be able to exploit the potential performance, especially with energy consumption and chip area becoming two major concerns. Traditional SoC designs usually separate software and hardware. Thus, the process of improving the system performance is a complicated task for both software and hardware designers. The aim of this research is to develop hardware acceleration workflow for software applications. Thus, system performance can be improved with constraints of energy consumption and on-chip resource costs. The characteristics of software applications can be identified by using profiling tools. Hardware acceleration can have significant performance improvement for highly mathematical calculations or repeated functions. The performance of SoC systems can then be improved, if the hardware acceleration method is used to accelerate the element that incurs performance overheads. The concepts mentioned in this study can be easily applied to a variety of sophisticated software applications. The contributions of SoC-based hardware acceleration in the hardware-software co-design platform include the following: (1) Software profiling methods are applied to H.264 Coder-Decoder (CODEC) core. The hotspot function of aimed application is identified by using critical attributes such as cycles per loop, loop rounds, etc. (2) Hardware acceleration method based on Field-Programmable Gate Array (FPGA) is used to resolve system bottlenecks and improve system performance. The identified hotspot function is then converted to a hardware accelerator and mapped onto the hardware platform. Two types of hardware acceleration methods – central bus design and co-processor design, are implemented for comparison in the proposed architecture. (3) System specifications, such as performance, energy consumption, and resource costs, are measured and analyzed. The trade-off of these three factors is compared and balanced. Different hardware accelerators are implemented and evaluated based on system requirements. 4) The system verification platform is designed based on Integrated Circuit (IC) workflow. Hardware optimization techniques are used for higher performance and less resource costs. Experimental results show that the proposed hardware acceleration workflow for software applications is an efficient technique. The system can reach 2.8X performance improvements and save 31.84% energy consumption by applying the Bus-IP design. The Co-processor design can have 7.9X performance and save 75.85% energy consumption.
Resumo:
Modern Integrated Circuit (IC) design is characterized by a strong trend of Intellectual Property (IP) core integration into complex system-on-chip (SOC) architectures. These cores require thorough verification of their functionality to avoid erroneous behavior in the final device. Formal verification methods are capable of detecting any design bug. However, due to state explosion, their use remains limited to small circuits. Alternatively, simulation-based verification can explore hardware descriptions of any size, although the corresponding stimulus generation, as well as functional coverage definition, must be carefully planned to guarantee its efficacy. In general, static input space optimization methodologies have shown better efficiency and results than, for instance, Coverage Directed Verification (CDV) techniques, although they act on different facets of the monitored system and are not exclusive. This work presents a constrained-random simulation-based functional verification methodology where, on the basis of the Parameter Domains (PD) formalism, irrelevant and invalid test case scenarios are removed from the input space. To this purpose, a tool to automatically generate PD-based stimuli sources was developed. Additionally, we have developed a second tool to generate functional coverage models that fit exactly to the PD-based input space. Both the input stimuli and coverage model enhancements, resulted in a notable testbench efficiency increase, if compared to testbenches with traditional stimulation and coverage scenarios: 22% simulation time reduction when generating stimuli with our PD-based stimuli sources (still with a conventional coverage model), and 56% simulation time reduction when combining our stimuli sources with their corresponding, automatically generated, coverage models.
Resumo:
Consider the problem of scheduling a set of sporadic tasks on a multiprocessor system to meet deadlines using a task-splitting scheduling algorithm. Task-splitting (also called semi-partitioning) scheduling algorithms assign most tasks to just one processor but a few tasks are assigned to two or more processors, and they are dispatched in a way that ensures that a task never executes on two or more processors simultaneously. A particular type of task-splitting algorithms, called slot-based task-splitting dispatching, is of particular interest because of its ability to schedule tasks with high processor utilizations. Unfortunately, no slot-based task-splitting algorithm has been implemented in a real operating system so far. In this paper we discuss and propose some modifications to the slot-based task-splitting algorithm driven by implementation concerns, and we report the first implementation of this family of algorithms in a real operating system running Linux kernel version 2.6.34. We have also conducted an extensive range of experiments on a 4-core multicore desktop PC running task-sets with utilizations of up to 88%. The results show that the behavior of our implementation is in line with the theoretical framework behind it.
Resumo:
Consider the problem of scheduling sporadic tasks on a multiprocessor platform under mutual exclusion constraints. We present an approach which appears promising for allowing large amounts of parallel task executions and still ensures low amounts of blocking.
Resumo:
Scheduling of constrained deadline sporadic task systems on multiprocessor platforms is an area which has received much attention in the recent past. It is widely believed that finding an optimal scheduler is hard, and therefore most studies have focused on developing algorithms with good processor utilization bounds. These algorithms can be broadly classified into two categories: partitioned scheduling in which tasks are statically assigned to individual processors, and global scheduling in which each task is allowed to execute on any processor in the platform. In this paper we consider a third, more general, approach called cluster-based scheduling. In this approach each task is statically assigned to a processor cluster, tasks in each cluster are globally scheduled among themselves, and clusters in turn are scheduled on the multiprocessor platform. We develop techniques to support such cluster-based scheduling algorithms, and also consider properties that minimize total processor utilization of individual clusters. In the last part of this paper, we develop new virtual cluster-based scheduling algorithms. For implicit deadline sporadic task systems, we develop an optimal scheduling algorithm that is neither Pfair nor ERfair. We also show that the processor utilization bound of us-edf{m/(2m−1)} can be improved by using virtual clustering. Since neither partitioned nor global strategies dominate over the other, cluster-based scheduling is a natural direction for research towards achieving improved processor utilization bounds.
Resumo:
Consider the problem of scheduling real-time tasks on a multiprocessor with the goal of meeting deadlines. Tasks arrive sporadically and have implicit deadlines, that is, the deadline of a task is equal to its minimum inter-arrival time. Consider this problem to be solved with global static-priority scheduling. We present a priority-assignment scheme with the property that if at most 38% of the processing capacity is requested then all deadlines are met.
Resumo:
This paper studies static-priority preemptive scheduling on a multiprocessor using partitioned scheduling. We propose a new scheduling algorithm and prove that if the proposed algorithm is used and if less than 50% of the capacity is requested then all deadlines are met. It is known that for every static-priority multiprocessor scheduling algorithm, there is a task set that misses a deadline although the requested capacity is arbitrary close to 50%.
Resumo:
Consider the problem of scheduling a set of periodically arriving tasks on a multiprocessor with the goal of meeting deadlines. Processors are identical and have the same speed. Tasks can be preempted and they can migrate between processors. We propose an algorithm with a utilization bound of 66% and with few preemptions. It can trade a higher utilization bound for more preemption and in doing so it has a utilization bound of 100%.
Resumo:
As electronic devices get smaller and more complex, dependability assurance is becoming fundamental for many mission critical computer based systems. This paper presents a case study on the possibility of using the on-chip debug infrastructures present in most current microprocessors to execute real time fault injection campaigns. The proposed methodology is based on a debugger customized for fault injection and designed for maximum flexibility, and consists of injecting bit-flip type faults on memory elements without modifying or halting the target application. The debugger design is easily portable and applicable to different architectures, providing a flexible and efficient mechanism for verifying and validating fault tolerant components.
Resumo:
This thesis is one of the first reports of digital microfluidics on paper and the first in which the chip’s circuit was screen printed unto the paper. The use of the screen printing technique, being a low cost and fast method for electrodes deposition, makes the all chip processing much more aligned with the low cost choice of paper as a substrate. Functioning chips were developed that were capable of working at as low as 50 V, performing all the digital microfluidics operations: movement, dispensing, merging and splitting of the droplets. Silver ink electrodes were screen printed unto paper substrates, covered by Parylene-C (through vapor deposition) as dielectric and Teflon AF 1600 (through spin coating) as hydrophobic layer. The morphology of different paper substrates, silver inks (with different annealing conditions) and Parylene deposition conditions were studied by optical microscopy, AFM, SEM and 3D profilometry. Resolution tests for the printing process and electrical characterization of the silver electrodes were also made. As a showcase of the applications potential of these chips as a biosensing device, a colorimetric peroxidase detection test was successfully done on chip, using 200 nL to 350 nL droplets dispensed from 1 μL drops.
Resumo:
The motivation for this research initiated from the abrupt rise and fall of minicomputers which were initially used both for industrial automation and business applications due to their significantly lower cost than their predecessors, the mainframes. Later industrial automation developed its own vertically integrated hardware and software to address the application needs of uninterrupted operations, real-time control and resilience to harsh environmental conditions. This has led to the creation of an independent industry, namely industrial automation used in PLC, DCS, SCADA and robot control systems. This industry employs today over 200'000 people in a profitable slow clockspeed context in contrast to the two mainstream computing industries of information technology (IT) focused on business applications and telecommunications focused on communications networks and hand-held devices. Already in 1990s it was foreseen that IT and communication would merge into one Information and communication industry (ICT). The fundamental question of the thesis is: Could industrial automation leverage a common technology platform with the newly formed ICT industry? Computer systems dominated by complex instruction set computers (CISC) were challenged during 1990s with higher performance reduced instruction set computers (RISC). RISC started to evolve parallel to the constant advancement of Moore's law. These developments created the high performance and low energy consumption System-on-Chip architecture (SoC). Unlike to the CISC processors RISC processor architecture is a separate industry from the RISC chip manufacturing industry. It also has several hardware independent software platforms consisting of integrated operating system, development environment, user interface and application market which enables customers to have more choices due to hardware independent real time capable software applications. An architecture disruption merged and the smartphone and tablet market were formed with new rules and new key players in the ICT industry. Today there are more RISC computer systems running Linux (or other Unix variants) than any other computer system. The astonishing rise of SoC based technologies and related software platforms in smartphones created in unit terms the largest installed base ever seen in the history of computers and is now being further extended by tablets. An underlying additional element of this transition is the increasing role of open source technologies both in software and hardware. This has driven the microprocessor based personal computer industry with few dominating closed operating system platforms into a steep decline. A significant factor in this process has been the separation of processor architecture and processor chip production and operating systems and application development platforms merger into integrated software platforms with proprietary application markets. Furthermore the pay-by-click marketing has changed the way applications development is compensated: Three essays on major trends in a slow clockspeed industry: The case of industrial automation 2014 freeware, ad based or licensed - all at a lower price and used by a wider customer base than ever before. Moreover, the concept of software maintenance contract is very remote in the app world. However, as a slow clockspeed industry, industrial automation has remained intact during the disruptions based on SoC and related software platforms in the ICT industries. Industrial automation incumbents continue to supply systems based on vertically integrated systems consisting of proprietary software and proprietary mainly microprocessor based hardware. They enjoy admirable profitability levels on a very narrow customer base due to strong technology-enabled customer lock-in and customers' high risk leverage as their production is dependent on fault-free operation of the industrial automation systems. When will this balance of power be disrupted? The thesis suggests how industrial automation could join the mainstream ICT industry and create an information, communication and automation (ICAT) industry. Lately the Internet of Things (loT) and weightless networks, a new standard leveraging frequency channels earlier occupied by TV broadcasting, have gradually started to change the rigid world of Machine to Machine (M2M) interaction. It is foreseeable that enough momentum will be created that the industrial automation market will in due course face an architecture disruption empowered by these new trends. This thesis examines the current state of industrial automation subject to the competition between the incumbents firstly through a research on cost competitiveness efforts in captive outsourcing of engineering, research and development and secondly researching process re- engineering in the case of complex system global software support. Thirdly we investigate the industry actors', namely customers, incumbents and newcomers, views on the future direction of industrial automation and conclude with our assessments of the possible routes industrial automation could advance taking into account the looming rise of the Internet of Things (loT) and weightless networks. Industrial automation is an industry dominated by a handful of global players each of them focusing on maintaining their own proprietary solutions. The rise of de facto standards like IBM PC, Unix and Linux and SoC leveraged by IBM, Compaq, Dell, HP, ARM, Apple, Google, Samsung and others have created new markets of personal computers, smartphone and tablets and will eventually also impact industrial automation through game changing commoditization and related control point and business model changes. This trend will inevitably continue, but the transition to a commoditized industrial automation will not happen in the near future.
Resumo:
In accordance with the Moore's law, the increasing number of on-chip integrated transistors has enabled modern computing platforms with not only higher processing power but also more affordable prices. As a result, these platforms, including portable devices, work stations and data centres, are becoming an inevitable part of the human society. However, with the demand for portability and raising cost of power, energy efficiency has emerged to be a major concern for modern computing platforms. As the complexity of on-chip systems increases, Network-on-Chip (NoC) has been proved as an efficient communication architecture which can further improve system performances and scalability while reducing the design cost. Therefore, in this thesis, we study and propose energy optimization approaches based on NoC architecture, with special focuses on the following aspects. As the architectural trend of future computing platforms, 3D systems have many bene ts including higher integration density, smaller footprint, heterogeneous integration, etc. Moreover, 3D technology can signi cantly improve the network communication and effectively avoid long wirings, and therefore, provide higher system performance and energy efficiency. With the dynamic nature of on-chip communication in large scale NoC based systems, run-time system optimization is of crucial importance in order to achieve higher system reliability and essentially energy efficiency. In this thesis, we propose an agent based system design approach where agents are on-chip components which monitor and control system parameters such as supply voltage, operating frequency, etc. With this approach, we have analysed the implementation alternatives for dynamic voltage and frequency scaling and power gating techniques at different granularity, which reduce both dynamic and leakage energy consumption. Topologies, being one of the key factors for NoCs, are also explored for energy saving purpose. A Honeycomb NoC architecture is proposed in this thesis with turn-model based deadlock-free routing algorithms. Our analysis and simulation based evaluation show that Honeycomb NoCs outperform their Mesh based counterparts in terms of network cost, system performance as well as energy efficiency.
Resumo:
Multiprocessor system-on-chip (MPSoC) designs utilize the available technology and communication architectures to meet the requirements of the upcoming applications. In MPSoC, the communication platform is both the key enabler, as well as the key differentiator for realizing efficient MPSoCs. It provides product differentiation to meet a diverse, multi-dimensional set of design constraints, including performance, power, energy, reconfigurability, scalability, cost, reliability and time-to-market. The communication resources of a single interconnection platform cannot be fully utilized by all kind of applications, such as the availability of higher communication bandwidth for computation but not data intensive applications is often unfeasible in the practical implementation. This thesis aims to perform the architecture-level design space exploration towards efficient and scalable resource utilization for MPSoC communication architecture. In order to meet the performance requirements within the design constraints, careful selection of MPSoC communication platform, resource aware partitioning and mapping of the application play important role. To enhance the utilization of communication resources, variety of techniques such as resource sharing, multicast to avoid re-transmission of identical data, and adaptive routing can be used. For implementation, these techniques should be customized according to the platform architecture. To address the resource utilization of MPSoC communication platforms, variety of architectures with different design parameters and performance levels, namely Segmented bus (SegBus), Network-on-Chip (NoC) and Three-Dimensional NoC (3D-NoC), are selected. Average packet latency and power consumption are the evaluation parameters for the proposed techniques. In conventional computing architectures, fault on a component makes the connected fault-free components inoperative. Resource sharing approach can utilize the fault-free components to retain the system performance by reducing the impact of faults. Design space exploration also guides to narrow down the selection of MPSoC architecture, which can meet the performance requirements with design constraints.
Resumo:
This paper discusses the architectural design, implementation and associated simulated peformance results of a possible receiver solution fir a multiband Ultra-Wideband (UWB) receiver. The paper concentrates on the tradeoff between the soft-bit width and numerical precision requirements for the receiver versus performance. The required numerical precision results obtained in this paper can be used by baseband designers of cost effective UWB systems using Systein-on-Chip (SoC), FPGA and ASIC technology solutions biased toward the competitive consumer electronics market(1).
Resumo:
This paper discusses the requirements on the numerical precision for a practical Multiband Ultra-Wideband (UWB) consumer electronic solution. To this end we first present the possibilities that UWB has to offer to the consumer electronics market and the possible range of devices. We then show the performance of a model of the UWB baseband system implemented using floating point precision. Then, by simulation we find the minimal numerical precision required to maintain floating-point performance for each of the specific data types and signals present in the UWB baseband. Finally, we present a full description of the numerical requirements for both the transmit and receive components of the UWB baseband. The numerical precision results obtained in this paper can then be used by baseband designers to implement cost effective UWB systems using System-on-Chip (SoC), FPGA and ASIC technology solutions biased toward the competitive consumer electronics market(1).