6 resultados para Matériel reconfigurable

em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland


Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work, the feasibility of the floating-gate technology in analog computing platforms in a scaled down general-purpose CMOS technology is considered. When the technology is scaled down the performance of analog circuits tends to get worse because the process parameters are optimized for digital transistors and the scaling involves the reduction of supply voltages. Generally, the challenge in analog circuit design is that all salient design metrics such as power, area, bandwidth and accuracy are interrelated. Furthermore, poor flexibility, i.e. lack of reconfigurability, the reuse of IP etc., can be considered the most severe weakness of analog hardware. On this account, digital calibration schemes are often required for improved performance or yield enhancement, whereas high flexibility/reconfigurability can not be easily achieved. Here, it is discussed whether it is possible to work around these obstacles by using floating-gate transistors (FGTs), and analyze problems associated with the practical implementation. FGT technology is attractive because it is electrically programmable and also features a charge-based built-in non-volatile memory. Apart from being ideal for canceling the circuit non-idealities due to process variations, the FGTs can also be used as computational or adaptive elements in analog circuits. The nominal gate oxide thickness in the deep sub-micron (DSM) processes is too thin to support robust charge retention and consequently the FGT becomes leaky. In principle, non-leaky FGTs can be implemented in a scaled down process without any special masks by using “double”-oxide transistors intended for providing devices that operate with higher supply voltages than general purpose devices. However, in practice the technology scaling poses several challenges which are addressed in this thesis. To provide a sufficiently wide-ranging survey, six prototype chips with varying complexity were implemented in four different DSM process nodes and investigated from this perspective. The focus is on non-leaky FGTs, but the presented autozeroing floating-gate amplifier (AFGA) demonstrates that leaky FGTs may also find a use. The simplest test structures contain only a few transistors, whereas the most complex experimental chip is an implementation of a spiking neural network (SNN) which comprises thousands of active and passive devices. More precisely, it is a fully connected (256 FGT synapses) two-layer spiking neural network (SNN), where the adaptive properties of FGT are taken advantage of. A compact realization of Spike Timing Dependent Plasticity (STDP) within the SNN is one of the key contributions of this thesis. Finally, the considerations in this thesis extend beyond CMOS to emerging nanodevices. To this end, one promising emerging nanoscale circuit element - memristor - is reviewed and its applicability for analog processing is considered. Furthermore, it is discussed how the FGT technology can be used to prototype computation paradigms compatible with these emerging two-terminal nanoscale devices in a mature and widely available CMOS technology.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The purpose of the work was to realize a high-speed digital data transfer system for RPC muon chambers in the CMS experiment on CERN’s new LHC accelerator. This large scale system took many years and many stages of prototyping to develop, and required the participation of tens of people. The system interfaces to Frontend Boards (FEB) at the 200,000-channel detector and to the trigger and readout electronics in the control room of the experiment. The distance between these two is about 80 metres and the speed required for the optic links was pushing the limits of available technology when the project was started. Here, as in many other aspects of the design, it was assumed that the features of readily available commercial components would develop in the course of the design work, just as they did. By choosing a high speed it was possible to multiplex the data from some the chambers into the same fibres to reduce the number of links needed. Further reduction was achieved by employing zero suppression and data compression, and a total of only 660 optical links were needed. Another requirement, which conflicted somewhat with choosing the components a late as possible was that the design needed to be radiation tolerant to an ionizing dose of 100 Gy and to a have a moderate tolerance to Single Event Effects (SEEs). This required some radiation test campaigns, and eventually led to ASICs being chosen for some of the critical parts. The system was made to be as reconfigurable as possible. The reconfiguration needs to be done from a distance as the electronics is not accessible except for some short and rare service breaks once the accelerator starts running. Therefore reconfigurable logic is extensively used, and the firmware development for the FPGAs constituted a sizable part of the work. Some special techniques needed to be used there too, to achieve the required radiation tolerance. The system has been demonstrated to work in several laboratory and beam tests, and now we are waiting to see it in action when the LHC will start running in the autumn 2008.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Multiprocessing is a promising solution to meet the requirements of near future applications. To get full benefit from parallel processing, a manycore system needs efficient, on-chip communication architecture. Networkon- Chip (NoC) is a general purpose communication concept that offers highthroughput, reduced power consumption, and keeps complexity in check by a regular composition of basic building blocks. This thesis presents power efficient communication approaches for networked many-core systems. We address a range of issues being important for designing power-efficient manycore systems at two different levels: the network-level and the router-level. From the network-level point of view, exploiting state-of-the-art concepts such as Globally Asynchronous Locally Synchronous (GALS), Voltage/ Frequency Island (VFI), and 3D Networks-on-Chip approaches may be a solution to the excessive power consumption demanded by today’s and future many-core systems. To this end, a low-cost 3D NoC architecture, based on high-speed GALS-based vertical channels, is proposed to mitigate high peak temperatures, power densities, and area footprints of vertical interconnects in 3D ICs. To further exploit the beneficial feature of a negligible inter-layer distance of 3D ICs, we propose a novel hybridization scheme for inter-layer communication. In addition, an efficient adaptive routing algorithm is presented which enables congestion-aware and reliable communication for the hybridized NoC architecture. An integrated monitoring and management platform on top of this architecture is also developed in order to implement more scalable power optimization techniques. From the router-level perspective, four design styles for implementing power-efficient reconfigurable interfaces in VFI-based NoC systems are proposed. To enhance the utilization of virtual channel buffers and to manage their power consumption, a partial virtual channel sharing method for NoC routers is devised and implemented. Extensive experiments with synthetic and real benchmarks show significant power savings and mitigated hotspots with similar performance compared to latest NoC architectures. The thesis concludes that careful codesigned elements from different network levels enable considerable power savings for many-core systems.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With the shift towards many-core computer architectures, dataflow programming has been proposed as one potential solution for producing software that scales to a varying number of processor cores. Programming for parallel architectures is considered difficult as the current popular programming languages are inherently sequential and introducing parallelism is typically up to the programmer. Dataflow, however, is inherently parallel, describing an application as a directed graph, where nodes represent calculations and edges represent a data dependency in form of a queue. These queues are the only allowed communication between the nodes, making the dependencies between the nodes explicit and thereby also the parallelism. Once a node have the su cient inputs available, the node can, independently of any other node, perform calculations, consume inputs, and produce outputs. Data ow models have existed for several decades and have become popular for describing signal processing applications as the graph representation is a very natural representation within this eld. Digital lters are typically described with boxes and arrows also in textbooks. Data ow is also becoming more interesting in other domains, and in principle, any application working on an information stream ts the dataflow paradigm. Such applications are, among others, network protocols, cryptography, and multimedia applications. As an example, the MPEG group standardized a dataflow language called RVC-CAL to be use within reconfigurable video coding. Describing a video coder as a data ow network instead of with conventional programming languages, makes the coder more readable as it describes how the video dataflows through the different coding tools. While dataflow provides an intuitive representation for many applications, it also introduces some new problems that need to be solved in order for data ow to be more widely used. The explicit parallelism of a dataflow program is descriptive and enables an improved utilization of available processing units, however, the independent nodes also implies that some kind of scheduling is required. The need for efficient scheduling becomes even more evident when the number of nodes is larger than the number of processing units and several nodes are running concurrently on one processor core. There exist several data ow models of computation, with different trade-offs between expressiveness and analyzability. These vary from rather restricted but statically schedulable, with minimal scheduling overhead, to dynamic where each ring requires a ring rule to evaluated. The model used in this work, namely RVC-CAL, is a very expressive language, and in the general case it requires dynamic scheduling, however, the strong encapsulation of dataflow nodes enables analysis and the scheduling overhead can be reduced by using quasi-static, or piecewise static, scheduling techniques. The scheduling problem is concerned with nding the few scheduling decisions that must be run-time, while most decisions are pre-calculated. The result is then an, as small as possible, set of static schedules that are dynamically scheduled. To identify these dynamic decisions and to find the concrete schedules, this thesis shows how quasi-static scheduling can be represented as a model checking problem. This involves identifying the relevant information to generate a minimal but complete model to be used for model checking. The model must describe everything that may affect scheduling of the application while omitting everything else in order to avoid state space explosion. This kind of simplification is necessary to make the state space analysis feasible. For the model checker to nd the actual schedules, a set of scheduling strategies are de ned which are able to produce quasi-static schedulers for a wide range of applications. The results of this work show that actor composition with quasi-static scheduling can be used to transform data ow programs to t many different computer architecture with different type and number of cores. This in turn, enables dataflow to provide a more platform independent representation as one application can be fitted to a specific processor architecture without changing the actual program representation. Instead, the program representation is in the context of design space exploration optimized by the development tools to fit the target platform. This work focuses on representing the dataflow scheduling problem as a model checking problem and is implemented as part of a compiler infrastructure. The thesis also presents experimental results as evidence of the usefulness of the approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tässä työssä on tutkittu modulaarisen aktiivimagneettilaakeroidun koelaitteen mekaanista suunnittelua ja analysointia. Suurnopeusroottorin suunnittelun teoria on esitelty. Lisäksi monia analyyttisiä mallinnusmenetelmiä mekaanisten kuormitusten mallintamiseksi on esitelty. Koska kyseessä on suurnopeussähkökone, roottoridynamiikka ja sen soveltuvuus suunnittelussa on esitelty. Magneettilaakerien rakenteeseen ja toimintaan on tutustuttu osana tätä työtä. Kirjallisuuskatsaus nykyisistä koelaitteista esimerkiksi komponenttien ominaisuuksien tunnistamiseen ja roottoridynamiikan tutkimuksiin on esitelty. Työn rajauksena on konseptisuunnittelu muunneltavalle magneettilaakeroidulle (AMB) koelaitteelle ja suunnitteluprosessin dokumentointi. Muunneltavuuteen päädyttiin, koska se mahdollistaa erilaisten komponenttiasetteluiden testaamisen erilaisille magneettilaakerikokoonpanoille ja roottoreille. Pääpaino tässä työssä on suurnopeus induktiokoneen roottorin suunnittelussa ja mallintamisessa. Modulaaristen toimilaitteiden kuten magneettilaakerien ja induktiosähkömoottorin rakenne on esitelty ja modulaarisen rakenteen käytettävyyden hyödyistä koelaitekäytössä on dokumentoitu. Analyyttisiä ja elementtimenetelmään perustuvia tutkimusmenetelmiä on käytetty tutkittaessa suunniteltua suurnopeusroottoria. Suunnittelun ja analysoinnin tulokset on esitelty ja verrattu keskenään eri mallinnusmenetelmien välillä. Lisäksi johtopäätökset sähkömagneettisten osien liittämisen monimutkaisuudesta ja vaatimuksista roottoriin ja toimilaitteisiin sekä mekaanisten että sähkömagneettisten ominaisuuksien optimoimiseksi on dokumentoitu.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis presents a novel design paradigm, called Virtual Runtime Application Partitions (VRAP), to judiciously utilize the on-chip resources. As the dark silicon era approaches, where the power considerations will allow only a fraction chip to be powered on, judicious resource management will become a key consideration in future designs. Most of the works on resource management treat only the physical components (i.e. computation, communication, and memory blocks) as resources and manipulate the component to application mapping to optimize various parameters (e.g. energy efficiency). To further enhance the optimization potential, in addition to the physical resources we propose to manipulate abstract resources (i.e. voltage/frequency operating point, the fault-tolerance strength, the degree of parallelism, and the configuration architecture). The proposed framework (i.e. VRAP) encapsulates methods, algorithms, and hardware blocks to provide each application with the abstract resources tailored to its needs. To test the efficacy of this concept, we have developed three distinct self adaptive environments: (i) Private Operating Environment (POE), (ii) Private Reliability Environment (PRE), and (iii) Private Configuration Environment (PCE) that collectively ensure that each application meets its deadlines using minimal platform resources. In this work several novel architectural enhancements, algorithms and policies are presented to realize the virtual runtime application partitions efficiently. Considering the future design trends, we have chosen Coarse Grained Reconfigurable Architectures (CGRAs) and Network on Chips (NoCs) to test the feasibility of our approach. Specifically, we have chosen Dynamically Reconfigurable Resource Array (DRRA) and McNoC as the representative CGRA and NoC platforms. The proposed techniques are compared and evaluated using a variety of quantitative experiments. Synthesis and simulation results demonstrate VRAP significantly enhances the energy and power efficiency compared to state of the art.