26 resultados para Software-reconfigurable array processing architectures

em AMS Tesi di Dottorato - Alm@DL - Università di Bologna


Relevância:

50.00% 50.00%

Publicador:

Resumo:

During the last few decades an unprecedented technological growth has been at the center of the embedded systems design paramount, with Moore’s Law being the leading factor of this trend. Today in fact an ever increasing number of cores can be integrated on the same die, marking the transition from state-of-the-art multi-core chips to the new many-core design paradigm. Despite the extraordinarily high computing power, the complexity of many-core chips opens the door to several challenges. As a result of the increased silicon density of modern Systems-on-a-Chip (SoC), the design space exploration needed to find the best design has exploded and hardware designers are in fact facing the problem of a huge design space. Virtual Platforms have always been used to enable hardware-software co-design, but today they are facing with the huge complexity of both hardware and software systems. In this thesis two different research works on Virtual Platforms are presented: the first one is intended for the hardware developer, to easily allow complex cycle accurate simulations of many-core SoCs. The second work exploits the parallel computing power of off-the-shelf General Purpose Graphics Processing Units (GPGPUs), with the goal of an increased simulation speed. The term Virtualization can be used in the context of many-core systems not only to refer to the aforementioned hardware emulation tools (Virtual Platforms), but also for two other main purposes: 1) to help the programmer to achieve the maximum possible performance of an application, by hiding the complexity of the underlying hardware. 2) to efficiently exploit the high parallel hardware of many-core chips in environments with multiple active Virtual Machines. This thesis is focused on virtualization techniques with the goal to mitigate, and overtake when possible, some of the challenges introduced by the many-core design paradigm.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis deals with Context Aware Services, Smart Environments, Context Management and solutions for Devices and Service Interoperability. Multi-vendor devices offer an increasing number of services and end-user applications that base their value on the ability to exploit the information originating from the surrounding environment by means of an increasing number of embedded sensors, e.g. GPS, compass, RFID readers, cameras and so on. However, usually such devices are not able to exchange information because of the lack of a shared data storage and common information exchange methods. A large number of standards and domain specific building blocks are available and are heavily used in today's products. However, the use of these solutions based on ready-to-use modules is not without problems. The integration and cooperation of different kinds of modules can be daunting because of growing complexity and dependency. In this scenarios it might be interesting to have an infrastructure that makes the coexistence of multi-vendor devices easy, while enabling low cost development and smooth access to services. This sort of technologies glue should reduce both software and hardware integration costs by removing the trouble of interoperability. The result should also lead to faster and simplified design, development and, deployment of cross-domain applications. This thesis is mainly focused on SW architectures supporting context aware service providers especially on the following subjects: - user preferences service adaptation - context management - content management - information interoperability - multivendor device interoperability - communication and connectivity interoperability Experimental activities were carried out in several domains including Cultural Heritage, indoor and personal smart spaces – all of which are considered significant test-beds in Context Aware Computing. The work evolved within european and national projects: on the europen side, I carried out my research activity within EPOCH, the FP6 Network of Excellence on “Processing Open Cultural Heritage” and within SOFIA, a project of the ARTEMIS JU on embedded systems. I worked in cooperation with several international establishments, including the University of Kent, VTT (the Technical Reserarch Center of Finland) and Eurotech. On the national side I contributed to a one-to-one research contract between ARCES and Telecom Italia. The first part of the thesis is focused on problem statement and related work and addresses interoperability issues and related architecture components. The second part is focused on specific architectures and frameworks: - MobiComp: a context management framework that I used in cultural heritage applications - CAB: a context, preference and profile based application broker which I designed within EPOCH Network of Excellence - M3: "Semantic Web based" information sharing infrastructure for smart spaces designed by Nokia within the European project SOFIA - NoTa: a service and transport independent connectivity framework - OSGi: the well known Java based service support framework The final section is dedicated to the middleware, the tools and, the SW agents developed during my Doctorate time to support context-aware services in smart environments.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Recently in most of the industrial automation process an ever increasing degree of automation has been observed. This increasing is motivated by the higher requirement of systems with great performance in terms of quality of products/services generated, productivity, efficiency and low costs in the design, realization and maintenance. This trend in the growth of complex automation systems is rapidly spreading over automated manufacturing systems (AMS), where the integration of the mechanical and electronic technology, typical of the Mechatronics, is merging with other technologies such as Informatics and the communication networks. An AMS is a very complex system that can be thought constituted by a set of flexible working stations, one or more transportation systems. To understand how this machine are important in our society let considerate that every day most of us use bottles of water or soda, buy product in box like food or cigarets and so on. Another important consideration from its complexity derive from the fact that the the consortium of machine producers has estimated around 350 types of manufacturing machine. A large number of manufacturing machine industry are presented in Italy and notably packaging machine industry,in particular a great concentration of this kind of industry is located in Bologna area; for this reason the Bologna area is called “packaging valley”. Usually, the various parts of the AMS interact among them in a concurrent and asynchronous way, and coordinate the parts of the machine to obtain a desiderated overall behaviour is an hard task. Often, this is the case in large scale systems, organized in a modular and distributed manner. Even if the success of a modern AMS from a functional and behavioural point of view is still to attribute to the design choices operated in the definition of the mechanical structure and electrical electronic architecture, the system that governs the control of the plant is becoming crucial, because of the large number of duties associated to it. Apart from the activity inherent to the automation of themachine cycles, the supervisory system is called to perform other main functions such as: emulating the behaviour of traditional mechanical members thus allowing a drastic constructive simplification of the machine and a crucial functional flexibility; dynamically adapting the control strategies according to the different productive needs and to the different operational scenarios; obtaining a high quality of the final product through the verification of the correctness of the processing; addressing the operator devoted to themachine to promptly and carefully take the actions devoted to establish or restore the optimal operating conditions; managing in real time information on diagnostics, as a support of the maintenance operations of the machine. The kind of facilities that designers can directly find on themarket, in terms of software component libraries provides in fact an adequate support as regard the implementation of either top-level or bottom-level functionalities, typically pertaining to the domains of user-friendly HMIs, closed-loop regulation and motion control, fieldbus-based interconnection of remote smart devices. What is still lacking is a reference framework comprising a comprehensive set of highly reusable logic control components that, focussing on the cross-cutting functionalities characterizing the automation domain, may help the designers in the process of modelling and structuring their applications according to the specific needs. Historically, the design and verification process for complex automated industrial systems is performed in empirical way, without a clear distinction between functional and technological-implementation concepts and without a systematic method to organically deal with the complete system. Traditionally, in the field of analog and digital control design and verification through formal and simulation tools have been adopted since a long time ago, at least for multivariable and/or nonlinear controllers for complex time-driven dynamics as in the fields of vehicles, aircrafts, robots, electric drives and complex power electronics equipments. Moving to the field of logic control, typical for industrial manufacturing automation, the design and verification process is approached in a completely different way, usually very “unstructured”. No clear distinction between functions and implementations, between functional architectures and technological architectures and platforms is considered. Probably this difference is due to the different “dynamical framework”of logic control with respect to analog/digital control. As a matter of facts, in logic control discrete-events dynamics replace time-driven dynamics; hence most of the formal and mathematical tools of analog/digital control cannot be directly migrated to logic control to enlighten the distinction between functions and implementations. In addition, in the common view of application technicians, logic control design is strictly connected to the adopted implementation technology (relays in the past, software nowadays), leading again to a deep confusion among functional view and technological view. In Industrial automation software engineering, concepts as modularity, encapsulation, composability and reusability are strongly emphasized and profitably realized in the so-calledobject-oriented methodologies. Industrial automation is receiving lately this approach, as testified by some IEC standards IEC 611313, IEC 61499 which have been considered in commercial products only recently. On the other hand, in the scientific and technical literature many contributions have been already proposed to establish a suitable modelling framework for industrial automation. During last years it was possible to note a considerable growth in the exploitation of innovative concepts and technologies from ICT world in industrial automation systems. For what concerns the logic control design, Model Based Design (MBD) is being imported in industrial automation from software engineering field. Another key-point in industrial automated systems is the growth of requirements in terms of availability, reliability and safety for technological systems. In other words, the control system should not only deal with the nominal behaviour, but should also deal with other important duties, such as diagnosis and faults isolations, recovery and safety management. Indeed, together with high performance, in complex systems fault occurrences increase. This is a consequence of the fact that, as it typically occurs in reliable mechatronic systems, in complex systems such as AMS, together with reliable mechanical elements, an increasing number of electronic devices are also present, that are more vulnerable by their own nature. The diagnosis problem and the faults isolation in a generic dynamical system consists in the design of an elaboration unit that, appropriately processing the inputs and outputs of the dynamical system, is also capable of detecting incipient faults on the plant devices, reconfiguring the control system so as to guarantee satisfactory performance. The designer should be able to formally verify the product, certifying that, in its final implementation, it will perform itsrequired function guarantying the desired level of reliability and safety; the next step is that of preventing faults and eventually reconfiguring the control system so that faults are tolerated. On this topic an important improvement to formal verification of logic control, fault diagnosis and fault tolerant control results derive from Discrete Event Systems theory. The aimof this work is to define a design pattern and a control architecture to help the designer of control logic in industrial automated systems. The work starts with a brief discussion on main characteristics and description of industrial automated systems on Chapter 1. In Chapter 2 a survey on the state of the software engineering paradigm applied to industrial automation is discussed. Chapter 3 presentes a architecture for industrial automated systems based on the new concept of Generalized Actuator showing its benefits, while in Chapter 4 this architecture is refined using a novel entity, the Generalized Device in order to have a better reusability and modularity of the control logic. In Chapter 5 a new approach will be present based on Discrete Event Systems for the problemof software formal verification and an active fault tolerant control architecture using online diagnostic. Finally conclusive remarks and some ideas on new directions to explore are given. In Appendix A are briefly reported some concepts and results about Discrete Event Systems which should help the reader in understanding some crucial points in chapter 5; while in Appendix B an overview on the experimental testbed of the Laboratory of Automation of University of Bologna, is reported to validated the approach presented in chapter 3, chapter 4 and chapter 5. In Appendix C some components model used in chapter 5 for formal verification are reported.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

I moderni sistemi embedded sono equipaggiati con risorse hardware che consentono l’esecuzione di applicazioni molto complesse come il decoding audio e video. La progettazione di simili sistemi deve soddisfare due esigenze opposte. Da un lato è necessario fornire un elevato potenziale computazionale, dall’altro bisogna rispettare dei vincoli stringenti riguardo il consumo di energia. Uno dei trend più diffusi per rispondere a queste esigenze opposte è quello di integrare su uno stesso chip un numero elevato di processori caratterizzati da un design semplificato e da bassi consumi. Tuttavia, per sfruttare effettivamente il potenziale computazionale offerto da una batteria di processoriè necessario rivisitare pesantemente le metodologie di sviluppo delle applicazioni. Con l’avvento dei sistemi multi-processore su singolo chip (MPSoC) il parallel programming si è diffuso largamente anche in ambito embedded. Tuttavia, i progressi nel campo della programmazione parallela non hanno mantenuto il passo con la capacità di integrare hardware parallelo su un singolo chip. Oltre all’introduzione di multipli processori, la necessità di ridurre i consumi degli MPSoC comporta altre soluzioni architetturali che hanno l’effetto diretto di complicare lo sviluppo delle applicazioni. Il design del sottosistema di memoria, in particolare, è un problema critico. Integrare sul chip dei banchi di memoria consente dei tempi d’accesso molto brevi e dei consumi molto contenuti. Sfortunatamente, la quantità di memoria on-chip che può essere integrata in un MPSoC è molto limitata. Per questo motivo è necessario aggiungere dei banchi di memoria off-chip, che hanno una capacità molto maggiore, come maggiori sono i consumi e i tempi d’accesso. La maggior parte degli MPSoC attualmente in commercio destina una parte del budget di area all’implementazione di memorie cache e/o scratchpad. Le scratchpad (SPM) sono spesso preferite alle cache nei sistemi MPSoC embedded, per motivi di maggiore predicibilità, minore occupazione d’area e – soprattutto – minori consumi. Per contro, mentre l’uso delle cache è completamente trasparente al programmatore, le SPM devono essere esplicitamente gestite dall’applicazione. Esporre l’organizzazione della gerarchia di memoria ll’applicazione consente di sfruttarne in maniera efficiente i vantaggi (ridotti tempi d’accesso e consumi). Per contro, per ottenere questi benefici è necessario scrivere le applicazioni in maniera tale che i dati vengano partizionati e allocati sulle varie memorie in maniera opportuna. L’onere di questo compito complesso ricade ovviamente sul programmatore. Questo scenario descrive bene l’esigenza di modelli di programmazione e strumenti di supporto che semplifichino lo sviluppo di applicazioni parallele. In questa tesi viene presentato un framework per lo sviluppo di software per MPSoC embedded basato su OpenMP. OpenMP è uno standard di fatto per la programmazione di multiprocessori con memoria shared, caratterizzato da un semplice approccio alla parallelizzazione tramite annotazioni (direttive per il compilatore). La sua interfaccia di programmazione consente di esprimere in maniera naturale e molto efficiente il parallelismo a livello di loop, molto diffuso tra le applicazioni embedded di tipo signal processing e multimedia. OpenMP costituisce un ottimo punto di partenza per la definizione di un modello di programmazione per MPSoC, soprattutto per la sua semplicità d’uso. D’altra parte, per sfruttare in maniera efficiente il potenziale computazionale di un MPSoC è necessario rivisitare profondamente l’implementazione del supporto OpenMP sia nel compilatore che nell’ambiente di supporto a runtime. Tutti i costrutti per gestire il parallelismo, la suddivisione del lavoro e la sincronizzazione inter-processore comportano un costo in termini di overhead che deve essere minimizzato per non comprometterre i vantaggi della parallelizzazione. Questo può essere ottenuto soltanto tramite una accurata analisi delle caratteristiche hardware e l’individuazione dei potenziali colli di bottiglia nell’architettura. Una implementazione del task management, della sincronizzazione a barriera e della condivisione dei dati che sfrutti efficientemente le risorse hardware consente di ottenere elevate performance e scalabilità. La condivisione dei dati, nel modello OpenMP, merita particolare attenzione. In un modello a memoria condivisa le strutture dati (array, matrici) accedute dal programma sono fisicamente allocate su una unica risorsa di memoria raggiungibile da tutti i processori. Al crescere del numero di processori in un sistema, l’accesso concorrente ad una singola risorsa di memoria costituisce un evidente collo di bottiglia. Per alleviare la pressione sulle memorie e sul sistema di connessione vengono da noi studiate e proposte delle tecniche di partizionamento delle strutture dati. Queste tecniche richiedono che una singola entità di tipo array venga trattata nel programma come l’insieme di tanti sotto-array, ciascuno dei quali può essere fisicamente allocato su una risorsa di memoria differente. Dal punto di vista del programma, indirizzare un array partizionato richiede che ad ogni accesso vengano eseguite delle istruzioni per ri-calcolare l’indirizzo fisico di destinazione. Questo è chiaramente un compito lungo, complesso e soggetto ad errori. Per questo motivo, le nostre tecniche di partizionamento sono state integrate nella l’interfaccia di programmazione di OpenMP, che è stata significativamente estesa. Specificamente, delle nuove direttive e clausole consentono al programmatore di annotare i dati di tipo array che si vuole partizionare e allocare in maniera distribuita sulla gerarchia di memoria. Sono stati inoltre sviluppati degli strumenti di supporto che consentono di raccogliere informazioni di profiling sul pattern di accesso agli array. Queste informazioni vengono sfruttate dal nostro compilatore per allocare le partizioni sulle varie risorse di memoria rispettando una relazione di affinità tra il task e i dati. Più precisamente, i passi di allocazione nel nostro compilatore assegnano una determinata partizione alla memoria scratchpad locale al processore che ospita il task che effettua il numero maggiore di accessi alla stessa.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis deals with heterogeneous architectures in standard workstations. Heterogeneous architectures represent an appealing alternative to traditional supercomputers because they are based on commodity components fabricated in large quantities. Hence their price-performance ratio is unparalleled in the world of high performance computing (HPC). In particular, different aspects related to the performance and consumption of heterogeneous architectures have been explored. The thesis initially focuses on an efficient implementation of a parallel application, where the execution time is dominated by an high number of floating point instructions. Then the thesis touches the central problem of efficient management of power peaks in heterogeneous computing systems. Finally it discusses a memory-bounded problem, where the execution time is dominated by the memory latency. Specifically, the following main contributions have been carried out: A novel framework for the design and analysis of solar field for Central Receiver Systems (CRS) has been developed. The implementation based on desktop workstation equipped with multiple Graphics Processing Units (GPUs) is motivated by the need to have an accurate and fast simulation environment for studying mirror imperfection and non-planar geometries. Secondly, a power-aware scheduling algorithm on heterogeneous CPU-GPU architectures, based on an efficient distribution of the computing workload to the resources, has been realized. The scheduler manages the resources of several computing nodes with a view to reducing the peak power. The two main contributions of this work follow: the approach reduces the supply cost due to high peak power whilst having negligible impact on the parallelism of computational nodes. from another point of view the developed model allows designer to increase the number of cores without increasing the capacity of the power supply unit. Finally, an implementation for efficient graph exploration on reconfigurable architectures is presented. The purpose is to accelerate graph exploration, reducing the number of random memory accesses.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This Thesis wants to highlight the importance of ad-hoc designed and developed embedded systems in the implementation of intelligent sensor networks. As evidence four areas of application are presented: Precision Agriculture, Bioengineering, Automotive and Structural Health Monitoring. For each field is reported one, or more, smart device design and developing, in addition to on-board elaborations, experimental validation and in field tests. In particular, it is presented the design and development of a fruit meter. In the bioengineering field, three different projects are reported, detailing the architectures implemented and the validation tests conducted. Two prototype realizations of an inner temperature measurement system in electric motors for an automotive application are then discussed. Lastly, the HW/SW design of a Smart Sensor Network is analyzed: the network features on-board data management and processing, integration in an IoT toolchain, Wireless Sensor Network developments and an AI framework for vibration-based structural assessment.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Providing support for multimedia applications on low-power mobile devices remains a significant research challenge. This is primarily due to two reasons: • Portable mobile devices have modest sizes and weights, and therefore inadequate resources, low CPU processing power, reduced display capabilities, limited memory and battery lifetimes as compared to desktop and laptop systems. • On the other hand, multimedia applications tend to have distinctive QoS and processing requirementswhichmake themextremely resource-demanding. This innate conflict introduces key research challenges in the design of multimedia applications and device-level power optimization. Energy efficiency in this kind of platforms can be achieved only via a synergistic hardware and software approach. In fact, while System-on-Chips are more and more programmable thus providing functional flexibility, hardwareonly power reduction techniques cannot maintain consumption under acceptable bounds. It is well understood both in research and industry that system configuration andmanagement cannot be controlled efficiently only relying on low-level firmware and hardware drivers. In fact, at this level there is lack of information about user application activity and consequently about the impact of power management decision on QoS. Even though operating system support and integration is a requirement for effective performance and energy management, more effective and QoSsensitive power management is possible if power awareness and hardware configuration control strategies are tightly integratedwith domain-specificmiddleware services. The main objective of this PhD research has been the exploration and the integration of amiddleware-centric energymanagement with applications and operating-system. We choose to focus on the CPU-memory and the video subsystems, since they are the most power-hungry components of an embedded system. A second main objective has been the definition and implementation of software facilities (like toolkits, API, and run-time engines) in order to improve programmability and performance efficiency of such platforms. Enhancing energy efficiency and programmability ofmodernMulti-Processor System-on-Chips (MPSoCs) Consumer applications are characterized by tight time-to-market constraints and extreme cost sensitivity. The software that runs on modern embedded systems must be high performance, real time, and even more important low power. Although much progress has been made on these problems, much remains to be done. Multi-processor System-on-Chip (MPSoC) are increasingly popular platforms for high performance embedded applications. This leads to interesting challenges in software development since efficient software development is a major issue for MPSoc designers. An important step in deploying applications on multiprocessors is to allocate and schedule concurrent tasks to the processing and communication resources of the platform. The problem of allocating and scheduling precedenceconstrained tasks on processors in a distributed real-time system is NP-hard. There is a clear need for deployment technology that addresses thesemulti processing issues. This problem can be tackled by means of specific middleware which takes care of allocating and scheduling tasks on the different processing elements and which tries also to optimize the power consumption of the entire multiprocessor platform. This dissertation is an attempt to develop insight into efficient, flexible and optimalmethods for allocating and scheduling concurrent applications tomultiprocessor architectures. It is a well-known problem in literature: this kind of optimization problems are very complex even in much simplified variants, therefore most authors propose simplified models and heuristic approaches to solve it in reasonable time. Model simplification is often achieved by abstracting away platform implementation ”details”. As a result, optimization problems become more tractable, even reaching polynomial time complexity. Unfortunately, this approach creates an abstraction gap between the optimization model and the real HW-SW platform. The main issue with heuristic or, more in general, with incomplete search is that they introduce an optimality gap of unknown size. They provide very limited or no information on the distance between the best computed solution and the optimal one. The goal of this work is to address both abstraction and optimality gaps, formulating accurate models which accounts for a number of ”non-idealities” in real-life hardware platforms, developing novel mapping algorithms that deterministically find optimal solutions, and implementing software infrastructures required by developers to deploy applications for the targetMPSoC platforms. Energy Efficient LCDBacklightAutoregulation on Real-LifeMultimediaAp- plication Processor Despite the ever increasing advances in Liquid Crystal Display’s (LCD) technology, their power consumption is still one of the major limitations to the battery life of mobile appliances such as smart phones, portable media players, gaming and navigation devices. There is a clear trend towards the increase of LCD size to exploit the multimedia capabilities of portable devices that can receive and render high definition video and pictures. Multimedia applications running on these devices require LCD screen sizes of 2.2 to 3.5 inches andmore to display video sequences and pictures with the required quality. LCD power consumption is dependent on the backlight and pixel matrix driving circuits and is typically proportional to the panel area. As a result, the contribution is also likely to be considerable in future mobile appliances. To address this issue, companies are proposing low power technologies suitable for mobile applications supporting low power states and image control techniques. On the research side, several power saving schemes and algorithms can be found in literature. Some of them exploit software-only techniques to change the image content to reduce the power associated with the crystal polarization, some others are aimed at decreasing the backlight level while compensating the luminance reduction by compensating the user perceived quality degradation using pixel-by-pixel image processing algorithms. The major limitation of these techniques is that they rely on the CPU to perform pixel-based manipulations and their impact on CPU utilization and power consumption has not been assessed. This PhDdissertation shows an alternative approach that exploits in a smart and efficient way the hardware image processing unit almost integrated in every current multimedia application processors to implement a hardware assisted image compensation that allows dynamic scaling of the backlight with a negligible impact on QoS. The proposed approach overcomes CPU-intensive techniques by saving system power without requiring either a dedicated display technology or hardware modification. Thesis Overview The remainder of the thesis is organized as follows. The first part is focused on enhancing energy efficiency and programmability of modern Multi-Processor System-on-Chips (MPSoCs). Chapter 2 gives an overview about architectural trends in embedded systems, illustrating the principal features of new technologies and the key challenges still open. Chapter 3 presents a QoS-driven methodology for optimal allocation and frequency selection for MPSoCs. The methodology is based on functional simulation and full system power estimation. Chapter 4 targets allocation and scheduling of pipelined stream-oriented applications on top of distributed memory architectures with messaging support. We tackled the complexity of the problem by means of decomposition and no-good generation, and prove the increased computational efficiency of this approach with respect to traditional ones. Chapter 5 presents a cooperative framework to solve the allocation, scheduling and voltage/frequency selection problem to optimality for energyefficient MPSoCs, while in Chapter 6 applications with conditional task graph are taken into account. Finally Chapter 7 proposes a complete framework, called Cellflow, to help programmers in efficient software implementation on a real architecture, the Cell Broadband Engine processor. The second part is focused on energy efficient software techniques for LCD displays. Chapter 8 gives an overview about portable device display technologies, illustrating the principal features of LCD video systems and the key challenges still open. Chapter 9 shows several energy efficient software techniques present in literature, while Chapter 10 illustrates in details our method for saving significant power in an LCD panel. Finally, conclusions are drawn, reporting the main research contributions that have been discussed throughout this dissertation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The need for high bandwidth, due to the explosion of new multi\-media-oriented IP-based services, as well as increasing broadband access requirements is leading to the need of flexible and highly reconfigurable optical networks. While transmission bandwidth does not represent a limit due to the huge bandwidth provided by optical fibers and Dense Wavelength Division Multiplexing (DWDM) technology, the electronic switching nodes in the core of the network represent the bottleneck in terms of speed and capacity for the overall network. For this reason DWDM technology must be exploited not only for data transport but also for switching operations. In this Ph.D. thesis solutions for photonic packet switches, a flexible alternative with respect to circuit-switched optical networks are proposed. In particular solutions based on devices and components that are expected to mature in the near future are proposed, with the aim to limit the employment of complex components. The work presented here is the result of part of the research activities performed by the Networks Research Group at the Department of Electronics, Computer Science and Systems (DEIS) of the University of Bologna, Italy. In particular, the work on optical packet switching has been carried on within three relevant research projects: the e-Photon/ONe and e-Photon/ONe+ projects, funded by the European Union in the Sixth Framework Programme, and the national project OSATE funded by the Italian Ministry of Education, University and Scientific Research. The rest of the work is organized as follows. Chapter 1 gives a brief introduction to network context and contention resolution in photonic packet switches. Chapter 2 presents different strategies for contention resolution in wavelength domain. Chapter 3 illustrates a possible implementation of one of the schemes proposed in chapter 2. Then, chapter 4 presents multi-fiber switches, which employ jointly wavelength and space domains to solve contention. Chapter 5 shows buffered switches, to solve contention in time domain besides wavelength domain. Finally chapter 6 presents a cost model to compare different switch architectures in terms of cost.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The miniaturization race in the hardware industry aiming at continuous increasing of transistor density on a die does not bring respective application performance improvements any more. One of the most promising alternatives is to exploit a heterogeneous nature of common applications in hardware. Supported by reconfigurable computation, which has already proved its efficiency in accelerating data intensive applications, this concept promises a breakthrough in contemporary technology development. Memory organization in such heterogeneous reconfigurable architectures becomes very critical. Two primary aspects introduce a sophisticated trade-off. On the one hand, a memory subsystem should provide well organized distributed data structure and guarantee the required data bandwidth. On the other hand, it should hide the heterogeneous hardware structure from the end-user, in order to support feasible high-level programmability of the system. This thesis work explores the heterogeneous reconfigurable hardware architectures and presents possible solutions to cope the problem of memory organization and data structure. By the example of the MORPHEUS heterogeneous platform, the discussion follows the complete design cycle, starting from decision making and justification, until hardware realization. Particular emphasis is made on the methods to support high system performance, meet application requirements, and provide a user-friendly programmer interface. As a result, the research introduces a complete heterogeneous platform enhanced with a hierarchical memory organization, which copes with its task by means of separating computation from communication, providing reconfigurable engines with computation and configuration data, and unification of heterogeneous computational devices using local storage buffers. It is distinguished from the related solutions by distributed data-flow organization, specifically engineered mechanisms to operate with data on local domains, particular communication infrastructure based on Network-on-Chip, and thorough methods to prevent computation and communication stalls. In addition, a novel advanced technique to accelerate memory access was developed and implemented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work describes the development of a simulation tool which allows the simulation of the Internal Combustion Engine (ICE), the transmission and the vehicle dynamics. It is a control oriented simulation tool, designed in order to perform both off-line (Software In the Loop) and on-line (Hardware In the Loop) simulation. In the first case the simulation tool can be used in order to optimize Engine Control Unit strategies (as far as regard, for example, the fuel consumption or the performance of the engine), while in the second case it can be used in order to test the control system. In recent years the use of HIL simulations has proved to be very useful in developing and testing of control systems. Hardware In the Loop simulation is a technology where the actual vehicles, engines or other components are replaced by a real time simulation, based on a mathematical model and running in a real time processor. The processor reads ECU (Engine Control Unit) output signals which would normally feed the actuators and, by using mathematical models, provides the signals which would be produced by the actual sensors. The simulation tool, fully designed within Simulink, includes the possibility to simulate the only engine, the transmission and vehicle dynamics and the engine along with the vehicle and transmission dynamics, allowing in this case to evaluate the performance and the operating conditions of the Internal Combustion Engine, once it is installed on a given vehicle. Furthermore the simulation tool includes different level of complexity, since it is possible to use, for example, either a zero-dimensional or a one-dimensional model of the intake system (in this case only for off-line application, because of the higher computational effort). Given these preliminary remarks, an important goal of this work is the development of a simulation environment that can be easily adapted to different engine types (single- or multi-cylinder, four-stroke or two-stroke, diesel or gasoline) and transmission architecture without reprogramming. Also, the same simulation tool can be rapidly configured both for off-line and real-time application. The Matlab-Simulink environment has been adopted to achieve such objectives, since its graphical programming interface allows building flexible and reconfigurable models, and real-time simulation is possible with standard, off-the-shelf software and hardware platforms (such as dSPACE systems).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The term Ambient Intelligence (AmI) refers to a vision on the future of the information society where smart, electronic environment are sensitive and responsive to the presence of people and their activities (Context awareness). In an ambient intelligence world, devices work in concert to support people in carrying out their everyday life activities, tasks and rituals in an easy, natural way using information and intelligence that is hidden in the network connecting these devices. This promotes the creation of pervasive environments improving the quality of life of the occupants and enhancing the human experience. AmI stems from the convergence of three key technologies: ubiquitous computing, ubiquitous communication and natural interfaces. Ambient intelligent systems are heterogeneous and require an excellent cooperation between several hardware/software technologies and disciplines, including signal processing, networking and protocols, embedded systems, information management, and distributed algorithms. Since a large amount of fixed and mobile sensors embedded is deployed into the environment, the Wireless Sensor Networks is one of the most relevant enabling technologies for AmI. WSN are complex systems made up of a number of sensor nodes which can be deployed in a target area to sense physical phenomena and communicate with other nodes and base stations. These simple devices typically embed a low power computational unit (microcontrollers, FPGAs etc.), a wireless communication unit, one or more sensors and a some form of energy supply (either batteries or energy scavenger modules). WNS promises of revolutionizing the interactions between the real physical worlds and human beings. Low-cost, low-computational power, low energy consumption and small size are characteristics that must be taken into consideration when designing and dealing with WSNs. To fully exploit the potential of distributed sensing approaches, a set of challengesmust be addressed. Sensor nodes are inherently resource-constrained systems with very low power consumption and small size requirements which enables than to reduce the interference on the physical phenomena sensed and to allow easy and low-cost deployment. They have limited processing speed,storage capacity and communication bandwidth that must be efficiently used to increase the degree of local ”understanding” of the observed phenomena. A particular case of sensor nodes are video sensors. This topic holds strong interest for a wide range of contexts such as military, security, robotics and most recently consumer applications. Vision sensors are extremely effective for medium to long-range sensing because vision provides rich information to human operators. However, image sensors generate a huge amount of data, whichmust be heavily processed before it is transmitted due to the scarce bandwidth capability of radio interfaces. In particular, in video-surveillance, it has been shown that source-side compression is mandatory due to limited bandwidth and delay constraints. Moreover, there is an ample opportunity for performing higher-level processing functions, such as object recognition that has the potential to drastically reduce the required bandwidth (e.g. by transmitting compressed images only when something ‘interesting‘ is detected). The energy cost of image processing must however be carefully minimized. Imaging could play and plays an important role in sensing devices for ambient intelligence. Computer vision can for instance be used for recognising persons and objects and recognising behaviour such as illness and rioting. Having a wireless camera as a camera mote opens the way for distributed scene analysis. More eyes see more than one and a camera system that can observe a scene from multiple directions would be able to overcome occlusion problems and could describe objects in their true 3D appearance. In real-time, these approaches are a recently opened field of research. In this thesis we pay attention to the realities of hardware/software technologies and the design needed to realize systems for distributed monitoring, attempting to propose solutions on open issues and filling the gap between AmI scenarios and hardware reality. The physical implementation of an individual wireless node is constrained by three important metrics which are outlined below. Despite that the design of the sensor network and its sensor nodes is strictly application dependent, a number of constraints should almost always be considered. Among them: • Small form factor to reduce nodes intrusiveness. • Low power consumption to reduce battery size and to extend nodes lifetime. • Low cost for a widespread diffusion. These limitations typically result in the adoption of low power, low cost devices such as low powermicrocontrollers with few kilobytes of RAMand tenth of kilobytes of program memory with whomonly simple data processing algorithms can be implemented. However the overall computational power of the WNS can be very large since the network presents a high degree of parallelism that can be exploited through the adoption of ad-hoc techniques. Furthermore through the fusion of information from the dense mesh of sensors even complex phenomena can be monitored. In this dissertation we present our results in building several AmI applications suitable for a WSN implementation. The work can be divided into two main areas:Low Power Video Sensor Node and Video Processing Alghoritm and Multimodal Surveillance . Low Power Video Sensor Nodes and Video Processing Alghoritms In comparison to scalar sensors, such as temperature, pressure, humidity, velocity, and acceleration sensors, vision sensors generate much higher bandwidth data due to the two-dimensional nature of their pixel array. We have tackled all the constraints listed above and have proposed solutions to overcome the current WSNlimits for Video sensor node. We have designed and developed wireless video sensor nodes focusing on the small size and the flexibility of reuse in different applications. The video nodes target a different design point: the portability (on-board power supply, wireless communication), a scanty power budget (500mW),while still providing a prominent level of intelligence, namely sophisticated classification algorithmand high level of reconfigurability. We developed two different video sensor node: The device architecture of the first one is based on a low-cost low-power FPGA+microcontroller system-on-chip. The second one is based on ARM9 processor. Both systems designed within the above mentioned power envelope could operate in a continuous fashion with Li-Polymer battery pack and solar panel. Novel low power low cost video sensor nodes which, in contrast to sensors that just watch the world, are capable of comprehending the perceived information in order to interpret it locally, are presented. Featuring such intelligence, these nodes would be able to cope with such tasks as recognition of unattended bags in airports, persons carrying potentially dangerous objects, etc.,which normally require a human operator. Vision algorithms for object detection, acquisition like human detection with Support Vector Machine (SVM) classification and abandoned/removed object detection are implemented, described and illustrated on real world data. Multimodal surveillance: In several setup the use of wired video cameras may not be possible. For this reason building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. Energy efficiency for wireless smart camera networks is one of the major efforts in distributed monitoring and surveillance community. For this reason, building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. The Pyroelectric Infra-Red (PIR) sensors have been used to extend the lifetime of a solar-powered video sensor node by providing an energy level dependent trigger to the video camera and the wireless module. Such approach has shown to be able to extend node lifetime and possibly result in continuous operation of the node.Being low-cost, passive (thus low-power) and presenting a limited form factor, PIR sensors are well suited for WSN applications. Moreover techniques to have aggressive power management policies are essential for achieving long-termoperating on standalone distributed cameras needed to improve the power consumption. We have used an adaptive controller like Model Predictive Control (MPC) to help the system to improve the performances outperforming naive power management policies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis explores the capabilities of heterogeneous multi-core systems, based on multiple Graphics Processing Units (GPUs) in a standard desktop framework. Multi-GPU accelerated desk side computers are an appealing alternative to other high performance computing (HPC) systems: being composed of commodity hardware components fabricated in large quantities, their price-performance ratio is unparalleled in the world of high performance computing. Essentially bringing “supercomputing to the masses”, this opens up new possibilities for application fields where investing in HPC resources had been considered unfeasible before. One of these is the field of bioelectrical imaging, a class of medical imaging technologies that occupy a low-cost niche next to million-dollar systems like functional Magnetic Resonance Imaging (fMRI). In the scope of this work, several computational challenges encountered in bioelectrical imaging are tackled with this new kind of computing resource, striving to help these methods approach their true potential. Specifically, the following main contributions were made: Firstly, a novel dual-GPU implementation of parallel triangular matrix inversion (TMI) is presented, addressing an crucial kernel in computation of multi-mesh head models of encephalographic (EEG) source localization. This includes not only a highly efficient implementation of the routine itself achieving excellent speedups versus an optimized CPU implementation, but also a novel GPU-friendly compressed storage scheme for triangular matrices. Secondly, a scalable multi-GPU solver for non-hermitian linear systems was implemented. It is integrated into a simulation environment for electrical impedance tomography (EIT) that requires frequent solution of complex systems with millions of unknowns, a task that this solution can perform within seconds. In terms of computational throughput, it outperforms not only an highly optimized multi-CPU reference, but related GPU-based work as well. Finally, a GPU-accelerated graphical EEG real-time source localization software was implemented. Thanks to acceleration, it can meet real-time requirements in unpreceeded anatomical detail running more complex localization algorithms. Additionally, a novel implementation to extract anatomical priors from static Magnetic Resonance (MR) scansions has been included.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many industries and academic institutions share the vision that an appropriate use of information originated from the environment may add value to services in multiple domains and may help humans in dealing with the growing information overload which often seems to jeopardize our life. It is also clear that information sharing and mutual understanding between software agents may impact complex processes where many actors (humans and machines) are involved, leading to relevant socioeconomic benefits. Starting from these two input, architectural and technological solutions to enable “environment-related cooperative digital services” are here explored. The proposed analysis starts from the consideration that our environment is physical space and here diversity is a major value. On the other side diversity is detrimental to common technological solutions, and it is an obstacle to mutual understanding. An appropriate environment abstraction and a shared information model are needed to provide the required levels of interoperability in our heterogeneous habitat. This thesis reviews several approaches to support environment related applications and intends to demonstrate that smart-space-based, ontology-driven, information-sharing platforms may become a flexible and powerful solution to support interoperable services in virtually any domain and even in cross-domain scenarios. It also shows that semantic technologies can be fruitfully applied not only to represent application domain knowledge. For example semantic modeling of Human-Computer Interaction may support interaction interoperability and transformation of interaction primitives into actions, and the thesis shows how smart-space-based platforms driven by an interaction ontology may enable natural ad flexible ways of accessing resources and services, e.g, with gestures. An ontology for computational flow execution has also been built to represent abstract computation, with the goal of exploring new ways of scheduling computation flows with smart-space-based semantic platforms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Most electronic systems can be described in a very simplified way as an assemblage of analog and digital components put all together in order to perform a certain function. Nowadays, there is an increasing tendency to reduce the analog components, and to replace them by operations performed in the digital domain. This tendency has led to the emergence of new electronic systems that are more flexible, cheaper and robust. However, no matter the amount of digital process implemented, there will be always an analog part to be sorted out and thus, the step of converting digital signals into analog signals and vice versa cannot be avoided. This conversion can be more or less complex depending on the characteristics of the signals. Thus, even if it is desirable to replace functions carried out by analog components by digital processes, it is equally important to do so in a way that simplifies the conversion from digital to analog signals and vice versa. In the present thesis, we have study strategies based on increasing the amount of processing in the digital domain in such a way that the implementation of analog hardware stages can be simplified. To this aim, we have proposed the use of very low quantized signals, i.e. 1-bit, for the acquisition and for the generation of particular classes of signals.