42 resultados para fault-tolerant techniques
em Instituto Politécnico do Porto, Portugal
Resumo:
To increase the amount of logic available to the users in SRAM-based FPGAs, manufacturers are using nanometric technologies to boost logic density and reduce costs, making its use more attractive. However, these technological improvements also make FPGAs particularly vulnerable to configuration memory bit-flips caused by power fluctuations, strong electromagnetic fields and radiation. This issue is particularly sensitive because of the increasing amount of configuration memory cells needed to define their functionality. A short survey of the most recent publications is presented to support the options assumed during the definition of a framework for implementing circuits immune to bit-flips induction mechanisms in memory cells, based on a customized redundant infrastructure and on a detection-and-fix controller.
Resumo:
This paper presents an architecture (Multi-μ) being implemented to study and develop software based fault tolerant mechanisms for Real-Time Systems, using the Ada language (Ada 95) and Commercial Off-The-Shelf (COTS) components. Several issues regarding fault tolerance are presented and mechanisms to achieve fault tolerance by software active replication in Ada 95 are discussed. The Multi-μ architecture, based on a specifically proposed Fault Tolerance Manager (FTManager), is then described. Finally, some considerations are made about the work being done and essential future developments.
Resumo:
On-chip debug (OCD) features are frequently available in modern microprocessors. Their contribution to shorten the time-to-market justifies the industry investment in this area, where a number of competing or complementary proposals are available or under development, e.g. NEXUS, CJTAG, IJTAG. The controllability and observability features provided by OCD infrastructures provide a valuable toolbox that can be used well beyond the debugging arena, improving the return on investment rate by diluting its cost across a wider spectrum of application areas. This paper discusses the use of OCD features for validating fault tolerant architectures, and in particular the efficiency of various fault injection methods provided by enhanced OCD infrastructures. The reference data for our comparative study was captured on a workbench comprising the 32-bit Freescale MPC-565 microprocessor, an iSYSTEM IC3000 debugger (iTracePro version) and the Winidea 2005 debugging package. All enhanced OCD infrastructures were implemented in VHDL and the results were obtained by simulation within the same fault injection environment. The focus of this paper is on the comparative analysis of the experimental results obtained for various OCD configurations and debugging scenarios.
Resumo:
Dependability is a critical factor in computer systems, requiring high quality validation & verification procedures in the development stage. At the same time, digital devices are getting smaller and access to their internal signals and registers is increasingly complex, requiring innovative debugging methodologies. To address this issue, most recent microprocessors include an on-chip debug (OCD) infrastructure to facilitate common debugging operations. This paper proposes an enhanced OCD infrastructure with the objective of supporting the verification of fault-tolerant mechanisms through fault injection campaigns. This upgraded on-chip debug and fault injection (OCD-FI) infrastructure provides an efficient fault injection mechanism with improved capabilities and dynamic behavior. Preliminary results show that this solution provides flexibility in terms of fault triggering and allows high speed real-time fault injection in memory elements
Resumo:
As electronic devices get smaller and more complex, dependability assurance is becoming fundamental for many mission critical computer based systems. This paper presents a case study on the possibility of using the on-chip debug infrastructures present in most current microprocessors to execute real time fault injection campaigns. The proposed methodology is based on a debugger customized for fault injection and designed for maximum flexibility, and consists of injecting bit-flip type faults on memory elements without modifying or halting the target application. The debugger design is easily portable and applicable to different architectures, providing a flexible and efficient mechanism for verifying and validating fault tolerant components.
Resumo:
The rapid increase in the use of microprocessor-based systems in critical areas, where failures imply risks to human lives, to the environment or to expensive equipment, significantly increased the need for dependable systems, able to detect, tolerate and eventually correct faults. The verification and validation of such systems is frequently performed via fault injection, using various forms and techniques. However, as electronic devices get smaller and more complex, controllability and observability issues, and sometimes real time constraints, make it harder to apply most conventional fault injection techniques. This paper proposes a fault injection environment and a scalable methodology to assist the execution of real-time fault injection campaigns, providing enhanced performance and capabilities. Our proposed solutions are based on the use of common and customized on-chip debug (OCD) mechanisms, present in many modern electronic devices, with the main objective of enabling the insertion of faults in microprocessor memory elements with minimum delay and intrusiveness. Different configurations were implemented starting from basic Components Off-The-Shelf (COTS) microprocessors, equipped with real-time OCD infrastructures, to improved solutions based on modified interfaces, and dedicated OCD circuitry that enhance fault injection capabilities and performance. All methodologies and configurations were evaluated and compared concerning performance gain and silicon overhead.
Resumo:
To increase the amount of logic available in SRAM-based FPGAs manufacturers are using nanometric technologies to boost logic density and reduce prices. However, nanometric scales are highly vulnerable to radiation-induced faults that affect values stored in memory cells. Since the functional definition of FPGAs relies on memory cells, they become highly prone to this type of faults. Fault tolerant implementations, based on triple modular redundancy (TMR) infrastructures, help to keep the correct operation of the circuit. However, TMR is not sufficient to guarantee the safe operation of a circuit. Other issues like the effects of multi-bit upsets (MBU) or fault accumulation, have also to be addressed. Furthermore, in case of a fault occurrence the correct operation of the affected module must be restored and the current state of the circuit coherently re-established. A solution that enables the autonomous correct restoration of the functional definition of the affected module, avoiding fault accumulation, re-establishing the correct circuit state in realtime, while keeping the normal operation of the circuit, is presented in this paper.
Resumo:
Fault injection is frequently used for the verification and validation of the fault tolerant features of microprocessors. This paper proposes the modification of a common on-chip debugging (OCD) infrastructure to add fault injection capabilities and improve performance. The proposed solution imposes a very low logic overhead and provides a flexible and efficient mechanism for the execution of fault injection campaigns, being applicable to different target system architectures.
Resumo:
Este documento descreve um modelo de tolerância a falhas para sistemas de tempo-real distribuídos. A sugestão deste modelo tem como propósito a apresentação de uma solu-ção fiável, flexível e adaptável às necessidades dos sistemas de tempo-real distribuídos. A tolerância a falhas é um aspeto extremamente importante na construção de sistemas de tempo-real e a sua aplicação traz inúmeros benefícios. Um design orientado para a to-lerância a falhas contribui para um melhor desempenho do sistema através do melhora-mento de aspetos chave como a segurança, a confiabilidade e a disponibilidade dos sis-temas. O trabalho desenvolvido centra-se na prevenção, deteção e tolerância a falhas de tipo ló-gicas (software) e físicas (hardware) e assenta numa arquitetura maioritariamente basea-da no tempo, conjugada com técnicas de redundância. O modelo preocupa-se com a efi-ciência e os custos de execução. Para isso utilizam-se também técnicas tradicionais de to-lerância a falhas, como a redundância e a migração, no sentido de não prejudicar o tempo de execução do serviço, ou seja, diminuindo o tempo de recuperação das réplicas, em ca-so de ocorrência de falhas. Neste trabalho são propostas heurísticas de baixa complexida-de para tempo-de-execução, a fim de se determinar para onde replicar os componentes que constituem o software de tempo-real e de negociá-los num mecanismo de coordena-ção por licitações. Este trabalho adapta e estende alguns algoritmos que fornecem solu-ções ainda que interrompidos. Estes algoritmos são referidos em trabalhos de investiga-ção relacionados, e são utilizados para formação de coligações entre nós coadjuvantes. O modelo proposto colmata as falhas através de técnicas de replicação ativa, tanto virtual como física, com blocos de execução concorrentes. Tenta-se melhorar ou manter a sua qualidade produzida, praticamente sem introduzir overhead de informação significativo no sistema. O modelo certifica-se que as máquinas escolhidas, para as quais os agentes migrarão, melhoram iterativamente os níveis de qualidade de serviço fornecida aos com-ponentes, em função das disponibilidades das respetivas máquinas. Caso a nova configu-ração de qualidade seja rentável para a qualidade geral do serviço, é feito um esforço no sentido de receber novos componentes em detrimento da qualidade dos já hospedados localmente. Os nós que cooperam na coligação maximizam o número de execuções para-lelas entre componentes paralelos que compõem o serviço, com o intuito de reduzir atra-sos de execução. O desenvolvimento desta tese conduziu ao modelo proposto e aos resultados apresenta-dos e foi genuinamente suportado por levantamentos bibliográficos de trabalhos de in-vestigação e desenvolvimento, literaturas e preliminares matemáticos. O trabalho tem também como base uma lista de referências bibliográficas.
Resumo:
OCEANS, 2001. MTS/IEEE Conference and Exhibition (Volume:2 )
Resumo:
Cloud computing is increasingly being adopted in different scenarios, like social networking, business applications, scientific experiments, etc. Relying in virtualization technology, the construction of these computing environments targets improvements in the infrastructure, such as power-efficiency and fulfillment of users’ SLA specifications. The methodology usually applied is packing all the virtual machines on the proper physical servers. However, failure occurrences in these networked computing systems can induce substantial negative impact on system performance, deviating the system from ours initial objectives. In this work, we propose adapted algorithms to dynamically map virtual machines to physical hosts, in order to improve cloud infrastructure power-efficiency, with low impact on users’ required performance. Our decision making algorithms leverage proactive fault-tolerance techniques to deal with systems failures, allied with virtual machine technology to share nodes resources in an accurately and controlled manner. The results indicate that our algorithms perform better targeting power-efficiency and SLA fulfillment, in face of cloud infrastructure failures.
Resumo:
Classical lock-based concurrency control does not scale with current and foreseen multi-core architectures, opening space for alternative concurrency control mechanisms. The concept of transactions executing concurrently in isolation with an underlying mechanism maintaining a consistent system state was already explored in fault-tolerant and distributed systems, and is currently being explored by transactional memory, this time being used to manage concurrent memory access. In this paper we discuss the use of Software Transactional Memory (STM), and how Ada can provide support for it. Furthermore, we draft a general programming interface to transactional memory, supporting future implementations of STM oriented to real-time systems.
Resumo:
To boost logic density and reduce per unit power consumption SRAM-based FPGAs manufacturers adopted nanometric technologies. However, this technology is highly vulnerable to radiation-induced faults, which affect values stored in memory cells, and to manufacturing imperfections. Fault tolerant implementations, based on Triple Modular Redundancy (TMR) infrastructures, help to keep the correct operation of the circuit. However, TMR is not sufficient to guarantee the safe operation of a circuit. Other issues like module placement, the effects of multi- bit upsets (MBU) or fault accumulation, have also to be addressed. In case of a fault occurrence the correct operation of the affected module must be restored and/or the current state of the circuit coherently re-established. A solution that enables the autonomous restoration of the functional definition of the affected module, avoiding fault accumulation, re-establishing the correct circuit state in real-time, while keeping the normal operation of the circuit, is presented in this paper.
Resumo:
It is widely accepted that organizations and individuals must be innovative and continually create new knowledge and ideas to deal with rapid change. Innovation plays an important role in not only the development of new business, process and products, but also in competitiveness and success of any organization. Technology for Creativity and Innovation: Tools, Techniques and Applications provides empirical research findings and best practices on creativity and innovation in business, organizational, and social environments. It is written for educators, academics and professionals who want to improve their understanding of creativity and innovation as well as the role technology has in shaping this discipline.
Resumo:
The introduction of Electric Vehicles (EVs) together with the implementation of smart grids will raise new challenges to power system operators. This paper proposes a demand response program for electric vehicle users which provides the network operator with another useful resource that consists in reducing vehicles charging necessities. This demand response program enables vehicle users to get some profit by agreeing to reduce their travel necessities and minimum battery level requirements on a given period. To support network operator actions, the amount of demand response usage can be estimated using data mining techniques applied to a database containing a large set of operation scenarios. The paper includes a case study based on simulated operation scenarios that consider different operation conditions, e.g. available renewable generation, and considering a diversity of distributed resources and electric vehicles with vehicle-to-grid capacity and demand response capacity in a 33 bus distribution network.