825 resultados para Fault Tolerance
Resumo:
This paper introduces an architecture for identifying and modelling in real-time at a copper mine using new technologies as M2M and cloud computing with a server in the cloud and an Android client inside the mine. The proposed design brings up pervasive mining, a system with wider coverage, higher communication efficiency, better fault-tolerance, and anytime anywhere availability. This solution was designed for a plant inside the mine which cannot tolerate interruption and for which their identification in situ, in real time, is an essential part of the system to control aspects such as instability by adjusting their corresponding parameters without stopping the process.
Resumo:
Architectures based on Coordinated Atomic action (CA action) concepts have been used to build concurrent fault-tolerant systems. This conceptual model combines concurrent exception handling with action nesting to provide a general mechanism for both enclosing interactions among system components and coordinating forward error recovery measures. This article presents an architectural model to guide the formal specification of concurrent fault-tolerant systems. This architecture provides built-in Communicating Sequential Processes (CSPs) and predefined channels to coordinate exception handling of the user-defined components. Hence some safety properties concerning action scoping and concurrent exception handling can be proved by using the FDR (Failure Divergence Refinement) verification tool. As a result, a formal and general architecture supporting software fault tolerance is ready to be used and proved as users define components with normal and exceptional behaviors. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
The InteGrade project is a multi-university effort to build a novel grid computing middleware based on the opportunistic use of resources belonging to user workstations. The InteGrade middleware currently enables the execution of sequential, bag-of-tasks, and parallel applications that follow the BSP or the MPI programming models. This article presents the lessons learned over the last five years of the InteGrade development and describes the solutions achieved concerning the support for robust application execution. The contributions cover the related fields of application scheduling, execution management, and fault tolerance. We present our solutions, describing their implementation principles and evaluation through the analysis of several experimental results. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
A área de Detecção de Intrusão, apesar de muito pesquisada, não responde a alguns problemas reais como níveis de ataques, dim ensão e complexidade de redes, tolerância a falhas, autenticação e privacidade, interoperabilidade e padronização. Uma pesquisa no Instituto de Informática da UFRGS, mais especificamente no Grupo de Segurança (GSEG), visa desenvolver um Sistema de Detecção de Intrusão Distribuído e com características de tolerância a falhas. Este projeto, denominado Asgaard, é a idealização de um sistema cujo objetivo não se restringe apenas a ser mais uma ferramenta de Detecção de Intrusão, mas uma plataforma que possibilite agregar novos módulos e técnicas, sendo um avanço em relação a outros Sistemas de Detecção atualmente em desenvolvimento. Um tópico ainda não abordado neste projeto seria a detecção de sniffers na rede, vindo a ser uma forma de prevenir que um ataque prossiga em outras estações ou redes interconectadas, desde que um intruso normalmente instala um sniffer após um ataque bem sucedido. Este trabalho discute as técnicas de detecção de sniffers, seus cenários, bem como avalia o uso destas técnicas em uma rede local. As técnicas conhecidas são testadas em um ambiente com diferentes sistemas operacionais, como linux e windows, mapeando os resultados sobre a eficiência das mesmas em condições diversas.
Resumo:
This thesis presents the study and development of fault-tolerant techniques for programmable architectures, the well-known Field Programmable Gate Arrays (FPGAs), customizable by SRAM. FPGAs are becoming more valuable for space applications because of the high density, high performance, reduced development cost and re-programmability. In particular, SRAM-based FPGAs are very valuable for remote missions because of the possibility of being reprogrammed by the user as many times as necessary in a very short period. SRAM-based FPGA and micro-controllers represent a wide range of components in space applications, and as a result will be the focus of this work, more specifically the Virtex® family from Xilinx and the architecture of the 8051 micro-controller from Intel. The Triple Modular Redundancy (TMR) with voters is a common high-level technique to protect ASICs against single event upset (SEU) and it can also be applied to FPGAs. The TMR technique was first tested in the Virtex® FPGA architecture by using a small design based on counters. Faults were injected in all sensitive parts of the FPGA and a detailed analysis of the effect of a fault in a TMR design synthesized in the Virtex® platform was performed. Results from fault injection and from a radiation ground test facility showed the efficiency of the TMR for the related case study circuit. Although TMR has showed a high reliability, this technique presents some limitations, such as area overhead, three times more input and output pins and, consequently, a significant increase in power dissipation. Aiming to reduce TMR costs and improve reliability, an innovative high-level technique for designing fault-tolerant systems in SRAM-based FPGAs was developed, without modification in the FPGA architecture. This technique combines time and hardware redundancy to reduce overhead and to ensure reliability. It is based on duplication with comparison and concurrent error detection. The new technique proposed in this work was specifically developed for FPGAs to cope with transient faults in the user combinational and sequential logic, while also reducing pin count, area and power dissipation. The methodology was validated by fault injection experiments in an emulation board. The thesis presents comparison results in fault coverage, area and performance between the discussed techniques.
Resumo:
Internet applications such as media streaming, collaborative computing and massive multiplayer are on the rise,. This leads to the need for multicast communication, but unfortunately group communications support based on IP multicast has not been widely adopted due to a combination of technical and non-technical problems. Therefore, a number of different application-layer multicast schemes have been proposed in recent literature to overcome the drawbacks. In addition, these applications often behave as both providers and clients of services, being called peer-topeer applications, and where participants come and go very dynamically. Thus, servercentric architectures for membership management have well-known problems related to scalability and fault-tolerance, and even peer-to-peer traditional solutions need to have some mechanism that takes into account member's volatility. The idea of location awareness distributes the participants in the overlay network according to their proximity in the underlying network allowing a better performance. Given this context, this thesis proposes an application layer multicast protocol, called LAALM, which takes into account the actual network topology in the assembly process of the overlay network. The membership algorithm uses a new metric, IPXY, to provide location awareness through the processing of local information, and it was implemented using a distributed shared and bi-directional tree. The algorithm also has a sub-optimal heuristic to minimize the cost of membership process. The protocol has been evaluated in two ways. First, through an own simulator developed in this work, where we evaluated the quality of distribution tree by metrics such as outdegree and path length. Second, reallife scenarios were built in the ns-3 network simulator where we evaluated the network protocol performance by metrics such as stress, stretch, time to first packet and reconfiguration group time
Resumo:
There are some approaches that take advantage of unused computational resources in the Internet nodes - users´ machines. In the last years , the peer-to-peer networks (P2P) have gaining a momentum mainly due to its support for scalability and fault tolerance. However, current P2P architectures present some problems such as nodes overhead due to messages routing, a great amount of nodes reconfigurations when the network topology changes, routing traffic inside a specific network even when the traffic is not directed to a machine of this network, and the lack of a proximity relationship among the P2P nodes and the proximity of these nodes in the IP network. Although some architectures use the information about the nodes distance in the IP network, they use methods that require dynamic information. In this work we propose a P2P architecture to fix the problems afore mentioned. It is composed of three parts. The first part consists of a basic P2P architecture, called SGrid, which maintains a relationship of nodes in the P2P network with their position in the IP network. Its assigns adjacent key regions to nodes of a same organization. The second part is a protocol called NATal (Routing and NAT application layer) that extends the basic architecture in order to remove from the nodes the responsibility of routing messages. The third part consists of a special kind of node, called LSP (Lightware Super-Peer), which is responsible for maintaining the P2P routing table. In addition, this work also presents a simulator that validates the architecture and a module of the Natal protocol to be used in Linux routers
Resumo:
The greater part of monitoring onshore Oil and Gas environment currently are based on wireless solutions. However, these solutions have a technological configuration that are out-of-date, mainly because analog radios and inefficient communication topologies are used. On the other hand, solutions based in digital radios can provide more efficient solutions related to energy consumption, security and fault tolerance. Thus, this paper evaluated if the Wireless Sensor Network, communication technology based on digital radios, are adequate to monitoring Oil and Gas onshore wells. Percent of packets transmitted with successful, energy consumption, communication delay and routing techniques applied to a mesh topology will be used as metrics to validate the proposal in the different routing techniques through network simulation tool NS-2
Resumo:
Previous works have studied the characteristics and peculiarities of P2P networks, especially security information aspects. Most works, in some way, deal with the sharing of resources and, in particular, the storage of files. This work complements previous studies and adds new definitions relating to this kind of systems. A system for safe storage of files (SAS-P2P) was specified and built, based on P2P technology, using the JXTA platform. This system uses standard X.509 and PKCS # 12 digital certificates, issued and managed by a public key infrastructure, which was also specified and developed based on P2P technology (PKIX-P2P). The information is stored in a special file with XML format which is especially prepared, facilitating handling and interoperability among applications. The intention of developing the SAS-P2P system was to offer a complementary service for Giga Natal network users, through which the participants in this network can collaboratively build a shared storage area, with important security features such as availability, confidentiality, authenticity and fault tolerance. Besides the specification, development of prototypes and testing of the SAS-P2P system, tests of the PKIX-P2P Manager module were also performed, in order to determine its fault tolerance and the effective calculation of the reputation of the certifying authorities participating in the system
Resumo:
Complex network analysis is a powerful tool into research of complex systems like brain networks. This work aims to describe the topological changes in neural functional connectivity networks of neocortex and hippocampus during slow-wave sleep (SWS) in animals submited to a novel experience exposure. Slow-wave sleep is an important sleep stage where occurs reverberations of electrical activities patterns of wakeness, playing a fundamental role in memory consolidation. Although its importance there s a lack of studies that characterize the topological dynamical of functional connectivity networks during that sleep stage. There s no studies that describe the topological modifications that novel exposure leads to this networks. We have observed that several topological properties have been modified after novel exposure and this modification remains for a long time. Major part of this changes in topological properties by novel exposure are related to fault tolerance
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Implementação de duas arquiteturas microcontroladas tolerantes a falhas para controle da temperatura
Resumo:
Pós-graduação em Física - IGCE
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Building facilities have become important infrastructures for modern productive plants dedicated to services. In this context, the control systems of intelligent buildings have evolved while their reliability has evidently improved. However, the occurrence of faults is inevitable in systems conceived, constructed and operated by humans. Thus, a practical alternative approach is found to be very useful to reduce the consequences of faults. Yet, only few publications address intelligent building modeling processes that take into consideration the occurrence of faults and how to manage their consequences. In the light of the foregoing, a procedure is proposed for the modeling of intelligent building control systems, considersing their functional specifications in normal operation and in the of the event of faults. The proposed procedure adopts the concepts of discrete event systems and holons, and explores Petri nets and their extensions so as to represent the structure and operation of control systems for intelligent buildings under normal and abnormal situations. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such as loosely connected and administered computing grids). It packs redundancy into heartbeat messages, thereby improving on the robustness of the traditional protocols. Results from experimental tests conducted in a simulated environment with adverse network conditions show significant improvement over existing solutions.