962 resultados para Fault tolerance


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Os Sistemas Multi-Robôs proporcionam vantagens sobre um robô individual, quando da realização de uma tarefa com maiores velocidade, precisão e tolerância a falhas. Os estudos dos comportamentos sociais na natureza têm permitido desenvolver algoritmos bio-inspirados úteis na área da robótica de enxame. Seguindo instruções simples e repetitivas, grupos de robôs, fisicamente limitados, conseguem solucionar problemas complexos. Quando existem duas ou mais tarefas a serem realizadas e o conjunto de robôs é heterogêneo, é possível agrupá-los de acordo com as funcionalidades neles disponíveis. No caso em que o conjunto de robôs é homogêneo, o agrupamento pode ser realizado considerando a posição relativa do robô em relação a uma tarefa ou acrescentando alguma característica distintiva. Nesta dissertação, é proposta uma técnica de clusterização espacial baseada simplesmente na comunicação local de robôs. Por meio de troca de mensagens entre os robôs vizinhos, esta técnica permite formar grupos de robôs espacialmente próximos sem precisar movimentar os robôs. Baseando-se nos métodos de clusterização de fichas, a técnica proposta emprega a noção de fichas virtuais, que são chamadas de cargas, sendo que uma carga pode ser estática ou dinâmica. Se uma carga é estática permite determinar a classe à qual um robô pertence. Dependendo da quantidade e do peso das cargas disponíveis no sistema, os robôs intercambiam informações até alcançar uma disposição homogênea de cargas. Quando as cargas se tornam estacionárias, é calculada uma densidade que permite guiar aquelas que estão ainda em movimento. Durante as experiências, foi observado visualmente que as cargas com maior peso acabam se agrupando primeiro enquanto aquelas com menor peso continuam se deslocando no enxame, até que estas cargas formem faixas de densidades diferenciadas para cada classe, alcançando assim o objetivo final que é a clusterização dos robôs.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Reliable messaging is a key component necessary for mobile agent systems. Current researches focus on reliable one-to-one message delivery to mobile agents. But how to implement a group communication system for mobile agents remains an open issue, which is a powerful block that facilitates the development of fault-tolerant mobile agent systems. In this paper, we propose a group communication system for mobile agents (GCS-MA), which includes totally ordered multicast and membership management functions. We divide a group of mobile agents into several agent clusters,and each agent cluster consists of all mobile agents residing in the same sub-network and is managed by a special module, named coordinator. Then, all coordinators form a ring-based overlay for interchanging messages between clusters. We present a token-based algorithm, an intra-cluster messaging algorithm and an inter-cluster migration algorithm to achieve atomicity and total ordering properties of multicast messages, by building a membership protocol on top of the clustering and failure detection mechanisms. Performance issues of the proposed system have been analysed through simulations. We also describe the application of the proposed system in the context of the service cooperation middleware (SCM) project.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

随着计算机芯片的速度不断提升,器件的门限电压越来越低,因此单粒子翻转的瞬时故障越来越容易发生。特别是在太空环境中的计算机系统,在宇宙射线的影响下,瞬时故障更为频繁,系统可靠性面临更突出的考验。 为了提高计算机系统的可靠性,一般有硬件冗余容错和软件冗余容错两种方法。相对硬件容错而言,软件容错的优点是价格便宜,性价比高,配置灵活等,缺点是会带来额外的时间和空间开销,而且给程序员带来编写额外的容错代码的工作量。近来出现了一些基于编译的软件容错方法,可在编译的过程中自动加入冗余容错逻辑,但是这类编译容错方法仍然会带来显著的时间空间开销。如何在保持容错能力的同时尽量降低时空开销,是有待继续研究的问题。 本文在编译容错方向上进行了进一步研究和实现,提出利用源代码中的变量信息对冗余容错逻辑进行了剪裁,在保证容错能力的同时降低了时空开销,对内存和寄存器中的数据进行保护。具体内容有: 1. 提出了一个容错编译环境SCC的设计蓝图,构建了一个容错编译工具的远 景目标。 2. 提出了一种指令级的编译容错检测方法VarBIFT ,提供检测瞬时故障的能力。平均只利用0.0069倍的时间损耗和0.3620倍的空间损耗就将发生瞬时故障时,程序正确执行和检测到故障的概率总和平均从39.1%提升到76.9%, 3. 提出了一种指令级的编译容错恢复方法VarRIFT ,提供从瞬时故障中恢复正确数据的能力。平均只增加0.043倍的时间损耗和0.69倍的空间损耗就将发生瞬时故障时,程序仍然正确执行的概率平均从44.8%提升到了78.7%。 4. 基于开源编译器LCC,实现了上述两个编译容错方法VarBIFT 和VarRIFT 。在容错方法的实现中只修改了跟具体CPU指令相独立的中间逻辑,所以这两个实现能够方便得移植到SPARC、MIPS等其他CPU架构上。 5. 开发了一个故障注入工具,并用它测试了上述两个编译容错方法VarBIFT和VarRIFT 的容错能力。

Relevância:

60.00% 60.00%

Publicador:

Resumo:

为了解决空间辐射对嵌入式计算机系统正确性的影响越来越明显的问题,基于典型的编译级容错技术,在编译器LCC上实现了基于有向无环图的编译级容错检测方法VarBIFT。该方法可以有效的保护由于粒子效应所引起的瞬时硬件故障,并可针对不同的目标机自动生成容错代码。实验结果表明,VarBIFT使源程序的平均段错误率从32.3%降到了13.9%,平均错误输出率从28.6%降到了9.2%;而其时间开销和空间开销仅为0.7%和36%。

Relevância:

60.00% 60.00%

Publicador:

Resumo:

提出一种抵抗瞬时故障的自动编译容错恢复方法,用源码中的变量信息在指令级别进行冗余错误流裁剪,在LCC上加以实现,并获得良好的容错性能。实验结果表明,该方法仅增加0.043倍的时间损耗及0.69倍的空间损耗,在时空损耗上优于现有的其他方法。

Relevância:

60.00% 60.00%

Publicador:

Resumo:

介绍了一种利用集群技术实现双机容错的开发方法.通过对集群技术运行机制的深入研究,提出了采用基于"层"模式的双机容错系统技术方案,在普通PC服务器上实现了双机容错系统,分析了该系统的可用性,针对电力综合自动系统的结构特点,对心跳侦测等功能进行了改进,并在一套小型变电站自动化系统上进行了实验验证,能够较好的满足中小型电力综合自动化系统的需求.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

提出了一种基于信道估计的RS纠错编码改进算法,该算法可以自适应地根据外界条件和环境对传输信道的干扰变化实时地调节编码系统的数据冗余量。仿真与完整的分析结果证实了该改进算法有效地改善了RS编码算法的传输效率;并且通过实际应用表明:良好的性能,高容错性适应于该通信系统的多种传输信道,具有很强的实用性。

Relevância:

60.00% 60.00%

Publicador:

Resumo:

设计了基于CAN协议的AUV内部通讯总线系统 .系统通过协议转换器的模块化、可配置性设计满足AUV系统对其内部通讯总线的开放性要求 .协议转换器内部的容错处理能力以及紧急事件处理节点的设计为增强AUV系统的可靠性和容错能力、为避免AUV在深海工作环境下丢失增加有力的保障措施

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This report addresses the problem of fault tolerance to system failures for database systems that are to run on highly concurrent computers. It assumes that, in general, an application may have a wide distribution in the lifetimes of its transactions. Logging remains the method of choice for ensuring fault tolerance. Generational garbage collection techniques manage the limited disk space reserved for log information; this technique does not require periodic checkpoints and is well suited for applications with a broad range of transaction lifetimes. An arbitrarily large collection of parallel log streams provide the necessary disk bandwidth.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As multiprocessor system size scales upward, two important aspects of multiprocessor systems will generally get worse rather than better: (1) interprocessor communication latency will increase and (2) the probability that some component in the system will fail will increase. These problems can prevent us from realizing the potential benefits of large-scale multiprocessing. In this report we consider the problem of designing networks which simultaneously minimize communication latency while maximizing fault tolerance. Using a synergy of techniques including connection topologies, routing protocols, signalling techniques, and packaging technologies we assemble integrated, system-level solutions to this network design problem.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This thesis presents methods for implementing robust hexpod locomotion on an autonomous robot with many sensors and actuators. The controller is based on the Subsumption Architecture and is fully distributed over approximately 1500 simple, concurrent processes. The robot, Hannibal, weighs approximately 6 pounds and is equipped with over 100 physical sensors, 19 degrees of freedom, and 8 on board computers. We investigate the following topics in depth: distributed control of a complex robot, insect-inspired locomotion control for gait generation and rough terrain mobility, and fault tolerance. The controller was implemented, debugged, and tested on Hannibal. Through a series of experiments, we examined Hannibal's gait generation, rough terrain locomotion, and fault tolerance performance. These results demonstrate that Hannibal exhibits robust, flexible, real-time locomotion over a variety of terrain and tolerates a multitude of hardware failures.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This technical report describes a new protocol, the Unique Token Protocol, for reliable message communication. This protocol eliminates the need for end-to-end acknowledgments and minimizes the communication effort when no dynamic errors occur. Various properties of end-to-end protocols are presented. The unique token protocol solves the associated problems. It eliminates source buffering by maintaining in the network at least two copies of a message. A token is used to decide if a message was delivered to the destination exactly once. This technical report also presents a possible implementation of the protocol in a worm-hole routed, 3-D mesh network.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

M. H. Lee, D. P. Barnes, and N. W. Hardy. Knowledge based error recovery in industrial robots. In Proc. 8th. Int. Joint Conf. Artificial Intelligence, pages 824-826, Karlsruhe, FDR., 1983.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The exploding demand for services like the World Wide Web reflects the potential that is presented by globally distributed information systems. The number of WWW servers world-wide has doubled every 3 to 5 months since 1993, outstripping even the growth of the Internet. At each of these self-managed sites, the Common Gateway Interface (CGI) and Hypertext Transfer Protocol (HTTP) already constitute a rudimentary basis for contributing local resources to remote collaborations. However, the Web has serious deficiencies that make it unsuited for use as a true medium for metacomputing --- the process of bringing hardware, software, and expertise from many geographically dispersed sources to bear on large scale problems. These deficiencies are, paradoxically, the direct result of the very simple design principles that enabled its exponential growth. There are many symptoms of the problems exhibited by the Web: disk and network resources are consumed extravagantly; information search and discovery are difficult; protocols are aimed at data movement rather than task migration, and ignore the potential for distributing computation. However, all of these can be seen as aspects of a single problem: as a distributed system for metacomputing, the Web offers unpredictable performance and unreliable results. The goal of our project is to use the Web as a medium (within either the global Internet or an enterprise intranet) for metacomputing in a reliable way with performance guarantees. We attack this problem one four levels: (1) Resource Management Services: Globally distributed computing allows novel approaches to the old problems of performance guarantees and reliability. Our first set of ideas involve setting up a family of real-time resource management models organized by the Web Computing Framework with a standard Resource Management Interface (RMI), a Resource Registry, a Task Registry, and resource management protocols to allow resource needs and availability information be collected and disseminated so that a family of algorithms with varying computational precision and accuracy of representations can be chosen to meet realtime and reliability constraints. (2) Middleware Services: Complementary to techniques for allocating and scheduling available resources to serve application needs under realtime and reliability constraints, the second set of ideas aim at reduce communication latency, traffic congestion, server work load, etc. We develop customizable middleware services to exploit application characteristics in traffic analysis to drive new server/browser design strategies (e.g., exploit self-similarity of Web traffic), derive document access patterns via multiserver cooperation, and use them in speculative prefetching, document caching, and aggressive replication to reduce server load and bandwidth requirements. (3) Communication Infrastructure: Finally, to achieve any guarantee of quality of service or performance, one must get at the network layer that can provide the basic guarantees of bandwidth, latency, and reliability. Therefore, the third area is a set of new techniques in network service and protocol designs. (4) Object-Oriented Web Computing Framework A useful resource management system must deal with job priority, fault-tolerance, quality of service, complex resources such as ATM channels, probabilistic models, etc., and models must be tailored to represent the best tradeoff for a particular setting. This requires a family of models, organized within an object-oriented framework, because no one-size-fits-all approach is appropriate. This presents a software engineering challenge requiring integration of solutions at all levels: algorithms, models, protocols, and profiling and monitoring tools. The framework captures the abstract class interfaces of the collection of cooperating components, but allows the concretization of each component to be driven by the requirements of a specific approach and environment.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One-and two-dimensional cellular automata which are known to be fault-tolerant are very complex. On the other hand, only very simple cellular automata have actually been proven to lack fault-tolerance, i.e., to be mixing. The latter either have large noise probability ε or belong to the small family of two-state nearest-neighbor monotonic rules which includes local majority voting. For a certain simple automaton L called the soldiers rule, this problem has intrigued researchers for the last two decades since L is clearly more robust than local voting: in the absence of noise, L eliminates any finite island of perturbation from an initial configuration of all 0's or all 1's. The same holds for a 4-state monotonic variant of L, K, called two-line voting. We will prove that the probabilistic cellular automata Kε and Lε asymptotically lose all information about their initial state when subject to small, strongly biased noise. The mixing property trivially implies that the systems are ergodic. The finite-time information-retaining quality of a mixing system can be represented by its relaxation time Relax(⋅), which measures the time before the onset of significant information loss. This is known to grow as (1/ε)^c for noisy local voting. The impressive error-correction ability of L has prompted some researchers to conjecture that Relax(Lε) = 2^(c/ε). We prove the tight bound 2^(c1log^21/ε) < Relax(Lε) < 2^(c2log^21/ε) for a biased error model. The same holds for Kε. Moreover, the lower bound is independent of the bias assumption. The strong bias assumption makes it possible to apply sparsity/renormalization techniques, the main tools of our investigation, used earlier in the opposite context of proving fault-tolerance.