955 resultados para high tolerance
Resumo:
Ordinary desktop computers continue to obtain ever more resources – in-creased processing power, memory, network speed and bandwidth – yet these resources spend much of their time underutilised. Cycle stealing frameworks harness these resources so they can be used for high-performance computing. Traditionally cycle stealing systems have used client-server based architectures which place significant limits on their ability to scale and the range of applica-tions they can support. By applying a fully decentralised network model to cycle stealing the limits of centralised models can be overcome. Using decentralised networks in this manner presents some difficulties which have not been encountered in their previous uses. Generally decentralised ap-plications do not require any significant fault tolerance guarantees. High-performance computing on the other hand requires very stringent guarantees to ensure correct results are obtained. Unfortunately mechanisms developed for traditional high-performance computing cannot be simply translated because of their reliance on a reliable storage mechanism. In the highly dynamic world of P2P computing this reliable storage is not available. As part of this research a fault tolerance system has been created which provides considerable reliability without the need for a persistent storage. As well as increased scalability, fully decentralised networks offer the ability for volunteers to communicate directly. This ability provides the possibility of supporting applications whose tasks require direct, message passing style communication. Previous cycle stealing systems have only supported embarrassingly parallel applications and applications with limited forms of communication so a new programming model has been developed which can support this style of communication within a cycle stealing context. In this thesis I present a fully decentralised cycle stealing framework. The framework addresses the problems of providing a reliable fault tolerance sys-tem and supporting direct communication between parallel tasks. The thesis includes a programming model for developing cycle stealing applications with direct inter-process communication and methods for optimising object locality on decentralised networks.