953 resultados para Distributed applications


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The move towards IT outsourcing is the first step towards an environment where compute infrastructure is treated as a service. In utility computing this IT service has to honor Service Level Agreements (SLA) in order to meet the desired Quality of Service (QoS) guarantees. Such an environment requires reliable services in order to maximize the utilization of the resources and to decrease the Total Cost of Ownership (TCO). Such reliability cannot come at the cost of resource duplication, since it increases the TCO of the data center and hence the cost per compute unit. We, in this paper, look into aspects of projecting impact of hardware failures on the SLAs and techniques required to take proactive recovery steps in case of a predicted failure. By maintaining health vectors of all hardware and system resources, we predict the failure probability of resources based on observed hardware errors/failure events, at runtime. This inturn influences an availability aware middleware to take proactive action (even before the application is affected in case the system and the application have low recoverability). The proposed framework has been prototyped on a system running HP-UX. Our offline analysis of the prediction system on hardware error logs indicate no more than 10% false positives. This work to the best of our knowledge is the first of its kind to perform an end-to-end analysis of the impact of a hardware fault on application SLAs, in a live system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The steady state throughput performance of distributed applications deployed in switched networks in presence of end-system bottlenecks is studied in this paper. The effect of various limitations at an end-system is modelled as an equivalent transmission capacity limitation. A class of distributed applications is characterised by a static traffic distribution matrix that determines the communication between various components of the application. It is found that uniqueness of steady state throughputs depends only on the traffic distribution matrix and that some applications (e.g., broadcast applications) can yield non-unique values for the steady state component throughputs. For a given switch capacity, with traffic distribution that yield fair unique throughputs, the trade-off between the end-system capacity and the number of application components is brought out. With a proposed distributed rate control, it has been illustrated that it is possible to have unique solution for certain traffic distributions which is otherwise impossible. Also, by proper selection of rate control parameters, various throughput performance objectives can be realised.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Emerging configurable infrastructures such as large-scale overlays and grids, distributed testbeds, and sensor networks comprise diverse sets of available computing resources (e.g., CPU and OS capabilities and memory constraints) and network conditions (e.g., link delay, bandwidth, loss rate, and jitter) whose characteristics are both complex and time-varying. At the same time, distributed applications to be deployed on these infrastructures exhibit increasingly complex constraints and requirements on resources they wish to utilize. Examples include selecting nodes and links to schedule an overlay multicast file transfer across the Grid, or embedding a network experiment with specific resource constraints in a distributed testbed such as PlanetLab. Thus, a common problem facing the efficient deployment of distributed applications on these infrastructures is that of "mapping" application-level requirements onto the network in such a manner that the requirements of the application are realized, assuming that the underlying characteristics of the network are known. We refer to this problem as the network embedding problem. In this paper, we propose a new approach to tackle this combinatorially-hard problem. Thanks to a number of heuristics, our approach greatly improves performance and scalability over previously existing techniques. It does so by pruning large portions of the search space without overlooking any valid embedding. We present a construction that allows a compact representation of candidate embeddings, which is maintained by carefully controlling the order via which candidate mappings are inserted and invalid mappings are removed. We present an implementation of our proposed technique, which we call NETEMBED – a service that identify feasible mappings of a virtual network configuration (the query network) to an existing real infrastructure or testbed (the hosting network). We present results of extensive performance evaluation experiments of NETEMBED using several combinations of real and synthetic network topologies. Our results show that our NETEMBED service is quite effective in identifying one (or all) possible embeddings for quite sizable queries and hosting networks – much larger than what any of the existing techniques or services are able to handle.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Natural distributed systems are adaptive, scalable and fault-tolerant. Emergence science describes how higher-level self-regulatory behaviour arises in natural systems from many participants following simple rulesets. Emergence advocates simple communication models, autonomy and independence, enhancing robustness and self-stabilization. High-quality distributed applications such as autonomic systems must satisfy the appropriate nonfunctional requirements which include scalability, efficiency, robustness, low-latency and stability. However the traditional design of distributed applications, especially in terms of the communication strategies employed, can introduce compromises between these characteristics. This paper discusses ways in which emergence science can be applied to distributed computing, avoiding some of the compromises associated with traditionally-designed applications. To demonstrate the effectiveness of this paradigm, an emergent election algorithm is described and its performance evaluated. The design incorporates nondeterministic behaviour. The resulting algorithm has very low communication complexity, and is simultaneously very stable, scalable and robust.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tycho was conceived in 2003 in response to a need by the GridRM [1] resource-monitoring project for a ldquolight-weightrdquo, scalable and easy to use wide-area distributed registry and messaging system. Since Tycho's first release in 2006 a number of modifications have been made to the system to make it easier to use and more flexible. Since its inception, Tycho has been utilised across a number of application domains including widearea resource monitoring, distributed queries across archival databases, providing services for the nodes of a Cray supercomputer, and as a system for transferring multi-terabyte scientific datasets across the Internet. This paper provides an overview of the initial Tycho system, describes a number of applications that utilise Tycho, discusses a number of new utilities, and how the Tycho infrastructure has evolved in response to experience of building applications with it.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most fault-tolerant application programs cannot cope with constant changes in their environments and user requirements because they embed policies and mechanisms together so that if the policies or mechanisms are  changed the whole programs have to be changed as well. This paper presents a reactive system approach to overcoming this limitation. The reactive system concepts are an attractive paradigm for system design, development and maintenance because they separate policies from mechanisms. In the paper we propose a generic reactive system architecture and use group communication primitives to model it. We then implement it as a generic package which can be applied in any distributed applications. The system performance shows that it can be used in a distributed environment effectively.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A common characteristic among parallel/distributed programming languages is that the one language is used to specify not only the overall organisation of the distributed application, but also the functionality of the application. That is, the connectivity and functionality of processes are specified within a single program. Connectivity and functionality are independent aspects of a distributed application. This thesis shows that these two aspects can be specified separately, therefore allowing application designers to freely concentrate on either aspect in a modular fashion. Two new programming languages have been developed for specifying each aspect. These languages are for loosely coupled distributed applications based on message passing, and have been designed to simplify distributed programming by completely removing all low level interprocess communication. A suite of languages and tools has been designed and developed. It includes the two new languages, parsers, a compilation system to generate intermediate C code that is compiled to binary object modules, a run-time system to create, manage and terminate several distributed applications, and a shell to communicate with the run-tune system. DAL (Distributed Application Language) and DAPL (Distributed Application Process Language) are the new programming languages for the specification and development of process oriented, asynchronous message passing, distributed applications. These two languages have been designed and developed as part of this doctorate in order to specify such distributed applications that execute on a cluster of computers. Both languages are used to specify orthogonal components of an application, on the one hand the organisation of processes that constitute an application, and on the other the interface and functionality of each process. Consequently, these components can be created in a modular fashion, individually and concurrently. The DAL language is used to specify not only the connectivity of all processes within an application, but also a cluster of computers for which the application executes. Furthermore, sub-clusters can be specified for individual processes of an application to constrain a process to a particular group of computers. The second language, DAPL, is used to specify the interface, functionality and data structures of application processes. In addition to these languages, a DAL parser, a DAPL parser, and a compilation system have been designed and developed (in this project). This compilation system takes DAL and DAPL programs to generate object modules based on machine code, one module for each application process. These object modules are used by the Distributed Application System (DAS) to instantiate and manage distributed applications. The DAS system is another new component of this project. The purpose of the DAS system is to create, manage, and terminate many distributed applications of similar and different configurations. The creation procedure incorporates the automatic allocation of processes to remote machines. Application management includes several operations such as deletion, addition, replacement, and movement of processes, and also detection and reaction to faults such as a processor crash. A DAS operator communicates with the DAS system via a textual shell called DASH (Distributed Application SHell). This suite of languages and tools allowed distributed applications of varying connectivity and functionality to be specified quickly and simply at a high level of abstraction. DAL and DAPL programs of several processes may require a few dozen lines to specify as compared to several hundred lines of equivalent C code that is generated by the compilation system. Furthermore, the DAL and DAPL compilation system is successful at generating binary object modules, and the DAS system succeeds in instantiating and managing several distributed applications on a cluster.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The development of fault-tolerant computing systems is a very difficult task. Two reasons contributed to this difficulty can be described as follows. The First is that, in normal practice, fault-tolerant computing policies and mechanisms are deeply embedded into most application programs, so that these application programs cannot cope with changes in environments, policies and mechanisms. These factors may change frequently in a distributed environment, especially in a heterogeneous environment. Therefore, in order to develop better fault-tolerant systems that can cope with constant changes in environments and user requirements, it is essential to separate the fault tolerant computing policies and mechanisms in application programs. The second is, on the other hand, a number of techniques have been proposed for the construction of reliable and fault-tolerant computing systems. Many computer systems are being developed to tolerant various hardware and software failures. However, most of these systems are to be used in specific application areas, since it is extremely difficult to develop systems that can be used in general-purpose fault-tolerant computing. The motivation of this thesis is based on these two aspects. The focus of the thesis is on developing a model based on the reactive system concepts for building better fault-tolerant computing applications. The reactive system concepts are an attractive paradigm for system design, development and maintenance because it separates policies from mechanisms. The stress of the model is to provide flexible system architecture for the general-purpose fault-tolerant application development, and the model can be applied in many specific applications. With this reactive system model, we can separate fault-tolerant computing polices and mechanisms in the applications, so that the development and maintenance of fault-tolerant computing systems can be made easier.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cloud Computing has evolved to become an enabler for delivering access to large scale distributed applications running on managed network-connected computing systems. This makes possible hosting Distributed Enterprise Information Systems (dEISs) in cloud environments, while enforcing strict performance and quality of service requirements, defined using Service Level Agreements (SLAs). {SLAs} define the performance boundaries of distributed applications, and are enforced by a cloud management system (CMS) dynamically allocating the available computing resources to the cloud services. We present two novel VM-scaling algorithms focused on dEIS systems, which optimally detect most appropriate scaling conditions using performance-models of distributed applications derived from constant-workload benchmarks, together with SLA-specified performance constraints. We simulate the VM-scaling algorithms in a cloud simulator and compare against trace-based performance models of dEISs. We compare a total of three SLA-based VM-scaling algorithms (one using prediction mechanisms) based on a real-world application scenario involving a large variable number of users. Our results show that it is beneficial to use autoregressive predictive SLA-driven scaling algorithms in cloud management systems for guaranteeing performance invariants of distributed cloud applications, as opposed to using only reactive SLA-based VM-scaling algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cloud Computing enables provisioning and distribution of highly scalable services in a reliable, on-demand and sustainable manner. However, objectives of managing enterprise distributed applications in cloud environments under Service Level Agreement (SLA) constraints lead to challenges for maintaining optimal resource control. Furthermore, conflicting objectives in management of cloud infrastructure and distributed applications might lead to violations of SLAs and inefficient use of hardware and software resources. This dissertation focusses on how SLAs can be used as an input to the cloud management system, increasing the efficiency of allocating resources, as well as that of infrastructure scaling. First, we present an extended SLA semantic model for modelling complex service-dependencies in distributed applications, and for enabling automated cloud infrastructure management operations. Second, we describe a multi-objective VM allocation algorithm for optimised resource allocation in infrastructure clouds. Third, we describe a method of discovering relations between the performance indicators of services belonging to distributed applications and then using these relations for building scaling rules that a CMS can use for automated management of VMs. Fourth, we introduce two novel VM-scaling algorithms, which optimally scale systems composed of VMs, based on given SLA performance constraints. All presented research works were implemented and tested using enterprise distributed applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Advancements in cloud computing have enabled the proliferation of distributed applications, which require management and control of multiple services. However, without an efficient mechanism for scaling services in response to changing workload conditions, such as number of connected users, application performance might suffer, leading to violations of Service Level Agreements (SLA) and possible inefficient use of hardware resources. Combining dynamic application requirements with the increased use of virtualised computing resources creates a challenging resource Management context for application and cloud-infrastructure owners. In such complex environments, business entities use SLAs as a means for specifying quantitative and qualitative requirements of services. There are several challenges in running distributed enterprise applications in cloud environments, ranging from the instantiation of service VMs in the correct order using an adequate quantity of computing resources, to adapting the number of running services in response to varying external loads, such as number of users. The application owner is interested in finding the optimum amount of computing and network resources to use for ensuring that the performance requirements of all her/his applications are met. She/he is also interested in appropriately scaling the distributed services so that application performance guarantees are maintained even under dynamic workload conditions. Similarly, the infrastructure Providers are interested in optimally provisioning the virtual resources onto the available physical infrastructure so that her/his operational costs are minimized, while maximizing the performance of tenants’ applications. Motivated by the complexities associated with the management and scaling of distributed applications, while satisfying multiple objectives (related to both consumers and providers of cloud resources), this thesis proposes a cloud resource management platform able to dynamically provision and coordinate the various lifecycle actions on both virtual and physical cloud resources using semantically enriched SLAs. The system focuses on dynamic sizing (scaling) of virtual infrastructures composed of virtual machines (VM) bounded application services. We describe several algorithms for adapting the number of VMs allocated to the distributed application in response to changing workload conditions, based on SLA-defined performance guarantees. We also present a framework for dynamic composition of scaling rules for distributed service, which used benchmark-generated application Monitoring traces. We show how these scaling rules can be combined and included into semantic SLAs for controlling allocation of services. We also provide a detailed description of the multi-objective infrastructure resource allocation problem and various approaches to satisfying this problem. We present a resource management system based on a genetic algorithm, which performs allocation of virtual resources, while considering the optimization of multiple criteria. We prove that our approach significantly outperforms reactive VM-scaling algorithms as well as heuristic-based VM-allocation approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The software engineering community has paid little attention to non-functional requirements, or quality attributes, compared with studies performed on capture, analysis and validation of functional requirements. This circumstance becomes more intense in the case of distributed applications. In these applications we have to take into account, besides the quality attributes such as correctness, robustness, extendibility, reusability, compatibility, efficiency, portability and ease of use, others like reliability, scalability, transparency, security, interoperability, concurrency, etc. In this work we will show how these last attributes are related to different abstractions that coexist in the problem domain. To achieve this goal, we have established a taxonomy of quality attributes of distributed applications and have determined the set of necessary services to support such attributes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The questions of distributed systems development based on Java RMI, EJB and J2EE technologies and tools are rated. Here is brought the comparative analysis, which determines the domain of an expedient demand of the considered information technologies as applied to the concrete distributed applications requirements.