52 resultados para Distributed replication system


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Rapid advancements in multi-core processor architectures coupled with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs that efficiently utilize all the resources in such a cluster is still a major challenge. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing popularity, support and stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Opportunistic selection selects the node that improves the overall system performance the most. Selecting the best node is challenging as the nodes are geographically distributed and have only local knowledge. Yet, selection must be fast to allow more time to be spent on data transmission, which exploits the selected node's services. We analyze the impact of imperfect power control on a fast, distributed, splitting based selection scheme that exploits the capture effect by allowing the transmitting nodes to have different target receive powers and uses information about the total received power to speed up selection. Imperfect power control makes the received power deviate from the target and, hence, affects performance. Our analysis quantifies how it changes the selection probability, reduces the selection speed, and leads to the selection of no node or a wrong node. We show that the effect of imperfect power control is primarily driven by the ratio of target receive powers. Furthermore, we quantify its effect on the net system throughput.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hepatitis C virus (HCV) is the causative agent of end-stage liver disease. Recent advances in the last decade in anti HCV treatment strategies have dramatically increased the viral clearance rate. However, several limitations are still associated, which warrant a great need of novel, safe and selective drugs against HCV infection. Towards this objective, we explored highly potent and selective small molecule inhibitors, the ellagitannins, from the crude extract of Pomegranate (Punica granatum) fruit peel. The pure compounds, punicalagin, punicalin, and ellagic acid isolated from the extract specifically blocked the HCV NS3/4A protease activity in vitro. Structural analysis using computational approach also showed that ligand molecules interact with the catalytic and substrate binding residues of NS3/4A protease, leading to inhibition of the enzyme activity. Further, punicalagin and punicalin significantly reduced the HCV replication in cell culture system. More importantly, these compounds are well tolerated ex vivo and `no observed adverse effect level' (NOAEL) was established upto an acute dose of 5000 mg/kg in BALB/c mice. Additionally, pharmacokinetics study showed that the compounds are bioavailable. Taken together, our study provides a proof-of-concept approach for the potential use of antiviral and non-toxic principle ellagitannins from pomegranate in prevention and control of HCV induced complications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Programming for parallel architectures that do not have a shared address space is extremely difficult due to the need for explicit communication between memories of different compute devices. A heterogeneous system with CPUs and multiple GPUs, or a distributed-memory cluster are examples of such systems. Past works that try to automate data movement for distributed-memory architectures can lead to excessive redundant communication. In this paper, we propose an automatic data movement scheme that minimizes the volume of communication between compute devices in heterogeneous and distributed-memory systems. We show that by partitioning data dependences in a particular non-trivial way, one can generate data movement code that results in the minimum volume for a vast majority of cases. The techniques are applicable to any sequence of affine loop nests and works on top of any choice of loop transformations, parallelization, and computation placement. The data movement code generated minimizes the volume of communication for a particular configuration of these. We use a combination of powerful static analyses relying on the polyhedral compiler framework and lightweight runtime routines they generate, to build a source-to-source transformation tool that automatically generates communication code. We demonstrate that the tool is scalable and leads to substantial gains in efficiency. On a heterogeneous system, the communication volume is reduced by a factor of 11X to 83X over state-of-the-art, translating into a mean execution time speedup of 1.53X. On a distributed-memory cluster, our scheme reduces the communication volume by a factor of 1.4X to 63.5X over state-of-the-art, resulting in a mean speedup of 1.55X. In addition, our scheme yields a mean speedup of 2.19X over hand-optimized UPC codes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Opportunistic selection in multi-node wireless systems improves system performance by selecting the ``best'' node and by using it for data transmission. In these systems, each node has a real-valued local metric, which is a measure of its ability to improve system performance. Our goal is to identify the best node, which has the largest metric. We propose, analyze, and optimize a new distributed, yet simple, node selection scheme that combines the timer scheme with power control. In it, each node sets a timer and transmit power level as a function of its metric. The power control is designed such that the best node is captured even if. other nodes simultaneously transmit with it. We develop several structural properties about the optimal metric-to-timer-and-power mapping, which maximizes the probability of selecting the best node. These significantly reduce the computational complexity of finding the optimal mapping and yield valuable insights about it. We show that the proposed scheme is scalable and significantly outperforms the conventional timer scheme. We investigate the effect of. and the number of receive power levels. Furthermore, we find that the practical peak power constraint has a negligible impact on the performance of the scheme.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper studies a pilot-assisted physical layer data fusion technique known as Distributed Co-Phasing (DCP). In this two-phase scheme, the sensors first estimate the channel to the fusion center (FC) using pilots sent by the latter; and then they simultaneously transmit their common data by pre-rotating them by the estimated channel phase, thereby achieving physical layer data fusion. First, by analyzing the symmetric mutual information of the system, it is shown that the use of higher order constellations (HOC) can improve the throughput of DCP compared to the binary signaling considered heretofore. Using an HOC in the DCP setting requires the estimation of the composite DCP channel at the FC for data decoding. To this end, two blind algorithms are proposed: 1) power method, and 2) modified K-means algorithm. The latter algorithm is shown to be computationally efficient and converges significantly faster than the conventional K-means algorithm. Analytical expressions for the probability of error are derived, and it is found that even at moderate to low SNRs, the modified K-means algorithm achieves a probability of error comparable to that achievable with a perfect channel estimate at the FC, while requiring no pilot symbols to be transmitted from the sensor nodes. Also, the problem of signal corruption due to imperfect DCP is investigated, and constellation shaping to minimize the probability of signal corruption is proposed and analyzed. The analysis is validated, and the promising performance of DCP for energy-efficient physical layer data fusion is illustrated, using Monte Carlo simulations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distributed system has quite a lot of servers to attain increased availability of service and for fault tolerance. Balancing the load among these servers is an important task to achieve better performance. There are various hardware and software based load balancing solutions available. However there is always an overhead on Servers and the Load Balancer while communicating with each other and sharing their availability and the current load status information. Load balancer is always busy in listening to clients' request and redirecting them. It also needs to collect the servers' availability status frequently, to keep itself up-to-date. Servers are busy in not only providing service to clients but also sharing their current load information with load balancing algorithms. In this paper we have proposed and discussed the concept and system model for software based load balancer along with Availability-Checker and Load Reporters (LB-ACLRs) which reduces the overhead on server and the load balancer. We have also described the architectural components with their roles and responsibilities. We have presented a detailed analysis to show how our proposed Availability Checker significantly increases the performance of the system.