850 resultados para managed execution
Resumo:
This paper describes the design of a power efficient microarchitecture for transient fault detection in chip multiprocessors (CMPs) We introduce a new per-core dynamic voltage and frequency scaling (DVFS) algorithm for our architecture that significantly reduces power dissipation for redundant execution with a minimal performance overhead. Using cycle accurate simulation combined with a simple first order power model, we estimate that our architecture reduces dynamic power dissipation in the redundant core by an mean value of 79% and a maximum of 85% with an associated mean performance overhead of only 1:2%
Resumo:
Workstation clusters equipped with high performance interconnect having programmable network processors facilitate interesting opportunities to enhance the performance of parallel application run on them. In this paper, we propose schemes where certain application level processing in parallel database query execution is performed on the network processor. We evaluate the performance of TPC-H queries executing on a high end cluster where all tuple processing is done on the host processor, using a timed Petri net model, and find that tuple processing costs on the host processor dominate the execution time. These results are validated using a small cluster. We therefore propose 4 schemes where certain tuple processing activity is offloaded to the network processor. The first 2 schemes offload the tuple splitting activity - computation to identify the node on which to process the tuples, resulting in an execution time speedup of 1.09 relative to the base scheme, but with I/O bus becoming the bottleneck resource. In the 3rd scheme in addition to offloading tuple processing activity, the disk and network interface are combined to avoid the I/O bus bottleneck, which results in speedups up to 1.16, but with high host processor utilization. Our 4th scheme where the network processor also performs apart of join operation along with the host processor, gives a speedup of 1.47 along with balanced system resource utilizations. Further we observe that the proposed schemes perform equally well even in a scaled architecture i.e., when the number of processors is increased from 2 to 64
Resumo:
Relentless CMOS scaling coupled with lower design tolerances is making ICs increasingly susceptible to wear-out related permanent faults and transient faults, necessitating on-chip fault tolerance in future chip microprocessors (CMPs). In this paper, we describe a power-efficient architecture for redundant execution on chip multiprocessors (CMPs) which when coupled with our per-core dynamic voltage and frequency scaling (DVFS) algorithm significantly reduces the energy overhead of redundant execution without sacrificing performance. Our evaluation shows that this architecture has a performance overhead of only 0.3% and consumes only 1.48 times the energy of a non-fault-tolerant baseline.
Resumo:
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typical MATLAB programs have abundant data parallelism. These programs also have control flow dominated scalar regions that have an impact on the program's execution time. Today's computer systems have tremendous computing power in the form of traditional CPU cores and throughput oriented accelerators such as graphics processing units(GPUs). Thus, an approach that maps the control flow dominated regions to the CPU and the data parallel regions to the GPU can significantly improve program performance. In this paper, we present the design and implementation of MEGHA, a compiler that automatically compiles MATLAB programs to enable synergistic execution on heterogeneous processors. Our solution is fully automated and does not require programmer input for identifying data parallel regions. We propose a set of compiler optimizations tailored for MATLAB. Our compiler identifies data parallel regions of the program and composes them into kernels. The problem of combining statements into kernels is formulated as a constrained graph clustering problem. Heuristics are presented to map identified kernels to either the CPU or GPU so that kernel execution on the CPU and the GPU happens synergistically and the amount of data transfer needed is minimized. In order to ensure required data movement for dependencies across basic blocks, we propose a data flow analysis and edge splitting strategy. Thus our compiler automatically handles composition of kernels, mapping of kernels to CPU and GPU, scheduling and insertion of required data transfer. The proposed compiler was implemented and experimental evaluation using a set of MATLAB benchmarks shows that our approach achieves a geometric mean speedup of 19.8X for data parallel benchmarks over native execution of MATLAB.
Resumo:
GPUs have been used for parallel execution of DOALL loops. However, loops with indirect array references can potentially cause cross iteration dependences which are hard to detect using existing compilation techniques. Applications with such loops cannot easily use the GPU and hence do not benefit from the tremendous compute capabilities of GPUs. In this paper, we present an algorithm to compute at runtime the cross iteration dependences in such loops. The algorithm uses both the CPU and the GPU to compute the dependences. Specifically, it effectively uses the compute capabilities of the GPU to quickly collect the memory accesses performed by the iterations by executing the slice functions generated for the indirect array accesses. Using the dependence information, the loop iterations are levelized such that each level contains independent iterations which can be executed in parallel. Another interesting aspect of the proposed solution is that it pipelines the dependence computation of the future level with the actual computation of the current level to effectively utilize the resources available in the GPU. We use NVIDIA Tesla C2070 to evaluate our implementation using benchmarks from Polybench suite and some synthetic benchmarks. Our experiments show that the proposed technique can achieve an average speedup of 6.4x on loops with a reasonable number of cross iteration dependences.
Resumo:
An exciting application of crowdsourcing is to use social networks in complex task execution. In this paper, we address the problem of a planner who needs to incentivize agents within a network in order to seek their help in executing an atomic task as well as in recruiting other agents to execute the task. We study this mechanism design problem under two natural resource optimization settings: (1) cost critical tasks, where the planner's goal is to minimize the total cost, and (2) time critical tasks, where the goal is to minimize the total time elapsed before the task is executed. We identify a set of desirable properties that should ideally be satisfied by a crowdsourcing mechanism. In particular, sybil-proofness and collapse-proofness are two complementary properties in our desiderata. We prove that no mechanism can satisfy all the desirable properties simultaneously. This leads us naturally to explore approximate versions of the critical properties. We focus our attention on approximate sybil-proofness and our exploration leads to a parametrized family of payment mechanisms which satisfy collapse-proofness. We characterize the approximate versions of the desirable properties in cost critical and time critical domain.
Resumo:
Rapid advancements in multi-core processor architectures coupled with low-cost, low-latency, high-bandwidth interconnects have made clusters of multi-core machines a common computing resource. Unfortunately, writing good parallel programs that efficiently utilize all the resources in such a cluster is still a major challenge. Various programming languages have been proposed as a solution to this problem, but are yet to be adopted widely to run performance-critical code mainly due to the relatively immature software framework and the effort involved in re-writing existing code in the new language. In this paper, we motivate and describe our initial study in exploring CUDA as a programming language for a cluster of multi-cores. We develop CUDA-For-Clusters (CFC), a framework that transparently orchestrates execution of CUDA kernels on a cluster of multi-core machines. The well-structured nature of a CUDA kernel, the growing popularity, support and stability of the CUDA software stack collectively make CUDA a good candidate to be considered as a programming language for a cluster. CFC uses a mixture of source-to-source compiler transformations, a work distribution runtime and a light-weight software distributed shared memory to manage parallel executions. Initial results on running several standard CUDA benchmark programs achieve impressive speedups of up to 7.5X on a cluster with 8 nodes, thereby opening up an interesting direction of research for further investigation.
Resumo:
Global conservation policy is increasingly debating the feasibility of reconciling wildlife conservation and human resource requirements in land uses outside protected areas (PAs). However, there are few quantitative assessments of whether or to what extent these `wildlife-friendly' land uses fulfill a fundamental function of PAs-to separate biodiversity from anthropogenic threats. We distinguish the role of wildlife-friendly land uses as being (a) subsidiary, whereby they augment PAs with secondary habitat, or (b) substitutive, wherein they provide comparable habitat to PAs. We tested our hypotheses by investigating the influence of land use and human presence on space-use intensity of the endangered Asian elephant (Elephas maximus) in a fragmented landscape comprising PAs and wildlife-friendly land uses. We applied multistate occupancy models to spatial data on elephant occurrence to estimate and model the overall probability of elephants using a site, and the conditional probability of high-intensity use given that elephants use a site. The probability of elephants using a site regardless of intensity did not vary between PAs and wildlife-friendly land uses. However, high-intensity use declined with distance to PM, and this effect was accentuated by an increase in village density. Therefore, while wildlife-friendly land uses did play a subsidiary conservation role, their potential to substitute for PAs was offset by a strong human presence. Our findings demonstrate the need to evaluate the role of wildlife-friendly land uses in landscape-scale conservation; for species that have conflicting resource requirements with people, PAs are likely to provide crucial refuge from growing anthropogenic threats. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Why are SRS important? The answer is to be found in this well-structured survey under: SRS as food source; SRS as additional source of cash income; Role of SRS in social capital. An analysis of the threats to SRS and the potential management options for farmer managed aquatic systems are also available in this survey along with the following definition of SRS: SRS are defined as aquatic animals that can be harvested from farmer managed aquatic systems without regular stocking. (PDF contains 4 pages)
Resumo:
The aims of this paper are twofold. Firstly to characterise rural poverty and to give a broad overview of the agro-ecological, climatic and socio-economic conditions in Sri Lanka which shape poverty. Secondly to present the methodology employed to screen suitable field research areas and the techniques subsequently used to carry out Rapid Rural Appraisal in two upper-watersheds villages. Also presented are details of a concurrent stakeholder analysis that aimed to investigate the capacity of secondary stakeholders to promote sustainable aquatic resource development and to invite their participation in the formulation of a participatory research agenda.[PDF contains 58 pages]
Resumo:
RRAs were carried out in two Small Tank Cascade systems (STCs) of North West Province, Sri Lanka (less than 1000 ha total watershed area). A total of 21 tanks and 7 villages were investigated with primary emphasis on two upper watershed communities. The two systems differ primarily in their resource base; namely rainfall, natural forests and proximity to large scale perennial irrigation resources. [PDF contains 86 pages]
Resumo:
The importance of fish meal production as a means of reducing fish waste currently being experienced in the fisheries subsector is discussed. Cost estimate for Nigeria establishing a fish meal manufacturing plant and suggestions on rational execution of the project are presented. If properly located and well managed, the project will serve to convert fish waste to cash in the industrial fishery
Resumo:
[ENG] If we look around us, we can observe that there is someone who is the best in each field of activity. We could think that they are exceptional individuals. This Final Project aims to increase knowledge of the effects of Deliberate Practice in the domains of music and sport. This will define you a concept of Deliberate Practice and then focus on the diversity of situations in which it shows us how it is presented in real life. From a questionnaire that has been designed for this study and distributed to the music students, I have expected to obtain a result that allow me to come to the conclusion that exists a relation between the hours of practice and the expertise in the execution. This reality has been linked to the regarding situation in the sport practice, whose information has been provided by the coordinators of the different sports. Taking into account the limited number of references available, this work has focused on a qualitative analysis of the data, interpreted from my point of view and my personal experience, which has been confirmed in the results obtained. The statistics managed allow me to conclude that, although the argument is not definitive, the guide effort through deliberate practice is essential to achieve the excellence.
Resumo:
The management of African freshwater fisheries in Southern African Development Coordination (SADC) countries is discussed. Changes in catch and fishing effort in the SADC freshwater fisheries in the past 50 years, the main causes behind the patterns of change in fishing effort, the effects of fishing effort and environment on the regeneration of fish stocks, as well as existing and proposed fisheries management regulations are investigated.