5 resultados para Animales útiles

em Indian Institute of Science - Bangalore - Índia


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Mufflers with at least one acoustically absorptive duct are generally called dissipative mufflers. Generally, for want of systems approach, these mufflers are characterized by transmission loss of the lined duct with overriding corrections for the terminations, mean flow, etc. In this article, it is proposed that dissipative duct should be integrated with other muffler elements, source impedance and radiation impedance, by means of transfer matrix approach. Towards this end, the transfer matrix for rectangular duct with mean flow has been derived here, for the least attenuated mode. Mean flow introduces a coupling between transverse wave numbers and axial wave number, the evaluation of which therefore calls for simultaneous solution of two or three transcendental equations. This is done by means of a Newton-Raphson iteration scheme, which is illustrated here for square ducts lined with porous ceramic tiles.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Video decoders used in emerging applications need to be flexible to handle a large variety of video formats and deliver scalable performance to handle wide variations in workloads. In this paper we propose a unified software and hardware architecture for video decoding to achieve scalable performance with flexibility. The light weight processor tiles and the reconfigurable hardware tiles in our architecture enable software and hardware implementations to co-exist, while a programmable interconnect enables dynamic interconnection of the tiles. Our process network oriented compilation flow achieves realization agnostic application partitioning and enables seamless migration across uniprocessor, multi-processor, semi hardware and full hardware implementations of a video decoder. An application quality of service aware scheduler monitors and controls the operation of the entire system. We prove the concept through a prototype of the architecture on an off-the-shelf FPGA. The FPGA prototype shows a scaling in performance from QCIF to 1080p resolutions in four discrete steps. We also demonstrate that the reconfiguration time is short enough to allow migration from one configuration to the other without any frame loss.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most stencil computations allow tile-wise concurrent start, i.e., there always exists a face of the iteration space and a set of tiling hyperplanes such that all tiles along that face can be started concurrently. This provides load balance and maximizes parallelism. However, existing automatic tiling frameworks often choose hyperplanes that lead to pipelined start-up and load imbalance. We address this issue with a new tiling technique that ensures concurrent start-up as well as perfect load-balance whenever possible. We first provide necessary and sufficient conditions on tiling hyperplanes to enable concurrent start for programs with affine data accesses. We then provide an approach to find such hyperplanes. Experimental evaluation on a 12-core Intel Westmere shows that our code is able to outperform a tuned domain-specific stencil code generator by 4% to 27%, and previous compiler techniques by a factor of 2x to 10.14x.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Effective air flow distribution through perforated tiles is required to efficiently cool servers in a raised floor data center. We present detailed computational fluid dynamics (CFD) modeling of air flow through a perforated tile and its entrance to the adjacent server rack. The realistic geometrical details of the perforated tile, as well as of the rack are included in the model. Generally, models for air flow through perforated tiles specify a step pressure loss across the tile surface, or porous jump model based on the tile porosity. An improvement to this includes a momentum source specification above the tile to simulate the acceleration of the air flow through the pores, or body force model. In both of these models, geometrical details of tile such as pore locations and shapes are not included. More details increase the grid size as well as the computational time. However, the grid refinement can be controlled to achieve balance between the accuracy and computational time. We compared the results from CFD using geometrical resolution with the porous jump and body force model solution as well as with the measured flow field using particle image velocimetry (PIV) experiments. We observe that including tile geometrical details gives better results as compared to elimination of tile geometrical details and specifying physical models across and above the tile surface. A modification to the body force model is also suggested and improved results were achieved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Multi-GPU machines are being increasingly used in high-performance computing. Each GPU in such a machine has its own memory and does not share the address space either with the host CPU or other GPUs. Hence, applications utilizing multiple GPUs have to manually allocate and manage data on each GPU. Existing works that propose to automate data allocations for GPUs have limitations and inefficiencies in terms of allocation sizes, exploiting reuse, transfer costs, and scalability. We propose a scalable and fully automatic data allocation and buffer management scheme for affine loop nests on multi-GPU machines. We call it the Bounding-Box-based Memory Manager (BBMM). BBMM can perform at runtime, during standard set operations like union, intersection, and difference, finding subset and superset relations on hyperrectangular regions of array data (bounding boxes). It uses these operations along with some compiler assistance to identify, allocate, and manage data required by applications in terms of disjoint bounding boxes. This allows it to (1) allocate exactly or nearly as much data as is required by computations running on each GPU, (2) efficiently track buffer allocations and hence maximize data reuse across tiles and minimize data transfer overhead, and (3) and as a result, maximize utilization of the combined memory on multi-GPU machines. BBMM can work with any choice of parallelizing transformations, computation placement, and scheduling schemes, whether static or dynamic. Experiments run on a four-GPU machine with various scientific programs showed that BBMM reduces data allocations on each GPU by up to 75% compared to current allocation schemes, yields performance of at least 88% of manually written code, and allows excellent weak scaling.