47 resultados para Library architecture

em Indian Institute of Science - Bangalore - Índia


Relevância:

30.00% 30.00%

Publicador:

Resumo:

H.264 video standard achieves high quality video along with high data compression when compared to other existing video standards. H.264 uses context-based adaptive variable length coding (CAVLC) to code residual data in Baseline profile. In this paper we describe a novel architecture for CAVLC decoder including coeff-token decoder, level decoder total-zeros decoder and run-before decoder UMC library in 0.13 mu CMOS technology is used to synthesize the proposed design. The proposed design reduces chip area and improves critical path performance of CAVLC decoder in comparison with [1]. Macroblock level (including luma and chroma) pipeline processing for CAVLC is implemented with an average of 141 cycles (including pipeline buffering) per macroblock at 250MHz clock frequency. To compare our results with [1] clock frequency is constrained to 125MHz. The area required for the proposed architecture is 17586 gates, which is 22.1% improvement in comparison to [1]. We obtain a throughput of 1.73 * 10(6) macroblocks/second, which is 28% higher than that reported in [1]. The proposed design meets the processing requirement of 1080HD [5] video at 30frames/seconds.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cardiac arrhythmias, such as ventricular tachycardia (VT) and ventricular fibrillation (VF), are among the leading causes of death in the industrialized world. These are associated with the formation of spiral and scroll waves of electrical activation in cardiac tissue; single spiral and scroll waves are believed to be associated with VT whereas their turbulent analogs are associated with VF. Thus, the study of these waves is an important biophysical problem. We present a systematic study of the combined effects of muscle-fiber rotation and inhomogeneities on scroll-wave dynamics in the TNNP (ten Tusscher Noble Noble Panfilov) model for human cardiac tissue. In particular, we use the three-dimensional TNNP model with fiber rotation and consider both conduction and ionic inhomogeneities. We find that, in addition to displaying a sensitive dependence on the positions, sizes, and types of inhomogeneities, scroll-wave dynamics also depends delicately upon the degree of fiber rotation. We find that the tendency of scroll waves to anchor to cylindrical conduction inhomogeneities increases with the radius of the inhomogeneity. Furthermore, the filament of the scroll wave can exhibit drift or meandering, transmural bending, twisting, and break-up. If the scroll-wave filament exhibits weak meandering, then there is a fine balance between the anchoring of this wave at the inhomogeneity and a disruption of wave-pinning by fiber rotation. If this filament displays strong meandering, then again the anchoring is suppressed by fiber rotation; also, the scroll wave can be eliminated from most of the layers only to be regenerated by a seed wave. Ionic inhomogeneities can also lead to an anchoring of the scroll wave; scroll waves can now enter the region inside an ionic inhomogeneity and can display a coexistence of spatiotemporal chaos and quasi-periodic behavior in different parts of the simulation domain. We discuss the experimental implications of our study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Today's feature-rich multimedia products require embedded system solution with complex System-on-Chip (SoC) to meet market expectations of high performance at a low cost and lower energy consumption. The memory architecture of the embedded system strongly influences these parameters. Hence the embedded system designer performs a complete memory architecture exploration. This problem is a multi-objective optimization problem and can be tackled as a two-level optimization problem. The outer level explores various memory architecture while the inner level explores placement of data sections (data layout problem) to minimize memory stalls. Further, the designer would be interested in multiple optimal design points to address various market segments. However, tight time-to-market constraints enforces short design cycle time. In this paper we address the multi-level multi-objective memory architecture exploration problem through a combination of Multi-objective Genetic Algorithm (Memory Architecture exploration) and an efficient heuristic data placement algorithm. At the outer level the memory architecture exploration is done by picking memory modules directly from a ASIC memory Library. This helps in performing the memory architecture exploration in a integrated framework, where the memory allocation, memory exploration and data layout works in a tightly coupled way to yield optimal design points with respect to area, power and performance. We experimented our approach for 3 embedded applications and our approach explores several thousand memory architecture for each application, yielding a few hundred optimal design points in a few hours of computation time on a standard desktop.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

DNA obtained from a human sputum isolate of Mycobacterium tuberculosis, NTI-64719, which showed extensive dissemination in the guinea pig model resulting in a high score for virulence was used to construct an expression library in the lambda ZAP vector. The size of DNA inserts in the library ranged from 1 to 3 kb, and recombinants represented 60% of the total plaques obtained. When probed with pooled serum from chronically infected tuberculosis patients, the library yielded 176 recombinants with a range of signal intensities. Among these, 93 recombinants were classified into 12 groups on the basis of DNA hybridization experiments, The polypeptides synthesized by the recombinants were predominantly LacZ fusion proteins, Serum obtained from patients who were clinically diagnosed to be in the early phase of M. tuberculosis infection was used to probe the 176 recombinants obtained. interestingly, some recombinants that gave very strong signals in the original screen did not react with early-phase serum; conversely, others whose signals were extremely weak in the original screen gave very intense signals with serum from recently infected patients, This indicates the differential nature of either the expression of these antigens or the immune response elicited by them as a function of disease progression.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Simultaneous consideration of both performance and reliability issues is important in the choice of computer architectures for real-time aerospace applications. One of the requirements for such a fault-tolerant computer system is the characteristic of graceful degradation. A shared and replicated resources computing system represents such an architecture. In this paper, a combinatorial model is used for the evaluation of the instruction execution rate of a degradable, replicated resources computing system such as a modular multiprocessor system. Next, a method is presented to evaluate the computation reliability of such a system utilizing a reliability graph model and the instruction execution rate. Finally, this computation reliability measure, which simultaneously describes both performance and reliability, is applied as a constraint in an architecture optimization model for such computing systems. Index Terms-Architecture optimization, computation

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the architecture and the VHDL design of an integer 2-D DCT used in the H.264/AVC. The 2-D DCT computation is performed by exploiting it’s orthogonality and separability property. The symmetry of the forward and inverse transform is used in this implementation. To reduce the computation overhead for the addition, subtraction and multiplication operations, we analyze the suitability of carry-free position independent residue number system (RNS) for the implementation of 2-D DCT. The implementation has been carried out in VHDL for Altera FPGA. We used the negative number representation in RNS, bit width analysis of the transforms and dedicated registers present in the Logic element of the FPGA to optimize the area. The complexity and efficiency analysis show that the proposed architecture could provide higher through-put.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents the architecture of a fault-tolerant, special-purpose multi-microprocessor system for solving Partial Differential Equations (PDEs). The modular nature of the architecture allows the use of hundreds of Processing Elements (PEs) for high throughput. Its performance is evaluated by both analytical and simulation methods. The results indicate that the system can achieve high operation rates and is not sensitive to inter-processor communication delay.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Packet forwarding is a memory-intensive application requiring multiple accesses through a trie structure. With the requirement to process packets at line rates, high-performance routers need to forward millions of packets every second with each packet needing up to seven memory accesses. Earlier work shows that a single cache for the nodes of a trie can reduce the number of external memory accesses. It is observed that the locality characteristics of the level-one nodes of a trie are significantly different from those of lower level nodes. Hence, we propose a heterogeneously segmented cache architecture (HSCA) which uses separate caches for level-one and lower level nodes, each with carefully chosen sizes. Besides reducing misses, segmenting the cache allows us to focus on optimizing the more frequently accessed level-one node segment. We find that due to the nonuniform distribution of nodes among cache sets, the level-one nodes cache is susceptible t high conflict misses. We reduce conflict misses by introducing a novel two-level mapping-based cache placement framework. We also propose an elegant way to fit the modified placement function into the cache organization with minimal increase in access time. Further, we propose an attribute preserving trace generation methodology which emulates real traces and can generate traces with varying locality. Performanc results reveal that our HSCA scheme results in a 32 percent speedup in average memory access time over a unified nodes cache. Also, HSC outperforms IHARC, a cache for lookup results, with as high as a 10-fold speedup in average memory access time. Two-level mappin further enhances the performance of the base HSCA by up to 13 percent leading to an overall improvement of up to 40 percent over the unified scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, three parallel polygon scan conversion algorithms have been proposed, and their performance when executed on a shared bus architecture has been compared. It has been shown that the parallel algorithm that does not use edge coherence performs better than those that use edge coherence. Further, a multiprocessing architecture has been proposed to execute the parallel polygon scan conversion algorithms more efficiently than a single shared bus architecture.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

With the intent of probing the feasibility of employing annulation as a tactic to engender axial rich conformations in nucleoside analogues, two adenine-derived, ``conformationally restricted'' nucleocylitols, 9 and 10, have been conceptualized as representatives of a hitherto unexplored class of nucleic acid base-cyclitol hybrids. A general synthetic strategy, with an inherent scope for diversification, allowed rapid functionalization of indane and tetralin to furnish 9 and 10 respectively in fair yield. Single-crystal X-ray diffraction analysis revealed that the two nucleocyclitols under study, though homologous, present completely dissimilar modes of molecular packing, marked, in particular, by the nature of involvement of the adenynyl NH2 group in the supramolecular assembly. In addition, the crystal structures of 9 and 10 also exhibit two different conformations of the functionalized cyclohexane ring. Thus, while the six-membered carbocycle in cyclopenta-annulated 9 exists in the expected chair (C) conformation that in cyclohexaannulated 10, which crystallizes as a dihydrate, shows an unusual twist-boat (TB) conformation. From a close analysis of the (HNMR)-H-1 spectroscopic data recorded for 9 and 10 in CD3OD, it was possible to put forth a putative explanation for the uncanny conformational preferences of crystalline 9 and 10.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Run-time interoperability between different applications based on H.264/AVC is an emerging need in networked infotainment, where media delivery must match the desired resolution and quality of the end terminals. In this paper, we describe the architecture and design of a polymorphic ASIC to support this. The H.264 decoding flow is partitioned into modules, such that the polymorphic ASIC meets the design goals of low-power, low-area, high flexibility, high throughput and fast interoperability between different profiles and levels of H.264. We demonstrate the idea with a multi-mode decoder that can decode baseline, main and high profile H.264 streams and can interoperate at run.time across these profiles. The decoder is capable of processing frame sizes of up to 1024 times 768 at 30 fps. The design synthesized with UMC 0.13 mum technology, occupies 250 k gates and runs at 100 MHz.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Copper(I) complexes with {Cu(μ2-S)N}4 and {Cu(μ3-S)N}12 core portions of butterfly-shaped or double wheel architectures have been isolated in the reaction of Cu(I) with the Schiff base ligand C6H4(CHNC6H4S)2, aiso-abtâ, under different conditions. View the MathML source containing the tetranuclear electroneutral complex View the MathML source is formed by the reaction of CuI in acetonitrilic solution and recrystallization from DMF, whereas View the MathML source containing dodecanuclear View the MathML source wheels is accessible starting from CuBF4. Complexes 2 and 4 represent the first examples of cyclic complexes with the same overall stoichiometry but different ring sizes. The ligand induces two different coordination environments around copper(I) by switching between μ2- and μ3-sulfur bridging modes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In modern wireline and wireless communication systems, Viterbi decoder is one of the most compute intensive and essential elements. Each standard requires a different configuration of Viterbi decoder. Hence there is a need to design a flexible reconfigurable Viterbi decoder to support different configurations on a single platform. In this paper we present a reconfigurable Viterbi decoder which can be reconfigured for standards such as WCDMA, CDMA2000, IEEE 802.11, DAB, DVB, and GSM. Different parameters like code rate, constraint length, polynomials and truncation length can be configured to map any of the above mentioned standards. Our design provides higher throughput and scalable power consumption in various configuration of the reconfigurable Viterbi decoder. The power and throughput can also be optimized for different standards.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

REDEFINE is a reconfigurable SoC architecture that provides a unique platform for high performance and low power computing by exploiting the synergistic interaction between coarse grain dynamic dataflow model of computation (to expose abundant parallelism in applications) and runtime composition of efficient compute structures (on the reconfigurable computation resources). We propose and study the throttling of execution in REDEFINE to maximize the architecture efficiency. A feature specific fast hybrid (mixed level) simulation framework for early in design phase study is developed and implemented to make the huge design space exploration practical. We do performance modeling in terms of selection of important performance criteria, ranking of the explored throttling schemes and investigate effectiveness of the design space exploration using statistical hypothesis testing. We find throttling schemes which give appreciable (24.8%) overall performance gain in the architecture and 37% resource usage gain in the throttling unit simultaneously.