37 resultados para Clock
Resumo:
A bit level systolic array for computing the convolution operation is described. The circuit in question is highly regular and ideally suited to VLSI chip design. It is also optimized in the sense that all the cells contribute to the computation on each clock cycle. This makes the array almost four times more efficient than one which was previously described.
Resumo:
A systolic array is an array of individual processing cells each of which has some local memory and is connected only to its nearest neighbours in the form of a regular lattice. On each cycle of a simple clock every cell receives data from its neighbouring cells and performs a specific processing operation on it. The resulting data is stored within the cell and passed on to neighbouring cells on the next clock cycle. This paper gives an overview of work to date and illustrates the application of bit-level systolic arrays by means of two examples: (1) a pipelined bit-slice circuit for computing matrix x vector transforms; and (2) a bit serial structure for multi-bit convolution.
Resumo:
A 64-point Fourier transform chip is described that performs a forward or inverse, 64-point Fourier transform on complex two's complement data supplied at a rate of 13.5MHz and can operate at clock rates of up to 40MHz, under worst-case conditions. It uses a 0.6µm double-level metal CMOS technology, contains 535k transistors and uses an internal 3.3V power supply. It has an area of 7.8×8mm, dissipates 0.9W, has 48 pins and is housed in a 84 pin PLCC plastic package. The chip is based on a FFT architecture developed from first principles through a detailed investigation of the structure of the relevant DFT matrix and through mapping repetitive blocks within this matrix onto a regular silicon structure.
Resumo:
A high-performance VLSI architecture to perform multiply-accumulate, division and square root operations is proposed. The circuit is highly regular, requires only minimal control and can be pipelined right down to the bit level. The system can also be reconfigured on every cycle to perform any one of these operations. The gate count per row has been estimated at (27n+70) gate equivalents where n is the divisor wordlength. The throughput rate, which equals the clock speed, is the same for each operation and is independent of the wordlength. This is achieved through the combination of pipelining and redundant arithmetic. With a 1.0 µm CMOS technology and extensive pipelining, throughput rates in excess of 70 million operations per second are expected.
Resumo:
Several novel systolic architectures for implementing densely pipelined bit parallel IIR filter sections are presented. The fundamental problem of latency in the feedback loop is overcome by employing redundant arithmetic in combination with bit-level feedback, allowing a basic first-order section to achieve a wordlength-independent latency of only two clock cycles. This is extended to produce a building block from which higher order sections can be constructed. The architecture is then refined by combining the use of both conventional and redundant arithmetic, resulting in two new structures offering substantial hardware savings over the original design. In contrast to alternative techniques, bit-level pipelinability is achieved with no net cost in hardware. © 1989 Kluwer Academic Publishers.
Resumo:
A novel bit-level systolic array architecture for implementing first-order IIR filter sections is presented. A latency of only two clock cycles is achieved by using a radix-4 redundant number representation, performing the recursive computation most-significant-digit first, and feeding back each digit of the result as soon as it is available.
Resumo:
A novel hardware architecture for elliptic curve cryptography (ECC) over GF(p) is introduced. This can perform the main prime field arithmetic functions needed in these cryptosystems including modular inversion and multiplication. This is based on a new unified modular inversion algorithm that offers considerable improvement over previous ECC techniques that use Fermat's Little Theorem for this operation. The processor described uses a full-word multiplier which requires much fewer clock cycles than previous methods, while still maintaining a competitive critical path delay. The benefits of the approach have been demonstrated by utilizing these techniques to create a field-programmable gate array (FPGA) design. This can perform a 256-bit prime field scalar point multiplication in 3.86 ms, the fastest FPGA time reported to date. The ECC architecture described can also perform four different types of modular inversion, making it suitable for use in many different ECC applications. © 2006 IEEE.
Resumo:
A novel bit-level systolic array architecture for implementing IIR (infinite-impulse response) filter sections is presented. A first-order section achieves a latency of only two clock cycles by using a radix-2 redundant number representation, performing the recursive computation most significant digit first, and feeding back each digit of the result as soon as it is available. The design is extended to produce a building block from which second- and higher-order sections can be connected.
Resumo:
The paper presents a state-of-the-art commercial demonstrator chip for infinite impulse response (IIR) filtering. The programmable IIR filter chip contains eight multiplier/accumulators that can be configured in one of five different modes to implement up to a 16th-order IIR filter. The multiply-accumulate block is based on a highly regular systolic array architecture and uses a redundant number system to overcome problems of pipelining in the feedback loop. The chip has been designed using the GEC Plessey Semiconductors CLA 78000 series gate array, operates on 16-bit two's complement data and has a clock speed of 30 MHz. Issues such as overflow detection and design for testability have also been addressed and are described.
Resumo:
We propose a dynamic verification approach for large-scale message passing programs to locate correctness bugs caused by unforeseen nondeterministic interactions. This approach hinges on an efficient protocol to track the causality between nondeterministic message receive operations and potentially matching send operations. We show that causality tracking protocols that rely solely on logical clocks fail to capture all nuances of MPI program behavior, including the variety of ways in which nonblocking calls can complete. Our approach is hinged on formally defining the matches-before relation underlying the MPI standard, and devising lazy update logical clock based algorithms that can correctly discover all potential outcomes of nondeterministic receives in practice. can achieve the same coverage as a vector clock based algorithm while maintaining good scalability. LLCP allows us to analyze realistic MPI programs involving a thousand MPI processes, incurring only modest overheads in terms of communication bandwidth, latency, and memory consumption. © 2011 IEEE.
Resumo:
Dual-rail encoding, return-to-spacer protocol, and hazard-free logic can be used to resist power analysis attacks by making energy consumed per clock cycle independent of processed data. Standard dual-rail logic uses a protocol with a single spacer, e.g., all-zeros, which gives rise to energy balancing problems. We address these problems by incorporating two spacers; the spacers alternate between adjacent clock cycles. This guarantees that all gates switch in every clock cycle regardless of the transmitted data values. To generate these dual-rail circuits, an automated tool has been developed. It is capable of converting synchronous netlists into dual-rail circuits and it is interfaced to industry CAD tools. Dual-rail and single-rail benchmarks based upon the advanced encryption standard (AES) have been simulated and compared in order to evaluate the method and the tool.
Resumo:
Architects typically interpret Heidegger to mean that dwelling in the Black Forest, was more authentic than living in an industrialised society however we cannot turn back the clock so we are confronted with the reality of modernisation. Since the Second World War production has shifted from material to immaterial assets. Increasingly place is believed to offer resistance to this fluidity, but this belief can conversely be viewed as expressing a sublimated anxiety about our role in the world – the need to create buildings that are self-consciously contextual suggests that we may no longer be rooted in material places, but in immaterial relations.
This issue has been pondered by David Harvey in his paper From Place to Space and Back Again where he argues that the role of place in legitimising identity is ultimately a political process, as the interpretation of its meaning is dependent on whose interpretation it is. Doreen Massey has found that different classes of people are more or less mobile and that mobility is related to class and education rather than to nationality or geography. These thinkers point to a different set of questions than the usual space/place divide – how can we begin to address the economic mediation of spatial production to develop an ethical production of place? Part of the answer is provided by the French architectural practice Lacaton Vassal in their book Plus. They ask themselves how to produce more space for the same cost so that people can enjoy a better quality of life. Another French practitioner, Patrick Bouchain, has argued that architect’s fees should be inversely proportional to the amount of material resources that they consume. These approaches use economics as a starting point for generating architectural form and point to more ethical possibilities for architectural practice
Resumo:
Molluscs are a diverse animal phylum with a formidable fossil record. Although there is little doubt about the monophyly of the eight extant classes, relationships between these groups are controversial.We analysed a comprehensive multilocus molecular data set for molluscs, the first to include multiple species from all classes, including five monoplacophorans in both extant families. Our analyses of fivemarkers resolve two major clades: the first includes gastropods and bivalves sister to Serialia (monoplacophorans and chitons), and the second comprises scaphopods sister to aplacophorans and cephalopods. Traditional groupings such as Testaria, Aculifera, and Conchifera are rejected by our data with significant Approximately Unbiased (AU) test values. A new molecular clock indicates that molluscs had a terminal Precambrian origin with rapid divergence of all eight extant classes in the Cambrian. Therecovery of Serialia as a derived, Late Cambrian clade is potentially in line with the stratigraphic chronology of morphologically heterogeneous early mollusc fossils. Serialia is in conflict with traditional molluscan classifications and recent phylogenomic data. Yet our hypothesis, as others from molecular data, implies frequent molluscan shell and body transformations by heterochronic shifts in development and multiple convergent adaptations, leading to the variable shells and body plans in extant lineages.
Resumo:
A giant retinal tear (GRT) is a full-thickness neurosensory retinal break that extends circumferentially around the retina for three or more clock hours in the presence of a posteriorly detached vitreous. Its incidence in large population-based studies has been estimated as 1.5% of rhegmatogenous retinal detachments, with a significant male preponderance, and bilaterality in 12.8%. Most GRTs are idiopathic, with trauma, hereditary vitreoretinopathies and high myopia each being causative in decreasing frequency. The vast majority of GRTs are currently managed with a pars plana vitrectomy; the use of adjunctive circumferential scleral buckling is debated, but no studies have shown a clear anatomical or visual advantage with its use. Similarly, silicone oil tamponade does not influence long-term outcomes when compared with gas. Primary and final retinal reattachment rates are achieved in 88% and 95% of patients, respectively. Even when the retina remains attached, however, visual recovery may be limited. Furthermore, fellow eyes of patients with a GRT are at higher risk of developing retinal tears and retinal detachment. Prophylactic treatment under these circumstances may be considered but there is no firm evidence of its efficacy at the present time.
Resumo:
Molecular communication is set to play an important role in the design of complex biological and chemical systems. An important class of molecular communication systems is based on the timing channel, where information is encoded in the delay of the transmitted molecule - a synchronous approach. At present, a widely used modeling assumption is the perfect synchronization between the transmitter and the receiver. Unfortunately, this assumption is unlikely to hold in most practical molecular systems. To remedy this, we introduce a clock into the model - leading to the molecular timing channel with synchronization error. To quantify the behavior of this new system, we derive upper and lower bounds on the variance-constrained capacity, which we view as the step between the mean-delay and the peak-delay constrained capacity. By numerically evaluating our bounds, we obtain a key practical insight: the drift velocity of the clock links does not need to be significantly larger than the drift velocity of the information link, in order to achieve the variance-constrained capacity with perfect synchronization.