38 resultados para Bi-Multiplication


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We examine the computational aspects of propagating a global R-matrix, R, across sub-regions in a 2-D plane. This problem originates in the large scale simulation of electron collisions with atoms and ions at intermediate energies. The propagation is dominated by matrix multiplications which are complicated because of the dynamic nature of R, which changes the designations of its rows and columns and grows in size as the propagation proceeds. The use of PBLAS to solve this problem on distributed memory HPC machines is the main focus of the paper.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

New FPGA architectures for the ordinary Montgomery multiplication algorithm and the FIOS modular multiplication algorithm are presented. The embedded 18×18-bit multipliers and fast carry look-ahead logic located on the Xilinx Virtex2 Pro family of FPGAs are used to perform the ordinary multiplications and additions/subtractions required by these two algorithms. The architectures are developed for use in Elliptic Curve Cryptosystems over GF(p), which require modular field multiplication to perform elliptic curve point addition and doubling. Field sizes of 128-bits and 256-bits are chosen but other field sizes can easily be accommodated, by rapidly reprogramming the FPGA. Overall, the larger the word size of the multiplier, the more efficiently it performs in terms of area/time product. Also, the FIOS algorithm is flexible in that one can tailor the multiplier architecture is to be area efficient, time efficient or a mixture of both by choosing a particular word size. It is estimated that the computation of a 256-bit scalar point multiplication over GF(p) would take about 4.8 ms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A bit-level systolic array for computing matrix x vector products is described. The operation is carried out on bit parallel input data words and the basic circuit takes the form of a 1-bit slice. Several bit-slice components must be connected together to form the final result, and authors outline two different ways in which this can be done. The basic array also has considerable potential as a stand-alone device, and its use in computing the Walsh-Hadamard transform and discrete Fourier transform operations is briefly discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A high-performance VLSI architecture to perform combined multiply-accumulate, divide, and square root operations is proposed. The circuit is highly regular, requires only minimal control, and can be reconfigured for every cycle. The execution time for each operation is the same. The combination of redundancy and pipelining results in a throughput independent of the wordsize of the array. With current CMOS technology, throughput rates in excess of 80 million operations per second are achievable.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The current study involved an evaluation of the emergence of untrained verbal relations as a function of between three different foreign-language teaching strategies. Two Spanish-speaking adults received foreign-language (English) tact-training as well as native-to-foreign and foreign-to-native intraverbal training. The results indicated that tact training and native-to-foreign intraverbal training are more likely to result in the emergence of untrained relations, and may thus be more efficient compared to foreign-to-native intraverbal training.