806 resultados para structured parallel computations


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dans l'apprentissage machine, la classification est le processus d’assigner une nouvelle observation à une certaine catégorie. Les classifieurs qui mettent en œuvre des algorithmes de classification ont été largement étudié au cours des dernières décennies. Les classifieurs traditionnels sont basés sur des algorithmes tels que le SVM et les réseaux de neurones, et sont généralement exécutés par des logiciels sur CPUs qui fait que le système souffre d’un manque de performance et d’une forte consommation d'énergie. Bien que les GPUs puissent être utilisés pour accélérer le calcul de certains classifieurs, leur grande consommation de puissance empêche la technologie d'être mise en œuvre sur des appareils portables tels que les systèmes embarqués. Pour rendre le système de classification plus léger, les classifieurs devraient être capable de fonctionner sur un système matériel plus compact au lieu d'un groupe de CPUs ou GPUs, et les classifieurs eux-mêmes devraient être optimisés pour ce matériel. Dans ce mémoire, nous explorons la mise en œuvre d'un classifieur novateur sur une plate-forme matérielle à base de FPGA. Le classifieur, conçu par Alain Tapp (Université de Montréal), est basé sur une grande quantité de tables de recherche qui forment des circuits arborescents qui effectuent les tâches de classification. Le FPGA semble être un élément fait sur mesure pour mettre en œuvre ce classifieur avec ses riches ressources de tables de recherche et l'architecture à parallélisme élevé. Notre travail montre que les FPGAs peuvent implémenter plusieurs classifieurs et faire les classification sur des images haute définition à une vitesse très élevée.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Thèse réalisée en cotutelle entre Aix-Marseille Université et l'Université de Montréal

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Parallel legal systems can and do exist within a single sovereign nation, and rural Guatemala offers one example. Such parallel systems are generally viewed as failures of legal penetration which compromise the rule of law. The question addressed in this paper is whether the de facto existence of parallel systems in Guatemala benefits the indigenous population, or whether the ultimate goal of attaining access to justice requires a complete overhaul of the official legal system. Ultimately, the author concludes that while the official justice system needs a lot of work in order to expand access to justice, especially for the rural poor, the existence of a parallel legal system can be a vehicle for, rather than a hindrance to, expanding such access.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

During 1990's the Wavelet Transform emerged as an important signal processing tool with potential applications in time-frequency analysis and non-stationary signal processing.Wavelets have gained popularity in broad range of disciplines like signal/image compression, medical diagnostics, boundary value problems, geophysical signal processing, statistical signal processing,pattern recognition,underwater acoustics etc.In 1993, G. Evangelista introduced the Pitch- synchronous Wavelet Transform, which is particularly suited for pseudo-periodic signal processing.The work presented in this thesis mainly concentrates on two interrelated topics in signal processing,viz. the Wavelet Transform based signal compression and the computation of Discrete Wavelet Transform. A new compression scheme is described in which the Pitch-Synchronous Wavelet Transform technique is combined with the popular linear Predictive Coding method for pseudo-periodic signal processing. Subsequently,A novel Parallel Multiple Subsequence structure is presented for the efficient computation of Wavelet Transform. Case studies also presented to highlight the potential applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dynamics of Nd:YAG laser with intracavity KTP crystal operating in two parallel polarized modes is investigated analytically and numerically. System equilibrium points were found out and the stability of each of them was checked using Routh–Hurwitz criteria and also by calculating the eigen values of the Jacobian. It is found that the system possesses three equilibrium points for (Ij, Gj), where j = 1, 2. One of these equilibrium points undergoes Hopf bifurcation in output dynamics as the control parameter is increased. The other two remain unstable throughout the entire region of the parameter space. Our numerical analysis of the Hopf bifurcation phenomena is found to be in good agreement with the analytical results. Nature of energy transfer between the two modes is also studied numerically.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most of the commercial and financial data are stored in decimal fonn. Recently, support for decimal arithmetic has received increased attention due to the growing importance in financial analysis, banking, tax calculation, currency conversion, insurance, telephone billing and accounting. Performing decimal arithmetic with systems that do not support decimal computations may give a result with representation error, conversion error, and/or rounding error. In this world of precision, such errors are no more tolerable. The errors can be eliminated and better accuracy can be achieved if decimal computations are done using Decimal Floating Point (DFP) units. But the floating-point arithmetic units in today's general-purpose microprocessors are based on the binary number system, and the decimal computations are done using binary arithmetic. Only few common decimal numbers can be exactly represented in Binary Floating Point (BF P). ln many; cases, the law requires that results generated from financial calculations performed on a computer should exactly match with manual calculations. Currently many applications involving fractional decimal data perform decimal computations either in software or with a combination of software and hardware. The performance can be dramatically improved by complete hardware DFP units and this leads to the design of processors that include DF P hardware.VLSI implementations using same modular building blocks can decrease system design and manufacturing cost. A multiplexer realization is a natural choice from the viewpoint of cost and speed.This thesis focuses on the design and synthesis of efficient decimal MAC (Multiply ACeumulate) architecture for high speed decimal processors based on IEEE Standard for Floating-point Arithmetic (IEEE 754-2008). The research goal is to design and synthesize deeimal'MAC architectures to achieve higher performance.Efficient design methods and architectures are developed for a high performance DFP MAC unit as part of this research.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The results of an investigation on the limits of the random errors contained in the basic data of Physical Oceanography and their propagation through the computational procedures are presented in this thesis. It also suggest a method which increases the reliability of the derived results. The thesis is presented in eight chapters including the introductory chapter. Chapter 2 discusses the general theory of errors that are relevant in the context of the propagation of errors in Physical Oceanographic computations. The error components contained in the independent oceanographic variables namely, temperature, salinity and depth are deliniated and quantified in chapter 3. Chapter 4 discusses and derives the magnitude of errors in the computation of the dependent oceanographic variables, density in situ, gt, specific volume and specific volume anomaly, due to the propagation of errors contained in the independent oceanographic variables. The errors propagated into the computed values of the derived quantities namely, dynamic depth and relative currents, have been estimated and presented chapter 5. Chapter 6 reviews the existing methods for the identification of level of no motion and suggests a method for the identification of a reliable zero reference level. Chapter 7 discusses the available methods for the extension of the zero reference level into shallow regions of the oceans and suggests a new method which is more reliable. A procedure of graphical smoothening of dynamic topographies between the error limits to provide more reliable results is also suggested in this chapter. Chapter 8 deals with the computation of the geostrophic current from these smoothened values of dynamic heights, with reference to the selected zero reference level. The summary and conclusion are also presented in this chapter.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Decimal multiplication is an integral part offinancial, commercial, and internet-based computations. The basic building block of a decimal multiplier is a single digit multiplier. It accepts two Binary Coded Decimal (BCD) inputs and gives a product in the range [0, 81] represented by two BCD digits. A novel design for single digit decimal multiplication that reduces the critical path delay and area is proposed in this research. Out of the possible 256 combinations for the 8-bit input, only hundred combinations are valid BCD inputs. In the hundred valid combinations only four combinations require 4 x 4 multiplication, combinations need x multiplication, and the remaining combinations use either x or x 3 multiplication. The proposed design makes use of this property. This design leads to more regular VLSI implementation, and does not require special registers for storing easy multiples. This is a fully parallel multiplier utilizing only combinational logic, and is extended to a Hex/Decimal multiplier that gives either a decimal output or a binary output. The accumulation ofpartial products generated using single digit multipliers is done by an array of multi-operand BCD adders for an (n-digit x n-digit) multiplication.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Decimal multiplication is an integral part of financial, commercial, and internet-based computations. A novel design for single digit decimal multiplication that reduces the critical path delay and area for an iterative multiplier is proposed in this research. The partial products are generated using single digit multipliers, and are accumulated based on a novel RPS algorithm. This design uses n single digit multipliers for an n × n multiplication. The latency for the multiplication of two n-digit Binary Coded Decimal (BCD) operands is (n + 1) cycles and a new multiplication can begin every n cycle. The accumulation of final partial products and the first iteration of partial product generation for next set of inputs are done simultaneously. This iterative decimal multiplier offers low latency and high throughput, and can be extended for decimal floating-point multiplication.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Animportant step in the residue number system(RNS) based signal processing is the conversion of signal into residue domain. Many implementations of this conversion have been proposed for various goals, and one of the implementations is by a direct conversion from an analogue input. A novel approach for analogue-to-residue conversion is proposed in this research using the most popular Sigma–Delta analogue-to-digital converter (SD-ADC). In this approach, the front end is the same as in traditional SD-ADC that uses Sigma–Delta (SD) modulator with appropriate dynamic range, but the filtering is doneby a filter implemented usingRNSarithmetic. Hence, the natural output of the filter is an RNS representation of the input signal. The resolution, conversion speed, hardware complexity and cost of implementation of the proposed SD based analogue-to-residue converter are compared with the existing analogue-to-residue converters based on Nyquist rate ADCs

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes JERIM-320, a new 320-bit hash function used for ensuring message integrity and details a comparison with popular hash functions of similar design. JERIM-320 and FORK -256 operate on four parallel lines of message processing while RIPEMD-320 operates on two parallel lines. Popular hash functions like MD5 and SHA-1 use serial successive iteration for designing compression functions and hence are less secure. The parallel branches help JERIM-320 to achieve higher level of security using multiple iterations and processing on the message blocks. The focus of this work is to prove the ability of JERIM 320 in ensuring the integrity of messages to a higher degree to suit the fast growing internet applications

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work proposes a parallel genetic algorithm for compressing scanned document images. A fitness function is designed with Hausdorff distance which determines the terminating condition. The algorithm helps to locate the text lines. A greater compression ratio has achieved with lesser distortion

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes our plans to evaluate the present state of affairs concerning parallel programming and its systems. Three subprojects are proposed: a survey among programmers and scientists, a comparison of parallel programming systems using a standard set of test programs, and a wiki resource for the parallel programming community - the Parawiki. We would like to invite you to participate and turn these subprojects into true community efforts.