864 resultados para IEEE Standard for Floating-Point Arithmetic (IEEE 754)
Resumo:
The IEEE 754 standard for oating-point arithmetic is widely used in computing. It is based on real arithmetic and is made total by adding both a positive and a negative infinity, a negative zero, and many Not-a-Number (NaN) states. The IEEE infinities are said to have the behaviour of limits. Transreal arithmetic is total. It also has a positive and a negative infinity but no negative zero, and it has a single, unordered number, nullity. We elucidate the transreal tangent and extend real limits to transreal limits. Arguing from this firm foundation, we maintain that there are three category errors in the IEEE 754 standard. Firstly the claim that IEEE infinities are limits of real arithmetic confuses limiting processes with arithmetic. Secondly a defence of IEEE negative zero confuses the limit of a function with the value of a function. Thirdly the definition of IEEE NaNs confuses undefined with unordered. Furthermore we prove that the tangent function, with the infinities given by geometrical con- struction, has a period of an entire rotation, not half a rotation as is commonly understood. This illustrates a category error, confusing the limit with the value of a function, in an important area of applied mathe- matics { trigonometry. We brie y consider the wider implications of this category error. Another paper proposes transreal arithmetic as a basis for floating- point arithmetic; here we take the profound step of proposing transreal arithmetic as a replacement for real arithmetic to remove the possibility of certain category errors in mathematics. Thus we propose both theo- retical and practical advantages of transmathematics. In particular we argue that implementing transreal analysis in trans- floating-point arith- metic would extend the coverage, accuracy and reliability of almost all computer programs that exploit real analysis { essentially all programs in science and engineering and many in finance, medicine and other socially beneficial applications.
Resumo:
IEEE 754 floating-point arithmetic is widely used in modern, general-purpose computers. It is based on real arithmetic and is made total by adding both a positive and a negative infinity, a negative zero, and many Not-a-Number (NaN) states. Transreal arithmetic is total. It also has a positive and a negative infinity but no negative zero, and it has a single, unordered number, nullity. Modifying the IEEE arithmetic so that it uses transreal arithmetic has a number of advantages. It removes one redundant binade from IEEE floating-point objects, doubling the numerical precision of the arithmetic. It removes eight redundant, relational,floating-point operations and removes the redundant total order operation. It replaces the non-reflexive, floating-point, equality operator with a reflexive equality operator and it indicates that some of the exceptions may be removed as redundant { subject to issues of backward compatibility and transient future compatibility as programmers migrate to the transreal paradigm.
Resumo:
Scientific applications rely heavily on floating point data types. Floating point operations are complex and require complicated hardware that is both area and power intensive. The emergence of massively parallel architectures like Rigel creates new challenges and poses new questions with respect to floating point support. The massively parallel aspect of Rigel places great emphasis on area efficient, low power designs. At the same time, Rigel is a general purpose accelerator and must provide high performance for a wide class of applications. This thesis presents an analysis of various floating point unit (FPU) components with respect to Rigel, and attempts to present a candidate design of an FPU that balances performance, area, and power and is suitable for massively parallel architectures like Rigel.
Resumo:
Power has become a key constraint in current nanoscale integrated circuit design due to the increasing demands for mobile computing and a low carbon economy. As an emerging technology, an inexact circuit design offers a promising approach to significantly reduce both dynamic and static power dissipation for error tolerant applications. Although fixed-point arithmetic circuits have been studied in terms of inexact computing, floating-point arithmetic circuits have not been fully considered although require more power. In this paper, the first inexact floating-point adder is designed and applied to high dynamic range (HDR) image processing. Inexact floating-point adders are proposed by approximately designing an exponent subtractor and mantissa adder. Related logic operations including normalization and rounding modules are also considered in terms of inexact computing. Two HDR images are processed using the proposed inexact floating-point adders to show the validity of the inexact design. HDR-VDP is used as a metric to measure the subjective results of the image addition. Significant improvements have been achieved in terms of area, delay and power consumption. Comparison results show that the proposed inexact floating-point adders can improve power consumption and the power-delay product by 29.98% and 39.60%, respectively.
Resumo:
Most of the commercial and financial data are stored in decimal fonn. Recently, support for decimal arithmetic has received increased attention due to the growing importance in financial analysis, banking, tax calculation, currency conversion, insurance, telephone billing and accounting. Performing decimal arithmetic with systems that do not support decimal computations may give a result with representation error, conversion error, and/or rounding error. In this world of precision, such errors are no more tolerable. The errors can be eliminated and better accuracy can be achieved if decimal computations are done using Decimal Floating Point (DFP) units. But the floating-point arithmetic units in today's general-purpose microprocessors are based on the binary number system, and the decimal computations are done using binary arithmetic. Only few common decimal numbers can be exactly represented in Binary Floating Point (BF P). ln many; cases, the law requires that results generated from financial calculations performed on a computer should exactly match with manual calculations. Currently many applications involving fractional decimal data perform decimal computations either in software or with a combination of software and hardware. The performance can be dramatically improved by complete hardware DFP units and this leads to the design of processors that include DF P hardware.VLSI implementations using same modular building blocks can decrease system design and manufacturing cost. A multiplexer realization is a natural choice from the viewpoint of cost and speed.This thesis focuses on the design and synthesis of efficient decimal MAC (Multiply ACeumulate) architecture for high speed decimal processors based on IEEE Standard for Floating-point Arithmetic (IEEE 754-2008). The research goal is to design and synthesize deeimal'MAC architectures to achieve higher performance.Efficient design methods and architectures are developed for a high performance DFP MAC unit as part of this research.
Resumo:
Thesis (M. S.)--University of Illinois at Urbana-Champaign.
Resumo:
The high integration density of current nanometer technologies allows the implementation of complex floating-point applications in a single FPGA. In this work the intrinsic complexity of floating-point operators is addressed targeting configurable devices and making design decisions providing the most suitable performance-standard compliance trade-offs. A set of floating-point libraries composed of adder/subtracter, multiplier, divisor, square root, exponential, logarithm and power function are presented. Each library has been designed taking into account special characteristics of current FPGAs, and with this purpose we have adapted the IEEE floating-point standard (software-oriented) to a custom FPGA-oriented format. Extended experimental results validate the design decisions made and prove the usefulness of reducing the format complexity
Resumo:
Basic concepts for an interval arithmetic standard are discussed in the paper. Interval arithmetic deals with closed and connected sets of real numbers. Unlike floating-point arithmetic it is free of exceptions. A complete set of formulas to approximate real interval arithmetic on the computer is displayed in section 3 of the paper. The essential comparison relations and lattice operations are discussed in section 6. Evaluation of functions for interval arguments is studied in section 7. The desirability of variable length interval arithmetic is also discussed in the paper. The requirement to adapt the digital computer to the needs of interval arithmetic is as old as interval arithmetic. An obvious, simple possible solution is shown in section 8.
Resumo:
This work presents JFLoat, a software implementation of IEEE-754 standard for binary floating point arithmetic. JFloat was built to provide some features not implemented in Java, specifically directed rounding support. That feature is important for Java-XSC, a project developed in this Department. Also, Java programs should have same portability when using floating point operations, mainly because IEEE-754 specifies that programs should have exactly same behavior on every configuration. However, it was noted that programs using Java native floating point types may be machine and operating system dependent. Also, JFloat is a possible solution to that problem
Resumo:
The high performance and capacity of current FPGAs makes them suitable as acceleration co-processors. This article studies the implementation, for such accelerators, of the floating-point power function xy as defined by the C99 and IEEE 754-2008 standards, generalized here to arbitrary exponent and mantissa sizes. Last-bit accuracy at the smallest possible cost is obtained thanks to a careful study of the various subcomponents: a floating-point logarithm, a modified floating-point exponential, and a truncated floating-point multiplier. A parameterized architecture generator in the open-source FloPoCo project is presented in details and evaluated.
Resumo:
he growth of high-performance application in computer graphics, signal processing and scientific computing is a key driver for high performance, fixed latency; pipelined floating point dividers. Solutions available in the literature use large lookup table for double precision floating point operations.In this paper, we propose a cost effective, fixed latency pipelined divider using modified Taylor-series expansion for double precision floating point operations. We reduce chip area by using a smaller lookup table. We show that the latency of the proposed divider is 49.4 times the latency of a full-adder. The proposed divider reduces chip area by about 81% than the pipelined divider in [9] which is based on modified Taylor-series.
Resumo:
Power has become a key constraint in nanoscale inte-grated circuit design due to the increasing demands for mobile computing and higher integration density. As an emerging compu-tational paradigm, an inexact circuit offers a promising approach to significantly reduce both dynamic and static power dissipation for error-tolerant applications. In this paper, an inexact floating-point adder is proposed by approximately designing an exponent sub-tractor and mantissa adder. Related operations such as normaliza-tion and rounding are also dealt with in terms of inexact computing. An upper bound error analysis for the average case is presented to guide the inexact design; it shows that the inexact floating-point adder design is dependent on the application data range. High dynamic range images are then processed using the proposed inexact floating-point adders to show the validity of the inexact design; comparison results show that the proposed inexact floating-point adders can improve the power consumption and power-delay product by 29.98% and 39.60%, respectively.
Resumo:
An SVD processor system is presented in which each processing element is implemented using a simple CORDIC unit. The internal recursive loop within the CORDIC module is exploited, with pipelining being used to multiplex the two independent micro-rotations onto a single CORDIC processor. This leads to a high performance and efficient hardware architecture. In addition, a novel method for scale factor correction is presented which only need be applied once at the end of the computation. This also reduces the computation time. The net result is an SVD architecture based on a conventional CORDIC approach, which combines high performance with high silicon area efficiency.
Resumo:
This paper proposes a set of well defined steps to design functional verification monitors intended to verify Floating Point Units (FPU) described in HDL. The first step consists on defining the input and output domain coverage. Next, the corner cases are defined. Finally, an already verified reference model is used in order to test the correctness of the Device Under Verification (DUV). As a case study a monitor for an IEEE754-2008 compliant design is implemented. This monitor is built to be easily instantiated into verification frameworks such as OVM. Two different designs were verified reaching complete input coverage and successful compliant results.
Resumo:
The perspex machine arose from the unification of projective geometry with the Turing machine. It uses a total arithmetic, called transreal arithmetic, that contains real arithmetic and allows division by zero. Transreal arithmetic is redefined here. The new arithmetic has both a positive and a negative infinity which lie at the extremes of the number line, and a number nullity that lies off the number line. We prove that nullity, 0/0, is a number. Hence a number may have one of four signs: negative, zero, positive, or nullity. It is, therefore, impossible to encode the sign of a number in one bit, as floating-, point arithmetic attempts to do, resulting in the difficulty of having both positive and negative zeros and NaNs. Transrational arithmetic is consistent with Cantor arithmetic. In an extension to real arithmetic, the product of zero, an infinity, or nullity with its reciprocal is nullity, not unity. This avoids the usual contradictions that follow from allowing division by zero. Transreal arithmetic has a fixed algebraic structure and does not admit options as IEEE, floating-point arithmetic does. Most significantly, nullity has a simple semantics that is related to zero. Zero means "no value" and nullity means "no information." We argue that nullity is as useful to a manufactured computer as zero is to a human computer. The perspex machine is intended to offer one solution to the mind-body problem by showing how the computable aspects of mind and. perhaps, the whole of mind relates to the geometrical aspects of body and, perhaps, the whole of body. We review some of Turing's writings and show that he held the view that his machine has spatial properties. In particular, that it has the property of being a 7D lattice of compact spaces. Thus, we read Turing as believing that his machine relates computation to geometrical bodies. We simplify the perspex machine by substituting an augmented Euclidean geometry for projective geometry. This leads to a general-linear perspex-machine which is very much easier to pro-ram than the original perspex-machine. We then show how to map the whole of perspex space into a unit cube. This allows us to construct a fractal of perspex machines with the cardinality of a real-numbered line or space. This fractal is the universal perspex machine. It can solve, in unit time, the halting problem for itself and for all perspex machines instantiated in real-numbered space, including all Turing machines. We cite an experiment that has been proposed to test the physical reality of the perspex machine's model of time, but we make no claim that the physical universe works this way or that it has the cardinality of the perspex machine. We leave it that the perspex machine provides an upper bound on the computational properties of physical things, including manufactured computers and biological organisms, that have a cardinality no greater than the real-number line.