From f215e02bf85f68d3a6106c2a1f4f7f063f819064 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Thu, 11 Apr 2024 10:17:27 +0200 Subject: Adding upstream version 7.0.14-dfsg. Signed-off-by: Daniel Baumann --- .../testfloat/doc/TestFloat-general.html | 1148 ++++++++++++++++++++ 1 file changed, 1148 insertions(+) create mode 100644 src/libs/softfloat-3e/testfloat/doc/TestFloat-general.html (limited to 'src/libs/softfloat-3e/testfloat/doc/TestFloat-general.html') diff --git a/src/libs/softfloat-3e/testfloat/doc/TestFloat-general.html b/src/libs/softfloat-3e/testfloat/doc/TestFloat-general.html new file mode 100644 index 00000000..55f67d71 --- /dev/null +++ b/src/libs/softfloat-3e/testfloat/doc/TestFloat-general.html @@ -0,0 +1,1148 @@ + + + + +Berkeley TestFloat General Documentation + + + + +

Berkeley TestFloat Release 3e: General Documentation

+ +

+John R. Hauser
+2018 January 20
+

+ + +

Contents

+ +
+ +++ + + + + + + + + + + + + + + + + + + + +
1. Introduction
2. Limitations
3. Acknowledgments and License
4. What TestFloat Does
5. Executing TestFloat
6. Operations Tested by TestFloat
6.1. Conversion Operations
6.2. Basic Arithmetic Operations
6.3. Fused Multiply-Add Operations
6.4. Remainder Operations
6.5. Round-to-Integer Operations
6.6. Comparison Operations
7. Interpreting TestFloat Output
8. Variations Allowed by the IEEE Floating-Point Standard
8.1. Underflow
8.2. NaNs
8.3. Conversions to Integer
9. Contact Information
+
+ + +

1. Introduction

+ +

+Berkeley TestFloat is a small collection of programs for testing that an +implementation of binary floating-point conforms to the IEEE Standard for +Floating-Point Arithmetic. +All operations required by the original 1985 version of the IEEE Floating-Point +Standard can be tested, except for conversions to and from decimal. +With the current release, the following binary formats can be tested: +16-bit half-precision, 32-bit single-precision, +64-bit double-precision, 80-bit +double-extended-precision, and/or 128-bit quadruple-precision. +TestFloat cannot test decimal floating-point. +

+ +

+Included in the TestFloat package are the testsoftfloat and +timesoftfloat programs for testing the Berkeley SoftFloat software +implementation of floating-point and for measuring its speed. +Information about SoftFloat can be found at the SoftFloat Web page, +http://www.jhauser.us/arithmetic/SoftFloat.html. +The testsoftfloat and timesoftfloat programs are +expected to be of interest only to people compiling the SoftFloat sources. +

+ +

+This document explains how to use the TestFloat programs. +It does not attempt to define or explain much of the IEEE Floating-Point +Standard. +Details about the standard are available elsewhere. +

+ +

+The current version of TestFloat is Release 3e. +This version differs from earlier releases 3b through 3d in only minor ways. +Compared to the original Release 3: +

+This release adds a few more small improvements, including modifying the +expected behavior of rounding mode odd and fixing a minor bug in +the all-in-one testfloat program. +

+ +

+Compared to Release 2c and earlier, the set of TestFloat programs, as well as +the programs’ arguments and behavior, changed some with +Release 3. +For more about the evolution of TestFloat releases, see +TestFloat-history.html. +

+ + +

2. Limitations

+ +

+TestFloat output is not always easily interpreted. +Detailed knowledge of the IEEE Floating-Point Standard and its vagaries is +needed to use TestFloat responsibly. +

+ +

+TestFloat performs relatively simple tests designed to check the fundamental +soundness of the floating-point under test. +TestFloat may also at times manage to find rarer and more subtle bugs, but it +will probably only find such bugs by chance. +Software that purposefully seeks out various kinds of subtle floating-point +bugs can be found through links posted on the TestFloat Web page, +http://www.jhauser.us/arithmetic/TestFloat.html. +

+ + +

3. Acknowledgments and License

+ +

+The TestFloat package was written by me, John R. Hauser. +Release 3 of TestFloat was a completely new implementation +supplanting earlier releases. +The project to create Release 3 (now through 3e) was +done in the employ of the University of California, Berkeley, within the +Department of Electrical Engineering and Computer Sciences, first for the +Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab. +The work was officially overseen by Prof. Krste Asanovic, with funding provided +by these sources: +

+ ++++ + + + + + + + + + +
Par Lab: +Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery +(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia, +NVIDIA, Oracle, and Samsung. +
ASPIRE Lab: +DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from +ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA, +Oracle, and Samsung. +
+
+

+ +

+The following applies to the whole of TestFloat Release 3e as well +as to each source file individually. +

+ +

+Copyright 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 The Regents of the +University of California. +All rights reserved. +

+ +

+Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: +

    + +
  1. +

    +Redistributions of source code must retain the above copyright notice, this +list of conditions, and the following disclaimer. +

    + +
  2. +

    +Redistributions in binary form must reproduce the above copyright notice, this +list of conditions, and the following disclaimer in the documentation and/or +other materials provided with the distribution. +

    + +
  3. +

    +Neither the name of the University nor the names of its contributors may be +used to endorse or promote products derived from this software without specific +prior written permission. +

    + +
+

+ +

+THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS “AS IS”, +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE +DISCLAIMED. +IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +

+ + +

4. What TestFloat Does

+ +

+TestFloat is designed to test a floating-point implementation by comparing its +behavior with that of TestFloat’s own internal floating-point implemented +in software. +For each operation to be tested, the TestFloat programs can generate a large +number of test cases, made up of simple pattern tests intermixed with weighted +random inputs. +The cases generated should be adequate for testing carry chain propagations, +and the rounding of addition, subtraction, multiplication, and simple +operations like conversions. +TestFloat makes a point of checking all boundary cases of the arithmetic, +including underflows, overflows, invalid operations, subnormal inputs, zeros +(positive and negative), infinities, and NaNs. +For the interesting operations like addition and multiplication, millions of +test cases may be checked. +

+ +

+TestFloat is not remarkably good at testing difficult rounding cases for +division and square root. +It also makes no attempt to find bugs specific to SRT division and the like +(such as the infamous Pentium division bug). +Software that tests for such failures can be found through links on the +TestFloat Web page, +http://www.jhauser.us/arithmetic/TestFloat.html. +

+ +

+NOTE!
+It is the responsibility of the user to verify that the discrepancies TestFloat +finds actually represent faults in the implementation being tested. +Advice to help with this task is provided later in this document. +Furthermore, even if TestFloat finds no fault with a floating-point +implementation, that in no way guarantees that the implementation is bug-free. +

+ +

+For each operation, TestFloat can test all five rounding modes defined by the +IEEE Floating-Point Standard, plus possibly a sixth mode, round to odd +(depending on the options selected when TestFloat was built). +TestFloat verifies not only that the numeric results of an operation are +correct, but also that the proper floating-point exception flags are raised. +All five exception flags are tested, including the inexact flag. +TestFloat does not attempt to verify that the floating-point exception flags +are actually implemented as sticky flags. +

+ +

+For the 80-bit double-extended-precision format, TestFloat can +test the addition, subtraction, multiplication, division, and square root +operations at all three of the standard rounding precisions. +The rounding precision can be set to 32 bits, equivalent to +single-precision, to 64 bits, equivalent to double-precision, or +to the full 80 bits of the double-extended-precision. +Rounding precision control can be applied only to the double-extended-precision +format and only for the five basic arithmetic operations: addition, +subtraction, multiplication, division, and square root. +Other operations can be tested only at full precision. +

+ +

+As a rule, TestFloat is not particular about the bit patterns of NaNs that +appear as operation results. +Any NaN is considered as good a result as another. +This laxness can be overridden so that TestFloat checks for particular bit +patterns within NaN results. +See section 8 below, Variations Allowed by the IEEE +Floating-Point Standard, plus the -checkNaNs and +-checkInvInts options documented for programs +testfloat_ver and testfloat. +

+ +

+TestFloat normally compares an implementation of floating-point against the +Berkeley SoftFloat software implementation of floating-point, also created by +me. +The SoftFloat functions are linked into each TestFloat program’s +executable. +Information about SoftFloat can be found at the Web page +http://www.jhauser.us/arithmetic/SoftFloat.html. +

+ +

+For testing SoftFloat itself, the TestFloat package includes a +testsoftfloat program that compares SoftFloat’s +floating-point against another software floating-point implementation. +The second software floating-point is simpler and slower than SoftFloat, and is +completely independent of SoftFloat. +Although the second software floating-point cannot be guaranteed to be +bug-free, the chance that it would mimic any of SoftFloat’s bugs is low. +Consequently, an error in one or the other floating-point version should appear +as an unexpected difference between the two implementations. +Note that testing SoftFloat should be necessary only when compiling a new +TestFloat executable or when compiling SoftFloat for some other reason. +

+ + +

5. Executing TestFloat

+ +

+The TestFloat package consists of five programs, all intended to be executed +from a command-line interpreter: +

+ + + + + + + + + + + + + + + + + + + + + +
+testfloat_gen    + +Generates test cases for a specific floating-point operation. +
+testfloat_ver + +Verifies whether the results from executing a floating-point operation are as +expected. +
+testfloat + +An all-in-one program that generates test cases, executes floating-point +operations, and verifies whether the results match expectations. +
+testsoftfloat    + +Like testfloat, but for testing SoftFloat. +
+timesoftfloat    + +A program for measuring the speed of SoftFloat (included in the TestFloat +package for convenience). +
+
+Each program has its own page of documentation that can be opened through the +links in the table above. +

+ +

+To test a floating-point implementation other than SoftFloat, one of three +different methods can be used. +The first method pipes output from testfloat_gen to a program +that: +(a) reads the incoming test cases, (b) invokes the +floating-point operation being tested, and (c) writes the +operation results to output. +These results can then be piped to testfloat_ver to be checked for +correctness. +Assuming a vertical bar (|) indicates a pipe between programs, the +complete process could be written as a single command like so: +

+
+testfloat_gen ... <type> | <program-that-invokes-op> | testfloat_ver ... <function>
+
+
+The program in the middle is not supplied by TestFloat but must be created +independently. +If for some reason this program cannot take command-line arguments, the +-prefix option of testfloat_gen can communicate +parameters through the pipe. +

+ +

+A second method for running TestFloat is similar but has +testfloat_gen supply not only the test inputs but also the +expected results for each case. +With this additional information, the job done by testfloat_ver +can be folded into the invoking program to give the following command: +

+
+testfloat_gen ... <function> | <program-that-invokes-op-and-compares-results>
+
+
+Again, the program that actually invokes the floating-point operation is not +supplied by TestFloat but must be created independently. +Depending on circumstance, it may be preferable either to let +testfloat_ver check and report suspected errors (first method) or +to include this step in the invoking program (second method). +

+ +

+The third way to use TestFloat is the all-in-one testfloat +program. +This program can perform all the steps of creating test cases, invoking the +floating-point operation, checking the results, and reporting suspected errors. +However, for this to be possible, testfloat must be compiled to +contain the method for invoking the floating-point operations to test. +Each build of testfloat is therefore capable of testing +only the floating-point implementation it was built to invoke. +To test a new implementation of floating-point, a new testfloat +must be created, linked to that specific implementation. +By comparison, the testfloat_gen and testfloat_ver +programs are entirely generic; +one instance is usable for testing any floating-point implementation, because +implementation-specific details are segregated in the custom program that +follows testfloat_gen. +

+ +

+Program testsoftfloat is another all-in-one program specifically +for testing SoftFloat. +

+ +

+Programs testfloat_ver, testfloat, and +testsoftfloat all report status and error information in a common +way. +As it executes, each of these programs writes status information to the +standard error output, which should be the screen by default. +In order for this status to be displayed properly, the standard error stream +should not be redirected to a file. +Any discrepancies that are found are written to the standard output stream, +which is easily redirected to a file if desired. +Unless redirected, reported errors will appear intermixed with the ongoing +status information in the output. +

+ + +

6. Operations Tested by TestFloat

+ +

+TestFloat can test all operations required by the original 1985 IEEE +Floating-Point Standard except for conversions to and from decimal. +These operations are: +

+In addition, TestFloat can also test + +

+ +

+More information about all these operations is given below. +In the operation names used by TestFloat, 16-bit half-precision is +called f16, 32-bit single-precision is +f32, 64-bit double-precision is f64, +80-bit double-extended-precision is extF80, and +128-bit quadruple-precision is f128. +TestFloat generally uses the same names for operations as Berkeley SoftFloat, +except that TestFloat’s names never include the M that +SoftFloat uses to indicate that values are passed through pointers. +

+ +

6.1. Conversion Operations

+ +

+All conversions among the floating-point formats and all conversions between a +floating-point format and 32-bit and 64-bit integers +can be tested. +The conversion operations are: +

+
+ui32_to_f16      ui64_to_f16      i32_to_f16       i64_to_f16
+ui32_to_f32      ui64_to_f32      i32_to_f32       i64_to_f32
+ui32_to_f64      ui64_to_f64      i32_to_f64       i64_to_f64
+ui32_to_extF80   ui64_to_extF80   i32_to_extF80    i64_to_extF80
+ui32_to_f128     ui64_to_f128     i32_to_f128      i64_to_f128
+
+f16_to_ui32      f32_to_ui32      f64_to_ui32      extF80_to_ui32    f128_to_ui32
+f16_to_ui64      f32_to_ui64      f64_to_ui64      extF80_to_ui64    f128_to_ui64
+f16_to_i32       f32_to_i32       f64_to_i32       extF80_to_i32     f128_to_i32
+f16_to_i64       f32_to_i64       f64_to_i64       extF80_to_i64     f128_to_i64
+
+f16_to_f32       f32_to_f16       f64_to_f16       extF80_to_f16     f128_to_f16
+f16_to_f64       f32_to_f64       f64_to_f32       extF80_to_f32     f128_to_f32
+f16_to_extF80    f32_to_extF80    f64_to_extF80    extF80_to_f64     f128_to_f64
+f16_to_f128      f32_to_f128      f64_to_f128      extF80_to_f128    f128_to_extF80
+
+
+Abbreviations ui32 and ui64 indicate +32-bit and 64-bit unsigned integer types, while +i32 and i64 indicate their signed counterparts. +These conversions all round according to the current rounding mode as relevant. +Conversions from a smaller to a larger floating-point format are always exact +and so require no rounding. +Likewise, conversions from 32-bit integers to 64-bit +double-precision or to any larger floating-point format are also exact, as are +conversions from 64-bit integers to 80-bit +double-extended-precision and 128-bit quadruple-precision. +

+ +

+For the all-in-one testfloat program, this list of conversion +operations requires amendment. +For testfloat only, conversions to an integer type have names that +explicitly specify the rounding mode and treatment of inexactness. +Thus, instead of +

+
+<float>_to_<int>
+
+
+as listed above, operations converting to integer type have names of these +forms: +
+
+<float>_to_<int>_r_<round>
+<float>_to_<int>_rx_<round>
+
+
+The <round> component is one of +‘near_even’, ‘near_maxMag’, +‘minMag’, ‘min’, or +‘max’, choosing the rounding mode. +Any other indication of rounding mode is ignored. +The operations with ‘_r_’ in their names never raise +the inexact exception, while those with ‘_rx_’ +raise the inexact exception whenever the result is not exact. +

+ +

+TestFloat assumes that conversions from floating-point to an integer type +should raise the invalid exception if the input cannot be rounded to an +integer representable in the result format. +In such a circumstance: +

+Conversions to integer types are expected never to raise the overflow +exception. +

+ +

6.2. Basic Arithmetic Operations

+ +

+The following standard arithmetic operations can be tested: +

+
+f16_add      f16_sub      f16_mul      f16_div      f16_sqrt
+f32_add      f32_sub      f32_mul      f32_div      f32_sqrt
+f64_add      f64_sub      f64_mul      f64_div      f64_sqrt
+extF80_add   extF80_sub   extF80_mul   extF80_div   extF80_sqrt
+f128_add     f128_sub     f128_mul     f128_div     f128_sqrt
+
+
+The double-extended-precision (extF80) operations can be rounded +to reduced precision under rounding precision control. +

+ +

6.3. Fused Multiply-Add Operations

+ +

+For all floating-point formats except 80-bit +double-extended-precision, TestFloat can test the fused multiply-add operation +defined by the 2008 IEEE Floating-Point Standard. +The fused multiply-add operations are: +

+
+f16_mulAdd
+f32_mulAdd
+f64_mulAdd
+f128_mulAdd
+
+
+

+ +

+If one of the multiplication operands is infinite and the other is zero, +TestFloat expects the fused multiply-add operation to raise the invalid +exception even if the third operand is a quiet NaN. +

+ +

6.4. Remainder Operations

+ +

+For each format, TestFloat can test the IEEE Standard’s remainder +operation. +These operations are: +

+
+f16_rem
+f32_rem
+f64_rem
+extF80_rem
+f128_rem
+
+
+The remainder operations are always exact and so require no rounding. +

+ +

6.5. Round-to-Integer Operations

+ +

+For each format, TestFloat can test the IEEE Standard’s round-to-integer +operation. +For most TestFloat programs, these operations are: +

+
+f16_roundToInt
+f32_roundToInt
+f64_roundToInt
+extF80_roundToInt
+f128_roundToInt
+
+
+

+ +

+Just as for conversions to integer types (section 6.1 above), the +all-in-one testfloat program is again an exception. +For testfloat only, the round-to-integer operations have names of +these forms: +

+
+<float>_roundToInt_r_<round>
+<float>_roundToInt_x
+
+
+For the ‘_r_’ versions, the inexact exception +is never raised, and the <round> component specifies +the rounding mode as one of ‘near_even’, +‘near_maxMag’, ‘minMag’, +‘min’, or ‘max’. +The usual indication of rounding mode is ignored. +In contrast, the ‘_x’ versions accept the usual +indication of rounding mode and raise the inexact exception whenever the +result is not exact. +This irregular system follows the IEEE Standard’s particular +specification for the round-to-integer operations. +

+ +

6.6. Comparison Operations

+ +

+The following floating-point comparison operations can be tested: +

+
+f16_eq      f16_le      f16_lt
+f32_eq      f32_le      f32_lt
+f64_eq      f64_le      f64_lt
+extF80_eq   extF80_le   extF80_lt
+f128_eq     f128_le     f128_lt
+
+
+The abbreviation eq stands for “equal” (=), +le stands for “less than or equal” (≤), and +lt stands for “less than” (<). +

+ +

+The IEEE Standard specifies that, by default, the less-than-or-equal and +less-than comparisons raise the invalid exception if either input is any +kind of NaN. +The equality comparisons, on the other hand, are defined by default to raise +the invalid exception only for signaling NaNs, not for quiet NaNs. +For completeness, the following additional operations can be tested if +supported: +

+
+f16_eq_signaling      f16_le_quiet      f16_lt_quiet
+f32_eq_signaling      f32_le_quiet      f32_lt_quiet
+f64_eq_signaling      f64_le_quiet      f64_lt_quiet
+extF80_eq_signaling   extF80_le_quiet   extF80_lt_quiet
+f128_eq_signaling     f128_le_quiet     f128_lt_quiet
+
+
+The signaling equality comparisons are identical to the standard +operations except that the invalid exception should be raised for any +NaN input. +Similarly, the quiet comparison operations should be identical to +their counterparts except that the invalid exception is not raised for +quiet NaNs. +

+ +

+Obviously, no comparison operations ever require rounding. +Any rounding mode is ignored. +

+ + +

7. Interpreting TestFloat Output

+ +

+The “errors” reported by TestFloat programs may or may not really +represent errors in the system being tested. +For each test case tried, the results from the floating-point implementation +being tested could differ from the expected results for several reasons: +

+It is the responsibility of the user to determine the causes for the +discrepancies that are reported. +Making this determination can require detailed knowledge about the IEEE +Standard. +Assuming TestFloat is working properly, any differences found will be due to +either the first or last of the reasons above. +Variations in the IEEE Standard that could lead to false error reports are +discussed in section 8, Variations Allowed by the IEEE +Floating-Point Standard. +

+ +

+For each reported error (or apparent error), a line of text is written to the +default output. +If a line would be longer than 79 characters, it is divided. +The first part of each error line begins in the leftmost column, and any +subsequent “continuation” lines are indented with a tab. +

+ +

+Each error reported is of the form: +

+
+<inputs>  => <observed-output>  expected: <expected-output>
+
+
+The <inputs> are the inputs to the operation. +Each output (observed or expected) is shown as a pair: the result value first, +followed by the exception flags. +

+ +

+For example, two typical error lines could be +

+
+-00.7FFF00  -7F.000100  => +01.000000 ...ux  expected: +01.000000 ....x
++81.000004  +00.1FFFFF  => +01.000000 ...ux  expected: +01.000000 ....x
+
+
+In the first line, the inputs are -00.7FFF00 and +-7F.000100, and the observed result is +01.000000 +with flags ...ux. +The trusted emulation result is the same but with different flags, +....x. +Items such as -00.7FFF00 composed of a sign character +(+/-), hexadecimal digits, and a single +period represent floating-point values (here 32-bit +single-precision). +The two instances above were reported as errors because the exception flag +results differ. +

+ +

+Aside from the exception flags, there are ten data types that may be +represented. +Five are floating-point types: 16-bit half-precision, +32-bit single-precision, 64-bit double-precision, +80-bit double-extended-precision, and 128-bit +quadruple-precision. +The remaining five types are 32-bit and 64-bit +unsigned integers, 32-bit and 64-bit +two’s-complement signed integers, and Boolean values (the results of +comparison operations). +Boolean values are represented as a single character, either a 0 +(false) or a 1 (true). +A 32-bit integer is represented as 8 hexadecimal digits. +Thus, for a signed 32-bit integer, FFFFFFFF is +−1, and 7FFFFFFF is the largest positive value. +64-bit integers are the same except with 16 hexadecimal digits. +

+ +

+Floating-point values are written decomposed into their sign, encoded exponent, +and encoded significand. +First is the sign character (+ or -), +followed by the encoded exponent in hexadecimal, then a period +(.), and lastly the encoded significand in hexadecimal. +

+ +

+For 16-bit half-precision, notable values include: +

+ + + + + + + + + + + + + + + +
+00.000    +0
+0F.000 1
+10.000 2
+1E.3FFmaximum finite value
+1F.000+infinity
 
-00.000−0
-0F.000−1
-10.000−2
-1E.3FFminimum finite value (largest magnitude, but negative)
-1F.000−infinity
+
+Certain categories are easily distinguished (assuming the xs are +not all 0): +
+ + + + + + + + +
+00.xxx    positive subnormal numbers
+1F.xxxpositive NaNs
-00.xxxnegative subnormal numbers
-1F.xxxnegative NaNs
+
+

+ +

+Likewise for other formats: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
32-bit single64-bit double128-bit quadruple
 
+00.000000    +000.0000000000000    +0000.0000000000000000000000000000    +0
+7F.000000+3FF.0000000000000+3FFF.0000000000000000000000000000 1
+80.000000+400.0000000000000+4000.0000000000000000000000000000 2
+FE.7FFFFF+7FE.FFFFFFFFFFFFF+7FFE.FFFFFFFFFFFFFFFFFFFFFFFFFFFFmaximum finite value
+FF.000000+7FF.0000000000000+7FFF.0000000000000000000000000000+infinity
 
-00.000000    -000.0000000000000    -0000.0000000000000000000000000000    −0
-7F.000000-3FF.0000000000000-3FFF.0000000000000000000000000000−1
-80.000000-400.0000000000000-4000.0000000000000000000000000000−2
-FE.7FFFFF-7FE.FFFFFFFFFFFFF-7FFE.FFFFFFFFFFFFFFFFFFFFFFFFFFFFminimum finite value
-FF.000000-7FF.0000000000000-7FFF.0000000000000000000000000000−infinity
 
+00.xxxxxx+000.xxxxxxxxxxxxx+0000.xxxxxxxxxxxxxxxxxxxxxxxxxxxxpositive subnormals
+FF.xxxxxx+7FF.xxxxxxxxxxxxx+7FFF.xxxxxxxxxxxxxxxxxxxxxxxxxxxxpositive NaNs
-00.xxxxxx-000.xxxxxxxxxxxxx-0000.xxxxxxxxxxxxxxxxxxxxxxxxxxxxnegative subnormals
-FF.xxxxxx-7FF.xxxxxxxxxxxxx-7FFF.xxxxxxxxxxxxxxxxxxxxxxxxxxxxnegative NaNs
+
+

+ +

+The 80-bit double-extended-precision values are a little unusual +in that the leading bit of precision is not hidden as with other formats. +When canonically encoded, the leading significand bit of an 80-bit +double-extended-precision value will be 0 if the value is zero or subnormal, +and will be 1 otherwise. +Hence, the same values listed above appear in 80-bit +double-extended-precision as follows (note the leading 8 digit in +the significands): +

+ + + + + + + + + + + + + + + + + + + + + +
+0000.0000000000000000    +0
+3FFF.8000000000000000 1
+4000.8000000000000000 2
+7FFE.FFFFFFFFFFFFFFFFmaximum finite value
+7FFF.8000000000000000+infinity
 
-0000.0000000000000000−0
-3FFF.8000000000000000−1
-4000.8000000000000000−2
-7FFE.FFFFFFFFFFFFFFFFminimum finite value
-7FFF.8000000000000000−infinity
+
+

+ +

+Lastly, exception flag values are represented by five characters, one character +per flag. +Each flag is written as either a letter or a period (.) according +to whether the flag was set or not by the operation. +A period indicates the flag was not set. +The letter used to indicate a set flag depends on the flag: +

+ + + + + + + + + + + + +
v    invalid exception
iinfinite exception (“divide by zero”)
ooverflow exception
uunderflow exception
xinexact exception
+
+For example, the notation ...ux indicates that the +underflow and inexact exception flags were set and that the other +three flags (invalid, infinite, and overflow) were not +set. +The exception flags are always written following the value returned as the +result of the operation. +

+ + +

8. Variations Allowed by the IEEE Floating-Point Standard

+ +

+The IEEE Floating-Point Standard admits some variation among conforming +implementations. +Because TestFloat expects the two implementations being compared to deliver +bit-for-bit identical results under most circumstances, this leeway in the +standard can result in false errors being reported if the two implementations +do not make the same choices everywhere the standard provides an option. +

+ +

8.1. Underflow

+ +

+The standard specifies that the underflow exception flag is to be raised +when two conditions are met simultaneously: +(1) tininess and (2) loss of accuracy. +

+ +

+A result is tiny when its magnitude is nonzero yet smaller than any normalized +floating-point number. +The standard allows tininess to be determined either before or after a result +is rounded to the destination precision. +If tininess is detected before rounding, some borderline cases will be flagged +as underflows even though the result after rounding actually lies within the +normal floating-point range. +By detecting tininess after rounding, a system can avoid some unnecessary +signaling of underflow. +All the TestFloat programs support options -tininessbefore and +-tininessafter to control whether TestFloat expects tininess on +underflow to be detected before or after rounding. +One or the other is selected as the default when TestFloat is compiled, but +these command options allow the default to be overridden. +

+ +

+Loss of accuracy occurs when the subnormal format is not sufficient to +represent an underflowed result accurately. +The original 1985 version of the IEEE Standard allowed loss of accuracy to be +detected either as an inexact result or as a +denormalization loss; +however, few if any systems ever chose the latter. +The latest standard requires that loss of accuracy be detected as an inexact +result, and TestFloat can test only for this case. +

+ +

8.2. NaNs

+ +

+The IEEE Standard gives the floating-point formats a large number of NaN +encodings and specifies that NaNs are to be returned as results under certain +conditions. +However, the standard allows an implementation almost complete freedom over +which NaN to return in each situation. +

+ +

+By default, TestFloat does not check the bit patterns of NaN results. +When the result of an operation should be a NaN, any NaN is considered as good +as another. +This laxness can be overridden with the -checkNaNs option of +programs testfloat_ver and testfloat. +In order for this option to be sensible, TestFloat must have been compiled so +that its internal floating-point implementation (SoftFloat) generates the +proper NaN results for the system being tested. +

+ +

8.3. Conversions to Integer

+ +

+Conversion of a floating-point value to an integer format will fail if the +source value is a NaN or if it is too large. +The IEEE Standard does not specify what value should be returned as the integer +result in these cases. +Moreover, according to the standard, the invalid exception can be raised +or an unspecified alternative mechanism may be used to signal such cases. +

+ +

+TestFloat assumes that conversions to integer will raise the invalid +exception if the source value cannot be rounded to a representable integer. +In such cases, TestFloat expects the result value to be the largest-magnitude +positive or negative integer or zero, as detailed earlier in +section 6.1, Conversion Operations. +If option -checkInvInts is selected with programs +testfloat_ver and testfloat, integer results of +invalid operations are checked for an exact match. +In order for this option to be sensible, TestFloat must have been compiled so +that its internal floating-point implementation (SoftFloat) generates the +proper integer results for the system being tested. +

+ + +

9. Contact Information

+ +

+At the time of this writing, the most up-to-date information about TestFloat +and the latest release can be found at the Web page +http://www.jhauser.us/arithmetic/TestFloat.html. +

+ + + + -- cgit v1.2.3