From 16f504a9dca3fe3b70568f67b7d41241ae485288 Mon Sep 17 00:00:00 2001
From: Daniel Baumann
+John R. Hauser
+Berkeley SoftFloat is a software implementation of binary floating-point that
+conforms to the IEEE Standard for Floating-Point Arithmetic.
+The current release supports five binary formats: Berkeley SoftFloat Release 3e: Library Interface
+
+
+2018 January 20
+Contents
+
+
+
+
+
+
+
+
+1. Introduction
+2. Limitations
+3. Acknowledgments and License
+4. Types and Functions
+4.1. Boolean and Integer Types
+4.2. Floating-Point Types
+4.3. Supported Floating-Point Functions
+
+
+ 4.4. Non-canonical Representations in
+extFloat80_t
+4.5. Conventions for Passing Arguments and Results
+5. Reserved Names
+6. Mode Variables
+6.1. Rounding Mode
+6.2. Underflow Detection
+
+
+ 6.3. Rounding Precision for the
+
+7. Exceptions and Exception Flags
+8. Function Details
+8.1. Conversions from Integer to Floating-Point
+8.2. Conversions from Floating-Point to Integer
+8.3. Conversions Among Floating-Point Types
+8.4. Basic Arithmetic Functions
+8.5. Fused Multiply-Add Functions
+8.6. Remainder Functions
+8.7. Round-to-Integer Functions
+8.8. Comparison Functions
+8.9. Signaling NaN Test Functions
+8.10. Raise-Exception Function
+9. Changes from SoftFloat
+9.1. Name Changes
+9.2. Changes to Function Arguments
+9.3. Added Capabilities
+9.4. Better Compatibility with the C Language
+9.5. New Organization as a Library
+9.6. Optimization Gains (and Losses)
+10. Future Directions
+11. Contact Information 1. Introduction
+
+
+
+All operations required by the original 1985 version of the IEEE Floating-Point
+Standard are implemented, except for conversions to and from decimal.
+
+This document gives information about the types defined and the routines +implemented by SoftFloat. +It does not attempt to define or explain the IEEE Floating-Point Standard. +Information about the standard is available elsewhere. +
+ +
+The current version of SoftFloat is
+The previous
+Among earlier releases, 3b was notable for adding support for the
+SoftFloat-history.html
+The functional interface of SoftFloat
+SoftFloat assumes the computer has an addressable byte size of 8 or
+
+SoftFloat is written in C and is designed to work with other C code.
+The C compiler used must conform at a minimum to the 1989 ANSI standard for the
+C language (same as the 1990 ISO standard) and must in addition support basic
+arithmetic on
+Most operations not required by the original 1985 version of the IEEE
+Floating-Point Standard but added in the 2008 version are not yet supported in
+SoftFloat
+The SoftFloat package was written by me,
++ + ++
++ + + + ++ Par Lab: + +Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery +(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia, +NVIDIA, Oracle, and Samsung. + ++ ++ ASPIRE Lab: + +DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from +ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA, +Oracle, and Samsung. + +
+The following applies to the whole of SoftFloat
+Copyright 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 The Regents of the +University of California. +All rights reserved. +
+ ++Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: +
+Redistributions of source code must retain the above copyright notice, this +list of conditions, and the following disclaimer. +
+ ++Redistributions in binary form must reproduce the above copyright notice, this +list of conditions, and the following disclaimer in the documentation and/or +other materials provided with the distribution. +
+ ++Neither the name of the University nor the names of its contributors may be +used to endorse or promote products derived from this software without specific +prior written permission. +
+ ++THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS “AS IS”, +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE +DISCLAIMED. +IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +
+ + +
+The types and functions of SoftFloat are declared in header file
+softfloat.h
.
+
+Header file softfloat.h
depends on standard headers
+<stdbool.h>
and <stdint.h>
to define type
+bool
and several integer types.
+These standard headers have been part of the ISO C Standard Library since 1999.
+With any recent compiler, they are likely to be supported, even if the compiler
+does not claim complete conformance to the latest ISO C Standard.
+For older or nonstandard compilers, a port of SoftFloat may have substitutes
+for these headers.
+Header softfloat.h
depends only on the name bool
from
+<stdbool.h>
and on these type names from
+<stdint.h>
:
+
++ + + ++uint16_t +uint32_t +uint64_t +int32_t +int64_t +uint_fast8_t +uint_fast32_t +uint_fast64_t +int_fast32_t +int_fast64_t ++
+The softfloat.h
header defines five floating-point types:
+
++The non-extended types are each exactly the size specified: ++
++ ++ float16_t
+ 16-bit half-precision binary format+ ++ float32_t
+ 32-bit single-precision binary format+ ++ float64_t
+ 64-bit double-precision binary format+ ++ extFloat80_t
+ 80-bit double-extended-precision binary format (old Intel or +Motorola format)+ ++ float128_t
+ 128-bit quadruple-precision binary format
float16_t
, float32_t
, float64_t
, and
+float128_t
.
+Aside from these size requirements, the definitions of all these types may
+differ for different ports of SoftFloat to specific systems.
+A given port of SoftFloat may or may not define some of the floating-point
+types as aliases for the C standard types float
,
+double
, and long
double
.
+
+
+
+Header file softfloat.h
also defines a structure,
+struct
extFloat80M
, for the representation of
+extFloat80_t
and contains
+at least these two fields (not necessarily in this order):
+
++Field+uint16_t signExp; +uint64_t signif; ++
signExp
contains the sign and exponent of the floating-point
+value, with the sign in the most significant bit (signif
is the complete +SoftFloat implements these arithmetic operations for its floating-point types: +
extFloat80_t
, the fused multiply-add
+operation defined by the IEEE Standard;
+
+The following operations required by the 2008 IEEE Floating-Point Standard are
+not supported in SoftFloat
extFloat80_t
+Because the extFloat80_t
, stores an explicit leading significand bit, many
+finite floating-point numbers are encodable in this type in multiple equivalent
+forms.
+Of these multiple encodings, there is always a unique one with the least
+encoded exponent value, and this encoding is considered the canonical
+representation of the floating-point number.
+Any other equivalent representations (having a higher encoded exponent value)
+are non-canonical.
+For a value in the subnormal range (including zero), the canonical
+representation always has an encoded exponent of zero and a leading significand
+bit
+For an infinity or NaN, the leading significand bit is similarly expected to
+extFloat80_t
+must have a leading significand bit
+SoftFloat’s functions are not guaranteed to operate as expected when
+inputs of type extFloat80_t
are non-canonical.
+Assuming all of a function’s extFloat80_t
inputs (if any)
+are canonical, function outputs of type extFloat80_t
will always
+be canonical.
+
+Values that are at most
+float64_t f64_add( float64_t, float64_t );
+
+
+
+
+The story is more complex when function inputs and outputs are
+
+void f128M_add( const float128_t *, const float128_t *, float128_t * );
+
+The first two arguments point to the values to be added, and the last argument
+points to the location where the sum will be stored.
+The M
in the name f128M_add
is mnemonic for the fact
+that the
+All ports of SoftFloat implement these pass-by-pointer functions for
+types extFloat80_t
and float128_t
.
+At the same time, SoftFloat ports may also implement alternate versions of
+these same functions that pass extFloat80_t
and
+float128_t
by value, like the smaller formats.
+Thus, besides the function with name f128M_add
shown above, a
+SoftFloat port may also supply an equivalent function with this signature:
+
+float128_t f128_add( float128_t, float128_t );
+
+
+
+
+As a general rule, on computers where the machine word size is
+f128M_add
) are provided for types extFloat80_t
+and float128_t
, because passing such large types directly can have
+significant extra cost.
+On computers where the word size is f128M_add
and f128_add
) are
+provided, because the cost of passing by value is then more reasonable.
+Applications that must be portable accross both classes of computers must use
+the pointer-based functions, as these are always implemented.
+However, if it is known that SoftFloat includes the by-value functions for all
+platforms of interest, programmers can use whichever version they prefer.
+
+In addition to the variables and functions documented here, SoftFloat defines
+some symbol names for its own private use.
+These private names always begin with the prefix
+‘softfloat_
’.
+When a program includes header softfloat.h
or links with the
+SoftFloat library, all names with prefix ‘softfloat_
’
+are reserved for possible use by SoftFloat.
+Applications that use SoftFloat should not define their own names with this
+prefix, and should reference only such names as are documented.
+
+The following global variables control rounding mode, underflow detection, and
+the
++These mode variables are covered in the next several subsections. +For some SoftFloat ports, these variables may be per-thread (declared +softfloat_roundingMode
+softfloat_detectTininess
+extF80_roundingPrecision
+
thread_local
), meaning that different execution threads have their
+own separate copies of the variables.
+
+
++All five rounding modes defined by the 2008 IEEE Floating-Point Standard are +implemented for all operations that require rounding. +Some ports of SoftFloat may also implement the round-to-odd mode. +
+ ++The rounding mode is selected by the global variable +
+uint_fast8_t softfloat_roundingMode;
+
+This variable may be set to one of the values
+++Variable+
++ ++ softfloat_round_near_even
round to nearest, with ties to even ++ ++ softfloat_round_near_maxMag
round to nearest, with ties to maximum magnitude (away from zero) ++ ++ softfloat_round_minMag
round to minimum magnitude (toward zero) ++ ++ softfloat_round_min
round to minimum (down) ++ ++ softfloat_round_max
round to maximum (up) ++ ++ softfloat_round_odd
round to odd (jamming), if supported by the SoftFloat port +
softfloat_roundingMode
is initialized to
+softfloat_round_near_even
.
+
+
+
+When softfloat_round_odd
is the rounding mode for a function that
+rounds to an integer value (either conversion to an integer format or a
+‘roundToInt
’ function), if the input is not already an
+integer, the rounded result is the closest odd integer.
+For other operations, this rounding mode acts as though the floating-point
+result is first rounded to minimum magnitude, the same as
+softfloat_round_minMag
, and then, if the result is inexact, the
+least-significant bit of the result is set
+In the terminology of the IEEE Standard, SoftFloat can detect tininess for +underflow either before or after rounding. +The choice is made by the global variable +
+uint_fast8_t softfloat_detectTininess;
+
+which can be set to either
+++Detecting tininess after rounding is usually better because it results in fewer +spurious underflow signals. +The other option is provided for compatibility with some systems. +Like most systems (and as required by the newer 2008 IEEE Standard), SoftFloat +always detects loss of accuracy for underflow as an inexact result. + + +softfloat_tininess_beforeRounding
+softfloat_tininess_afterRounding
+
+For extFloat80_t
only, the rounding precision of the basic
+arithmetic operations is controlled by the global variable
+
+uint_fast8_t extF80_roundingPrecision;
+
+The operations affected are:
+++WhenextF80_add
+extF80_sub
+extF80_mul
+extF80_div
+extF80_sqrt
+
extF80_roundingPrecision
is set to its default value of 80,
+these operations are rounded to the full precision of the extF80_roundingPrecision
to 32 or to 64 causes the
+operations listed to be rounded to float32_t
) or to float64_t
), respectively.
+When rounding to reduced precision, additional bits in the result significand
+beyond the rounding point are set to zero.
+The consequences of setting extF80_roundingPrecision
to a value
+other than 32, 64, or 80 is not specified.
+Operations other than the ones listed above are not affected by
+extF80_roundingPrecision
.
+
+
+
++All five exception flags required by the IEEE Floating-Point Standard are +implemented. +Each flag is stored as a separate bit in the global variable +
+uint_fast8_t softfloat_exceptionFlags;
+
+The positions of the exception flag bits within this variable are determined by
+the bit masks
+++Variablesoftfloat_flag_inexact
+softfloat_flag_underflow
+softfloat_flag_overflow
+softfloat_flag_infinite
+softfloat_flag_invalid
+
softfloat_exceptionFlags
is initialized to all zeros,
+meaning no exceptions.
+
+
+
+For some SoftFloat ports, softfloat_exceptionFlags
may be
+per-thread (declared thread_local
), meaning that different
+execution threads have their own separate instances of it.
+
+An individual exception flag can be cleared with the statement +
+softfloat_exceptionFlags &= ~softfloat_flag_<exception>;
+
+where <exception>
is the appropriate name.
+To raise a floating-point exception, function softfloat_raiseFlags
+should normally be used.
+
+
+
+When SoftFloat detects an exception other than inexact, it calls
+softfloat_raiseFlags
.
+The default version of this function simply raises the corresponding exception
+flags.
+Particular ports of SoftFloat may support alternate behavior, such as exception
+traps, by modifying the default softfloat_raiseFlags
.
+A program may also supply its own softfloat_raiseFlags
function to
+override the one from the SoftFloat library.
+
+Because inexact results occur frequently under most circumstances (and thus are
+hardly exceptional), SoftFloat does not ordinarily call
+softfloat_raiseFlags
for inexact exceptions.
+It does always raise the inexact exception flag as required.
+
+In this section, <float>
appears in function names as
+a substitute for one of these abbreviations:
+
++The circumstances under which values of floating-point types ++
++ ++ f16
indicates +float16_t
, passed by value+ ++ f32
indicates +float32_t
, passed by value+ ++ f64
indicates +float64_t
, passed by value+ ++ extF80M
indicates +extFloat80_t
, passed indirectly via pointers+ ++ extF80
indicates +extFloat80_t
, passed by value+ ++ f128M
indicates +float128_t
, passed indirectly via pointers+ ++ f128
indicates +float128_t
, passed by value
extFloat80_t
and float128_t
may be passed either by
+value or indirectly via pointers was discussed earlier in
+
+All conversions from a
++Conversions fromui32_to_<float>
+ui64_to_<float>
+i32_to_<float>
+i64_to_<float>
+
+Each conversion function takes one input of the appropriate type and generates +one output. +The following illustrates the signatures of these functions in cases when the +floating-point result is passed either by value or via pointers: +
++ + ++float64_t i32_to_f64( int32_t a ); +++void i32_to_f128M( int32_t a, float128_t *destPtr ); ++
+Conversions from a floating-point format to a
++The functions have signatures as follows, depending on whether the +floating-point input is passed by value or via pointers: +<float>_to_ui32
+<float>_to_ui64
+<float>_to_i32
+<float>_to_i64
+
++ + ++int_fast32_t f64_to_i32( float64_t a, uint_fast8_t roundingMode, bool exact ); +++int_fast32_t + f128M_to_i32( const float128_t *aPtr, uint_fast8_t roundingMode, bool exact ); ++
+The roundingMode
argument specifies the rounding mode for
+the conversion.
+The variable that usually indicates rounding mode,
+softfloat_roundingMode
, is ignored.
+Argument exact
determines whether the inexact
+exception flag is raised if the conversion is not exact.
+If exact
is true
, the inexact flag may
+be raised;
+otherwise, it will not be, even if the conversion is inexact.
+
+A conversion from floating-point to integer format raises the invalid +exception if the source value cannot be rounded to a representable integer of +the desired size (32 or 64 bits). +In such circumstances, the integer result returned is determined by the +particular port of SoftFloat, although typically this value will be either the +maximum or minimum value of the integer format. +The functions that convert to integer types never raise the floating-point +overflow exception. +
+ +
+Because languages such
++These functions round only toward zero (to minimum magnitude). +The signatures for these functions are the same as above without the redundant +<float>_to_ui32_r_minMag
+<float>_to_ui64_r_minMag
+<float>_to_i32_r_minMag
+<float>_to_i64_r_minMag
+
roundingMode
argument:
+++ + ++int_fast32_t f64_to_i32_r_minMag( float64_t a, bool exact ); +++int_fast32_t f128M_to_i32_r_minMag( const float128_t *aPtr, bool exact ); ++
+Conversions between floating-point formats are done by functions with these +names: +
+<float>_to_<float>
+
+All combinations of source and result type are supported where the source and
+result are different formats.
+There are four different styles of signature for these functions, depending on
+whether the input and the output floating-point values are passed by value or
+via pointers:
+++ + ++float32_t f64_to_f32( float64_t a ); +++float32_t f128M_to_f32( const float128_t *aPtr ); +++void f32_to_f128M( float32_t a, float128_t *destPtr ); +++void extF80M_to_f128M( const extFloat80_t *aPtr, float128_t *destPtr ); ++
+Conversions from a smaller to a larger floating-point format are always exact +and so require no rounding. +
+ ++The following basic arithmetic functions are provided: +
++Each floating-point operation takes two operands, except for<float>_add
+<float>_sub
+<float>_mul
+<float>_div
+<float>_sqrt
+
sqrt
+(square root) which takes only one.
+The operands and result are all of the same floating-point format.
+Signatures for these functions take the following forms:
+++When floating-point values are passed indirectly through pointers, arguments ++float64_t f64_add( float64_t a, float64_t b ); +++void + f128M_add( + const float128_t *aPtr, const float128_t *bPtr, float128_t *destPtr ); +++float64_t f64_sqrt( float64_t a ); +++void f128M_sqrt( const float128_t *aPtr, float128_t *destPtr ); ++
aPtr
and bPtr
point to the input
+operands, and the last argument, destPtr
, points to the
+location where the result is stored.
+
+
+
+Rounding of the extFloat80_t
) functions is affected by variable
+extF80_roundingPrecision
, as explained earlier in
+
+The 2008 version of the IEEE Floating-Point Standard defines a fused +multiply-add operation that does a combined multiplication and addition +with only a single rounding. +SoftFloat implements fused multiply-add with functions +
+<float>_mulAdd
+
+Unlike other operations, fused multiple-add is not supported for the
+extFloat80_t
.
+
+
++Depending on whether floating-point values are passed by value or via pointers, +the fused multiply-add functions have signatures of these forms: +
++The functions compute ++float64_t f64_mulAdd( float64_t a, float64_t b, float64_t c ); +++void + f128M_mulAdd( + const float128_t *aPtr, + const float128_t *bPtr, + const float128_t *cPtr, + float128_t *destPtr + ); ++
a
× b
)
+ + c
aPtr
, bPtr
, and
+cPtr
point to operands a
,
+b
, and c
respectively, and
+destPtr
points to the location where the result is stored.
+
+
+
+If one of the multiplication operands a
and
+b
is infinite and the other is zero, these functions raise
+the invalid exception even if operand c
is a quiet NaN.
+
+For each format, SoftFloat implements the remainder operation defined by the +IEEE Floating-Point Standard. +The remainder functions have names +
+<float>_rem
+
+Each remainder operation takes two floating-point operands of the same format
+and returns a result in the same format.
+Depending on whether floating-point values are passed by value or via pointers,
+the remainder functions have signatures of these forms:
+++When floating-point values are passed indirectly through pointers, arguments ++float64_t f64_rem( float64_t a, float64_t b ); +++void + f128M_rem( + const float128_t *aPtr, const float128_t *bPtr, float128_t *destPtr ); ++
aPtr
and bPtr
point to operands
+a
and b
respectively, and
+destPtr
points to the location where the result is stored.
+
+
+
+The IEEE Standard remainder operation computes the value
+a
+ − n × b
a
÷ b
a
÷ b
a
÷ b
+Depending on the relative magnitudes of the operands, the remainder +functions can take considerably longer to execute than the other SoftFloat +functions. +This is an inherent characteristic of the remainder operation itself and is not +a flaw in the SoftFloat implementation. +
+ ++For each format, SoftFloat implements the round-to-integer operation specified +by the IEEE Floating-Point Standard. +These functions are named +
+<float>_roundToInt
+
+Each round-to-integer operation takes a single floating-point operand.
+This operand is rounded to an integer according to a specified rounding mode,
+and the resulting integer value is returned in the same floating-point format.
+(Note that the result is not an integer type.)
+
+
++The signatures of the round-to-integer functions are similar to those for +conversions to an integer type: +
++When floating-point values are passed indirectly through pointers, ++float64_t f64_roundToInt( float64_t a, uint_fast8_t roundingMode, bool exact ); +++void + f128M_roundToInt( + const float128_t *aPtr, + uint_fast8_t roundingMode, + bool exact, + float128_t *destPtr + ); ++
aPtr
points to the input operand and
+destPtr
points to the location where the result is stored.
+
+
+
+The roundingMode
argument specifies the rounding mode to
+apply.
+The variable that usually indicates rounding mode,
+softfloat_roundingMode
, is ignored.
+Argument exact
determines whether the inexact
+exception flag is raised if the conversion is not exact.
+If exact
is true
, the inexact flag may
+be raised;
+otherwise, it will not be, even if the conversion is inexact.
+
+For each format, the following floating-point comparison functions are +provided: +
++Each comparison takes two operands of the same type and returns a Boolean. +The abbreviation<float>_eq
+<float>_le
+<float>_lt
+
eq
stands for “equal” (=);
+le
stands for “less than or equal” (≤);
+and lt
stands for “less than” (<).
+Depending on whether the floating-point operands are passed by value or via
+pointers, the comparison functions have signatures of these forms:
+++ + ++bool f64_eq( float64_t a, float64_t b ); +++bool f128M_eq( const float128_t *aPtr, const float128_t *bPtr ); ++
+The usual greater-than (>), greater-than-or-equal (≥), and not-equal +(≠) comparisons are easily obtained from the functions provided. +The not-equal function is just the logical complement of the equal function. +The greater-than-or-equal function is identical to the less-than-or-equal +function with the arguments in reverse order, and likewise the greater-than +function is identical to the less-than function with the arguments reversed. +
+ ++The IEEE Floating-Point Standard specifies that the less-than-or-equal and +less-than comparisons by default raise the invalid exception if either +operand is any kind of NaN. +Equality comparisons, on the other hand, are defined by default to raise the +invalid exception only for signaling NaNs, not quiet NaNs. +For completeness, SoftFloat provides these complementary functions: +
++The<float>_eq_signaling
+<float>_le_quiet
+<float>_lt_quiet
+
signaling
equality comparisons are identical to the default
+equality comparisons except that the invalid exception is raised for any
+NaN input, not just for signaling NaNs.
+Similarly, the quiet
comparison functions are identical to their
+default counterparts except that the invalid exception is not raised for
+quiet NaNs.
+
+
++Functions for testing whether a floating-point value is a signaling NaN are +provided with these names: +
+<float>_isSignalingNaN
+
+The functions take one floating-point operand and return a Boolean indicating
+whether the operand is a signaling NaN.
+Accordingly, the functions have the forms
+++ + ++bool f64_isSignalingNaN( float64_t a ); +++bool f128M_isSignalingNaN( const float128_t *aPtr ); ++
+SoftFloat provides a single function for raising floating-point exceptions: +
++The+void softfloat_raiseFlags( uint_fast8_t exceptions ); ++
exceptions
argument is a mask indicating the set of
+exceptions to raise.
+(See earlier section 7, Exceptions and Exception Flags.)
+In addition to setting the specified exception flags in variable
+softfloat_exceptionFlags
, the softfloat_raiseFlags
+function may cause a trap or abort appropriate for the current system.
+
+
+
+
+Apart from a change in the legal use license,
+The most obvious and pervasive difference compared to
++ + ++
++ +old name, Release 2: +new name, Release 3: ++ ++ float32
+ float32_t
+ ++ float64
+ float64_t
+ ++ floatx80
+ extFloat80_t
+ ++ float128
+ float128_t
+ ++ float_rounding_mode
+ softfloat_roundingMode
+ ++ float_round_nearest_even
+ softfloat_round_near_even
+ ++ float_round_to_zero
+ softfloat_round_minMag
+ ++ float_round_down
+ softfloat_round_min
+ ++ float_round_up
+ softfloat_round_max
+ ++ float_detect_tininess
+ softfloat_detectTininess
+ ++ float_tininess_before_rounding
+ softfloat_tininess_beforeRounding
+ ++ float_tininess_after_rounding
+ softfloat_tininess_afterRounding
+ ++ floatx80_rounding_precision
+ extF80_roundingPrecision
+ ++ float_exception_flags
+ softfloat_exceptionFlags
+ ++ float_flag_inexact
+ softfloat_flag_inexact
+ ++ float_flag_underflow
+ softfloat_flag_underflow
+ ++ float_flag_overflow
+ softfloat_flag_overflow
+ ++ float_flag_divbyzero
+ softfloat_flag_infinite
+ ++ float_flag_invalid
+ softfloat_flag_invalid
+ ++ float_raise
+ softfloat_raiseFlags
+Furthermore,
++Thus, for example, the function to add two+
++ +used in names in Release 2: +
used in names in Release 3: ++ int32
i32
+ int64
i64
+ float32
f32
+ float64
f64
+ floatx80
extF80
+ float128
f128
float32_add
in f32_add
.
+Lastly, there have been a few other changes to function names:
+++ + ++
++ +used in names in Release 2: +
used in names in Release 3: +
relevant functions: ++ ++ _round_to_zero
+ _r_minMag
conversions from floating-point to integer ( +section 8.2 )+ ++ round_to_int
+ roundToInt
round-to-integer functions ( +section 8.7 )+ ++ is_signaling_nan
+ isSignalingNaN
signaling NaN test functions ( +section 8.9 )
+Besides simple name changes, some operations were given a different interface
+in
+Since <stdint.h>
, such as
+uint32_t
, whereas previously their types could be defined
+differently for each port of SoftFloat, usually using traditional C types such
+as unsigned
int
.
+Likewise, functions in bool
from <stdbool.h>
, whereas
+previously these were again passed as a port-specific type (usually
+int
).
+
+As explained earlier in
+Functions that round to an integer have additional
+roundingMode
and exact
arguments that
+they did not have in softfloat_roundingMode
but previously known as
+float_rounding_mode
).
+Also, for softfloat_roundingMode
for argument
+roundingMode
and true
for argument
+exact
.
+
+With
+A port of SoftFloat can now define any of the floating-point types
+float32_t
, float64_t
, extFloat80_t
, and
+float128_t
as aliases for C’s standard floating-point types
+float
, double
, and long
+double
, using either #define
or typedef
.
+This potential convenience was not supported under
+(Note, however, that there may be a performance cost to defining +SoftFloat’s floating-point types this way, depending on the platform and +the applications using SoftFloat. +Ports of SoftFloat may choose to forgo the convenience in favor of better +speed.) +
+ ++
float16_t
, is supported.
+
+
++
+
extFloat80_t
.
+
+
++
softfloat_round_near_maxMag
(round to nearest, with ties to
+maximum magnitude, away from zero), and, as of softfloat_round_odd
(round to odd, also known as
+jamming).
+
+
+
+
+Starting with
+Individual SoftFloat functions have been variously improved in
+extFloat80_t
and
+float128_t
, code size has also generally been reduced.
+
+However, because
+The following improvements are anticipated for future releases of SoftFloat: +
extFloat80_t
(discussed in extFloat80_t
).
+
+
+At the time of this writing, the most up-to-date information about SoftFloat
+and the latest release can be found at the Web page
+http://www.jhauser.us/arithmetic/SoftFloat.html