From f215e02bf85f68d3a6106c2a1f4f7f063f819064 Mon Sep 17 00:00:00 2001
From: Daniel Baumann
+John R. Hauser
+John R. Hauser
+This document gives information needed for compiling and/or porting Berkeley
+SoftFloat, a library of C functions implementing binary floating-point
+conforming to the IEEE Standard for Floating-Point Arithmetic.
+For basic documentation about SoftFloat refer to
+
+The source code for SoftFloat is intended to be relatively machine-independent
+and should be compilable with any ISO-Standard C compiler that also supports
+
+
+SoftFloat assumes the computer has an addressable byte size of either 8 or
+
+SoftFloat is written in C and is designed to work with other C code.
+The C compiler used must conform at a minimum to the 1989 ANSI standard for the
+C language (same as the 1990 ISO standard) and must in addition support basic
+arithmetic on
+
+The SoftFloat package was written by me, History of Berkeley SoftFloat, to Release 3e
+
+
+2018 January 20
+Release 3e (2018 January)
+
+
+
+
+
+
+odd
+(round to odd, also known as jamming) from 5 to 6.
+
+odd
when rounding to an
+integer value (either conversion to an integer format or a
+‘roundToInt
’ function).
+Previously, for those cases only, rounding mode odd
acted the same
+as rounding to minimum magnitude.
+Now all operations are rounded consistently.
+
+f16_to_ui64
might return a
+different integer than expected in the case that the floating-point operand is
+negative.
+
+Release 3d (2017 August)
+
+
+
+
+
+
+f64_sqrt
), the result
+could sometimes be off by f128_sqrt
was first reported by Alexei Sibidanov.)
+
+Release 3c (2017 February)
+
+
+
+
+
+
+odd
(round to odd, also known as
+jamming).
+
+Release 3b (2016 July)
+
+
+
+
+
+
+float16_t
).
+
+THREAD_LOCAL
to allow the floating-point
+state (modes and exception flags) to be made per-thread.
+
+make
command.
+
+Release 3a (2015 October)
+
+
+
+
+
+
+Release 3 (2015 February)
+
+
+
+
+
+
+uint32_t
and
+uint64_t
).
+
+near_maxMag
(round to
+nearest, with ties to maximum magnitude, away from zero).
+
+timesoftfloat
program (now part of the Berkeley
+TestFloat package).
+
+Release 2c (2015 January)
+
+
+
+
+
+
+Release 2b (2002 May)
+
+
+
+
+
+
+Release 2a (1998 December)
+
+
+
+
+
+
+int64
) and all supported floating-point formats.
+
+float32_sqrt
that caused the result sometimes to be off by
+Release 2 (1997 June)
+
+
+
+
+
+
+bits64
) version, adding the
+floatx80
and float128
formats.
+
+bits32
and a bits64
version.
+Renamed environment.h
to milieu.h
to avoid confusion
+with environment variables.
+
+float64_round_to_int
often to
+round the wrong way in nearest/even mode when the operand was between
+220 and 221 and halfway between two integers.
+
+Release 1a (1996 July)
+
+
+
+
+
+
+float_detect_tininess
variable to control whether
+tininess is detected before or after rounding.
+
+Release 1 (1996 July)
+
+
+
+
+
+
+
+
diff --git a/src/libs/softfloat-3e/doc/SoftFloat-source.html b/src/libs/softfloat-3e/doc/SoftFloat-source.html
new file mode 100644
index 00000000..4ff9d4c4
--- /dev/null
+++ b/src/libs/softfloat-3e/doc/SoftFloat-source.html
@@ -0,0 +1,686 @@
+
+
+
+
+Berkeley SoftFloat Release 3e: Source Documentation
+
+
+2018 January 20
+Contents
+
+
+
+
+
+
+
+
+1. Introduction
+2. Limitations
+3. Acknowledgments and License
+4. SoftFloat Package Directory Structure
+5. Issues for Porting SoftFloat to a New Target
+
+
+ 5.1. Standard Headers
+<stdbool.h>
and
+ <stdint.h>
+5.2. Specializing Floating-Point Behavior
+5.3. Macros for Build Options
+5.4. Adapting a Template Target Directory
+
+5.5. Target-Specific Optimization of Primitive Functions
+
+6. Testing SoftFloat
+
+7. Providing SoftFloat as a Common Library for Applications
+
+8. Contact Information 1. Introduction
+
+SoftFloat.html
gcc
) for several platforms.
+SoftFloat.html
2. Limitations
+
+<stdbool.h>
and
+<stdint.h>
are required for defining standard Boolean and
+integer types.
+If these headers are not supplied with the C compiler, minimal substitutes must
+be provided.
+SoftFloat’s dependence on these headers is detailed later in
+<stdbool.h>
+and <stdint.h>
.
+3. Acknowledgments and License
+
+
+
+
+
+
+
+
+
+
+Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery
+(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia,
+NVIDIA, Oracle, and Samsung.
+
+
+
+
+
+
+DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from
+ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA,
+Oracle, and Samsung.
+
+
+The following applies to the whole of SoftFloat
+Copyright 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 The Regents of the +University of California. +All rights reserved. +
+ ++Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: +
+Redistributions of source code must retain the above copyright notice, this +list of conditions, and the following disclaimer. +
+ ++Redistributions in binary form must reproduce the above copyright notice, this +list of conditions, and the following disclaimer in the documentation and/or +other materials provided with the distribution. +
+ ++Neither the name of the University nor the names of its contributors may be +used to endorse or promote products derived from this software without specific +prior written permission. +
+ ++THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS “AS IS”, +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE +DISCLAIMED. +IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +
+ + ++Because SoftFloat is targeted to multiple platforms, its source code is +slightly scattered between target-specific and target-independent directories +and files. +The supplied directory structure is as follows: +
++The majority of the SoftFloat sources are provided in the+doc +source + include + 8086 + 8086-SSE + ARM-VFPv2 + ARM-VFPv2-defaultNaN +build + template-FAST_INT64 + template-not-FAST_INT64 + Linux-386-GCC + Linux-386-SSE2-GCC + Linux-x86_64-GCC + Linux-ARM-VFPv2-GCC + Win32-MinGW + Win32-SSE2-MinGW + Win64-MinGW-w64 ++
source
+directory.
+The include
subdirectory contains several header files
+(unsurprisingly), while the other subdirectories of source
contain
+source files that specialize the floating-point behavior to match particular
+processor families:
+++If other specializations are attempted, these would be expected to be other +subdirectories of+
+- +
8086
- +Intel’s older, 8087-derived floating-point, extended to all supported +floating-point types +
+- +
8086-SSE
- +Intel’s x86 processors with Streaming SIMD Extensions (SSE) and later +compatible extensions, having 8087 behavior for
+80-bit +double-extended-precision (extFloat80_t
) and SSE behavior for +other floating-point types +- +
ARM-VFPv2
- +ARM’s VFPv2 or later floating-point, with NaN payload propagation +
+- +
ARM-VFPv2-defaultNaN
- +ARM’s VFPv2 or later floating-point, with the “default NaN” +option +
+
source
alongside the ones listed above.
+Specialization is covered later, in
+The build
directory is intended to contain a subdirectory for each
+target platform for which a build of the SoftFloat library may be created.
+For each build target, the target’s subdirectory is where all derived
+object files and the completed SoftFloat library (typically
+softfloat.a
or libsoftfloat.a
) are created.
+The two template
subdirectories are not actual build targets but
+contain sample files for creating new target directories.
+(The meaning of FAST_INT64
will be explained later.)
+
+Ignoring the template
directories, the supplied target directories
+are intended to follow a naming system of
+<execution-environment>-<compiler>
<execution-environment>
Linux-386
Linux-386-SSE2
Linux-x86_64
Linux-ARM-VFPv2
Win32
,
+Win32-SSE2
Win64
, and
+<compiler>
GCC
,
+MinGW
, or MinGW-w64
+All of the supplied target directories are merely examples that may or may not +be correct for compiling on any particular system. +Despite requests, there are currently no plans to include and maintain in the +SoftFloat package the build files needed for a great many users’ +compilation environments, which can span a huge range of operating systems, +compilers, and other tools. +
+ ++As supplied, each target directory contains two files: +
++The provided+Makefile +platform.h ++
Makefile
is written for GNU make
.
+A build of SoftFloat for the specific target is begun by executing the
+make
command with the target directory as the current directory.
+A completely different build tool can be used if an appropriate
+Makefile
equivalent is created.
+
+
+
+The platform.h
header file exists to provide a location for
+additional C declarations specific to the build target.
+Every C source file of SoftFloat contains a #include
for
+platform.h
.
+In many cases, the contents of platform.h
can be as simple as one
+or two lines of code.
+At the other extreme, to get maximal performance from SoftFloat, it may be
+desirable to include in header platform.h
(directly or via
+#include
) declarations for numerous target-specific optimizations.
+Such possibilities are discussed in the next section, Issues for Porting
+SoftFloat to a New Target.
+If the target’s compiler or library has bugs or other shortcomings,
+workarounds for these issues may also be possible with target-specific
+declarations in platform.h
, avoiding the need to modify the main
+SoftFloat sources.
+
<stdbool.h>
and <stdint.h>
+The SoftFloat sources make use of standard headers
+<stdbool.h>
and <stdint.h>
, which have
+been part of the ISO C Standard Library since 1999.
+With any recent compiler, these standard headers are likely to be supported,
+even if the compiler does not claim complete conformance to the latest ISO C
+Standard.
+For older or nonstandard compilers, substitutes for
+<stdbool.h>
and <stdint.h>
may need to be
+created.
+SoftFloat depends on these names from <stdbool.h>
:
+
++and on these names from+bool +true +false ++
<stdint.h>
:
+++ + + ++uint16_t +uint32_t +uint64_t +int32_t +int64_t +UINT64_C +INT64_C +uint_least8_t +uint_fast8_t +uint_fast16_t +uint_fast32_t +uint_fast64_t +int_fast8_t +int_fast16_t +int_fast32_t +int_fast64_t ++
+The IEEE Floating-Point Standard allows for some flexibility in a conforming
+implementation, particularly concerning NaNs.
+The SoftFloat source
directory is supplied with some
+specialization subdirectories containing possible definitions for this
+implementation-specific behavior.
+For example, the 8086
and 8086-SSE
+As provided, the build process for a target expects to involve exactly
+one specialization directory that defines all of these
+implementation-specific details for the target.
+A specialization directory such as 8086
is expected to contain a
+header file called specialize.h
, together with whatever other
+source files are needed to complete the specialization.
+
+A new build target may use an existing specialization, such as the ones
+provided by the 8086
and 8086-SSE
specialize.h
header file from any of the provided
+specialization subdirectories can be used as a model for what definitions are
+needed.
+
+The SoftFloat source files adapt the floating-point implementation according to +several C preprocessor macros: +
++ + ++
+LITTLEENDIAN
+- +Must be defined for little-endian machines; must not be defined for big-endian +machines. +
INLINE
+- +Specifies the sequence of tokens used to indicate that a C function should be +inlined. +If macro
INLINE_LEVEL
is defined with a value of 1 or higher, this +macro must be defined; otherwise, this macro is ignored and need not be +defined. +For compilers that conform to the C Standard’s rules for inline +functions, this macro can be defined as the single keywordinline
. +For other compilers that follow a convention pre-dating the standardization of +inline
, this macro may need to be defined toextern
+inline
. +THREAD_LOCAL
+- +Can be defined to a sequence of tokens that, when appearing at the start of a +variable declaration, indicates to the C compiler that the variable is +per-thread, meaning that each execution thread gets its own separate +instance of the variable. +This macro is used in header
softfloat.h
in the declarations of +variablessoftfloat_roundingMode
, +softfloat_detectTininess
,extF80_roundingPrecision
, +andsoftfloat_exceptionFlags
. +If macroTHREAD_LOCAL
is left undefined, these variables will +default to being ordinary global variables. +Depending on the compiler, possible valid definitions of this macro include +_Thread_local
and__thread
. ++
+SOFTFLOAT_ROUND_ODD
+- +Can be defined to enable support for optional rounding mode +
softfloat_round_odd
. ++
+INLINE_LEVEL
+- +Can be defined to an integer to determine the degree of inlining requested of +the compiler. +Larger numbers request that more inlining be done. +If this macro is not defined or is defined to a value less
than 1 +(zero or negative), no inlining is requested. +The maximum effective value is no higherthan 5 . +Defining this macro to a value greater than 5 is the same as defining it +to 5 . +SOFTFLOAT_FAST_INT64
+- +Can be defined to indicate that the build target’s implementation of +
64-bit arithmetic is efficient. +For newer64-bit processors, this macro should usually be defined. +For very small microprocessors whose buses and registers are8-bit +or16-bit in size, this macro should usually not be defined. +Whether this macro should be defined for a32-bit processor may +depend on the target machine and the applications that will use SoftFloat. +SOFTFLOAT_FAST_DIV32TO16
+- +Can be defined to indicate that the target’s division operator +
in C (written as/
) is reasonably efficient for +dividing a32-bit unsigned integer by a16-bit +unsigned integer. +Setting this macro may affect the performance of functionf16_div
. +SOFTFLOAT_FAST_DIV64TO32
+- +Can be defined to indicate that the target’s division operator +
in C (written as/
) is reasonably efficient for +dividing a64-bit unsigned integer by a32-bit +unsigned integer. +Setting this macro may affect the performance of division, remainder, and +square root operations other thanf16_div
. +
+Following the usual custom INLINE
, THREAD_LOCAL
, and
+INLINE_LEVEL
), the content of any definition is irrelevant;
+what matters is a macro’s effect on #ifdef
directives.
+
+It is recommended that any definitions of macros LITTLEENDIAN
,
+INLINE
, and THREAD_LOCAL
be made in a build
+target’s platform.h
header file, because these macros are
+expected to be determined inflexibly by the target machine and compiler.
+The other five macros select options and control optimization, and thus might
+be better located in the target’s Makefile (or its equivalent).
+
+In the build
directory, two template
subdirectories
+provide models for new target directories.
+Two different templates exist because different functions are needed in the
+SoftFloat library depending on whether macro SOFTFLOAT_FAST_INT64
+is defined.
+If macro SOFTFLOAT_FAST_INT64
will be defined,
+template-FAST_INT64
template-not-FAST_INT64
+Header file primitives.h
(in directory
+source/include
) declares macros and functions for numerous
+underlying arithmetic operations upon which many of SoftFloat’s
+floating-point functions are ultimately built.
+The SoftFloat sources include implementations of all of these functions/macros,
+written as standard C code, so a complete and correct SoftFloat library can be
+created using only the supplied code for all functions.
+However, for many targets, SoftFloat’s performance can be improved by
+substituting target-specific implementations of some of the functions/macros
+declared in primitives.h
.
+
+For example, primitives.h
declares a function called
+softfloat_countLeadingZeros32
that takes an unsigned
+
+A build target can replace the supplied version of any function or macro of
+primitives.h
by defining a macro with the same name in the
+target’s platform.h
header file.
+For this purpose, it may be helpful for platform.h
to
+#include
header file primitiveTypes.h
, which defines
+types used for arguments and results of functions declared in
+primitives.h
.
+When a desired replacement implementation is a function, not a macro, it is
+sufficient for platform.h
to include the line
+
++where+#define <function-name> <function-name> ++
<function-name>
<function-name>
+The supplied header file opts-GCC.h
(in directory
+source/include
) provides an example of target-specific
+optimization for the GCC compiler.
+Each GCC target example in the build
directory has
+
+#include "opts-GCC.h"
+
+in its platform.h
header file.
+Before opts-GCC.h
is included, the following macros must be
+defined (or not) to control which features are invoked:
+++On some machines, these improvements are observed to increase the speeds of ++
+- +
SOFTFLOAT_BUILTIN_CLZ
- +If defined, SoftFloat’s internal +‘
+countLeadingZeros
’ functions use intrinsics +__builtin_clz
and__builtin_clzll
. +- +
SOFTFLOAT_INTRINSIC_INT128
- +If defined, SoftFloat makes use of GCC’s nonstandard
+128-bit +integer type__int128
. +
f64_mul
and f128_mul
by around 20 to 25%, although
+other functions receive less dramatic boosts, or none at all.
+Results can vary greatly across different platforms.
+
+
+
+
+SoftFloat can be tested using the testsoftfloat
program by the
+same author.
+This program is part of the Berkeley TestFloat package available at the Web
+page
+http://www.jhauser.us/arithmetic/TestFloat.html
timesoftfloat
that
+measures the speed of SoftFloat’s floating-point functions.
+
+Header file softfloat.h
defines the SoftFloat interface as seen by
+clients.
+If the SoftFloat library will be made a common library for programs on a
+system, the supplied softfloat.h
has a couple of deficiencies for
+this purpose:
+
softfloat.h
depends on another header,
+softfloat_types.h
, that is not intended for public use but which
+must also be visible to the programmer’s compiler.
+softfloat.h
is included in a C source
+file, macros SOFTFLOAT_FAST_INT64
and THREAD_LOCAL
+must be defined, or not defined, consistent with how these macro were defined
+when the SoftFloat library was built.
+#include
header
+file softfloat.h
, it is recommended that a custom, self-contained
+version of this header file be created that eliminates these issues.
+
+
+
+
+At the time of this writing, the most up-to-date information about SoftFloat
+and the latest release can be found at the Web page
+http://www.jhauser.us/arithmetic/SoftFloat.html
+John R. Hauser
+2018 January 20
+
++ + ++
++ + + 1. Introduction + 2. Limitations + 3. Acknowledgments and License + 4. Types and Functions + 4.1. Boolean and Integer Types + 4.2. Floating-Point Types + 4.3. Supported Floating-Point Functions + ++ 4.4. Non-canonical Representations in +extFloat80_t
+ 4.5. Conventions for Passing Arguments and Results + 5. Reserved Names + 6. Mode Variables + 6.1. Rounding Mode + 6.2. Underflow Detection + ++ 6.3. Rounding Precision for the +80-Bit Extended Format+ 7. Exceptions and Exception Flags + 8. Function Details + 8.1. Conversions from Integer to Floating-Point + 8.2. Conversions from Floating-Point to Integer + 8.3. Conversions Among Floating-Point Types + 8.4. Basic Arithmetic Functions + 8.5. Fused Multiply-Add Functions + 8.6. Remainder Functions + 8.7. Round-to-Integer Functions + 8.8. Comparison Functions + 8.9. Signaling NaN Test Functions + 8.10. Raise-Exception Function + 9. Changes from SoftFloat Release 2 + 9.1. Name Changes + 9.2. Changes to Function Arguments + 9.3. Added Capabilities + 9.4. Better Compatibility with the C Language + 9.5. New Organization as a Library + 9.6. Optimization Gains (and Losses) + 10. Future Directions + 11. Contact Information
+Berkeley SoftFloat is a software implementation of binary floating-point that
+conforms to the IEEE Standard for Floating-Point Arithmetic.
+The current release supports five binary formats:
+This document gives information about the types defined and the routines +implemented by SoftFloat. +It does not attempt to define or explain the IEEE Floating-Point Standard. +Information about the standard is available elsewhere. +
+ +
+The current version of SoftFloat is
+The previous
+Among earlier releases, 3b was notable for adding support for the
+SoftFloat-history.html
+The functional interface of SoftFloat
+SoftFloat assumes the computer has an addressable byte size of 8 or
+
+SoftFloat is written in C and is designed to work with other C code.
+The C compiler used must conform at a minimum to the 1989 ANSI standard for the
+C language (same as the 1990 ISO standard) and must in addition support basic
+arithmetic on
+Most operations not required by the original 1985 version of the IEEE
+Floating-Point Standard but added in the 2008 version are not yet supported in
+SoftFloat
+The SoftFloat package was written by me,
++ + ++
++ + + + ++ Par Lab: + +Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery +(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia, +NVIDIA, Oracle, and Samsung. + ++ ++ ASPIRE Lab: + +DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from +ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA, +Oracle, and Samsung. + +
+The following applies to the whole of SoftFloat
+Copyright 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 The Regents of the +University of California. +All rights reserved. +
+ ++Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: +
+Redistributions of source code must retain the above copyright notice, this +list of conditions, and the following disclaimer. +
+ ++Redistributions in binary form must reproduce the above copyright notice, this +list of conditions, and the following disclaimer in the documentation and/or +other materials provided with the distribution. +
+ ++Neither the name of the University nor the names of its contributors may be +used to endorse or promote products derived from this software without specific +prior written permission. +
+ ++THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS “AS IS”, +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE +DISCLAIMED. +IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +
+ + +
+The types and functions of SoftFloat are declared in header file
+softfloat.h
.
+
+Header file softfloat.h
depends on standard headers
+<stdbool.h>
and <stdint.h>
to define type
+bool
and several integer types.
+These standard headers have been part of the ISO C Standard Library since 1999.
+With any recent compiler, they are likely to be supported, even if the compiler
+does not claim complete conformance to the latest ISO C Standard.
+For older or nonstandard compilers, a port of SoftFloat may have substitutes
+for these headers.
+Header softfloat.h
depends only on the name bool
from
+<stdbool.h>
and on these type names from
+<stdint.h>
:
+
++ + + ++uint16_t +uint32_t +uint64_t +int32_t +int64_t +uint_fast8_t +uint_fast32_t +uint_fast64_t +int_fast32_t +int_fast64_t ++
+The softfloat.h
header defines five floating-point types:
+
++The non-extended types are each exactly the size specified: ++
++ ++ float16_t
+ 16-bit half-precision binary format+ ++ float32_t
+ 32-bit single-precision binary format+ ++ float64_t
+ 64-bit double-precision binary format+ ++ extFloat80_t
+ 80-bit double-extended-precision binary format (old Intel or +Motorola format)+ ++ float128_t
+ 128-bit quadruple-precision binary format
float16_t
, float32_t
, float64_t
, and
+float128_t
.
+Aside from these size requirements, the definitions of all these types may
+differ for different ports of SoftFloat to specific systems.
+A given port of SoftFloat may or may not define some of the floating-point
+types as aliases for the C standard types float
,
+double
, and long
double
.
+
+
+
+Header file softfloat.h
also defines a structure,
+struct
extFloat80M
, for the representation of
+extFloat80_t
and contains
+at least these two fields (not necessarily in this order):
+
++Field+uint16_t signExp; +uint64_t signif; ++
signExp
contains the sign and exponent of the floating-point
+value, with the sign in the most significant bit (signif
is the complete +SoftFloat implements these arithmetic operations for its floating-point types: +
extFloat80_t
, the fused multiply-add
+operation defined by the IEEE Standard;
+
+The following operations required by the 2008 IEEE Floating-Point Standard are
+not supported in SoftFloat
extFloat80_t
+Because the extFloat80_t
, stores an explicit leading significand bit, many
+finite floating-point numbers are encodable in this type in multiple equivalent
+forms.
+Of these multiple encodings, there is always a unique one with the least
+encoded exponent value, and this encoding is considered the canonical
+representation of the floating-point number.
+Any other equivalent representations (having a higher encoded exponent value)
+are non-canonical.
+For a value in the subnormal range (including zero), the canonical
+representation always has an encoded exponent of zero and a leading significand
+bit
+For an infinity or NaN, the leading significand bit is similarly expected to
+extFloat80_t
+must have a leading significand bit
+SoftFloat’s functions are not guaranteed to operate as expected when
+inputs of type extFloat80_t
are non-canonical.
+Assuming all of a function’s extFloat80_t
inputs (if any)
+are canonical, function outputs of type extFloat80_t
will always
+be canonical.
+
+Values that are at most
+float64_t f64_add( float64_t, float64_t );
+
+
+
+
+The story is more complex when function inputs and outputs are
+
+void f128M_add( const float128_t *, const float128_t *, float128_t * );
+
+The first two arguments point to the values to be added, and the last argument
+points to the location where the sum will be stored.
+The M
in the name f128M_add
is mnemonic for the fact
+that the
+All ports of SoftFloat implement these pass-by-pointer functions for
+types extFloat80_t
and float128_t
.
+At the same time, SoftFloat ports may also implement alternate versions of
+these same functions that pass extFloat80_t
and
+float128_t
by value, like the smaller formats.
+Thus, besides the function with name f128M_add
shown above, a
+SoftFloat port may also supply an equivalent function with this signature:
+
+float128_t f128_add( float128_t, float128_t );
+
+
+
+
+As a general rule, on computers where the machine word size is
+f128M_add
) are provided for types extFloat80_t
+and float128_t
, because passing such large types directly can have
+significant extra cost.
+On computers where the word size is f128M_add
and f128_add
) are
+provided, because the cost of passing by value is then more reasonable.
+Applications that must be portable accross both classes of computers must use
+the pointer-based functions, as these are always implemented.
+However, if it is known that SoftFloat includes the by-value functions for all
+platforms of interest, programmers can use whichever version they prefer.
+
+In addition to the variables and functions documented here, SoftFloat defines
+some symbol names for its own private use.
+These private names always begin with the prefix
+‘softfloat_
’.
+When a program includes header softfloat.h
or links with the
+SoftFloat library, all names with prefix ‘softfloat_
’
+are reserved for possible use by SoftFloat.
+Applications that use SoftFloat should not define their own names with this
+prefix, and should reference only such names as are documented.
+
+The following global variables control rounding mode, underflow detection, and
+the
++These mode variables are covered in the next several subsections. +For some SoftFloat ports, these variables may be per-thread (declared +softfloat_roundingMode
+softfloat_detectTininess
+extF80_roundingPrecision
+
thread_local
), meaning that different execution threads have their
+own separate copies of the variables.
+
+
++All five rounding modes defined by the 2008 IEEE Floating-Point Standard are +implemented for all operations that require rounding. +Some ports of SoftFloat may also implement the round-to-odd mode. +
+ ++The rounding mode is selected by the global variable +
+uint_fast8_t softfloat_roundingMode;
+
+This variable may be set to one of the values
+++Variable+
++ ++ softfloat_round_near_even
round to nearest, with ties to even ++ ++ softfloat_round_near_maxMag
round to nearest, with ties to maximum magnitude (away from zero) ++ ++ softfloat_round_minMag
round to minimum magnitude (toward zero) ++ ++ softfloat_round_min
round to minimum (down) ++ ++ softfloat_round_max
round to maximum (up) ++ ++ softfloat_round_odd
round to odd (jamming), if supported by the SoftFloat port +
softfloat_roundingMode
is initialized to
+softfloat_round_near_even
.
+
+
+
+When softfloat_round_odd
is the rounding mode for a function that
+rounds to an integer value (either conversion to an integer format or a
+‘roundToInt
’ function), if the input is not already an
+integer, the rounded result is the closest odd integer.
+For other operations, this rounding mode acts as though the floating-point
+result is first rounded to minimum magnitude, the same as
+softfloat_round_minMag
, and then, if the result is inexact, the
+least-significant bit of the result is set
+In the terminology of the IEEE Standard, SoftFloat can detect tininess for +underflow either before or after rounding. +The choice is made by the global variable +
+uint_fast8_t softfloat_detectTininess;
+
+which can be set to either
+++Detecting tininess after rounding is usually better because it results in fewer +spurious underflow signals. +The other option is provided for compatibility with some systems. +Like most systems (and as required by the newer 2008 IEEE Standard), SoftFloat +always detects loss of accuracy for underflow as an inexact result. + + +softfloat_tininess_beforeRounding
+softfloat_tininess_afterRounding
+
+For extFloat80_t
only, the rounding precision of the basic
+arithmetic operations is controlled by the global variable
+
+uint_fast8_t extF80_roundingPrecision;
+
+The operations affected are:
+++WhenextF80_add
+extF80_sub
+extF80_mul
+extF80_div
+extF80_sqrt
+
extF80_roundingPrecision
is set to its default value of 80,
+these operations are rounded to the full precision of the extF80_roundingPrecision
to 32 or to 64 causes the
+operations listed to be rounded to float32_t
) or to float64_t
), respectively.
+When rounding to reduced precision, additional bits in the result significand
+beyond the rounding point are set to zero.
+The consequences of setting extF80_roundingPrecision
to a value
+other than 32, 64, or 80 is not specified.
+Operations other than the ones listed above are not affected by
+extF80_roundingPrecision
.
+
+
+
++All five exception flags required by the IEEE Floating-Point Standard are +implemented. +Each flag is stored as a separate bit in the global variable +
+uint_fast8_t softfloat_exceptionFlags;
+
+The positions of the exception flag bits within this variable are determined by
+the bit masks
+++Variablesoftfloat_flag_inexact
+softfloat_flag_underflow
+softfloat_flag_overflow
+softfloat_flag_infinite
+softfloat_flag_invalid
+
softfloat_exceptionFlags
is initialized to all zeros,
+meaning no exceptions.
+
+
+
+For some SoftFloat ports, softfloat_exceptionFlags
may be
+per-thread (declared thread_local
), meaning that different
+execution threads have their own separate instances of it.
+
+An individual exception flag can be cleared with the statement +
+softfloat_exceptionFlags &= ~softfloat_flag_<exception>;
+
+where <exception>
is the appropriate name.
+To raise a floating-point exception, function softfloat_raiseFlags
+should normally be used.
+
+
+
+When SoftFloat detects an exception other than inexact, it calls
+softfloat_raiseFlags
.
+The default version of this function simply raises the corresponding exception
+flags.
+Particular ports of SoftFloat may support alternate behavior, such as exception
+traps, by modifying the default softfloat_raiseFlags
.
+A program may also supply its own softfloat_raiseFlags
function to
+override the one from the SoftFloat library.
+
+Because inexact results occur frequently under most circumstances (and thus are
+hardly exceptional), SoftFloat does not ordinarily call
+softfloat_raiseFlags
for inexact exceptions.
+It does always raise the inexact exception flag as required.
+
+In this section, <float>
appears in function names as
+a substitute for one of these abbreviations:
+
++The circumstances under which values of floating-point types ++
++ ++ f16
indicates +float16_t
, passed by value+ ++ f32
indicates +float32_t
, passed by value+ ++ f64
indicates +float64_t
, passed by value+ ++ extF80M
indicates +extFloat80_t
, passed indirectly via pointers+ ++ extF80
indicates +extFloat80_t
, passed by value+ ++ f128M
indicates +float128_t
, passed indirectly via pointers+ ++ f128
indicates +float128_t
, passed by value
extFloat80_t
and float128_t
may be passed either by
+value or indirectly via pointers was discussed earlier in
+
+All conversions from a
++Conversions fromui32_to_<float>
+ui64_to_<float>
+i32_to_<float>
+i64_to_<float>
+
+Each conversion function takes one input of the appropriate type and generates +one output. +The following illustrates the signatures of these functions in cases when the +floating-point result is passed either by value or via pointers: +
++ + ++float64_t i32_to_f64( int32_t a ); +++void i32_to_f128M( int32_t a, float128_t *destPtr ); ++
+Conversions from a floating-point format to a
++The functions have signatures as follows, depending on whether the +floating-point input is passed by value or via pointers: +<float>_to_ui32
+<float>_to_ui64
+<float>_to_i32
+<float>_to_i64
+
++ + ++int_fast32_t f64_to_i32( float64_t a, uint_fast8_t roundingMode, bool exact ); +++int_fast32_t + f128M_to_i32( const float128_t *aPtr, uint_fast8_t roundingMode, bool exact ); ++
+The roundingMode
argument specifies the rounding mode for
+the conversion.
+The variable that usually indicates rounding mode,
+softfloat_roundingMode
, is ignored.
+Argument exact
determines whether the inexact
+exception flag is raised if the conversion is not exact.
+If exact
is true
, the inexact flag may
+be raised;
+otherwise, it will not be, even if the conversion is inexact.
+
+A conversion from floating-point to integer format raises the invalid +exception if the source value cannot be rounded to a representable integer of +the desired size (32 or 64 bits). +In such circumstances, the integer result returned is determined by the +particular port of SoftFloat, although typically this value will be either the +maximum or minimum value of the integer format. +The functions that convert to integer types never raise the floating-point +overflow exception. +
+ +
+Because languages such
++These functions round only toward zero (to minimum magnitude). +The signatures for these functions are the same as above without the redundant +<float>_to_ui32_r_minMag
+<float>_to_ui64_r_minMag
+<float>_to_i32_r_minMag
+<float>_to_i64_r_minMag
+
roundingMode
argument:
+++ + ++int_fast32_t f64_to_i32_r_minMag( float64_t a, bool exact ); +++int_fast32_t f128M_to_i32_r_minMag( const float128_t *aPtr, bool exact ); ++
+Conversions between floating-point formats are done by functions with these +names: +
+<float>_to_<float>
+
+All combinations of source and result type are supported where the source and
+result are different formats.
+There are four different styles of signature for these functions, depending on
+whether the input and the output floating-point values are passed by value or
+via pointers:
+++ + ++float32_t f64_to_f32( float64_t a ); +++float32_t f128M_to_f32( const float128_t *aPtr ); +++void f32_to_f128M( float32_t a, float128_t *destPtr ); +++void extF80M_to_f128M( const extFloat80_t *aPtr, float128_t *destPtr ); ++
+Conversions from a smaller to a larger floating-point format are always exact +and so require no rounding. +
+ ++The following basic arithmetic functions are provided: +
++Each floating-point operation takes two operands, except for<float>_add
+<float>_sub
+<float>_mul
+<float>_div
+<float>_sqrt
+
sqrt
+(square root) which takes only one.
+The operands and result are all of the same floating-point format.
+Signatures for these functions take the following forms:
+++When floating-point values are passed indirectly through pointers, arguments ++float64_t f64_add( float64_t a, float64_t b ); +++void + f128M_add( + const float128_t *aPtr, const float128_t *bPtr, float128_t *destPtr ); +++float64_t f64_sqrt( float64_t a ); +++void f128M_sqrt( const float128_t *aPtr, float128_t *destPtr ); ++
aPtr
and bPtr
point to the input
+operands, and the last argument, destPtr
, points to the
+location where the result is stored.
+
+
+
+Rounding of the extFloat80_t
) functions is affected by variable
+extF80_roundingPrecision
, as explained earlier in
+
+The 2008 version of the IEEE Floating-Point Standard defines a fused +multiply-add operation that does a combined multiplication and addition +with only a single rounding. +SoftFloat implements fused multiply-add with functions +
+<float>_mulAdd
+
+Unlike other operations, fused multiple-add is not supported for the
+extFloat80_t
.
+
+
++Depending on whether floating-point values are passed by value or via pointers, +the fused multiply-add functions have signatures of these forms: +
++The functions compute ++float64_t f64_mulAdd( float64_t a, float64_t b, float64_t c ); +++void + f128M_mulAdd( + const float128_t *aPtr, + const float128_t *bPtr, + const float128_t *cPtr, + float128_t *destPtr + ); ++
a
× b
)
+ + c
aPtr
, bPtr
, and
+cPtr
point to operands a
,
+b
, and c
respectively, and
+destPtr
points to the location where the result is stored.
+
+
+
+If one of the multiplication operands a
and
+b
is infinite and the other is zero, these functions raise
+the invalid exception even if operand c
is a quiet NaN.
+
+For each format, SoftFloat implements the remainder operation defined by the +IEEE Floating-Point Standard. +The remainder functions have names +
+<float>_rem
+
+Each remainder operation takes two floating-point operands of the same format
+and returns a result in the same format.
+Depending on whether floating-point values are passed by value or via pointers,
+the remainder functions have signatures of these forms:
+++When floating-point values are passed indirectly through pointers, arguments ++float64_t f64_rem( float64_t a, float64_t b ); +++void + f128M_rem( + const float128_t *aPtr, const float128_t *bPtr, float128_t *destPtr ); ++
aPtr
and bPtr
point to operands
+a
and b
respectively, and
+destPtr
points to the location where the result is stored.
+
+
+
+The IEEE Standard remainder operation computes the value
+a
+ − n × b
a
÷ b
a
÷ b
a
÷ b
+Depending on the relative magnitudes of the operands, the remainder +functions can take considerably longer to execute than the other SoftFloat +functions. +This is an inherent characteristic of the remainder operation itself and is not +a flaw in the SoftFloat implementation. +
+ ++For each format, SoftFloat implements the round-to-integer operation specified +by the IEEE Floating-Point Standard. +These functions are named +
+<float>_roundToInt
+
+Each round-to-integer operation takes a single floating-point operand.
+This operand is rounded to an integer according to a specified rounding mode,
+and the resulting integer value is returned in the same floating-point format.
+(Note that the result is not an integer type.)
+
+
++The signatures of the round-to-integer functions are similar to those for +conversions to an integer type: +
++When floating-point values are passed indirectly through pointers, ++float64_t f64_roundToInt( float64_t a, uint_fast8_t roundingMode, bool exact ); +++void + f128M_roundToInt( + const float128_t *aPtr, + uint_fast8_t roundingMode, + bool exact, + float128_t *destPtr + ); ++
aPtr
points to the input operand and
+destPtr
points to the location where the result is stored.
+
+
+
+The roundingMode
argument specifies the rounding mode to
+apply.
+The variable that usually indicates rounding mode,
+softfloat_roundingMode
, is ignored.
+Argument exact
determines whether the inexact
+exception flag is raised if the conversion is not exact.
+If exact
is true
, the inexact flag may
+be raised;
+otherwise, it will not be, even if the conversion is inexact.
+
+For each format, the following floating-point comparison functions are +provided: +
++Each comparison takes two operands of the same type and returns a Boolean. +The abbreviation<float>_eq
+<float>_le
+<float>_lt
+
eq
stands for “equal” (=);
+le
stands for “less than or equal” (≤);
+and lt
stands for “less than” (<).
+Depending on whether the floating-point operands are passed by value or via
+pointers, the comparison functions have signatures of these forms:
+++ + ++bool f64_eq( float64_t a, float64_t b ); +++bool f128M_eq( const float128_t *aPtr, const float128_t *bPtr ); ++
+The usual greater-than (>), greater-than-or-equal (≥), and not-equal +(≠) comparisons are easily obtained from the functions provided. +The not-equal function is just the logical complement of the equal function. +The greater-than-or-equal function is identical to the less-than-or-equal +function with the arguments in reverse order, and likewise the greater-than +function is identical to the less-than function with the arguments reversed. +
+ ++The IEEE Floating-Point Standard specifies that the less-than-or-equal and +less-than comparisons by default raise the invalid exception if either +operand is any kind of NaN. +Equality comparisons, on the other hand, are defined by default to raise the +invalid exception only for signaling NaNs, not quiet NaNs. +For completeness, SoftFloat provides these complementary functions: +
++The<float>_eq_signaling
+<float>_le_quiet
+<float>_lt_quiet
+
signaling
equality comparisons are identical to the default
+equality comparisons except that the invalid exception is raised for any
+NaN input, not just for signaling NaNs.
+Similarly, the quiet
comparison functions are identical to their
+default counterparts except that the invalid exception is not raised for
+quiet NaNs.
+
+
++Functions for testing whether a floating-point value is a signaling NaN are +provided with these names: +
+<float>_isSignalingNaN
+
+The functions take one floating-point operand and return a Boolean indicating
+whether the operand is a signaling NaN.
+Accordingly, the functions have the forms
+++ + ++bool f64_isSignalingNaN( float64_t a ); +++bool f128M_isSignalingNaN( const float128_t *aPtr ); ++
+SoftFloat provides a single function for raising floating-point exceptions: +
++The+void softfloat_raiseFlags( uint_fast8_t exceptions ); ++
exceptions
argument is a mask indicating the set of
+exceptions to raise.
+(See earlier section 7, Exceptions and Exception Flags.)
+In addition to setting the specified exception flags in variable
+softfloat_exceptionFlags
, the softfloat_raiseFlags
+function may cause a trap or abort appropriate for the current system.
+
+
+
+
+Apart from a change in the legal use license,
+The most obvious and pervasive difference compared to
++ + ++
++ +old name, Release 2: +new name, Release 3: ++ ++ float32
+ float32_t
+ ++ float64
+ float64_t
+ ++ floatx80
+ extFloat80_t
+ ++ float128
+ float128_t
+ ++ float_rounding_mode
+ softfloat_roundingMode
+ ++ float_round_nearest_even
+ softfloat_round_near_even
+ ++ float_round_to_zero
+ softfloat_round_minMag
+ ++ float_round_down
+ softfloat_round_min
+ ++ float_round_up
+ softfloat_round_max
+ ++ float_detect_tininess
+ softfloat_detectTininess
+ ++ float_tininess_before_rounding
+ softfloat_tininess_beforeRounding
+ ++ float_tininess_after_rounding
+ softfloat_tininess_afterRounding
+ ++ floatx80_rounding_precision
+ extF80_roundingPrecision
+ ++ float_exception_flags
+ softfloat_exceptionFlags
+ ++ float_flag_inexact
+ softfloat_flag_inexact
+ ++ float_flag_underflow
+ softfloat_flag_underflow
+ ++ float_flag_overflow
+ softfloat_flag_overflow
+ ++ float_flag_divbyzero
+ softfloat_flag_infinite
+ ++ float_flag_invalid
+ softfloat_flag_invalid
+ ++ float_raise
+ softfloat_raiseFlags
+Furthermore,
++Thus, for example, the function to add two+
++ +used in names in Release 2: +
used in names in Release 3: ++ int32
i32
+ int64
i64
+ float32
f32
+ float64
f64
+ floatx80
extF80
+ float128
f128
float32_add
in f32_add
.
+Lastly, there have been a few other changes to function names:
+++ + ++
++ +used in names in Release 2: +
used in names in Release 3: +
relevant functions: ++ ++ _round_to_zero
+ _r_minMag
conversions from floating-point to integer ( +section 8.2 )+ ++ round_to_int
+ roundToInt
round-to-integer functions ( +section 8.7 )+ ++ is_signaling_nan
+ isSignalingNaN
signaling NaN test functions ( +section 8.9 )
+Besides simple name changes, some operations were given a different interface
+in
+Since <stdint.h>
, such as
+uint32_t
, whereas previously their types could be defined
+differently for each port of SoftFloat, usually using traditional C types such
+as unsigned
int
.
+Likewise, functions in bool
from <stdbool.h>
, whereas
+previously these were again passed as a port-specific type (usually
+int
).
+
+As explained earlier in
+Functions that round to an integer have additional
+roundingMode
and exact
arguments that
+they did not have in softfloat_roundingMode
but previously known as
+float_rounding_mode
).
+Also, for softfloat_roundingMode
for argument
+roundingMode
and true
for argument
+exact
.
+
+With
+A port of SoftFloat can now define any of the floating-point types
+float32_t
, float64_t
, extFloat80_t
, and
+float128_t
as aliases for C’s standard floating-point types
+float
, double
, and long
+double
, using either #define
or typedef
.
+This potential convenience was not supported under
+(Note, however, that there may be a performance cost to defining +SoftFloat’s floating-point types this way, depending on the platform and +the applications using SoftFloat. +Ports of SoftFloat may choose to forgo the convenience in favor of better +speed.) +
+ ++
float16_t
, is supported.
+
+
++
+
extFloat80_t
.
+
+
++
softfloat_round_near_maxMag
(round to nearest, with ties to
+maximum magnitude, away from zero), and, as of softfloat_round_odd
(round to odd, also known as
+jamming).
+
+
+
+
+Starting with
+Individual SoftFloat functions have been variously improved in
+extFloat80_t
and
+float128_t
, code size has also generally been reduced.
+
+However, because
+The following improvements are anticipated for future releases of SoftFloat: +
extFloat80_t
(discussed in extFloat80_t
).
+
+
+At the time of this writing, the most up-to-date information about SoftFloat
+and the latest release can be found at the Web page
+http://www.jhauser.us/arithmetic/SoftFloat.html