From f215e02bf85f68d3a6106c2a1f4f7f063f819064 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Thu, 11 Apr 2024 10:17:27 +0200 Subject: Adding upstream version 7.0.14-dfsg. Signed-off-by: Daniel Baumann --- src/libs/softfloat-3e/doc/SoftFloat-history.html | 258 ++++ src/libs/softfloat-3e/doc/SoftFloat-source.html | 686 ++++++++++ src/libs/softfloat-3e/doc/SoftFloat.html | 1527 ++++++++++++++++++++++ 3 files changed, 2471 insertions(+) create mode 100644 src/libs/softfloat-3e/doc/SoftFloat-history.html create mode 100644 src/libs/softfloat-3e/doc/SoftFloat-source.html create mode 100644 src/libs/softfloat-3e/doc/SoftFloat.html (limited to 'src/libs/softfloat-3e/doc') diff --git a/src/libs/softfloat-3e/doc/SoftFloat-history.html b/src/libs/softfloat-3e/doc/SoftFloat-history.html new file mode 100644 index 00000000..d81c6bc5 --- /dev/null +++ b/src/libs/softfloat-3e/doc/SoftFloat-history.html @@ -0,0 +1,258 @@ + + + + +Berkeley SoftFloat History + + + + +

History of Berkeley SoftFloat, to Release 3e

+ +

+John R. Hauser
+2018 January 20
+

+ + +

Release 3e (2018 January)

+ + + + +

Release 3d (2017 August)

+ + + + +

Release 3c (2017 February)

+ + + + +

Release 3b (2016 July)

+ + + + +

Release 3a (2015 October)

+ + + + +

Release 3 (2015 February)

+ + + + +

Release 2c (2015 January)

+ + + + +

Release 2b (2002 May)

+ + + + +

Release 2a (1998 December)

+ + + + +

Release 2 (1997 June)

+ + + + +

Release 1a (1996 July)

+ + + + +

Release 1 (1996 July)

+ + + + + + diff --git a/src/libs/softfloat-3e/doc/SoftFloat-source.html b/src/libs/softfloat-3e/doc/SoftFloat-source.html new file mode 100644 index 00000000..4ff9d4c4 --- /dev/null +++ b/src/libs/softfloat-3e/doc/SoftFloat-source.html @@ -0,0 +1,686 @@ + + + + +Berkeley SoftFloat Source Documentation + + + + +

Berkeley SoftFloat Release 3e: Source Documentation

+ +

+John R. Hauser
+2018 January 20
+

+ + +

Contents

+ +
+ +++ + + + + + + + + + + + + + + + + + + + +
1. Introduction
2. Limitations
3. Acknowledgments and License
4. SoftFloat Package Directory Structure
5. Issues for Porting SoftFloat to a New Target
5.1. Standard Headers <stdbool.h> and + <stdint.h>
5.2. Specializing Floating-Point Behavior
5.3. Macros for Build Options
5.4. Adapting a Template Target Directory
5.5. Target-Specific Optimization of Primitive Functions
6. Testing SoftFloat
7. Providing SoftFloat as a Common Library for Applications
8. Contact Information
+
+ + +

1. Introduction

+ +

+This document gives information needed for compiling and/or porting Berkeley +SoftFloat, a library of C functions implementing binary floating-point +conforming to the IEEE Standard for Floating-Point Arithmetic. +For basic documentation about SoftFloat refer to +SoftFloat.html. +

+ +

+The source code for SoftFloat is intended to be relatively machine-independent +and should be compilable with any ISO-Standard C compiler that also supports +64-bit integers. +SoftFloat has been successfully compiled with the GNU C Compiler +(gcc) for several platforms. +

+ +

+Release 3 of SoftFloat was a complete rewrite relative to +Release 2 or earlier. +Changes to the interface of SoftFloat functions are documented in +SoftFloat.html. +The current version of SoftFloat is Release 3e. +

+ + +

2. Limitations

+ +

+SoftFloat assumes the computer has an addressable byte size of either 8 or +16 bits. +(Nearly all computers in use today have 8-bit bytes.) +

+ +

+SoftFloat is written in C and is designed to work with other C code. +The C compiler used must conform at a minimum to the 1989 ANSI standard for the +C language (same as the 1990 ISO standard) and must in addition support basic +arithmetic on 64-bit integers. +Earlier releases of SoftFloat included implementations of 32-bit +single-precision and 64-bit double-precision floating-point that +did not require 64-bit integers, but this option is not supported +starting with Release 3. +Since 1999, ISO standards for C have mandated compiler support for +64-bit integers. +A compiler conforming to the 1999 C Standard or later is recommended but not +strictly required. +

+ +

+C Standard header files <stdbool.h> and +<stdint.h> are required for defining standard Boolean and +integer types. +If these headers are not supplied with the C compiler, minimal substitutes must +be provided. +SoftFloat’s dependence on these headers is detailed later in +section 5.1, Standard Headers <stdbool.h> +and <stdint.h>. +

+ + +

3. Acknowledgments and License

+ +

+The SoftFloat package was written by me, John R. Hauser. +Release 3 of SoftFloat was a completely new implementation +supplanting earlier releases. +The project to create Release 3 (now through 3e) was +done in the employ of the University of California, Berkeley, within the +Department of Electrical Engineering and Computer Sciences, first for the +Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab. +The work was officially overseen by Prof. Krste Asanovic, with funding provided +by these sources: +

+ ++++ + + + + + + + + + +
Par Lab: +Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery +(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia, +NVIDIA, Oracle, and Samsung. +
ASPIRE Lab: +DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from +ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA, +Oracle, and Samsung. +
+
+

+ +

+The following applies to the whole of SoftFloat Release 3e as well +as to each source file individually. +

+ +

+Copyright 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 The Regents of the +University of California. +All rights reserved. +

+ +

+Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: +

    + +
  1. +

    +Redistributions of source code must retain the above copyright notice, this +list of conditions, and the following disclaimer. +

    + +
  2. +

    +Redistributions in binary form must reproduce the above copyright notice, this +list of conditions, and the following disclaimer in the documentation and/or +other materials provided with the distribution. +

    + +
  3. +

    +Neither the name of the University nor the names of its contributors may be +used to endorse or promote products derived from this software without specific +prior written permission. +

    + +
+

+ +

+THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS “AS IS”, +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE +DISCLAIMED. +IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +

+ + +

4. SoftFloat Package Directory Structure

+ +

+Because SoftFloat is targeted to multiple platforms, its source code is +slightly scattered between target-specific and target-independent directories +and files. +The supplied directory structure is as follows: +

+
+doc
+source
+    include
+    8086
+    8086-SSE
+    ARM-VFPv2
+    ARM-VFPv2-defaultNaN
+build
+    template-FAST_INT64
+    template-not-FAST_INT64
+    Linux-386-GCC
+    Linux-386-SSE2-GCC
+    Linux-x86_64-GCC
+    Linux-ARM-VFPv2-GCC
+    Win32-MinGW
+    Win32-SSE2-MinGW
+    Win64-MinGW-w64
+
+
+The majority of the SoftFloat sources are provided in the source +directory. +The include subdirectory contains several header files +(unsurprisingly), while the other subdirectories of source contain +source files that specialize the floating-point behavior to match particular +processor families: +
+
+
8086
+
+Intel’s older, 8087-derived floating-point, extended to all supported +floating-point types +
+
8086-SSE
+
+Intel’s x86 processors with Streaming SIMD Extensions (SSE) and later +compatible extensions, having 8087 behavior for 80-bit +double-extended-precision (extFloat80_t) and SSE behavior for +other floating-point types +
+
ARM-VFPv2
+
+ARM’s VFPv2 or later floating-point, with NaN payload propagation +
+
ARM-VFPv2-defaultNaN
+
+ARM’s VFPv2 or later floating-point, with the “default NaN” +option +
+
+
+If other specializations are attempted, these would be expected to be other +subdirectories of source alongside the ones listed above. +Specialization is covered later, in section 5.2, Specializing +Floating-Point Behavior. +

+ +

+The build directory is intended to contain a subdirectory for each +target platform for which a build of the SoftFloat library may be created. +For each build target, the target’s subdirectory is where all derived +object files and the completed SoftFloat library (typically +softfloat.a or libsoftfloat.a) are created. +The two template subdirectories are not actual build targets but +contain sample files for creating new target directories. +(The meaning of FAST_INT64 will be explained later.) +

+ +

+Ignoring the template directories, the supplied target directories +are intended to follow a naming system of +<execution-environment>-<compiler>. +For the example targets, +<execution-environment> is +Linux-386, Linux-386-SSE2, +Linux-x86_64, +Linux-ARM-VFPv2, Win32, +Win32-SSE2, or Win64, and +<compiler> is GCC, +MinGW, or MinGW-w64. +

+ +

+All of the supplied target directories are merely examples that may or may not +be correct for compiling on any particular system. +Despite requests, there are currently no plans to include and maintain in the +SoftFloat package the build files needed for a great many users’ +compilation environments, which can span a huge range of operating systems, +compilers, and other tools. +

+ +

+As supplied, each target directory contains two files: +

+
+Makefile
+platform.h
+
+
+The provided Makefile is written for GNU make. +A build of SoftFloat for the specific target is begun by executing the +make command with the target directory as the current directory. +A completely different build tool can be used if an appropriate +Makefile equivalent is created. +

+ +

+The platform.h header file exists to provide a location for +additional C declarations specific to the build target. +Every C source file of SoftFloat contains a #include for +platform.h. +In many cases, the contents of platform.h can be as simple as one +or two lines of code. +At the other extreme, to get maximal performance from SoftFloat, it may be +desirable to include in header platform.h (directly or via +#include) declarations for numerous target-specific optimizations. +Such possibilities are discussed in the next section, Issues for Porting +SoftFloat to a New Target. +If the target’s compiler or library has bugs or other shortcomings, +workarounds for these issues may also be possible with target-specific +declarations in platform.h, avoiding the need to modify the main +SoftFloat sources. +

+ + +

5. Issues for Porting SoftFloat to a New Target

+ +

5.1. Standard Headers <stdbool.h> and <stdint.h>

+ +

+The SoftFloat sources make use of standard headers +<stdbool.h> and <stdint.h>, which have +been part of the ISO C Standard Library since 1999. +With any recent compiler, these standard headers are likely to be supported, +even if the compiler does not claim complete conformance to the latest ISO C +Standard. +For older or nonstandard compilers, substitutes for +<stdbool.h> and <stdint.h> may need to be +created. +SoftFloat depends on these names from <stdbool.h>: +

+
+bool
+true
+false
+
+
+and on these names from <stdint.h>: +
+
+uint16_t
+uint32_t
+uint64_t
+int32_t
+int64_t
+UINT64_C
+INT64_C
+uint_least8_t
+uint_fast8_t
+uint_fast16_t
+uint_fast32_t
+uint_fast64_t
+int_fast8_t
+int_fast16_t
+int_fast32_t
+int_fast64_t
+
+
+

+ + +

5.2. Specializing Floating-Point Behavior

+ +

+The IEEE Floating-Point Standard allows for some flexibility in a conforming +implementation, particularly concerning NaNs. +The SoftFloat source directory is supplied with some +specialization subdirectories containing possible definitions for this +implementation-specific behavior. +For example, the 8086 and 8086-SSE +subdirectories have source files that specialize SoftFloat’s behavior to +match that of Intel’s x86 line of processors. +The files in a specialization subdirectory must determine: +

+

+ +

+As provided, the build process for a target expects to involve exactly +one specialization directory that defines all of these +implementation-specific details for the target. +A specialization directory such as 8086 is expected to contain a +header file called specialize.h, together with whatever other +source files are needed to complete the specialization. +

+ +

+A new build target may use an existing specialization, such as the ones +provided by the 8086 and 8086-SSE +subdirectories. +If a build target needs a new specialization, different from any existing ones, +it is recommended that a new specialization directory be created for this +purpose. +The specialize.h header file from any of the provided +specialization subdirectories can be used as a model for what definitions are +needed. +

+ + +

5.3. Macros for Build Options

+ +

+The SoftFloat source files adapt the floating-point implementation according to +several C preprocessor macros: +

+
+
LITTLEENDIAN +
+Must be defined for little-endian machines; must not be defined for big-endian +machines. +
INLINE +
+Specifies the sequence of tokens used to indicate that a C function should be +inlined. +If macro INLINE_LEVEL is defined with a value of 1 or higher, this +macro must be defined; otherwise, this macro is ignored and need not be +defined. +For compilers that conform to the C Standard’s rules for inline +functions, this macro can be defined as the single keyword inline. +For other compilers that follow a convention pre-dating the standardization of +inline, this macro may need to be defined to extern +inline. +
THREAD_LOCAL +
+Can be defined to a sequence of tokens that, when appearing at the start of a +variable declaration, indicates to the C compiler that the variable is +per-thread, meaning that each execution thread gets its own separate +instance of the variable. +This macro is used in header softfloat.h in the declarations of +variables softfloat_roundingMode, +softfloat_detectTininess, extF80_roundingPrecision, +and softfloat_exceptionFlags. +If macro THREAD_LOCAL is left undefined, these variables will +default to being ordinary global variables. +Depending on the compiler, possible valid definitions of this macro include +_Thread_local and __thread. +
+
+
SOFTFLOAT_ROUND_ODD +
+Can be defined to enable support for optional rounding mode +softfloat_round_odd. +
+
+
INLINE_LEVEL +
+Can be defined to an integer to determine the degree of inlining requested of +the compiler. +Larger numbers request that more inlining be done. +If this macro is not defined or is defined to a value less than 1 +(zero or negative), no inlining is requested. +The maximum effective value is no higher than 5. +Defining this macro to a value greater than 5 is the same as defining it +to 5. +
SOFTFLOAT_FAST_INT64 +
+Can be defined to indicate that the build target’s implementation of +64-bit arithmetic is efficient. +For newer 64-bit processors, this macro should usually be defined. +For very small microprocessors whose buses and registers are 8-bit +or 16-bit in size, this macro should usually not be defined. +Whether this macro should be defined for a 32-bit processor may +depend on the target machine and the applications that will use SoftFloat. +
SOFTFLOAT_FAST_DIV32TO16 +
+Can be defined to indicate that the target’s division operator +in C (written as /) is reasonably efficient for +dividing a 32-bit unsigned integer by a 16-bit +unsigned integer. +Setting this macro may affect the performance of function f16_div. +
SOFTFLOAT_FAST_DIV64TO32 +
+Can be defined to indicate that the target’s division operator +in C (written as /) is reasonably efficient for +dividing a 64-bit unsigned integer by a 32-bit +unsigned integer. +Setting this macro may affect the performance of division, remainder, and +square root operations other than f16_div. +
+
+

+ +

+Following the usual custom for C, for most of these macros (all +except INLINE, THREAD_LOCAL, and +INLINE_LEVEL), the content of any definition is irrelevant; +what matters is a macro’s effect on #ifdef directives. +

+ +

+It is recommended that any definitions of macros LITTLEENDIAN, +INLINE, and THREAD_LOCAL be made in a build +target’s platform.h header file, because these macros are +expected to be determined inflexibly by the target machine and compiler. +The other five macros select options and control optimization, and thus might +be better located in the target’s Makefile (or its equivalent). +

+ + +

5.4. Adapting a Template Target Directory

+ +

+In the build directory, two template subdirectories +provide models for new target directories. +Two different templates exist because different functions are needed in the +SoftFloat library depending on whether macro SOFTFLOAT_FAST_INT64 +is defined. +If macro SOFTFLOAT_FAST_INT64 will be defined, +template-FAST_INT64 is the template to use; +otherwise, template-not-FAST_INT64 is the appropriate +template. +A new target directory can be created by copying the correct template directory +and editing the files inside. +To avoid confusion, it would be wise to refrain from editing the files within a +template directory directly. +

+ + +

5.5. Target-Specific Optimization of Primitive Functions

+ +

+Header file primitives.h (in directory +source/include) declares macros and functions for numerous +underlying arithmetic operations upon which many of SoftFloat’s +floating-point functions are ultimately built. +The SoftFloat sources include implementations of all of these functions/macros, +written as standard C code, so a complete and correct SoftFloat library can be +created using only the supplied code for all functions. +However, for many targets, SoftFloat’s performance can be improved by +substituting target-specific implementations of some of the functions/macros +declared in primitives.h. +

+ +

+For example, primitives.h declares a function called +softfloat_countLeadingZeros32 that takes an unsigned +32-bit integer as an argument and returns the number of the +integer’s most-significant bits that are zeros. +While the SoftFloat sources include an implementation of this function written +in standard C, many processors can perform this same function +directly in only one or two machine instructions. +An alternative, target-specific implementation that maps to those instructions +is likely to be more efficient than the generic C code from the SoftFloat +package. +

+ +

+A build target can replace the supplied version of any function or macro of +primitives.h by defining a macro with the same name in the +target’s platform.h header file. +For this purpose, it may be helpful for platform.h to +#include header file primitiveTypes.h, which defines +types used for arguments and results of functions declared in +primitives.h. +When a desired replacement implementation is a function, not a macro, it is +sufficient for platform.h to include the line +

+
+#define <function-name> <function-name>
+
+
+where <function-name> is the name of the +function. +This technically defines <function-name> +as a macro, but one that resolves to the same name, which may then be a +function. +(A preprocessor that conforms to the C Standard is required to limit recursive +macro expansion from being applied more than once.) +

+ +

+The supplied header file opts-GCC.h (in directory +source/include) provides an example of target-specific +optimization for the GCC compiler. +Each GCC target example in the build directory has +

+#include "opts-GCC.h" +
+in its platform.h header file. +Before opts-GCC.h is included, the following macros must be +defined (or not) to control which features are invoked: +
+
+
SOFTFLOAT_BUILTIN_CLZ
+
+If defined, SoftFloat’s internal +‘countLeadingZeros’ functions use intrinsics +__builtin_clz and __builtin_clzll. +
+
SOFTFLOAT_INTRINSIC_INT128
+
+If defined, SoftFloat makes use of GCC’s nonstandard 128-bit +integer type __int128. +
+
+
+On some machines, these improvements are observed to increase the speeds of +f64_mul and f128_mul by around 20 to 25%, although +other functions receive less dramatic boosts, or none at all. +Results can vary greatly across different platforms. +

+ + +

6. Testing SoftFloat

+ +

+SoftFloat can be tested using the testsoftfloat program by the +same author. +This program is part of the Berkeley TestFloat package available at the Web +page +http://www.jhauser.us/arithmetic/TestFloat.html. +The TestFloat package also has a program called timesoftfloat that +measures the speed of SoftFloat’s floating-point functions. +

+ + +

7. Providing SoftFloat as a Common Library for Applications

+ +

+Header file softfloat.h defines the SoftFloat interface as seen by +clients. +If the SoftFloat library will be made a common library for programs on a +system, the supplied softfloat.h has a couple of deficiencies for +this purpose: +

+In the situation that new programs may regularly #include header +file softfloat.h, it is recommended that a custom, self-contained +version of this header file be created that eliminates these issues. +

+ + +

8. Contact Information

+ +

+At the time of this writing, the most up-to-date information about SoftFloat +and the latest release can be found at the Web page +http://www.jhauser.us/arithmetic/SoftFloat.html. +

+ + + + diff --git a/src/libs/softfloat-3e/doc/SoftFloat.html b/src/libs/softfloat-3e/doc/SoftFloat.html new file mode 100644 index 00000000..b72b407f --- /dev/null +++ b/src/libs/softfloat-3e/doc/SoftFloat.html @@ -0,0 +1,1527 @@ + + + + +Berkeley SoftFloat Library Interface + + + + +

Berkeley SoftFloat Release 3e: Library Interface

+ +

+John R. Hauser
+2018 January 20
+

+ + +

Contents

+ +
+ +++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
1. Introduction
2. Limitations
3. Acknowledgments and License
4. Types and Functions
4.1. Boolean and Integer Types
4.2. Floating-Point Types
4.3. Supported Floating-Point Functions
4.4. Non-canonical Representations in extFloat80_t
4.5. Conventions for Passing Arguments and Results
5. Reserved Names
6. Mode Variables
6.1. Rounding Mode
6.2. Underflow Detection
6.3. Rounding Precision for the 80-Bit Extended Format
7. Exceptions and Exception Flags
8. Function Details
8.1. Conversions from Integer to Floating-Point
8.2. Conversions from Floating-Point to Integer
8.3. Conversions Among Floating-Point Types
8.4. Basic Arithmetic Functions
8.5. Fused Multiply-Add Functions
8.6. Remainder Functions
8.7. Round-to-Integer Functions
8.8. Comparison Functions
8.9. Signaling NaN Test Functions
8.10. Raise-Exception Function
9. Changes from SoftFloat Release 2
9.1. Name Changes
9.2. Changes to Function Arguments
9.3. Added Capabilities
9.4. Better Compatibility with the C Language
9.5. New Organization as a Library
9.6. Optimization Gains (and Losses)
10. Future Directions
11. Contact Information
+
+ + +

1. Introduction

+ +

+Berkeley SoftFloat is a software implementation of binary floating-point that +conforms to the IEEE Standard for Floating-Point Arithmetic. +The current release supports five binary formats: 16-bit +half-precision, 32-bit single-precision, 64-bit +double-precision, 80-bit double-extended-precision, and +128-bit quadruple-precision. +The following functions are supported for each format: +

+All operations required by the original 1985 version of the IEEE Floating-Point +Standard are implemented, except for conversions to and from decimal. +

+ +

+This document gives information about the types defined and the routines +implemented by SoftFloat. +It does not attempt to define or explain the IEEE Floating-Point Standard. +Information about the standard is available elsewhere. +

+ +

+The current version of SoftFloat is Release 3e. +This release modifies the behavior of the rarely used odd rounding mode +(round to odd, also known as jamming), and also adds some new +specialization and optimization examples for those compiling SoftFloat. +

+ +

+The previous Release 3d fixed bugs that were found in the square +root functions for the 64-bit, 80-bit, and +128-bit floating-point formats. +(Thanks to Alexei Sibidanov at the University of Victoria for reporting an +incorrect result.) +The bugs affected all prior Release-3 versions of SoftFloat +through 3c. +The flaw in the 64-bit floating-point square root function was of +very minor impact, causing a 1-ulp error (1 unit in +the last place) a few times out of a billion. +The bugs in the 80-bit and 128-bit square root +functions were more serious. +Although incorrect results again occurred only a few times out of a billion, +when they did occur a large portion of the less-significant bits could be +wrong. +

+ +

+Among earlier releases, 3b was notable for adding support for the +16-bit half-precision format. +For more about the evolution of SoftFloat releases, see +SoftFloat-history.html. +

+ +

+The functional interface of SoftFloat Release 3 and later differs +in many details from the releases that came before. +For specifics of these differences, see section 9 below, +Changes from SoftFloat Release 2. +

+ + +

2. Limitations

+ +

+SoftFloat assumes the computer has an addressable byte size of 8 or +16 bits. +(Nearly all computers in use today have 8-bit bytes.) +

+ +

+SoftFloat is written in C and is designed to work with other C code. +The C compiler used must conform at a minimum to the 1989 ANSI standard for the +C language (same as the 1990 ISO standard) and must in addition support basic +arithmetic on 64-bit integers. +Earlier releases of SoftFloat included implementations of 32-bit +single-precision and 64-bit double-precision floating-point that +did not require 64-bit integers, but this option is not supported +starting with Release 3. +Since 1999, ISO standards for C have mandated compiler support for +64-bit integers. +A compiler conforming to the 1999 C Standard or later is recommended but not +strictly required. +

+ +

+Most operations not required by the original 1985 version of the IEEE +Floating-Point Standard but added in the 2008 version are not yet supported in +SoftFloat Release 3e. +

+ + +

3. Acknowledgments and License

+ +

+The SoftFloat package was written by me, John R. Hauser. +Release 3 of SoftFloat was a completely new implementation +supplanting earlier releases. +The project to create Release 3 (now through 3e) was +done in the employ of the University of California, Berkeley, within the +Department of Electrical Engineering and Computer Sciences, first for the +Parallel Computing Laboratory (Par Lab) and then for the ASPIRE Lab. +The work was officially overseen by Prof. Krste Asanovic, with funding provided +by these sources: +

+ ++++ + + + + + + + + + +
Par Lab: +Microsoft (Award #024263), Intel (Award #024894), and U.C. Discovery +(Award #DIG07-10227), with additional support from Par Lab affiliates Nokia, +NVIDIA, Oracle, and Samsung. +
ASPIRE Lab: +DARPA PERFECT program (Award #HR0011-12-2-0016), with additional support from +ASPIRE industrial sponsor Intel and ASPIRE affiliates Google, Nokia, NVIDIA, +Oracle, and Samsung. +
+
+

+ +

+The following applies to the whole of SoftFloat Release 3e as well +as to each source file individually. +

+ +

+Copyright 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018 The Regents of the +University of California. +All rights reserved. +

+ +

+Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are met: +

    + +
  1. +

    +Redistributions of source code must retain the above copyright notice, this +list of conditions, and the following disclaimer. +

    + +
  2. +

    +Redistributions in binary form must reproduce the above copyright notice, this +list of conditions, and the following disclaimer in the documentation and/or +other materials provided with the distribution. +

    + +
  3. +

    +Neither the name of the University nor the names of its contributors may be +used to endorse or promote products derived from this software without specific +prior written permission. +

    + +
+

+ +

+THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS “AS IS”, +AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE +IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE +DISCLAIMED. +IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +

+ + +

4. Types and Functions

+ +

+The types and functions of SoftFloat are declared in header file +softfloat.h. +

+ +

4.1. Boolean and Integer Types

+ +

+Header file softfloat.h depends on standard headers +<stdbool.h> and <stdint.h> to define type +bool and several integer types. +These standard headers have been part of the ISO C Standard Library since 1999. +With any recent compiler, they are likely to be supported, even if the compiler +does not claim complete conformance to the latest ISO C Standard. +For older or nonstandard compilers, a port of SoftFloat may have substitutes +for these headers. +Header softfloat.h depends only on the name bool from +<stdbool.h> and on these type names from +<stdint.h>: +

+
+uint16_t
+uint32_t
+uint64_t
+int32_t
+int64_t
+uint_fast8_t
+uint_fast32_t
+uint_fast64_t
+int_fast32_t
+int_fast64_t
+
+
+

+ + +

4.2. Floating-Point Types

+ +

+The softfloat.h header defines five floating-point types: +

+ + + + + + + + + + + + + + + + + + + + + +
float16_t16-bit half-precision binary format
float32_t32-bit single-precision binary format
float64_t64-bit double-precision binary format
extFloat80_t   80-bit double-extended-precision binary format (old Intel or +Motorola format)
float128_t128-bit quadruple-precision binary format
+
+The non-extended types are each exactly the size specified: +16 bits for float16_t, 32 bits for +float32_t, 64 bits for float64_t, and +128 bits for float128_t. +Aside from these size requirements, the definitions of all these types may +differ for different ports of SoftFloat to specific systems. +A given port of SoftFloat may or may not define some of the floating-point +types as aliases for the C standard types float, +double, and long double. +

+ +

+Header file softfloat.h also defines a structure, +struct extFloat80M, for the representation of +80-bit double-extended-precision floating-point values in memory. +This structure is the same size as type extFloat80_t and contains +at least these two fields (not necessarily in this order): +

+
+uint16_t signExp;
+uint64_t signif;
+
+
+Field signExp contains the sign and exponent of the floating-point +value, with the sign in the most significant bit (bit 15) and the +encoded exponent in the other 15 bits. +Field signif is the complete 64-bit significand of +the floating-point value. +(In the usual encoding for 80-bit extended floating-point, the +leading 1 bit of normalized numbers is not implicit but is stored +in the most significant bit of the significand.) +

+ +

4.3. Supported Floating-Point Functions

+ +

+SoftFloat implements these arithmetic operations for its floating-point types: +

+

+ +

+The following operations required by the 2008 IEEE Floating-Point Standard are +not supported in SoftFloat Release 3e: +

+

+ +

4.4. Non-canonical Representations in extFloat80_t

+ +

+Because the 80-bit double-extended-precision format, +extFloat80_t, stores an explicit leading significand bit, many +finite floating-point numbers are encodable in this type in multiple equivalent +forms. +Of these multiple encodings, there is always a unique one with the least +encoded exponent value, and this encoding is considered the canonical +representation of the floating-point number. +Any other equivalent representations (having a higher encoded exponent value) +are non-canonical. +For a value in the subnormal range (including zero), the canonical +representation always has an encoded exponent of zero and a leading significand +bit of 0. +For finite values outside the subnormal range, the canonical representation +always has an encoded exponent that is nonzero and a leading significand bit +of 1. +

+ +

+For an infinity or NaN, the leading significand bit is similarly expected to +be 1. +An infinity or NaN with a leading significand bit of 0 is again +considered non-canonical. +Hence, altogether, to be canonical, a value of type extFloat80_t +must have a leading significand bit of 1, unless the value is +subnormal or zero, in which case the leading significand bit and the encoded +exponent must both be zero. +

+ +

+SoftFloat’s functions are not guaranteed to operate as expected when +inputs of type extFloat80_t are non-canonical. +Assuming all of a function’s extFloat80_t inputs (if any) +are canonical, function outputs of type extFloat80_t will always +be canonical. +

+ +

4.5. Conventions for Passing Arguments and Results

+ +

+Values that are at most 64 bits in size (i.e., not the +80-bit or 128-bit floating-point formats) are in all +cases passed as function arguments by value. +Likewise, when an output of a function is no more than 64 bits, it +is always returned directly as the function result. +Thus, for example, the SoftFloat function for adding two 64-bit +floating-point values has this simple signature: +

+float64_t f64_add( float64_t, float64_t ); +
+

+ +

+The story is more complex when function inputs and outputs are +80-bit and 128-bit floating-point. +For these types, SoftFloat always provides a function that passes these larger +values into or out of the function indirectly, via pointers. +For example, for adding two 128-bit floating-point values, +SoftFloat supplies this function: +

+void f128M_add( const float128_t *, const float128_t *, float128_t * ); +
+The first two arguments point to the values to be added, and the last argument +points to the location where the sum will be stored. +The M in the name f128M_add is mnemonic for the fact +that the 128-bit inputs and outputs are “in memory”, +pointed to by pointer arguments. +

+ +

+All ports of SoftFloat implement these pass-by-pointer functions for +types extFloat80_t and float128_t. +At the same time, SoftFloat ports may also implement alternate versions of +these same functions that pass extFloat80_t and +float128_t by value, like the smaller formats. +Thus, besides the function with name f128M_add shown above, a +SoftFloat port may also supply an equivalent function with this signature: +

+float128_t f128_add( float128_t, float128_t ); +
+

+ +

+As a general rule, on computers where the machine word size is +32 bits or smaller, only the pass-by-pointer versions of functions +(e.g., f128M_add) are provided for types extFloat80_t +and float128_t, because passing such large types directly can have +significant extra cost. +On computers where the word size is 64 bits or larger, both +function versions (f128M_add and f128_add) are +provided, because the cost of passing by value is then more reasonable. +Applications that must be portable accross both classes of computers must use +the pointer-based functions, as these are always implemented. +However, if it is known that SoftFloat includes the by-value functions for all +platforms of interest, programmers can use whichever version they prefer. +

+ + +

5. Reserved Names

+ +

+In addition to the variables and functions documented here, SoftFloat defines +some symbol names for its own private use. +These private names always begin with the prefix +‘softfloat_’. +When a program includes header softfloat.h or links with the +SoftFloat library, all names with prefix ‘softfloat_’ +are reserved for possible use by SoftFloat. +Applications that use SoftFloat should not define their own names with this +prefix, and should reference only such names as are documented. +

+ + +

6. Mode Variables

+ +

+The following global variables control rounding mode, underflow detection, and +the 80-bit extended format’s rounding precision: +

+softfloat_roundingMode
+softfloat_detectTininess
+extF80_roundingPrecision +
+These mode variables are covered in the next several subsections. +For some SoftFloat ports, these variables may be per-thread (declared +thread_local), meaning that different execution threads have their +own separate copies of the variables. +

+ +

6.1. Rounding Mode

+ +

+All five rounding modes defined by the 2008 IEEE Floating-Point Standard are +implemented for all operations that require rounding. +Some ports of SoftFloat may also implement the round-to-odd mode. +

+ +

+The rounding mode is selected by the global variable +

+uint_fast8_t softfloat_roundingMode; +
+This variable may be set to one of the values +
+ + + + + + + + + + + + + + + + + + + + + + + + + +
softfloat_round_near_evenround to nearest, with ties to even
softfloat_round_near_maxMag  round to nearest, with ties to maximum magnitude (away from zero)
softfloat_round_minMaground to minimum magnitude (toward zero)
softfloat_round_minround to minimum (down)
softfloat_round_maxround to maximum (up)
softfloat_round_oddround to odd (jamming), if supported by the SoftFloat port
+
+Variable softfloat_roundingMode is initialized to +softfloat_round_near_even. +

+ +

+When softfloat_round_odd is the rounding mode for a function that +rounds to an integer value (either conversion to an integer format or a +‘roundToInt’ function), if the input is not already an +integer, the rounded result is the closest odd integer. +For other operations, this rounding mode acts as though the floating-point +result is first rounded to minimum magnitude, the same as +softfloat_round_minMag, and then, if the result is inexact, the +least-significant bit of the result is set to 1. +Rounding to odd is also known as jamming. +

+ +

6.2. Underflow Detection

+ +

+In the terminology of the IEEE Standard, SoftFloat can detect tininess for +underflow either before or after rounding. +The choice is made by the global variable +

+uint_fast8_t softfloat_detectTininess; +
+which can be set to either +
+softfloat_tininess_beforeRounding
+softfloat_tininess_afterRounding +
+Detecting tininess after rounding is usually better because it results in fewer +spurious underflow signals. +The other option is provided for compatibility with some systems. +Like most systems (and as required by the newer 2008 IEEE Standard), SoftFloat +always detects loss of accuracy for underflow as an inexact result. +

+ +

6.3. Rounding Precision for the 80-Bit Extended Format

+ +

+For extFloat80_t only, the rounding precision of the basic +arithmetic operations is controlled by the global variable +

+uint_fast8_t extF80_roundingPrecision; +
+The operations affected are: +
+extF80_add
+extF80_sub
+extF80_mul
+extF80_div
+extF80_sqrt +
+When extF80_roundingPrecision is set to its default value of 80, +these operations are rounded to the full precision of the 80-bit +double-extended-precision format, like occurs for other formats. +Setting extF80_roundingPrecision to 32 or to 64 causes the +operations listed to be rounded to 32-bit precision (equivalent to +float32_t) or to 64-bit precision (equivalent to +float64_t), respectively. +When rounding to reduced precision, additional bits in the result significand +beyond the rounding point are set to zero. +The consequences of setting extF80_roundingPrecision to a value +other than 32, 64, or 80 is not specified. +Operations other than the ones listed above are not affected by +extF80_roundingPrecision. +

+ + +

7. Exceptions and Exception Flags

+ +

+All five exception flags required by the IEEE Floating-Point Standard are +implemented. +Each flag is stored as a separate bit in the global variable +

+uint_fast8_t softfloat_exceptionFlags; +
+The positions of the exception flag bits within this variable are determined by +the bit masks +
+softfloat_flag_inexact
+softfloat_flag_underflow
+softfloat_flag_overflow
+softfloat_flag_infinite
+softfloat_flag_invalid +
+Variable softfloat_exceptionFlags is initialized to all zeros, +meaning no exceptions. +

+ +

+For some SoftFloat ports, softfloat_exceptionFlags may be +per-thread (declared thread_local), meaning that different +execution threads have their own separate instances of it. +

+ +

+An individual exception flag can be cleared with the statement +

+softfloat_exceptionFlags &= ~softfloat_flag_<exception>; +
+where <exception> is the appropriate name. +To raise a floating-point exception, function softfloat_raiseFlags +should normally be used. +

+ +

+When SoftFloat detects an exception other than inexact, it calls +softfloat_raiseFlags. +The default version of this function simply raises the corresponding exception +flags. +Particular ports of SoftFloat may support alternate behavior, such as exception +traps, by modifying the default softfloat_raiseFlags. +A program may also supply its own softfloat_raiseFlags function to +override the one from the SoftFloat library. +

+ +

+Because inexact results occur frequently under most circumstances (and thus are +hardly exceptional), SoftFloat does not ordinarily call +softfloat_raiseFlags for inexact exceptions. +It does always raise the inexact exception flag as required. +

+ + +

8. Function Details

+ +

+In this section, <float> appears in function names as +a substitute for one of these abbreviations: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
f16indicates float16_t, passed by value
f32indicates float32_t, passed by value
f64indicates float64_t, passed by value
extF80M   indicates extFloat80_t, passed indirectly via pointers
extF80indicates extFloat80_t, passed by value
f128Mindicates float128_t, passed indirectly via pointers
f128indicates float128_t, passed by value
+
+The circumstances under which values of floating-point types +extFloat80_t and float128_t may be passed either by +value or indirectly via pointers was discussed earlier in +section 4.5, Conventions for Passing Arguments and Results. +

+ +

8.1. Conversions from Integer to Floating-Point

+ +

+All conversions from a 32-bit or 64-bit integer, +signed or unsigned, to a floating-point format are supported. +Functions performing these conversions have these names: +

+ui32_to_<float>
+ui64_to_<float>
+i32_to_<float>
+i64_to_<float> +
+Conversions from 32-bit integers to 64-bit +double-precision and larger formats are always exact, and likewise conversions +from 64-bit integers to 80-bit +double-extended-precision and 128-bit quadruple-precision are also +always exact. +

+ +

+Each conversion function takes one input of the appropriate type and generates +one output. +The following illustrates the signatures of these functions in cases when the +floating-point result is passed either by value or via pointers: +

+
+float64_t i32_to_f64( int32_t a );
+
+
+void i32_to_f128M( int32_t a, float128_t *destPtr );
+
+
+

+ +

8.2. Conversions from Floating-Point to Integer

+ +

+Conversions from a floating-point format to a 32-bit or +64-bit integer, signed or unsigned, are supported with these +functions: +

+<float>_to_ui32
+<float>_to_ui64
+<float>_to_i32
+<float>_to_i64 +
+The functions have signatures as follows, depending on whether the +floating-point input is passed by value or via pointers: +
+
+int_fast32_t f64_to_i32( float64_t a, uint_fast8_t roundingMode, bool exact );
+
+
+int_fast32_t
+ f128M_to_i32( const float128_t *aPtr, uint_fast8_t roundingMode, bool exact );
+
+
+

+ +

+The roundingMode argument specifies the rounding mode for +the conversion. +The variable that usually indicates rounding mode, +softfloat_roundingMode, is ignored. +Argument exact determines whether the inexact +exception flag is raised if the conversion is not exact. +If exact is true, the inexact flag may +be raised; +otherwise, it will not be, even if the conversion is inexact. +

+ +

+A conversion from floating-point to integer format raises the invalid +exception if the source value cannot be rounded to a representable integer of +the desired size (32 or 64 bits). +In such circumstances, the integer result returned is determined by the +particular port of SoftFloat, although typically this value will be either the +maximum or minimum value of the integer format. +The functions that convert to integer types never raise the floating-point +overflow exception. +

+ +

+Because languages such as C require that conversions to integers +be rounded toward zero, the following functions are provided for improved speed +and convenience: +

+<float>_to_ui32_r_minMag
+<float>_to_ui64_r_minMag
+<float>_to_i32_r_minMag
+<float>_to_i64_r_minMag +
+These functions round only toward zero (to minimum magnitude). +The signatures for these functions are the same as above without the redundant +roundingMode argument: +
+
+int_fast32_t f64_to_i32_r_minMag( float64_t a, bool exact );
+
+
+int_fast32_t f128M_to_i32_r_minMag( const float128_t *aPtr, bool exact );
+
+
+

+ +

8.3. Conversions Among Floating-Point Types

+ +

+Conversions between floating-point formats are done by functions with these +names: +

+<float>_to_<float> +
+All combinations of source and result type are supported where the source and +result are different formats. +There are four different styles of signature for these functions, depending on +whether the input and the output floating-point values are passed by value or +via pointers: +
+
+float32_t f64_to_f32( float64_t a );
+
+
+float32_t f128M_to_f32( const float128_t *aPtr );
+
+
+void f32_to_f128M( float32_t a, float128_t *destPtr );
+
+
+void extF80M_to_f128M( const extFloat80_t *aPtr, float128_t *destPtr );
+
+
+

+ +

+Conversions from a smaller to a larger floating-point format are always exact +and so require no rounding. +

+ +

8.4. Basic Arithmetic Functions

+ +

+The following basic arithmetic functions are provided: +

+<float>_add
+<float>_sub
+<float>_mul
+<float>_div
+<float>_sqrt +
+Each floating-point operation takes two operands, except for sqrt +(square root) which takes only one. +The operands and result are all of the same floating-point format. +Signatures for these functions take the following forms: +
+
+float64_t f64_add( float64_t a, float64_t b );
+
+
+void
+ f128M_add(
+     const float128_t *aPtr, const float128_t *bPtr, float128_t *destPtr );
+
+
+float64_t f64_sqrt( float64_t a );
+
+
+void f128M_sqrt( const float128_t *aPtr, float128_t *destPtr );
+
+
+When floating-point values are passed indirectly through pointers, arguments +aPtr and bPtr point to the input +operands, and the last argument, destPtr, points to the +location where the result is stored. +

+ +

+Rounding of the 80-bit double-extended-precision +(extFloat80_t) functions is affected by variable +extF80_roundingPrecision, as explained earlier in +section 6.3, +Rounding Precision for the 80-Bit Extended Format. +

+ +

8.5. Fused Multiply-Add Functions

+ +

+The 2008 version of the IEEE Floating-Point Standard defines a fused +multiply-add operation that does a combined multiplication and addition +with only a single rounding. +SoftFloat implements fused multiply-add with functions +

+<float>_mulAdd +
+Unlike other operations, fused multiple-add is not supported for the +80-bit double-extended-precision format, +extFloat80_t. +

+ +

+Depending on whether floating-point values are passed by value or via pointers, +the fused multiply-add functions have signatures of these forms: +

+
+float64_t f64_mulAdd( float64_t a, float64_t b, float64_t c );
+
+
+void
+ f128M_mulAdd(
+     const float128_t *aPtr,
+     const float128_t *bPtr,
+     const float128_t *cPtr,
+     float128_t *destPtr
+ );
+
+
+The functions compute +(a × b) + + c +with a single rounding. +When floating-point values are passed indirectly through pointers, arguments +aPtr, bPtr, and +cPtr point to operands a, +b, and c respectively, and +destPtr points to the location where the result is stored. +

+ +

+If one of the multiplication operands a and +b is infinite and the other is zero, these functions raise +the invalid exception even if operand c is a quiet NaN. +

+ +

8.6. Remainder Functions

+ +

+For each format, SoftFloat implements the remainder operation defined by the +IEEE Floating-Point Standard. +The remainder functions have names +

+<float>_rem +
+Each remainder operation takes two floating-point operands of the same format +and returns a result in the same format. +Depending on whether floating-point values are passed by value or via pointers, +the remainder functions have signatures of these forms: +
+
+float64_t f64_rem( float64_t a, float64_t b );
+
+
+void
+ f128M_rem(
+     const float128_t *aPtr, const float128_t *bPtr, float128_t *destPtr );
+
+
+When floating-point values are passed indirectly through pointers, arguments +aPtr and bPtr point to operands +a and b respectively, and +destPtr points to the location where the result is stored. +

+ +

+The IEEE Standard remainder operation computes the value +a + − n × b, +where n is the integer closest to +a ÷ b. +If a ÷ b is exactly +halfway between two integers, n is the even integer closest to +a ÷ b. +The IEEE Standard’s remainder operation is always exact and so requires +no rounding. +

+ +

+Depending on the relative magnitudes of the operands, the remainder +functions can take considerably longer to execute than the other SoftFloat +functions. +This is an inherent characteristic of the remainder operation itself and is not +a flaw in the SoftFloat implementation. +

+ +

8.7. Round-to-Integer Functions

+ +

+For each format, SoftFloat implements the round-to-integer operation specified +by the IEEE Floating-Point Standard. +These functions are named +

+<float>_roundToInt +
+Each round-to-integer operation takes a single floating-point operand. +This operand is rounded to an integer according to a specified rounding mode, +and the resulting integer value is returned in the same floating-point format. +(Note that the result is not an integer type.) +

+ +

+The signatures of the round-to-integer functions are similar to those for +conversions to an integer type: +

+
+float64_t f64_roundToInt( float64_t a, uint_fast8_t roundingMode, bool exact );
+
+
+void
+ f128M_roundToInt(
+     const float128_t *aPtr,
+     uint_fast8_t roundingMode,
+     bool exact,
+     float128_t *destPtr
+ );
+
+
+When floating-point values are passed indirectly through pointers, +aPtr points to the input operand and +destPtr points to the location where the result is stored. +

+ +

+The roundingMode argument specifies the rounding mode to +apply. +The variable that usually indicates rounding mode, +softfloat_roundingMode, is ignored. +Argument exact determines whether the inexact +exception flag is raised if the conversion is not exact. +If exact is true, the inexact flag may +be raised; +otherwise, it will not be, even if the conversion is inexact. +

+ +

8.8. Comparison Functions

+ +

+For each format, the following floating-point comparison functions are +provided: +

+<float>_eq
+<float>_le
+<float>_lt +
+Each comparison takes two operands of the same type and returns a Boolean. +The abbreviation eq stands for “equal” (=); +le stands for “less than or equal” (≤); +and lt stands for “less than” (<). +Depending on whether the floating-point operands are passed by value or via +pointers, the comparison functions have signatures of these forms: +
+
+bool f64_eq( float64_t a, float64_t b );
+
+
+bool f128M_eq( const float128_t *aPtr, const float128_t *bPtr );
+
+
+

+ +

+The usual greater-than (>), greater-than-or-equal (≥), and not-equal +(≠) comparisons are easily obtained from the functions provided. +The not-equal function is just the logical complement of the equal function. +The greater-than-or-equal function is identical to the less-than-or-equal +function with the arguments in reverse order, and likewise the greater-than +function is identical to the less-than function with the arguments reversed. +

+ +

+The IEEE Floating-Point Standard specifies that the less-than-or-equal and +less-than comparisons by default raise the invalid exception if either +operand is any kind of NaN. +Equality comparisons, on the other hand, are defined by default to raise the +invalid exception only for signaling NaNs, not quiet NaNs. +For completeness, SoftFloat provides these complementary functions: +

+<float>_eq_signaling
+<float>_le_quiet
+<float>_lt_quiet +
+The signaling equality comparisons are identical to the default +equality comparisons except that the invalid exception is raised for any +NaN input, not just for signaling NaNs. +Similarly, the quiet comparison functions are identical to their +default counterparts except that the invalid exception is not raised for +quiet NaNs. +

+ +

8.9. Signaling NaN Test Functions

+ +

+Functions for testing whether a floating-point value is a signaling NaN are +provided with these names: +

+<float>_isSignalingNaN +
+The functions take one floating-point operand and return a Boolean indicating +whether the operand is a signaling NaN. +Accordingly, the functions have the forms +
+
+bool f64_isSignalingNaN( float64_t a );
+
+
+bool f128M_isSignalingNaN( const float128_t *aPtr );
+
+
+

+ +

8.10. Raise-Exception Function

+ +

+SoftFloat provides a single function for raising floating-point exceptions: +

+
+void softfloat_raiseFlags( uint_fast8_t exceptions );
+
+
+The exceptions argument is a mask indicating the set of +exceptions to raise. +(See earlier section 7, Exceptions and Exception Flags.) +In addition to setting the specified exception flags in variable +softfloat_exceptionFlags, the softfloat_raiseFlags +function may cause a trap or abort appropriate for the current system. +

+ + +

9. Changes from SoftFloat Release 2

+ +

+Apart from a change in the legal use license, Release 3 of +SoftFloat introduced numerous technical differences compared to earlier +releases. +

+ +

9.1. Name Changes

+ +

+The most obvious and pervasive difference compared to Release 2 +is that the names of most functions and variables have changed, even when the +behavior has not. +First, the floating-point types, the mode variables, the exception flags +variable, the function to raise exceptions, and various associated constants +have been renamed as follows: +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
old name, Release 2:new name, Release 3:
float32float32_t
float64float64_t
floatx80extFloat80_t
float128float128_t
float_rounding_modesoftfloat_roundingMode
float_round_nearest_evensoftfloat_round_near_even
float_round_to_zerosoftfloat_round_minMag
float_round_downsoftfloat_round_min
float_round_upsoftfloat_round_max
float_detect_tininesssoftfloat_detectTininess
float_tininess_before_rounding    softfloat_tininess_beforeRounding
float_tininess_after_roundingsoftfloat_tininess_afterRounding
floatx80_rounding_precisionextF80_roundingPrecision
float_exception_flagssoftfloat_exceptionFlags
float_flag_inexactsoftfloat_flag_inexact
float_flag_underflowsoftfloat_flag_underflow
float_flag_overflowsoftfloat_flag_overflow
float_flag_divbyzerosoftfloat_flag_infinite
float_flag_invalidsoftfloat_flag_invalid
float_raisesoftfloat_raiseFlags
+
+

+ +

+Furthermore, Release 3 adopted the following new abbreviations for +function names: +

+ + + + + + + + + + + +
used in names in Release 2:    used in names in Release 3:
int32 i32
int64 i64
float32 f32
float64 f64
floatx80 extF80
float128 f128
+
+Thus, for example, the function to add two 32-bit floating-point +numbers, previously called float32_add in Release 2, +is now f32_add. +Lastly, there have been a few other changes to function names: +
+ + + + + + + + + + + + + + + + + + + + + +
used in names in Release 2:   used in names in Release 3:   relevant functions:
_round_to_zero_r_minMagconversions from floating-point to integer (section 8.2)
round_to_introundToIntround-to-integer functions (section 8.7)
is_signaling_nan    isSignalingNaNsignaling NaN test functions (section 8.9)
+
+

+ +

9.2. Changes to Function Arguments

+ +

+Besides simple name changes, some operations were given a different interface +in Release 3 than they had in Release 2: +

+

+ +

9.3. Added Capabilities

+ +

+With Release 3, some new features have been added that were not +present in Release 2: +

+

+ +

9.4. Better Compatibility with the C Language

+ +

+Release 3 of SoftFloat was written to conform better to the ISO C +Standard’s rules for portability. +For example, older releases of SoftFloat employed type conversions in ways +that, while commonly practiced, are not fully defined by the C Standard. +Such problematic type conversions have generally been replaced by the use of +unions, the behavior around which is more strictly regulated these days. +

+ +

9.5. New Organization as a Library

+ +

+Starting with Release 3, SoftFloat now builds as a library. +Previously, SoftFloat compiled into a single, monolithic object file containing +all the SoftFloat functions, with the consequence that a program linking with +SoftFloat would get every SoftFloat function in its binary file even if only a +few functions were actually used. +With SoftFloat in the form of a library, a program that is linked by a standard +linker will include only those functions of SoftFloat that it needs and no +others. +

+ +

9.6. Optimization Gains (and Losses)

+ +

+Individual SoftFloat functions have been variously improved in +Release 3 compared to earlier releases. +In particular, better, faster algorithms have been deployed for the operations +of division, square root, and remainder. +For functions operating on the larger 80-bit and +128-bit formats, extFloat80_t and +float128_t, code size has also generally been reduced. +

+ +

+However, because Release 2 compiled all of SoftFloat together as a +single object file, compilers could make optimizations across function calls +when one SoftFloat function calls another. +Now that the functions of SoftFloat are compiled separately and only afterward +linked together into a program, there is not usually the same opportunity to +optimize across function calls. +Some loss of speed has been observed due to this change. +

+ + +

10. Future Directions

+ +

+The following improvements are anticipated for future releases of SoftFloat: +

+

+ + +

11. Contact Information

+ +

+At the time of this writing, the most up-to-date information about SoftFloat +and the latest release can be found at the Web page +http://www.jhauser.us/arithmetic/SoftFloat.html. +

+ + + + -- cgit v1.2.3