v2.30 Intel Intelligent Storage Acceleration Library Release Notes ================================================================== RELEASE NOTE CONTENTS 1. KNOWN ISSUES 2. FIXED ISSUES 3. CHANGE LOG & FEATURES ADDED 1. KNOWN ISSUES ---------------- * Perf tests do not run in Windows environment. * 32-bit lib is not supported in Windows. 2. FIXED ISSUES --------------- v2.30 * Intel CET support. * Windows nasm support fix. v2.28 * Fix documentation on gf_vect_mad(). Min length listed as 32 instead of required min 64 bytes. v2.27 * Fix lack of install for pkg-config files v2.26 * Fixes for sanitizer warnings. v2.25 * Fix for nasm on Mac OS X/darwin. v2.24 * Fix for crc32_iscsi(). Potential read-over for small buffer. For an input buffer length of less than 8 bytes and aligned to an 8 byte boundary, function could read past length. Previously had the possibility to cause a seg fault only for length 0 and invalid buffer passed. Calculated CRC is unchanged. * Fix for compression/decompression of > 4GB files. For streaming compression of extremely large files, the total_out parameter would wrap and could potentially flag an otherwise valid lookback distance as being invalid. Total_out is still 32bit for zlib compatibility. No inconsistent compressed buffers were generated by the issue. v2.23 * Fix for histogram generation base function. * Fix library build warnings on macOS. * Fix igzip to use bsf instruction when tzcnt is not available. v2.22 * Fix ISA-L builds for other architectures. Base function and examples sanitized for non-IA builds. * Fix fuzz test script to work with llvm 6.0 builtin libFuzz. v2.20 * Inflate total_out behavior corrected for in-progress decompression. Previously total_out represented the total bytes decompressed into the output buffer or temp internal buffer. This is changed to be only the bytes put into the output buffer. * Fixed issue with isal_create_hufftables_subset. Affects semi-dynamic compression use case when explicitly creating hufftables from histogram. The _hufftables_subset function could fail to generate length symbols for any length that were never seen. v2.19 * Fix erasure code test that violates rs matrix bounds. * Fix 0 length file and looping errors in igzip_inflate_test. v2.18 * Mac OS X/darwin systems no longer require the --target=darwin config option. The autoconf canonical build should detect. v2.17 * Fix igzip using 32K window and a shared object * Fix igzip undefined instruction error on Nehalem. * Fixed issue in crc performance tests where OS optimizations turned cold cache tests into warm tests. v2.15 * Fix for windows register save in gf_6vect_mad_avx2.asm. Only affects windows versions of ec_encode_data_update() running with AVX2. A GP register was not properly restored resulting in corruption on return. v2.14 * Building in unit directories is no longer supported removing the issue of leftover object files causing the top-level make build to fail. v2.10 * Fix for windows register save overlap in gf_{3-6}vect_dot_prod_sse.asm. Only affects windows versions of erasure code. GP register saves/restore were pushed to same stack area as XMM. 3. CHANGE LOG & FEATURES ADDED ------------------------------ v2.30 * Igzip compression enhancements. - New functions for dictionary acceleration. Split dictionary processing and resetting can greatly accelerate the performance of compressing many small files with a dictionary. - New static level 0 header decode tables. Accelerates decompressing small files that are level 0 compressed by skipping the known header parsing. - New feature for igzip cli tool: support for concatenated .gz files. On decompression, igzip will process a series of independent, concatenated .gz files into one output stream. * CRC Improvements - New vclmul version of crc32_iscsi(). - Updates for aarch64. v2.29 * CRC Improvements - New AVX512 vclmul versions of crc16_t10dif(), crc32_ieee(), crc32_gzip_refl. * Erasure code improvements - Added AVX512 ec functions with 5 and 6 outputs. Can improve performance for codes with 5 or more parity by running in batches of up to 6 at a time. v2.28 * New next-arch versions of 64-bit CRC. All norm and reflected 64-bit polynomials are expanded to utilize vpclmulqdq. v2.27 * New multi-threaded compression option for igzip cli tool v2.26 * Adler32 added to external API. * Multi-arch improvements. * Performance test improvements. v2.25 * Igzip performance improvements and features. - Performance improvements for uncompressable files. Random or uncompressable files can be up to 3x faster in level 1 or 2 compression. - Additional small file performance improvments. - New options in igzip cli: use name from header or not, test compressed file. * Multi-arch autoconf script. - Autoconf should detect architecture and run base functions at minimum. v2.24 * Igzip small file performance improvements and new features. - Better performance on small files. - New gzip/zlib header and trailer handling. - New gzip/zlib header parsing helper functions. - New user-space compression/decompression tool igzip. * New mem unit added with first function isal_zero_detect(). v2.23 * Igzip inflate (decompression) performance improvements. - Implemented multi-byte decode for inflate. Decode can pack up to three symbols into the decode table making some compressed streams decompress much faster depending on the prevalence of short codes. v2.22 * Igzip: AVX2 version of level 3 compression added. * Erasure code examples - New examples for standard EC encode and decode. - Example of piggyback EC encode and decode. v2.21 * Igzip improvements - New compression levels added. ISA-L fast deflate now has more levels to balance speed vs. target compression level. Level 0, 1 are as in previous generations. New levels 2 & 3 target higher compression roughly comparable to zlib levels 2-3. Level 3 is currently only optimized for processors with AVX512 instructions. * New T10dif & copy function - crc16_t10dif_copy() - CRC and copy was added to emulate T10dif operations such as DIF insert and strip. This function stitches together CRC and memcpy operations eliminating an extra data read. * CRC32 iscsi performance improvements - Fixes issue under some distributions where warm cache performance was reduced. v2.20 * Igzip improvements - Optimized deflate_hash in compression functions. Improves performance of using preset dictionary. - Removed alignment restrictions on input structure. v2.19 * Igzip improvements - Add optimized Adler-32 checksum. - Implement zlib compression format. - Add stateful dictionary support. - Add struct reset functions for both deflate and inflate. * Reflected IEEE format CRC32 is released out. Function interface is named crc32_gzip_refl. * Exact work condition of Erasure Code Reed-Solomon Matrix is determined by new added program gen_rs_matrix_limits. v2.18 * New 2-pass fully-dynamic deflate compression (level -1). ISA-L fast deflate now has two levels. Level 0 (default) is the same as previous generations. Setting to level 1 will switch to the fully-dynamic compression that will typically reach higher compression ratios. * RAID AVX512 functions. v2.17 * New fast decompression (inflate) * Compression improvements (deflate) - Speed and compression ratio improvements. - Fast custom Huffman code generation. - New features: * Run-time option of gzip crc calculation and headers/trailer. * Choice of static header (BTYPE 01) blocks. * LARGE_WINDOW, 32K history, now default. * Stateless full flush mode. * CRC64 - Six new 64-bit polynomials supported. Normal and reflected versions of ECMA, ISO and Jones polynomials. v2.16 * Units added: crc, raid, igzip (deflate compression). v2.15 * Erasure code updates. New AVX512 versions. * Nasm support. ISA-L ported to build with nasm or yasm assembler. * Windows DLL support. Windows builds DLL by default. v2.14 * Autoconf and autotools build allows easier porting to additional systems. Previous make system still available to embedded users with Makefile.unx. * Includes update for building on Mac OS X/darwin systems. Add --target=darwin to ./configure step. v2.13 * Erasure code improvments - 32-bit port of optimized gf_vect_dot_prod() functions. This makes ec_encode_data() functions much faster on 32-bit processors. - Avoton performance improvements. Performance on Avoton for gf_vect_dot_prod() and ec_encode_data() can improve by as much as 20%. v2.11 * Incremental erasure code. New functions added to erasure code to handle single source update of code blocks. The function ec_encode_data_update() works with parameters similar to ec_encode_data() but are called incrementally with each source block. These versions are useful when source blocks are not all available at once. v2.10 * Erasure code updates - New AVX and AVX2 support functions. - Changes min len requirement on gf_vect_dot_prod() to 32 from 16. - Tests include both source and parity recovery with ec_encode_data(). - New encoding examples with Vandermonde or Cauchy matrix. v2.8 * First open release of erasure code unit that is part of ISA-L.