diff options
Diffstat (limited to 'TODO')
-rw-r--r-- | TODO | 109 |
1 files changed, 109 insertions, 0 deletions
@@ -0,0 +1,109 @@ + +XZ Utils To-Do List +=================== + +Known bugs +---------- + + The test suite is too incomplete. + + If the memory usage limit is less than about 13 MiB, xz is unable to + automatically scale down the compression settings enough even though + it would be possible by switching from BT2/BT3/BT4 match finder to + HC3/HC4. + + XZ Utils compress some files significantly worse than LZMA Utils. + This is due to faster compression presets used by XZ Utils, and + can often be worked around by using "xz --extreme". With some files + --extreme isn't enough though: it's most likely with files that + compress extremely well, so going from compression ratio of 0.003 + to 0.004 means big relative increase in the compressed file size. + + xz doesn't quote unprintable characters when it displays file names + given on the command line. + + tuklib_exit() doesn't block signals => EINTR is possible. + + SIGTSTP is not handled. If xz is stopped, the estimated remaining + time and calculated (de)compression speed won't make sense in the + progress indicator (xz --verbose). + + If liblzma has created threads and fork() gets called, liblzma + code will break in the child process unless it calls exec() and + doesn't touch liblzma. + + +Missing features +---------------- + + Add support for storing metadata in .xz files. A preliminary + idea is to create a new Stream type for metadata. When both + metadata and data are wanted in the same .xz file, two or more + Streams would be concatenated. + + The state stored in lzma_stream should be cloneable, which would + be mostly useful when using a preset dictionary in LZMA2, but + it may have other uses too. Compare to deflateCopy() in zlib. + + Support LZMA_FINISH in raw decoder to indicate end of LZMA1 and + other streams that don't have an end of payload marker. + + Adjust dictionary size when the input file size is known. + Maybe do this only if an option is given. + + xz doesn't support copying extended attributes, access control + lists etc. from source to target file. + + Multithreaded compression: + - Reduce memory usage of the current method. + - Implement threaded match finders. + - Implement pigz-style threading in LZMA2. + + Buffer-to-buffer coding could use less RAM (especially when + decompressing LZMA1 or LZMA2). + + I/O library is not implemented (similar to gzopen() in zlib). + It will be a separate library that supports uncompressed, .gz, + .bz2, .lzma, and .xz files. + + Support changing lzma_options_lzma.mode with lzma_filters_update(). + + Support LZMA_FULL_FLUSH for lzma_stream_decoder() to stop at + Block and Stream boundaries. + + lzma_strerror() to convert lzma_ret to human readable form? + This is tricky, because the same error codes are used with + slightly different meanings, and this cannot be fixed anymore. + + Make it possible to adjust LZMA2 options in the middle of a Block + so that the encoding speed vs. compression ratio can be optimized + when the compressed data is streamed over network. + + Improved BCJ filters. The current filters are small but they aren't + so great when compressing binary packages that contain various file + types. Specifically, they make things worse if there are static + libraries or Linux kernel modules. The filtering could also be + more effective (without getting overly complex), for example, + streamable variant BCJ2 from 7-Zip could be implemented. + + Filter that autodetects specific data types in the input stream + and applies appropriate filters for the corrects parts of the input. + Perhaps combine this with the BCJ filter improvement point above. + + Long-range LZ77 method as a separate filter or as a new LZMA2 + match finder. + + +Documentation +------------- + + More tutorial programs are needed for liblzma. + + Document the LZMA1 and LZMA2 algorithms. + + +Miscellaneous +------------ + + Try to get the media type for .xz registered at IANA. + |