summaryrefslogtreecommitdiffstats
path: root/src/grep/ChangeLog
diff options
context:
space:
mode:
Diffstat (limited to 'src/grep/ChangeLog')
-rw-r--r--src/grep/ChangeLog12542
1 files changed, 12542 insertions, 0 deletions
diff --git a/src/grep/ChangeLog b/src/grep/ChangeLog
new file mode 100644
index 0000000..3151cc2
--- /dev/null
+++ b/src/grep/ChangeLog
@@ -0,0 +1,12542 @@
+2021-08-14 Jim Meyering <meyering@fb.com>
+
+ version 3.7
+ * NEWS: Record release date.
+
+2021-08-09 Jim Meyering <meyering@fb.com>
+
+ tests: provide an awk-based seq replacement
+ ...so we can continue to use seq, but the wrapper when needed.
+ * tests/init.cfg (seq): Some systems lask seq.
+ Provide a replacement.
+ * tests/hash-collision-perf: Use seq once again.
+ * tests/long-pattern-perf: Likewise. And remove a comment about seq.
+
+2021-08-09 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: simplify EGexecute
+ * src/dfasearch.c (EGexecute): Remove a label and goto.
+ This also makes the machine code a bit shorter, on x86-64 gcc.
+
+ grep: simplify data movement slightly
+ * src/grep.c (fillbuf): Simplify movement of saved data.
+
+ grep: pointer-integer cast nit
+ * src/grep.c (ALIGN_TO): When converting pointers to unsigned
+ integers, convert to uintptr_t not size_t, as size_t in theory
+ might be too narrow.
+
+ tests: use awk, not seq
+ Portability problem reported by Dagobert Michelsen in:
+ https://lists.gnu.org/r/grep-devel/2021-08/msg00004.html
+ * tests/hash-collision-perf, tests/long-pattern-perf:
+ Don’t assume seq is installed; use awk instead.
+
+2021-08-08 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+
+ build: update gnulib to latest
+
+2021-08-06 Kevin Locke <kevin@kevinlocke.name>
+
+ doc: usage: --group-separator/--no-group-separator
+ * src/grep.c (usage): Document --group-separator
+ and --no-group-separator.
+
+ doc: man: add --group-separator/--no-group-separator
+ * doc/grep.in.1:
+ Add copy of docs for --group-separator from doc/grep.texi.
+ Add copy of docs for --no-group-separator from doc/grep.texi.
+
+2021-08-06 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+
+2021-06-19 Mateusz Okulus <mmokulus@gmail.com>
+
+ doc: note that -H is a GNU extension in man page, too
+ * doc/grep.in.1 (-H): Mention that this is a GNU extension.
+
+2021-06-13 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2021-06-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2021-06-10 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: improve examples and wording
+ * doc/grep.texi (The Backslash Character and Special Expressions)
+ (Usage): Improve doc (Bug#48948).
+
+2021-01-31 Jim Meyering <meyering@fb.com>
+
+ doc: man: fix -L description and improve -l's
+ * doc/grep.texi (-L): Remove erroneous sentence about stopping early.
+ With -L, grep cannot stop scanning early.
+ (-l): Tweak existing wording.
+ * doc/grep.in.1: Remove the -L sentence here, too.
+ (-l): Copy the sentence from grep.texi, to clarify: it's only per-file
+ scanning that stops upon match. Reported by Robert Bruntz
+ in http://debbugs.gnu.org/46179
+
+2021-01-05 Jim Meyering <meyering@fb.com>
+
+ build: avoid long-string warnings in gnulib tests
+ * configure.ac (GNULIB_TEST_WARN_CFLAGS): Add
+ -Woverlength-strings to avoid clang warnings.
+
+2021-01-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: further clarify regexp structure
+ * doc/grep.texi (Fundamental Structure)
+ (Back-references and Subexpressions, Basic vs Extended):
+ Further clarifications.
+
+ maint: copy bootstrap, tests/init.sh from Gnulib
+
+ doc: update grep.texi cite to 2021
+
+ maint: run "make update-copyright"
+
+ build: update gnulib submodule to latest
+
+2020-12-30 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+ * gnulib: update for clang-10 warning warning-avoidance
+ fixes in hash and regex-tests.
+
+ maint: add parentheses to avoid new clang-10 warning
+ * src/dfasearch.c (regex_compile): Parenthesize arith-OR vs
+ ternary, to placate clang-10.
+
+2020-12-29 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: clarify special chars and }
+ * doc/grep.texi (Fundamental Structure)
+ (Character Classes and Bracket Expressions)
+ (The Backslash Character and Special Expressions, Anchoring)
+ (Basic vs Extended): Clarify which characters are special,
+ and why \ is needed before } in grep even though } is not special.
+ Use Posix terminology for ordinary and special characters and for
+ interval expressions.
+
+2020-12-29 Marek Suppa <mr@shu.io>
+
+ doc: fix missing right curly brace
+ * doc/grep.texi (Basic vs Extended Regular Expressions): Mention that
+ the right curly brace (}) meta-character must be backslash-escaped.
+ It had been omitted from the list.
+
+2020-12-25 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+
+ grep: use of --unix-byte-offsets (-u) now elicits a warning
+ * NEWS (Change in behavior): Mention this.
+ * src/grep.c (main): Warn about each use of obsolete
+ --unix-byte-offsets (-u).
+ * doc/grep.in.1 (-u): Remove its documentation.
+
+2020-12-23 Helge Kreutzmann <debian@helgefjell.de>
+
+ doc: adjust man page syntax
+ * doc/grep.in.1: Mark some manual names with B<...>.
+ Mark PATTERNS with I<...>.
+ Drop final period in SEE ALSO.
+ With suggestions from of several members of the manpage-l10n
+ translation community. This resolves https://bugs.gnu.org/45353
+
+2020-11-26 Jim Meyering <meyering@fb.com>
+
+ grep: avoid performance regression with many patterns
+ * src/grep.c (hash_pattern): Switch from PJW to DJB2, to avoid an
+ O(N) to O(N^2) performance regression due to hash collisions with
+ patterns from e.g., seq 500000|tr 0-9 A-J
+ Reported by Frank Heckenbach in https://bugs.gnu.org/44754
+ * NEWS (Bug fixes): Mention it.
+ * tests/hash-collision-perf: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+ build: update gnulib to latest for warning fixes
+ * gnulib: Update submodule to latest.
+ * src/grep.c (printf_errno): Reflect gnulib's renaming: change
+ _GL_ATTRIBUTE_FORMAT_PRINTF to
+ _GL_ATTRIBUTE_FORMAT_PRINTF_STANDARD
+
+ tests: enable warnings for the gnulib-tests subdir
+ * gnulib-tests/Makefile.am (AM_CFLAGS): Enable gnulib
+ warning options for these tests.
+ * configure.ac (GNULIB_TEST_WARN_CFLAGS): Disable the same three
+ warning options that coreutils does, and a few more for GCC11.
+
+2020-11-08 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 3.6
+ * NEWS: Record release date.
+
+2020-11-05 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest for test improvements
+
+2020-11-03 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest for C++-ready dfa.h and test-verify.c fix
+
+2020-11-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: remove GREP_OPTIONS
+ * NEWS: Mention this.
+ * doc/grep.in.1:
+ Remove GREP_OPTIONS documentation.
+ * doc/grep.texi (Environment Variables):
+ Move GREP_OPTIONS stuff into a “no longer implemented” paragraph.
+ * src/grep.c (prepend_args, prepend_default_options): Remove.
+ (main): Do not look at GREP_OPTIONS.
+ * tests/Makefile.am (TESTS_ENVIRONMENTS):
+ * tests/init.cfg (vars_): Remove GREP_OPTIONS.
+
+2020-11-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: use RE_NO_SUB when calling regex solely to check syntax
+ * src/dfasearch.c (regex_compile): New parameter. All callers changed.
+ (GEAcompile): Move setting syntax for regex into regex_compile()
+ function. This addresses a performance problem exposed by extreme
+ regular expressions, as described in https://bugs.gnu.org/43862 .
+
+ tests: add the test for bugfix in gnulib's dfa
+ * tests/ere.tests: Add new test.
+
+2020-11-01 Jim Meyering <meyering@fb.com>
+
+ grep: avoid erroneous matches for e.g., a+a+a+
+ * gnulib: Update to latest, for dfa's invalid-merge fix.
+ * NEWS (Bug fixes): Mention this.
+
+2020-10-11 Jim Meyering <meyering@fb.com>
+
+ grep: -P: report input filename upon PCRE execution failure
+ Without this, it could be tedious to determine which input
+ file evokes a PCRE-execution-time failure.
+ * src/pcresearch.c (Pexecute): When failing, include the
+ error-provoking file name in the diagnostic.
+ * src/grep.c (input_filename): Make extern, since used above.
+ * src/search.h (input_filename): Declare.
+ * tests/filename-lineno.pl: Test for this.
+ ($no_pcre): Factor out.
+ * NEWS (Bug fixes): Mention this.
+
+2020-10-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: minor kwset cleanups
+ * src/kwsearch.c (Fexecute):
+ Assume C99 to put declarations nearer uses.
+ * src/kwset.c (bmexec): Omit unnecessary test.
+ * src/kwset.h (struct kwsmatch): Make OFFSET and SIZE individual
+ elements, not arrays of size 1 (a revenant of an earlier API).
+ All uses changed.
+
+2020-10-11 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: remove unused code
+ * src/kwsearch.c (Fcompile, Fexecute): Remove unused code. No longer these
+ are used after commit 016e590a8198009bce0e1078f6d4c7e037e2df3c.
+
+2020-10-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2020-10-05 Jim Meyering <meyering@fb.com>
+
+ tests: correct filename-lineno.pl
+ * tests/filename-lineno.pl: Remove a stray envvar
+ that somehow slipped into expected output string.
+
+2020-10-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: fix tests when PCRE is not used
+ * tests/Makefile.am (TESTS_ENVIRONMENT):
+ Set PATH before setting PCRE_WORKS, so that the latter test
+ uses the just-built grep.
+ * tests/filename-lineno.pl (invalid-re-P-paren)
+ (invalid-re-P-star-paren): Adjust non-PCRE case to match
+ recently-changed behavior.
+
+ build: update gnulib submodule to latest
+
+2020-10-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: document --include/--exclude better
+ Problem reported by John Ruckstuhl (Bug#43782).
+ * doc/grep.texi (File and Directory Selection):
+ Document what happens if contradictory options are given,
+ or if no option matches a file name.
+ * doc/grep.in.1:
+
+2020-10-01 Jim Meyering <meyering@fb.com>
+
+ maint: add technically-required quotes
+ * configure.ac: Quote args of AC_CONFIG_AUX_DIR, AC_CONFIG_SRCDIR
+ and AC_CHECK_FUNCS_ONCE.
+
+2020-09-28 Jim Meyering <meyering@fb.com>
+
+ tests: restore deleted -P tests
+ v3.4-almost-45-g8577dda deleted these two -P-using tests because a
+ grep built without PCRE support would fail those tests. This sets
+ an envvar with the equivalent of the result from the require_pcre_
+ function and restores the now-guarded tests. Tested by running this:
+ ./configure --disable-perl-regexp && make check
+ * tests/Makefile.am (PCRE_WORKS): Set this envvar.
+ * tests/filename-lineno.pl: Restore invalid-re-P-paren and
+ invalid-re-P-star-paren, now each with a guard.
+
+2020-09-27 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 3.5
+ * NEWS: Record release date.
+
+ maint: avoid autoconf warnings * configure.ac (AC_HEADER_STDC): Remove. It's been assumed for ages. * m4/pcre.m4 (gl_FUNC_PCRE): Use AS_HELP_STRING, not AC_HELP_STRING.
+
+ build: update gnulib to latest
+
+2020-09-26 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+
+ tests: skip stack-overflow test when built with ASAN
+ * tests/stack-overflow: Skip this test when the binary was built
+ with ASAN, to avoid spurious failures.
+
+2020-09-25 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+ build: update gnulib submodule to latest
+
+2020-09-24 Jim Meyering <meyering@fb.com>
+
+ tests: fix surrogate-pair test to work on 16-bit wchar_t systems
+ * tests/surrogate-pair: Avoid new failure on systems with
+ 16-bit wchar_t. Detect the condition and exit before the
+ otherwise-failing tests. Remove the now-incorrect in-loop
+ test for that alternate failure mode. This was exposed by
+ testing on gcc119.fsffrance.org, a power8 AIX 7.2 system.
+
+2020-09-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: don't assume PCRE in tests
+ * tests/filename-lineno.pl: Remove invalid-re-P-paren and
+ invalid-re-P-star-paren as they assume PCRE support, which
+ causes a false alarm "grep: Perl matching not supported in a
+ --disable-perl-regexp build" on platforms without PCRE.
+
+ grep: pacify Sun C 5.15
+ This suppresses a false alarm '"grep.c", line 720: warning:
+ initializer will be sign-extended: -1'.
+ * src/grep.c (uword_max): New static constant.
+ (initialize_unibyte_mask): Use it.
+
+2020-09-23 Paul Eggert <eggert@cs.ucla.edu>
+ Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: fix more Turkish-eyes bugs
+ Fix more bugs recently uncovered by Norihiro Tanaka (Bug#43577).
+ * NEWS: Mention new bug report.
+ * src/grep.c (ok_fold): New static var.
+ (setup_ok_fold): New function.
+ (fgrep_icase_charlen): Reject single-byte characters
+ if they match some multibyte characters when ignoring case.
+ This part of the patch is partly derived from
+ <https://bugs.gnu.org/43577#14>, which means it is:
+ (main): Call setup_ok_fold if ok_fold might be needed.
+ * src/searchutils.c (kwsinit): With the grep.c changes,
+ this code can now revert to classic 7th Edition Unix style;
+ aborting would be wrong.
+ * tests/turkish-eyes: Add tests for these bugs.
+
+2020-09-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+ * NEWS: Mention Bug#43577, which this fixes.
+
+ grep: fix recently-introduced performance glitch
+ * src/grep.c (main): Do not double-increment update_patterns.
+ update_patterns increments n_patterns now; do not increment it
+ again, as the incorrect count would hurt performance heuristics later.
+
+2020-09-22 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: improve --line-buffer doc
+ * doc/grep.texi (Other Options): Document --line-buffered more
+ carefully, and say what happens when it is not used. Problem
+ reported by Dan Jacobson (Bug#35339).
+
+ tests: port timeout test to Alpine
+ Problem reported by Bruno Haible in:
+ https://lists.gnu.org/r/grep-devel/2020-09/msg00080.html
+ * tests/init.cfg (require_timeout_): Check that ‘timeout 0.01
+ sleep 0.02’ works as expected, to avoid spurious test failure
+ on Alpine.
+
+2020-09-22 Jim Meyering <meyering@fb.com>
+
+ tests: test for many-regexp N^2 RSS regression
+ * tests/many-regex-performance: New test for this performance
+ regression.
+ * tests/Makefile.am: Add it.
+ * NEWS (Bug fixes): Describe it.
+
+2020-09-22 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: avoid unnecessary regex compilation
+ Grep resorts to using the regex engine when the precision of either
+ -o or --color is required, or when the pattern is not supported by
+ our DFA engine (e.g., backref). Otherwise, grep would perform regex
+ compilation solely to check the syntax. This change makes grep skip
+ that compilation in the common case for which it is unnecessary.
+
+ The compilation we are avoiding is quite costly, consuming O(N^2)
+ RSS for N regular expressions.
+
+ * src/dfasearch.c (GEAcompile): Add new argument, and avoid unneeded
+ compilation of regex.
+ * src/grep.c (compile_fp_t): Update prototype.
+ (main): Update caller.
+ * src/kwsearch.c (Fcompile): Update caller and add new argument.
+ * src/pcresearch.c (Pcompile): Add new argument.
+ * src/search.h (GEAcompile, Fcompile, Pcompile): Update prototype.
+
+2020-09-22 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+
+ tests: skip stack-overflow test on midnightbsd*
+ * tests/stack-overflow: skip_ when run on this OS. See details
+ in https://lists.gnu.org/r/grep-devel/2020-09/msg00062.html
+ * tests/Makefile.am (host_triplet): Export.
+
+2020-09-21 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: say how to match chars by code
+ From a suggestion in Bug#41004.
+ * doc/grep.texi (Character Encoding, Matching Non-ASCII):
+ New sections. Move some material from Environment Variables
+ into these sections.
+
+2020-09-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ * src/dfasearch.c (struct dfa_comp): Fix out-of-date comment.
+
+ grep: "grep '\)'" reports an error again
+ * src/grep.c (try_fgrep_pattern): With -G, pass \) through to
+ GEAcompile so that it can complain. This fixes an unexpected
+ change in behavior from grep 3.4 and earlier.
+ * tests/filename-lineno.pl: Add tests for this sort of thing.
+
+ grep: tweak by using mempcpy
+ * src/grep.c (try_fgrep_pattern): Tweak previous change
+ by using mempcpy.
+
+2020-09-18 Jim Meyering <meyering@fb.com>
+
+ grep: make echo .|grep '\.' match once again
+ The same applied for many other backslash-escaped bytes, not just
+ metacharacters. The switch to rawmemchr in v3.4-almost-10-g9393b97
+ made some parts of the code require the usually-guaranteed newline
+ sentinel at the end of each pattern. Before, some consumers used a
+ (correct) pattern length and did not care that try_fgrep_pattern could
+ transform a pattern (with sentinel) like "\\.\n" to "..\n", thus
+ violating that assumption.
+ * src/grep.c (try_fgrep_pattern): Preserve the invariant
+ that each regexp is newline-terminated.
+ * tests/backslash-dot: New file. Test for this.
+ * tests/Makefile.am (TESTS): Add it.
+
+ tests: triple-backref: print a reference to glibc bug
+ * tests/triple-backref (MALLOC_CHECK_): And tell glibc not to
+ bother with a core dump. Suggested by Pádraig Brady.
+
+2020-09-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: be more consistent about diagnostic format
+ * NEWS: Mention this.
+ * bootstrap.conf (gnulib_modules): Remove 'quote'.
+ * src/grep.c: Do not include quote.h.
+ (grep, grepdirent, grepdesc): Put the three unusual diagnostics
+ into the same "grep: FOO: message" form that grep uses elsewhere.
+ * tests/binary-file-matches, tests/in-eq-out-infloop:
+ Adjust tests to match new diagnostic format.
+
+2020-09-17 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+
+2020-09-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ * tests/triple-backref: Add comment.
+
+2020-09-17 Jim Meyering <meyering@fb.com>
+
+ tests: make new test executable, to placate distcheck
+ * tests/binary-file-matches: Make this executable.
+
+ tests: add coverage for code that emits the new diagnostic
+ * tests/binary-file-matches: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+ maint: avoid syntax-check failure
+ * src/grep.c (grep): Lower-case the "B" in "Binary file... matches"
+ diagnostic that we now emit to stderr. This avoids the following
+ when running "make syntax-check":
+ maint.mk: found capitalized error message
+ make: *** [maint.mk:469: sc_error_message_uppercase] Error 1
+
+2020-09-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ Send "Binary file FOO matches" to stderr
+ * NEWS, doc/grep.texi: Mention this change (Bug#29668).
+ * src/grep.c (grep): Send "Binary file FOO matches" to stderr
+ instead of stdout.
+ * tests/encoding-error, tests/invalid-multibyte-infloop:
+ * tests/null-byte, tests/pcre-count, tests/surrogate-pair:
+ * tests/symlink, tests/unibyte-binary:
+ Adjust tests to match new behavior. In all cases this
+ simplifies the tests, which is a good sign.
+
+ Suppress "Binary file FOO matches" if -I
+ Problem reported by Jason Franklin (Bug#33552).
+ * NEWS: Mention this.
+ * src/grep.c (grep): Do not output "Binary file FOO matches" if -I.
+ * tests/encoding-error: Add test for this bug.
+
+2020-09-15 Jim Meyering <meyering@fb.com>
+
+ maint: keep two blank lines before each old Noteworthy line.
+ * NEWS: Insert a blank line.
+
+2020-09-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2020-09-13 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2020-09-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2020-09-11 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+
+2020-09-09 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix logic for growing PCRE JIT stack
+ * src/pcresearch.c (jit_exec) [PCRE_EXTRA_MATCH_LIMIT_RECURSION]:
+ When growing the match_limit_recursion limit, do not use the old
+ value if ! (flags & PCRE_EXTRA_MATCH_LIMIT_RECURSION), as it is
+ uninitialized in that case.
+
+ grep: fix PCRE JIT test when JIT not available
+ Problem reported by Thomas Deutschmann (Bug#29446#23).
+ * src/pcresearch.c (Pexecute): Diagnose PCRE_ERROR_RECURSIONLIMIT.
+ * tests/pcre-jitstack: Treat recursion limit overflow like stack
+ overflow.
+
+ grep: fix -w bug in UTF-8 locales
+ Problem reported by Mayo Fark (Bug#43225).
+ * src/searchutils.c (wordchar_prev): In a UTF-8 locale, do not
+ assume that an encoding-error byte cannot be part of a word
+ constituent, as this assumption is incorrect for the last byte
+ of a multibyte word constituent.
+ * tests/word-delim-multibyte: Add a test for the bug.
+
+ Distribute a gzip tarball again
+ Requested by Issam E. Maghni in:
+ https://lists.gnu.org/r/grep-devel/2020-09/msg00000.html
+ * configure.ac (AM_INIT_AUTOMAKE): Remove no-dist-gzip.
+
+ * README-prereq: Also mention xz.
+
+2020-09-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ Prefer rawmemchr to memchr when it’s easy
+ * bootstrap.conf (gnulib_modules): Add rawmemchr.
+ * src/dfasearch.c (GEAcompile, EGexecute):
+ * src/grep.c (update_patterns, prpending, prtext):
+ * src/kwsearch.c (Fcompile, Fexecute):
+ * src/pcresearch.c (Pcompile, Pexecute):
+ Simplify (and presumably speed up a little) by using rawmemchr
+ with a sentinel, instead of using memchr.
+
+ Simplify pattern_file_name
+ * src/grep.c (pattern_file_name): Make first argument
+ origin-0, not origin-1, as this simplifies both caller and
+ callee. All uses changed.
+
+ Simplify regex_compile
+ * src/dfasearch.c (regex_compile): "" suffices; we don’t need "\0".
+ No need to initialize pat_lineno.
+
+ Omit duplicate regexps
+ Do not pass two copies of the same regexp to the
+ regular-expression engine. Although the engines should
+ perform nearly as well even with the copies, in practice they do not.
+ Problem reported by Luca Borzacchiello (Bug#43040).
+ * bootstrap.conf (gnulib_modules): Add hash.
+ * src/grep.c: Include stdint.h, for SIZE_WIDTH.
+ Include hash.h.
+ (struct patloc, patloc, patlocs_allocated, patlocs_used):
+ Rename from struct FL_pair, fl_pair, n_fl_pair_slots, n_pattern_files,
+ respectively, since the data type is no longer a pair.
+ All uses changed.
+ (struct patloc): New member FILELINE. The lineno member is now
+ ptrdiff_t since nowadays we prefer signed types.
+ (pattern_array, patterns_table): New static vars.
+ (count_nl_bytes, fl_add): Remove; no longer used.
+ (hash_pattern, compare_patterns, update_patterns): New functions.
+ update_patterns does what fl_add used to do, plus remove dups.
+ (pattern_file_name): Adjust to change from fl_pair to patloc.
+ (main): Move some variables to inner blocks for clarity.
+ Maintain the pattern_table hash of all patterns.
+ Update pattern_array to match keys, and use update_patterns
+ instead of fl_add to remove duplicate keys.
+ * tests/filename-lineno.pl (invalid-re-2-files)
+ (invalid-re-2-files2, invalid-re-2e): Ensure regexps are unique in
+ tests so that dups aren’t removed in diagnostics.
+ (invalid-re-line-numbers): New test.
+
+2020-08-23 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+ * gnulib: Update submodule to latest.
+ * bootstrap.conf (gnulib_modules): Add explicit dependency on dirname-lgpl.
+ Before, we pulled this in via a dependency.
+ * bootstrap: Update from gnulib.
+
+ build: require autoconf-2.64
+ * configure.ac: Require autoconf-2.64, up from 2.63, to align with gnulib.
+
+2020-08-22 Paul Eggert <eggert@cs.ucla.edu>
+
+ Revert -L exit status change introduced in grep 3.2
+ Problems reported by Antonio Diaz Diaz in:
+ https://bugs.gnu.org/28105#29
+ * NEWS, doc/grep.texi (Exit Status), src/grep.c (usage):
+ Adjust documentation accordingly.
+ * src/grep.c (grepdesc, main): Go back to old behavior.
+ * tests/skip-read: Adjust tests accordingly.
+
+2020-01-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: fix permission issue in previous change
+
+ tests: work around GCC -fprofile-generate bug
+ * tests/triple-backref: Add a 10 s timeout to work around
+ what appears to be a GCC bug with -fprofile-generate.
+ Problem reported by Martin Liška, with diagnosis by
+ Andreas Schwab (Bug#21513).
+
+2020-01-02 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 3.4
+ * NEWS: Record release date.
+
+ build: update gnulib to latest, for mbrtowc-vs-Irix build fix
+
+2020-01-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: mention glibc bug 24269
+ * doc/grep.texi (Known Bugs): Mention glibc bug 24269.
+ Merge formatting/URL changes from Gnulib regex.texi.
+
+ doc: fix --exclude description in man page
+ Problem reported by Duncan Moore (Bug#37212).
+ * src/grep.c (usage): Fix incorrect statement about --exclude
+ and directories. Standardize on “that match GLOB” instead
+ of “matching GLOB”.
+
+ doc: fix missing “more” in man page
+ Problem reported by Philippe Schnoebelen (Bug#34078).
+ * doc/grep.in.1: Add missing “more”.
+
+2020-01-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: add [:blank:] to man page
+ * doc/grep.in.1: Mention [:blank:] (Bug#33291).
+
+2020-01-01 Jim Meyering <meyering@fb.com>
+
+ maint: update all copyright year number ranges
+ Run "make update-copyright" and then...
+ * gnulib: Update to latest with copyright year adjusted.
+ * tests/init.sh: Sync with gnulib to pick up copyright year.
+ * bootstrap: Likewise.
+ * doc/grep.in.1: Use "-" in copyright year ranges, not \en.
+
+2019-12-31 Jim Meyering <meyering@fb.com>
+
+ tests: avoid unwarranted failure in a netbsd 8.1 VM
+ * tests/mb-non-UTF8-perf-Fw: Run twice, to avoid first-read penalty.
+ Reported by Nelson H.F. Beebe.
+
+2019-12-30 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest (for localeinfo perf fix)
+
+ maint: add syntax-check rule to prohibit "backreference" spelling
+ * cfg.mk (sc_prohibit_backref): New rule.
+
+2019-12-30 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: remove too-long line from AUTHORS
+ * AUTHORS: Remove URL that’s too long.
+
+ maint: update AUTHORS
+ * AUTHORS: Update to better reflect current authorship.
+
+2019-12-30 Jim Meyering <meyering@fb.com>
+
+ avoid new syntax-check failures
+ * cfg.mk (old_NEWS_hash): Updating old news, we must also udpate this.
+
+2019-12-30 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: don’t encourage back-references
+ * doc/grep.texi (Usage): Remove palindrome question. Bondioni’s
+ RE makes grep issue a ‘grep: stack overflow’ diagnostic, and we
+ shouldn’t be encouraging fancy back-references anyway, due to all
+ the bugs in this area (Bug#26864). Plus, the allusion to
+ “GNU extensions” doesn't seem to be correct here.
+
+ doc: robustify some examples
+ Prompted by suggestions by Stephane Chazelas (Bug#38792#20).
+ * doc/grep.texi (Usage): Make examples more robust.
+
+ doc: fix bug# typo
+
+ doc: spell "back-reference" more consistently
+
+ doc: mention back-reference bugs
+ Inspired by Bug#26864.
+ * doc/grep.texi (Known Bugs): New section.
+ Mention back-reference issues.
+
+2019-12-29 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: Add -- to more-complex example
+ Suggested by Stephane Chazelas (Bug#38792).
+ * doc/grep.in.1, doc/grep.texi: Add ‘--’ to recently-added example.
+
+ doc: improve subsection title (Bug#26132)
+ * doc/grep.in.1: Rename "Matcher Selection" to "Pattern Syntax".
+
+ doc: fix typo in previous patch
+
+ doc: document quoting better
+ Problem reported by Martin Simons (Bug#38792).
+ * doc/grep.texi: Fix quoting used in examples. Say that patterns
+ should be quoted, use quoting more consistently in examples, and
+ give an example illustrating the difference between patterns and
+ globbing. Don’t assume zgrep expertise in example.
+ * doc/grep.in.1: Likewise. Also, reorder sections
+ to match GNU/Linux man-pages style.
+
+2019-12-26 Jim Meyering <meyering@fb.com>
+
+ maint: tweak NEWS wording
+ * NEWS: Minor wording change.
+
+ build: update gnulib to latest; and sync tests/init.sh
+ * gnulib: update
+ * tests/init.sh: Sync from gnulib (this removes the LC_ALL=C setting).
+
+ tests: avoid spurious failure due to 1-second timeout
+ * tests/grep-dev-null-out: Use a 10-second timeout, rather than
+ a 1-second one. This avoids false failure on slow systems.
+ Reported by Assaf Gordon in
+ https://lists.gnu.org/r/grep-devel/2019-12/msg00018.html
+
+2019-12-26 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+ maint: adjust surrogate-pair for 16-bit wchar_t
+ * tests/surrogate-pair: Adjust to match fixed behavior
+ on AIX 7.2, where wchar_t is 16 bits and cannot represent
+ the test case data.
+
+2019-12-25 Jim Meyering <meyering@fb.com>
+
+ tests: fix typo in name of test file
+ * tests/backslash-s-vs-invalid-multitype: Rename to...
+ * tests/backslash-s-vs-invalid-multibyte: ...this.
+ * tests/Makefile.am (TESTS): Reflect renaming.
+
+ tests: ensure we use require_timeout_ when needed
+ * cfg.mk (sc_timeout_prereq): New syntax-check rule.
+
+ tests: require timeout
+ * tests/mb-non-UTF8-perf-Fw: This test uses "timeout",
+ so must first call require_timeout_.
+ This avoids test spurious failure when running with
+ no timeout program. Reported by Bruno Haible in
+ https://lists.gnu.org/r/grep-devel/2019-12/msg00008.html
+
+2019-12-25 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: work around AIX 7.2 sh printf bug
+ AIX 7.2 /bin/sh’s printf command mishandles octal escapes
+ in multibyte locales: it treats them as characters, not bytes.
+ * tests/backslash-s-vs-invalid-multitype, tests/encoding-error:
+ Use the C locale when employing the printf command with an octal
+ escape that AIX 7.2 sh might mishandle.
+ * tests/init.sh (setup_): Use the C locale for tests.
+ This has the side benefit of making them more reproducible.
+
+2019-12-22 Jim Meyering <meyering@fb.com>
+
+ maint: adjust new comments
+ * src/dfasearch.c (possible_backrefs_in_pattern): Remove a
+ duplicate "a", insert a "be" and a comma, and reformat.
+
+ build: update gnulib to latest
+ * gnulib: Update submodule to latest.
+ * bootstrap: Copy from gnulib.
+ * tests/init.sh: Likewise.
+
+2019-12-22 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix some bugs in pattern-grouping speedup
+ This fixes some bugs in the previous commit,
+ and should finish the fix for Bug#33249.
+ * NEWS: Mention fix for Bug#33249.
+ * src/dfasearch.c (possible_backrefs_in_pattern, regex_compile)
+ (GEAcompile): In new code, prefer ptrdiff_t to size_t when either
+ will do, since ptrdiff_t has better error checking. At some point
+ we should adjust the old code too.
+ (possible_backrefs_in_pattern): Rename from
+ find_backref_in_pattern. New arg BS_SAFE. All uses changed.
+ Fix false negative if a multibyte character ends in a single
+ '\\' byte, followed by the two bytes '\\', '1'.
+ (regex_compile): Simplify.
+ (GEAcompile): Avoid quadratic behavior when reallocating growing
+ buffers. Fix a couple of bugs in copying pattern data involving
+ backreferences. Fix another bug in copying pattern metadata
+ involving backreferences, by removing the need to copy it.
+
+2019-12-22 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: grouping of a pattern with multiple lines
+ When grep uses regex, it splits a pattern with multiple lines by
+ newline character into fragments. Compilation and execution run for
+ each fragment. That causes slowdown. By this change, each fragment is
+ divided into groups by whether the fragment includes back references.
+ A fragment with back references constitutes group, and all fragments
+ that lack back references also constitute a group.
+
+ This change extremely speeds-up following case.
+
+ $ seq -f '%040g' 0 9999 | sed '1s/$/\\(0\\)\\1/' >pat
+ $ yes 00000000000000000000000000000000000000000x | head -10000 >in
+ $ time -p env LC_ALL=C src/grep -f pat in
+
+ * src/dfasearch.c (find_backref_in_pattern, regex_compile):
+ New functions.
+ (GEAcompile): Use the new functions to group fragments
+ as mentioned above.
+
+2019-12-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: add NEWS for Bug#34951 fix
+ * NEWS: Mention Bug#34951.
+
+2019-12-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: separate parse and compile phase
+ DFAMUST() must be called after parse and before tokens re-order which is
+ introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98, but both are
+ executed in compilation phase.
+
+ * lib/dfa.c (dfaparse): Change it to global function.
+ (dfacomp): If first argument is NULL, skip parse.
+ * lib/dfa.h: (dfaparse): Add a prototype.
+
+2019-12-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2019-12-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: speed up multiple word matching
+ grep uses its KWset matcher for multiple word matching, but that is
+ very slow when most of the parts matched to a pattern are not words.
+ So, if the first match to a pattern is not a word, use the grep matcher
+ to match for its line.
+
+ Note that when START_PTR is set, the grep matcher uses the regex matcher
+ which is very slow to match words. Therefore, we use the grep matcher
+ when only START_PTR is NULL.
+
+ * src/kwsearch.c (Fexecute): If an initial match is incomplete because
+ not on a word boundary, use the grep matcher to find a matching line.
+
+2019-12-18 Jim Meyering <meyering@fb.com>
+
+ maint: sort test names
+ * tests/Makefile.am (TESTS): Alphabetize the new addition,
+ mb-non-UTF8-perf-Fw to placate syntax-check's sc_sorted_tests.
+
+2019-12-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: adjust to recent Gnulib change
+ * po/POTFILES.in: Remove lib/xstrtol-error.c.
+
+2019-12-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: do not match invalid UTF-8
+ Update Gnulib to latest. Also:
+ * src/dfasearch.c (EGexecute): Use ptrdiff_t, not size_t,
+ to match new Gnulib API.
+ * tests/Makefile.am (TESTS): Add dfa-invalid-utf8.
+ * tests/dfa-invalid-utf8: New file.
+
+2019-11-30 Jim Meyering <meyering@fb.com>
+
+ tests: add test that would have detected -Fw perf regression
+ * tests/mb-non-UTF8-perf-Fw: New file. Detect v3.3-22-g090a4db's
+ performance regression.
+ * tests/Makefile.am (TESTS): Add it.
+
+2019-11-29 Jim Meyering <meyering@fb.com>
+
+ maint: fix test comment
+ * tests/mb-non-UTF8-word-boundary: Also correct "introduced-in"
+ version number in a comment here.
+
+2019-11-25 Jim Meyering <meyering@fb.com>
+
+ maint: correct NEWS blurb
+ * NEWS (Bug fixes): Correction: the -Fw bug was introduced
+ in 2.28, not in 3.0. Reported by Paul Eggert.
+
+2019-11-17 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: improve grep -Fw performance in non-UTF8 multibyte locales
+ * src/searchutils.c (mb_goback): New parameter. All callers changed.
+ * src/search.h (mb_goback): Update prototype.
+ * src/kwsearch.c (Fexecute): Use mb_goback's MBCLEN to detect a
+ word-boundary even more efficiently.
+
+ grep: fix performance regression with previous patch
+ * src/kwsearch.c (Fexecute): Avoid unnecessary back-up in non-UTF8
+ multibyte locales.
+
+2019-11-16 Jim Meyering <meyering@fb.com>
+
+ maint: rename a variable: bol -> nl
+ * src/kwsearch.c (Fexecute): Change misleading name: s/bol/nl/
+
+ build: update gnulib to latest
+
+ maint: correct and clarify a comment
+ * src/kwsearch.c (Fexecute): Logic was reversed.
+
+ grep: avoid false -Fw match in non-UTF8 multibyte locales
+ For example, this command would erroneously print its input line:
+ echo ab | LC_CTYPE=ja_JP.eucjp grep -Fw b
+ This arose when the "memrchr" search for a preceding newline failed:
+ in that case, MB_START was not adjusted and was initially the same
+ as BEG, so wordchar_prev mistakenly returned 0.
+ * src/kwsearch.c (Fexecute): Set MB_START also when there is no
+ preceding newline.
+ * NEWS (Bug fixes): Mention it.
+ * tests/mb-non-UTF8-word-boundary: New file. Test for the bug.
+ * tests/Makefile.am (TESTS): Add it.
+ Reported by NIDE, Naoyuki in https://bugs.gnu.org/38223.
+
+2019-11-08 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib to latest
+ * po/POTFILES.in: Add lib/argmatch.h.
+
+2019-11-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: new --no-ignore-case option
+ Suggested by Karl Berry and mostly implemented by Arnold Robbins
+ (Bug#37907).
+ * NEWS:
+ * doc/grep.in.1:
+ * doc/grep.texi (Matching Control):
+ * src/grep.c (usage):
+ Document the new option.
+ * src/grep.c (NO_IGNORE_CASE_OPTION): New constant.
+ (long_options, main): Support new option.
+
+ grep: simplify previous patch
+ * src/grep.c (main): Use an int rather than an enum for a local
+ var, which is overkill here.
+
+ grep: further simplify out_file handling
+ * src/grep.c (print_filenames): Make this a local variable instead
+ of static. Rename it to filename_option, to avoid confusion with
+ the print_filename function, and rename the enum values for the
+ same reason. All uses changed.
+ (out_file): Now -1, 0, 1 to represent unknown, false, true.
+ All uses changed.
+ (single_command_line_arg): Remove. This static variable’s
+ function is now accomplished by a local variable ‘num_operands’.
+ (grepdesc): Simplify adjustment of out_file accordingly.
+ (main): Initialize out_file to -1 if not known yet.
+
+2019-11-05 Zev Weiss <zev@bewilderbeest.net>
+
+ grep: simplify out_file handling
+ * src/grep.c (print_filenames): New tristate enum (-H, -h, or
+ neither); supplants with_filenames and no_filenames.
+ (single_command_line_arg): New variable indicating if grep was run
+ with a single command-line argument.
+ (no_filenames): Remove variable.
+ (grepdirent): Don't twiddle out_file back and forth during recursion.
+ (grepdesc): Turn off out_file on 'grep -r foo nondirectory'.
+ (main): Replace with_filenames and no_filenames with print_filenames.
+ Enable out_file when both -r/-R and multiple arguments are given.
+
+2019-10-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix ‘grep -L ... >/dev/null’ bug
+ Problem reported by Adam Sampson (Bug#37716).
+ * NEWS: Mention this.
+ * src/grep.c (grepdesc): Don’t assume that stdout being /dev/null
+ means list_files == LISTFILES_NONE.
+ (main): Do not change list_files merely because stdout is /dev/null.
+ * tests/skip-read: Test for this bug.
+
+2019-10-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: tighten -i doc
+ * doc/grep.in.1:
+ * doc/grep.texi (Matching Control):
+ * src/grep.c (usage):
+ Make it clearer that -i affects patterns and data, but not
+ file names (Bug#37604).
+
+2019-03-10 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: fix “/src/grep: No such file or directory”
+ Problem reported by Jim Meyering in:
+ https://lists.gnu.org/r/grep-devel/2019-02/msg00000.html
+ * NEWS: Mention the change.
+ * configure.ac (fn_grep): Remove. This old attempt to fix
+ <https://savannah.gnu.org/bugs/?31646> wasn’t working anyway,
+ since subprograms didn’t grok fn_grep. People building on Solaris
+ will need a working grep, which is reasonably standard nowadays.
+ (GREP, EGREP): Do not override. This way, we test the
+ newly-built grep only when running ‘make test’ and suchlike.
+ Instead, output a hopefully-helpful diagnostic if the
+ system 'grep' does not work.
+
+2019-02-18 Jim Meyering <meyering@fb.com>
+
+ tests: avoid false positive upon stack overflow
+ * tests/pcre-jitstack: Don't let a stack overflow evoke a false
+ failure. This test is to ensure there is no internal PCRE error.
+ Reported by Andreas Schwab in http://bugs.gnu.org/34370
+
+2019-02-16 Jim Meyering <meyering@fb.com>
+
+ build: avoid build failure with --enable-gcc-warnings
+ * src/kwset.c (bmexec_trans): Define with _GL_ATTRIBUTE_PURE,
+ per suggestion from recent gcc snapshot.
+
+2019-02-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: clarify --exclude globbing
+ Problem reported by Paul Jackson.
+ * doc/grep.in.1:
+ * doc/grep.texi (File and Directory Selection):
+ Clarify how --exclude globbing works.
+
+ grep: parse --color arg independent of locale
+ This is a better fix for Bug#34285.
+ * bootstrap.conf (gnulib_modules): Add c-strcase.
+ * src/grep.c: Include c-strcase.h, not strings.h.
+ (main): Use c_strcasecmp, not strcasecmp.
+
+2019-02-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix grep.c includes
+ * src/grep.c: Include strings.h; problem reported by David
+ Monniaux (Bug#34285). Do not include fcntl.h, as system.h does
+ that for us.h
+
+ build: update gnulib submodule to latest
+
+2019-01-20 Jim Meyering <meyering@fb.com>
+
+ build: ensure no VLA is used
+ Cause developer builds to fail for any use of a VLA.
+ VLAs (variable length arrays) limit portability.
+ * configure.ac (nw): Remove -Wvla from the list of disabled warnings,
+ thus enabling the warning when configured with --enable-gcc-warnings.
+ (GNULIB_NO_VLA) Define, disabling use of VLAs in gnulib. This commit
+ is functionally equivalent to coreutils' v8.30-44-gd26dece5d.
+
+ build: update gnulib to latest
+
+2019-01-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: --binary-files update in man page
+ * doc/grep.in.1: Adjust --binary-files description to match that
+ in doc/grep.texi. When I updated the documentation in
+ 2016-09-09T01:33:14!eggert@cs.ucla.edu I forgot to update the man
+ page accordingly (Bug#33898).
+
+ grep: simplify pcresearch.c ifdefs
+ This fixes a warning if PCRE is not used (Bug#34054).
+ * configure.ac (USE_PCRE): New conditional.
+ * src/Makefile.am (grep_SOURCES) [!USE_PCRE]: Omit pcresearch.c.
+ * src/grep.c (matchers) [!HAVE_LIBPCRE]: Omit perl matcher.
+ (setmatcher) [!HAVE_LIBPCRE]: If helpful, mention
+ --disable-perl-regexp in diagnostic.
+ * src/pcresearch.c: Simplify by assuming HAVE_LIBPCRE.
+
+2019-01-01 Jim Meyering <meyering@fb.com>
+
+ maint: update all copyright dates via "make update-copyright"
+ * gnulib: Also update submodule for its copyright updates.
+
+2018-12-20 Jim Meyering <meyering@fb.com>
+
+ doc: fix the bug-introduced version in 3.3's announcement
+ * NEWS: Correct bug-introduced version (s/2.3/3.2/).
+ * cfg.mk (old_NEWS_hash): Updating old news, we must also udpate this.
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 3.3
+ * NEWS: Record release date.
+
+ grep: fix \b DFA-bug in C locale
+ Under some conditions, \b would mistakenly fail to match, e.g.
+ echo 123-x|LC_ALL=C grep '.\bx'
+ * NEWS (Bug fixes): Mention it
+ * gnulib: Update to latest, for DFA regression fix.
+ * tests/word-delim-multibyte: Add a test for the dfa.c regression.
+
+2018-12-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fit --version authorship into 80
+ * src/grep.c (AUTHORS): Remove.
+ (main): Output the authorship info ourselves instead of having
+ version_etc do it. This is better for i18n anyway.
+
+ build: update gnulib submodule to latest
+
+2018-12-20 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 3.2
+ * NEWS: Record release date.
+
+2018-12-18 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib for c-stack fix
+
+2018-12-17 Bruno Haible <bruno@clisp.org>
+
+ tests: stack-overflow: avoid unwarranted test failure on some hosts
+ * tests/stack-overflow: Use ulimit to limit stack size. Otherwise,
+ at least on gcc113, grep would fail to overflow its stack, so this
+ test would fail to find the required diagnostic and would fail.
+
+2018-12-16 Jim Meyering <meyering@fb.com>
+
+ tests: reenable the surrogate-pair test
+ This reverts commit bdb98cec2e7bf255e1d00eaf8be16299f7bf571e,
+ but adding the comment changes suggested by Bruno Haible in
+ https://lists.gnu.org/r/grep-devel/2018-12/msg00037.html
+ * tests/surrogate-pair: New file.
+ * tests/Makefile.am (TESTS): List it.
+
+2018-12-16 Bruno Haible <bruno@clisp.org>
+
+ tests: stackoverflow: fix test failure on HardenedBSD 11
+ * tests/stack-overflow: Try up to 10 million opening parentheses.
+
+2018-12-16 Jim Meyering <meyering@fb.com>
+
+ tests: remove stale surrogate-pair test
+ The cygwin-specific code for surrogate pairs was first disconnected
+ via v2.21-62-g936c904 and later removed as part of a then-unused
+ function via v2.24-12-g704de87. So now I'm removing the test, too.
+ If someone thinks it important and would like to revive it, please do.
+ * tests/surrogate-pair: Remove file.
+ * tests/Makefile.am (TESTS): Remove it.
+
+2018-12-16 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2018-12-15 Jim Meyering <meyering@fb.com>
+
+ tests: stack-overflow: handle the case of success without the diagnostic
+ * tests/stack-overflow: Do not always require a stack
+ overflow diagnostic.
+
+ build: update gnulib to latest
+ * gnulib: Update to latest, to pull in code that now compensates for
+ a bug in glibc-2.27 and prior.
+
+ build: make the autoconf-2.63 requirement explicit
+ * configure.ac: AC_PREREQ: Require 2.63, not 2.59. And quote properly.
+ Autoconf-2.63 has been required for some time via gnulib.
+ This merely makes it explicit.
+
+2018-12-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: fix diagnostic typo
+ Fix by Bruno Haible in:
+ https://lists.gnu.org/r/grep-devel/2018-12/msg00003.html
+ * tests/init.cfg (envvar_check_fail): Fix diagnostic.
+
+2018-11-24 Jim Meyering <meyering@fb.com>
+
+ tests: stack-overflow: avoid false failure
+ * tests/stack-overflow: This test would fail to elicit a stack overflow
+ diagnostic on some OS X systems. Rewrite to iterate, gradually increasing
+ the size of the input regex, stopping when grep emits the desired diagnostic
+ or the size reaches a reasonable limit.
+
+2018-10-16 Jim Meyering <meyering@fb.com>
+
+ tests: reduce the sole failing test
+ * tests/backref-alt: Significantly reduce abort-inducing input.
+
+ build: update gnulib to latest; also update bootstrap and init.sh
+
+2018-10-13 Jim Meyering <meyering@fb.com>
+
+ doc: NEWS: mention performance improvements
+ * NEWS (Improvements): Mention them.
+
+2018-10-13 Jim Meyering <meyering@fb.com>
+
+ grep: triple initial buffer size: 32k->96k
+ Changing 32k to 96k gives a 3-23% performance improvement.
+ All timings ran with this diff on top of commit v3.1-39-g7179b21:
+
+ for n in 32 64 96 128; do
+ echo n=$n
+ perl -pi -e 's/(INITIAL_BUFSIZE =) \d+/$1 '$n/ src/grep.c &&
+ make AM_CFLAGS=-O3 WERROR_CFLAGS= >& makerr-$n &&
+ for needle in 1f2 1f298lkjskjhahjklkj34; do
+ echo " needle=$needle"
+ for i in $(seq 10); do
+ env MALLOC_PERTURB_= time -qf%e src/grep $needle w2000
+ done 2>&1 |sort -g | tee >(head -1|sed 's/^/ /') > .time-${n}KB-$needle
+ done
+ done
+
+ Tested searchs: search for a short literal pattern that is not
+ present in 9.3GB file containing 2000 copies of /usr/dict/words
+ created via this:
+ ln -s /usr/share/dict/words k && cat $(yes k|head -2000) > w2000
+ I ran this command:
+ env MALLOC_PERTURB_= time src/grep 1f2 w2000
+ old(32k) vs new elapsed time, best of 10 trials (gcc-9.0.0 20180831, -O3):
+ 32k 64k 96k(%incr) 128k CPU
+ 1.25 1.18 1.16( 7.2) 1.20 i7-4770S@3.10GHz cache=8MB
+ 1.21 1.16 1.17( 3.3) 1.19 Xeon(R) E3-1505M v5 @ 2.80GHz cache=8MB
+ 2.36 2.29 2.29( 3.0) 2.36 Xeon(R) E5-2680 v4 @ 2.40GHz cache=32MB
+ 1.40 1.32 1.31( 6.4) 1.33 i5-6260U @ 1.80GHz cache=4MB
+ 1.31 1.26 1.24( 5.3) 1.23 AMD FX(tm)-4100 cache=2MB (with only 1000 copies)
+
+ Searching for a longer string: 1f298lkjskjhahjklkj34
+ 2.03 1.76 1.61(20.7) 1.53 i7-4770S@3.10GHz cache=8MB
+ 1.95 1.70 1.56(20.0) 1.51 Xeon(R) E3-1505M v5 @ 2.80GHz
+ 3.27 2.98 2.84(13.1) 3.02 Xeon(R) E5-2680 v4 @ 2.40GHz
+ 2.48 2.12 1.91(23.0) 1.80 i5-6260U @ 1.80GHz cache=4MB
+ 1.72 1.54 1.46(15.1) 1.41 AMD FX(tm)-4100 cache=2MB
+
+ * src/grep.c (INITIAL_BUFSIZE): Triple it: 32kB -> 96kB
+
+2018-09-28 Barret Rhoden <brho@cs.berkeley.edu> (tiny change)
+
+ maint: fix cross-compiling problem
+ * cfg.mk (PATH): Omit if cross-compiling (Bug#32866).
+
+2018-09-28 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+ grep: fix usage 80-column glitch
+ * src/grep.c (usage): Do not go over 80 columns in the source
+ code, to pacify "make dist".
+
+2018-09-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: update bootstrap
+ * bootstrap: Copy from Gnulib.
+
+ maint: fix build failure
+ Problem found by OpenCSW buildbot; the bug also occurs on GNU/Linux
+ build platforms. The symptom is “system.h:26:24: fatal error:
+ configmake.h: No such file or directory”. See:
+ https://buildfarm.opencsw.org/buildbot/builders/ggrep-solaris10-sparc/builds/107
+ * bootstrap.conf: Add configmake, a dependency that was formerly brought
+ in only by accident.
+
+2018-09-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2018-08-09 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: fix comment
+
+ tests: backref-alt works with glibc 2.28
+ Problem reported by Jaroslav Skarvada (Bug#32409).
+ * tests/Makefile.am (XFAIL_TESTS) [!USE_INCLUDED_REGEX]:
+ Don’t add backref-alt, since this bug is fixed in glibc 2.28.
+
+2018-05-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: “pattern” vs “patterns”
+ * doc/grep.in.1, doc/grep.texi, src/grep.c (usage): Be more
+ careful about saying that an argument or option specifies one or
+ more patterns, not just a single pattern. Problem reported by Kaz
+ Kylheku (Bug#31400).
+
+ build: update gnulib submodule to latest
+
+2018-04-21 Jim Meyering <meyering@fb.com>
+
+ maint: fix new syntax-check (sc_long_lines) failure
+ * HACKING: Shorten line by one byte to fit in 80 columns.
+
+ build: update gnulib to latest
+
+2018-04-21 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: fix font typo
+
+ maint: update URLs
+ Mostly this is just changing http: to https:.
+ In one or two places it removes no-longer-useful URLs.
+
+ doc: man-page format fixes
+ * doc/grep.in.1: Fix minor formatting glitches, e.g., extra
+ space after [...] because groff thought it was a sentence end.
+ Problem reported by Ingo Schwarze (Bug#31228#11).
+
+2018-04-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: mention encoding errors
+ This attempts to document the encoding-error problem more
+ precisely (Bug#30326).
+ * doc/grep.in.1, doc/grep.texi: Mention that the behavior of
+ patterns like ‘.’ is not specified on encoding errors.
+
+ doc: port better to mandoc
+ * doc/grep.in.1: Check for groff and its macro packages
+ independently, as groff can be used with non-groff macro packages.
+ Use an-ext style macros rather than www.tmac style, as this should
+ be more portable to mandoc. Problem reported by Laura Morales and
+ Ingo Schwarze (Bug#31228).
+
+2018-02-16 Jim Meyering <meyering@fb.com>
+
+ maint: avoid new syntax-check failure
+ * cfg.mk (old_NEWS_hash): Update, to accommodate v3.1-20-g63d4174's
+ typo fix.
+
+ doc: clarify that PCRE support is here to stay
+ * doc/grep.texi (grep Programs): Clarify: it's not PCRE support
+ that is experimental, but its combination with --null-data (-z).
+
+2018-02-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: fix typo
+
+2018-01-06 Jim Meyering <meyering@fb.com>
+
+ maint: update gnulib and copyright dates for 2018
+ * gnulib: Update to latest.
+ * all files: Run "make update-copyright".
+ * bootstrap: Update from gnulib.
+
+2017-12-17 Jim Meyering <meyering@fb.com>
+
+ build: link with -lsigsegv, when c-stack module requires it
+ * src/Makefile.am (grep_LDADD): Add $(LIBCSTACK).
+ Otherwise, on at least Debian and Arch-based systems, linking would
+ fail with diagnostics like these:
+ c-stack.c:207: undefined reference to `stackoverflow_install_handler'
+ c-stack.c:216: undefined reference to `sigsegv_install_handler'
+ Reported by Jeremy Feusi.
+
+ build: suppress sig-handler.h's -Wcast-function-type warning
+ * configure.ac (WERROR_CFLAGS): Add -Wno-cast-function-type
+ to suppress warning about sig-handler.h's sa_handler_t cast:
+ sig-handler.h: In function 'get_handler':
+ sig-handler.h:47:12: error: cast between incompatible function\
+ types from 'void (* const)(int, siginfo_t *, void *)'\
+ {aka 'void (* const)(int, struct <anonymous> *, void *)'}\
+ to 'void (*)(int)' [-Werror=cast-function-type]
+ return (sa_handler_t) a->sa_sigaction;
+
+2017-12-16 Jim Meyering <meyering@fb.com>
+
+ grep: diagnose stack overflow rather than segfaulting
+ * bootstrap.conf (gnulib_modules): Add c-stack.
+ * src/grep.c: Include "c-stack.h".
+ (main): Call c_stack_action (NULL);
+ * tests/stack-overflow: New file.
+ * tests/Makefile.am (TESTS): Add name of new file.
+ * NEWS (Improvements): Mention it.
+ Interestingly, this bug does not afflict grep-2.5.4 or prior,
+ so it appeared to have been introduced with grep-2.6. However,
+ the origin is in glibc's regexp compiler, and I tracked it to
+ stack-aware parsing that was removed from glibc's regexp in 2002.
+ However, grep-2.5.4 was released in 2009. That version worked
+ (and still works, now) because it included and (by default) used
+ an old copy of glibc's regexp code.
+ Jeremy Feusi reported the grep segfault in https://bugs.gnu.org/29666.
+ I reported the glibc regexp bug in
+ https://sourceware.org/bugzilla/show_bug.cgi?id=22620
+
+2017-11-26 Stephan T. Lavavej <stl@nuwen.net>
+
+ grep: fix directory recursion on MS-Windows
+ gnulib recently gained a module, windows-stat-inodes, that fixes
+ directory recursion on MS-Windows. No changes to grep's C sources are
+ required; grep simply needs to request the module during configuration.
+
+ When grep requests this module, its configure script will gain the
+ behavior that was implemented in windows-stat-inodes.m4. This detects
+ mingw and sets WINDOWS_STAT_INODES=1. All other platforms are
+ unaffected, setting WINDOWS_STAT_INODES=0 (which is what's happening
+ in the absence of this patch).
+
+ * bootstrap.conf (gnulib_modules): Add windows-stat-inodes.
+ * NEWS (Bug fixes): Mention it.
+ Thanks to Pär Björklund who diagnosed the problem as involving inodes,
+ and thanks to Václav Haisman who provided the bootstrap.conf patch.
+
+2017-11-25 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: port better to Adélie GNU/Linux 64-bit ppc
+ Problem reported by A. Wilcox (Bug#29446).
+ * src/pcresearch.c (PCRE_EXTRA_MATCH_LIMIT_RECURSION)
+ (PCRE_STUDY_EXTRA_NEEDED): Default to 0.
+ (jit_exec): If we run up against the recursion limit,
+ double it (if possible) and try again.
+ (Pcompile): Also specify PCRE_STUDY_EXTRA_NEEDED so that
+ pc->extra is not null.
+
+2017-11-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: omit a dup 'const'
+ * src/grep.c (matchers): Omit duplicate 'const'.
+
+2017-10-13 Bernhard Voelker <mail@bernhard-voelker.de>
+
+ doc: document the option delimiter '--'
+ * doc/grep.texi (Other options): Do the above.
+ Reported in https://lists.opensuse.org/opensuse/2017-03/msg00411.html
+ This addresses http://bugs.gnu.org/26139
+
+2017-08-21 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+ Pacify GCC 5.4
+ * src/grep.c (grepdesc): Rework to pacify GCC 5.4 warning
+ about logical not.
+
+2017-08-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2017-08-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -L exits with status 0 if a file is selected
+ Problem reported by Anthony Sottile (Bug#28105).
+ * NEWS, doc/grep.texi (Exit Status), src/grep.c (usage): Document this.
+ * src/grep.c (grepdesc): Implement it.
+ * tests/skip-read: Test it.
+
+ build: update gnulib submodule to latest
+
+2017-08-13 Jim Meyering <meyering@fb.com>
+
+ maint: avoid newly-introduced syntax-check failure
+ * src/grep.c (usage): Shorten --help line to 80, so
+ "make syntax-check" passes once again.
+
+2017-08-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: improve -o help
+ * src/grep.c (usage): Document that -o outputs only nonempty
+ matches (Bug#27931).
+
+2017-07-26 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: add Bug#27838 test case
+ * tests/backref-alt: New test case from a fuzzer.
+
+2017-07-25 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: distinguish -w from \<...\>
+ * doc/grep.texi (Matching Control):
+ Give example of why -w differs from \<...\> (Bug#27813).
+
+2017-07-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: define Dt string in man page
+ Problem reported by Bjarni I. Gislason via Santiago R.R. (Bug#27651).
+ * doc/grep.in.1 (dT): New macro.
+ (Dt): Define this string.
+
+2017-07-02 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 3.1
+ * NEWS: Record release date.
+
+2017-07-01 Jim Meyering <meyering@fb.com>
+
+ tests: avoid false failures when run in qemu user mode
+ * tests/filename-lineno.pl: Derive the program name that grep
+ will use in diagnostics, based on a suggestion from Assaf Gordon.
+ * tests/in-eq-out-infloop: Similar: accept an arbitrary "command_name: "
+ prefix on checked diagnostics, rather than requiring "grep: ".
+ * tests/reversed-range-endpoints: Likewise.
+ * tests/write-error-msg: Likewise.
+ Reported by Bruno Haible in http://bugs.gnu.org/27532
+
+2017-06-25 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+ * gnulib: Update to latest for these portability fixes:
+ - stat: port to xlc 12.01
+ - xalloc-oversized: port to icc
+
+ doc: fix another typo
+ * doc/grep.texi (File and Directory Selection): Fix typo: s/afer/after/
+
+2017-06-24 Jim Meyering <meyering@fb.com>
+
+ doc: stop calling --perl-regexp (-P) "highly" experimental
+ Use wording that is less likely to make readers think that
+ support for -P may be removed.
+ * doc/grep.in.1: s/highly experimental/experimental/
+ * doc/grep.texi: Likewise.
+ Suggested by Evan Sheahan.
+
+2017-06-21 Jim Meyering <meyering@fb.com>
+
+ doc: correct typo
+ * doc/grep.texi (Performance): s/suprisingly/surprisingly/
+
+ gnulib: update to latest
+
+2017-06-21 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -m no longer cuts off trailing context
+ Problem reported by Markus Jochim (Bug#26254).
+ * NEWS, doc/grep.texi (General Output Control): Document this.
+ * src/grep.c (prpending): Selected lines no longer cut off context.
+ (usage): Say "selected" instead of "matching", where appropriate.
+ * tests/foad1, tests/max-count-vs-context, tests/yesno:
+ Adjust to match new behavior.
+
+2017-05-31 Paul Eggert <eggert@cs.ucla.edu>
+
+ Document grep performance
+ * doc/grep.texi (Performance): New section.
+
+ build: update gnulib submodule to latest
+
+2017-05-21 Jim Meyering <meyering@fb.com>
+
+ maint: make the announcement template Cc the devel- list
+ * cfg.mk (announcement_Cc_): Define.
+
+ gnulib: update to latest; and update tests/init.sh
+
+ maint: accommodate GCC7's -Werror=duplicated-branches
+ * src/system.h (IGNORE_DUPLICATE_BRANCH_WARNING): Define.
+ * src/grep.c (grepfile): Use it.
+ * src/kwset.c (bmexec, acexec): Use it.
+
+ maint: update to work with GCC7's -Werror=implicit-fallthrough=
+ * src/system.h (FALLTHROUGH): Define.
+ * src/grep.c (context_length_arg): Use new FALLTHROUGH macro in place
+ of comments
+ (fgrep_to_grep_pattern, try_fgrep_pattern, main): Likewise.
+
+2017-05-13 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest and adapt src/kwset.c
+ * gnulib: Update to latest.
+ * src/kwset.c: Include "verify.h" for use of assume.
+
+2017-03-22 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest for dfa [0-9] performance improvement
+ This pulls in the following change that is very relevant to grep:
+
+ commit 6afba02d7869d39ed7f61981045ddbdcb2814101
+ Author: Paul Eggert <eggert@cs.ucla.edu>
+ dfa: make [0-9] faster in non-C locales
+
+ * gnulib: Update to latest.
+ * NEWS (Improvements): Describe the effect on grep.
+
+2017-03-05 Jim Meyering <meyering@fb.com>
+
+ build: use $(builddir), not $(srcdir)
+ * cfg.mk (PATH): Use $(builddir), so this also takes effect
+ in a non-srcdir build. Also, switch ${PATH} syntax to $(PATH).
+
+2017-03-05 Juan Manuel Guerrero <juan.guerrero@gmx.de>
+
+ build: use $(PATH_SEPARATOR), not ":" to augment PATH
+ * cfg.mk (PATH): Use $(PATH_SEPARATOR), for those systems that
+ use something other than ":".
+ * THANKS.in: Remove name, to avoid syntax-check failure due to
+ the duplicate, now that there is this commit.
+
+2017-02-17 Jim Meyering <meyering@fb.com>
+
+ maint: fix distcheck failure: remove stale dosbuf.c reference
+ * src/Makefile.am (EXTRA_DIST): Do not attempt to distribute
+ the recently deleted file, dosbuf.c.
+
+ maint: fix new syntax-check errors
+ * po/POTFILES.in: Add lib/xbinary-io.c.
+ * cfg.mk (FILTER_LONG_LINES): Add TODO to the list of exempt files.
+
+2017-02-16 Paul Eggert <eggert@cs.ucla.edu>
+
+ Fix up recent -U patches
+ Inspired by a suggestion by Eric Blake (Bug#25707#17).
+ * bootstrap.conf (gnulib_modules): Add xbinary-io,
+ and remove binary-io and xfreopen.
+ * doc/grep.texi (Other Options):
+ Fix typo and reword to be a bit more general.
+ * src/grep.c: Include xbinary-io.h instead of xfreopen.h.
+ (grepfile): Open with O_BINARY if binary.
+ (grepdesc): No need for set_binary_mode now.
+ (grep_command_line_arg, main): Set stdin to binary mode if binary.
+ (main): Avoid unnecessary test of stdin == NULL.
+ Use xsetmode instead of xfreopen.
+ * src/system.h: Do not include binary-io.h.
+
+ build: update gnulib submodule to latest
+
+ Simplify -U on MS-Windows by removing guesswork
+ Suggested by Eric Blake (Bug#25707#11).
+ * NEWS, doc/grep.texi: Document this.
+ * src/dosbuf.c: Remove.
+ * bootstrap.conf (gnulib_modules): Add xfreopen.
+ * src/grep.c: Include xfreopen.h, not dosbuf.c.
+ (fillbuf, print_line_head): Do not undossify input.
+ (binary): New static var.
+ (grepdesc): Apply BINARY to input file.
+ (usage): Remove -u help.
+ (main): Set BINARY if -U, and apply it to stdout. Do nothing if -u.
+ With -f, apply BINARY to input file.
+
+2017-02-16 Eric Blake <eblake@redhat.com>
+
+ grep: don't forcefully strip carriage returns
+ Commit 5c92a54 made the mistaken assumption that using fopen("rt")
+ on platforms where O_TEXT is non-zero makes sense. However, POSIX
+ already requires fopen("r") to open a file in text mode, vs.
+ fopen("rb") when binary mode is wanted, and at least on Cygwin,
+ where it is possible to control whether a mount point is binary
+ or text by default (using just "r"), the use of fopen("rt") actively
+ breaks assumptions on a binary mount by silently corrupting any
+ carriage returns that are supposed to be preserved.
+
+ * src/grep.c (main): Never use fopen("rt") (Bug#25707).
+
+2017-02-13 Paul Eggert <eggert@cs.ucla.edu>
+
+ Update TODO and doc
+ * TODO: Bring up-to-date and fix formatting glitches.
+ * doc/grep.in.1, doc/grep.texi: Fix minor glitches.
+ The above patches should address the same problems that recent
+ Debian doc patches address, albeit in a different way.
+
+2017-02-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: clarify default input (Bug#25651)
+ * doc/grep.in.1:
+ * src/grep.c (usage): Clarify default input when -r.
+ * src/grep.c (usage): Do not bother documenting egrep and fgrep;
+ the manual is enough.
+
+2017-02-09 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 3.0
+ * NEWS: Record release date.
+
+2017-02-08 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: do not mishandle \. in multiple patterns
+ Problem reported by Lars Wendler (Bug#25655).
+ * NEWS: Document this.
+ * src/grep.c (try_fgrep_pattern): Fix typo that prevented
+ keys from being properly updated.
+ * tests/foad1: Test for the bug.
+
+2017-02-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ Do not assume PCRE 8.20 or later
+ Problem reported by Zube (Bug#25647)
+ * NEWS: Document this.
+ * src/pcresearch.c (struct pcre.com.jit_stack):
+ Declare only if PCRE_STUDY_JIT_COMPILE.
+
+2017-02-06 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.28
+ * NEWS: Record release date.
+
+2017-02-02 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+
+2017-02-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: tune to avoid memchr2 sometimes
+ Problem noted by Norihiro Tanaka in:
+ http://lists.gnu.org/archive/html/grep-devel/2017-01/msg00027.html
+ Although not enough to restore all the previous performance in the
+ case he noted, it helps significantly.
+ * src/kwset.c (memchr_kwset): Bring back small_heuristic,
+ in a somewhat different form.
+
+2017-01-29 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+
+2017-01-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: simplify recent kwset change
+ * src/kwset.c (acexec_trans): Simplify.
+
+2017-01-23 Jim Meyering <meyering@fb.com>
+
+ tests: really add the new test name
+ * tests/Makefile.am (TESTS): Add fgrep-longest.
+
+2017-01-21 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep -Fo could report a match that is not the longest
+ * src/kwset.c (acexec): Fix it.
+ * tests/fgrep-longest: New test.
+ * tests/Makefile.am: Add the test.
+ * NEWS: Mention it.
+
+2017-01-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: speed up Aho-Corasick when at most 2 bytes
+ When using Aho-Corasick and all matched strings either begin with
+ the same byte, or begin with one of at most two bytes, use memchr2
+ to search for these matching bytes and apply the Aho-Corasick
+ algorithm only when a memchr2 match is found. On my platform,
+ this speeds up 'grep -F -e aa -e ba in' by a factor of 7, where
+ the file 'in' was created by 'seq -f %040.0f 10000000 >in'.
+ * src/kwset.c (struct kwset.gc1): Now int, not char.
+ If negative, there is no single terminal byte. All uses changed.
+ (struct kwset.gc1help): Now int, not char.
+ If negative, memchr2 cannot be used.
+ (kwsprep): Set up gc1 and gc1help from kwset->next, with
+ the new (slightly changed) interpretation.
+ (memchr_kwset): Use memchr2 if possible.
+ Adjust to match new meaning of gc1, gc1help.
+ (memoff2_kwset): Remove; no longer needed.
+ (acexec_trans): Use memchr_kwset when possible, for speed.
+ It now supersedes memoff2_kwset.
+
+ grep: remove Commentz-Walter code
+ This code was not being used, and complicated maintenance.
+ We can bring it back from the repository if it turns out
+ to be useful later.
+ * src/kwset.c (struct kwset.reverse): Remove. All uses of
+ FOO->reverse replaced by (FOO->kwsexec == bmexec).
+ (kwsalloc): Remove 'reverse' arg, as callers outside this
+ module do not care about algorithm choice. All callers changed.
+ (kwsprep): When deciding whether to use Boyer-Moore, do not worry
+ about being called twice on the same kwset, as that is not allowed.
+ (cwexec): Remove; it was never called. All uses removed.
+
+2017-01-17 Jim Meyering <meyering@fb.com>
+
+ maint: avoid new syntax-check failures
+ * src/kwset.c (struct kwset): Split a line longer than 80.
+ * bootstrap: Update from gnulib. This fixes a new syntax-check
+ failure due to its use of "time stamp".
+
+2017-01-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ * NEWS: Fix typo.
+
+ * src/kwset.c: Fix comment typo.
+
+ Improve -i performance in typical UTF-8 searches
+ Currently ‘grep -i i’ is slow in a UTF-8 locale, because ‘i’ in
+ the pattern matches the two-byte character 'ı' (U+0131, LATIN
+ SMALL LETTER DOTLESS I) in data, and kwset handles only
+ single-byte character translations, so grep falls back on a slower
+ DFA-based search for all searches. Improve -i performance in the
+ typical case by using kwset when data are free of troublesome
+ characters like 'ı', falling back on the DFA only when data
+ contain troublesome characters.
+ * src/dfasearch.c (GEAcompile):
+ * src/grep.c (compile_fp_t):
+ * src/kwsearch.c (Fcompile):
+ * src/pcresearch.c (Pcompile):
+ Pattern arg is now char *, not char const *, since Fcompile
+ now reallocates it sometimes.
+ * src/grep.c (all_single_byte_after_folding): Remove.
+ All callers removed.
+ (fgrep_icase_charlen): New function.
+ (fgrep_icase_available, try_fgrep_pattern):
+ Use it, for more-generous semantics.
+ (fgrep_to_grep_pattern): Now extern.
+ (main): Do not free keys, since Fexecute may use them.
+ * src/kwsearch.c (struct kwsearch): New struct.
+ (Fcompile): Return it. If -i, be more generous about patterns.
+ (Fexecute): Use it. Fall back on DFA when the data contain
+ troublesome characters; this should be rare in practice.
+ * src/kwset.c, src/kwset.h (kwswords): New function.
+
+ build: update gnulib submodule to latest
+
+2017-01-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: prefer ptrdiff_t to size_t
+ The code already cannot handle objects with size greater than
+ SIZE_MAX / 2, so be more honest about it and use ptrdiff_t instead
+ of size_t. ptrdiff_t arithmetic is signed, which allows for more
+ checking via -fsanitize=undefined. It also makes the code a tad
+ smaller on x86-64, since it can test for < 0 rather than for ==
+ SIZE_MAX.
+ * src/dfasearch.c (struct dfa_comp.kwset_exact_matches):
+ (kwsmusts, EGexecute):
+ * src/kwsearch.c (Fcompile, Fexecute):
+ * src/kwset.c (struct kwset.kwsexec, kwsincr, memchr_kwset)
+ (memoff2_kwset, bmexec_trans, bmexec, cwexec, acexec_trans)
+ (acexec, kwsexec):
+ * src/kwset.h (struct kwsmatch.index, .offset, .size):
+ Prefer ptrdiff_t to size_t where either will do.
+
+2017-01-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: improve comments, mostly in kwset
+ Remove kwset.h comments that are obsolete and seemingly not
+ maintained anyway; people can look in kwset.c instead.
+ Update comments to reflect current behavior better.
+ Cite Faro & Lecroq 2013. Use GNU style for end-of-sentence.
+
+2017-01-01 Jim Meyering <meyering@fb.com>
+
+ maint: update gnulib and copyright dates for 2017
+ * gnulib: Update to latest.
+ * all files: Run "make update-copyright".
+
+2016-12-31 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: speed up -x with many patterns
+ * src/kwsearch.c (Fcompile): Improve buffer allocation overhead
+ with -x and multiple patterns. In the common case where '\n' is
+ the end-of-line byte, avoid copying other than the first and last
+ patterns.
+
+2016-12-31 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest, fixing a parallel getopt test failure
+
+2016-12-29 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: space before paren
+
+ grep: int cleanup in kwset.c
+ This should affect only theoretical bugs with very large inputs.
+ On my platform, this patch shrinks the grep text by 136 bytes.
+ * src/kwset.c: Include intprops.h, for INT_MULTIPLY_WRAPV.
+ (struct trie, struct kwset, kwsalloc, kwsincr, treedelta, kwsprep)
+ (bm_delta2_search, bmexec_trans, cwexec): Prefer ptrdiff_t to int
+ when counts can exceed INT_MAX in large inputs, at least in theory.
+ (hasevery): Use bool for booleans.
+ (bmexec_trans): Avoid undefined behavior on integer overflow.
+
+2016-12-27 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: improve performance with multiple patterns
+ * src/grep.c (main): Avoid fgrep-to-grep conversion for word matching
+ with multiple patterns in single byte locales.
+
+2016-12-27 Paul Eggert <eggert@cs.ucla.edu>
+
+ * NEWS: Fix typo.
+
+ grep: fix bug with '... | grep pat >> /dev/null'
+ Problem reported by Benno Fünfstück (Bug#25283).
+ * NEWS: Document this.
+ * src/grep.c (drain_input) [SPLICE_F_MOVE]:
+ Don't assume /dev/null is always acceptable output to splice.
+ * tests/grep-dev-null-out: Test for the bug.
+
+2016-12-26 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: minor performance tweak for pure functions
+ * src/search.h (wordchars_size, wordchar_next, wordchar_prev):
+ Declare to be pure.
+
+2016-12-25 Zev Weiss <zev@bewilderbeest.net>
+
+ grep: move localeinfo to grep.c
+ It's not really dfasearch-specific, and grep.c initializes it, so it
+ seems like the most appropriate "owner".
+
+ * src/dfasearch.c (localeinfo): Remove.
+ * src/grep.c (localeinfo): Add.
+ * src/search.h (localeinfo): Move to new commented section.
+
+2016-12-25 Zev Weiss <zev@bewilderbeest.net>
+
+ pcresearch: thread safety
+ * src/pcresearch.c (pcre_comp): New struct to hold previously-global
+ state.
+ (jit_exec): Operate on a pcre_comp parameter instead of global state.
+ (Pcompile): Allocate and return a pcre_comp instead of setting global
+ variables.
+ (Pexecute): Operate on a pcre_comp parameter instead of global state.
+
+ kwsearch: thread safety
+ * src/kwsearch.c (Fcompile): Return a kwset_t instead of setting a
+ global variable.
+ (Fexecute): Use a passed-in kwset_t instead of a global variable.
+ (kwset): Remove global variable.
+
+ dfasearch: thread safety
+ * src/dfasearch.c (struct dfa_comp): New struct to hold
+ previously-global variables.
+ (dfawarn): Remove static variable.
+ (kwsmusts): Operate on a dfa_comp parameter instead of global
+ variables.
+ (GEAcompile): Allocate and return a dfa_comp struct instead of setting
+ global variables.
+ (EGexecute): Operate on a dfa_comp parameter instead of global
+ variables.
+ * src/searchutils.c (kwsinit): Replace a static array with a
+ dynamically-allocated one.
+
+2016-12-25 Zev Weiss <zev@bewilderbeest.net>
+
+ grep: prepare search backends for thread-safety
+ To facilitate removing mutable global state from search backends,
+ compile() functions will return an opaque pointer to backend-specific
+ data, which must then be passed back into the corresponding execute()
+ function. This is merely a preparatory step changing function
+ signatures and call sites, so the pointers passed & returned are
+ dummies for now and not (yet) actually used.
+
+ * src/grep.c (compile_fp_t): Now returns an opaque pointer (the
+ compiled pattern).
+ (execute_fp_t): Now passed the pointer returned by a compile_fp_t.
+ All call sites updated accordingly.
+ (compiled_pattern): New static variable.
+ * src/dfasearch.c (GEAcompile): Return a void pointer (dummy NULL).
+ (EGexecute): Receive a void pointer argument (unused).
+ * src/kwsearch.c (Fcompile): Return a void pointer (dummy NULL).
+ (Fexecute): Receive a void pointer argument (unused).
+ * src/pcresearch.c (Pcompile): Return a void pointer (dummy NULL).
+ (Pexecute): Receive a void pointer argument (unused).
+ * src/search.h: Update compile/execute function prototypes.
+
+2016-12-24 Jim Meyering <meyering@fb.com>
+
+ maint: fix "syntax-check" failure
+ * src/grep.c (SEP_STR_GROUP): Declare "static".
+
+2016-12-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix comment in searchutils.c
+
+ grep: improve word checking with UTF-8
+ * src/searchutils.c: Do not include <verify.h>.
+ (word_start): Remove, replacing with ...
+ (sbwordchar): New static var. All uses changed.
+ (wordchar_prev): Return size_t, not bool, as this generates
+ slightly better code. Go back faster if UTF-8.
+
+ grep: standardize on localeinfo.multibyte
+ * src/dfasearch.c (EGexecute):
+ * src/grep.c (main):
+ * src/kwsearch.c (Fexecute):
+ * src/pcresearch.c (Pcompile):
+ Prefer localeinfo.multibyte to (MB_CUR_MAX > 1).
+
+ grep: speed up -wf in C locale
+ Problem reported by Norihiro Tanaka (Bug#22357#100).
+ This patch improves the performance on that benchmark on my
+ platform so that grep is now only about 2x slower than grep 2.26,
+ which means it is considerably faster than grep 2.25 and earlier.
+ * src/kwsearch.c (Fexecute):
+ Use wordchars_size to boost performance for this case.
+ * src/search.h, src/searchutils.c (wordchars_size): New function.
+
+ grep: specialize word-finding functions
+ This improves performance a bit.
+ * src/dfasearch.c, src/kwsearch.c (wordchar):
+ Remove; now in searchutils.c.
+ * src/grep.c (main): Call wordinit if -w.
+ * src/search.h: Adjust.
+ * src/searchutils.c: Include verify.h.
+ (word_start): New static var.
+ (wordchar): Move here from dfasearch.c and kwsearch.c.
+ (wordinit, wordchars_count, wordchar_next, wordchar_prev):
+ New functions.
+ (mb_prev_wc, mb_next_wc): Remove.
+ All callers changed to use the new functions instead.
+
+ grep: simplify Fexecute
+ * src/kwsearch.c (Fexecute): Avoid the need for a 'try' local or
+ for a 'goto success'. Update mb_start to reflect newline found.
+
+ grep: remove C label
+ * src/kwsearch.c (Fexecute): Remove label.
+
+ maint: rewrite to avoid some macros
+ These days, the dangerous powers of C macros are not needed if
+ constants or functions will do just as well.
+ * src/grep.c (SEP_CHAR_SELECTED, SEP_CHAR_REJECTED, SEP_STR_GROUP)
+ (INITIAL_BUFSIZE):
+ * src/kwset.c (DEPTH_SIZE):
+ Now constants, not macros.
+ * src/kwset.c (link): Remove macro. Instead, rename local vars
+ from 'link' to 'cur'.
+ (malloc) [GREP]: Remove macro. All uses of malloc changed to xmalloc.
+ Omit double-inclusion of xalloc.h. Do not depend on 'GREP'.
+ (U): Now a function, not a macro.
+ * src/kwset.c, src/searchutils.c (NCHAR): Move this macro to ...
+ * src/system.h: ... here, and make it a constant.
+
+2016-12-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix performance with multiple patterns
+ Problem reported by Jaroslav Skarvada (Bug#22357).
+ * NEWS: Document this and other recent performance fixes.
+ * src/grep.c (E_MATCHER_INDEX): New constant.
+ (all_single_byte_after_folding):
+ New function, split out from fgrep_icase_available.
+ (fgrep_icase_available): Use it.
+ (try_fgrep_pattern): New function, which also uses it.
+ (main): With two or more patterns, use try_fgrep_pattern to fix
+ performance regression. The number "two" here is just a heuristic.
+
+ grep: simplify matcher configuration
+ * src/grep.c (matcher, compile): Remove static vars.
+ (compile_fp_t): Now takes a 3rd syntax argument.
+ (Gcomppile, Ecompile, Acompile, GAcompile, PAcompile): Remove.
+ (struct matcher): Now nameless, since it is used only once.
+ Make 'name' a bit shorter. New member 'syntax'.
+ (matchers): Initialize it, and change removed functions to GEAcompile.
+ (F_MATCHER_INDEX, G_MATCHER_INDEX): New constants.
+ (setmatcher): New arg MATCHER, and return new matcher index.
+ Avoid unnecessary call to strcmp.
+ (main): Keep matcher as a local int, not a global pointer.
+ * src/kwsearch.c (Fcompile):
+ * src/pcresearch.c (Pcompile): Ignore the 3rd syntax argument.
+
+ grep: simplify line counting in patterns
+ * src/grep.c (n_patterns): Rename from patfile_lineno,
+ as it is now origin-zero. Now size_t, not uintmax_t.
+ (count_nl_bytes, fl_add): Simplify to just buffer and size.
+ All callers changed.
+
+2016-12-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2016-12-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+ build: update gnulib submodule to latest
+
+2016-12-13 Jim Meyering <meyering@fb.com>
+
+ tests: use just-built grep in more places
+ * cfg.mk (PATH): Prepend $(srcdir)/src, so that we use the just-
+ built grep also when running commands like those of "make distcheck".
+ This would have avoided the recently-luckily-noticed infloop bug.
+ Tested by running this in a just-built directory:
+ f=src/grep; printf '%s\n' '#!/bin/sh' 'sleep 9h' > $f; chmod a+x $f
+ and then verifying that nearly every "make syntax-check" rule hangs.
+
+ maint: tell "syntax-check" not to worry about the NEWS update
+ Whenever we change "old" NEWS, we have to update this checksum.
+ Otherwise, a "make syntax-check" test that guards against a class
+ of logical merge conflicts will fail.
+ * cfg.mk (old_NEWS_hash): Update this hash to accommodate the
+ recent clarification of a 2.27 NEWS entry.
+
+2016-12-13 Arnold D. Robbins <arnold@skeeve.com>
+
+ build: update gnulib submodule to latest
+ * src/dfasearch.c (GEAcompile): Remove use of flag, RE_ICASE covers it.
+
+2016-12-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: work around proc lseek glitch
+ Problem reported by Andreas Schwab (Bug#25180).
+ * NEWS: Document this.
+ * src/grep.c (finalize_input): Ignore EINVAL lseek failures.
+ * tests/Makefile.am (TESTS): Add proc.
+ * tests/proc: New file.
+
+2016-12-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: simplify finalize_input
+ * src/grep.c (finalize_input): Simplify without changing behavior.
+ It's still a bit of a rat's-nest, but it's a cozier rat's-nest.
+
+ maint: clarify early-exit news for 2.27
+ * NEWS: Mention early-exit options to avoid confusion. See:
+ http://lists.gnu.org/archive/html/grep-devel/2016-12/msg00007.html
+
+2016-12-06 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.27
+ * NEWS: Record release date.
+
+2016-11-29 Jim Meyering <meyering@fb.com>
+
+ grep: fix DFA-induced infloop
+ * gnulib: Update to latest, for the DFA infloop fix.
+ * tests/dfa-infloop: New test, to trigger an infinite loop
+ in the DFA matcher.
+ * tests/Makefile.am (TESTS): Add it.
+
+2016-11-28 Jim Meyering <meyering@fb.com>
+
+ tests: use "returns_ N env VAR=val ..."
+ rather than "VAR=val returns_ N ..."
+ Some shells do not propagate envvar settings through our use
+ of the "returns_" function, so set any envvar via use of "env".
+ This was an issue at least on Ubuntu and Debian-based systems,
+ presumably due to their common use of "dash" as /bin/sh.
+ Reported by Assaf Gordon.
+ * tests/char-class-multibyte: As above.
+ * tests/euc-mb: Likewise.
+ * tests/false-match-mb-non-utf8: Likewise.
+ * tests/pcre-infloop: Likewise.
+ * tests/pcre-jitstack: Likewise.
+ * tests/sjis-mb: Likewise.
+ * tests/warn-char-classes: Likewise.
+
+2016-11-28 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: revert check for unibyte French range bug
+ The test wasn't portable, as it assumed that rational ranges
+ were not in effect. Problem reported by Eric Blake (Bug#25048#8).
+ There doesn't seem to be a portable way to do the test, so omit it.
+ * tests/init.cfg, tests/unibyte-bracket-expr:
+ Revert previous change.
+
+ build: update gnulib submodule to latest
+
+2016-11-27 Jim Meyering <meyering@fb.com>
+
+ grep: avoid false matches in non-UTF8 multibyte locales
+ * gnulib: Update to latest, for the dfa.c fix.
+ * NEWS (Bug fixes): Mention it.
+ * tests/false-match-mb-non-utf8: New file, with tests for this.
+ Based on tests from Stephane Chazelas.
+ * tests/Makefile.am (TESTS): Add it.
+ Introduced by commit v2.18-54-g3ef4c8e, a change that made grep use
+ its DFA matcher more aggressively. The malfunction arises only with
+ the DFA matcher, not with regex.
+ Reported by Stephane Chazelas in https://bugs.gnu.org/24975
+
+2016-11-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: check for unibyte French range bug
+ Problem reported by Stephane Chazelas (Bug#24973).
+ This bug was fixed in Gnulib.
+ * NEWS: Document the fix.
+ * tests/init.cfg (require_ru_RU_koi8_r): Remove.
+ * tests/unibyte-bracket-expr: Add a test for the bug.
+ Call get-mb-cur-max directly instead of bothering with
+ require_ru_RU_koi8_r.
+
+ build: update gnulib submodule to latest
+
+2016-11-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: further -P performance fix
+ Problem reported by Stephane Chazelas in:
+ http://bugs.gnu.org/22655#103
+ * src/pcresearch.c (Pexecute): Set the subject to the start of
+ each line as it is found.
+
+ grep: -P no longer uses PCRE_MULTILINE
+ This reverts commit f6603c4e1e04dbb87a7232c4b44acc6afdf65fef,
+ as the extra performance is not worth the trouble for PCRE users.
+ Problem reported by Stephane Chazelas in:
+ http://bugs.gnu.org/22655#103
+ * NEWS: Document this and the next patch.
+ * src/dfasearch.c (EGexecute):
+ * src/grep.c (execute_fp_t):
+ * src/kwsearch.c (Fexecute):
+ * src/pcresearch.c (Pexecute):
+ First arg is now a const pointer again.
+ * src/grep.c (buf_has_encoding_errors): Now static.
+ * src/grep.h (buf_has_encoding_errors): Remove decl.
+ * src/search.h: Adjust decls.
+ * src/pcresearch.c (reflags): Remove. All uses removed.
+ (Pcompile, Pexecute): Do not use PCRE_MULTILINE.
+
+2016-11-19 Jim Meyering <meyering@fb.com>
+
+ doc: fix a doubled "the"
+ * doc/grep.texi (--perl-regexp): s/the\nthe/the/
+
+2016-11-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix -zxP bug
+ * NEWS: Document this.
+ * src/pcresearch.c (Pcompile): Search a line at a time if -x is
+ used, since -x uses ^ and $.
+ * tests/pcre: Test this.
+
+ grep: simplify by using PRIuMAX
+ * configure.ac (HAVE_PRINTF_C99_SIZES): Remove; no longer needed.
+ * src/grep.c (print_offset): Simplify (Bug#24451).
+
+ grep: -T now adjusts number widths for worst case
+ * NEWS, doc/grep.texi (Output Line Prefix Control):
+ Document this (Bug#24451).
+ * src/grep.c (offset_width): New static var.
+ (print_offset): Use it instead of arg. All callers changed.
+ (grep): Set it.
+ * tests/initial-tab: Test this.
+
+ grep: -T no longer outputs BS
+ * NEWS: Document this (Bug#24451).
+ * src/grep.c (print_line_head): Do not attempt to backspace output.
+ * tests/initial-tab: New test.
+ * tests/Makefile.am (TESTS): Add it.
+
+ grep: document -oz better
+ * doc/grep.texi (General Output Control, Usage): Tweak (Bug#24961).
+
+ grep: fix performance typo with -P
+ Reported by Zev Weiss in: http://bugs.gnu.org/22655#88
+ * src/pcresearch.c (Pcompile): Initialize reflags.
+
+ tests: use "returns_" rather than "$?"
+ * tests/grep-dev-null-out: Use "returns_ 124" rather than testing
+ $? = 124.
+
+ grep -f /dev/null -L PAT FILE outputs FILE
+ * NEWS: Document this.
+ * src/grep.c (main): Do not exit right away with -L.
+ * tests/skip-read: Test for the fix.
+
+ grep: tune -f /dev/null
+ * src/grep.c (main): Do the -f /dev/null early-exit checks before
+ more-expensive tests that involve syscalls.
+
+ grep: treat -f /dev/null like -m0
+ * NEWS: Document this.
+ * src/grep.c (main): With -f /dev/null, don't bother to read the
+ input. This is what FreeBSD grep does.
+ * tests/Makefile.am (TESTS): Add skip-read.
+ * tests/skip-read: New file.
+
+ grep: avoid O(N**2) buffer reallocation
+ * src/grep.c (main): Use x2realloc to avoid O(N**2) performance as
+ pattern buffers grow.
+
+ grep: avoid unnecessary gettext call
+ Translate "(standard input)" lazily.
+ * src/grep.c (input_filename): New function.
+ (suppressible_error): Remove 1st arg, since it is always
+ input_filename (). All callers changed.
+ (suppressible_error, print_filename, grep, grepdesc): Use it.
+ (grep_command_line_arg): Set filename to NULL if standard
+ input has no label. Often, this avoids all calls to gettext,
+ which can be a win as the first call can be expensive.
+
+ grep: drain the input pipe faster
+ * src/grep.c (dev_null_output): Now static.
+ (drain_input): New function, using 'splice' if that makes sense.
+ (finalize_input): Use it.
+ (main): Omit now-unnecessary initialization.
+
+ grep: scale back /dev/null speedup
+ The performance improvement when output is /dev/null (commit
+ af6af288eac28951b5eee1eaaf373e22b2193b7b dated 2016-05-01)
+ breaks scripts that run "PROGRAM | grep PATTERN >/dev/null"
+ where PROGRAM dies when writing into a broken pipe.
+ Suppress the improvement if standard input is not seekable.
+ Problem reported by Gary Johnson (Bug#24941).
+ * NEWS: Document this.
+ * src/grep.c (seek_failed): New static var.
+ (seek_data_failed): Move decl earlier, to be next to seek_failed.
+ (file_must_have_nulls): Skip useless syscalls if seek_failed.
+ Lessen source-code nesting.
+ (reset): Set seek_failed and seek_data_failed.
+ Try lseek even on non-regular files.
+ (grep): New arg INEOF. All callers changed.
+ Do not clear seek_data_failed here, since 'reset' now does this.
+ (finalize_input): New static function.
+ (grepdesc): Use it.
+ (main): Do not exit on first match merely because output is
+ /dev/null.
+ * tests/grep-dev-null-out: Adjust to new behavior.
+
+ grep: improve diagnostic on lseek failure
+ * src/grep.c (reset): Mention the file name in the (unlikely)
+ chance of an lseek failure.
+
+ grep: avoid unnecessary isatty calls
+ This fixes an inefficiency that was mistakenly introduced a while
+ back, when the macro SET_BINARY became defined on all platforms.
+ * src/grep.c (grepdesc, main): Do not unecessarily call isatty on
+ POSIXish platforms.
+
+ grep: -Pz no longer rejects ^, $
+ Problem reported by Stephane Chazelas (Bug#22655).
+ * NEWS: Document this.
+ * doc/grep.texi (grep Programs): Warn about -Pz.
+ * src/pcresearch.c (reflags): New static var.
+ (multibyte_locale): Remove static var; now local to Pcompile.
+ (Pcompile): Check for (? and (* too. Set reflags instead of
+ dying when problematic operators are found.
+ (Pexecute): Use reflags to decide whether searches should
+ be multiline.
+ * tests/pcre: Test new behavior.
+
+2016-11-14 Jim Meyering <meyering@fb.com>
+
+ tests: use "returns_" rather than explicit comparison with "$?"
+ * tests/sjis-mb (encode): Rearrange to emit desired input into
+ a file, rather than piping directly into grep. That permits
+ the use of returns_ 1 to verify timeout's exit status.
+ * tests/euc-mb: Use "returns_ 1" rather than testing $? = 1
+ * tests/char-class-multibyte: Likewise.
+ * tests/dfa-heap-overrun: Likewise.
+ * tests/encoding-error: Likewise.
+ * tests/fedora: Likewise.
+ * tests/grep-dev-null: Likewise.
+ * tests/init.cfg (envvar_check_fail): Likewise.
+ * tests/kwset-abuse: Likewise.
+ * tests/mb-non-UTF8-overrun: Likewise.
+ * tests/multibyte-white-space: Likewise.
+ * tests/pcre-infloop: Likewise.
+ * tests/surrogate-pair: Likewise.
+ * tests/warn-char-classes: Likewise.
+ Do the same for other values:
+ * tests/backref-multibyte-slow: Likewise.
+ * tests/euc-mb: Likewise.
+ * tests/pcre-abort: Likewise.
+ * tests/pcre-jitstack: Likewise.
+ * tests/repetition-overflow: Likewise.
+ * tests/reversed-range-endpoints: Likewise.
+ * tests/warn-char-classes: Likewise.
+
+2016-10-26 Jim Meyering <meyering@fb.com>
+
+ doc: grep builds on HP-UX once again
+ * NEWS (Bug fixes): Mention the HP-UX fix.
+
+ gnulib: update to latest, for getprogname HPUX port
+
+2016-10-22 Mark Veltzer <mark.veltzer@gmail.com>
+
+ ignore coverage generated files
+
+ ignore ar-lib in build-aux
+
+2016-10-20 Zev Weiss <zev@bewilderbeest.net>
+
+ grep: use 'j' intmax_t printf length modifier if supported
+ * configure.ac: Use gl_PRINTF_SIZES_C99 to test printf and
+ (conditionally) define HAVE_PRINTF_C99_SIZES.
+ * src/grep.c (print_offset): Use printf("%j...") for printing
+ [u]intmax_t if HAVE_PRINTF_C99_SIZES is defined; otherwise continue
+ using the existing hand-rolled loop.
+
+2016-10-15 Jim Meyering <meyering@fb.com>
+
+ build: distribute new file, die.h, so "make distcheck" passes
+ * src/Makefile.am (grep_SOURCES): Add die.h.
+ Also, sort these file names.
+
+2016-10-10 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2016-10-09 Jim Meyering <meyering@fb.com>
+
+ maint: die.h: add the "#define ..." part of double inclusion guard
+ * src/die.h (DIE_H): Define to 1.
+
+2016-10-04 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: don't assume stdbool.h before die call
+ * src/die.h: Include stdbool.h, since 'die' uses 'false'
+
+ grep: die more systematically
+ * src/die.h: New file.
+ * src/dfasearch.c, src/grep.c, src/pcresearch.c: Include die.h.
+ * src/dfasearch.c (dfaerror):
+ * src/grep.c (context_length_arg, add_count, prline, setmatcher, main):
+ * src/pcresearch.c (jit_exec, Pcompile, Pexecute):
+ Use 'die' instead of 'error' when exiting.
+ * src/pcresearch.c: Do not include verify.h.
+ (die): Remove; now in die.h.
+ * src/search.h: Do not include error.h here, since this file does
+ not use anything defined in error.h. Instead, dfasearch.c, which
+ uses error.h's symbols, now includes error.h directly.
+
+2016-10-02 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.26
+ * NEWS: Record release date.
+
+2016-10-01 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest; for getprogname fix
+
+2016-10-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests/grep-dir: port to Solaris 10
+ * tests/grep-dir: Port to Solaris 10 'cat', which
+ exits with status 0 even after 'read' fails from a directory.
+
+2016-09-28 Jim Meyering <meyering@fb.com>
+
+ build: placate GCC 7's -Wimplicit-fallthrough
+ * src/pcresearch.c (die): New macro.
+ (Pexecute): Use it in place of offending uses of error,
+ to placate GCC 7's -Wimplicit-fallthrough.
+ Include verify.h. Since this is grep's first explicit use of this
+ gnulib module, ...
+ * bootstrap.conf (gnulib_modules): Add verify.
+
+ gnulib: update to latest; for ...
+ This includes the following:
+ - a getprogname-vs-openbsd-5.1 portability fix
+ - "fallthru" comment-adding changes for dfa and unistr/u8-uctomb-aux.c
+ - another getprograme fix to avoid breaking newer glibc
+
+2016-09-27 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: reword .git old-GCC warning
+ * configure.ac (gl_gcc_warnings): Reword diagnostic.
+ Suggested by Assaf Gordon in:
+ http://lists.gnu.org/archive/html/grep-devel/2016-09/msg00024.html
+
+ build: port .git builds to newer GCC
+ * configure.ac (gl_gcc_warnings): Omit duplicate copy of 'main'.
+ Problem reported by Assaf Gordon in:
+ http://lists.gnu.org/archive/html/grep-devel/2016-09/msg00024.html
+
+ build: port .git builds to older GCC
+ Problem reported by Dagobert Michelsen in:
+ http://lists.gnu.org/archive/html/grep-devel/2016-09/msg00018.html
+ * configure.ac (gl_gcc_warnings): Default to false if .git
+ exists but GCC is too old.
+
+2016-09-27 Jim Meyering <meyering@fb.com>
+
+ tests/long-pattern-perf: avoid false-failure due to cache speed
+ * tests/long-pattern-perf: This test would fail semi-consistently
+ on some systems, probably because the smaller regexp fit well
+ within cache, yet the larger one did not. In that case, there
+ was a relative speed difference greater than 20x and the test
+ would fail. Quadruple the sizes, to make that less likely.
+ Also, construct the 10x larger regexp directly from the smaller,
+ rather than relying on seq with endpoints to induce that
+ approximate size ratio. Reported by Bruce Dubbs in
+ https://lists.gnu.org/archive/html/grep-devel/2016-09/msg00013.html
+
+2016-09-24 Jim Meyering <meyering@fb.com>
+
+ build: avoid "./configure && make dist" missing-dep. failure
+ * Makefile.am (run-syntax-check): Depend on "all", to avoid a
+ parallel build failure due to a missing dependency. Reported by
+ Paul Eggert in https://bugs.gnu.org/24256#50
+
+2016-09-24 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2016-09-24 Jim Meyering <meyering@fb.com>
+
+ tests/fmbtest: avoid false-failure due to reliance on MB-correct sed
+ * tests/fmbtest: Several of these tests would mistakenly fail due to
+ postprocessing with a combination of sed and locale support that failed
+ to handle some multibyte characters in the cs_CZ.UTF-8 locale. Instead
+ of relying on sed's multibyte support or anything locale-related to
+ perform this simple filtering, just use this: tr -cs '0-9' '[ *]'
+ Also, rather than exporting LC_ALL, just set it for each command.
+ Reported by Nelson H. F. Beebe.
+ https://bugs.gnu.org/24534
+
+ tests: revamp multibyte-white-space test to be more permissive
+ This test elicits too many failures. Whether a system has accurate
+ unicode "whitespace" attributes should not influence whether grep's
+ test suite passes. In many cases, now you will see a warning that
+ some multibyte characters do not pass whitespace-related tests, but
+ this test no longer fails. However, if you run this test on a modern
+ enough system, it does require that \s and \S do work properly with
+ most of the listed characters.
+ * tests/multibyte-white-space: Confirm that Fedora 24's locale
+ tables still declare those four Unicode code points *not* whitespace.
+ Honor a new column telling how to handle failure. Provide more
+ information in each diagnostic.
+ Reported by Nelson H. F. Beebe.
+ https://bugs.gnu.org/24530
+
+ tests: avoid erroneous failure of pcre-jitstack test
+ On some systems (*BSD), 'ulimit -s unlimited' would fail, yet the
+ test for that mistakenly masked the failure, so the following grep
+ command ended up failing with a segfault.
+ * tests/pcre-jitstack: Don't mask the ulimit failure.
+ Reported privately by Nelson H. F. Beebe.
+ https://bugs.gnu.org/24524
+
+2016-09-23 Jim Meyering <meyering@fb.com>
+
+ grep: avoid unwarranted "input file 'F' is also the output" on *BSD
+ On *BSD systems, any command like "echo y | grep x", where grep reads
+ from a pipe and writes to standard output, would mistakenly emit this:
+ grep: input file '(standard input)' is also the output
+ * src/grep.c (grepdesc): Ensure that the file descriptor we're
+ reading is a regular one before using SAME_INODE to test whether
+ it is the same as the descriptor open on standard output.
+ Nelson Beebe reported privately that the foad1 tests failed on many
+ BSD systems. Exposed by commit v2.25-2-gaf6af28.
+ https://bugs.gnu.org/24522
+
+ tests: avoid backref-multibyte-slow false failure
+ * tests/backref-multibyte-slow (max_seconds): If we calculate
+ a max duration of 1 second, use 5. Otherwise, on high-latency
+ systems, it would be way too easy for the duration of the final
+ test run to exceed that limit. Reported by Nelson H. F. Beebe.
+ http://bugs.gnu.org/24516
+
+2016-09-22 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest; for getprogname-vs-AIX fix
+
+2016-09-18 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: add news entry for fix to bug#24233
+ * NEWS (Bug fixes): Add an entry describing bug#24233.
+ The bug was fixed by commit v2.25-77-gad468bb, by chance.
+
+2016-09-15 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+
+2016-09-10 Jim Meyering <meyering@fb.com>
+
+ dfa: reflect move of grep's DFA code to gnulib
+ Now that the core DFA code and tests reside in gnulib,
+ remove the copies here and use what gnulib provides.
+ * bootstrap.conf: Use the dfa module.
+ * cfg.mk: Remove settings involving files that have moved.
+ (_gl_TS_unmarked_extern_functions): Add dfaerror and dfawarn.
+ It is wrong/ugly to have to define these global symbols to use
+ the dfa module, but we'll adjust that separately.
+ * po/POTFILES.in: Apply s/src/lib/ to src/dfa.c.
+ * src/Makefile.am: Remove mention of dfa.[ch] and localeinfo.[ch].
+ * tests/Makefile.am: Remove mention of the tests that we have
+ moved to the gnulib module.
+ * src/dfa.c: Remove file.
+ * src/dfa.h: Likewise.
+ * src/localeinfo.c: Likewise.
+ * src/localeinfo.h: Likewise.
+ * tests/dfa-match: Likewise.
+ * tests/dfa-match-aux.c: Likewise.
+ * tests/invalid-char-class: Likewise.
+
+ gnulib: update to latest, for new dfa module
+
+2016-09-08 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: encoding errors suppress just their line
+ From a suggestion by Marcello Perathoner (Bug#22838).
+ * NEWS, doc/grep.texi (File and Directory Selection): Document this.
+ * src/grep.c (print_line_head): Do not suppress later output lines
+ merely because an earlier output line would have had an encoding error.
+ * tests/encoding-error: Test for the new behavior.
+
+2016-09-08 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest, for getprogname fixes
+
+2016-09-08 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: additional change new option for anchored searches
+ * src/dfa.c (dfaexec_main): Do it.
+
+2016-09-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: define "context lines"
+ Reported by Igor Bogomazov via Santiago Ruano Rincón (Bug#24024).
+ * doc/grep.texi (Context Line Control): Define "context lines".
+
+ build: update gnulib submodule to latest
+
+2016-09-05 Jim Meyering <meyering@fb.com>
+
+ maint: switch from gnulib's progname to getprogname module
+ * gnulib: Update to latest, for its new getprogname module.
+ * bootstrap.conf (avoided_gnulib_modules): Include the getprogname
+ module rather than the now-obsolescent progname.
+ * src/grep.c: Include "getprogname.h" rather than "progname.h"
+ and remove any use of set_program_name.
+ * tests/dfa-match-aux.c (main): Likewise.
+ * tests/get-mb-cur-max.c (main): Likewise.
+ * src/grep.c (usage, main): Use getprogname() in place of program_name.
+
+2016-09-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: minor cleanup of previous change
+ * src/dfa.c (dfaexec_main): Omit redundant code and reindent.
+
+2016-09-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: additional change new option for anchored searches
+ * src/dfa.c (dfaexec_main): Do it.
+
+ dfa: use single-byte algorithm even in non-UTF-8
+ * src/dfa.c (dfaexec_main): Do it. (This was inadvertently
+ omitted in a recent patch.)
+
+2016-09-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: merge xalloc.h changes from Gawk
+ * src/dfa.h (_GL_ATTRIBUTE_MALLOC): Define here, as other
+ Gnulib .h files do. This is more consistent with Gawk.
+ * src/dfa.c: Include xalloc.h, since dfa.h no longer does so.
+ Include localeinfo.h later; we don't care about order, but Gawk does.
+
+2016-09-02 Arnold Robbins <arnold@skeeve.com>
+
+ dfa: port to C90
+ * src/dfa.c (dfamust): Avoid declarations after statement (Bug#21486).
+
+2016-09-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: new option for anchored searches
+ This follows up on a suggestion by Norihiro Tanaka (Bug#24262).
+ * src/dfa.c (struct regex_syntax): New member 'anchor'.
+ (char_context): Use it.
+ (dfasyntax): Change signature to specify it, along with the old
+ FOLD and EOL args, as a single DFAOPTS arg. All uses changed.
+ * src/dfa.h (DFA_ANCHOR, DFA_CASE_FOLD, DFA_EOL_NUL): New constants
+ for dfasyntax new last arg.
+
+2016-09-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: simplify and optimize at initial state in execution
+ * src/dfa.c (skip_remains_mb): Remove argument *pwc. Update calller.
+ (dfaexec_main): Simplify and optimize at initial state (Bug#24261).
+
+ dfa: simplify to find state index for state 0
+ * src/dfa.c (dfastate): Simplify to find state index for state 0.
+
+2016-09-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ tests: add a new test for SJIS locale
+ * tests/sjis-mb: Add a new test. It fails in grep-2.25 or prior.
+
+2016-09-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: update NEWS
+ * NEWS: Describe previous change.
+
+ grep: use regex fastmap unless -i
+ This builds on a suggestion by Norihiro Tanaka (Bug#24009).
+ * src/dfasearch.c (GEAcompile): Use a fastmap unless -i.
+ This improves performance 20x for me using the first benchmark
+ given in Bug#24009.
+
+ grep: improve dfasearch storage management
+ This patch is mostly refactoring, with a bit of performance tweaking.
+ It is done in preparation for a fix for Bug#24009.
+ * src/dfasearch.c (patterns): Now of type struct re_pattern_buffer *
+ instead of an anonymous struct pointer, since there is no longer
+ any need to keep regs here. All uses changed.
+ (GEAcompile): Use patlim instead of a hard-to-follow "total".
+ Use x2nrealloc to avoid potential O(N**2) reallocation algorithm.
+ Initialize just the pattern members that need clearing.
+ (EGexecute): Put regs into a static variable, as this code did
+ before 2001-02-18, as there is no need to have a separate set of
+ regs for each pattern. Explain the "Q@#%!#" comment better.
+
+2016-09-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: remove separation by context in transition in non-UTF8 multibyte locales
+ * src/dfa.c (struct dfa): Remove member curr_dependent. All uses
+ removed.
+
+2016-09-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: document previous change
+ * NEWS: Adjust to match previous change.
+
+2016-09-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: avoid invalid character matching period
+ * dfa.c (transit_state): Avoid invalid character matching period.
+
+ dfa: use single-byte algorithm even in non-UTF-8
+ Even in non-UTF8 locales, if the current input character
+ is single byte, we can use CSET to match ANYCHAR.
+ * src/dfa.c (struct dfa): New member canychar.
+ Cache index of CSET for ANYCHAR.
+ (lex): Make CSET for ANYCHAR.
+ (state_index): Simplify.
+ (dfastate): Consider CSET for ANYCHAR.
+ (transit_state_singlebyte, transit_state): Remove handling for eolbyte,
+ as we assume that eolbyte does not appear at current position.
+ (dfaexec_main): Use algorithm for single byte character to any single
+ byte character in input text always.
+ (dfasyntax): Initialize canychar.
+
+2016-09-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: avoid code duplication with -iF
+ This follows up on the -iF performance improvement (Bug#23752).
+ * NEWS: Simplify description of -iF improvement.
+ * src/dfa.c: Do not include wctype.h.
+ (lonesome_lower, case_folded_counterparts): Move to localeinfo.c.
+ (CASE_FOLDED_BUFSIZE): Move to localeinfo.h.
+ * src/grep.c: Do not include wctype.h.
+ (lonesome_lower): Remove.
+ (fgrep_icase_available): Use case_folded_counterparts instead.
+ Do not call it for the same character twice.
+ Return false on wcrtomb failures (which should never happen).
+ (fgrep_to_grep_pattern, main): Simplify. Let fgrep_to_grep’s
+ caller fiddle with the global variables.
+ * src/localeinfo.c: Include <wctype.h>
+ (lonesome_lower, case_folded_counterparts):
+ Move here from src/dfa.c. Return int, not unsigned int.
+ Verify that CASE_FOLDED_BUFSIZE is big enough.
+ * src/localeinfo.h (CASE_FOLDED_BUFSIZE): Now 32, so that
+ we don’t expose lonesome_lower’s size.
+ * src/searchutils.c (kwsinit): Return new kwset instead of
+ storing it via a pointer. All callers changed. Simplify a bit.
+
+2016-09-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: speed up -iF in multibyte locales
+ In a multibyte locale, if a pattern is composed of only single byte
+ characters and their all counterparts are also single byte characters
+ and the pattern does not have invalid sequences, grep -iF uses the
+ fgrep matcher, the same as in a single byte locale (Bug#23752).
+ * NEWS: Mention it.
+ * src/grep.c (lonesome_lower): New constant.
+ (fgrep_icase_available): New function.
+ (fgrep_to_grep_pattern): Simplify it.
+ (main): Use them.
+ * src/searchutils.c (kwsinit): New arg MB_TRANS; all uses changed.
+ Try fgrep matcher for case insensitive matching by grep -F in multibyte
+ locale.
+
+2016-08-31 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2016-08-31 Jim Meyering <meyering@fb.com>
+
+ maint: avoid new 'make syntax-check' failure
+ * src/dfa.c (using_simple_locale): Prefer STREQ(a,b) over
+ strcmp(a,b) == 0.
+
+ gnulib: update to latest
+
+2016-08-31 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: make dfa.c fully thread-safe
+ This follows up on Zev Weiss’s recent patches to make the DFA code
+ thread-safe (Bug#24249). It removes the remaining static
+ variables used by dfa.c. These variables are locale-dependent, so
+ they would cause problems in multithreaded code where different
+ threads are in different locales (e.g., via uselocale). I
+ abstracted most of the variables into a new localeinfo module.
+ * src/Makefile.am (grep_SOURCES): Add localeinfo.c.
+ (noinst_HEADERS): Add localeinfo.h.
+ * src/dfa.c: Include localeinfo.h.
+ (struct dfa): Remove multibyte member, as it is now part of
+ localeinfo. New members simple_locale and localeinfo.
+ Put locale-related members at the end.
+ (mbrtowc_cache): Remove; now part of dfa->localeinfo.
+ (charclass_index): Rename back from dfa_charclass_index,
+ since it's private.
+ (unibyte_word_constituent): New arg DFA; use its sbctowc member.
+ (using_utf8, dfa_using_utf8, init_mbrtowc_cache, check_utf8):
+ Remove; now done by localeinfo members. All uses changed.
+ (dfasyntax): New localeinfo arg. Move to end to avoid forward decls.
+ Initialize the entire DFA.
+ (unibyte_c, check_unibyte_c): Remove; now in simple_locale member.
+ (using_simple_locale): Now takes bool instead of DFA.
+ Do the locale check here, rather than in the caller,
+ as the result is now cached in dfa->simple_locale.
+ (dfaalloc): Just allocate the DFA. dfasyntax now initializes it.
+ * src/dfa.h: Add forward decl of struct localeinfo.
+ Adjust to new dfa.c API.
+ * src/dfasearch.c (localeinfo): New var, replacing former static
+ vars like mbrtowc_cache.
+ * src/localeinfo.c, src/localeinfo.h: New files.
+ * src/search.h: Include localeinfo.h.
+ (localeinfo): New decl.
+ * src/searchutils.c (mbclen_cache, build_mbclen_cache):
+ Remove. All uses changed to localeinfo.
+ * tests/Makefile.am (dfa_match_aux_LDADD): Add localeinfo.o.
+ * tests/dfa-match-aux.c: Include localeinfo.h.
+ (main): Adjust to changes in DFA API.
+
+2016-08-28 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+ This should fix Bug#24323 reported by Dennis Clarke, where grep
+ does not build on Solaris 10 when compiled with Solaris Studio 12.4.
+
+2016-08-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: minor thread-safety cleanups
+ * src/dfa.c (struct lexer_state): Rename lexptr to ptr and lexleft
+ to left, for brevity. All uses changed.
+ (struct dfa): Rename lexstate to lex and parsestate to parse,
+ for brevity. All uses changed.
+ (using_simple_locale): Simplify boolean expression.
+ (FETCH_WC): Parenthesize uses of dfa macro arg.
+ (FETCH_WC, parse_bracket_exp, addtok_mb): Prefer suffix operators
+ on structure members when possible, for clarity.
+ (parse_bracket_exp): Check for buffer exhaustion before
+ dereferencing buffer pointer.
+ (struct lexptr): New type.
+ (push_lex_state, pop_lex_state): Use it. Change from macros
+ PUSH_LEX_STATE and POP_LEX_STATE to static functions, and add
+ parameters to make them proper C functions. All uses changed.
+ (lex): Simplify tests for \) and \|. Avoid some string
+ duplication by using &"^..."[boolean].
+ (dfaalloc): Use xzalloc, not xcalloc with 1.
+
+2016-08-21 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: minor tweaks of initial buffer alloc
+ * src/grep.c (main): Allocate input buffer only when about
+ to do I/O. Avoid int overflow on systems with 2 GiB pages.
+ Fix size_t overflow check.
+
+2016-08-20 Zev Weiss <zev@bewilderbeest.net>
+
+ dfa: constify some function parameters
+ * src/dfa.c (char_context): Mark dfa parameter const.
+ (charclass_context): Likewise.
+
+ dfa: thread-safety: initialize mbrtowc_cache in dfa_init
+ * src/dfa.c (dfasyntax): Remove initialization of mbrtowc_cache.
+ (init_mbrtowc_cache): New function.
+ (dfa_init): Call it.
+ http://bugs.gnu.org/24259
+
+ dfa: thread-safety: eliminate static local variables
+ * src/dfa.c: Replace utf8 and unibyte_c static local variables with
+ static globals initialized by a new function dfa_init() which must be
+ called before any other dfa*() functions.
+ (dfa_using_utf8): Rename using_utf8() to dfa_using_utf8() for
+ consistency with other exported functions.
+ * src/dfa.h (dfa_using_utf8): Rename using_utf8() to dfa_using_utf8();
+ also add _GL_ATTRIBUTE_PURE.
+ (dfa_init): New function.
+ * src/grep.c (main), tests/dfa-match-aux.c (main): Call dfa_init().
+ * src/dfasearch.c (EGexecute): Replace using_utf8 with dfa_using_utf8.
+ * src/kwsearch.c (Fexecute): Likewise.
+ * src/pcresearch.c (Pcompile): Likewise.
+ http://bugs.gnu.org/24259
+
+ dfa: thread-safety: move regex syntax configuration into struct dfa
+ * src/dfa.c: move global variables holding regex syntax configuration
+ into a new struct (`struct regex_syntax') and add an instance of it to
+ struct dfa. All references to the globals are replaced with
+ references to the dfa struct's new member. As a side effect, a
+ `struct dfa' must be allocated with dfaalloc() and passed to
+ dfasyntax().
+ * src/dfa.h (dfasyntax): Add new struct dfa* parameter.
+ * src/dfasearch.c (GEAcompile): Allocate `dfa' earlier and pass it to
+ dfasyntax().
+ * tests/dfa-match-aux.c (main): Pass `dfa' to dfasyntax().
+ http://bugs.gnu.org/24259
+
+ dfa: thread-safety: move parser state into struct dfa
+ * src/dfa.c: move global variables holding parser state (`tok' and
+ `depth') into a new struct (`struct parser_state') and add an instance
+ of it to struct dfa. All references to the globals are replaced by
+ references to the dfa struct's new member.
+ http://bugs.gnu.org/24259
+
+ dfa: thread-safety: move lexer state into struct dfa
+ * src/dfa.c: move global variables holding lexer state into a new
+ struct (`struct lexer_state') and add an instance of this struct to
+ struct dfa. All references to the globals are replaced with
+ references to the dfa struct's new member.
+ http://bugs.gnu.org/24259
+
+2016-08-19 Zev Weiss <zev@bewilderbeest.net>
+
+ dfa: thread-safety: remove dfa.c's "dfa" global
+ Remove the global dfa struct. Instead, add a struct dfa pointer
+ parameter to each function that had been using the global.
+ * src/dfa.c (dfa): Remove file-scoped global.
+ (charclass_index): Remove now-unnecessary function.
+ (using_simple_locale): Add a dfa parameter and update all callers.
+ (FETCH_WC, parse_bracket_exp, lex, addtok_mb, addtok): Likewise.
+ (addtok_wc, add_utf8_anychar, atom, nsubtoks, copytoks): Likewise.
+ (closure, branch, regexp): Likewise.
+ (dfaparse): No longer set the global.
+ http://bugs.gnu.org/24260
+
+2016-08-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: tune list_files conversion to enum
+ * src/grep.c (grepdesc): Use a slightly more-efficient way to test
+ list_files.
+
+ grep: prefer bitwise to short-circuit when shorter
+ * src/grep.c (skip_devices, initialize_unibyte_mask, fillbuf, main)
+ * src/kwsearch.c (Fexecute): Prefer bitwise to short-circuit ops
+ when they are logically equivalent and the bitwise ops generate
+ shorter code on GCC 6.1 x86-64.
+ * src/grep.c (get_nondigit_option, parse_grep_colors):
+ Use c_isdigit instead of spelling it out with a short-circuit op.
+
+2016-08-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: use 64-bit when ulong is at least that wide
+ * src/dfa.c (charclass_word): Now unsigned long instead of unsigned.
+ (CHARCLASS_WORD_BITS): Now 64 on 64-bit platforms.
+ (CHARCLASS_PAIR, CHARCLASS_INIT): New macros.
+ (CHARCLASS_WORD_MASK): Now a static const, since it no longer
+ needs to be a macro.
+ (equal): Open-code rather than calling memcmp.
+ (add_utf8_anychar): Use CHARCLASS_INIT.
+
+ dfa: avoid uninitialized constants
+ Some compilers warn about 'static int const x;' on the grounds
+ that X should have an initializer. Instead of worrying about
+ this, rewrite to avoid this sort of thing.
+ * src/dfa.c (emptyset): New function.
+ (parse_bracket_exp): Use it instead of 'equal' and a zero constant.
+ * src/dfasearch.c (struct patterns): Remove tag 'patterns'.
+ (patterns0): Remove zero constant.
+ (GEAcompile): Use memset instead of the zero constant.
+
+2016-08-17 Jim Meyering <meyering@fb.com>
+
+ maint: avoid new "make syntax-check" failure
+ * src/dfa.c: Adjust comment not to go past column 80.
+
+ tests: pcre-jitstack: avoid false failure without base64 -d support
+ * tests/pcre-jitstack: Try harder to find a base64 decoder:
+ try 'base64 -d', 'base64 -D', 'openssl base64 -d' and perl's
+ MIME::Base64 decode_base64. The old code would fail at least on
+ OS X, for which base64 expects -D or --decode.
+ Reported by Jack Howarth in http://bugs.gnu.org/24243.
+
+2016-08-16 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: minor refactoring and doc fixes
+ * NEWS: Improve description of recent change.
+ * src/dfa.c: Improve commentary. Indent new code (and some
+ long-existing howlers) more in GNU style.
+ (dfa_state): Reorder members to make struct smaller on x86.
+ mb_trindex member is now state_num, not size_t, so that -1 is more
+ natural; all uses changed.
+ (struct dfa): Similarly for mb_trcount member.
+ (state_index): Compute values for new state components before
+ allocating the state, to make the code easier to understand.
+ (state_index, dfastate): Prefer A & ~B to other forms like (A & B)
+ != A.
+ (dfastate, build_state, transit_state): In new code, prefer i++ to
+ ++i in for-loop control.
+ (build_state, transit_state): In new code, prefer < to >.
+ (transit_state): Add to *PP in one assignment, rather than in a
+ loop. Prefer !x to x == NULL. Use xmalloc instead of xnmalloc,
+ since the size is a constant. Do the size calculation as a signed
+ integer constant expression, so that the compiler diagnoses any
+ overflow.
+ (transit_state, free_mbdata): Tune by looping from -1 to N - 1,
+ rather than from 0 to N - 1 with a separate instance for -1.
+ (dfaexec_main): Rewrite to avoid side effects in if-part.
+ (free_mbdata): Simplify.
+
+ dfa: port to C90
+ * src/dfa.c (transit_state, dfa_supported, dfamust):
+ Don't use declarations after statements.
+ If I recall correctly, gawk still wants to port to C90.
+
+ dfa: fix context newline confusion
+ * src/dfa.c (transit_state): Fix "... & ~0" that was evidently
+ intended to be "... & ~1". Do index calculation in a simpler way,
+ that uses just addition (Bug#21486).
+
+2016-08-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: improve leading "." with non-UTF8 multibyte
+ In non-UTF8 multibyte locales, matching the dot expression is very
+ slow, as the next state is calculated on demand. This change caches
+ the result for the typical case (Bug#21486).
+
+ Compare the run times of this command before and after this change,
+ on a i5-4570 CPU @ 3.20GHz using rawhide (~fedora 22) and compiled
+ with gcc 5.1.1 20150618:
+ yes "$(printf 'a%38db\n' 0)" | head -1000000 >in
+ env LC_ALL=ja_JP.eucJP time -p \
+ src/grep .......................................... in
+ Before: 19.10
+ After : 0.55
+
+ * NEWS: Document this.
+ * src/dfa.c: (struct dfa_state): New members curr_dependent, mb_trindex.
+ (MAX_TRCOUNT): New constant.
+ (struct dfa): New members mb_trans, mb_trcount.
+ (state_index): Initialize new members of struct dfa_state and calculate
+ dependency on context of next character for positions for dot.
+ (dfastate): Calculate follows positions for dot if enabled.
+ (realloc_trans_if_necessary): Allocate transition tables.
+ (build_state): Use new constant and reset transition tables.
+ (transit_state): Use cache for transition from a state with the dot
+ expression.
+ (free_mbdata): Deallocate transition tables.
+
+2016-08-06 Jim Meyering <meyering@fb.com>
+
+ tests: standardize on 10-second timeouts to avoid rare false failure
+ In a parallel test run, it is not unusual to exceed a timeout of
+ 1-3 seconds. Increase several from 3 or fewer to 10 seconds.
+ * tests/skip-device: Increase timeout from 2 to 10 seconds.
+ * tests/grep-dev-null-out: Likewise, but s/1/10/.
+ * tests/pcre-invalid-utf8-input: Likewise, but s/3/10/.
+ * tests/dfa-match: Likewise.
+ * tests/pcre-invalid-utf8-infloop: Likewise.
+ * tests/pcre-infloop: Likewise.
+ * tests/max-count-overread: Likewise.
+ * tests/invalid-multibyte-infloop: Likewise.
+ Prompted by http://bugs.gnu.org/24159.
+
+ tests/backref-multibyte-slow:: avoid false positive
+ * tests/backref-multibyte-slow: When redirecting the "fast" LC_ALL=C
+ run's output to /dev/null, we got an artificially low timing (of 0),
+ due to grep's own stdout-vs-/dev/null optimization. With an initial
+ timing of 0 on that first run, the derived timeout for the UTF-8 run
+ (which redirects to a file) would be a mere 1 second. The fix: also
+ redirect that first run's output to a file, not to /dev/null.
+
+2016-08-05 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: minor fix for whether dfa is "fast"
+ * src/dfa.c (dfaoptimize): When a UTF-8 optimization succeeds for
+ a DFA (it can use single-byte code paths), record that by setting
+ its ->fast flag.
+
+2016-07-25 Jim Meyering <meyering@fb.com>
+
+ grep: print "filename:lineno:" in invalid-regex diagnostic
+ Determining the file name and line number is a little tricky because
+ of the way the regular expressions are all concatenated onto a newline-
+ separated list. By the time grep would compile regular expressions,
+ the <filename,lineno> origin of each regexp was no longer available.
+ This patch adds a list of filename,first_lineno pairs, one per input
+ source, by which we can then map the ordinal regexp number to a
+ filename,lineno pair for the diagnostic.
+
+ * src/dfasearch.c (GEAcompile): When diagnosing an invalid regexp
+ specified via -f FILE, include the "FILENAME:LINENO: " prefix.
+ Also, when there are two or more lines with compilation failures,
+ diagnose all of them, rather than stopping after the first.
+ * src/grep.h (pattern_file_name): Declare it.
+ * src/grep.c: (struct FL_pair): Define type.
+ (fl_pair, n_fl_pair_slots, n_pattern_files, patfile_lineno):
+ Define globals.
+ (fl_add, pattern_file_name): Define functions.
+ (main): Call fl_add for each type of the following: -e argument,
+ -f argument, command-line-specified (without -e) regexp.
+ * tests/filename-lineno.pl: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Improvements): Mention this.
+ Initially reported by Gunnar Wolf in https://bugs.debian.org/525214
+ Forwarded to grep's bug list by Santiago Ruano Rincón as
+ http://debbugs.gnu.org/23965
+
+2016-07-24 Jim Meyering <meyering@fb.com>
+
+ tests: add coreutils' perl-driven test framework
+ * configure.ac: Set the AM_CONDITIONAL variable, HAVE_PERL.
+ * tests/Coreutils.pm: New file.
+ * tests/CuSkip.pm: New file.
+ * tests/CuTmpdir.pm: New file.
+ * tests/no-perl: New file.
+ * tests/Makefile.am: Set up to use .pl tests:
+ (TEST_EXTENSIONS, TESTSUITE_PERL, TESTSUITE_PERL_OPTIONS): Define.
+ (SH_LOG_COMPILER, PL_LOG_COMPILER): Define.
+ (EXTRA_DIST): Add the four new file names.
+
+ doc: omit an excess word in HACKING
+
+2016-07-21 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: always match single line only with DFA superset
+ \n cannot occur inside a multibyte character. So an input always
+ matches single line only with DFA superset.
+
+ * src/dfasearch.c (EGexecute): Simplify it with above.
+
+2016-07-15 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: fix whitespace problems
+ * src/dfa.c: Use GNU style for pointer decls.
+
+2016-07-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: modernize HACKING a bit
+ * HACKING: Remove some ancient history to simplify maintenance.
+
+2016-07-14 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: minor style changes for -F crash fix
+ * src/kwset.c (memoff2_kwset): Use ?: instead of if-else.
+
+2016-07-14 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: fix -F crash when alternating duplicates
+ grep -F crashes with a pattern like 0\n0.
+ This bug was introduced in 966f6586fbce3081ce6e5e2f9b55301b0ec3d2b4.
+
+ * src/kwset.c (memoff2_kwset): If two characters are the same,
+ use memchr instead of memchr2.
+ * tests/two-chars: New test.
+ * tests/Makefile.am (TESTS): Add it.
+
+2016-07-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: fix comments to match code better
+ * src/dfa.c: Fix comments.
+
+2016-07-06 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: don't treat null bytes specially
+ * src/dfa.c (transit_state): Do not treat null byte specially
+ when eolbyte == '\n'.
+
+2016-07-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: don't distingish letter in non-POSIX locales
+ For non-POSIX locales, dfa does not support word delimiter
+ support, so remove distinction between letters and non-letters.
+ * src/dfa.c (struct dfa): Remove members initstate_letter,
+ initstate_others. All uses removed. New member initstate_notbol.
+ (dfaanalyze, dfaexec_main): Replace old members with new member.
+ (wchar_context): Remove. Update callers.
+
+2016-07-06 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: minor cleanups for non-POSIX simplification
+ * src/dfa.c (transit_state_singlebyte): Remove unnecessary 'const'
+ from arg; we usually don't bother with 'const' on locals.
+ (transit_state_singlebyte): Omit '!= NULL' in boolean context.
+ Use assert rather than abort.
+
+2016-07-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: simplify for non-POSIX locales
+ Simplify the dfa code, since it no longer supports ranges,
+ collating elements, and equivalent classes in non-POSIX locales.
+ * src/dfa.c (struct dfa): Remove mb_match_lens.
+ (enum status_transit_state, match_anychar)
+ (check_matching_with_multibyte_ops, transit_state_consume_1char):
+ (State_transition): Remove.
+ (transit_state_singlebyte): Accepts pointer-to-pointer position,
+ instead of pointer, and no longer accept pointer to next state.
+ Return next state instead of status_transit_state. All callers
+ changed.
+ (transit_state_singlebyte, transit_state): Simplify.
+ (dfaexec_main): Now transit_state is called only when next character
+ matches with ANYCHAR.
+
+2016-06-14 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: propagate more changes from grep.texi
+ Problem reported by Björn Voigt in: http://bugs.gnu.org/23763#27
+ * doc/grep.in.1: Fix more inconsistencies with grep.texi.
+
+2016-06-13 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: remove obsolete MS-DOS mention
+ * doc/grep.in.1: Remove obsolete discussion of MS-DOS heuristics.
+ Problem reported by Björn Voigt in: http://bugs.gnu.org/23763
+
+2016-06-09 Zev Weiss <zev@bewilderbeest.net>
+
+ grep: do pagesize initialization and buffer allocation earlier
+ * src/grep.c (reset, main): We're going to need pagesize and buffer
+ initialized anyway, so we might as well do so unconditionally early on
+ rather than checking on every call to reset().
+ http://bugs.gnu.org/23717
+
+ grep: remove unnecessary dirdesc variable.
+ * src/grep.c (grepdirent): Remove dirdesc variable and just use
+ fts_cwd_fd directly, since the fts_options test was guaranteed to
+ succeed (and fts_cwd_fd was already being used directly in fstatat()
+ anyway). http://bugs.gnu.org/23716
+
+ grep: convert list_files to an enum
+ * src/grep.c: Make list_files a tristate enum instead of an int.
+ http://bugs.gnu.org/23715
+
+ grep: correct a stale comment and remove dead code
+ * src/grep.c (grepdesc): The `grep()' function no longer has
+ special-case negative return values, since it no longer handles
+ directories, so don't bother checking for them.
+ http://bugs.gnu.org/23714
+
+ maint: replace bitwise with logical OR
+ * src/grep.c (main): replace bitwise ORs with logical ORs where it
+ makes sense (when dealing with boolean conditions as opposed to
+ bitmasks). http://bugs.gnu.org/23713
+
+ maint: mark a couple of static variables const
+ * src/dfa.c (parse_bracket_exp): mark zeroclass const.
+ * src/dfasearch.c: mark patterns0 const.
+ http://bugs.gnu.org/23712
+
+2016-06-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: fix similar bug in exit status test
+ * tests/grep-dir (status_range): New shell function.
+ Use it to fix bug where $? was not saved properly.
+
+2016-06-03 Zev Weiss <zev@bewilderbeest.net>
+
+ tests: fix bug in exit status test
+ When checking $? against multiple values, save its value in another
+ variable and check that so as to avoid tests beyond the first seeing a
+ $? clobbered by earlier ones.
+
+ * tests/status: save $? in a temporary variable before testing it.
+
+2016-06-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: more simplification of dfaexec_main
+ * src/dfa.c (dfaexec_main): Failure at an acceptable position and demand
+ to build state is unlikely. So go next loop without checking them after
+ a newline. This commit induces no semantic change.
+
+2016-06-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: correct attribution
+ * build-aux/git-log-fix: Fix attribution of primary Aho-Corasick patch
+
+2016-06-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: simplify -F Aho-Corasick a bit
+ This removes some tuning that complicates the code without providing
+ performance benefits that I could measure (GCC 6.1, x86-64).
+ (acexec_trans): Do not hand-unroll. Unduplicate the code for a
+ transition step.
+
+ * src/kwset.c (struct kwset.kwsexec, bmexec, acexec_trans, acexec)
+
+2016-06-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: minor cleanups for -F Aho-Corasick
+ * NEWS: Don't claim 7x, as the value seems to be system-dependent.
+ * src/kwset.c (struct kwset.kwsexec, bmexec, acexec_trans, acexec):
+ * src/kwset.c, src/kwset.h (kwsalloc, kwsexec):
+ Don't put 'const' into the declaration when that is irrelevant to
+ the API. More generally, don't bother with 'const' when it's only
+ a local so it is reasonably obvious to a reader that it is 'const'
+ anyway. It would be overkill to add 'const' to all locals that
+ never change.
+ * src/kwset.c (U): Avoid unnecessary parens.
+ (treefails, memoff2_kwset, bmexec_trans, bmexec, cwexec, acexec_trans):
+ Prefer SIZE_MAX to (size_t) -1.
+ (bmexec_trans, cwexec, acexec_trans):
+ Remove attributes for static functions that no longer seem needed.
+ (memoff2_kwset): Rename from memchr2_kwset, since it returns
+ an offset, not a pointer. All uses changed.
+ (cwexec, acexec_trans) [lint]: Remove initialization that is no
+ longer needed; at least, GCC 6.1 x86-64 does not need it.
+ (acexec_trans): Clarify code by using nesting rather than 'continue'.
+
+2016-06-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: use memchr2 for two patterns of a character
+ * src/kwset.c (memchr2_kwset): Add a new function. grep uses memchr2 to
+ search just two letters.
+ (cwexec, acexec_trans): Use it.
+
+ grep: -F multiword longest match not always needed
+ Searching multiple fixed words, grep immediately returns without longest
+ match if not needed. Without this change, grep tries longest match for
+ multiple words even if not needed.
+ * src/kwset.c (kwsexec, acexec, cwexec, bmexec): Add a bool argument
+ for whether longest match is needed. All callers changed.
+ * src/kwset.h (kwsexec): Update prototype.
+
+2016-06-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: use Aho-Corasick algorithm to search multiple fixed words
+ Searching multiple fixed words, grep used the Commentz-Walter
+ algorithm, but this was O(m*n) and was very slow in the worst case.
+ For example:
+
+ - input: yes `printf %040d` | head -10000000
+ - word1: x0000000000000000000
+ - word2: x
+
+ This change instead uses the Aho-Corasick algorithm to search multiple
+ fixed words. It uses a high-quality trie-building function that is
+ already defined for Commentz-Walter in kwset.c.
+
+ I see 7x speed-up even for a typical case on Fedora 21 with a 3.2GHz i5
+ by this change. Using best-of-5 trials for the benchmark:
+
+ find /usr/share/doc/ -type f |
+ LC_ALL=C time -p xargs.sh src/grep -Ff /usr/share/dict/linux.words >/dev/null
+
+ The results were:
+
+ real 11.37 user 11.03 sys 0.24 [without the change]
+ real 1.49 user 1.31 sys 0.15 [with the change]
+
+ * src/kwset.c (struct kwset): Add a new member 'mode'.
+ (kwsalloc): Use it.
+ All callers are changed.
+ (kwsincr): Using Aho-Corasick algorithm, build tries in normal order.
+ (acexec_trans, acexec): Add a new function.
+ (kwsexec): Use it.
+ * src/kwset.h (kwsalloc): Update a prototype.
+ * NEWS (Improvements): Mention it.
+
+2016-05-13 Jim Meyering <meyering@fb.com>
+
+ maint: do not let a LANGUAGE envvar setting perturb tests
+ E.g., running "LANGUAGE=eo make check" would provoke a failure
+ of the encoding-error test, on systems that mistakenly let that
+ envvar trump the setting of LC_ALL.
+ * tests/envvar-check: New file, copied from coreutils.
+ * tests/Makefile.am (EXTRA_DIST): Add it.
+ (TESTS_ENVIRONMENT): Source it.
+ Also select TMPDIR as we do for coreutils tests.
+ Reported by Benno Schulenberg in http://bugs.gnu.org/23527.
+
+2016-05-02 Jim Meyering <meyering@fb.com>
+
+ maint: avoid NEWS syntax-check failure
+ * NEWS: Move the mention of the /dev/null speed-up from the
+ block for 2.25 into the current, in-preparation block.
+
+2016-05-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: prefer bool for boolean
+ * src/dfa.c (syntax_bits_set, dfasyntax, using_utf8, FETCH_WC)
+ (POP_LEX_STATE, State_transition):
+ * src/dfa.h (using_utf_8):
+ Use bool for boolean.
+
+2016-05-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: stop exporting internal functions
+ * src/dfa.c, src/dfa.h (dfaparse, dfaanalyze, dfastate, dfainit):
+ Now static.
+
+ dfa: prefer bool at DFA interfaces
+ * src/dfa.c (struct dfa, dfasyntax, dfaanalyze, dfaexec_main)
+ (dfaexec_mb, dfaexec_sb, dfaexec_noop, dfaexec, dfacomp):
+ * src/dfa.h (dfasyntax, dfacomp, dfaexec, dfaanalyze):
+ * src/dfasearch.c (EGexecute):
+ Use bool for boolean.
+
+2016-05-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: speed up checking for character boundary
+ This should help performance with gawk; not so much with grep.
+ Suggested by Norihiro Tanaka in: http://bugs.gnu.org/18777
+ * src/dfa.c (never_trail): New static var.
+ (dfasyntax): Initialize it.
+ (skip_remains_mb): Use it to speed up a common case in Gawk.
+
+ grep: /dev/null output speedup
+ This sped up 'seq 10000000000 | grep . >/dev/null' by a factor of
+ 380,000 on my platform (Fedora 23, x86-64, AMD Phenom II X4 910e,
+ en_US.UTF-8 locale).
+ * NEWS: Document this.
+ * src/grep.c (grepbuf): exit_on_match no longer implies that -q
+ was specified, so when a match is found, exit with exit_failure if
+ an error was also found.
+ (grepdesc): Omit unnecessary S_ISREG and st_ino checks.
+ out_stat.st_ino is zero if stdout is not a regular file,
+ and this cannot possibly equal st->st_ino.
+ (main): Omit duplicate initialization of exit_failure. Do not
+ bother with isatty unless -q is not used and stdout is a character
+ special file and --color=auto and TERM says colorization is
+ possible. Most importantly, set exit_on_match if the output is
+ /dev/null.
+ * tests/grep-dev-null-out: New test.
+ * tests/Makefile.am (TESTS): Add it.
+ * tests/status: Do not require grep to actually read all the input
+ files when the output is /dev/null and a matching line has been
+ found.
+
+2016-04-21 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.25
+ * NEWS: Record release date.
+
+2016-04-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: remove dependency on btowc
+ MirOS BSD btowc is a macro that (when GCC is being used) hardcodes
+ btowc (0x80) == WEOF regardless of locale, which contradicts
+ future POSIX in the C locale. Instead of bothering to develop a
+ Gnulib workaround for the btowc incompatibility, use mbrtowc,
+ which we are using elsewhere and fixing anyway, and are caching so
+ it is fast here. Problem reported by Nelson H. F. Beebe via Jim
+ Meyering in: http://bugs.gnu.org/23269#14
+ * bootstrap.conf (gnulib_modules): Remove btowc.
+ * src/dfa.c (struct dfa): Remove mbrtowc_cache member, replacing with ...
+ (mbrtowc_cache): ... this new static var. All uses changed.
+ (dfambcache): Remove; now done by setsyntax. Call removed.
+ (is_valid_unibyte_character): Remove.
+ (IS_WORD_CONSTITUENT): Remove this macro, replacing it with ...
+ (unibyte_word_constituent): ... this new function. It uses
+ mbrtowc_cache rather than btowc.
+ (dfasyntax): Initialize mbrtowc_cache before using it.
+
+2016-04-10 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: minor doc tweaks inspired by Debian
+ Problem reported by Santiago Ruano Rincón in: http://bugs.gnu.org/22911
+ * doc/grep.in.1:
+ * doc/grep.texi (Matching Control, grep Programs)
+ (Regular Expressions):
+ Document -e, -f, and PCRE more carefully.
+
+2016-04-10 Jim Meyering <meyering@fb.com>
+
+ maint: remove unused mbtoupper function
+ * src/searchutils.c (mbtoupper): Remove now-unused function.
+ Also remove inclusion of <assert.h>, since this change removed
+ the final use of assert.
+ * src/search.h (mbtoupper): Remove declaration.
+
+2016-04-10 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: in C locale, all bytes are valid characters
+ This works around glibc bug 19932:
+ https://sourceware.org/bugzilla/show_bug.cgi?id=19932
+ The actual bug fix was the update to the current version of Gnulib.
+ grep problem reported by Björn Jacke in: http://bugs.gnu.org/23234
+ * NEWS: Mention this.
+ * doc/grep.texi (File and Directory Selection): Crossref to LC_*
+ section. Suggest why -a or LC_ALL=C might be useful.
+ (Environment Variables): Mention 'locale -a'.
+ Say that LC_CTYPE also specifies encoding, and that every
+ byte is a valid character in the C or POSIX locale.
+ * tests/c-locale: New test.
+ * tests/Makefile.am (TESTS): Add it.
+
+ build: update gnulib submodule to latest
+
+2016-04-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ Give another example of binary file processing
+ Problem reported by Shlomi Fish
+ * doc/grep.texi (File and Directory Selection):
+ Document that 'q$' might match 'q' followed by a NUL
+ if --binary-files=binary is in effect.
+
+2016-04-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: test egrep/fgrep help only if our grep
+ Problem reported by Christian Weisgerber in: http://bugs.gnu.org/23146
+ * tests/Makefile.am (TESTS_ENVIRONMENT):
+ Test egrep and fgrep only if they use our grep.
+
+2016-03-29 Jim Meyering <meyering@fb.com>
+
+ tests: remove spurious test of egrep
+ * tests/reversed-range-endpoints: Do not test egrep here.
+ There is already a test of grep -E.
+ Prompted by http://bugs.gnu.org/23146
+
+2016-03-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -Pz no longer misdiagnoses [^a]
+ Problem reported by Michael Jess.
+ * NEWS: Document this.
+ * src/pcresearch.c (Pcompile): Do not diagnose [^ when [ is unescaped.
+ * tests/pcre: Test for the bug.
+
+2016-03-22 Jim Meyering <meyering@fb.com>
+
+ maint: move new 'Improvements' blurb into proper section
+ * NEWS (Improvements): Move this new section from within the block
+ for the already-released 2.24 into the proper "next-release" block.
+ Also, retain the 2-blank-line separator between blocks.
+
+2016-03-18 Jim Meyering <meyering@fb.com>
+
+ maint: avoid spurious "binary file ... matches" in generated THANKS
+ * Makefile.am (THANKS): Don't apply grep to a stream containing
+ NUL bytes. Sync this rule from the one in coreutils: it was missing
+ some improvements.
+ Reported by Bailes Magio in http://bugs.gnu.org/22899
+
+2016-03-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -oz now outputs null bytes, not newlines
+ * NEWS: Document this.
+ * doc/grep.texi (Other Options): Clarify that -z affects output
+ as well as input data.
+ * src/grep.c (print_line_middle): Output eolbyte, not newline, if -o.
+ * tests/null-byte: Test -o too.
+ * tests/pcre-context: Adjust test to match new behavior.
+
+2016-03-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: use errno consistently in write diagnostics
+ Feature request and initial version reported by Assaf Gordon in:
+ http://bugs.gnu.org/23031
+ * NEWS: Document this.
+ * src/grep.c: Include <stdarg.h>.
+ (stdout_errno): New static var.
+ (write_error_seen): Remove; superseded by stdout_errno.
+ All uses changed.
+ (putchar_errno, fputs_errno, printf_errno, fwrite_errno)
+ (fflush_errno): New static functions.
+ (print_filename, print_sep, print_offset, print_line_head)
+ (print_line_middle, print_line_tail, prline, prtext, grep)
+ (grepdesc): Use them.
+ * tests/write-error-msg: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+2016-03-10 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.24
+ * NEWS: Record release date.
+
+2016-02-28 Jim Meyering <meyering@fb.com>
+
+ maint: add dist-check.mk
+ This file augments "make distcheck" rules.
+ * dist-check.mk: New file, from coreutils via gzip.
+ * Makefile.am (EXTRA_DIST): Add it.
+ * cfg.mk: Include it.
+
+2016-02-21 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -Pz is incompatible with ^ and $
+ Problem reported by Sergei Trofimovich in: http://bugs.gnu.org/22655
+ * NEWS: Document this.
+ * src/pcresearch.c (Pcompile): Warn with -Pz and anchors.
+ * tests/pcre: Test new behavior.
+
+2016-02-21 Jim Meyering <meyering@fb.com>
+
+ tests: test cleanup
+ * tests/z-anchor-newline: Remove test artifact that would write
+ to /t/x.
+
+2016-02-20 Jim Meyering <meyering@fb.com>
+
+ grep -z: avoid erroneous match with regexp anchor and \n in text
+ * src/dfasearch.c (EGexecute): Clear the newline_anchor bit when
+ eolbyte is not '\n'.
+ * tests/z-anchor-newline: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Describe it.
+ Originally reported by Ulrich Mueller in
+ https://bugs.gentoo.org/show_bug.cgi?id=574662
+ Reported to us by Sergei Trofimovich as http://debbugs.gnu.org/22655
+
+ tests: convert "cmd && fail=1" to "returns_ 1 cmd || fail=1"
+ The latter is robust, while the former can silently ignore
+ failure due to signals.
+ * cfg.mk (sc_prohibit_and_fail_1): New rule, copied from coreutils.
+ * tests/long-pattern-perf: Perform the above substitution.
+ * tests/mb-non-UTF8-performance: Likewise.
+ * tests/help-version: Merge from coreutils.
+
+2016-02-09 Jim Meyering <meyering@fb.com>
+
+ maint: add a check-very-expensive target
+ * Makefile.am (check-very-expensive): New convenience rule,
+ currently merely equivalent to check-expensive.
+
+2016-02-04 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.23
+ * NEWS: Record release date.
+
+2016-02-02 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+ Update for this "make distcheck"-fixing change:
+ > verify-tests: also remove stray test-verify.Tpo
+
+2016-02-01 Jim Meyering <meyering@fb.com>
+
+ tests/null-byte: test another code path
+ * tests/null-byte: Also exercise the case in which there is
+ a match in the block along with the NUL byte.
+
+2016-01-31 Paul Eggert <eggert@cs.ucla.edu>
+
+ Omit excess "Binary file ... matches"
+ Problem reported in: http://bugs.gnu.org/22461
+ * src/grep.c (grep): Don't report "Binary file ... matches"
+ merely because the file contained both matches and binary data.
+ Insist that the binary data contained a match.
+ * tests/null-byte: Add a test for this.
+
+2016-01-28 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+
+2016-01-23 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+
+ maint: fix typo in NEWS: s/a/an/
+
+2016-01-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -x now supersedes -w more consistently
+ * NEWS, doc/grep.texi (Matching Control): Mention this.
+ * src/dfasearch.c (EGexecute):
+ * src/pcresearch.c (Pcompile):
+ Don't get confused by -w if -x is also present.
+ * src/pcresearch.c (Pcompile): Remove misleading comment about
+ non-UTF-8 multibyte locales, as PCRE doesn't support them.
+ Calculate buffer sizes more carefully; the old method
+ allocated a buffer slightly too big, seemingly due to luck.
+ * tests/backref-word, tests/pcre: Add tests for this bug.
+
+ tests: omit update-copyright-tests
+ This test does not check how 'grep' itself operates, so it is
+ out of place for grep's 'make check'. Problem reported by Sam Razavi in:
+ http://bugs.gnu.org/22376
+ * bootstrap.conf (avoided_gnulib_modules): Add update-copyright-tests.
+
+2016-01-11 Jim Meyering <meyering@fb.com>
+
+ tests: do use "yes" but via an AWK replacement
+ Also, use sed Nq in place of head -N
+ * tests/init.cfg (yes): Define.
+ Thanks to Paul Eggert for this definition.
+ * tests/max-count-overread: Revert to using "yes".
+ * tests/mb-non-UTF8-performance: Likewise, and use
+ "sed Nq" in place of head -N.
+
+2016-01-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ * tests/pcre-count: Don't assume the page size is 32kB.
+
+2016-01-08 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: port to other POSIXish platforms
+ I tested this on Solaris 10 and AIX 7.1.
+ * tests/max-count-overread:
+ * tests/mb-non-UTF8-performance:
+ Don't assume 'yes' exists, as 'yes' is not in POSIX.
+ * tests/mb-non-UTF8-performance:
+ Don't rely on 'head -1000', as that option syntax is not POSIX.
+ * tests/pcre-count: Don't rely on "printf '\x0'".
+ * tests/unibyte-binary: Don't assume \200 is an encoding error
+ in every unibyte locale.
+
+2016-01-08 Jim Meyering <meyering@fb.com>
+
+ tests: fix encoding-error test failure to use of printf '\xHH'
+ * tests/encoding-error: Don't rely on printf having support for \xHH
+ hexadecimal. That is not portable. Use \OOO octal, instead.
+
+ maint: fix typo in NEWS: s/a/an/
+
+2016-01-07 Jim Meyering <meyering@fb.com>
+
+ mb-non-UTF8-performance: avoid FP test failure on fast hardware
+ * tests/mb-non-UTF8-performance: Don't use a fixed size.
+ Otherwise, on a fast system, the fixed-size unibyte test
+ would complete in a nominal 0 ms, which might well be
+ smaller than 1/30 of the multibyte duration, provoking
+ a false positive test failure. Instead, increase the
+ size of the input until we obtain a unibyte duration of
+ at least 10ms.
+
+2016-01-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: mention unibyte encoding fix
+ * NEWS: Document recent fix for encoding errors in unibyte locales.
+
+ grep: improve unibyte -P performance
+ This is a followon to the recent changes prompted by Bug#20526.
+ In <http://bugs.gnu.org/bug=20526#86> Norihiro Tanaka pointed out
+ that grep mistakenly assumed that unibyte locales cannot have
+ encoding errors. Here, the mistake hurt performance significantly.
+ On Fedora 23 x86-64 in the C locale, this patch improved grep's
+ performance by a factor of 7 when run as "grep -P 'z.*a'" on the
+ output of "yes $(printf '\200\n') | head -n 1000000000".
+ * src/pcresearch.c (multibyte_locale) [HAVE_LIBPCRE]: New static var.
+ (Pcompile): Set it.
+ (Pexecute): Use it to avoid the need to call
+ buf_has_encoding_errors in unibyte locales.
+
+2016-01-06 Paul Eggert <eggert@cs.ucla.edu>
+
+ Improve on fix for Bug#22181
+ * src/pcresearch.c (Pexecute): Update subject when skipping past
+ easily-determined encoding errors, as this is faster than letting
+ pcre_exec skip them. On my platform this improves performance
+ 4.7x on a benchmark created via "yes $(printf '\200\200\200\200
+ \200\200\200\200\200\200\200\200\200\200\200\200\200\200\200\200x\n')
+ | head -n 1000000 >j; grep -oP y j" in a UTF-8 locale. Rework
+ code that deals with PCRE_ERROR_BADUTF8 return, to avoid an
+ incorrect (albeit currently harmless) 'bol = false' assignment.
+
+ grep: restore -P optimization (followup fix)
+ * src/search.h (EGexecute, Fexecute, Pexecute):
+ Change decls to match new implementations.
+ I forgot to add this file to the previous commit.
+
+ grep: restore -P PCRE_NO_UTF8_CHECK optimization
+ On my platform in the en_US.utf8 locale, this makes 'grep -P "z.*a" k'
+ 220x faster, where k is created by the shell command:
+ yes 'abcdefg hijklmn opqrstu vwxyz' | head -n 10000000 >k
+ * src/dfasearch.c (EGexecute):
+ * src/grep.c (execute_fp_t):
+ * src/kwsearch.c (Fexecute):
+ * src/pcresearch.c (Pexecute):
+ First arg is now char *, not char const *, since Pexecute now
+ temporarily modifies this argument.
+ * src/grep.c, src/grep.h (buf_has_encoding_errors): Now extern.
+ * src/pcresearch.c (Pexecute): Use it. If the input is free of
+ encoding errors, use a multiline search and the PCRE_NO_UTF8_CHECK
+ option, as this is typically way faster. This restores an
+ optimization that was removed with the recent changes for binary
+ file detection.
+
+2016-01-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ Fix calculation of unibyte_mask
+ * src/grep.c (initialize_unibyte_mask): The old method worked for
+ UTF-8 and other typical encodings, but did not work for weird
+ encodings, e.g., one where all bytes other than 0x7f and 0x80 are
+ unibyte characters.
+
+2016-01-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix bug with with invalid unibyte sequence
+ This was introduced by the recent binary-data-detection changes.
+ Problem reported by Norihiro Tanaka in: http://bugs.gnu.org/20526#86
+ * src/grep.c (HIBYTE, easy_encoding, init_easy_encoding): Remove,
+ replacing with ...
+ (uword_max, unibyte_mask, initialize_unibyte_mask): ... this new
+ constant, static var, and function. All uses changed. The
+ unibyte_mask var generalizes the old local var hibyte_mask, which
+ worked only for encodings where every byte with 0x80 turned off is
+ a single-byte character.
+ (buf_has_encoding_errors): Return false immediately if
+ unibyte_mask is zero, not whether the current encoding is unibyte.
+ The old test was incorrect in unibyte locales in which some bytes
+ were encoding errors.
+ * tests/pcre-z: Require UTF-8 locale, since the grep -z . test now
+ needs this. Use printf \0 rather than tr. Port the 'grep -z .'
+ test to platforms where the C locale says '\200' is an encoding
+ error. Use cmp rather than compare, as the file is binary and
+ so non-GNU diff might not work.
+ * tests/unibyte-binary: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+2016-01-01 Jim Meyering <meyering@fb.com>
+
+ maint: update copyright year, bootstrap, init.sh
+ Run "make update-copyright" and then...
+
+ * gnulib: Update to latest.
+ * tests/init.sh: Update from gnulib.
+ * bootstrap: Likewise.
+
+2015-12-31 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: clarify text vs binary match output
+ * NEWS:
+ * doc/grep.texi (File and Directory Selection):
+ Make it clearer that grep can now output matching text before
+ reporting a binary match. Problem reported by Norihiro Tanaka in:
+ http://bugs.gnu.org/20526#83
+
+ doc: minor clarifications
+ * doc/grep.in.1, doc/grep.texi: Minor clarifications suggested by
+ Debian documentation patches. Problem reported by Santiago Ruano
+ Rincón in: http://bugs.gnu.org/18651
+
+ grep: fix -l --line-buffer bug
+ Problem reported by Louis Sautier in: http://bugs.gnu.org/18750
+ * NEWS: Document this.
+ * src/grep.c (grep, grepdesc): If --line-buffered, flush
+ stdout after outputting newline (or null byte, if applicable).
+
+2015-12-30 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: remove duplicate init
+ * src/grep.c (print_line_middle): Remove duplicate initialization.
+
+ grep: report line-buffered write error right away
+ * src/grep.c (prline): When line buffered, if there is a write
+ error, report it immediately rather than waiting until the next
+ line of output.
+
+ grep: -c should keep counting after binary data
+ Problem and fix reported by Jaroslav Škarvada, and test case
+ reported by Norihiro Tanaka, in: http://bugs.gnu.org/22028
+ * NEWS: Document this.
+ * src/grep.c (grep): Don't stop counting merely because nulls seen.
+ * tests/pcre-count: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+ dfa: port to tinycc
+ * src/dfa.c (add_utf8_anychar): Put 'const' after type.
+ Problem reported by Aharon Robbins in:
+ http://bugs.gnu.org/22260
+
+ grep: be less picky about encoding errors
+ This fixes a longstanding problem introduced in grep 2.21,
+ which is overly picky about binary files.
+ * NEWS:
+ * doc/grep.texi (File and Directory Selection): Document this.
+ * src/grep.c (input_textbin, textbin_is_binary, buffer_textbin)
+ (file_textbin):
+ Remove. All uses removed.
+ (encoding_error_output): New static var.
+ (buf_has_encoding_errors, buf_has_nulls, file_must_have_nulls):
+ New functions, which reuse bits
+ and pieces of the removed functions.
+ (lastout, print_line_head, print_line_middle, print_line_tail, prline)
+ (prpending, prtext, grepbuf):
+ Avoid use of const, now that we have
+ functions that require modifying a sentinel.
+ (print_line_head): New arg LEN. All uses changed.
+ (print_line_head, print_line_tail):
+ Return indicator whether the output line was printed.
+ All uses changed.
+ (print_line_middle): Exit early on encoding error.
+ (grep): Use new method for determining whether file is binary.
+ * src/grep.h (enum textbin, TEXTBIN_BINARY, TEXTBIN_UNKNOWN)
+ (TEXTBIN_TEXT, input_textbin): Remove decls. All uses removed.
+ * src/pcresearch.c (Pexecute): Remove multiline optimization,
+ since the main program no longer checks for encoding errors on input.
+ * tests/encoding-error: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+2015-12-29 Jim Meyering <meyering@fb.com>
+
+ maint: correct (make sorted) order of test file names
+ * tests/Makefile.am (TESTS): Insert new test name in sorted order.
+
+2015-12-28 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: --exclude matches trailing parts of args
+ Problem reported by Vincent Lefevre in:
+ http://bugs.gnu.org/22144
+ * NEWS:
+ * doc/grep.texi (File and Directory Selection): Document this.
+ * src/grep.c (excluded_patterns, excluded_directory_patterns):
+ Now 2-element arrays, with one element for subfiles and another
+ for command-line args. All uses changed. This implements the change.
+ (exclude_options): New function.
+ * tests/include-exclude: Test the change.
+
+2015-12-18 Jim Meyering <meyering@fb.com>
+
+ grep -oP: don't infloop when processing invalid UTF8 preceding a match
+ * src/pcresearch.c (Pexecute): When advancing SUBJECT past an
+ encoding error, don't blindly set P to that new value, since we
+ will soon compute SEARCH_OFFSET = P - SUBJECT, and mistakenly
+ making that difference too small would allow us to match some
+ previously-processed text, resulting in an infinite loop.
+ * NEWS (Bug fixes): Mention it.
+ * THANKS.in: Add Christian's name and email address.
+ * tests/pcre-invalid-utf8-infloop: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ Reported by Christian Boltz in http://debbugs.gnu.org/22181
+ Introduced by commit, v2.21-37-g14f8e48.
+
+2015-11-04 Jim Meyering <meyering@fb.com>
+
+ tests: mark performance-related tests as expensive
+ These performance-related tests are slightly failure prone due to
+ varying system load during the two runs.
+ Marking these tests as "expensive" makes it so they are no longer run
+ via "make check". You can still run them via make "check-expensive".
+ This makes them less likely to be run by regular users.
+ * tests/long-pattern-perf: Use expensive_.
+ * tests/mb-non-UTF8-performance: Likewise.
+ Reported by Jaroslav Skarvada in http://debbugs.gnu.org/21826
+ and by Andreas Schwab in http://debbugs.gnu.org/21812.
+
+2015-11-01 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.22
+ * NEWS: Record release date.
+
+ tests: pcre-jitstack: upon failure, retry with no stack size limit
+ * tests/pcre-jitstack: Don't let an example that provokes inordinate
+ stack space use cause a test failure. Thanks to reports from and
+ analysis by Bruce Dubbs; see http://debbugs.gnu.org/21755
+
+2015-10-27 Jim Meyering <meyering@fb.com>
+
+ maint: update THANKS.in
+ * THANKS.in: Add name+email of those who found and reported
+ the bug that made grep -E '^x|x$' match any "x".
+
+2015-10-25 Zev Weiss <zev@bewilderbeest.net>
+
+ dfa: plug a memory leak in dfamust
+ * src/dfa.c (dfamust): Ensure MP is freed, by refraining
+ from returning early when, at "done:" *RESULT is NULL.
+
+2015-10-25 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+ * gnulib: Pull in one more portability fix:
+ stdalign: port to Sun C 5.9
+
+2015-10-24 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest, for portability fixes
+ * gnulib: Pull in changes like these:
+ fts: port to C11 alignof
+ stdalign: work around pre-4.9 GCC x86 bug
+
+ maint: NEWS: correct/amend
+ * NEWS: Move the long-regexp-performance-improvement from
+ "Bug fixes" to "Improvements." Say more and include an example.
+ The -Fw degradation was introduced in commit v2.18-125-g94555dd
+
+ tests: avoid spurious failure on OpenBSD 5.8
+ * tests/fedora: Don't rely on "diff - FILE" reading from stdin.
+ Reported privately by Nelson Beebe.
+
+2015-10-17 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest; also bootstrap and tests/init.sh
+ * bootstrap: Update from gnulib.
+ * tests/init.sh: Likewise.
+ * gnulib: Update submodule to latest.
+
+ build: avoid spurious bootstrap failure involving pkg.m4
+ Running ./bootstrap could fail mistakenly at the very end in
+ its attempt to obtain a copy of pkg.m4. It would search only
+ $(aclocal --print-ac-dir) and some other directories, but not
+ those listed in $(aclocal --print-ac-dir)/dirlist.
+ * bootstrap.conf (bootstrap_post_import_hook): Also search the
+ directories named in $(aclocal --print-ac-dir)/dirlist when that
+ file exists with nonzero size.
+
+2015-10-16 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: add news item
+ * NEWS: Document grep -Fw speedup.
+
+ grep: simplify previous change
+ * src/grep.c (main): Simplify recently-changed grep -Fw test.
+
+2015-10-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: use grep matcher for grep -Fw when unibyte
+ In single byte locales with grep -Fw, prefer the grep matcher to the
+ kwset matcher, as the former uses KWset and a DFA, whereas the latter
+ calls kwsexec many times until it matches a word.
+ * src/grep.c (main): Change pattern for fgrep into grep for grep -Fw in
+ single byte locales.
+
+2015-10-16 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: use memchr/memrchar
+ * src/kwsearch.c (Fexecute): Prefer memchr and memrchr to doing it
+ by hand.
+
+2015-10-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: improve performance of grep -Fw
+ * src/kwsearch.c (Fexecute): grep -Fw examined whether the previous
+ character is a word character after matching from the head of the
+ buffer. It is extremely slow. Now, if grep found a potential match,
+ it looks for the previous newline, and examines from there.
+
+2015-10-13 Jim Meyering <meyering@fb.com>
+
+ maint: use single quote rather than UTF-8 multi-byte version
+ * tests/backref-alt: Translate unnecessary non-ASCII in comment.
+
+2015-10-13 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: make the executable a bit smaller
+ * src/dfa.c (dfamust): Hoist MB_CUR_MAX calculation out of loops.
+
+2015-10-13 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: fix bug in alternate of sub-patterns that differ only in constraints
+ Fix a bug where a line incorrectly matches alternates of sub-patterns
+ that differ only in the constraints, e.g., the ERE '^a|a$'.
+ Reported by Greg Boyd in: http://debbugs.gnu.org/21670
+ * src/dfa.c (dfamust): For a pattern with constraints, check that it is
+ matched including the constraints, to judge whether it is exact.
+
+ dfa: fix off-by-one error
+ * src/dfa.c (dfamust): Fix off-by-one error in computing 'must' length,
+ which caused the 'must' to be too short. See:
+ http://bugs.gnu.org/21670#28
+
+2015-10-12 Jim Meyering <meyering@fb.com>
+
+ doc: NEWS: mention a bug fix
+ * NEWS (Bug fixes): Describe it.
+ This bug was introduced by commit v2.18-85-g2c94326
+ and fixed by commit v2.21-51-g256a4b4.
+
+2015-10-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: add test case for Bug#21670
+ * tests/options: Add test #4 to catch Bug#21670.
+ Also, do not overescape # in shell strings.
+
+2015-09-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ Add test for pop_fail_stack bug
+ Problem reported by Hanno Böck in: http://bugs.gnu.org/21513
+ If you use --with-included-regex the bug fix is in gnulib, here:
+ http://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=5513b40999149090987a0341c018d05d3eea1272
+ If you use glibc, the bug fix has not been installed yet.
+ * tests/Makefile.am (XFAIL_TESTS): Add backref-alt if system matcher.
+ (TESTS): Add backref-alt.
+ * tests/backref-alt: New file.
+ * tests/triple-backref: Remove unused var.
+ Don't skip if tested with glibc, as Makefile.am now handles this.
+
+ build: update gnulib submodule to latest
+
+2015-08-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: avoid use of uninitialized variable
+ EGexecute would use "backref" uninitialized.
+ While that could have no bearing on correctness, it could
+ impact performance, via an unnecessary use of regexp.
+ * src/dfasearch.c (EGexecute): Initialize backref.
+ Reported as http://debbugs.gnu.org/21273
+ Introduced by commit v2.21-55-gea0ebaa.
+
+2015-08-12 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: remove fgrep code for case insensitive match
+ The fgrep matcher is no longer called in case insensitive matching,
+ so remove the code to support it.
+ * src/kwsearch.c (mb_case_map_apply): Remove function.
+ (Fexecute): Remove now-unused code.
+
+2015-08-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: optimize [x-x]
+ * src/dfa.c (parse_bracket_exp): Treat [x-x] as if it were [x].
+ This also pacifies GCC, which otherwise complains about wc2
+ being set but not used.
+
+2015-08-12 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: remove unused multibyte support
+ Now regex should be used for range, collating element, equivalent class
+ in non POSIX locales. So remove code to support these features.
+ * dfa.c (struct mb_char_classes): Remove members ch_classes,
+ nch_classes, ranges, nranges, equivs, nequivs, coll_elems, ncoll_elems.
+ All uses removed.
+ (match_mb_charset): Remove function.
+
+2015-08-01 Jim Meyering <meyering@fb.com>
+
+ tests: mb-non-UTF8-performance: use new function
+ * tests/mb-non-UTF8-performance: Rewrite to use
+ the user-time measuring function in init.cfg.
+
+ tests: long-pattern-perf: measure user time, not elapsed
+ Measuring user time makes this test less prone to false
+ positive failure, and also lets us use a tighter bound.
+ * tests/long-pattern-perf: Measure elapsed user time rather than
+ wall-clock time, to permit a tighter bound on the ratio of
+ N-to-10N timings. Suggested by Giuseppe Ottaviano.
+ Also, use regexps built from mostly 5-digit numbers, so that the 10:1
+ ratio applies to lines of "seq" output as well as to total bytes.
+
+ tests: new function to measure elapsed user time
+ * tests/init.cfg (user_time_): New function.
+
+2015-07-25 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: remove word delimiter support for multibyte locales
+ DFA supports word delimiter expressions, but it does not behave
+ correctly for multibyte locales. Even if it were to be fixed,
+ the DFA matcher's performance would be no better than that of regex.
+ Thus, this change removes DFA support for word delimiter expressions
+ in multibyte locales.
+
+ * src/dfa.c (dfa_supported): Return false also when a pattern uses any
+ word delimiter expression in a multibyte locale.
+
+2015-07-25 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: avoid execution for a pattern including an unsupported expression
+ If a pattern includes a construct unsupported by the DFA matcher,
+ the DFA search would fail in most cases. Make dfaexec immediately
+ return for any such pattern.
+
+ * src/dfa.c (struct dfa_state) [has_backref, has_mbcset]: Remove members
+ and all uses.
+ (dfaexec_main): Remove 'backref' parameter. Update callers.
+ (dfaexec_noop): New function.
+ (dfa_supported): New function.
+ (dfassbuild): Remove now-unused code.
+ (dfacomp): When a pattern uses a DFA-unsupported construct, do not
+ waste time performing any further analysis.
+
+2015-07-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: DEBUG: print detail of DFA states
+ When compiled with -DDEBUG, grep outputs tokens etc.
+ With this change, also print DFA states and transitions.
+ This change is very useful when debugging those.
+
+ * src/dfa.c (prtok) [DEBUG]: Change `%c' to `%02x' in printf format.
+ (state_index) [DEBUG]: Print detail of new state.
+ (dfastate) [DEBUG]: Print detail of DFA states.
+ Reported as http://debbugs.gnu.org/18707
+
+2015-07-18 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ tests: sjis-mb: accept two more locales
+ * tests/sjis-mb: Accept the ja_JP.SJIS and ja_JP.PCK locales
+ as well as ja_JP.SHIFT_JIS, so this test is less likely to
+ be skipped unnecessarily. Reported as http://bugs.gnu.org/18983
+
+2015-07-18 Jim Meyering <meyering@fb.com>
+
+ tests: add a test for the performance fix
+ * tests/long-pattern-perf: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+2015-07-18 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: speed up handling of long pattern
+ DFA tries to find a long sequence of characters that must appear
+ in any matching line. However, when a pattern is long (length N),
+ it is very slow, because it makes O(N^2) strstr calls.
+ This change reduces that to O(N) by processing each sequence of
+ adjacent "regular" characters as a group.
+
+ Compare the run times of this command before and after this change:
+ (on a i7-4770S CPU @ 3.10GHz using rawhide (~fedora 22) and compiled
+ with gcc 6.0.0 20150627)
+ : | env time -f %e grep -f <(seq -s '' 9999)
+ Before: 0.85
+ After: 0.02
+
+ * src/dfa.c (dfamust): Process each string of concatenated normal
+ characters as a unit.
+ * NEWS (Improvement): Mention it.
+ Prompted by a bug report and patch by Ivan Yanikov
+ in http://bugs.gnu.org/15191#5
+
+2015-07-17 Jim Meyering <meyering@fb.com>
+
+ tests: fix mis-applied patch.
+ * tests/include-exclude: I applied "|sort" to the wrong creation
+ of "out", and didn't push the same patch that I'd tested.
+
+ tests: avoid FS-dependent false-positive failure
+ * tests/include-exclude: Sort file name list, so that this test
+ is not sensitive to the order in which those names are returned
+ via readdir. I noticed the failure on a Fedora 21 system using ext4.
+ Also fix a typo: s/framework_failure+/framework_failure_/
+
+2015-07-13 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix bug with --exclude-dir and command line
+ Reported by Aron Griffis in: http://bugs.gnu.org/21027
+ * NEWS: Document this.
+ * src/grep.c (grepdirent): Don't check whether the file is skipped
+ when on the command line, as that's the caller's responsibility.
+ (main): Anchor the exclude patterns.
+ * tests/include-exclude: Adjust test case to match fixed behavior.
+ Add some more test cases.
+
+ tests: fix $? typo in null-byte
+ * tests/null-byte: Don't assume $? survives an invocation of 'test'.
+
+2015-07-05 Jim Meyering <meyering@fb.com>
+
+ maint: dfa: used unsigned types where appropriate
+ * src/dfa.c (case_folded_counterparts): Return unsigned int, not int.
+ Change type of two locals to unsigned int, to reflect that their
+ values are never negative.
+ (parse_bracket_exp): Adjust type of result at each use, as well
+ as that of related index variables.
+
+2015-07-04 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: build struct dfamust on demand
+ If we won't use KWset, do not build a "struct dfamust".
+ Now it is built only when needed.
+ * src/dfa.c (struct dfa) [musts]: Remove member.
+ (dfacomp): Don't build dfamust here.
+ (dfamustfree): New function to free a struct dfamust.
+ (dfamust): Make it a global function, and make it return a pointer
+ to a malloc'd struct dfamust.
+ (dfamusts): Remove it.
+ * src/dfa.h (struct dfamust) [next]: Remove member.
+ In the implementation preceding this patch, there was
+ never more than one of these in a given "struct dfa".
+ (dfamustfree, dfamust): Add prototypes.
+ (dfamusts): Remove prototype.
+ (dfaalloc): Declare with _GL_ATTRIBUTE_MALLOC.
+ To make that symbol usable there, move the inclusion
+ of "xalloc.h" from dfa.c to this file, dfa.h.
+ * src/dfasearch.c (kwsmusts): Adapt to use the new interface.
+ Update the comments to reflect reality.
+ This addresses http://bugs.gnu.org/17715
+
+2015-07-04 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: use recent gnulib syntax bits
+ * src/grep.c (Gcompile, Ecompile): Use plain RE_SYNTAX_GREP
+ and RE_SYNTAX_EGREP, now that we assume a recent-enough gnulib.
+
+ maint: ignore gendocs_template_min
+ * doc/.gitignore: Add '/gendocs_template_min'.
+
+ build: update gnulib submodule to latest
+
+ dfa: '.' and '[^x]' now consistently match newline
+ * src/dfa.c (parse_bracket_exp, lex, add_utf8_anychar)
+ (match_anychar): RE_DOT_NEWLINE and RE_HAT_LISTS_NOT_NEWLINE
+ are about LF, not about eolbyte. This patch does not affect
+ 'grep', but may affect other users of dfa.c.
+
+ grep: -z '[^x]' now consistently matches newline
+ Problem reported by Norihiro Tanaka in: http://bugs.gnu.org/20974#19
+ * NEWS: Document this.
+ * src/grep.c (Gcompile, Ecompile): Clear RE_HAT_LISTS_NOT_NEWLINE.
+ * tests/utf8-bracket: Test this.
+
+2015-07-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -z '.' now consistently matches newline
+ Problem reported by Balazs Kezes in: http://bugs.gnu.org/20974
+ * NEWS: Document this.
+ * tests/utf8-bracket: New file, to test for this bug.
+ * src/grep.c (Gcompile, Ecompile): Also specify RE_DOT_NEWLINE.
+ * tests/Makefile.am (TESTS): Add it.
+
+ grep: simplify print_line_middle slightly
+ * src/grep.c (print_line_middle): Simplify.
+
+ grep: don't mishandle left context in -P
+ http://bugs.gnu.org/20957
+ * src/pcresearch.c (jit_exec): New arg SEARCH_OFFSET.
+ Caller changed.
+ (Pexecute): Pass the left context to pcre_exec, so that PCRE
+ regular-expression matching can see it.
+ * tests/pcre-context: New file, to test for this bug.
+ * tests/Makefile.am (TESTS): Add it.
+
+2015-06-28 Jim Meyering <meyering@fb.com>
+
+ tests/case-fold-backref: factor test
+
+2015-06-26 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: don't hang on command-line fifo if -D skip
+ * NEWS: Document this.
+ * src/grep.c (skip_devices):
+ New function, with code taken from grepdirent.
+ (grepdirent): Use it. Avoid an unnecessary initialization.
+ (grepfile): If skipping devices, open files with O_NONBLOCK.
+ Throw in O_NOCTTY while we're at it.
+ (grepdesc): Skip devices here, too. Not only does this fix the
+ bug, it fixes an unlikely race condition if some other process
+ renames a device between fstatat and openat.
+ * tests/skip-device: Add a test for this bug.
+
+ grep: minor tweaks
+ * src/grep.c (main): Change recently-added static vars to be
+ constants, which makes them sharable. Prefer 'return' to 'exit'
+ when returning/exiting from 'main'. Move decl closer to first use
+ and rename local from 'ok' (which was confusing) to 'status'.
+ Prefer named constant STDOUT_FILENO to unnamed constant 1.
+
+2015-06-26 Jim Meyering <meyering@fb.com>
+
+ maint: unify three argv-processing calls
+ * src/grep.c (main): Unify three calls to grep_commandline_arg.
+
+ maint: alphabetize anonymous enum member names
+
+2015-05-30 Paul Eggert <eggert@cs.ucla.edu>
+
+ test: tighten tests for bracket exprs
+ * tests/posix-bracket: Test '[a-a[.-.]--]'.
+ Also, test that failures are with status 1
+ (nonmatching data), not status 2 (invalid expressions).
+
+2015-04-26 Jim Meyering <meyering@fb.com>
+
+ maint: update bootstrap from gnulib
+ * bootstrap: Update from gnulib.
+
+ maint: reword a diagnostic not to trigger leading capital check
+ * src/pcresearch.c: Reword diagnostic to avoid "make syntax-check"
+ failure.
+
+ maint: sort test names in tests/Makefile.am and add syntax-check rule
+ * cfg.mk (sc_sorted_tests): New rule.
+ * tests/Makefile.am (TESTS): Alphabetize.
+
+2015-04-25 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: make find_pred return NULL for an invalid predicate
+ This could never happen when invoked via grep, but could have triggered
+ a bug if dfa.c's find_pred function were invoked by some other program.
+ * src/dfa.c (find_pred): Return NULL for an invalid predicate.
+ * tests/invalid-char-class: New file to test for this.
+ * tests/Makefile.am (TESTS): Add that new file name to the list.
+ This addresses http://debbugs.gnu.org/18631
+
+2015-04-06 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: improve pkg-config doc and error handling
+ Error-handling improvement suggested by Mike Frysinger in:
+ http://bugs.gnu.org/16757#29
+ * NEWS: Document pkg-config changes.
+ * README-prereq: pkg-config is now a prereq when building from
+ repository.
+ * m4/pcre.m4 (gl_FUNC_PCRE): Report an error if pcre is explicitly
+ requested but not available. Defer to user-supplied PCRE_CFLAGS
+ and PCRE_LIBS.
+
+ build: remove typo and don't bother with /usr/include/pcre
+ Problem reported by Holger Bruenjes.
+ * m4/pcre.m4: Remove test for /usr/include/libpng (a typo).
+ Come to think of it, don't bother worrying about
+ /usr/include/pcre, as hosts with that problem can use pkg-config
+ or configure with CFLAGS by hand.
+
+ build: use pkg-config (if available) to configure libpcre
+ Problem reported by Mike Frysinger in: http://bugs.gnu.org/16757
+ * bootstrap.conf (bootstrap_post_import_hook):
+ Copy pkg-config's pkg.m4.
+ * configure.ac: Invoke PKG_PROG_PKG_CONFIG.
+ * m4/pcre.m4 (gl_FUNC_PCRE): Rewrite to use pkg-config if
+ available, and to test that pcre_compile can be linked to.
+ * src/Makefile.am (AM_CFLAGS): Add PCRE_CFLAGS.
+ (grep_LDADD): Add PCRE_LIBS.
+ * src/pcresearch.c: Simply include <pcre.h> if HAVE_LIBPCRE,
+ since 'configure' arranges for the appropriate -I option now.
+
+2015-03-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: output "." file name in diagnostic
+ This is bug C as reported by David Grayson in:
+ http://bugs.gnu.org/16444#18
+ This bug occurs only in obscure circumstances, and I didn't see
+ how to write a reasonable test case for it.
+ * src/grep.c (filename_prefix_len): Remove, replacing with ...
+ (omit_dot_slash): New static var. All uses of the former replaced
+ with uses of the latter.
+ (grepdirent): Don't add 2 if the filename is just ".".
+
+ egrep, fgrep: just use what's in PATH
+ * src/egrep.sh: Don't monkey with PATH; just use whatever 'grep'
+ is in the path. This is simpler, and lets the user specify
+ default options with a script for only grep, with no need for
+ egrep and fgrep scripts.
+ Fixes: bug#19998
+
+ doc: give a script wrapper example
+ * doc/grep.texi (Environment Variables): Give an example of a
+ wrapper script, as an alternative to using GREP_OPTIONS.
+ Fixes: bug#19998
+
+ doc: clarify how -a matches
+ * doc/grep.in.1, doc/grep.texi (File and Directory Selection):
+ Give an example of how non-text bytes affect pattern matching in
+ binary files.
+ Fixes: bug#20080
+
+2015-02-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ Cover the non-INSTALL case
+ * README: Mention what to do if there is no INSTALL file.
+ Fixes: bug#19928
+
+2015-02-11 Jim Meyering <meyering@fb.com>
+
+ maint: use ASAN-poisoning more carefully
+ The ASAN-poisoning instituted by commit v2.21-14-g1555185 was
+ incomplete, since the poisoned tail of the read buffer could well
+ be the target of a legitimate follow-on read. To accommodate that,
+ we must unpoison each such region just before beginning fillbuf's
+ read loop.
+ * src/grep.c [HAVE_ASAN] (asan_poison): Define.
+ (clear_asan_poison): Define.
+ (fillbuf): Clear before reading, since we are likely to read
+ into memory that was poisoned on the preceding iteration.
+ * tests/two-files: New file, to test for this.
+ * tests/Makefile.am (TESTS): Add it.
+
+2015-02-10 Paul Eggert <eggert@cs.ucla.edu>
+
+ Grow the JIT stack if it becomes exhausted
+ Problem reported by Oliver Freyermuth in: http://bugs.gnu.org/19833
+ * NEWS: Document the fix.
+ * tests/Makefile.am (TESTS): Add pcre-jitstack.
+ * tests/pcre-jitstack: New file.
+ * src/pcresearch.c (NSUB): Move decl earlier, since it's needed
+ earlier now.
+ (jit_stack_size) [PCRE_STUDY_JIT_COMPILE]: New static var.
+ (jit_exec): New function.
+ (Pcompile): Initialize jit_stack_size.
+ (Pexecute): Use new jit_exec function. Report a useful diagnostic
+ if the error is PCRE_ERROR_JIT_STACKLIMIT.
+
+2015-02-01 Jim Meyering <meyering@fb.com>
+
+ maint: reference CVE-2015-1345 from NEWS
+ * NEWS: Mention the CVE that was addressed by v2.21-13-g83a95bd,
+ "grep -F: fix a heap buffer (read) overrun".
+
+2015-01-18 Jim Meyering <meyering@fb.com>
+
+ maint: convert "goto" to "continue" and remove now-spurious label
+ * src/kwset.c (bmexec_trans): Using "goto big_advance" here is
+ equivalent to using "continue". Make that change and remove
+ the now-unused label.
+
+2015-01-10 Jim Meyering <meyering@fb.com>
+
+ tests: add support for ASAN memory poisoning
+ This lets us reliably detect with ASAN some UMR bugs
+ that would otherwise be detectable only some of the time
+ with MSAN. Use __asan_poison_memory_region to mark the unused
+ portion of a read buffer as inaccessible. Then, with ASAN,
+ any attempt to access those bytes results in an ASAN abort.
+ * src/system.h: Include "ignore-value.h".
+ (__has_feature): Define.
+ (HAVE_ASAN): Define when address sanitizer is enabled.
+ [HAVE_ASAN]: Declare these two __asan_* symbols.
+ [!HAVE_ASAN] (__asan_poison_memory_region): Define stub.
+ [!HAVE_ASAN] (__asan_unpoison_memory_region): Likewise.
+ * src/grep.c: Use __asan_poison_memory_region.
+
+2015-01-09 Yuliy Pisetsky <ypisetsky@fb.com>
+
+ grep -F: fix a heap buffer (read) overrun
+ grep's read buffer is often filled to its full size, except when
+ reading the final buffer of a file. In that case, the number of
+ bytes read may be far less than the size of the buffer. However, for
+ certain unusual pattern/text combinations, grep -F would mistakenly
+ examine bytes in that uninitialized region of memory when searching
+ for a match. With carefully chosen inputs, one can cause grep -F to
+ read beyond the end of that buffer altogether. This problem arose via
+ commit v2.18-90-g73893ff with the introduction of a more efficient
+ heuristic using what is now the memchr_kwset function. The use of
+ that function in bmexec_trans could leave TP much larger than EP,
+ and the subsequent call to bm_delta2_search would mistakenly access
+ beyond end of the main input read buffer.
+
+ * src/kwset.c (bmexec_trans): When TP reaches or exceeds EP,
+ do not call bm_delta2_search.
+ * tests/kwset-abuse: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * THANKS.in: Update.
+ * NEWS (Bug fixes): Mention it.
+
+ Prior to this patch, this command would trigger a UMR:
+
+ printf %0360db 0 | valgrind src/grep -F $(printf %019dXb 0)
+
+ Use of uninitialised value of size 8
+ at 0x4142BE: bmexec_trans (kwset.c:657)
+ by 0x4143CA: bmexec (kwset.c:678)
+ by 0x414973: kwsexec (kwset.c:848)
+ by 0x414DC4: Fexecute (kwsearch.c:128)
+ by 0x404E2E: grepbuf (grep.c:1238)
+ by 0x4054BF: grep (grep.c:1417)
+ by 0x405CEB: grepdesc (grep.c:1645)
+ by 0x405EC1: grep_command_line_arg (grep.c:1692)
+ by 0x4077D4: main (grep.c:2570)
+
+ See the accompanying test for how to trigger the heap buffer overrun.
+
+ Thanks to Nima Aghdaii for testing and finding numerous
+ ways to break early iterations of this patch.
+
+2015-01-08 Jim Meyering <meyering@fb.com>
+
+ grep: avoid false-positive UMR
+ For some inputs, valgrind would report an uninitialized
+ memory read error, but it was harmless.
+ * src/grep.c (fillbuf): Initialize those trailing bytes.
+
+2015-01-01 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+
+ maint: update copyright year ranges to include 2015
+ Run "make update-copyright". Also, ...
+ * grep.texi: Update manually, converting each "--" to "-".
+
+2014-12-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: document binary-data heuristic better
+ Problem reported by Martin Hoch in: http://bugs.gnu.org/19388
+ * doc/grep.texi (File and Directory Selection):
+ Document what non-text bytes are.
+ (Usage): Fix cross reference.
+
+2014-12-12 Jim Meyering <meyering@fb.com>
+
+ maint: fix a new "make syntax-check" failure
+ * tests/dfa-match-aux.c: s/can not/cannot/
+
+2014-12-12 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ build: avoid build failure with --enable-gcc-warnings and no PCRE
+ * src/pcresearch.c [HAVE_LIBPCRE] (empty_match): Guard the declaration
+ of this PCRE-only variable.
+
+2014-12-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: port fmbtest to CentOS 6 and earlier
+ * tests/fmbtest: Port to platforms where the 'sed' pattern
+ '[^0-9]' does not match every non-digit character. Problem
+ reported by Norihiro Tanaka in: http://bugs.gnu.org/19293
+
+2014-12-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: simplify dfaexec
+ * src/dfa.c (dfaexec): Simplify by rearrangement of IF conditions.
+ This commit induces no semantic change, and reverts part of commit
+ v2.5.4-144-gbafa134.
+
+2014-12-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: avoid invalid match or infinite loop in unused matching mode
+ Neither grep nor gawk uses this DFA code in its matching mode,
+ since each always calls dfacomp with a nonzero final argument.
+ However, when used in that mode, it had bug:
+ After failing to match in matching mode, it should return NULL,
+ but instead would either report a false match or enter an
+ infinite loop.
+
+ * src/dfa.c (dfaexec_main): After failing to match in matching mode
+ return NULL, rather than transitioning to the next state.
+ * tests/dfa-match: Add a new test.
+ * tests/dfa-match-aux.c: Add a new program to exercise this
+ otherwise-unused part of dfa.c.
+ * tests/Makefile.am: Add a rule to build new test.
+ (check_PROGRAMS): Add dfa-match-aux.
+ (AM_CPPFLAGS): Add -I$(top_srcdir)/src.
+ (TESTS): Add dfa-match.
+ * cfg.mk (exclude_file_name_regexp--sc_bindtextdomain):
+ (exclude_file_name_regexp--sc_prohibit_atoi_atof):
+ Exempt the new test file from some syntax-check rules.
+
+2014-12-04 Santiago Ruano Rincón <santiago@debian.org>
+
+ doc: document grep-2.11 change in behavior of -r, --recursive
+ * doc/grep.texi (--recursive, -r): Mention the new behavior
+ of recursively searching "." when there is no FILE argument.
+ * doc/grep.in.1: Likewise.
+ That change first appeared in grep-2.11, released on 2012-03-02.
+
+2014-11-24 Jim Meyering <meyering@fb.com>
+
+ maint: correct for four Author: name misspellings
+ * .mailmap: Correct for misspelling in Norihiro Tanaka's last name
+ as listed in four commit Author: fields: s/Norihirio/Norihiro/
+
+2014-11-23 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.21
+ * NEWS: Record release date.
+
+2014-11-21 Jim Meyering <meyering@fb.com>
+
+ tests: sjis-mb: remove now-obsolete and failing sub-tests
+ * tests/sjis-mb: Commit v2.18-123-geb3292b changed how grep
+ handles patterns with encoding errors. These SJIS tests are
+ skipped so often that we didn't notice until now that there were
+ two tests of that changed behavior, and that on any system with
+ the ja_JP.SHIFT_JIS locale, they would always fail. Remove those
+ two tests, since this functionality is well tested separately,
+ via tests/prefix-of-multibyte.
+
+2014-11-20 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep -F could erroneously fail to match in non-UTF8 multibyte locales
+ This fixes a bug that can strike only when using a non-UTF8 multibyte
+ locale like ja_JP.SHIFT_JIS.
+
+ Consider this example: it would mistakenly fail to match before
+ this patch:
+
+ printf '\203AA\n'|LC_ALL=ja_JP.SHIFT_JIS src/grep -F A
+
+ When searching for a single byte that happens to be the latter
+ byte of a multibyte character, and the target byte also follows
+ that multibyte character, grep -F would advance an internal pointer
+ by one byte too many, thus missing the target byte. A test case
+ for this bug is already included in tests/sjis-mb.
+
+ * src/kwsearch.c (Fexecute): Skip one byte less, after matched middle of a
+ multi-byte character. Introduced by commit v2.18-119-gfb7d538.
+
+2014-11-17 Jim Meyering <meyering@fb.com>
+
+ tests: big-match: disable OOM-provoking subtest
+ * tests/big-match: Our application of this regexp '^.*x\(\)\1'
+ to a file containing a single matching line of length 2GiB+2
+ would cause inordinate memory consumption (over 100GB) via
+ regexec.c, but no leak. That would cause disruption on most
+ systems, so remove this subtest. Reported by Assaf Gordon.
+
+2014-11-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: avoid undefined behavior
+ * src/dfa.c (dfassbuild): Don't call memcpy with a second
+ argument of NULL, even when the size (3rd argument) is 0.
+
+2014-11-14 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+
+2014-11-14 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep -F -x -o PAT would print an extra newline for each match
+ * src/kwsearch.c (Fexecute): Correctly compute the length of a match
+ by subtracting 2 (not 1) when match_lines is set. With -x, we augment
+ the "line" by both prepending and appending an EOLBYTE to the search
+ pattern. Here, we must correct for that. However, to compensate,
+ when we are using -x (--line-regexp) and start_ptr is NULL, we have
+ to add 1 to the length so that we still print the trailing EOLBYTE.
+ Introduced by commit v2.18-85-g2c94326.
+ * tests/match-lines: Add a new test.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+
+2014-11-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: port to Darwin
+ The 'sed' command 's/.//' does not delete all bytes in the C locale.
+ Problem reported by Nelson H. F. Beebe.
+ * tests/fmbtest: Don't assume that sed treats bytes with the
+ top bit set as valid characters in the C locale, as this is not
+ true for Darwin. Use the cs_CZ.UTF-8 locale instead, and
+ simplify the sed script.
+
+ tests: fix recently-introduced stray output
+ * tests/init.cfg (require_pcre_): Remove stray debugging output.
+
+ build: port to GCC 4.6.4 + glibc 2.5
+ On platforms this old, building with _FORTIFY_SOURCE equal to 2
+ results in duplicate definitions of standard library functions.
+ Problem reported by Nelson H. F. Beebe.
+ * configure.ac (_FORTIFY_SOURCE): Sort after GNULIB_PORTCHECK.
+ By default, do not enable this unless GNULIB_PORTCHECK is defined.
+ This better matches the original intent, which as I recall was to
+ enable these extra checks only with --enable-gcc-warnings.
+
+ tests: port to libpcre sans UTF-8 support
+ Problem reported by Nelson H. F. Beebe.
+ * tests/pcre-infloop, tests/pcre-invalid-utf8-input, tests/pcre-utf8:
+ Skip the test unless PCRE works in an en_US.UTF-8 locale.
+
+2014-11-09 Jim Meyering <meyering@fb.com>
+
+ tests: do not fail when the zh_CN.UTF-8 locale is not installed
+ * tests/word-multibyte: This test would fail on a system with
+ no zh_CN.UTF-8 locale. Use it only if it is installed.
+
+ tests: avoid hex_printf_ portability problems
+ * tests/init.cfg (hex_printf_): Spell out a-f and A-F, for
+ non-C locales, ensure that the input to sed is newline-terminated,
+ and quote the final octal format string.
+ Suggestions from Paul Eggert.
+
+2014-11-08 Jim Meyering <meyering@fb.com>
+
+ tests: avoid a multibyte tr portability problem
+ * tests/init.cfg (tr): New wrapper function.
+ See comments for details. Reported by Norihiro Tanaka
+ in http://debbugs.gnu.org/18991
+
+ maint: remove spurious LC_ALL setting from one test
+ * tests/word-multibyte: Remove unnecessary setting of LC_ALL.
+
+ tests: fix typo in previous change
+ * tests/init.cfg (hex_printf_): Fix typo s/A-f/A-F/.
+ For the record, I introduced that error, not Norihiro.
+
+2014-11-08 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ tests: avoid awk+printf+\xHH portability trap
+ * tests/init.cfg (hex_printf_): Rewrite in terms of printf and sed.
+ Using awk's printf with \xHH in the format string was not portable
+ to the awk of Solaris 10, AIX 7 or HP-UX 11.23, as reported in
+ http://debbugs.gnu.org/18987.
+ * tests/word-multibyte: Use printf rather than hex_printf_,
+ and give the character we're printing a name: e_acute (rather
+ than A-grave), since that is used in other tests.
+ a trailing \n in the format string, adjust by removing it, and
+ instead invoking echo.
+ * tests/multibyte-white-space: Simply remove each trailing \n.
+ They were not needed.
+
+2014-11-07 Jim Meyering <meyering@fb.com>
+
+ tests: avoid printf+\xHH portability trap
+ * tests/word-multibyte: Using the bourne shell's printf function
+ with strings like "\xHH\xHH" happens to work for most interactive
+ shells, but not for dash. That is not portable. Use our hex_printf_
+ awk wrapper instead. Without this change, this test would fail on
+ a Debian system for which /bin/sh is configured to be "dash".
+
+ maint: move helper function, hex_printf to init.cfg
+ * tests/init.cfg (hex_printf_): New function, from ...
+ * tests/multibyte-white-space: ... here. Reflect the
+ s/hex_print/hex_printf_/ renaming.
+
+2014-11-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: port O_NOFOLLOW errno checking to NetBSD
+ Problem reported by Assaf Gordon in: http://bugs.gnu.org/18892
+ * NEWS: Document it.
+ * src/grep.c (open_symlink_nofollow_error):
+ New function, which does the right thing on NetBSD.
+ (grepfile): Use it.
+
+2014-10-31 Jim Meyering <meyering@fb.com>
+
+ build: generate man pages even when existing targets are read-only
+ * doc/Makefile.am (grep.1): Use mv -f to move temporary to target,
+ in case the target is read-only. Also, always make the generated
+ files read-only.
+ (egrep.1 fgrep.1): Likewise.
+ This avoids a build failure reported by Eric Blake in
+ http://lists.gnu.org/archive/html/bug-grep/2014-10/msg00112.html
+
+2014-10-30 Jim Meyering <meyering@fb.com>
+
+ tests: avoid false-positive failure due to some zh_CN.* locales
+ On some systems, and for some zh_CN.* locales (e.g., OpenBSD5.5) the
+ E-acute pair of bytes do not qualify as a word-constituent character.
+ * tests/word-multibyte: Use zh_CN.UTF-8, rather than "zh_CN".
+ Reported by Assaf Gordon and Bruce Dubbs in
+ http://debbugs.gnu.org/18892
+
+2014-10-29 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest; bootstrap, too
+ * gnulib: Update to latest.
+ * bootstrap: Copy latest from gnulib.
+
+2014-10-28 Jim Meyering <meyering@fb.com>
+
+ tests: make new test script executable
+ * tests/word-multibyte: Make this file executable.
+
+2014-10-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: make \w and \W work in multibyte locales
+ Reported by Jaroslav Skarvada in: http://bugs.gnu.org/18817
+ Now, \w and \W are supported in not only single byte locale but multibyte
+ locale.
+
+ * src/dfa.c (PUSH_LEX_STATE, POP_LEX_STATE): Move definitions "up",
+ so they are not within the function.
+ (lex): Make \w and \W work in a multibyte locale, the same way
+ we made \s and \S work.
+ * tests/word-multibyte: New test for this change.
+ * tests/Makefile.am: Add a rule to build new test.
+ * NEWS (Bug fixes): Mention it.
+
+2014-10-26 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: avoid false match in a non-UTF8 multibyte locale
+ This command should print nothing:
+
+ printf '\263\244\263\244\n' \
+ | LC_ALL=ja_JP.eucJP grep -E "$(printf '^x|\244\263')"
+
+ Before this patch, it would print its sole input line.
+ * src/dfa.c (struct dfa): Add new members: min_trcount,
+ initstate_letter, initstate_others.
+ (dfaanalyze): Build states with not only a newline context but others.
+ (build_state): Don't release initial states.
+ (skip_remains_mb): Add a parameter.
+ Add a comment describing all parameters.
+ (dfaexec_main): When there are multiple start states, we are about
+ to transition from one state to another and the current byte is not
+ the first byte of a multibyte character, first advance past the
+ current multibyte character.
+ * tests/euc-mb: Add a new test.
+ * NEWS (Bug fixes): Mention it.
+ This addresses http://debbugs.gnu.org/18685
+
+2014-10-25 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: work around older libpcre bugs when testing -P and UTF-8
+ * tests/pcre-invalid-utf8-input: Add require_timeout_ and
+ require_compiled_in_MB_support. Put a timeout of 3 seconds on
+ grep, to avoid having this test case loop forever with older
+ versions of libpcre, such as those found on RHEL 6.5.
+ Reported by Jim Meyering in: http://bugs.gnu.org/18806#34
+
+2014-10-24 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ tests: add test for grep -P fix
+ * tests/pcre-o: New test for this change.
+ * tests/Makefile.am (TESTS): Add it.
+
+2014-10-24 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix grep -P crash
+ Reported by Shlomi Fish in: http://bugs.gnu.org/18806
+ Commit 9fa500407137f49f6edc3c6b4ee6c7096f0190c5 (2014-09-16) is a
+ hack that I put in to speed up 'grep -P'. Unfortunately, not only
+ is it violation of modularity, it's also a bug magnet, as we have
+ found out with Bug#18738 and Bug#18806. Remove the optimization
+ instead of applying more bandaids. Perhaps we can think of a
+ better way of doing the optimization, or perhaps we can just live
+ with a slower grep -P (as -P is inherently slower anyway...).
+ * src/grep.c, src/grep.h (validated_boundary):
+ Remove. All uses removed.
+ * src/pcresearch.c (Pexecute): Do not worry about validated_boundary.
+
+2014-10-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: remove two erroneous clauses from a now-unused function
+ RE_DOT_NEWLINE and RE_DOT_NOT_NULL apply only to a dot that
+ matches any character. Do not consider them when matching
+ with a bracket expression.
+
+ * src/dfa.c (match_mb_charset): Remove tests for RE_DOT_NEWLINE
+ and RE_DOT_NOT_NULL.
+
+2014-10-19 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: process all MBCSET constructs via glibc's matcher
+ The DFA matcher does not support collating symbols or equivalence
+ classes, so ensure that any MBCSET reference is handled by the glibc
+ matcher. dfa.c already handled this in one case, but not the other,
+ so that a command like "printf '\0' |src/grep -aE '^\s?$'" would
+ mistakenly end up using dfa.c's match_mb_charset function rather
+ than glibc's matcher.
+
+ * src/dfa.c (dfaexec_main): Move that code into the
+ State_transition macro. This renders the match_mb_charset
+ unused by grep.
+ * tests/multibyte-white-space: Add a test to exercise the
+ just-rendered-inaccessible code path.
+
+2014-10-15 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: initialize validation_boundary properly before use
+ * src/grep.c (main): Initialize validation_boundary before pre-searching
+ for an empty line.
+
+2014-10-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix off-by-one bug in -P optimization
+ Reported by Norihiro Tanaka in: http://bugs.gnu.org/18738
+ * src/pcresearch.c (Pexecute): Fix off-by-one bug with
+ validation_boundary.
+ * tests/init.cfg (envvar_check_fail): Catch off-by-one bug.
+
+2014-10-08 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: fix a theoretical bug
+ * src/dfa.c (dfaexec_main): After searching for a match from
+ the initial state, set the previous state, S1, to 0.
+ So far, we have found no case in which this fix makes a difference.
+ See http://debbugs.gnu.org/18645
+
+2014-10-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: modernize and simplify man page
+ * doc/grep.in.1 (Tx, Id): Remove. All uses removed.
+ (MTO, URL): New macros, used for email and URL.
+ Use them when appropriate.
+ In main text, omit chatty discussions of other implementations;
+ the full manual suffices for this sort of thing.
+
+ doc: clarify exit status
+ Reported by Santiago Ruano Rincón in: http://bugs.gnu.org/18651
+ * doc/grep.in.1 (EXIT STATUS):
+ * doc/grep.texi (Exit Status): Clarify.
+
+2014-10-07 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: test for just-fixed bug
+ * tests/mb-dot-newline: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+ Bisection suggests that the bug was introduced by
+ commit v2.18-123-geb3292b. Also see
+ http://debbugs.gnu.org/cgi/bugreport.cgi?msg=17;bug=18580
+
+2014-10-05 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: factor out a new nontrivial block of duplicated code
+ * src/dfa.c (State_transition): New macro.
+ (dfaexec_main): Use it twice.
+
+ dfa: check end of input buffer after transition in non-UTF8 multibyte locale
+ * src/dfa.c (dfaexec_main): Check for end of input buffer after each
+ transition in a non-UTF8 multibyte locale.
+ * tests/mb-non-UTF8-overrun: New test.
+ * tests/Makefile.am (TESTS): Add it.
+ * src/grep.c (main): With this fix, we no longer need the fourth
+ byte of "eolbytes".
+
+2014-10-04 Jim Meyering <meyering@fb.com>
+
+ grep: avoid stack buffer read-underrun and overrun
+ Testing binaries built with -fsanitize=address caused aborts due
+ to stack underrun and overrun.
+ * src/grep.c (main): Allocate a larger buffer for eolbytes:
+ one byte before the beginning and one more after the end.
+ For details, see http://debbugs.gnu.org/18580#44.
+
+2014-10-04 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: fix subscript error when testing whether empty lines match
+ src/grep.c (grep): When testing whether an empty line matches,
+ make the input buffer one byte longer, as dfaexec uses that
+ for a sentinel.
+
+2014-09-27 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: minor tweaks, mostly to remove __attribute__ ((noinline))
+ That attribute isn't portable, and I found a way to get similar
+ performance with standard C features.
+ * NEWS: Document the recently-installed performance improvement.
+ * src/dfa.c (struct dfa): New member dfaexec.
+ (dfaexec_main): Remove unnecessary 'const'.
+ (dfaexec_mb, dfaexec_sb): Remove __attribute__ ((noinline));
+ no longer needed.
+ (dfaexec): Use new dfaexec member.
+ (dfainit, dfaoptimize, dfassbuild): Initialize it.
+
+2014-09-27 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: separate dfaexec function to help optimization by compiler
+ * src/dfa.c (dfaexec_main): Rename from dfaexec, add inline attribute.
+ (dfaexec_mb): New function. Run it when d->multibyte is true. For this
+ function inlination must be avoided.
+ (dfaexec_sb): New function. Run it when d->multibyte is false. For this
+ function inlination must be avoided.
+ (dfaexec): Call dfaexec_mb or dfaexec_sb accoding to d->multibyte.
+
+2014-09-27 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: speed-up at initial state
+ DFA state is always 0 until have found potential match. So we improve
+ matching there by continuing to use the transition table.
+
+ * src/dfa.c (skip_remains_mb): New function.
+ (dfaexec): Speed-up at initial state.
+
+2014-09-27 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: generalize the -Wcast-align fix
+ * src/grep.c (CAST_ALIGNED): New macro.
+ (skip_easy_bytes): Use it.
+
+2014-09-27 Jim Meyering <meyering@fb.com>
+
+ maint: suppress a false-positive -Wcast-align warning
+ Building with --enable-gcc-warnings and gcc-4.9.1 would provoke this:
+ grep.c:499:12: error: cast from 'const char *' to 'const uword *'\
+ (aka 'const unsigned long *') increases required alignment from\
+ 1 to 8 [-Werror,-Wcast-align]
+ for (s = (uword const *) p; ! (*s & hibyte_mask); s++)
+ ^~~~~~~~~~~~~~~~~
+ * src/grep.c (skip_easy_bytes): Use a pragma to suppress
+ gcc's false-positive cast-alignment warning.
+
+2014-09-26 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: don't check extensively for invalid prefix bytes unless -P
+ Problem reported by Jim Meyering in: http://bugs.gnu.org/18454#56
+ * src/grep.c (grep): After the first buffer is checked, leave the
+ file-type checker in TEXTBIN_UNKNOWN state only when -P is used.
+ Only the -P matcher has performance problems with checking binary
+ data that make it worthwhile to check every prefix input byte so
+ the -P matcher's TEXTBIN_UNKNOWN optimizations can come into play.
+ Other matchers can simply check the data directly, and using
+ TEXTBIN_UNKNOWN with them slows 'grep' down for no benefit.
+
+ grep: scan for valid multibyte strings more quickly
+ Scan valid multibyte strings more quickly in the common case of
+ encodings that are upward compatible with ASCII, such as UTF-8.
+ You'd think there'd be a fast standard way to do this nowadays,
+ but nooooo....
+ Problem reported by Jim Meyering in: http://bugs.gnu.org/18454#56
+ * src/grep.c (HIBYTE): New constant.
+ (easy_encoding): New static var.
+ (init_easy_encoding, skip_easy_bytes): New functions.
+ (uword): New type.
+ (buffer_textbin): Skip easy bytes quickly.
+ Don't bother with mb_clen here, since skip_easy_bytes typically
+ captures the easy cases; just use mbrlen directly.
+ (buffer_textbin, file_textbin): First arg is no longer a const
+ pointer, since the byte past the end is now an overwritten sentinel.
+ (fillbuf): Make room for a uword after the buffer, for skip_easy_bytes.
+ (main): Call init_easy_encoding.
+
+2014-09-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: speed up processing of holes before EOF on Solaris
+ * src/grep.c (fillbuf): If SEEK_DATA fails with errno == ENXIO,
+ skip over the hole at EOF.
+
+ grep: port to platforms lacking SEEK_DATA
+ Reported by Norihiro Tanaka in: http://bugs.gnu.org/18454#38
+ * src/grep.c (SEEK_DATA): Default to SEEK_SET if not defined.
+ (SEEK_HOLE): Move to top level, and default it to SEEK_SET.
+ (file_textbin): Adjust to new default.
+ (fillbuf): Don't bother with SEEK_DATA if it defaults to SEEK_SET.
+
+ grep: skip past holes efficiently
+ Take advantage of the relaxed rules for treating non-text bytes in
+ binary data, by efficiently skipping past holes on platforms
+ supporting lseek's SEEK_DATA flag.
+ On one test on a circa-2008 Sun Fire V40z running Solaris 11.2,
+ 'grep x' took 0.009 real-time seconds to scan a holey file of size
+ 9,223,372,036,854,775,802 bytes, for a nominal scan rate of 1 ZB/s.
+ grep 2.20's scan rate on this platform was 843 MB/s, so this is a
+ speedup by a factor of 1.2 trillion. The speedup factor is not
+ as great on GNU/Linux hosts, due to what appear to be SEEK_DATA
+ inefficiencies, but presumably this will be cleared up in time.
+ * NEWS: Document this.
+ * src/grep.c, src/grep.h (eolbyte): Now char, not unsigned char.
+ This is for compatibility with the rest of the code.
+ The old (performance?) reasons for 'unsigned char' are now moot.
+ * src/grep.c (skip_nuls, skip_empty_lines, seek_data_failed):
+ New static vars.
+ (totalnl): Move up, since it's about input, not output, and
+ fillbuf now uses it.
+ (add_count): Move up, since fillbuf now uses it.
+ (all_zeros): New function.
+ (fillbuf): Use SEEK_DATA to skip past holes efficiently,
+ on systems that support this.
+ (grep, main): Set the new static vars.
+
+ grep: improve -P performance in typical cases
+ * src/grep.c, src/grep.h (enum textbin): Move to grep.h.
+ (input_textbin, validated_boundary): New vars.
+ * src/grep.c (grepbuf, grep): Initialize them.
+ * src/pcresearch.c (Pexecute): Do a multiline search
+ when the input is known to be free of encoding errors.
+ Quickly discard bytes that are obviously encoding errors.
+ Quickly match empty strings.
+
+ grep: minor -P speedup with jit_stack
+ * src/pcresearch.c (jit_stack): No longer static.
+
+ grep: non-text bytes in binary data may be treated as line ends
+ * NEWS, doc/grep.texi (File and Directory Selection):
+ Document this change.
+ * src/grep.c (zap_nuls): New function.
+ (grep): Use it.
+ * tests/null-byte: Relax to allow new behavior.
+
+ grep: -z no longer considers '\200' to be binary data
+ This avoids a problem when using grep -z in a Windows-1252 locale.
+ Plus, it lets 'grep -z' run a bit faster.
+ * NEWS: Document this.
+ * src/grep.c (buffer_textbin): Don't look for '\200' if -z.
+ * tests/pcre-z: Test for new behavior.
+
+ grep: refactor binary-vs-unknown-vs-text flags for clarity
+ * src/grep.c (enum textbin): New enum.
+ (textbin_is_binary): New function.
+ (buffer_textbin, file_textbin, grep): Use them, for clarity.
+
+2014-09-16 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix -P speedup bug with empty match
+ * src/pcresearch.c (NSUB): New top-level constant, replacing
+ 'nsub' within Pexecute.
+ (Pcompile, Pexecute): Use it.
+ (Pexecute): Don't assume sub[1] is zero after a PCRE_ERROR_BADUTF8
+ match failure.
+ * tests/pcre-invalid-utf8-input: Test for this bug.
+
+ grep: port -P speedup to hosts lacking PCRE_STUDY_JIT_COMPILE
+ * src/pcresearch.c (Pcompile): Do not assume that
+ PCRE_STUDY_JIT_COMPILE is defined.
+ (empty_match): Define on all platforms.
+
+ grep: use mbclen cache in one more place
+ * src/grep.c (fgrep_to_grep_pattern): Use mb_clen here, too.
+
+ grep: avoid false alarms for mb_clen and to_uchar
+ * cfg.mk (_gl_TS_unmarked_extern_functions): New var,
+ to bypass the tight_scope false alarms on mb_clen and to_uchar.
+
+ grep: use mbclen cache more effectively
+ * src/grep.c (buffer_textbin, contains_encoding_error):
+ Use mb_clen for speed.
+ (buffer_textbin): Bypass mb_clen in unibyte locales.
+ (main): Always initialize the cache, since it's sometimes used in
+ unibyte locales now. Initialize it before contains_encoding_error
+ might be called.
+ * src/search.h (SEARCH_INLINE): New macro.
+ (mbclen_cache): Now extern decl.
+ (mb_clen): New inline function.
+ * src/searchutils.c (SEARCH_INLINE, SYSTEM_INLINE): Define.
+ (mbclen_cache): Now extern.
+ (build_mbclen_cache): Put 1 into the cache when mbrlen returns 0.
+ (mb_goback): Use mb_len for speed, and rely on it returning nonzero.
+ * src/system.h (SYSTEM_INLINE): New macro.
+ (to_uchar): Use it.
+
+ grep: improve performance for older glibc
+ glibc has a bug where mbrlen and mbrtowc mishandle length-0 inputs.
+ Working around it in gnulib slows grep down, so disable the tests for it
+ and make sure grep works even if the bug is present.
+ * bootstrap.conf (avoided_gnulib_modules): Add mbrtowc-tests.
+ * configure.ac (gl_cv_func_mbrtowc_empty_input): Assume yes.
+ * src/searchutils.c (mb_next_wc): Don't invoke mbrtowc on empty input.
+
+ grep: treat a file as binary if its prefix contains encoding errors
+ * NEWS:
+ * doc/grep.texi (File and Directory Selection):
+ Document this.
+ * src/grep.c (buffer_encoding, buffer_textbin): New functions.
+ (file_textbin): Rename from file_is_binary. Now returns 3-way value.
+ All callers changed.
+ (file_textbin, grep): Check the input more carefully for text vs
+ binary data.
+ (contains_encoding_error): Remove; use replaced by buffer_encoding.
+ * tests/backref-multibyte-slow:
+ * tests/high-bit-range:
+ * tests/invalid-multibyte-infloop:
+ Use -a, since the input is now considered to be binary.
+ * tests/invalid-multibyte-infloop: Add a check for new behavior.
+
+ grep: use bool for boolean in grep.c
+ * src/grep.c (show_version, suppress_errors, only_matching)
+ (align_tabs, match_icase, match_words, match_lines, errseen)
+ (write_error_seen, is_device_mode, usable_st_size)
+ (file_is_binary, skipped_file, reset, fillbuf, out_quiet)
+ (out_line, out_byte, count_matches, no_filenames, line_buffered)
+ (done_on_match, exit_on_match, print_line_head, prline, grep)
+ (grepdirent, grepfile, grepdesc, grep_command_line_arg)
+ (get_nondigit_option, main): Use bool for boolean.
+ (print_line_head, prline): Use char for byte.
+ * src/grep.h: Include <stdbool.h>, and adjust decls to match
+ changes in grep.c.
+
+ grep: speed up -P on files containing many multibyte errors
+ * src/pcresearch.c (empty_match): New var.
+ (Pcompile): Set it.
+ (Pexecute): Use it.
+
+ grep: remove/refactor unnecessary code about line splitting
+ * src/grep.c (do_execute): Remove. Caller now uses 'execute'.
+ * src/pcresearch.c (Pexecute): Improve comment about this.
+
+2014-09-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: diagnose -P in non-UTF-8 multibyte locale
+ * src/pcresearch.c (Pcompile):
+ libpcre supports only unibyte and UTF-8 locales,
+ so report an error and exit if used in other locales.
+ * NEWS: Mention this.
+ * tests/euc-mb: Test this.
+
+2014-09-12 Jim Meyering <meyering@fb.com>
+
+ doc: move NEWS note about GREP_OPTIONS into proper section
+ * NEWS (Changes in behavior): Move the note about GREP_OPTIONS
+ from the 2.20 section into the section for the upcoming release.
+
+2014-09-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: make GREP_OPTIONS obsolescent
+ * NEWS:
+ * doc/grep.in.1 (ENVIRONMENT_VARIABLES):
+ * doc/grep.texi (Environment Variables):
+ Document that GREP_OPTIONS is obsolescent now.
+ * src/grep.c (main): Warn if GREP_OPTIONS is used.
+ * tests/r-dot, tests/skip-device: Don't use GREP_OPTIONS.
+
+2014-09-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: bug tracker has moved to debbugs.gnu.org
+ * README (KNOWN BUGS):
+ * doc/grep.in.1:
+ * doc/grep.texi (Reporting Bugs): Document this.
+
+ grep: fix false matches with -P '...$' and invalid UTF-8
+ * tests/pcre-invalid-utf8-input: Add a test for that.
+
+ grep: fix false matches with -P '...$' and invalid UTF-8
+ * src/pcresearch.c (Pexecute): Use PCRE_NOTEOL when matching
+ initial substrings of a line.
+
+2014-09-10 Jim Meyering <meyering@fb.com>
+
+ tests: add expect-to-fail test for a glibc regexp bug
+ * tests/triple-backref: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ (XFAIL_TESTS): List it as a known, always-failing test.
+ Based on the bug report from Paul Eggert:
+ https://sourceware.org/bugzilla/show_bug.cgi?id=17356
+
+ maint: avoid distcheck failure
+ * Makefile.am (EXTRA_DIST): Add .mailmap.
+
+2014-09-10 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: port recent fix to older pcre version
+ * src/pcresearch.c (Pexecute): Don't assume that a pcre_exec
+ that returns PCRE_ERROR_NOMATCH leaves its sub argument alone.
+ This assumption is false for libpcre-3 version 8.31-2ubuntu2.
+
+2014-09-09 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -P now treats invalid UTF-8 input as non-matching
+ Problem reported by Santiago Vila in: http://bugs.gnu.org/18266
+ * NEWS: Mention this.
+ * src/pcresearch.c (Pexecute): Treat UTF-8 encoding errors
+ as non-matching data, instead of exiting 'grep'.
+ * tests/pcre-infloop: grep now exits with status 1, not 2.
+ * tests/pcre-invalid-utf8-input: grep now exits with status 0, not 2.
+
+2014-08-14 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix integer-width bugs in undossify_input etc.
+ undossify_input bug reported by Vincent Lefevre in:
+ http://bugs.gnu.org/18269
+ * src/dosbuf.c (undossify_input): Return size_t, not int.
+ * src/grep.c (fillbuf): Work portably even if safe_read returns a
+ value greater than SSIZE_MAX, e.g., if there's an I/O error.
+
+2014-08-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: document LANGUAGE
+ Reported by Benno Schulenberg in: http://bugs.gnu.org/18185
+ * doc/grep.texi (Environment Variables): Document LANGUAGE.
+
+ doc: prefer @env to @code
+ Reported by Benno Schulenberg in: http://bugs.gnu.org/18184
+ * doc/grep.texi: Avoid @code in favor of @env, or of nothing at all.
+
+2014-07-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: Document -r vs --exclude more carefully.
+ Problem reported by Hugues Andreux in: http://bugs.gnu.org/17763
+ * doc/grep.texi (File and Directory Selection): Be more careful
+ about documenting the interaction between recursive searching,
+ --include, --exclude, and --exclude-dir.
+
+2014-06-27 Jim Meyering <meyering@fb.com>
+
+ maint: split long lines, and enforce the 80-column limit
+ * cfg.mk (sc_long_lines): New rule, from coreutils; exempt tests/*
+ * src/grep.c (usage): Tweak -F wording to shorten a line.
+ Correct grammar in a comment.
+ Split the --exclude-file=... description to fit within 80 columns.
+ Use emit_bug_reporting_address, eliminating another long line.
+ * src/dfa.c: Split long lines. No semantic change.
+ * doc/grep.texi: Likewise.
+ * tests/include-exclude: Split a long line.
+ * tests/backref: Split long lines.
+ * tests/empty: Likewise.
+ * tests/fmbtest: Likewise.
+
+ doc: update HACKING
+ * HACKING: Update from coreutils.
+
+ maint: generate distributed THANKS from VC'd THANKS.in
+ * Makefile.am (THANKS): New rule.
+ * THANKS.in: New file.
+ * THANKS: Remove. Now it's generated from the combination of
+ THANKS.in and git logs.
+ * .mailmap: New file.
+ * cfg.mk (sc_THANKS_in_duplicates): New syntax-check rule, from
+ coreutils.
+ * .gitignore: Add THANKS.
+ * thanks-gen: New file, from coreutils.
+
+2014-06-27 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: with -E, unmatched ')' matches itself
+ Problem reported by Nathan Weeks in: http://bugs.gnu.org/17856
+ * src/grep.c (Ecompile): Also specify RE_UNMATCHED_RIGHT_PAREN_ORD.
+ * doc/grep.texi (Fundamental Structure), NEWS: Document this.
+ * tests/ere.tests: Add a couple of tests for this.
+ * tests/spencer1.tests: Fix exit status.
+
+2014-06-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: avoid -Wstack-protector
+ This allows the use of --enable-gcc-warnings on Gentoo and Ubuntu.
+ See: http://bugs.gnu.org/17793
+ * configure.ac (WERROR_CFLAGS): Avoid -Wstack-protector.
+
+ This can be worked around, but the cure is worse than the disease.
+
+2014-06-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: don't make output files read-only
+ This led to problems, such as the prompt "mv: try to overwrite
+ 'egrep', overriding mode 0555 (r-xr-xr-x)? " during a build.
+ It can be worked around, but the cure is worse than the disease;
+ making output files read-only is more trouble than it's worth.
+ * doc/Makefile.am (grep.1, egrep.1, fgrep.1):
+ * lib/Makefile.am (colorize.c):
+ * src/Makefile.am (egrep fgrep):
+ Don't make output files read-only. Prefer separate commands to
+ '&&' when either will do.
+
+2014-06-08 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: remove grep.spec
+ * grep.spec: Remove; obsolete and evidently not used.
+
+2014-06-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: use gnulib fdl module
+ * bootstrap.conf (gnulib_modules): Add fdl.
+ * doc/fdl.texi: Remove, as this now comes from gnulib.
+ * doc/.gitignore: Update to match current sources.
+
+2014-06-06 Jim Meyering <meyering@fb.com>
+
+ build: improve rule to generate egrep+fgrep scripts
+ * src/Makefile.am (egrep fgrep): chmod a=rx generated files,
+ and remove $@-t before attempting to redirect to it, in case it
+ is read-only.
+
+ build: don't redirect directly to $@
+ * lib/Makefile.am (colorize.c): Don't redirect directly to target, $@.
+ Otherwise, we could create a corrupt colorize.c file with a
+ timestamp that indicates it is up to date.
+ Also, make the generated file read-only.
+
+2014-06-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: undo part of previous change
+ * src/dfa.c (enlist): Undo part of previous change that doesn't
+ look correct and doesn't help performance much anyway.
+
+ grep: use system strstr if available and fast
+ Problem reported by Norihiro Tanaka in: http://bugs.gnu.org/17700
+ * NEWS: Document this.
+ * bootstrap.conf (gnulib_modules): Add strstr.
+ * src/dfa.c (istrstr): Remove.
+ (enlist): Use strstr instead. Wait until we need memory before
+ allocating it; this can save an unnecessary allocate and free.
+
+ build: update gnulib submodule to latest
+
+2014-06-03 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.20
+ * NEWS: Record release date.
+
+2014-05-30 Jim Meyering <meyering@fb.com>
+
+ grep: fix --max-count=N (-m N) to stop reading after Nth match
+ With --max-count=N (-m N), grep is supposed to stop reading input
+ after it has found the Nth match. However, a recent context-
+ related change made it so grep would always read to end of file.
+ * src/grep.c (prtext): Don't let a negative "out_after" value
+ make "pending" line count negative.
+ * tests/max-count-overread: New test, for this.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+ * THANKS: Add names of two recent bug reporters.
+ This bug was introduced by commit v2.18-139-g5122195.
+ Reported by Marc Aldorasi in http://bugs.gnu.org/17640.
+
+2014-05-29 Jim Meyering <meyering@fb.com>
+
+ dfa: fix off-by-one under-allocation from recent change
+ Commit v2.19-10-gc32ff67 mistakenly made this change:
+ -realloc_trans_if_necessary (d, 1);
+ +realloc_trans_if_necessary (d, 0);
+ which led to a heap buffer overflow.
+ * src/dfa.c (dfaexec): Allocate space for one state, as before.
+
+2014-05-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: fix bug with regex containing multiple begin/end-line constraints
+ grep -E 'a(b$|c$)' would mistakenly match "aa".
+ * src/dfa.c (dfamust): When resetting 'is' in OR, also reset
+ 'begline' and 'endline' of 'must'.
+ * NEWS (Bug fixes): Mention it.
+ This bug was introduced via commit v2.18-85-g2c94326.
+ Reported by Péter Radics in <http://bugs.gnu.org/17617>.
+
+2014-05-26 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: simplify building initial state
+ build_state_zero doesn't need the struct dfa to be initialized,
+ so remove the initialization and simplify.
+ * src/dfa.c (build_state_zero): Remove.
+ (dfaexec): Call realloc_trans_if_necessary and build_state directly.
+
+ dfa: revert "grep: do not count newline before the start of buffer"
+ This reverts commit 5dc3af2806d21455b818be3f9da26c372e4a7f8d.
+ The previous change renders that commit unnecessary.
+
+ dfa: do not clear the first state of a transition table
+ If number of DFA states reaches 1024, build_state clears transition
+ tables to save memory. However, the initial state is always used,
+ so clearing it just wastes time.
+ * src/dfa.c (build_state): Do not clear the initial state's
+ transition and failure tables.
+
+ grep: remove unnecessary argument
+ * src/grep.c (do_execute): Remove argument 'start_ptr'. It's always null.
+ All uses changed.
+
+2014-05-24 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: --exclude-dir=FOO/ now ignores the trailing slash
+ Problem reported by Khaled Ziyaeen; see: http://bugs.gnu.org/17481
+ * NEWS, doc/grep.texi (File and Directory Selection): Document this.
+ * src/grep.c (main): Implement this.
+ * tests/include-exclude: Test this.
+
+ dist: don't distribute lib/colorize.c
+ 'configure' creates this file, so it shouldn't be distributed; see:
+ http://bugs.gnu.org/17480
+ * configure.ac (COLORIZE_SOURCE): New macro.
+ Don't use AC_CONFIG_LINKS for lib/colorize.c.
+ * lib/Makefile.am (nodist_libgreputils_a_SOURCES): New macro.
+ (libgreputils_a_SOURCES): Remove colorize.c.
+ (CLEANFILES): Add colorize.c
+ (colorize.c): New rule.
+
+2014-05-23 behoffski <behoffski@grouse.com.au>
+
+ maint: uncapitalize first letter of two dfaerror message strings
+ * dfa.c (lex): Make two message strings consistent with all of
+ the others: do not capitalize the first letter of the first word.
+
+2014-05-23 Jim Meyering <meyering@fb.com>
+
+ maint: revert "grep: port mb_next_wc to RHEL 6.5 x86-64"
+ This reverts commit v2.18-148-ga6ae68d.
+ Now that we have gnulib change v0.1-131-g2a045bc, "mbrlen, mbrtowc:
+ fix bug with empty input", this work-around is no longer needed.
+
+ gnulib: update, for mbrlen/mbrtowc empty input bug fix
+
+2014-05-22 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.19
+ * NEWS: Record release date.
+
+2014-05-21 Jim Meyering <meyering@fb.com>
+
+ maint: avoid new false-positive syntax-check failure
+ * cfg.mk (exclude_file_name_regexp--sc_prohibit_doubled_word):
+ Exempt new test file that contains legitimate use of "in in".
+
+2014-05-17 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ tests: add test case for newline-count fix
+ * tests/count-newline: New test.
+ * tests/Makefile.am (TESTS): Add it.
+
+2014-05-16 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: do not count newline before the start of buffer
+ * src/dfa.c (build_state): When checking whether the previous
+ character was a newline, do not count any newline before the
+ start of the buffer.
+
+2014-05-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: port mb_next_wc to RHEL 6.5 x86-64
+ * src/searchutils.c (mb_next_wc): Work around glibc bug 16950; see:
+ https://sourceware.org/bugzilla/show_bug.cgi?id=16950
+ This bug was masked in the other GNU/Linux tests I made. It was
+ exposed on RHEL 6.5 x86-64, where the compiler (GCC Red Hat 4.4.7-4)
+ happened to use temporaries in a different way.
+ Also see recent changes to the Gnulib documentation in this area:
+ http://lists.gnu.org/archive/html/bug-gnulib/2014-05/msg00013.html
+
+ tests: port mb-non-UTF8-performance to RHEL 6.5
+ * tests/mb-non-UTF8-performance (timeout): Use an integer,
+ as 'timeout 1.234' doesn't work in EUC locales.
+
+2014-05-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ egrep, fgrep: port to Solaris 10 /bin/sh
+ This old shell doesn't grok ${0%/*}; see: http://bugs.gnu.org/17471
+ * src/Makefile.am (egrep fgrep): Don't assume the shell does substrings.
+ * src/egrep.sh (dir): New var, so that the substring calculation is
+ done only once (which is a small win even with newer shells),
+ and so that the calculation is easier to edit on older shells.
+
+2014-05-10 Jim Meyering <meyering@fb.com>
+
+ maint: NEWS: adjust wording to reflect move
+ * NEWS (Improvements): Correct direction-relative wording,
+ now that the referent is below, not above.
+
+ maint: NEWS: move "Improvements" to the top
+ * NEWS: Move the small "Improvements" section to precede
+ the longer "Bug fixes" one.
+
+ gnulib: update submodule to latest, and bootstrap
+ * gnulib: Update submodule.
+ * bootstrap: Update from gnulib.
+
+2014-05-10 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: omit double includes
+ * src/dfa.c: Don't include stddef.h or stdbool.h, as dfa.h includes
+ them already, and it's the same module as we are.
+ Suggested by Aharon Robbins in: http://bugs.gnu.org/17458
+
+ dfa: fix bug with \< etc in multibyte locales
+ Problem reported by Stephane Chazelas in: http://bugs.gnu.org/16867
+ * NEWS: Document the fix.
+ * src/dfa.c (dfaoptimize): Remove any superset if changing from
+ UTF-8 to unibyte, and if the pattern has no backreferences.
+ (dfassbuild): In multibyte locales, treat \< \> \b \B as
+ backreferences in the DFA, since the DFA relies on unibyte
+ tests to check them.
+ (dfacomp): Optimize after building the superset, so that
+ dfassbuild can depend on d->multibyte. A downside is that
+ dfaoptimize must remove supersets that are likely slower than the
+ DFA after optimization, but that's been done in the
+ above-described change.
+ * tests/Makefile.am (XFAIL_TESTS): Remove word-delim-multibyte,
+ since the test works now.
+
+ tests: add test case for -C 0 change
+ * tests/context-0: New test.
+ * tests/Makefile.am (TESTS): Add it.
+
+ grep: -A 0, -B 0, -C 0 now output a separator
+ Problem reported by Dan Jacobson in: http://bugs.gnu.org/17380
+ * NEWS:
+ * doc/grep.texi (Context Line Control): Document this.
+ * src/grep.c (prtext): Output a separator even if context is zero.
+ (main): Default context is now -1, not 0.
+
+2014-05-09 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: minor improvements to retry-DFA-superset patch
+ * src/dfasearch.c (EGexecute): Avoid unnecessary test in a context
+ where memrchr cannot return a null pointer.
+
+2014-05-09 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: retry DFA superset after matching multiple lines
+ * src/dfasearch.c (EGexecute): Without this patch, the code reverts
+ to KWset when the DFA superset matches multiple lines.
+ However, if the DFA superset matches multiple lines, it most likely
+ also matches a single line, and reverting to KWset means dfafast
+ won't work effectively. Change the code so that it retries the DFA
+ superset immediately after it matches multipline lines. On my platform
+ this improves the performance of "LC_ALL=C grep '\(ab\)cd\1d' k" from
+ 3.48 to 2.14 seconds realtime, where k contains the output of
+ "yes abcdabc | head -50000000".
+
+ dfa: fix inconsistency in multibyte locales
+ * src/dfa.c (dfaexec): Use the same exit condition in multibyte
+ locales as in unibyte.
+
+2014-05-08 Jim Meyering <meyering@fb.com>
+
+ maint: mark some breakless cases with /* fallthrough */ comment
+ * src/dfa.c (addtok_mb, dfaanalyze): Add comment so that it is
+ clear that the "break" statement is deliberately omitted.
+
+2014-05-08 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: assume C89 for CHAR_BIT
+ * src/dfa.c (CHARBITS): Remove. All uses replaced by CHAR_BIT.
+ (NOTCHAR): Now an enum, since it need not be a macro.
+
+ dfa: don't assume unsigned int is exactly 32 bits wide
+ Sun C 5.12 (sparc) warns of the potential unportability.
+ * src/dfa.c (charclass_word): New type, for clarity.
+ All relevant uses of 'unsigned' changed.
+ (CHARCLASS_WORD_BITS): Rename from INTBITS. All uses changed.
+ Now an enum, since it needn't be a macro.
+ (CHARCLASS_WORD_MASK): New macro.
+ (CHARCLASS_WORDS): Rename from CHARCLASS_INTS. All uses changed.
+ (setbit, clrbit): Cast 1 to charclass_word, for clarity.
+ (notset, add_utf8_anychar, dfastats):
+ Don't assume unsigned int is exactly 32 bits wide.
+ (dfastate): Don't rely on implementation-defined conversion of
+ greater-than-INT_MAX unsigned to int. Change bit test to resemble
+ tstbit more.
+
+ maint: fix indenting to pacify 'prohibit_tab_based_indentation'
+ * src/dfa.c: Use spaces and not tabs to indent some lines.
+
+ grep: simplify and clarify invert-related code
+ * src/grep.c (out_invert, prtext): Use bool for booleans.
+ (prline): Remove unnecessary '!!' on a value that is always 0 or 1.
+ (prtext): Remove last arg NLINESP; use !out_invert instead. All uses
+ changed. Move decls to nearer uses, since we can assume C99 here.
+ Update 'outleft' and 'after_last_match' here; it's simpler.
+ (grepbuf): Compute return value by subtracting new from old 'outleft',
+ rather than by keeping a separate running total. Avoid code duplication
+ by arranging for prtext to be called from one place, not three.
+
+2014-05-08 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: improve performance of -v when combined with -L, -l or -q
+ Problem reported by Jörn Hees in: http://bugs.gnu.org/17427
+ * src/grep.c (grepbuf, grep): When -v is combined with -L, -l, or -q,
+ don't read data unnecessarily after a non-match is found.
+
+2014-05-06 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: mention performance changes
+ * NEWS: Discuss recent performance improvements and downgrades.
+
+ dfa: clarify use of "if"
+ The phrase "Y is true if X" is logically equivalent to "X implies Y",
+ but often "X if and only if Y" was intended.
+ * src/dfa.c, src/dfa.h: Reword to avoid the incorrect use of "if".
+
+ dfa: minor performance improvement for previous change
+ * src/dfa.c (struct dfa): New member 'fast'. Remove 'has_backref'.
+ All uses changed.
+
+2014-05-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: speed up 'dfaisfast'
+ * src/dfa.c (struct dfa): New member 'has_backref'.
+ (addtok_mb): Set it.
+ (dfaisfast): Use it.
+
+2014-05-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix -w match next to a multibyte letter
+ * NEWS: Document this.
+ * src/dfasearch.c, src/kwsearch.c (WCHAR): Remove.
+ (wordchar): New static function.
+ * src/dfasearch.c (EGexecute):
+ * src/kwsearch.c (Fexecute): Use the new functions, so that the
+ code works correctly if a multibyte character adjacent to the
+ match has two or more bytes.
+ * src/search.h, src/searchutils.c (mb_prev_wc, mb_next_wc):
+ New functions.
+ * tests/word-delim-multibyte: Add a test for grep -w (which now
+ passes), and a test for \> (which still fails). The \< test also
+ still fails.
+
+ grep: improve internal API for multibyte boundary
+ * src/search.h, src/searchutils.c (mb_goback): Rename from
+ is_mb_middle. Omit last arg. Return number of bytes to go back,
+ not just a boolean. All uses changed.
+ * src/dfasearch.c (EGexecute):
+ * src/kwsearch.c (Fexecute): Adjust to API change.
+ * src/kwsearch.c (Fexecute): Eliminate common subexpression.
+
+ grep: fix encoding-error incompatibilities among regex, DFA, KWset
+ This follows up to http://bugs.gnu.org/17376 and fixes a different
+ set of incompatibilities, namely between the regex matcher and the
+ other matchers, when the pattern contains encoding errors.
+ The GNU regex matcher is not consistent in this area: sometimes
+ an encoding error matches only itself, and sometimes it
+ matches part of a multibyte character. There is no documentation
+ for grep's behavior in this area and users don't seem to care,
+ and it's simpler to defer to the regex matcher for problematic
+ cases like these.
+ * NEWS: Document this.
+ * src/dfa.c (ctok): Remove. All uses removed.
+ (parse_bracket_exp, atom): Use BACKREF if a pattern contains
+ an encoding error, so that the matcher will revert to regex.
+ * src/dfasearch.c, src/grep.c, src/pcresearch.c, src/searchutils.c:
+ Don't include dfa.h, since search.h now does that for us.
+ * src/dfasearch.c (EGexecute):
+ * src/kwsearch.c (Fexecute): In a UTF-8 locale, there's no need to
+ worry about matching part of a multibyte character.
+ * src/grep.c (contains_encoding_error): New static function.
+ (main): Use it, so that grep -F is consistent with plain fgrep
+ when the pattern contains an encoding error.
+ * src/search.h: Include dfa.h, so that kwsearch.c can call using_utf8.
+ * src/searchutils.c (is_mb_middle): Remove UTF-8-specific code.
+ Callers now ensure that we are in a non-UTF-8 locale.
+ The code was clearly wrong, anyway.
+ * tests/fgrep-infloop, tests/invalid-multibyte-infloop:
+ * tests/prefix-of-multibyte:
+ Do not require that grep have a particular behavor for this test.
+ It's OK to match (exit status 0), not match (exit status 1), or
+ report an error (exit status 2), since the pattern contains an
+ encoding error and grep's behavior is not specified for such
+ patterns. Test only that KWset, DFA, and regex agree.
+ * tests/prefix-of-multibyte: Add tests for ABCABC and __..._ABCABC___.
+
+2014-05-04 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: minor simplification
+ * src/dfa.c (parse_bracket_exp): Use enum, not macro, and move var
+ to just the scope it's needed.
+
+ grep: simplify and fix problems with KWset-DFA agreement patch
+ * src/dfa.c (dfambcache, parse_bracket_exp): Simplify.
+ (mbs_to_wchar, wctok, FETCH_WC, match_anychar, match_mb_charset)
+ (check_matching_with_multibyte_ops, transit_state_consume_1char)
+ (transit_state, dfaexec): Use wint_t, not wchar_t, so that
+ WEOF is treated correctly on platforms where WEOF is not a valid
+ wchar_t value.
+ (ctok, lex): Use int, not unsigned int, for characters,
+ so that EOF is treated more naturally.
+ (parse_bracket_exp): Use NOTCHAR to mark uninitialized char, since
+ FETCH_WC can now set the char to EOF.
+ (lex): Remove unnecessary test for EOF.
+ (parse_bracket_exp, atom): Swap then and else parts, to put
+ the small one first; this is more readable here.
+ * src/searchutils.c (is_mb_middle): Simplify.
+
+ tests: improve coverage for prefix-of-multibyte
+ * tests/prefix-of-multibyte: Also test the regex version.
+
+2014-05-04 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: make KWset and DFA agree about invalid sequences in patterns
+ See: http://bugs.gnu.org/17376
+ * src/dfa.c (dfambcache): Don't cache invalid sequences, because they can't be
+ represented by wide characters.
+ (dfambcache, mbs_to_wchar): Return WEOF for invalid sequences.
+ (ctok): New global variable.
+ (parse_bracket_exp, atom, match_anychar, match_mb_charset): Don't allow WEOF.
+ (lex): Set 'ctok'.
+ * src/kwsearch.c (Fexecute):
+ * src/searchutils.c (is_mb_middle): Don't check here.
+ * tests/invalid-multibyte-infloop: Adjust to fixed behavior.
+ * tests/prefix-of-multibyte: Add test cases for this bug.
+
+2014-05-03 Jim Meyering <meyering@fb.com>
+
+ maint: make ChangeLog generation more robust
+ * Makefile.am (gen-ChangeLog): Sync changes from GNU coreutils,
+ to ensure exit status is propagated, and to support an optional
+ git-log-fix file.
+
+2014-05-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: clarify EGexecute slightly
+ * src/dfasearch.c (EGexecute): Change if-then-else to !if-else-then.
+
+2014-05-03 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: fix the bug in previous patch.
+ * src/dfasearch.c (EGexecute): Do it.
+
+2014-04-30 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: simplify EGexecute further
+ * src/dfa.c, src/dfa.h (dfasuperset): Arg is now const pointer.
+ Now pure.
+ * src/dfasearch.c (EGexecute): Coalesce some duplicate code.
+ Don't worry about memrchr returning NULL when that's impossible.
+
+2014-04-30 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: adjust timing back to kwset when dfaisfast is true
+ * src/dfasearch.c (EGexecute): If DFA fails after kwset succeeds,
+ the code doesn't return to kwset until it reaches the end of the buffer
+ or finds a match. Because of this, although some cases speed up,
+ others slow down.
+
+ Adjust the heuristic for switching to the DFA, so that it
+ is more likely to switch at the right times.
+
+2014-04-30 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: simplify superset
+ * src/dfa.h (dfahint): Remove decl.
+ (dfasuperset): New decl.
+ * src/dfa.c (dfahint): Remove.
+ (dfassbuild): Rename from dfasuperset.
+ (dfasuperset): New function. It returns the superset of D.
+ * src/dfasearch.c: Use dfasuperset instead of dfahint, and simplify.
+
+ dfa: optimize memory allocation
+ * src/dfa.c (epsclosure): get the value of 'visited' from the argument.
+ (dfaanalyze): Define and allocate variable 'visited'.
+ (dfastate): Use not 'insert' but 'merge' to insert positions for
+ state 0 of DFA.
+
+2014-04-29 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ kwset: improve performance by inlining tr
+ Without this change, older versions of GCC won't inline 'tr', and this
+ can hurt performance significantly. See: http://bugs.gnu.org/17229#64
+ * src/kwset.c (tr): Make it inline.
+
+2014-04-27 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+ * gnulib: This fixes a bug whereby running bootstrap
+ would remove our build-aux/git-log-fix file.
+
+2014-04-27 Paul Eggert <eggert@cs.ucla.edu>
+
+ kwset: improve performance by inlining more
+ Problem reported by Norihiro Tanaka in <http://bugs.gnu.org/17229#55>.
+ * src/kwset.c (bmexec_trans): Rename from bmexec, and make it inline.
+ (bmexec): New implementation, which calls bmexec_trans. This helps
+ GCC inline more aggressively with the default optimization, and
+ improves performance 25% with the reported benchmark on my host.
+
+2014-04-26 Paul Eggert <eggert@cs.ucla.edu>
+
+ kwset: speed up by using memchr2
+ Idea suggested by Eric Blake in: http://bugs.gnu.org/17229#43
+ * bootstrap.conf (gnulib_modules): Add memchr2.
+ * src/kwset.c: Include stdint.h, for uintptr_t. Include memchr2.h.
+ (struct kwset): New members gc1, gc2, gc1help.
+ (tr): Move earlier, so it can be used earlier.
+ (kwsprep): Initialize struct kwset's new members.
+ (memchr_kwset): Rename from memchr_trans. Combine C and TRANS args into
+ new arg KWSET. All uses changed. Use memchr2 when appropriate.
+ (bmexec): Use new members instead of recomputing their values.
+ Increase advance_heuristic; it's just a guess, but memchr2 probably
+ makes it reasonable to increase it.
+
+ kwset: improve performance when large Boyer-Moore key doesn't match
+ * src/kwset.c (bmexec): As a heuristic, prefer memchr to seeking
+ by delta1 only when the latter doesn't advance much.
+
+ dfa: fix index bug in previous patch, and simplify
+ * src/dfa.c, src/dfa.h (dfaisfast): Arg is const pointer.
+ * src/dfa.c (dfaisfast): Simplify, since supersets never contain BACKREF.
+ * src/dfa.h (dfaisfast): Declare to be pure.
+ * src/dfasearch.c (EGexecute): Fix typo that could cause buffer
+ read overrun when !dfafast. Hoist duplicate computation out
+ of an if's then and else parts.
+
+2014-04-26 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: speed up for a case to repeat failure in DFA after success in kwset
+ A DFA is typically much faster if it is unibyte and does not set BACKREF.
+ Skip kwset if the DFA is fast. For example:
+
+ yes abcdabc | head -50000000 >k
+ env LC_ALL=C time -p src/grep -i 'abcd.bd' k
+
+ This improved real-time from 4.86 to 1.34 s.
+
+ * src/dfa.c, src/dfa.h (dfaisfast): New function.
+ * src/dfasearch.c (EGexecute): Use it.
+
+2014-04-24 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: fix recently-introduced memory leak
+ Problem reported by Aharon Robbins in: http://bugs.gnu.org/17341
+ * src/dfa.c (dfasuperset): free after dfafree.
+
+ misc: fix doc and test bugs re grep -z
+ Problem reported by Stephane Chazelas in: http://bugs.gnu.org/16871
+ * doc/grep.texi (Usage): Remove incorrect example with -P.
+ * tests/pcre: Improve test so that it actually tests whether \s
+ matches a newline.
+
+ dfa: minor simplification of dfaexec
+ * src/dfa.c (dfaexec): Streamline updating of returned values.
+ Don't bother to check d->multibyte before updating mbp.
+ Avoid duplicate p > end test.
+
+2014-04-24 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: simplify and be more consistent about MB_CUR_MAX
+ * src/dfa.c (struct dfa): New member 'multibyte',
+ replacing 'mb_cur_max'. All uses changed. Use this new member
+ consistently, instead of sometimes referring to MB_CUR_MAX directly.
+
+ dfa: fix comment
+ * src/dfa.c (maybe_realloc): Fix comment to match behavior better.
+
+2014-04-24 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: skip checking of multibyte character boundary, reaching at eolbyte
+ * src/dfa.c (dfaexec): Skip checking of multibyte character boundary,
+ reaching at eolbyte.
+
+2014-04-24 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: fix incorrect comment that led to heap overrun
+ * dfa.c (maybe_realloc): Fix comment to match behavior.
+
+ dfa: minor tuneup of dfamust memory savings patch
+ * src/dfa.c (allocmust): Use xmalloc, not xzalloc.
+ Initialize the must completely, so that the caller need not
+ invoke resetmust. All callers changed.
+ (dfamust): Omit asserts that aren't needed on typical machines
+ where dereferencing NULL dumps core. Don't leak memory if the
+ pattern contains a NUL byte.
+
+2014-04-24 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: avoid wasting memory for large patterns in dfamust
+ * src/dfa.c (struct must): New member 'prev'. It points to the
+ previous must.
+ (allocmust): New function.
+ (freemust): New function.
+ (dfamust): Use it.
+
+2014-04-24 Jim Meyering <meyering@fb.com>
+
+ grep: fix new heap write buffer overrun
+ * src/dfa.c (parse_bracket_exp): Fix off-by-one allocation error.
+ Exposed by running the tests with an ASAN-enabled binary (i.e.,
+ created using gcc's -fsanitize=address option). Introduced by
+ commit v2.18-70-gd3d9612, "dfa: simplify range char allocation".
+
+2014-04-24 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: suppress unsafe-loop-optimizations warnings
+ I ran into one of these while trying out GCC 4.9.0's new
+ -fsanitize=undefined option. The warning told me that GCC didn't
+ do an unsafe optimization, but in 'grep' this is not typically a
+ symptom of a programming error.
+ * configure.ac (WERROR_CFLAGS): Suppress -Wunsafe-loop-optimizations.
+
+2014-04-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: fix memory leak reintroduced by previous patch
+ Reported by Norihiro Tanaka in <http://bugs.gnu.org/17328#16>.
+ * src/dfa.c (dfaexec): Allocate mb_match_lens and mb_follows only
+ if not already allocated.
+ (free_mbdata): Null out mb_match_lens to mark it as being freed.
+
+2014-04-23 Jim Meyering <meyering@fb.com>
+
+ tests: use consistent spelling for locale name, en_US.UTF-8
+ * tests/pcre-infloop: Spell locale name, en_US.UTF-8, consistently,
+ converting this one use from "en_US.utf8", which would provoke a
+ test failure on OS/X.
+
+2014-04-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: omit static variables that limited dfaexec to one struct dfa
+ Problem reported by Aharon Robbins in: http://bugs.gnu.org/17328
+ * src/dfa.c (struct dfa): New member mbs.
+ mb_follows is now a position_set, not a pointer to one;
+ this simplifies memory allocation. All uses changed.
+ (mbs_to_wchar): Put DFA arg at the end, in place of the mbstate_t *arg,
+ since the DFA now contains an mbstate_t. All uses changed.
+ (mbs): Remove static variable.
+ (dfaexec): Remove static bool that attempted to optimize memory
+ allocation, as this wasn't correct for Gawk. Perhaps we can think
+ of a better way to optimize memory.
+
+2014-04-22 Paul Eggert <eggert@cs.ucla.edu>
+
+ kwset: simplify and speed up Boyer-Moore unibyte -i in some cases
+ This improves the performance of, for example,
+ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 | grep -i jk
+ in a unibyte locale.
+ * src/kwset.c (memchr_trans): New function.
+ (bmexec): Use it. Simplify the code and remove some of the
+ confusing gotos and breaks and labels. Do not treat glibc memchr
+ as a special case; if non-glibc memchr is slow, that is lower
+ priority and I suppose we can try to work around the problem in
+ gnulib.
+
+2014-04-22 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: speed-up by using memchr() in Boyer-Moore searching
+ memchr() of glibc is faster than seeking by delta1 on some platforms.
+ When there is no chance to match for a while, use it on them.
+ * src/kwset.c (bmexec): Use memchr() in Boyer-Moore searching.
+
+2014-04-22 Paul Eggert <eggert@cs.ucla.edu>
+
+ kwset: simplify Boyer-Moore with unibyte -i
+ This change doesn't significantly affect performance on my platform,
+ and should make the code easier to maintain.
+ * src/kwset.c (BM_DELTA2_SEARCH, LAST_SHIFT, TRANS):
+ Remove these macros, in favor of ...
+ (tr, bm_delta2_search): New functions. All uses changed.
+ The latter function is inline because this improves code size and
+ runtime CPU slightly on x86-64 with gcc -O2 (GCC 4.9.0).
+ (bmexec): Prefer tr when that's simpler.
+
+2014-04-22 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: may also use Boyer-Moore algorithm for case-insensitive matching
+ * src/kwset.c (BM_DELTA2_SEARCH, LAST_SHIFT, TRANS): New macro.
+ (bmexec): Use character translation table.
+ (kwsexec): Call bmexec for case-insensitive matching.
+ (kwsprep): Change the `if' condition.
+
+2014-04-21 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -P now rejects invalid input sequences in UTF-8 locales
+ See <http://bugs.gnu.org/17245> and <http://bugs.exim.org/1468>.
+ * NEWS: Document this.
+ * src/pcresearch.c (Pexecute): Do not use PCRE_NO_UTF8_CHECK,
+ as this leads to undefined behavior when the input is not UTF-8.
+ * tests/pcre-infloop, tests/pcre-invalid-utf8-input:
+ Exit status is now 2, not 1, when grep -P is given invalid UTF-8
+ data in a UTF-8 locale.
+
+ dfa: minor improvements to previous patch
+ * src/dfa.c (dfamust): Use &=, not if-then.
+ * src/dfa.h (struct dfamust):
+ * src/dfasearch.c (begline, hwsmusts): Use bool for boolean.
+ * src/dfasearch.c (kwsmusts):
+ * src/kwsearch.c (Fcompile): Prefer decls after statements.
+ * src/dfasearch.c (kwsmusts): Avoid conditional branch.
+ * src/kwsearch.c (Fcompile): Unify the two calls to kwsincr.
+
+2014-04-21 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: speed-up for exact matching with begline and endline constraints.
+ dfamust turns on the flag when a state exactly matches the proposed one.
+ However, when the state has begline and/or endline constraints, turns
+ off it.
+
+ This patch enables to match a state exactly, even if the state has
+ begline and/or endline constraints. If a exact string has one of their
+ constrations, the string adding eolbyte to a head and/or foot is pushed
+ to kwsincr(). In addition, if it has begline constration, start
+ searching from just before the position of the text.
+
+ * src/dfa.c (variable must): New members `begline' and `endline'.
+ (dfamust): Consideration of begline and endline constrations.
+ * src/dfa.h (struct dfamust): New members `begline' and `endline'.
+ * src/dfasearch.c (kwsmusts): If a exact string has begline constration,
+ start searching from just before the position of the text.
+ (EGexecute): Same as above.
+ * src/kwsearch.c (Fexecute): Same as above.
+
+2014-04-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: fix bug that caused NUL to be mishandled in patterns
+ This bug was introduced in the early-2012 patches that fixed some
+ context-handling bugs. Bisecting found commit
+ d8951d3f4e1bbd564809aa8e713d8333bda2f802 (2012-02-05 18:00:43 +0100),
+ but it apears the underlying problem was introduced in commit
+ 8b47c4cf6556933f59226c234b0fe984f6c77dc7 (2012-01-03 11:22:09 +0100).
+ * NEWS: Mention bug fix.
+ * src/dfa.c (char_context): Consider NUL to be a newline only if -z.
+ * tests/Makefile.am (TESTS): Add null-byte.
+ * tests/null-byte: New file.
+
+2014-04-19 Jim Meyering <meyering@fb.com>
+
+ build: reenable some compiler warning options
+
+2014-04-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: fix pointer type conversion bug
+ The code converted between size_t * and ptrdiff_t *, which wasn't
+ diagnosed by modern x86-64 GCC but isn't portable. Problem
+ reported by Norihiro Tanaka in <http://bugs.gnu.org/17136#31>.
+ * configure.ac (WERROR_CFLAGS): Don't add -Wno-pointer-sign.
+ We want GCC to diagnose pointer signedness problems, as they
+ violate the C standard and other compilers no doubt complain too.
+ * src/dfa.c (struct dfa): Change type of salloc to size_t.
+ (realloc_trans_if_necessary): Convert signed value to size_t before
+ passing its address to x2nrealloc. Changing the type of tralloc
+ to size_t might have led to problems elsewhere.
+
+2014-04-18 Jim Meyering <meyering@fb.com>
+
+ maint: Revert "dfa: avoid new NULL dereference"
+ This reverts commit 5190041fe515743ef4545abf287d243bc025c701.
+ It was only a bug if one neglected to update to the latest gnulib.
+ With the newer xn2realloc, there is no problem.
+
+ dfa: avoid new NULL dereference
+ * src/dfa.c (dfa_charclass_index): Restore a "+ 1" mistakenly omitted
+ during recent improvements. Introduced in v2.18-66-g6a60fd5.
+
+2014-04-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: minor cleanup
+ * src/dfa.c (MAX): Remove; no longer used.
+
+2014-04-17 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: speed up by checking multibyte characters on demand
+ If dfaexec() runs in non-UTF8 locales, length and wide character
+ representation are checked for all characters of a line in a input
+ string. However, if matched early in the line, results for remaining
+ characters are wasted.
+
+ This patch checks multibyte characters on demand. It should work
+ faster for early matches, and reduces memory requirements.
+
+ * src/dfa.c (struct dfa): Remove members mblen_buf, nmblen_buf,
+ inputwcs, ninputwcs. All uses removed.
+ (buf_begin, buf_end, prepare_wc_buf): Remove. All uses removed.
+ (SKIP_REMAINS_MB_IF_INITIAL_STATE): Remove. This is now expanded
+ when used.
+ (match_anychar, match_mb_charset, check_matching_with_multibyte_ops):
+ New arg wc, mbclen. Remove arg idx. All uses changed.
+ (transit_state_consume_1char): New arg wc. All uses changed.
+ (transit_state): New arg 'end'. All uses changed.
+
+2014-04-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: trans reallocation microoptimization
+ * src/dfa.c (realloc_trans_if_necessary):
+ Help the compiler avoid unnecessary reloads.
+
+ dfa: simplify dfmust initialization
+ * src/dfa.c (dfamust): Don't initialize musts twice.
+ Use zcalloc, not xmalloc followed by zeroing.
+ Make result a const pointer.
+
+ dfa: simplify freelist
+ * src/dfa.c (freelist): Don't null out array while freeing its
+ pointers; the caller can do that if needed.
+ (resetmust): Null out zeroth entry of array.
+
+ dfa: avoid duplicate strlen when allocating memory
+ * src/dfa.c (dfamust): Use xstrdup, not strlen (twice) + xmemdup.
+
+ dfa: simplify memory allocation
+ * src/dfa.c (icatalloc, freelist, enlist, comsubs, addlists, inboth)
+ (dfamust): Don't worry about null arguments or results,
+ as memory allocators no longer can return null pointers.
+ (dfamust): Invoke malloc just once when building a concatenated string.
+
+ dfa: simplify position set and element count allocation
+ * src/dfa.c (dfaanalyze): Allocation position set info all at one go,
+ and similarly for element count info.
+
+ dfa: simplify multibyte_prop allocation
+ * src/dfa.c (struct dfa): Simplify by removing nmultibyte_prop;
+ it should always be the same as talloc. All uses changed.
+
+ dfa: simplify range char allocation
+ * src/dfa.c (struct dfa): Simplify by allocating one array of ranges
+ rather than one for range starts and another for range ends.
+ All uses changed.
+
+ dfa: simplify transition table allocation
+ * src/dfa.c (struct dfa): Remove member 'realtrans', as it can
+ be computed from 'trans'. All uses changed.
+ (realloc_trans_if_necessary): Move earlier, to avoid a forward decl.
+ Use x2nrealloc to compute new size, rather than doing it by hand,
+ which omits a check for unlikely overflow.
+ (realloc_trans_if_necessary, dfafree): Adjust to the fact that
+ d->trans now might be either NULL, or 1 + the pointer to free.
+ (build_state, build_state_zero): Use realloc_trans_if_necessary
+ instead of duplicating its code.
+
+ dfa: better size-overflow check
+ * src/dfa.c (dfasuperset): Let xnmalloc do the multiplication,
+ to check for size arithmetic overflow better.
+
+ dfa: avoid unnecessary work and other initialization
+ * src/dfa.c (dfaanalyze, dfainit):
+ Don't bother allocating when x2nrealloc will do it for us.
+ (dfastate): Allocate grps and labels on the stack, as their
+ size is known at compile time.
+ (build_state): Use xmalloc, not xnmalloc, since the multiplication
+ can be done at compile-time.
+
+ dfa: clarify memory allocation and port to IRIX
+ This change was prompted by a porting problem:
+ IRIX defines its own MALLOC macro, which clashes with ours.
+ More generally, the MALLOC etc. macros are confusing, as they
+ look like functions but do not have C-function semantics.
+ A functional style makes the code easier to read, and though
+ it lengthens the code a bit here it'll make other
+ simplifications easier.
+ * src/dfa.c (XNMALLOC, XCALLOC, CALLOC, MALLOC, REALLOC): Remove.
+ All uses replaced by xnmalloc etc.
+ (REALLOC_IF_NECESSARY): Remove; all uses replaced by ....
+ (maybe_realloc): New function.
+ (copy, merge): Free and allocate rather than realloc, as we
+ needn't save the contents.
+
+2014-04-14 Jim Meyering <meyering@fb.com>
+
+ tests: detect an infloop-inducing bug in grep -P (pcre-8.35)
+ * tests/pcre-infloop: New test.
+ * tests/Makefile.am (TESTS): Add it.
+
+2014-04-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2014-04-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: improvements for the open-CSET patch
+ * src/dfa.c (dfamust): Simplify by removing some duplicate code.
+ Optimize patterns like [aaa] even when not case-folding.
+ Avoid an unnecessary copy of the charclass.
+
+2014-04-11 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: open CSET and transform into uppercase when MB_CUR_MAX == 1
+ In unibyte locales with -i, kwset matching isn't helpful, because
+ dfamust doesn't extract the CSET entries. Fix dmamust so that it
+ does that, and makes it possible to take out a longer fixed string
+ from tokens.
+ * src/dfa.c (dfamust): open CSET and transform into uppercase
+ when MB_CUR_MAX == 1.
+
+2014-04-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: cleanup for HAS_DOS_FILE_CONTENTS issue
+ While cleaning up the empty-string fix, I noticed that one part of
+ the code worried about CRLF in pattern files whereas another part
+ did not. Fix this by using the same approach in both places,
+ and make the CRLF code more modular in the process.
+ * src/dosbuf.c (dos_binary, dos_unix_byte_offsets): New functions.
+ (undossify_input, dossified_pos): Do nothing if ! O_BINARY.
+ * src/grep.c: Always include dosbuf.c so that the code is
+ checked statically even on non-DOS hosts.
+ (dos_binary, dos_unix_byte_offsets): New decls.
+ (undossify_input): Declare unconditionally.
+ * src/grep.c (fillbuf, print_line_head, main):
+ * src/kwsearch.c (Fcompile):
+ Simplify by not worrying about HAVE_DOS_FILE_CONTENTS.
+ * src/grep.c (main): fopen with "rt" if O_TEXT; this is simpler
+ than worrying about HAVE_DOS_FILE_CONTENTS elsewhere.
+ * src/system.h (HAVE_DOS_FILE_CONTENTS): Remove.
+
+ grep: cleanup for empty-string fix
+ * NEWS: Document it.
+ * src/dfasearch.c (GEAcompile):
+ * src/kwsearch.c (Fcompile):
+ Use C99-style decls to simplify. Avoid duplicate code.
+ * tests/empty-line: Add some more tests like this.
+
+2014-04-11 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: no match for the empty string included in multiple patterns
+ * src/dfasearch.c (EGAcompile): Fix it.
+ * src/kwsearch.c (Fcompile): Fix it.
+
+2014-04-08 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: remove bool_bf
+ The extra complexity of this microoptimization wasn't ever much help,
+ and currently it generated bigger code with gcc -O2 (x86-64).
+ * src/dfa.c (bool_bf): Remove. All uses replaced by plain 'bool',
+ without a bitfield.
+
+2014-04-08 Jim Meyering <meyering@fb.com>
+
+ maint: avoid sc_po_check syntax-check failure (kwset.c)
+ * po/POTFILES.in: Remove kwset.c from this list, since it
+ no longer contains a translatable diagnostic.
+
+2014-04-08 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: port better to hosts with nonstandard nl_langinfo
+ On some hosts, nl_langinfo returns strings other than "UTF-8" when
+ UTF-8 is used, and (worse) return "UTF-8" even if the encoding is
+ single-byte. Work around these problems by trying a sample
+ character instead.
+ * src/dfa.c, src/pcresearch.c, src/searchutils.c:
+ Don't include <langinfo.h>.
+ * src/dfa.c (using_utf8): Test for UTF-8 by trying a character
+ rather than by invoking nl_langinfo (CODESET); this is more
+ portable in practice, and removes a dependency on
+ HAVE_LANGINFO_CODESET.
+ * src/pcresearch.c: Include dfa.h, for using_utf8.
+ (Pcompile): Use using_utf8 rather than nl_langinfo.
+
+2014-04-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: prefer bool in DFA internals
+ * src/dfa.c (bool_bf): New type.
+ (dfa_state): Use it, as this seems to generate slightly better
+ code with GCC.
+ (struct mb_char_classes, struct dfa, equal, case_fold, dfasyntax)
+ (laststart, parse_bracket_exp, lex, dfaparse, dfaanalyze, dfastate)
+ (match_mb_charset, dfamust):
+ Use bool for boolean.
+ (using_utf8) [!HAVE_LANGINFO_CODESET]: Tune.
+ (dfaanalyze): Prefer & to && and | to || on booleans; it's simpler here.
+ (dfastate): Simplify charclass nonzero testing. Redo has_mbcset
+ test so that the compiler's more likely to optimize it.
+
+2014-04-07 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: prefer regex to DFA for ANYCHAR in multibyte locales
+ * src/dfa.c (dfa_state): New member has_mbcset.
+ Rename backref to has_backref, and make it of type bool too.
+ All uses changed.
+ (state_index, dfastate): Initialize new member.
+ (dfaexec): Prefer regex to DFA for ANYCHAR in multibyte locales.
+
+2014-04-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: remove trival_case_ignore
+ This optimization is no longer needed, given the other
+ optimizations recently installed. Derived from a patch by
+ Norihiro Tanaka; see <http://bugs.gnu.org/17019>.
+ * bootstrap.conf (gnulib_modules): Remove assert-h.
+ * src/dfa.c (CASE_FOLDED_BUFSIZE): Move here from dfa.h.
+ Remove now-unnecessary static assert.
+ (case_folded_counterparts): Now static.
+ * src/dfa.h (CASE_FOLDED_BUFSIZE, case_folded_counterparts):
+ Remove decls; no longer public.
+ * src/dfasearch.c (kwsmusts): Use kwset even if fill MB_CUR_MAX > 1
+ and case-insensitive.
+ * src/grep.c (MBRTOWC, WCRTOMB): Remove.
+ (fgrep_to_grep_pattern): Use mbrtowc, not MBRTOWC.
+ (trivial_case_ignore): Remove; this optimization is no longer needed.
+ All uses removed.
+
+ grep: simplify memory allocation in kwset
+ * src/kwset.c: Include kwset.h first, to check its prereqs.
+ Include xalloc.h, for xmalloc.
+ (kwsalloc): Use xmalloc, not malloc, so that the caller need not
+ worry about memory allocation failure.
+ (kwsalloc, kwsincr, kwsprep): Do not worry about obstack_alloc
+ returning NULL, as that's not possible.
+ (kwsalloc, kwsincr, kwsprep, bmexec, cwexec, kwsexec, kwsfree):
+ Omit unnecessary conversion between struct kwset * and kwset_t.
+ (kwsincr, kwsprep): Return void since memory-allocation failure is
+ not possible now. All uses changed.
+ * src/kwset.h: Include <stddef.h>, for size_t, so that this
+ include file doesn't require other files to be included first.
+
+ grep: minor cleanups for Galil speedups
+ * src/kwset.c: Update citations.
+ Include stdbool.h.
+ (kwsincr, kwsprep): Clarify by using C99 decls after statements.
+ (kwsprep): Clarify by using MIN. Avoid a couple of buffer copies
+ when !TRANS.
+ (bmexec): Use bool for boolean. Prefer "continue;" to ";".
+
+2014-04-07 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: use the Galil rule for Boyer-Moore algorithm in KWSet
+ The Boyer-Moore algorithm is O(m*n), which means it may be much
+ slower than the DFA. Its Galil rule variant is O(n) and increases
+ efficiency in the typical case; it skips sections that are known
+ to match and does not compare more than once for a position in the text.
+ To use the Galil rule, look for the delta2 shift at each position
+ from the trie instead of the 'mind2' value.
+ * src/kwset.c (struct kwset): Replace member 'mind2' with 'shift'.
+ (kwsprep): Look for the delta2 shift.
+ (bmexec): Use it.
+
+2014-04-06 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: cleanup DFA superset optimization
+ * src/dfa.c (dfa_charclass_index): New function, with body of
+ old dfa_charclass but with an extra parameter D.
+ (charclass_index): Reimplement in terms of dfa_charclass_index.
+ (dfahint): Clarify.
+ (dfasuperset): Do not assign to 'dfa' static variable. Instead,
+ use a local, and use the new dfa_charclass_index function. This
+ doesn't fix any bugs, but it's clearer. Initialize a few more
+ members, to simplify dfafree. Copy the charclasses with
+ just one memcpy call. Don't assign nonnull to D->superset until
+ it's known to be valid; that's simpler.
+ (dfafree, dfaalloc): Simplify based on dfasuperset initializations.
+ * src/dfa.h (dfahint): Add comment.
+ * src/dfasearch.c (EGexecute): Simplify use of memchr.
+ Simplify by using memrchr. Fix typo that could cause a buffer
+ read overrun.
+
+2014-04-06 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: optimization with the superset of DFA
+ The superset of a DFA is like the DFA, except that for speed
+ ANYCHAR, MBCSET and BACKREF are replaced by (CSET full bits) STAR,
+ and mb_cur_max is 1. For example, for 'a\(b\)c\1':
+ original: a b CAT c CAT BACKREF CAT
+ superset: a b CAT c CAT CSET STAR CAT (The CSET has all bits set.)
+ If a string matches a DFA, it matches the DFA's superset.
+ Using the superset to filter can dramatically improve performance,
+ over 200x in some cases. See <http://bugs.gnu.org/16966>.
+ * src/dfa.c (struct dfa): New member 'superset'.
+ (dfahint, dfasuperset): New functions.
+ (dfacomp): Create and analyze the superset.
+ (dfafree): Free only non-NULL items.
+ (dfaalloc): Initialize superset member.
+ (dfaoptimize): If succeed in optimization for UTF-8 locale, don't use
+ the superset.
+ * src/dfa.h (dfahint): New decl.
+ * src/dfasearch.c (EGexecute): Use dfahint.
+
+2014-04-06 Jim Meyering <meyering@fb.com>
+
+ build: avoid OS X 10.8.5 build failure due to lack of static_assert
+ * bootstrap.conf (gnulib_modules): Add assert-h, to accommodate the
+ new use of static_assert on systems lacking support for that construct.
+ Without this change, compilation of dfa.c failed on OS X 10.8.5 with
+ gcc-4.9.0 20140324. We should be using gnulib's assert-h module,
+ regardless, for its nominal improved portability, since grep includes
+ assert.h and uses assert.
+
+2014-04-05 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: fix performance bug with regex in line-by-line mode
+ * src/dfasearch.c (EGexecute): Match line-by-line with regex.
+
+2014-04-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: minor improvements to previous patch
+ * src/dfa.c (MAX): New macro.
+ (match_anychar, match_mb_charset, transit_state_consume_1char):
+ Use it to simplify assignments.
+ (SKIP_REMAINS_MB_IF_INITIAL_STATE): Prefer != 0 for unsigned.
+ (free_mbdata): Omit an unnecessary 'free'.
+
+2014-04-05 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: reuse multibyte DFA buffers in non-UTF8 locales
+ * src/dfa.c (struct dfa): New members 'mblen_buf', 'nmblen_buf',
+ 'inputwcs', 'ninputwcs', 'mb_follows' and 'mb_match_lens'.
+ (mblen_buf, inputwcs): Remove static vars.
+ (SKIP_REMAINS_MB_IF_INITIAL_STATE, match_anychar, match_mb_charset)
+ (transit_state_consume_1char, transit_state, prepare_wc_buf):
+ Use new members instead of global variables.
+ (check_matching_with_multibyte_ops): Use new members
+ instead of new allocation.
+ (dfaexec): Initialize new members.
+ (free_mbdata): Free new members.
+
+2014-04-05 Paul Eggert <eggert@penguin.cs.ucla.edu>
+
+ grep: simplify dfa.c by having it not include mbsupport.h directly
+ * src/mbsupport.h: Remove.
+ * src/Makefile.am (noinst_HEADERS): Remove mbsupport.h.
+ * src/dfa.c, src/grep.c, src/search.h: Don't include mbsupport.h.
+ * src/dfa.c: Include wchar.h and wctype.h unconditionally, as
+ this simplifies the use of dfa.c in grep, and it does no harm
+ in gawk.
+ (setlocale, static_assert): Remove gawk-specific hacks, as
+ gawk now does these itself.
+ (struct dfa, dfambcache, mbs_to_wchar)
+ (is_valid_unibyte_character, setbit_wc, using_utf8, FETCH_WC)
+ (addtok_wc, add_utf8_anychar, atom, state_index, epsclosure)
+ (dfaanalyze, dfastate, prepare_wc_buf, dfaoptimize, dfafree, dfamust):
+ * src/dfasearch.c (EGexecute):
+ * src/grep.c (main):
+ * src/searchutils.c (mbtoupper):
+ Assume MBS_SUPPORT.
+
+2014-04-01 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: avoid re-building a state built previously
+ * src/dfa.c (dfaexec): Avoid to re-build a state built previously.
+
+2014-03-28 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: improve port to freestanding DJGPP
+ Suggested by Aharon Robbins (Bug#17056).
+ * src/dfa.c (setlocale) [!LC_ALL]: Return NULL, not "C",
+ reverting part of a recent change.
+ (using_simple_locale): Return true if setlocale returns null.
+
+2014-03-28 Jim Meyering <meyering@fb.com>
+
+ tests: placate "make syntax-check" re compare arg ordering
+ * tests/euc-mb: Reverse order of arguments to compare.
+ Be consistent in ordering compare arguments: expected followed
+ by actual.
+
+2014-03-28 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: avoid an indirection and port wint_t usage
+ * src/dfa.c (struct dfa): Put mbrtowc_cache directly into struct dfa
+ rather than having a pointer; this saves a malloc and an indirection.
+ All uses changed.
+ (dfambcache): Port to hosts where wint_t * can't be cast to wchar_t *.
+
+2014-03-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: take mbrtowc_cache into new member of struct dfa
+ When struct dfa more than one are used at the same time, mbrtowc cache
+ may be conflict. So, take mbrtowc_cache into new member of struct dfa,
+ and define each mbrtowc cache for them.
+
+ * src/dfa.c (struct dfa): New member `mbrtowc_cache'.
+ (dfambcache): Rename from build_mbrtowc_cache. Add dependency on struct dfa.
+ (mbs_to_wchar): Add dependency on struct dfa.
+ (FETCH_WC): Use it.
+ (prepare_wc_buf): Use it. Add dependency on struct dfa.
+ (dfacomp): Call it.
+ (dfafree): Release it.
+
+2014-03-28 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: cache results of mbrtowc for speed
+ Idea suggested by Norihiro Tanaka in Bug#16842.
+ * src/dfa.c (mbrtowc_cache): New static var.
+ (build_mbrtowc_cache, mbs_to_wchar): New functions.
+ (FETCH_WC) [MBS_SUPPORT]: Speed up by using mbs_to_wchar
+ instead of mbrtowc and wctob.
+ (FETCH_WC) [!MBS_SUPPORT]: Rewrite in terms of old FETCH macro.
+ (FETCH): Remove; no longer used.
+ (lex): Simplify by avoiding the need for FETCH.
+ (prepare_wc_buf) [MBS_SUPPORT]: Speed up by using mbs_to_wchar.
+ Simplify the loop.
+ (dfacomp): Initialize the cache.
+
+2014-03-27 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: perform the kwset-helping DFA match in narrower range
+ When kwsexec gives us the offset of a potential match, we compute
+ line begin/end and then run the DFA matcher to see if there really
+ is a match on that line. When the beginning of the line, BEG, is
+ not on a multibyte character boundary, advance BEG until it on such
+ a boundary, before running the DFA search.
+ * src/dfasearch.c (EGexecute): As above. Add a comment.
+ * tests/euc-mb: Add a test case that exercises this code.
+ This addresses http://debbugs.gnu.org/17095.
+
+2014-03-26 Jim Meyering <meyering@fb.com>
+
+ maint: fix "make dist"
+ * src/Makefile.am (egrep fgrep): Specify egrep.sh via
+ $(srcdir)/egrep.sh, so non-srcdir builds work once again.
+
+2014-03-26 Paul Eggert <eggert@penguin.cs.ucla.edu>
+
+ dfa: improve port to freestanding DJGPP
+ * src/dfa.c (setlocale) [!LC_ALL]: Return "C", not NULL (Bug#17056).
+ (using_simple_locale): Store setlocale result in a ptr-to-const.
+
+ egrep, fgrep: improve diagnostics from shell scripts
+ This should fix Bug#17098.
+ * src/Makefile.am (EXTRA_DIST): Add egrep.sh.
+ (egrep fgrep): Depend on egrep.sh and Makefile.
+ Build from new file egrep.sh, as this makes the build process
+ easier to follow. Arrange for $0 to look nicer in subgrep.
+ * src/egrep.sh: New file.
+
+2014-03-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: avoid undefined behavior
+ * src/dfa.c (FETCH_WC, addtok_wc): Don't rely on undefined behavior
+ when converting an out-of-range value to 'int'.
+ (FETCH_WC, prepare_wc_buf): Don't rely on conversion state after
+ mbrtowc returns a special value, as it's undefined for (size_t) -1.
+ (prepare_wc_buf): Simplify test for valid character.
+
+ grep: fix and simplify grep -iF optimization
+ * src/grep.c (check_any_alphabets): Remove.
+ (fgrep_to_grep_pattern): Fix problems when mbrtowc returns -1 or -2.
+ Simplify a bit.
+ (main): Don't bother optimizing 'grep -iF PAT' when PAT contains no
+ alphabetics; it's so rare it's not worth the complexity.
+
+2014-03-23 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: optimization for fgrep with changing the macher to grep macher.
+ fgrep macher is only use kwset engine. However, it's very slow for
+ case-insensitive matching in multibyte locales.
+
+ And so, if the matcher is fgrep and case-insensitive and keys including
+ any alphabets, change it into grep matcher by escape of keys. OTOH, if
+ keys include no alphabet, turn match_icase flag off.
+
+ I prepare following string to measure the performance.
+
+ yes $(printf '%078dm' 0)| head -1000000 | tr 0 a > in
+ A=`printf '\xef\xbc\xa1'` # FULLWIDTH LATIN CAPITAL LETTER A
+
+ I run three tests with this patch (best-of-5 trials):
+
+ env LC_ALL=en_US.UTF-8 time -p src/fgrep -i "$A" in
+ real 8.54 user 7.13 sys 1.16
+
+ Back out that commit (temporarily), recompile, and rerun the experiment:
+
+ env LC_ALL=en_US.UTF-8 time -p src/fgrep -i "$A" in
+ real 0.07 user 0.02 sys 0.05
+
+ * src/fgrep.c (Gcompile) New function.
+ * src/main.c (check_any_alphabets) New function.
+ (fgrep_to_grep_pattern) New function.
+ (main) Use them.
+
+2014-03-23 Paul Eggert <eggert@cs.ucla.edu>
+
+ egrep, fgrep: go back to shell scripts
+ Although egrep's and fgrep's switch from shell scripts to
+ executables may have made sense in 2005, it complicated
+ maintenance and recently has caused subtle performance bugs.
+ Go back to the old way of doing things, as it's simpler and more
+ easily separated from the mainstream implementation. This should
+ be good enough nowadays, as POSIX has withdrawn egrep/fgrep and
+ portable applications should be using -E/-F anyway.
+ * po/POTFILES.in: Remove src/egrep.c, src/fgrep.c, src/main.c.
+ * src/Makefile.am (bin_PROGRAMS): Remove egrep, fgrep.
+ (bin_SCRIPTS): New macro.
+ (grep_SOURCES): Move searchutils.c, dfa.c, dfasearch.c, kwset.c,
+ kwsearch.c, pcresearch.c here from libgrep_a_SOURCES.
+ (egrep_SOURCES, fgrep_SOURCES, noinst_LIBRARIES, libgrep_a_SOURCES):
+ Remove.
+ (LDADD): Remove libgrep.a.
+ (egrep, fgrep): New rules.
+ (CLEANFILES): New macro.
+ * src/grep.c: Rename from src/main.c.
+ (usage, setmatcher, main):
+ Simplify, since there's now just one executable.
+ (Gcompile, Ecompile, Acompile, GAcompile, PAcompile, matchers):
+ Move here from the (removed) src/grep.c.
+ (compile_fp_t, execute_fp_t, struct matcher, matchers):
+ Move here from src/grep.h, as they no longer need to be public.
+ (struct matcher.name): Avoid one level of indirection/relocation.
+ (do_execute, main): Fix a performance bug when it was compiled
+ as 'fgrep', due to confusion about which matcher was which.
+ (main): Fix a performance bug with -P, likewise.
+ * src/grep.h (before_options, after_options): Remove.
+ * src/egrep.c, src/fgrep.c, src/grep.c: Remove.
+
+ dfa: port to freestanding DJGPP (Bug#17056)
+ * src/dfa.c (setlocale) [!LC_ALL]: Define a dummy.
+
+2014-03-16 Jim Meyering <meyering@fb.com>
+
+ tests: avoid false-positive failure on some AMD CPUs
+ * tests/mb-non-UTF8-performance: Avoid false-positive failure
+ when run on certain AMD processors.
+
+2014-03-10 Jim Meyering <meyering@fb.com>
+
+ tests: make a performance-measuring test less system-sensitive
+ Andreas Schwab reported in http://debbugs.gnu.org/16941
+ that this test would timeout and fail on m68k-suse-linux.
+ Rather than testing absolute duration with a limit tuned
+ to today's hardware, compare performance of grep with LC_ALL=C
+ against that same command using LC_ALL=ja_JP.eucJP.
+ * tests/init.cfg (require_hi_res_time_): New function.
+ * tests/mb-non-UTF8-performance: Rewrite to use it:
+ record absolute duration D of the first (normally much faster)
+ command, and set a timeout of 8*D for the command running in
+ an affected locale.
+
+2014-03-09 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: pacify 'make dist'
+ * src/dfa.c (parse_bracket_exp): Reindent with spaces.
+ * src/dfa.h (case_folded_counterparts): Prefix decl with 'extern'.
+ * src/main.c: Don't include assert.h.
+
+2014-03-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ fgrep: fix case-fold incompatibility with plain 'grep'
+ fgrep converted to lowercase, whereas the regex code converted
+ to uppercase. The resulting behaviors don't agree in offbeat
+ cases like Greek sigmas and Turkish Is. Fix this by changing
+ fgrep to agree with the regex code.
+ * src/kwsearch.c (Fcompile, Fexecute):
+ * src/searchutils.c (kwsinit, mbtoupper):
+ Convert to uppercase, not to lowercase, for compatibility with
+ plain 'grep'.
+ * src/search.h, src/searchutils.c (mbtoupper):
+ Rename from mbtolower, since it now converts to uppercase.
+ All uses changed.
+ * tests/case-fold-titlecase: Add tests for this.
+
+ grep: fix case-fold mismatches between DFA and regex
+ The DFA code and the regex code didn't use the same semantics for
+ case-folding. The regex code says that the data char d matches
+ the pattern char p if uc (d) == uc (p). POSIX is unclear in this
+ area; the simplest fix for now is to change the DFA code to agree
+ with the regex code. See <http://bugs.gnu.org/16919>.
+ * src/dfa.c (static_assert): New macro, if not already defined.
+ (setbit_case_fold_c): Assume MB_CUR_MAX is 1 and that case_fold
+ is nonzero; all callers changed.
+ (setbit_case_fold_c, parse_bracket_exp, lex, atom):
+ Case-fold like the regex code does.
+ (lonesome_lower): New constant.
+ (case_folded_counterparts): New function.
+ (parse_bracket_exp): Prefer plain setbit when case-folding is
+ not needed.
+ * src/dfa.h (CASE_FOLDED_BUFSIZE): New constant.
+ (case_folded_counterparts): New function decl.
+ * src/main.c (trivial_case_ignore): Case-fold like the regex code does.
+ (main): Try to improve comment re trivial_case_ignore.
+ * tests/case-fold-titlecase: Add lots more test cases.
+
+2014-03-06 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+ doc: do not overpromise --ignore-case's behavior
+ * NEWS: Omit vague statement about titlecase that could be
+ misinterpreted, and is more trouble than it's worth.
+ * doc/grep.texi: Add @documentencoding. Fix copyright range to
+ use endash not hyphen.
+ (Matching Control): Do not overpromise what --ignore-case will do.
+ Give examples of corner cases where the documentation does not
+ specify behavior.
+
+2014-03-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: remove differences from gnulib regex code
+ These don't seem to be needed with GCC 4.8.2, and are making
+ maintenance harder. If we need to disable warnings with older
+ compilers, we can add pragmas to the gnulib versions. See
+ <http://bugs.gnu.org/16911#24>.
+ * gl/lib/regcomp.c.diff, gl/lib/regex_internal.c.diff:
+ * gl/lib/regex_internal.h.diff, gl/lib/regexec.c.diff:
+ Remove.
+ * cfg.mk (exclude_file_name_regexp--sc_prohibit_tab_based_indentation):
+ Don't mention gl/* files.
+
+2014-03-03 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix comment
+ * src/main.c (trivial_case_ignore): Fix comment typo.
+
+2014-03-03 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: avoid to add same character to a bracket expression
+ * src/main.c (trivial_ignore_case): Only when uppercase and/or
+ lowercase is different from original character, add it to new pattern.
+
+2014-03-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix some unlikely bugs in trivial_case_ignore
+ * src/main.c (MBRTOWC, WCRTOMB): Reformat as per usual GNU style.
+ (trivial_case_ignore): Don't overrun buffer in the unusual case
+ when a character has both lowercase and uppercase counterparts.
+ Don't rely on undefined behavior when assigning out-of-range value
+ to an 'int'. Simplify by avoiding unnecessary buffer copies.
+ Work even with shift encodings, by using mbsinit to
+ disable the optimization if we are not in the initial state
+ when we replace B by [BCD].
+
+2014-03-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: revert removal of trivial_case_ignore
+ Revive trivial_case_ignore function in order to be able to use kwset.
+
+ * src/main.c (MBRTOWC, WCRTOMB): New macros.
+ (trivial_case_ignore): New function.
+ (main): Use it.
+
+2014-03-02 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: optimization of bracket expression for non-UTF8 locales
+ * src/dfa.c (addtok): Replace an MBCSET with a CSET even in
+ non-UTF8 locales, and even when it has individual characters.
+
+2014-03-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: describe titlecase fix better
+ * NEWS: Document behavior on lowercase text too.
+ Suggested by Eric Blake in <http://bugs.gnu.org/16911#10>.
+ * doc/grep.texi (Matching Control): Specify behavior of -i
+ more precisely.
+
+2014-02-28 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: minor tuning for mb_case_map_apply
+ * src/kwsearch.c (mb_case_map_apply): Avoid unnecessary widening of
+ size_t to intmax_t. Avoid unnecessary reinitialization of k.
+
+ grep: avoid 'inline' when it doesn't matter
+ These days, compilers generally do just fine without advice from
+ users about 'inline', and there's little need for 'static inline',
+ just as there's little need for 'register'.
+ * src/dfa.c (to_uchar):
+ * src/dosbuf.c (guess_type, undossify_input, dossified_pos):
+ * src/main.c (undossify_input):
+ No longer inline.
+ * src/search.h (mb_case_map_apply): Move from here ...
+ * src/kwsearch.c (mb_case_map_apply): ... to here, and
+ make it no longer 'inline'.
+
+ grep: fix bugs with -i and titlecase
+ * NEWS: Document this.
+ * src/dfa.c (setbit_wc): Simplify.
+ (setbit_c): Remove; no longer used.
+ (setbit_case_fold_c, parse_bracket_exp, atom):
+ Don't mishandle titlecase. For 'atom', this removes the need for
+ the refactoring of Bug#16729.
+ (lex): Use the slower approach only for letters that have a
+ differing case.
+ * tests/case-fold-titlecase: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+ grep: remove lint
+ * src/main.c (MBRTOWC, WCRTOMB): Remove no-longer-used macros.
+
+2014-02-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ grep: remove trivial_case_ignore
+ * src/main.c (trivial_case_ignore): Remove.
+ (main): Remove its use; this optimization is no longer needed.
+
+ grep: don't match line-by-line for case-insensitive with grep and awk
+ * src/main.c (matcher): Move decl up.
+ (do_execute): With the grep or awk matchers,
+ no need to match line by line.
+
+2014-02-27 Jim Meyering <meyering@fb.com>
+
+ maint: dfa: pass NULL, not 0, as 2nd arg to setlocale
+ * src/dfa.c (using_simple_locale): Use NULL, not 0.
+
+2014-02-27 Paul Eggert <eggert@cs.ucla.edu>
+
+ * src/dfa.c (prednames): POSIX allows [[:xdigit:]] to match multibyte chars.
+
+ * src/dfa.c (parse_bracket_exp): Parenthesize.
+
+ grep: fix multiple bugs with bracket expressions
+ * NEWS: Document this.
+ * src/dfa.c (using_simple_locale): New function.
+ (parse_bracket_exp): Handle bracket expressions like [a-[.z.]]
+ correctly. Don't assume that dfaexec handles expressions like
+ [^a-z] correctly, as they can match multiple characters in some
+ locales.
+ * tests/posix-bracket: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+2014-02-25 Stephane Chazelas <stephane.chazelas@gmail.com>
+
+ align grep -Pw with grep -w
+ For the -w option, with -P, we used to look for the pattern surrounded by
+ word boundaries. That's different from what grep -w does and what the
+ documentation describes. Now align with grep -w and the documentation by
+ using PCRE look-behind and look-ahead operators to match the pattern if
+ it is not surrounded by word constituents.
+ * src/pcresearch.c (Pcompile): Use (?<!\w)(?:...)(?!\w) rather than
+ \b(?:...)\b.
+ * NEWS (Bug fixes): Mention it.
+ * tests/pcre-w: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ This complements the fix for http://debbugs.gnu.org/16865
+
+2014-02-24 Stephane Chazelas <stephane.chazelas@gmail.com>
+
+ grep -P: fix it so backreferences now work with -w and -x
+ To implement -w and -x, we bracket the search term with parentheses.
+ However, that set of parentheses had the default semantics of
+ "capturing", i.e., creating a backreferenceable matched quantity.
+ Instead, use (?:...), to create a non-capturing group.
+ * src/pcresearch.c (Pcompile): Use (?:...) rather than (...).
+ * NEWS (Bug fixes): Mention it.
+ * tests/pcre-wx-backref: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ This addresses http://debbugs.gnu.org/16865
+
+2014-02-20 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.18
+ * NEWS: Record release date.
+
+ tests: test for the non-UTF8 multi-byte performance regression
+ Test for the just-fixed performance regression.
+ With a 100-200x differential, it is reasonable to expect that
+ a very slow system will be able to complete the designated
+ task in a few seconds, while with the bug, even a very fast
+ system would exceed the timeout.
+ * tests/mb-non-UTF8-performance: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * tests/init.cfg (require_JP_EUC_locale_): New function.
+
+ grep -i: avoid a performance regression in multibyte non-UTF8 locales
+ * src/main.c: Include dfa.h.
+ (trivial_case_ignore): Perform this optimization only for UTF8 locales.
+ This rectifies a 100-200x performance regression in non-UTF8 multi-byte
+ locales like ja_JP.eucJP. The regression was introduced by the 10x
+ UTF8/grep-i speedup, commit v2.16-4-g97318f5.
+ * NEWS (Bug fixes): Mention it.
+ Reported by Norihiro Tanaka in http://debbugs.gnu.org/16232#50
+
+ maint: give dfa.c's using_utf8 function external scope
+ * src/dfa.c (using_utf8): Remove "static inline".
+ * src/dfa.h (using_utf8): Declare it.
+ * src/searchutils.c (is_mb_middle): Use using_utf8 rather than
+ rolling our own.
+
+2014-02-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: test [^^-^] in unibyte locales
+ This is a bug in the current dfa.c, which was reintroduced by the
+ recent reversion from RRI.
+ * tests/unibyte-negated-circumflex: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * tests/init.cfg (require_unibyte_locale): New function.
+
+ grep: fix bug with patterns like [^^-~] in unibyte locales
+ * NEWS: Document this.
+ * src/dfa.c (parse_bracket_exp): Escape patterns like [^^-~], or
+ Awk patterns like [\^-\]], so that they are not misinterpreted by
+ the system regex library. Check for system regex failure due to
+ memory exhaustion.
+
+2014-02-17 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.17
+ * NEWS: Record release date.
+
+2014-02-17 Paolo Bonzini <bonzini@gnu.org>
+
+ revert "grep: DFA now uses rational ranges in unibyte locales"
+ The correct course of action for grep is to defer range interpretation
+ to regex, because otherwise you can get mismatches between regexes with
+ backreferences and those without.
+
+ For example, [A-Z]. will use RRI but ([A-Z])\1 won't, with the confusing
+ result that the first regex won't match a superset of the language
+ described by the second regex.
+
+ The source of the confusion is that, even though grep's dfa.c was changed
+ to use range checking instead of strcoll, that code is only invoked if
+ dfaexec is called with backref = NULL, and that never happens for grep!
+
+ In the end, all that's needed for RRI is compiling --with-included-regex,
+ and in that case the patch is almost a no-op. Almost, because there
+ are corner cases that aren't handled correctly (e.g. [a-[.e.]], or
+ regular expressions that include a NUL character), but this can be
+ handled separately.
+
+ * NEWS: Revert paragraph introduced by commit v2.16-7-g1078b64.
+ * src/dfa.c (parse_bracket_exp): Revert back to regcomp/regexec.
+
+2014-02-16 Mike Frysinger <vapier@gentoo.org>
+
+ maint: ignore configure.lineno
+ * .gitignore: Add configure.lineno.
+
+2014-02-11 Benno Schulenberg <bensberg@justemail.net>
+
+ help: remove surplus newline
+ * src/main.c (usage): Remove inconsistent \n introduced by previous
+ patch.
+
+2014-02-10 Benno Schulenberg <bensberg@justemail.net>
+
+ help: fix a line ending, and use the same word for similar things
+ * src/main.c (usage): Change a stray 'n' to a newline, and use
+ the word "display" for showing version info as for help text.
+
+2014-02-09 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ speed up mb-boundary-detection after each preliminary match
+ After each kwsexec or dfaexec match, we must determine whether
+ the tentative match falls in the middle of a multi-byte character.
+ That is what our is_mb_middle function does, but it was expensive,
+ even when most input consisted of single-byte characters. The main
+ cost was for each call to mbrlen. This change constructs and uses
+ a cache of the lengths returned by mbrlen for unibyte values.
+ The largest speed-up (3x to 7x, CPU-dependent) is when most
+ lines contain a match, yet few are printed, e.g., when using
+ grep -v common-pattern ... to filter out all but a few lines.
+
+ * src/search.h (build_mbclen_cache): Declare it.
+ * src/main.c: Include "search.h".
+ [MBS_SUPPORT] (main): Call build_mbclen_cache in a multibyte locale.
+ * src/searchutils.c [HAVE_LANGINFO_CODESET]: Include <langinfo.h>.
+ (mbclen_cache): New global.
+ (build_mbclen_cache): New function.
+ (is_mb_middle) [HAVE_LANGINFO_CODESET]: Use it.
+ * NEWS (Improvements): Mention it.
+
+2014-02-01 Jim Meyering <meyering@fb.com>
+
+ maint: use to_uchar function rather than explicit casts
+ * src/system.h (to_uchar): Define function.
+ * src/kwsearch.c (Fexecute): Use to_uchar twice in place of casts.
+ * src/dfasearch.c (EGexecute): Likewise.
+ * src/main.c (prepend_args): Likewise.
+ * src/kwset.c (U): Define in terms of to_uchar.
+ * src/dfa.c (match_mb_charset): Use to_uchar, not an explicit cast.
+
+2014-01-27 Jim Meyering <meyering@fb.com>
+
+ maint: remove vestiges of support for long-disabled --mmap option
+ This option was disabled in March of 2010, and began to elicit a
+ warning in January of 2012. Its time has come.
+ * doc/grep.in.1: Remove mention.
+ * doc/grep.texi: Likewise.
+ * src/main.c (GROUP_SEPARATOR_OPTION, usage, MMAP_OPTION)
+ (long_options, main): Remove all traces.
+ * tests/Makefile.am (check_PROGRAMS): Remove mention of ignore-mmap.
+ * tests/ignore-mmap: Remove file.
+ * NEWS (Maintenance): Mention it.
+
+2014-01-26 Jim Meyering <meyering@fb.com>
+
+ maint: move two local variable declarations
+ * src/dfasearch.c (kwsmusts): Move one declaration down to the point
+ of definition. Move another into the sole scope where it is used.
+
+2014-01-26 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfasearch: skip kwset optimization when multi-byte+case-insensitive
+ Now that DFA searching works with multi-byte locales, the only remaining
+ reason to case-convert the searched input is the kwset optimization.
+ But multi-byte case-conversion is so expensive that it's not
+ worthwhile even to attempt that optimization.
+
+ * src/dfasearch.c (kwsmusts): Skip this function in ignore-case mode
+ when the locale is multi-byte.
+ (EGexecute): Now that this code need not handle multi-byte case-ignoring
+ matches, remove the expensive copy/case-conversion code.
+ With no case-converted buffer, there is no longer any need to call
+ mb_case_map_apply, so remove it and associated code.
+ (kwsincr_case): Remove function. Now, every use of this function
+ is equivalent to a use of kwsincr. Replace all uses.
+ * tests/turkish-eyes: Test all of -E, -F and -G.
+
+2014-01-25 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ dfa: remove GREP-ifdef'd code in favor of code used by gawk
+ For many years, gawk and grep have used different #ifdef'd bits of
+ code relating to how the DFA matcher matches multibyte characters.
+ Remove the GREP-specific code in favor of the code gawk uses. This
+ permits us to avoid still more cases in which grep must resort to
+ the expensive process of copying/case-converting each input line
+ before matching against a case-converted regexp.
+ * src/dfa.c (parse_bracket_exp, atom): As above.
+
+2014-01-25 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+
+2014-01-17 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: DFA now uses rational ranges in unibyte locales
+ Problem reported by Aharon Robbins in <http://bugs.gnu.org/16481>.
+ * NEWS:
+ * doc/grep.texi (Environment Variables)
+ (Character Classes and Bracket Expressions):
+ Document this.
+ * src/dfa.c (parse_bracket_exp): Treat unibyte locales like multibyte.
+
+2014-01-17 Aharon Robbins <arnold@skeeve.com>
+
+ grep: add undocumented '-X gawk' and '-X posixawk' options
+ See <http://bugs.gnu.org/16481>.
+ * src/grep.c (GAcompile, PAcompile): New functions.
+ (const): Use them.
+
+2014-01-10 Pádraig Brady <P@draigBrady.com>
+
+ tests: remove superfluous uses of printf
+ * tests/turkish-eyes: Remove unnecessary uses of printf.
+
+2014-01-09 Jim Meyering <meyering@fb.com>
+
+ grep: make --ignore-case (-i) faster (sometimes 10x) in multibyte locales
+ These days, nearly everyone uses a multibyte locale, and grep is often
+ used with the --ignore-case (-i) option, but that option imposes a very
+ high cost in order to handle some unusual cases in just a few multibyte
+ locales. This change gets most of the performance of using LC_ALL=C
+ without eliminating the ability to search for multibyte strings.
+
+ With the following example, I see an 11x speed-up with a 2.3GHz i7:
+ Generate a 10M-line file, with each line consisting of 40 'j's:
+
+ yes jjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjjj | head -10000000 > k
+
+ Time searching it for the simple/noexistent string "foobar",
+ first with this patch (best-of-5 trials):
+
+ LC_ALL=en_US.UTF-8 env time src/grep -i foobar k
+ 1.10 real 1.03 user 0.07 sys
+
+ Back out that commit (temporarily), recompile, and rerun the experiment:
+
+ git log -1 -p|patch -R -p1; make
+ LC_ALL=en_US.UTF-8 env time src/grep -i foobar k
+ 12.50 real 12.41 user 0.08 sys
+
+ The trick is to realize that for some search strings, it is easy
+ to convert to an equivalent one that is handled much more efficiently.
+ E.g., convert this command:
+
+ grep -i foobar k
+
+ to this:
+
+ grep '[fF][oO][oO][bB][aA][rR]' k
+
+ That allows the matcher to search in buffer mode, rather than having to
+ extract/case-convert/search each line separately. Currently, we perform
+ this conversion only when search strings contain neither '\' nor '['.
+ See the comments for more detail.
+
+ * src/main.c (trivial_case_ignore): New function.
+ (main): When possible, transform the regexp so we can drop the -i.
+ * tests/turkish-eyes: New file.
+ * tests/Makefile.am (TESTS): Use it.
+ * NEWS (Improvements): Mention it.
+
+2014-01-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: port Solaris 10 /bin/sh patch back to GNU/Linux
+ Problem reported by Jim Meyering.
+ * tests/bre, tests/ere, tests/spencer1-locale:
+ Prefer re_shell, not re_shell_.
+ * tests/init.sh (re_shell): New var, which is exported instead of
+ re_shell_.
+
+ Port to Solaris 10 /bin/sh.
+ Problem reported by Dagobert Michelsen in <http://bugs.gnu.org/16380>.
+ * tests/bre, tests/ere, tests/spencer1-locale:
+ Prefer re_shell_ to SHELL, if re_shell_ is set.
+ * tests/init.sh (re_shell_): Export if it's used.
+
+2014-01-01 Jim Meyering <meyering@fb.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.16
+ * NEWS: Record release date.
+
+ gnulib: update to latest, for maint.mk fix
+
+ maint: update copyright dates for 2014
+ Do that by running "make update-copyright".
+
+ gnulib: update to latest
+
+2013-12-31 Jim Meyering <meyering@fb.com>
+
+ pcre: use PCRE_NO_UTF8_CHECK properly
+ In order to obtain the behavior we want, i.e., to disable
+ error-on-invalid-UTF-in-input, apply this PCRE option in
+ pcre_exec, not when compiling.
+ * src/pcresearch.c (Pexecute): Use PCRE_NO_UTF8_CHECK here, ...
+ (Pcompile): ...rather than here.
+ * tests/pcre-invalid-utf8-input: Adjust test case to test for this.
+
+2013-12-26 Jim Meyering <meyering@fb.com>
+
+ maint: fix inconsistent spacing in expression
+ * src/main.c (prline): Fix inconsistent spacing in expression:
+ s/ / /.
+
+2013-12-26 behoffski <behoffski@grouse.com.au>
+
+ maint: fix a garbled comment
+ * src/dfa.c (XNMALLOC, etc.): Fix garbled comment wording.
+
+2013-12-23 Jim Meyering <meyering@fb.com>
+
+ maint: fix/improve a comment
+ * src/main.c (prline): Replace untrue FIXME comment with one
+ telling how the hard-to-reach code can be exercised.
+
+2013-12-21 Santiago Ruano Rincón <santiago@debian.org>
+
+ pcre: tell grep -P to relax its stance on invalid multibyte chars
+ Do not exit-2 for invalid UTF-8 characters. Just prior to this
+ change, this command would match no lines and fail like this:
+ $ printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 grep -P j|cat -A; echo $?
+ grep: invalid UTF-8 byte sequence in input
+ 2
+ After this change, the same command matches both lines, and succeeds:
+ jM-^B$
+ j$
+ 0
+ * src/pcresearch.c (Pcompile): Use PCRE_NO_UTF8_CHECK, too, and
+ add a comment.
+ * tests/pcre-utf8: Add a test and a comment.
+ This change did not work with Debian unstable pcre-8.31-2
+ or with some 8.33 and 8.34-based versions, but does work with
+ Fedora 20's 8.33 and with a built-from-latest source library.
+ Based on a patch by Santiago Ruano Rincón.
+ See http://bugs.gnu.org/15758/
+
+2013-12-21 Jim Meyering <meyering@fb.com>
+
+ tests: avoid FP failure due to exhausted memory
+ * tests/long-line-vs-2GiB-read: Don't declare the test "failed"
+ when running out of memory. In that case, skip it.
+
+2013-12-18 Jim Meyering <meyering@fb.com>
+
+ maint: add comments and split some long lines
+ * src/main.c (do_execute): Add a comment.
+ Split some lines longer than 80 bytes.
+
+ pcre: avoid a nominal leak
+ * src/pcresearch.c (Pcompile)[HAVE_LIBPCRE && !PCRE_STUDY_JIT_COMPILE]:
+ We would leak "re" if built with HAVE_LIBPCRE but without
+ PCRE_STUDY_JIT_COMPILE. Move the free out one level.
+
+ maint: indent cpp directives to reflect nesting
+ * src/pcresearch.c: Insert spaces after a few "#", to indent
+ cpp directives to reflect their nesting.
+
+ grep: handle lines longer than INT_MAX on more systems
+ When trying to exercize some long-line-handling code, I ran these
+ commands:
+ $ dd bs=1 seek=2G of=big < /dev/null; grep -l x big; echo $?
+ grep: big: Invalid argument
+ 2
+ grep should not have issued that diagnostic, and it should
+ have exited with status 1, not 2. What happened?
+ grep read the 2GiB of NULs, doubled its buffer size,
+ copied the 2GiB into the new 4GiB buffer, and proceeded
+ to call "read" with a byte-count argument of 2^32.
+ On at least Darwin 12.5.0, that makes read fail with EINVAL.
+ The solution is to use gnulib's safe_read wrapper.
+ * src/main.c: Include "safe-read.h"
+ (fillbuf): Use safe_read, rather than bare read. The latter
+ cannot handle a read size of 2^32 on some systems.
+ * bootstrap.conf (gnulib_modules): Add safe-read.
+ * tests/long-line-vs-2GiB-read: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+
+2013-11-25 Jim Meyering <meyering@fb.com>
+
+ tests: port to non-GNU sed
+ * tests/multibyte-white-space (utf8_space_characters): The generation
+ of test inputs relied on GNU sed's interpretation of \<, but that is
+ not portable, and caused spurious test failures. Adjust the sed regexp
+ to work on all versions.
+ Reported by Karl Dubost in http://bugs.gnu.org/15953.
+
+2013-11-22 Jim Meyering <meyering@fb.com>
+
+ maint: minor cleanup: xmalloc+strcpy -> xmemdup
+ * src/main.c (main): Replace an xmalloc+strcpy combination
+ with an equivalent use of xmemdup.
+
+2013-11-21 Jim Meyering <meyering@fb.com>
+ Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: avoid undefined behavior of "1 << 31"
+ * src/dfa.c (charclass): Change type from "int" to "unsigned int".
+ (tstbit): Rather than shifting "1" left to form a mask, shift the
+ LHS bits the right and use "1" as the mask. Also, return bool, rather
+ than "int".
+ (setbit, clrbit, dfastate): Don't shift "1" (aka (int)1) left by 31 bits.
+ Instead, use "1U" as the operand, to avoid undefined behavior.
+ Spotted by gcc's new -fsanitize=undefined.
+
+2013-11-02 Jim Meyering <meyering@fb.com>
+
+ grep: fix regression with -P vs. invalid UTF-8 input
+ * src/pcresearch.c (Pexecute): Don't abort upon unexpected
+ PCRE-specific error code. Explicitly handle PCRE_ERROR_BADUTF8,
+ and change the default to print a diagnostic including the unhandled
+ integer PCRE error code and exit with status 2.
+ * tests/pcre-invalid-utf8-input: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+ * THANKS: Update.
+ Reported by Dave Reisner in http://bugs.gnu.org/15758.
+
+ grep: fix regression involving \s and \S
+ Commit v2.14-40-g01ec90b made \s and \S work with multi-byte
+ characters, but it made it so any use like \s*, \s+, \s?, \s{3}
+ would malfunction in a multi-byte locale.
+ * src/dfa.c (lex): Also reset laststart.
+ * tests/backslash-s-and-repetition-operators: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+ * THANKS: Update.
+ Reported by Mirraz Mirraz in http://bugs.gnu.org/15773.
+
+2013-11-01 Jim Meyering <meyering@fb.com>
+
+ maint: NEWS: document a release-related bug fix
+ * NEWS (Bug fixes): Add an entry for a fix pulled from gnulib.
+
+2013-10-26 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib submodule to latest
+ This pulls in a gnulib fix for maint.mk that ensures the procedure
+ described in README-release actually does what we want. Before this
+ change, that procedure resulted in a grep-2.15 tarball that would
+ lead to a grep binary whose --version- reported version number was
+ 2.14.51... rather than the expected 2.15.
+
+ maint: avoid automake deprecation warning re ACLOCAL_AMFLAGS
+ * Makefile.am (ACLOCAL_AMFLAGS): Don't use this deprecated variable.
+ * configure.ac (AC_CONFIG_MACRO_DIRS): Use this instead.
+ (AUTOMAKE_OPTIONS): Require automake-1.12.
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.15
+ * NEWS: Record release date.
+
+2013-10-25 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: port to AIX
+ Problem reported by Pavel Kharitonov in <http://bugs.gnu.org/15690#68>.
+ * src/Makefile.am (LDADD): Add $(LIBTHREAD).
+
+ build: avoid duplicate -funit-at-a-time etc. options
+ * configure.ac (WERROR_CFLAGS): Don't add -fdiagnostics-show-option
+ and -funit-at-a-time, as Gnulib does that for us now, and we're
+ merely piling on duplicats.
+
+2013-10-24 Jim Meyering <meyering@fb.com>
+
+ tests: port more tests to bourne shells with hex-challenged printf
+ * tests/pcre-utf8: Convert the hex \xHH literals for the euro symbol
+ to octal \OOO.
+ * tests/turkish-I: Likewise for "I with dot".
+ * tests/turkish-I-without-dot: Likewise for another Turkish I: U+0131.
+
+ maint: clean up an ugly 'while' condition
+ * src/main.c (get_nondigit_option): Separate a slightly baroque
+ "while" expression into two separate statements, both inside the loop.
+
+2013-10-23 Jim Meyering <meyering@fb.com>
+
+ tests: port to bourne shells whose printf doesn't grok hex
+ Use octal escapes, not hex, in printf(1) format strings,
+ and in one case, use $AWK's printf so we can continue
+ to use the table of hex values.
+ * tests/char-class-multibyte: Use printf octal escapes, not hex,
+ for portability to shells like dash and Solaris 10's /bin/sh.
+ * tests/backslash-s-vs-invalid-multitype: Likewise.
+ * tests/surrogate-pair: Likewise.
+ * tests/unibyte-bracket-expr: Count in decimal and convert to octal.
+ * tests/multibyte-white-space (hex_printf): New function.
+ Use it in place of printf so we can retain the table of hex digits
+ without hitting the limitation of some bourne shells.
+ Reported by Paul Eggert in http://bugs.gnu.org/15690#11
+
+2013-10-21 Jim Meyering <meyering@fb.com>
+
+ gnulib: update to latest
+
+ maint: remove now-unused wcscoll module
+ * bootstrap.conf (gnulib_modules): Remove wcscoll; no longer used.
+
+2013-10-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: avoid chatter from Automake 1.14
+ * configure.ac (AM_INIT_AUTOMAKE): Add subdir-objects.
+
+ build: port shell pattern to Solaris 10
+ * configure.ac: Don't use unquoted '^' in a pattern, as this
+ breaks 'configure' on Solaris 10, whose /bin/sh complains about it,
+ which causes 'configure' to exit even before it finds a decent shell.
+ Unix 7th edition shell accepted '^' as an alias for '|'.
+
+ build: port to platforms that predefine _FORTIFY_SOURCE
+ Problem reported by Brenton Hoff (Bug#15663).
+ * configure.ac (_FORTIFY_SOURCE): Don't define if already defined.
+ This is what Emacs does.
+
+2013-10-20 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib submodule to latest
+
+2013-10-19 Jim Meyering <meyering@fb.com>
+
+ tests: extend the multibyte-white-space test
+ * tests/multibyte-white-space (utf8_space_characters): Add more
+ single-byte whitespace characters. Align RHS hex values and
+ make the sed substitution less rigid, to accommodate.
+ Also, ensure that grep '\S' exits with status 1.
+
+ maint: update bootstrap to latest from gnulib
+ * bootstrap: Update from gnulib.
+
+ maint: fix typo in NEWS
+ * NEWS: Fix/improve example commands in most recent entry.
+ The LC_ALL envvar setting goes before grep, not before printf.
+ Don't reference src/ in the second example command, and do specify
+ the locale.
+
+2013-10-09 Jim Meyering <meyering@fb.com>
+
+ tests: add a test for better coverage of some tricky code
+ * tests/spencer1.tests: Add a non-range bracket expression representing the
+ same regexp, to cover the alternate code path, the one that does not require
+ a regcomp/exec call to interpret the regexp.
+
+2013-10-01 Jim Meyering <meyering@fb.com>
+
+ tests: ensure neither \s nor \S matches an invalid multibyte character
+ * tests/backslash-S-vs-invalid-multitype: New file.
+ Prompted by the bug report from Roman at
+ http://savannah.gnu.org/bugs/?40009
+ * tests/Makefile.am (TESTS): Add it.
+
+ dfa: fix \s and \S to work for multibyte
+ * src/dfa.c (lex): In multibyte mode, we can't treat \s and \S as we do
+ in single-byte mode. Map them to [[:space:]] and [^[:space:]] respectively,
+ to make the DFA matcher use the regex-matcher for this term.
+ * tests/multibyte-white-space: New file. Test for the bug.
+ * tests/Makefile.am (TESTS): Add it.
+ This bug was introduced with the addition of DFA support
+ for \s and \S in commit v2.5.4-112-gf979ca0.
+
+2013-09-30 Jim Meyering <meyering@fb.com>
+
+ maint: change all references: s/POSIX\.2/POSIX/
+ There is no longer any point in referring to POSIX.N.
+ POSIX is sufficient.
+ * doc/grep.in.1: As above.
+ * src/main.c (main): Likewise.
+ * tests/file: Likewise.
+ * tests/options: Likewise.
+ * ChangeLog: Likewise.
+ * NEWS: Likewise.
+ * cfg.mk: Update, to match changed NEWS.
+ Inspired by Glenn Golden's suggestion in http://bugs.gnu.org/15486
+
+2013-09-22 Jim Meyering <meyering@fb.com>
+
+ dfa: remove dead disjunct
+ * src/dfa.c (parse_bracket_exp): Remove dead disjunct.
+ At that point, we know MB_CUR_MAX <= 1, so the test,
+ MB_CUR_MAX > 1 && ... is always false. Remove the disjunct.
+
+ maint: dfa: improve comments and formatting
+ * src/dfa.c (add_utf8_anychar): Correct wording/alignment of a comment.
+ (dfaexec): Add curly braces around multi-line while statement within
+ a "then" block.
+ (ANYCHAR): Clarify comment: "." does not match an invalid UTF8 character.
+ (parse_bracket_exp) Improve comment.
+
+2013-09-08 Jim Meyering <meyering@fb.com>
+
+ dfa: appease a static analyzer, and save 95 stack bytes
+ * src/dfa.c (MAX_BRACKET_STRING_LEN): Rename from BRACKET_BUFFER_SIZE
+ and decrease from 128 to 32.
+ (parse_bracket_exp): Add one byte more than MAX_BRACKET_STRING_LEN
+ to the length of "str" buffer, to avoid appearance that we may store
+ the trailing NUL beyond the end of buffer. A string of length 32
+ or greater is rejected by earlier processing, so would never reach
+ this code. Addresses http://bugs.gnu.org/15307
+
+2013-09-01 Corinna Vinschen <vinschen@redhat.com>
+
+ fix Cygwin UTF-16 surrogate-pair handling with -i
+ grep -i would segfault on systems using UTF-16-based wchar_t (Cygwin)
+ when converting an input string containing certain 4-byte UTF-8
+ sequences to lower case. The conversions to wchar_t and back to
+ a UTF-8 multibyte string did not take surrogate pairs into account.
+ * src/searchutils.c (mbtolower) [__CYGWIN__]: Detect and handle
+ surrogate pairs when converting.
+ * NEWS (Bug fixes): Mention it.
+ * tests/surrogate-pair: New test.
+ * tests/Makefile.am (TESTS): Add it.
+ Reported by: Jim Burwell
+
+2013-08-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: mention how to use the latest gnulib
+ * README-hacking: Steal some text from coreutils/README-hacking.
+
+2013-08-10 Jim Meyering <meyering@fb.com>
+
+ build: update gnulib-related code
+ * gnulib: Update submodule to latest.
+ * bootstrap: Update from gnulib.
+ * gl/lib/regex_internal.h.diff: Update to reflect gnulib changes.
+ * bootstrap.conf: Partial sync from coreutils.
+
+2013-08-09 Jim Meyering <meyering@fb.com>
+
+ tests: simplify and factor newest test
+ * tests/char-class-multibyte2: Simplify file names.
+ Factor out $e_acute, so that the grep argument representation
+ is ascii (though the value is still UTF8).
+
+ doc: NEWS: mention the DFA segfault fix
+ * NEWS (Bug fixes): List the DFA segfault fix.
+
+2013-07-05 Paul Eggert <eggert@cs.ucla.edu>
+
+ Redo comments and white space to better approach GNU style.
+
+2013-07-05 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: add testcase for previous change
+ * tests/Makefile.am (TESTS): add char-class-multibyte2.
+ * tests/char-class-multibyte2: New file.
+
+2013-07-05 Mike Haertel <mike@ducky.net>
+
+ dfa: fix multibyte character in brackets with repetition
+ Let FOO stand for any multibyte (e.g. CJK character) in the regexp.
+ It turns out the following much simpler regexp:
+ ([^.]*[FOO]){1,2}
+ is sufficient to cause the crash.
+
+ In the first step of its parsing, DFA transforms regexp from human
+ readable syntax into reverse-polish form. For regexps of the form a{m,n}
+ repeat counts, it simply builds repeated copies of the representation
+ of a, with appropriate inserted CAT and QMARK operators. For the above
+ example with a regexp of the form a{1,2} it would build:
+
+ <RPN representation for a>
+ <RPN representation for a>
+ QMARK
+ CAT
+
+ When building repeated copies of RPN representations, additional
+ copies of the RPN representations are made by calling a function
+ copytoks() with arguments consisting of the start position and
+ length of the original copy.
+
+ The problem is that the current code for copytoks() is simply
+ incorrect. It operates by calling addtok() for each individual
+ token in the source range being copied. But, in the particular
+ case that the token being added is MBCSET, addtok():
+
+ (1) incorrectly assumes that the character set being added to be added
+ is the one most (addtok has no argument to indicate which cset is
+ being added, so it just uses the latest one)
+
+ (2) attempts to do some token sequence expansion into more primitive
+ operators so things like [FOO] are matched efficiently.
+
+ Both of these assumptions are incorrect in the case that addtok()
+ is being called from copytoks(): (1) is simply not true, and
+ (2) is redundant--the expansion has already been done token sequence
+ being copied, so there is no need to do the expansion again.
+
+ The correct function to add exactly one token, without further expansion,
+ is addtok_mb(). So here is my proposed fix, which is that copytoks()
+ should never call addtok(), but instead directly call addtok_mb()
+ (which is what addtok() eventually calls).
+
+ * src/dfa.c (copytoks): Rewrite using addtok_mb directly.
+
+2013-05-28 Jim Meyering <meyering@fb.com>
+
+ maint: align backslashes consistently
+ * tests/Makefile.am: Most backslashes were aligned with TABs,
+ so adjust the few that used spaces to conform.
+
+ grep -F: avoid an infinite loop with invalid multi-byte search string
+ * src/kwsearch.c (Fexecute): Avoid an infinite loop when processing
+ a fixed (-F) multibyte search string that is an invalid byte sequence
+ in the current locale and that matches the bytes of the input twice
+ on a line. Reported by Daisuke GOTO in
+ http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4773
+ * tests/invalid-multibyte-infloop: New test.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+
+2013-04-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ * cfg.mk (old_NEWS_hash): Update.
+
+ doc: document EREs like a{,10}
+ Problem reported by Eric Blake in
+ <http://lists.gnu.org/archive/html/bug-grep/2013-04/msg00005.html>.
+ * NEWS: Document the bug fix.
+ * doc/grep.in.1: Restore documentation for this feature, but mention
+ that it is a GNU extension.
+ * doc/grep.texi (Fundamental Structure): Mention that this feature
+ is a GNU extension.
+
+2013-04-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: make dfa.c closer to Gawk's
+ * src/dfa.c: Include <stddef.h>, not <sys/types.h>.
+ stddef.h is smaller and is all we need and is portable nowadays.
+ Include <wchar.h> and <wctype.h> only if MBS_SUPPORT.
+
+2013-01-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: make dfa.h standalone
+ Problem reported by Aharon Robbins in
+ <http://lists.gnu.org/archive/html/bug-grep/2013-01/msg00007.html>.
+ * src/dfa.c: Include dfa.h first, so that it's tested standalone.
+ No need to include <regex.h>, since we are in charge of dfa.h and
+ know that it includes <regex.h>.
+ * src/dfa.h: Include <regex.h> and <stddef.h>, so that it's standalone.
+
+2013-01-11 Stefano Lattarini <stefano.lattarini@gmail.com>
+
+ build: update gettext version to 0.18.2
+ * configure.ac (AM_GNU_GETTEXT_VERSION): Update to 0.18.2.
+ This is necessary to have the gettext-provided m4 files to use
+ AC_PROG_MKDIR_P rather than AM_PROG_MKDIR_P. This latter macro,
+ planned to disappear in Automake 1.14, has already been removed
+ in the development version of Automake, so that, without this
+ change, grep fails to bootstrap with bleeding-edge Automake.
+
+2013-01-11 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: update gnulib submodule to latest
+
+2013-01-11 Stefano Lattarini <stefano.lattarini@gmail.com>
+
+ build: remove redundant use of $(INCLUDES)
+ * lib/Makefile.am (INCLUDES): Remove. Automake automatically adds
+ $(srcdir) and $(top_builddir) to the C preprocessor search path.
+ INCLUDES is deprecated in Automake 1.13 (causing a runtime
+ warning), and will be removed in Automake 1.14.
+
+2013-01-04 Jim Meyering <jim@meyering.net>
+
+ build: update gnulib submodule to latest
+
+ maint: update all copyright year number ranges
+ Run "make update-copyright".
+
+2012-11-20 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: normalize diagnostics
+ * src/pcresearch.c (Pcompile): Use similar format diagnostics
+ as elsewhere, and translate them.
+
+2012-11-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: diagnose read errors from -f dir, porting to Solaris
+ Problem reported by Dennis Clarke for Solaris 10 in
+ <http://lists.gnu.org/archive/html/bug-grep/2012-11/msg00009.html>.
+ * src/main.c (main): For -f F, diagnose any read errors
+ encountered when reading F.
+ * tests/Makefile.am (XFAIL_TESTS): Remove grep-dir.
+ * tests/grep-dir: Don't assume that directories cannot be read
+ via fread, as POSIX allows this and it can happen on Solaris.
+
+2012-11-09 Paolo Bonzini <bonzini@gnu.org>
+
+ pcre: add PCRE-JIT support for grep
+ * NEWS: Document new feature.
+ * src/pcresearch.c [PCRE_STUDY_JIT_COMPILE] (jit_stack): New.
+ [PCRE_STUDY_JIT_COMPILE] (Pcompile): JIT-compile the regular expression
+ and allocate a stack for it. Based on a patch from Zoltan Herczeg.
+ * THANKS: Add Zoltan to the list.
+
+2012-10-24 Paul Eggert <eggert@cs.ucla.edu>
+
+ build: go back to AC_PROG_CC
+ * configure.ac: Go back to using AC_PROG_CC rather than AC_PROG_CC_STDC,
+ as the latter is obsolescent and the Autoconf bug involving the former
+ has been fixed.
+
+2012-10-24 Jim Meyering <jim@meyering.net>
+
+ build: use AC_PROG_CC_STDC rather than AC_PROG_CC
+ * configure.ac: Use AC_PROG_CC_STDC rather than AC_PROG_CC,
+ to accommodate autoconf-2.69-37+.
+
+ build: update gnulib submodule to latest
+
+2012-10-23 Eric Blake <eblake@redhat.com>
+
+ build: default to --enable-gcc-warnings in a git tree
+ Anyone building from cloned sources can be assumed to have a new
+ enough environment, such that enabling gcc warnings by default will
+ be useful. Tarballs still default to no warnings, and the default
+ can still be overridden with --disable-gcc-warnings.
+ * configure.ac (gl_gcc_warnings): Set default based on environment.
+
+2012-10-03 Jim Meyering <meyering@redhat.com>
+
+ maint: factor out STREQ definition
+ * src/main.c (STREQ): Remove definition.
+ * src/pcresearch.c: (STREQ): Likewise.
+ * src/system.h (STREQ): Define it here instead.
+
+ maint: correct syntax-check failures; adjust NEWS
+ * tests/pcre-utf8: Reverse order of compare arguments.
+ Remove all copyright year numbers except 2012.
+ Use skip_ "diagnostic...", rather than a bare "exit 77".
+ * NEWS: Start with a concise description of the bug.
+ * src/pcresearch.c (STREQ): Define, so that we can...
+ (Pcompile): use STREQ, not strcmp.
+
+2012-10-03 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: include UTF-8 testcases for grep -P
+ * tests/Makefile.am (TESTS): Add pcre-utf8.
+ * tests/pcre-utf8: New file.
+
+2012-10-03 Petr Pisar <ppisar@redhat.com>
+
+ pcresearch: set UTF-8 flag correctly for UTF-8 locales
+ Otherwise, Unicode properties (\p{XXX}) do not work with characters
+ outside the 7-bit ASCII character set.
+
+ * src/pcresearch.c (Pcompile): Look for UTF-8 locales and set PCRE_UTF8
+ if one is found.
+
+2012-10-03 Jaroslav Škarvada <jskarvad@redhat.com>
+
+ doc: fix a formatting bug in grep.1 template
+ * doc/grep.in.1: Insert .TP before the paragraph describing
+ --dereference-recursive (-R).
+
+2012-10-03 Jim Meyering <meyering@redhat.com>
+
+ maint: placate gcc's -Wjump-misses-init warning
+ * src/kwsearch.c (Fexecute): Replace a "goto" and "return" with
+ a simple return statement, eliminating the label, since that was
+ the sole use.
+ * src/dfasearch.c (EGexecute): Likewise.
+
+2012-09-01 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+2012-09-01 Eric Blake <eblake@redhat.com>
+
+ build: work with new glibc when not optimizing
+ Starting with glibc 2.15, the system headers refuse to compile
+ unconditional use of FORTIFY_SOURCE if optimization is disabled
+ but -Werror is in effect.
+
+ * configure.ac (FORTIFY_SOURCE): Make conditional.
+
+2012-08-19 Jim Meyering <meyering@redhat.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.14
+ * NEWS: Record release date.
+
+2012-08-07 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib and bootstrap
+
+ tests: test for bug with -i and ^$ in a multi-byte locale
+ * tests/empty-line-mb: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+ grep -i '^$' in a multi-byte locale could report a false match
+ * src/dfasearch.c (EGexecute): Do not match the sentinel "newline"
+ that is appended to each buffer.
+ This bug may sound like a big deal (it certainly surprised me), but
+ realize that only the empty-line-matching regular expression '^$'
+ can trigger it, and then only when you add the unnecessary (and
+ arguably superfluous) -i, *and* run the command in a multi-byte
+ locale. Using a multi-byte locale for such a regular expression
+ is also pointless, and hurts performance.
+ * NEWS (Bug fixes): Mention it.
+ Reported by Alexander Katassonov <katasso@gmx.de>
+
+2012-08-06 Jim Meyering <meyering@redhat.com>
+
+ tests: fix a skip diagnostic that mentioned the wrong locale
+ * tests/init.cfg (require_tr_utf8_locale_): s/en_US/tr_TR/
+
+2012-08-02 Jim Meyering <meyering@redhat.com>
+
+ tests: skip failing test on FS/system that lack SEEK_HOLE support
+ * tests/big-hole: Test for SEEK_HOLE support. If not available,
+ skip this test. Hence, this test is now skipped on linux-3.5.0 with
+ ext4 or tmpfs. The test runs (and passes) with at least btrfs, xfs,
+ or ocfs2.
+ * bootstrap.conf (gnulib_modules): Use the perl module.
+
+2012-07-30 Jim Meyering <meyering@redhat.com>
+
+ maint: optimize long-line processing
+ * src/main.c (grep): Use memrchr rather than an open-coded loop,
+ reducing the cost of the replaced code by 50% when processing very
+ long lines. If there were a rawmemrchr function (analogous to glibc's
+ rawmemchr), then the performance improvement would be even greater.
+
+2012-07-27 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: remove stat-size
+ * bootstrap.conf (gnulib_modules): Remove stat-size.
+ * src/main.c: Don't include stat-size.h; no longer needed.
+
+ grep: don't falsely report compressed text files as binary
+ * NEWS: Document this.
+ * src/main.c (file_is_binary): Remove the heuristic based on
+ st_blocks, as it does not work for compressed file systems.
+ On Solaris, it'd be cheap to test whether the file system is known
+ to be uncompressed, which allow the heuristic, but Solaris has
+ SEEK_HOLE so there's little point.
+
+ grep: don't falsely report tiny text files as binary
+ * NEWS: Document this.
+ * src/main.c (file_is_binary): When we are already at apparent
+ EOF, skip the file-size check, as some servers use zero blocks
+ to store binary files. Reported by Martin Carroll in
+ <http://lists.gnu.org/archive/html/bug-grep/2012-07/msg00016.html>.
+
+2012-07-26 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: document -r/-R in man page
+ * doc/grep.in.1: Document -r vs. -R.
+
+2012-07-21 Jim Meyering <meyering@redhat.com>
+
+ tests: avoid false positive upon kernel OOM-kill
+ * tests/big-match (skip_diagnostic): Handle case of 139 (SIGKILL)
+ with no diagnostic.
+
+ build: update gnulib and bootstrap
+
+ maint: fix misspellings in old ChangeLog
+ * ChangeLog-2009: Fix typos.
+
+2012-07-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: fix ptrdiff/size_t clash
+ Reported by Jaroslav Škarvada in <http://savannah.gnu.org/bugs/?36883>.
+ * src/dfasearch.c (EGexecute): Use size_t, not ptrdiff_t, for lengths.
+ Use regoff_t to store re_match's output, and test it before converting
+ it to size_t.
+
+2012-07-06 Jim Meyering <meyering@redhat.com>
+
+ maint: correct log typo, to reflect in generated ChangeLog
+ * Makefile.am (gen-ChangeLog): Use --amend, now that we must
+ make our first log correction.
+ * build-aux/git-log-fix: New file.
+
+2012-07-04 Jim Meyering <meyering@redhat.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.13
+ * NEWS: Record release date.
+
+ build: update gnulib submodule, bootstrap, init.sh
+
+2012-06-17 Jim Meyering <meyering@redhat.com>
+
+ tests: add another turkish-I-related test case
+ * tests/turkish-I-without-dot: Also exercise the case in which
+ the original string and the lower-case buffer have precisely
+ the same length (22 bytes here), yet internal offsets do differ.
+
+2012-06-16 Jim Meyering <meyering@redhat.com>
+
+ grep -i: work also when converting to lower-case inflates byte count
+ Commit v2.12-16-g7aa698d addressed the case in which the lower-case
+ representation of an input byte occupies fewer bytes than the original.
+ However, even with commit v2.12-20-g074842d, grep -i would still
+ misbehave when converting a character to lower-case increased its
+ byte count. The map-manipulation code assumed that the case conversion
+ could only shrink the byte count. With the consideration that it may
+ also inflate it, the deltas recorded in the map array must be signed,
+ and we must account for the one-to-two-or-more mapping when the
+ original-to-lower-case conversion causes the byte count to increase.
+ * src/searchutils.c (mbtolower): When a lower-case character occupies
+ more than one byte, set its remaining map slots to zero. Change the
+ type of the map to be signed, and compute the change in character
+ byte count as new_length - old_length.
+ * src/search.h: Include <stdint.h>, for decl of intmax_t.
+ (mb_case_map_apply): Adjust for signed increments:
+ each map entry is now signed.
+ (mb_len_map_t): Define type. Thanks to Paul Eggert for noticing
+ in review that using a bare "char" as the base type would be wrong on
+ systems for which it is a signed type (as with gcc's -funsigned-char).
+ * src/kwsearch.c (Fcompile, Fexecute): Likewise.
+ * src/dfasearch.c (kwsincr_case, EGexecute): Likewise.
+ * tests/turkish-I-without-dot: New test. Thanks to Paolo Bonzini
+ for the tip that in the tr_TR.utf8 locale, mapping "I" to lower case
+ increases the character's byte count.
+ * tests/Makefile.am (TESTS): Add it.
+ * tests/init.cfg (require_tr_utf8_locale_): New function.
+ * NEWS (Bug fixes): Expand the existing entry.
+
+2012-06-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: handle -i when chars differ in length but line does not
+ * src/searchutils.c (mbtolower): Return the map back to the caller
+ if any input character's length differs from the corresponding output
+ character's, not merely if the total string length differs.
+ Problem reported by Johannes Meixner in
+ <http://lists.gnu.org/archive/html/bug-grep/2012-06/msg00029.html>.
+
+2012-06-07 Jim Meyering <meyering@redhat.com>
+
+ tests: extend coverage of dfa.c's match_mb_charset
+ Add a test case to increase test coverage of part of dfa.c (the DFA
+ matcher used by grep and gawk). While thinking about removing the few
+ remaining uses of strncpy in dfa.c, I found that none of the existing
+ tests covered the 40+ lines of code at the end of match_mb_charset,
+ so constructed this test case to demonstrate that it's not dead code
+ * tests/dfa-coverage: New test, for improved coverage.
+ * tests/Makefile.am (TESTS): Add it.
+
+2012-06-05 Jim Meyering <meyering@redhat.com>
+
+ build: fix a subtly twisted "make distcheck" failure
+ "make distcheck" would fail when, during a test build,
+ an attempt to overwrite the deliberately-write-protected
+ $(srcdir)/grep.pot file would fail.
+ * bootstrap.conf (bootstrap_epilogue): Don't let the existence of
+ a large sparse file in the build directory induce "make distcheck"
+ failure. The existence of a large sparse test file named 8T-or-so
+ would make po/Makefile.in.in's use of grep (to search for "GNU grep"
+ as an indication that this is a GNU package) exit 2 without generating
+ any output, which made the first xgettext use --package-name=grep,
+ while that same search for "GNU grep" would succeed when run
+ from a pristine from-tarball build, thus making the second
+ xgettext invocation use --package-name='GNU grep'.
+ That mismatch:
+ -"Project-Id-Version: grep 2.12.18-1080\n"
+ +"Project-Id-Version: GNU grep 2.12.18-1080\n"
+ led to the attempt by Makefile.in.in's grep.pot-update rule to
+ overwrite ../../grep.pot in the read-only po/ source directory.
+
+2012-06-03 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule, bootstrap and init.sh
+ cfg.mk: Exempt dfa.c from the new no-strncpy test, for now.
+
+2012-06-02 Jim Meyering <meyering@redhat.com>
+
+ grep: fix how -i works with a match containing the Turkish I-with-dot
+ Fix a long-standing problem in the way grep's -i interacts with
+ data whose byte count changes when we convert it to lower case.
+ For example, the UTF-8 Turkish I-with-dot (İ) occupies two bytes,
+ but its lower case analog, i, occupies just one byte. The code
+ converts both search string and the haystack data to lower case,
+ and then searches for the modified string in the modified buffer.
+ The trouble arose when using a lowercase buffer <offset,length>
+ pair to manipulate the original (longer) buffer.
+
+ The solution is to change mbtolower to return additional information:
+ a malloc'd mapping vector. With that, the caller maps the lowercase-
+ relative <offset,length> to numbers that refer to the original buffer.
+ This mapping is used only when lengths actually differ, so the cost
+ in general should be small.
+
+ * src/searchutils.c (mbtolower): Add the new map parameter.
+ * src/search.h (mb_case_map_apply): New function.
+ * src/kwsearch.c (Fexecute): Update mbtolower caller, and upon
+ success, apply the new map.
+ * src/dfasearch.c (EGexecute): Likewise.
+ * tests/Makefile.am (XFAIL_TESTS): Remove turkish-I from this list;
+ that test is no longer expected to fail.
+ * NEWS (Bug fixes): Mention it.
+ Reported by Ilya Basin in
+ http://thread.gmane.org/gmane.comp.gnu.grep.bugs/3413 and later
+ by Strahinja Kustudic in http://savannah.gnu.org/bugs/?36567
+
+2012-06-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: remove unnecessary "what-if-signal?" code
+ * src/main.c (fillbuf): Don't worry about EINTR when closing --
+ not possible, since we're not catching signals.
+
+2012-05-16 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: avoid nominal integer overflow
+ * src/dfa.c (add_utf8_anychar): Avoid signed integer overflow.
+ Although this works on all platforms we know about, strictly
+ speaking the behavior is undefined, and Sun C 5.8 warns about it.
+
+2012-05-15 Jim Meyering <meyering@redhat.com>
+
+ maint: avoid nit-picky syntax-check test failure; tweak big-hole test
+ * NEWS: Restore deleted newline in "old" NEWS, to fix a syntax-check
+ test failure.
+ * tests/big-hole: Use awk, rather than a shell loop: saves 3000 lines
+ of verbose shell output in the .log file.
+
+2012-05-15 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: sparse files are now considered binary
+ * NEWS: Document this.
+ * doc/grep.texi (File and Directory Selection): Likewise.
+ * bootstrap.conf (gnulib_modules): Add stat-size.
+ * src/main.c: Include stat-size.h.
+ (usable_st_size): New function, mostly stolen from coreutils.
+ (fillbuf): Use it.
+ (file_is_binary): New function, which looks for holes too.
+ (grep): Use it.
+ * tests/Makefile.am (TESTS): Add big-hole.
+ * tests/big-hole: New file.
+
+2012-05-06 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: quote 'like this' or "like this", not `like this'
+ See <http://lists.gnu.org/archive/html/bug-grep/2012-01/msg00125.html>.
+ * ChangeLog-2009, HACKING, NEWS, README-hacking, cfg.mk, configure.ac:
+ * lib/colorize-w32.c, m4/pcre.m4:
+ * src/Makefile.am, src/dfa.c, src/dosbuf.c, src/main.c:
+ * tests/backref, tests/help-version, tests/tests:
+ In commentary, quote 'like this' or "like this" rather than
+ `like this' or ``like this''.
+ * cfg.mk (old_NEWS_hash): Update due to changed old NEWS.
+ * doc/grep.texi (General Output Control): Quote sample text
+ with @samp, not with `...'.
+ * src/main.c (usage):
+ * tests/help-version: Quote 'like this' rather than `like this'
+ in diagnostics.
+
+ exclude: process exclude and include directives in order
+ Also, change exclude and include directives so that they apply to
+ command-line arguments too. This restores the pre-2.6 behavior,
+ and fixes a bug reported by Quentin Arce in
+ <http://lists.gnu.org/archive/html/bug-grep/2012-04/msg00056.html>.
+ * NEWS: Document this.
+ * src/main.c (included_patterns): Remove. All uses removed.
+ (skipped_file): New function.
+ (grepdirent): New arg command_line; all callers changed. This is
+ needed because non-command-line files can invoke fts_open, and
+ their directory entries need to be distinguished from top-level
+ directory entries. Move code into the new skipped_file function.
+ (grepdesc): Check whether a command-line argument should be skipped.
+ (main): --include and --exclude options now share excluded_patterns
+ rather than having separate variables included_patterns and
+ excluded_patterns.
+ * tests/include-exclude: Add a test to detect the fixed bug.
+
+ build: update gnulib submodule to latest
+
+2012-04-30 Jim Meyering <meyering@redhat.com>
+
+ cosmetic: binary operator goes *after* the newline, when split
+ * src/dfa.c (match_mb_charset): Join split lines.
+ (parse_bracket_exp): Move "||" from end of first split line
+ to the beginning of the continued line.
+ * src/dosbuf.c (dossified_pos): Likewise, but for "&&".
+
+ grep: -K is not an option: remove it from list
+ The presence of "K" in the short-option string meant that
+ an erroneous "grep -K ..." would fail with a bare Usage/Try...
+ message, without the usual "invalid option -- 'K'". With this
+ removal, now grep prints the expected invalid option diagnostic.
+ * src/main.c (short_options): Remove "K".
+ Reported by Петр Досычев in
+ http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4488
+
+2012-04-29 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: small fixes to single-byte range computation
+ * src/dfa.c (parse_bracket_exp): Do not call regexec with an invalid
+ subject. Move declarations before all statements.
+
+2012-04-27 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: do not use hard-locale
+ * bootstrap.conf (gnulib_modules): Remove hard-locale.
+ * src/dfa.c (hard_LC_COLLATE): Remove.
+ (dfaparse): Do not initialize it.
+ (parse_bracket_exp): Always go through system regex matcher to find
+ single byte characters matching a range.
+
+ drop support for Makefile.boot
+ * Makefile.am: Do not distribute README-boot and Makefile.boot.
+ * NEWS: Mention this change.
+ * README-alpha: Do not mention README-boot and Makefile.boot.
+ * Makefile.boot: Remove.
+ * README-boot: Remove.
+
+2012-04-27 Aharon Robbins <arnold@skeeve.com>
+
+ dfa: do not use strcoll to match multibyte characters in ranges
+ This does not affect the behavior of grep, which always defers
+ to glibc or gnulib when matching ranges.
+ * src/dfa.c (match_mb_charset): Compare wc directly to the range
+ endpoints.
+
+ dfa: include stdbool.h explicitly
+ * src/dfa.c: Include stdbool.h explicitly
+
+2012-04-23 Jim Meyering <meyering@redhat.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.12
+ * NEWS: Record release date.
+
+ build: update gnulib submodule to latest
+
+ tests: skip annoyingly long gnulib lock tests
+ * bootstrap.conf (avoided_gnulib_modules): Define.
+ (gnulib_tool_option_extras): Use it.
+
+2012-04-22 Jim Meyering <meyering@redhat.com>
+
+ tests: avoid spurious quote-mismatch failure on OS/X
+ * tests/in-eq-out-infloop: Simplify expected error output, eliminating
+ expected quotes altogether, thus avoiding spurious OS/X-specific
+ failure due to mismatch of multi-byte vs. single-byte quotes.
+
+2012-04-17 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+ * bootstrap: Also update this file.
+
+2012-04-17 Jim Meyering <meyering@redhat.com>
+
+ grep: fix --devices=ACTION (-D) so stdin is once again exempt
+ An oversight in the 2.11 changes made it so "echo x|grep x" would
+ fail for those who set GREP_OPTIONS=--devices=skip.
+
+ * src/main.c (grepdesc): Ignore skip-related options when reading
+ from standard input.
+ * tests/skip-device: New file. Test for the above.
+ * tests/Makefile.am (TESTS): Add it.
+ * doc/grep.texi (File and Directory Selection): Clarify this point,
+ documenting the stdin exemption.
+ * NEWS (Bug fixes): Mention it, and add a few "[fixed in ...] notes.
+ Reported by Tino Keitel in http://bugs.debian.org/669084,
+ and forwarded to bug-grep by Aníbal Monsalve Salazar.
+
+2012-04-13 Jim Meyering <meyering@redhat.com>
+
+ maint: dfa: correct bogus formatting
+ * src/dfa.c (transit_state, dfaexec): s/++ * VAR/++*VAR/
+
+ maint: dfa: add/improve comments
+ * src/dfa.c (transit_state_consume_1char): Note always-ignored
+ return value.
+ Fix typos: s/equivalent class/equivalence class/.
+
+ maint: dfa: avoid unnecessary uses of strcpy/strncpy
+ * src/dfa.c (icatalloc): Use memcpy, not strcpy, given the length.
+ (dfamust): Combine MALLOC+strcpy into cleaner xmemdup.
+ (parse_bracket_exp): Likewise, but replace a use of strncpy.
+
+ grep: handle symlinked directory loops as usual
+ * src/main.c (grepfile): Treat EMLINK just like ELOOP, for
+ systems like FreeBSD 9.0 on which we would otherwise report
+ "Too many links" rather than ignoring that type of failure.
+ E.g., "mkdir d; cd d; ln -s . a; grep -r ^" would print
+ grep: a: Too many links and would exit with status 2.
+ Now, it prints nothing and exits with status 1, as before.
+ Reported by Nelson H. F. Beebe.
+
+ tests: avoid spurious failure of the symlink test
+ * tests/symlink: Ignore spurious "Binary file d matches" on
+ systems for which reading from a directory actually succeeds.
+ Reported by Bruno Haible and Nelson Beebe.
+
+2012-04-09 Jim Meyering <meyering@redhat.com>
+
+ tests: avoid syntax-check failure: reverse compare arguments
+ * tests/repetition-overflow: Fix reversed compare arguments.
+
+ build: update gnulib submodule to latest
+
+2012-03-18 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: report overflow for ERE a{1000000000}
+ * NEWS: Document this.
+ * src/dfa.c (MIN): New macro.
+ (lex): Lexically analyze the repeat-count operator once, not
+ twice; the double-scan complicated the code and made it harder to
+ understand and fix. Adjust the repeat-count parsing so that it
+ better matches the behavior of the regex code, in three ways:
+ 1. Diagnose too-large repeat counts rather than treating them as
+ literal characters. 2. Use RE_INVALID_INTERVAL_ORD, not
+ RE_NO_BK_BRACES, to decide whether to treat invalid-syntax {...}s
+ as literals. 3. Use the same wording for {...}-related
+ diagnostics that the regex code uses.
+ * tests/bre.tests, tests/ere.tests, tests/repetition-overflow:
+ Adjust to match new behavior, and add a few tests.
+ * cfg.mk (exclude_file_name_regexp--sc_error_message_uppercase):
+ New macro, since the diagnostics start with uppercase letters.
+
+2012-03-14 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -r no longer follows symlinks; use fts
+ Change -r to follow only command-line symlinks, and by default to
+ read only devices named on the command line. This is a simple
+ way to get a more-useful behavior when searching random
+ directories; the idea is to use 'find' if you want something fancy.
+ -R acts as before and gets a new alias --dereference-recursive.
+ The code now uses fts internally, so it is more robust and
+ faster with large hierarchies.
+ * .gitignore: Remove lib/savedir.c, lib/savedir.h.
+ * tests/symlink: New file
+ * Makefile.boot (LIB_OBJS_core): Remove isdir.o, savedir.o.
+ Perhaps other changes are needed too, but I'm not sure what
+ this makefile is for.
+ * NEWS: Document changes.
+ * doc/grep.texi (File and Directory Selection): Likewise.
+ * bootstrap.conf (gnulib_modules): Remove dirent, dirname, isdir, open.
+ Add fstatat, fts, openat-safer.
+ * lib/Makefile.am (libgreputils_a_SOURCES): Remove savedir.c, savedir.h.
+ * lib/savedir.c, lib/savedir.h: Remove.
+ * po/POTFILES.in: Add lib/openat-die.c.
+ * src/main.c: Include fcntl-safer.h, fts_.h. Don't include
+ isdir.h, savedir.h.
+ (struct stats, stats_base): Remove.
+ (long_options, usage, main): Add --dereference-recursive and
+ implement -r vs -R.
+ (filename_prefix_len, fts_options): New static vars.
+ (basic_fts_options, READ_COMMAND_LINE_DEVICES): New constants.
+ (devices): Now defaults to READ_COMMAND_LINE_DEVICES.
+ (reset, grep): Now takes just struct stat rather than file name and
+ struct stats. All callers changed.
+ (fillbuf): Now takes struct stat reather than struct stats.
+ All callers changed.
+ (grep): Don't worry about recursing too deeply; fts and grepdesc
+ handle this now.
+ (is_device_mode, grepdirent, grepdesc, grep_command_line_args):
+ New functions.
+ (grepfile): New args DIRDESC, FOLLOW, COMMAND_LINE. Remove struct stats
+ arg. All callers changed. Use openat_safer rather than open.
+ Use desc == STDIN_FILENO to tell whether we're reading "-".
+ Don't worry about EINTR when closing -- not possible, since we're
+ not catching signals.
+ * tests/Makefile.am (TESTS): Add symlink.
+ * tests/symlink: New file.
+
+2012-03-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: port big-match to non-GNU dd
+ * tests/big-match: Don't assume GNU dd extension "bs=1M".
+
+ tests: test for bug with -r --exclude-dir and no file operand
+ * tests/include-exclude: Test for the bug and fix.
+
+2012-03-12 Allan McRae <allan@archlinux.org>
+
+ grep: fix segfault with -r --exclude-dir and no file operand
+ * src/main.c (grepdir): Don't invoke excluded_file_name on NULL.
+ * NEWS (Bug fixes): Mention it.
+
+2012-03-09 Jim Meyering <meyering@redhat.com>
+
+ tests: exercise two recently-fixed bugs
+ * tests/repetition-overflow: New test for bugs fixed by commit
+ v2.10-82-gcbbc1a4.
+ * tests/Makefile.am (TESTS): Add it.
+
+2012-03-03 Jim Meyering <meyering@redhat.com>
+
+ maint: use an optimal-for-grep xz compression setting
+ * cfg.mk (XZ_OPT): Use -6e (determined empirically, see comments).
+ This sacrifices a meager 60 bytes of compressed tarball size for a
+ 55-MiB decrease in the memory required during decompression. I.e.,
+ using -9e would shave off only 60 bytes from the tar.xz file, yet
+ would force every decompression process to use 55 MiB more memory.
+
+ build: update gnulib submodule to latest
+
+2012-03-02 Jim Meyering <meyering@redhat.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.11
+ * NEWS: Record release date.
+
+ tests: avoid failure when using Solaris 10's sed
+ * tests/reversed-range-endpoints: Use a simpler sed expression to
+ sanitize actual output, so it also works with Solaris 10's /bin/sed.
+
+2012-03-01 Jim Meyering <meyering@redhat.com>
+
+ maint: manually correct formatting in dfa.c's cpp definitions
+ * src/dfa.c: Adjust formatting in cpp definitions.
+
+ maint: indent dfa.c
+ * src/dfa.c: Filter through indent like this:
+ HOME=. indent -Tsize_t -l79 --leave-preprocessor-space \
+ --dont-format-comments --no-tabs < dfa.c > k && mv k dfa.c
+
+ doc: correct grep.1's descriptions of \w and \W (they omitted "_")
+ * doc/grep.in.1: Fix descriptions of \w and \W.
+ They did not mention "_".
+ * doc/grep.texi (The Backslash Character and Special Expressions):
+ [\w, \W]: List the "_" before the char class, not after: [_[:alnum:]],
+ for readability and to be consistent with the man page.
+
+2012-03-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: spelling fixes
+
+ grep: fix integer-overflow issues in main program
+ * NEWS: Document this.
+ * bootstrap.conf (gnulib_modules): Add inttypes, xstrtoimax.
+ Remove xstrtoumax.
+ * src/main.c: Include <inttypes.h>, for INTMAX_MAX, PRIdMAX.
+ (context_length_arg, prtext, grepbuf, grep, grepfile)
+ (get_nondigit_option, main):
+ Use intmax_t, not int, for line counts.
+ (context_length_arg, main): Silently ceiling line counts
+ to maximum value, since there's no practical difference between
+ doing that and using infinite-precision arithmetic.
+ (out_before, out_after, pending): Now intmax_t, not int.
+ (max_count, outleft): Now intmax_t, not off_t.
+ (prepend_args, prepend_default_options, main):
+ Use size_t, not int, for sizes.
+ (prepend_default_options): Check for int and size_t overflow.
+
+ grep: avoid mishandling of long lines
+ * src/pcresearch.c (Pexecute): Do not pass a line longer than
+ INT_MAX to pcre_exec, since its API does not permit that.
+
+ grep: remove no-longer-used setrlimit code
+ This code has been unused and obsolescent ever since the regex
+ code stopped using the stack for large regular expressions.
+ * src/main.c [HAVE_SETRLIMIT]: Do not include <sys/time.h> or
+ or <sys/resource.h>; no longer needed.
+ (set_rlimits): Remove. All callers changed.
+
+ grep: fix some core dumps with long lines etc.
+ These problems mostly occur because the code attempts to stuff
+ sizes into int or into unsigned int; this doesn't work on most
+ 64-bit hosts and the errors can lead to core dumps.
+ * NEWS: Document this.
+ * src/dfa.c (token): Typedef to ptrdiff_t, since the enum's
+ range could be as small as -128 .. 127 on practical hosts.
+ (position.index): Now size_t, not unsigned int.
+ (leaf_set.elems): Now size_t *, not unsigned int *.
+ (dfa_state.hash, struct mb_char_classes.nchars, .nch_classes)
+ (.nranges, .nequivs, .ncoll_elems, struct dfa.cindex, .calloc, .tindex)
+ (.talloc, .depth, .nleaves, .nregexps, .nmultibyte_prop, .nmbcsets):
+ (.mbcsets_alloc): Now size_t, not int.
+ (dfa_state.first_end): Now token, not int.
+ (state_num): New type.
+ (struct mb_char_classes.cset): Now ptrdiff_t, not int.
+ (struct dfa.utf8_anychar_classes): Now token[5], not int[5].
+ (struct dfa.sindex, .salloc, .tralloc): Now state_num, not int.
+ (struct dfa.trans, .realtrans, .fails): Now state_num **, not int **.
+ (struct dfa.newlines): Now state_num *, not int *.
+ (prtok): Don't assume 'token' is no wider than int.
+ (lexleft, parens, depth): Now size_t, not int.
+ (charclass_index, nsubtoks)
+ (parse_bracket_exp, addtok, copytoks, closure, insert, merge, delete)
+ (state_index, epsclosure, state_separate_contexts)
+ (dfaanalyze, dfastate, build_state, realloc_trans_if_necessary)
+ (transit_state_singlebyte, match_anychar, match_mb_charset)
+ (check_matching_with_multibyte_ops, transit_state_consume_1char)
+ (transit_state, dfaexec, free_mbdata, dfaoptimize, dfafree)
+ (freelist, enlist, addlists, inboth, dfamust):
+ Don't assume indexes fit in 'int'.
+ (lex): Avoid overflow in string-to-{hi,lo} conversions.
+ (dfaanalyze): Redo indexing so that it works with size_t values,
+ which cannot go negative.
+ * src/dfa.h (dfaexec): Count argument is now size_t *, not int *.
+ (dfastate): State numbers are now ptrdiff_t, not int.
+ * src/dfasearch.c: Include "intprops.h", for TYPE_MAXIMUM.
+ (kwset_exact_matches): Now size_t, not int.
+ (EGexecute): Don't assume indexes fit in 'int'.
+ Check for overflow before converting a ptrdiff_t to a regoff_t,
+ as regoff_t is narrower than ptrdiff_t in 64-bit glibc (contra POSIX).
+ Check for memory exhaustion in re_search rather than treating
+ it merely as failure to match; use xalloc_die () to report any error.
+ * src/kwset.c (struct trie.accepting): Now size_t, not unsigned int.
+ (struct kwset.words): Now ptrdiff_t, not int.
+ * src/kwset.h (struct kwsmatch.index): Now size_t, not int.
+
+ tests: test for problems with long matches
+ The new test is expensive, so add a category of expensive tests,
+ which are normally not run, and put the new test in this new
+ category. The idea of having expensive tests is taken from coreutils.
+ * HACKING: Mention RUN_EXPENSIVE_TESTS and similar env vars.
+ * Makefile.am (check-expensive): New rule.
+ * tests/Makefile.am (TESTS): Add big-match.
+ * tests/init.cfg (expensive_): New function, from coreutils.
+ * tests/big-match: New file.
+
+2012-02-29 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: use gnulib _Noreturn rather than __attribute__ ((noreturn))
+ * src/grep.h (__attribute__): Remove.
+ * src/dfa.h (__attribute__): Likewise.
+ (dfaerror): Use noreturn rather than __attribute__ ((noreturn)).
+ * src/main.c (usage): Likewise.
+
+2012-02-26 Jim Meyering <meyering@redhat.com>
+
+ build: update submodule, bootstrap, tests/init.sh from gnulib
+ * gl/lib/regcomp.c.diff: Adjust.
+ * bootstrap: Update from gnulib.
+ * tests/init.sh: Update from gnulib.
+
+2012-02-26 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: merge calls to SUCCEEDS_IN_CONTEXT
+ * src/dfa.c (state_index): use a single call to SUCCEEDS_IN_CONTEXT.
+
+ dfa: fix a subtle constraint encoding bug
+ * src/dfa.c (SUCCEEDS_IN_CONTEXT, PREV_NEWLINE_DEPENDENT,
+ PREV_LETTER_DEPENDENT): Rewrite to handle all 3*3=9 possible
+ combinations of previous and next character contexts.
+ (MATCHES_NEWLINE_CONTEXT, MATCHES_LETTER_CONTEXT): Remove.
+ (NO_CONSTRAINT, BEGLINE_CONSTRAINT, ENDLINE_CONSTRAINT,
+ BEGWORD_CONSTRAINT, ENDWORD_CONSTRAINT, LIMWORD_CONSTRAINT,
+ NOTLIMWORD_CONSTRAINT): Switch to new encoding.
+ * NEWS: Document resulting bugfix.
+ * tests/spencer1.tests: Add regression test.
+
+ dfa: do not use MATCHES_*_CONTEXT directly
+ * src/dfa.c (dfastate): Use SUCCEEDS_IN_CONTEXT.
+
+ dfa: change meaning of a state context
+ * src/dfa.c (MATCHES_NEWLINE_CONTEXT, MATCHES_LETTER_CONTEXT): New.
+ (state_separate_contexts): Remove second argument.
+ (state_index): Do not mask away CTX_NONE.
+ (dfaanalyze): Adjust call to state_index and state_separate_contexts.
+ (dfastate): Adjust calls to state_index and state_separate_contexts.
+
+2012-02-13 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: fix loop in epipe test
+ * tests/epipe: Don't loop forever if the bug is present.
+ Problem reported by Jaroslav Skarvada.
+
+2012-02-08 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: work portably even if SIGPIPE is ignored
+ * tests/epipe: Don't rely on "trap - PIPE"; that's not portable.
+ Problem reported by Eric Blake in
+ <http://lists.gnu.org/archive/html/bug-grep/2012-02/msg00017.html>.
+ Also, use "ls -al" rather than "echo", in case "echo" is done by a
+ buggy shell that ignores write errors. And close grep's fd 3, as
+ a sanity check.
+
+2012-02-07 Paul Eggert <eggert@cs.ucla.edu>
+
+ tests: work even if SIGPIPE is ignored
+ * tests/epipe: Do not infinite-loop if SIGPIPE is already ignored.
+ It could be that the invoker of 'make check' ignores SIGPIPE,
+ for example.
+
+2012-02-05 Jim Meyering <meyering@redhat.com>
+
+ build: accommodate -Wshadow and -Werror=suggest-attribute=pure
+ * src/dfa.c (state_separate_contexts): Add _GL_ATTRIBUTE_PURE.
+ (dfaexec): Rename parameter, s/newline/allow_nl/, to avoid
+ shadowing the global.
+
+2012-02-05 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: refactor common context computations
+ * src/dfa.c (CTX_ANY, charclass_context, state_separate_contexts): New.
+ (dfaanalyze): Use state_separate_contexts.
+ (dfastate): Use charclass_context and state_separate_contexts. Rename
+ prev_context to separate_contexts.
+
+ dfa: change newline/letter to a single context value
+ * src/dfa.c (MATCHES_NEWLINE_CONTEXT, MATCHES_LETTER_CONTEXT,
+ SUCCEEDS_IN_CONTEXT, ACCEPTS_IN_CONTEXT): Take a single context value
+ for prev and curr.
+ (struct dfa_state): Replace newline and letter with context.
+ (wchar_context): New.
+ (state_index): Replace newline and letter with context. Compare
+ context values in the state struct. Adjust calls to pass contexts.
+ (wants_newline): Replace with wanted_context. Adjust calls to pass
+ contexts.
+ (dfastate): Replace wants_newline and wants_letter with wanted_context.
+ Adjust calls to pass contexts.
+ (build_state): Adjust calls to pass contexts.
+ (match_anychar, match_mb_charset, transit_state): Use wchar_context.
+ Adjust calls to pass contexts.
+
+2012-02-05 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: introduce contexts for the values in d->success
+ Also initialize all tables in a single place in dfasyntax.
+
+ * src/dfa.c (CTX_NONE, CTX_LETTER, CTX_NEWLINE, char_context): New.
+ (sbit, letters, newline): New.
+ (dfasyntax): Fill them.
+ (dfastate): Remove letters, newline, initialized.
+ (build_state): Use CTX_* constants.
+ (dfaexec): Remove sbit and sbit_init.
+
+2012-02-05 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: remove useless check
+ * src/dfa.c (state_index): There is nothing that is a newline *and*
+ a letter. Remove redundant call to SUCCEEDS_IN_CONTEXT.
+
+2012-01-22 Jim Meyering <meyering@redhat.com>
+
+ build: update bootstrap from gnulib and adapt
+ * bootstrap: Update from gnulib.
+ * tests/init.sh: Update from gnulib.
+ * bootstrap.conf (bootstrap_epilogue): Remove now-unnecessary,
+ snippet that edited gnulib-tests/gnulib.mk.
+ (gnulib_tool_option_extras): Add both --symlink and
+ --makefile-name=gnulib.mk. Remove use of $bt.
+ * lib/Makefile.am: Initialize numerous automake variables so that
+ generated code in gnulib.mk may use += to append to them.
+
+ maint: convert `this' to 'this' quoting style in diagnostics
+ Now that gnulib's quote and quotearg modules use 'this' style,
+ change the few explicit uses in diagnostics to conform.
+ * src/egrep.c (after_options): Use 'this' style of quotes.
+ * src/fgrep.c (after_options): Likewise.
+ * src/grep.c (after_options): Likewise.
+ * src/main.c (usage): Likewise.
+
+ build: update gnulib to latest; adjust quoting in tests
+ * gnulib: Update.
+ * tests/in-eq-out-infloop: Convert expected diagnostics to match
+ new quoting.
+
+2012-01-22 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: document recent diagnostics-related changes
+ * NEWS: Document changes re diagnostics related to GREP_COLORS,
+ directory loops, -s, "write error".
+
+ grep: be quiet about GREP_COLORS syntax
+ * src/main.c (struct color_cap): fct now returns void,
+ since there's no longer need to use what it returns.
+ (color_cap_mt_fct, color_cap_rv_fct, color_cap_ne_fct): Return void.
+ (parse_grep_colors): Do not output diagnostics and then exit with
+ status 0. Instead, ignore errors in GREP_COLORS. This is more
+ consistent with programs that (e.g.) ignore errors in termcap entries,
+ and it's more internally-consistent as some GREP_COLORS errors
+ were ignored but not others.
+
+ grep: exit with nonzero status if directory loop
+ * src/main.c (grepdir): Exit with status 2 if a directory loop is
+ found, since the output might not be "right" (i.e., infinite...).
+
+ grep: suppress read errors if -s
+ * src/main.c (reset, grep, grepfile): Do not report an input error
+ if -s is given.
+
+ grep: don't say "write error" over and over
+ Problem reported by Travis Gummels in
+ <https://bugzilla.redhat.com/show_bug.cgi?id=741452>.
+ * src/main.c (write_error_seen): New static var.
+ (clean_up_stdout): New function.
+ (prline): Do not output 'write error' more than once; exit
+ after the first one. Use the same wording for the diagnostic
+ that close_stdout uses.
+ (main): Clean up with clean_up_stdout, not close_stdout, so that
+ grep doesn't output multiple "write error" diagnostics.
+ * tests/Makefile.am (TESTS): Add epipe.
+ * tests/epipe: New file.
+
+2012-01-12 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: non-glibc word-constituent unibyte fix
+ * src/dfa.c (is_valid_unibyte_character): Fix typo that caused
+ this to incorrectly return 0 on unibyte non-glibc systems.
+ Problem reported by Aharon Robbins in
+ <http://lists.gnu.org/archive/html/bug-grep/2012-01/msg00084.html>.
+
+2012-01-04 Paul Eggert <eggert@cs.ucla.edu>
+
+ doc: document empty pattern better
+ * doc/grep.texi (Top, Fundamental Structure, Usage):
+ Explain how grep deals with the empty pattern.
+ Problem spotted by Bernhard Voelker in
+ <http://lists.gnu.org/archive/html/bug-grep/2012-01/msg00050.html>.
+
+ grep: with no args, search "." only if command-line -r
+ * NEWS: Document this.
+ * doc/grep.texi (Environment Variables, grep Programs): Likewise.
+ * src/main.c (usage): Likewise.
+ (main): Implement this.
+ (prepend_default_options): Return a count of prepended options.
+ * tests/r-dot: Test the above.
+
+2012-01-03 Jim Meyering <meyering@redhat.com>
+
+ tests: adjust test to match code, now that --mmap writes to stderr
+ * tests/ignore-mmap: Separate stdout and stderr; test both.
+
+ deprecate the --mmap option
+ * src/main.c (main): Deprecate the --mmap option: issue a warning
+ when it is used.
+ (usage): Change description.
+ * doc/grep.texi (Other Options): Document the new behavior.
+ * NEWS (Changes in behavior): Mention it.
+
+2012-01-03 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: fix incorrect comment
+ * src/dfa.c (dfastate): Fix comment for newline.
+
+ dfa: fix rebase conflict
+ * src/dfa.c (dfaanalyze): Fix reference to nalloc.
+
+ dfa: automatically resize position_sets
+ * src/dfa.c (insert, copy, merge): Resize arrays here.
+ (dfaanalyze): Do not track number of allocated elements here.
+ (dfastate): Allocate mbps with only one element.
+
+ dfa: change position_set nelem to size_t
+ * src/dfa.c (REALLOC_IF_NECESSARY): Disable assertion, to avoid
+ warnings from -Wtype-limits.
+ (position_set): Change nelem to a size_t.
+
+ dfa: move nalloc to position_set structure
+ * src/dfa.c (position_set): Add alloc.
+ (alloc_position_set): Initialize it.
+ (dfaanalyze): Use it instead of the nalloc array or nelem.
+
+ dfa: remove dead assignment
+ * src/dfa.c (transit_state): transit_state_consume_1char will clear follows,
+ do not do this ourselves.
+
+ dfa: introduce alloc_position_set
+ * src/dfa.c (alloc_position_set): New function, use it throughout.
+
+ dfa: use a more compact data type for grps
+ * src/dfa.c (leaf_set): New.
+ (dfastate): Use the smaller type, leaf_set, for grps. Its prior type
+ contained an unused constraint field.
+
+ dfa: use MALLOC/REALLOC always
+ src/dfa.c (dfastate, enlist, dfamust): Use MALLOC and REALLOC.
+
+ dfa: remove unnecessary braces
+ * src/dfa.c (dfastate): Remove unnecessary braces.
+
+ dfa: x2nrealloc starting from a NULL pointer works
+ * src/dfa.c (parse_bracket_exp): Do not MALLOC mbcset parts the first time
+ they are encountered. Initialize chars_al correctly.
+
+2012-01-03 Jim Meyering <meyering@redhat.com>
+
+ build: avoid build failure with --enable-gcc-warnings and recent gcc
+ * lib/colorize-posix.c: Disable -Wsuggest-attribute=const, to avoid
+ warning about this empty init_colorize function.
+
+2012-01-03 Paolo Bonzini <bonzini@gnu.org>
+
+ remove lib/ms/
+ * configure.ac: Create lib/colorize.c as a symbolic link.
+ * lib/colorize-posix.c: New name of lib/colorize-impl.c.
+ * lib/colorize-w32.c: New name of lib/ms/colorize-impl.c.
+ * lib/colorize.c: Delete.
+ * lib/Makefile.am (EXTRA_DIST): Adjust.
+ * .gitignore: Adjust.
+ * cfg.mk: Adjust syntax-check exclusions.
+
+ unify colorize.h headers
+ * lib/Makefile.am (EXTRA_DIST): Adjust.
+ * lib/colorize.h: Remove inline functions.
+ * lib/colorize-impl.c: Move them here as functions.
+ * lib/ms/colorize.h: Remove.
+ * src/Makefile.am (DEFAULT_HEADERS): Remove.
+
+2012-01-02 Paolo Bonzini <bonzini@gnu.org>
+
+ colorize: use isatty module
+ * bootstrap.conf: Add isatty module.
+ * gnulib: Update to latest.
+ * lib/colorize.h: Remove argument from should_colorize.
+ * lib/ms/colorize.h: Likewise.
+ * lib/colorize-impl.c: Factor isatty call out of here...
+ * lib/ms/colorize-impl.c: ... and here...
+ * src/main.c: ... into here.
+
+2012-01-02 Jim Meyering <meyering@redhat.com>
+
+ tests: avoid minor "make check" failure
+ * tests/r-dot: Make executable, to avoid triggering a failed
+ consistency test in "make check".
+
+2012-01-02 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: -r with no args now searches "."
+ This is a patch I've been meaning to put in for years.
+ When I added support for "grep -r", I forgot to have "grep -r PAT"
+ search the working directory by default, instead of searching
+ standard input (which makes no sense, even if stdin is a directory).
+ This is not an upward compatible change, since "grep -r PAT <file"
+ will no longer search standard input, but that's OK; nobody should
+ be using "grep -r" that way anyway.
+ * NEWS: Document this.
+ * doc/grep.texi (File and Directory Selection, grep Programs, Usage):
+ Likewise.
+ * src/main.c (usage): Likewise.
+ (grepdir): If DIR is null, search the working directory, but do
+ not prepend "./" to the file names.
+ (main): If recursing and no operands are given, search ".".
+ * tests/Makefile.am (TESTS): Add r-dot.
+ * tests/r-dot: New file.
+
+ grep: prefer fgets to printf, _ to gettext
+ * lib/colorize.h (print_end_colorize):
+ * lib/ms/colorize-impl.c (print_end_colorize):
+ Use fputs instead of printf.
+ * src/main.c (usage): Likewise. Use _ instead of gettext.
+
+2012-01-01 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: check stdin like other files
+ * NEWS: Document this.
+ * src/main.c (grepfile): Revamp tests for input files so that
+ standard input is tested like other files. For example, report
+ an error if standard input equals standard output.
+ Prefer open+fstat to stat+open if possible, as open+fstat is
+ usually a bit faster and avoids a race condition.
+ * tests/in-eq-out-infloop: Add tests for cases like
+ 'grep pat <file >>file'.
+
+2012-01-01 Jim Meyering <meyering@redhat.com>
+
+ maint: update all copyright year number ranges
+ Run "make update-copyright".
+
+2011-12-31 Paul Eggert <eggert@cs.ucla.edu>
+
+ grep: lower-case function names
+ These names used to be macros, but they're functions now.
+ All callers changed.
+ * src/main.c (pr_sgr_start): Rename from PR_SGR_START.
+ (pr_sgr_end): Rename from PR_SGR_END.
+ (pr_sgr_start_if): Rename from PR_SGR_START_IF.
+ (pr_sgr_end_if): Rename from PR_SGR_END_IF.
+
+ ms: move Microsoft-specific stuff to lib/ms
+ * cfg.mk (exclude_file_name_regexp--sc_prohibit_strcmp)
+ (exclude_file_name_regexp--sc_require_config_h)
+ (exclude_file_name_regexp--sc_require_config_h_first):
+ New rules.
+ * lib/colorize.c, lib/colorize.h, lib/colorize-impl.c:
+ * lib/ms/colorize.h, lib/ms/colorize-impl.c: New files.
+ * configure.ac (GREP_SRC_INCLUDES): New macro.
+ * lib/Makefile.am (libgreputils_a_SOURCES): Add colorize.[ch].
+ (EXTRA_DIST): New macro.
+ * src/Makefile.am (DEFAULT_INCLUDES): New macro.
+ * src/main.c: Include colorize.h.
+ (PR_SGR_START, PR_SGR_END, PR_SGR_START_IF, PR_SGR_END_IF):
+ Now static functions, not macros.
+ (hstdout, norm_attr, w32_console_init, w32_sgr2attr)
+ (w32_clreol) [__MINGW32__]: Move to lib/ms/colorize-impl.c.
+ (pr_sgr_start, pr_sgr_end): Remove; callers changed to use new
+ print_start_colorize, print_end_colorize from colorize.h.
+ (init_colorize): Rename from w32_console_init and move to
+ colorize module; caller changed.
+ (should_colorize): Move to colorize module.
+
+ grep: do input==output check more like dir loop check
+ * src/main.c (grepfile): Just use SAME_INODE; don't bother
+ with SAME_REGULAR_FILE. This works better on properly-working
+ POSIX hosts, since it handles the case where the file is changing
+ as we grep it. It works worse on hosts that don't support st_ino
+ properly, but in practice this isn't that much of a problem here.
+ * src/system.h (same_file_attributes, SAME_REGULAR_FILE):
+ Remove; no longer needed.
+
+ build: update gnulib submodule to latest
+
+2011-12-28 Paul Eggert <eggert@cs.ucla.edu>
+
+ maint: remove now-unused/obsolete files
+ * README.DOS: Remove file.
+ * m4/djgpp.m4: Likewise.
+ * .gitignore: Remove reference to m4/djgpp.m4.
+
+2011-12-28 Jim Meyering <meyering@redhat.com>
+
+ maint: distribute ChangeLog-2009
+ * Makefile.am (EXTRA_DIST): Add ChangeLog-2009.
+ Spotted by Eli Zaretskii.
+
+2011-12-28 Jim Meyering <meyering@redhat.com>
+
+ main.c: add some 'const' directives
+ * src/main.c (color_dict, fg_color, bg_color, cap): Declare const.
+
+ No semantic change.
+
+2011-12-28 Jim Meyering <meyering@redhat.com>
+
+ main.c: correct indentation and formatting style
+ * src/main.c: Correct many formatting inconsistencies.
+ No semantic change.
+
+ avoid new syntax-check failures
+ * cfg.mk (old_NEWS_hash): Update, to accommodate old NEWS modification.
+ * src/main.c: Indent solely with spaces, never with TABs.
+ (should_colorize): Remove useless parens in #if directive.
+
+2011-12-28 Eli Zaretskii <eliz@gnu.org>
+
+ Fix whitespace, indentation and documentation
+ * src/main.c (parse_grep_colors): Fix indentation.
+ (usage): Mention MS-Windows in help text for -U and -u options.
+
+ update NEWS for MS-Windows changes
+ * NEWS: Mention MS-Windows related bugfixes and enhancements.
+
+ Fix the test suite for MS-Windows.
+ * tests/include-exclude: Use --directories=skip, to avoid
+ gratuitous failures on systems that cannot grep directories.
+ * tests/reversed-range-endpoints: Don't reject program names with
+ leading directories and drive letters.
+ * tests/warn-char-classes: Likewise.
+
+ Support color highlighting on MS-Windows
+ * src/main.c (SGR_START, SGR_END, PR_SGR_FMT, PR_SGR_FMT_IF): Remove.
+ (PR_SGR_START, PR_SGR_START_IF): Replace with pr_sgr_start.
+ (PR_SGR_END, PR_SGR_END_IF): Replace with pr_sgr_end.
+ (pr_sgr_start, pr_sgr_end, should_colorize): New functions.
+ (w32_console_init, w32_sgr2attr, w32_clreol) [__MINGW32__]: New functions.
+ (main): Use should_colorize. Invoke w32_console_init.
+
+2011-12-24 Paul Eggert <eggert@cs.ucla.edu>
+
+ don't ignore errors when reading a directory
+ grep no longer silently suppresses errors when reading a directory
+ as if it were a text file. For example, "grep x ." now reports a
+ read error on most systems; formerly, it ignored the error.
+ Problem reported as an aside by Bob Proulx (Bug#10355).
+ * NEWS: Document this.
+ * src/main.c (grep, grepfile): Implement this. Simplify the code
+ considerably.
+ * src/system.h (is_EISDIR): Remove; no longer needed.
+
+ --include etc. now work on command-line args more consistently
+ --include and --exclude apply only to non-directories and
+ --exclude-dir applies only to directories. "-" (standard input)
+ is never excluded, since it is not a file name.
+ This bug was discovered while fixing a read-directory bug (Bug#10355).
+ * NEWS: Document this.
+ * src/main.c (main): Implement this.
+ * tests/include-exclude: Test for it.
+
+2011-12-24 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+2011-12-12 Arnold D. Robbins <arnold@skeeve.com>
+
+ doc: improve grep.texi
+ * doc/grep.texi: General editing for improved aesthetics.
+ Also fix a few problems.
+
+2011-12-12 Jim Meyering <meyering@redhat.com>
+
+ build: use gnulib's iswctype wcscoll
+ * bootstrap.conf (gnulib_modules): Add iswctype and wcscoll.
+ * configure.ac: Remove explicit checks for those functions.
+ * src/mbsupport.h (MBS_SUPPORT): Define to 1 if not already defined.
+ Remove the conditional, now that we're guaranteed by gnulib to have
+ wcscoll and iswctype.
+ Suggested by Alan Hourihane in http://savannah.gnu.org/bugs/?34930
+
+ disable the new input==output guard for additional options
+ * src/main.c (grepfile): Do not reject input == output also
+ when using a few other options.
+ * tests/in-eq-out-infloop: Test these new cases.
+ * NEWS (Bug fixes): Mention it
+
+2011-12-11 Nicolas Vigier <boklm@mars-attacks.org>
+
+ do not reject "grep -qr . > out"
+ The recent fix to avoid an infinite disk-filling loop, commit 5e20a38a,
+ introduced a minor regression. If you use grep with -q and -r, and
+ redirect output to a file that will be traversed, then grep would
+ reject the command, even though it will generate no output.
+ In that case, there is no risk of an infinite loop.
+ * src/main.c (grepfile): Do not reject input == output when
+ using --quiet/--silent (-q).
+ Reported by J H Wilson in http://bugs.mageia.org/show_bug.cgi?id=3501
+ forwarded by Nicolas Vigier to https://savannah.gnu.org/bugs/?34917
+
+2011-11-29 Arnold Robbins <arnold@skeeve.com>
+
+ dfa: do not call nl_langinfo in !MBS_SUPPORT mode
+ * src/dfa.c (using_utf8) [!MBS_SUPPORT]: Remove erroneous "defined"
+ in cpp test for MBS_SUPPORT. Since commit a163349d, MBS_SUPPORT is 0/1.
+ This error caused trouble only in the !MBS_SUPPORT case.
+
+ dfa: avoid warning from deficient compiler in !MBS_SUPPORT mode
+ * src/dfa.c (setbit_wc) [!MBS_SUPPORT]: Add explicit "return false;"
+ after "abort ();", to avoid a warning from deficient compilers.
+
+2011-11-29 Jim Meyering <meyering@redhat.com>
+
+ tests: use "compare exp out", not "compare out exp"
+ Likewise, when an empty file is expected, use "compare /dev/null out",
+ not "compare out /dev/null". I.e., specify the expected/desired contents
+ via the first file name. Prompted by a suggestion from Bruno Haible
+ in http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4020/focus=29154
+
+ Run these commands:
+
+ git grep -l -E 'compare [^ ]+ exp' \
+ |xargs perl -pi -e 's/(compare) (\S+) (exp\S*)/$1 $3 $2/'
+ git grep -l -E 'compare [^ ]+ /dev/null' \
+ |xargs perl -pi -e 's/(compare) (\S+) (\/dev\/null)/$1 $3 $2/'
+
+2011-11-29 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+2011-11-28 Jim Meyering <meyering@redhat.com>
+
+ build: accommodate -Werror=suggest-attribute=pure
+ Now that we're using the latest manywarnings module from gnulib,
+ accommodate gcc's -Werror=suggest-attribute=pure option by marking
+ suggested functions with gnulib-defined _GL_ATTRIBUTE_PURE.
+ * src/kwset.c (hasevery): Mark function with pure attribute.
+ (bmexec): Likewise.
+ * src/dfa.c (nsubtoks, istrstr, find_pred, dfamusts): Likewise.
+ * configure.ac: Disable (for lib/) options that seem not to be worth
+ the trouble: -Wunsuffixed-float-constants and -Wformat-nonliteral.
+
+2011-11-21 Bruno Haible <bruno@clisp.org>
+
+ build: fix "make check" error on OSF/1
+ * tests/Makefile.am (TESTS_ENVIRONMENT): Test the value of the variable
+ BASH_VERSION, not the literal ASH_VERSION.
+
+2011-11-21 Jim Meyering <meyering@redhat.com>
+
+ portability: work consistently on *BSD systems
+ * src/dfa.c (is_valid_unibyte_character): Define.
+ (IS_WORD_CONSTITUENT): Use it here, to make grep work consistently
+ even on *BSD systems, which use different tables for ctype macros
+ like isalpha. http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4022
+ With help from Bruno Haible.
+
+2011-11-20 Jim Meyering <meyering@redhat.com>
+
+ maint: consistently use NULL, not 0, when comparing pointers
+ * src/dfa.c (dfaanalyze): Compare trans[s] with NULL, not 0.
+
+ maint: remove an avoidable #ifdef/#endif pair
+ * src/dfa.c (dfaanalyze): Remove avoidable #ifdef around "{".
+
+ tests: fix typo in last change
+ * tests/word-delim-multibyte: Use double quotes around $e_acute,
+ not single quotes. Spotted by Bruno Haible.
+ This and the preceding change do not resolve the XPASS failure
+ on OpenBSD 4.9 after all. See the explanation at
+ http://thread.gmane.org/gmane.comp.gnu.grep.bugs/4022
+
+ tests: avoid unwarranted test failure on *BSD-based systems
+ * tests/word-delim-multibyte (e_acute): Use a more portable
+ representation of e-acute. Reported by Bruno Haible.
+
+2011-11-19 Jim Meyering <meyering@redhat.com>
+
+ maint: accommodate -Wdeclaration-after-statement, but only in dfa.c,
+ and because doing so does not impact readability/maintainability.
+ This is solely to accommodate gawk users who are stuck with ancient gcc.
+ This is no excuse to change any other code in grep.
+ * src/dfa.c (dfaoptimize, parse_bracket_exp): Move declaration
+ to precede first statement in block.
+
+2011-11-16 Jim Meyering <meyering@redhat.com>
+
+ maint: post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.10
+ * NEWS: Record release date.
+
+ build: update gnulib submodule to latest
+
+2011-11-13 Jim Meyering <meyering@redhat.com>
+
+ maint: update bootstrap and init.sh from gnulib
+ * tests/init.sh: Update from gnulib.
+ * bootstrap: Likewise.
+
+2011-11-12 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib for exclude-test fixes
+
+ tests: make our "export" replacement efficient with modern shells
+ * tests/Makefile.am (TESTS_ENVIRONMENT): Use a trivial and efficient
+ implementation with a shell that supports "export var=val".
+ Use the sed-invoking replacement only when necessary.
+ Improved by Stefano Lattarini.
+
+ tests: make the replacement export function more robust
+ * tests/Makefile.am (sed_quote_value): Also quote single quotes.
+ Remove sed's -e options. Not needed.
+
+2011-11-12 Bruno Haible <bruno@clisp.org>
+
+ tests: fix test suite execution failure on OSF/1 5.1
+ * tests/Makefile.am (TESTS_ENVIRONMENT): Use a shell function to
+ ensure that we use only the portable form of the 'export' shell
+ built-in.
+
+ tests: don't assume that /bin/bash exists
+ * tests/fedora: Run using /bin/sh, not /bin/bash.
+
+ tests: avoid unwarranted failures due to SATAN's timeout
+ * tests/init.cfg (require_timeout_): Also ensure that
+ timeout exits with its child's exit status.
+
+ build: fix compilation error on MSVC 9 to due Pexecute() declaration
+ * src/pcresearch.c (WITHOUT_PCRE_NORETURN): Remove macro.
+ (Pexecute): Replace abort() call with code that does not trigger GCC
+ warnings.
+
+ tests: fix high-bit-range test failure on OSF/1 5.1
+ * tests/high-bit-range: Use octal escape instead of hexadecimal escape
+ sequence.
+
+2011-11-11 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib for solaris test fix
+
+2011-11-10 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+ maint: adjust the URL that will appear in the generated announcement
+ * cfg.mk (url_dir_list): Use this http://ftp.gnu.org/gnu/$(PACKAGE)
+ for the first link listed in the generated announcement.
+ announce-gen now provides the faster mirror link automatically.
+
+2011-11-06 Jim Meyering <meyering@redhat.com>
+
+ build: stop distributing gzip'd releases; xz is enough
+ * configure.ac (AM_INIT_AUTOMAKE): Add no-dist-gzip.
+ * NEWS (Build-related): Mention that we're dropping .tar.gz.
+
+ build: update gnulib submodule to latest
+
+2011-10-14 Stefano Lattarini <stefano.lattarini@gmail.com>
+
+ distcheck: ensure dist-hook fails if syntax-check fails
+ * Makefile.am (run-syntax-check): Fix logic, to ensure that
+ the recipe of this target returns a non-zero exit status if
+ "make syntax-check" fails.
+
+2011-10-12 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+ This should fix a few portability problems, including one on HP-UX
+ and a test-float failure on PPC, reported by Andreas Metzler.
+
+2011-10-10 Stefano Lattarini <stefano.lattarini@gmail.com>
+
+ gitignore: merge top-level and tests/ .gitignore files
+ * tests/.gitignore: Remove; what little remained of its
+ contents has been moved ...
+ * .gitignore: ... here.
+
+ tests: tiny simplification in TESTS_ENVIRONMENT definition
+ * tests/Makefile.am (TESTS_ENVIRONMENT): Remove redundant use of
+ `export'.
+
+2011-10-10 Stefano Lattarini <stefano.lattarini@gmail.com>
+
+ tests: support development version of automake too
+ This change implements a more correct and idiomatic use of the
+ features of the Automake-provided 'parallel-tests' harness.
+ Moreover, this change is required in order for the testsuite to
+ continue to work with the new testsuite harness that is planned
+ to be introduced in Automake 1.12 (which, as of the writing date,
+ is still under development and in late alpha state).
+
+ * tests/Makefile.am (TESTS_ENVIRONMENT): The development version of
+ automake dos not support setting the interpreter delegated to run
+ the tests scripts in this variable; instead, use ...
+ (LOG_COMPILER): ... this variable.
+ * .gitignore: Ignore `.trs' files in directory `tests/'.
+ * build-aux/.gitignore: Ignore `test-driver' script.
+
+2011-10-03 Eli Zaretskii <eliz@gnu.org>
+
+ dfa: don't mishandle high-bit bytes in a regexp with signed-char
+ This appears to arise only on systems for which "char" is signed.
+ * src/dfa.c (FETCH_WC, FETCH): Produce an unsigned value, rather
+ than a sign-extended one. Fixes a bug on MS-Windows with compiling
+ patterns that include characters with the 8-th bit set.
+ (to_uchar): Define. From coreutils.
+ Reported by David Millis <tvtronix@yahoo.com>.
+ See http://thread.gmane.org/gmane.comp.gnu.grep.bugs/3893
+ * NEWS (Bug fixes): Mention it.
+
+2011-09-16 Jim Meyering <meyering@redhat.com>
+
+ maint: dfa: simplify multi-byte-related conditionals
+ * src/dfa.c (setbit_case_fold_c, parse_bracket_exp, lex):
+ (addtok_mb, dfaparse): Change each "MBS_SUPPORT && MB_CUR_MAX > 1"
+ test to just "MB_CUR_MAX > 1".
+ * src/dfasearch.c (kwsincr_case, EGexecute): Likewise.
+ * src/kwsearch.c (Fcompile, Fexecute): Likewise.
+ * src/searchutils.c (kwsinit): Likewise.
+ * src/dfa.c (parse_bracket_exp): Convert
+ "if (!MBS_SUPPORT || MB_CUR_MAX == 1)" to
+ "if (MB_CUR_MAX == 1)" and do this:
+ - assert(!MBS_SUPPORT || MB_CUR_MAX == 1);
+ + assert(MB_CUR_MAX == 1);
+
+ maint: dfa: simplify several expressions
+ * src/dfa.c (dfainit): Set d->mb_cur_max unconditionally, now
+ that MB_CUR_MAX is always usable. With that, simplify all
+ "MBS_SUPPORT && d->mb_cur_max > 1" to simply "d->mb_cur_max > 1".
+ (dfastate, dfaexec, dfainit, dfafree): Simplify, removing each
+ now-unnecessary "MBS_SUPPORT &&".
+
+ maint: dfa: avoid in-function "#if MBS_SUPPORT" tests
+ * src/dfa.c (setbit_case_fold_c): Remove "#if MBS_SUPPORT" in favor
+ of simple "if (MBS_SUPPORT ...".
+ (dfaexec, addtok): Likewise.
+
+ maint: ensure that MB_CUR_MAX is defined even when !MBS_SUPPORT
+ * src/mbsupport.h [!MBS_SUPPORT] (MB_CUR_MAX): Define to 1.
+
+ build: fix compilation failure when MBS_SUPPORT is 0
+ * src/dfa.c (add_utf8_anychar): Always compile this function,
+ but when MBS_SUPPORT is 0, give it an empty body.
+ (prepare_wc_buf): Likewise.
+ [! MBS_SUPPORT] (setbit_wc): Define to always abort.
+
+ maint: dfa: simplify dfaoptimize
+ * src/dfa.c (dfaoptimize): Simplify.
+ (dfacomp): Remove now-redundant "if (MBS_SUPPORT)" guard,
+ since dfaoptimize does nothing if !MBS_SUPPORT.
+
+ maint: dfa: remove some #if MBS_SUPPORT guards
+ * src/dfa.c: Replace a few "#if MBS_SUPPORT" directives with
+ "if (MBS_SUPPORT)". Remove some altogether.
+
+ maint: dfa: convert #if-MBS_SUPPORT (dfastate)
+ * src/dfa.c (dfastate): Use regular "if", not #if MBS_SUPPORT.
+
+ maint: dfa: convert #if-MBS_SUPPORT (dfastate)
+ * src/dfa.c (dfastate): Use regular "if", not #if MBS_SUPPORT.
+
+ maint: dfa: convert #if-MBS_SUPPORT (state_index)
+ * src/dfa.c (state_index): Use regular "if", not #if MBS_SUPPORT.
+
+ maint: dfa: convert #if-MBS_SUPPORT (dfaparse)
+ * src/dfa.c (dfaparse): Use regular "if", not #if MBS_SUPPORT.'
+
+ maint: dfa: convert #if-MBS_SUPPORT (copytoks)
+ * src/dfa.c (copytoks): Use regular "if", not #if MBS_SUPPORT.'
+
+ maint: dfa: convert #if-MBS_SUPPORT (lex)
+ * src/dfa.c (lex): Use regular "if", not #if MBS_SUPPORT.'
+
+ maint: dfa: convert #if-MBS_SUPPORT (parse_bracket_exp)
+ * src/dfa.c (parse_bracket_exp): Use regular "if", not #if MBS_SUPPORT.
+
+ maint: dfa: convert #if-MBS_SUPPORT (parse_bracket_exp)
+ * src/dfa.c (parse_bracket_exp): Use regular "if", not #if MBS_SUPPORT.
+
+ maint: dfa: convert #if-MBS_SUPPORT (parse_bracket_exp)
+ * src/dfa.c (parse_bracket_exp): Use regular "if", not #if MBS_SUPPORT.
+
+ maint: dfa: convert #if-MBS_SUPPORT (dfaexec)
+ * src/dfa.c (dfaexec): Use regular "if", not #if MBS_SUPPORT.
+
+ maint: dfa: convert #if-MBS_SUPPORT (dfaexec)
+ * src/dfa.c (dfaexec): Use regular "if", not #if MBS_SUPPORT.
+ Also add curly braces around multi-line if/else blocks.
+
+ maint: dfa: remove #if-MBS_SUPPORT (free_mbdata)
+ * src/dfa.c (free_mbdata): Remove the #if guard altogether.
+
+ maint: dfa: convert #if-MBS_SUPPORT (dfaoptimize, dfacomp)
+ * src/dfa.c (dfaoptimize, dfacomp): Use regular "if",
+ not #if MBS_SUPPORT.
+
+ maint: dfa: convert #if-MBS_SUPPORT (dfafree)
+ * src/dfa.c (dfafree): Use regular "if", not #if MBS_SUPPORT.
+
+ maint: dfa: convert #if-MBS_SUPPORT (parse_bracket_exp, part1)
+ * src/dfa.c (parse_bracket_exp): Remove in-function #if MBS_SUPPORT.
+
+ maint: remove #if-MBS_SUPPORT declaration guards
+ * src/search.h: Don't bother to #if-out declarations.
+
+ maint: convert #if-MBS_SUPPORT (EGexecute)
+ * src/dfasearch.c (EGexecute): Remove in-function #if MBS_SUPPORT.
+
+ maint: convert #if-MBS_SUPPORT (kwsincr_case)
+ * src/dfasearch.c (kwsincr_case): Remove in-function #if MBS_SUPPORT.
+ Move decl's down.
+
+ maint: convert #if-MBS_SUPPORT (Fcompile, etc.)
+ * src/kwsearch.c (Fcompile, Fexecute): Remove in-function #if MBS_SUPPORT.
+ (Fcompile): Rearrange some declarations. No semantic change.
+
+ maint: convert #if-MBS_SUPPORT (kwsinit)
+ * src/searchutils.c (kwsinit): Remove in-function #if MBS_SUPPORT.
+
+ maint: dfa: remove case-guarding #if-MBS_SUPPORT
+ * src/dfa.c [DEBUG] (prtok): Remove now-useless #if-MBS_SUPPORT.
+
+2011-09-15 Jim Meyering <meyering@redhat.com>
+
+ maint: remove #if MBS_SUPPORT around member declaration
+ * src/dfa.c (dfastate): Don't #ifdef-out "mbps" position_set member.
+
+ maint: dfa: remove #if MBS_SUPPORT around struct definition
+ * src/dfa.c (struct mb_char_classes): Don't #ifdef-out declarations.
+
+ build: avoid compilation failure when building without PCRE support
+ * src/pcresearch.c [!HAVE_LIBPCRE] (WITHOUT_PCRE_NORETURN): Define
+ to _Noreturn, not obsoleted-by-gnulib _GL_ATTRIBUTE_NORETURN.
+ Reported by Eric Blake.
+
+ tests: stop using skip_test_; use skip_ instead
+ * tests/init.cfg (skip_test_): Remove definition. Use the improved
+ skip_ function from init.sh, now that it has the same feature.
+ * tests/euc-mb: s/skip_test_/skip_/
+ * tests/sjis-mb: Likewise.
+ * tests/fmbtest: Likewise.
+
+ tests: skip tests that require MBS support
+ * tests/init.cfg (require_compiled_in_MB_support): New function.
+ * tests/char-class-multibyte: Use it here, since this test cannot
+ succeed without MBS support.
+ * tests/equiv-classes: Likewise.
+ * tests/euc-mb: Likewise.
+ * tests/fgrep-infloop: Likewise.
+ * tests/init.cfg: Likewise.
+ * tests/prefix-of-multibyte: Likewise.
+ * tests/turkish-I: Likewise.
+ * tests/sjis-mb: Likewise.
+
+ tests: make fmbtest explain (to stderr, not log) why it is skipped
+ * tests/fmbtest: Use skip_ and fail_ to give better diagnostics.
+
+ maint: dfa: improve comments
+ * src/dfa.c (match_mb_charset, match_anychar): Improve comments.
+
+2011-09-14 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to newer
+
+ maint: correct indentation
+ * src/dfa.c (dfaexec): Reposition curly braces to match indentation style.
+ Remove useless comment.
+
+ maint: move declaration "down" to inner scope where it is used
+ * src/dfa.c (dfaexec): Move decl of local down into scope where used.
+
+2011-09-07 Jim Meyering <meyering@redhat.com>
+
+ doc: use "file name" consistently in grep's --help output
+ * src/main.c (usage): Use "file name", not "filename" in descriptions
+ of --with-filename (-H), --no-filename (-h) and --label=LABEL.
+ Suggested by Sequoia McDowell.
+
+ bug: requires ru_RU.KOI8-R". [bug introduced in grep-2.9]
+
+2011-08-31 Matthew Burgess <matthew@linuxfromscratch.org>
+
+ tests: remove debug code that would cp to /t
+ * tests/unibyte-bracket-expr: Remove debug artifact introduced
+ by 2011-06-02 commit de5f7000, "tests: exercise a uni-byte [...]
+ bug: requires ru_RU.KOI8-R". [bug introduced in grep-2.9]
+
+2011-08-20 Jim Meyering <meyering@redhat.com>
+
+ build: use largefile module and update to latest gnulib
+ * configure.ac: Remove AC_SYS_LARGEFILE, subsumed by ...
+ * bootstrap.conf (gnulib_modules): ...this. Use largefile module.
+ * gnulib: Update to latest.
+
+ maint: clean up and plug a leak-on-OOM
+ * src/dfa.c (icatalloc): Clean up; use xrealloc in place of malloc
+ and realloc; remove conditionals that are unnecessary, now that
+ failed allocation results in exit.
+ (enlist): Use xrealloc in place of realloc; remove conditional.
+ (comsubs): Avoid leak upon failed enlist call.
+ (dfamust): Use xmalloc in place of malloc.
+ Remove conditionals, now that icpyalloc and icatalloc never return NULL.
+
+ maint: use x2nrealloc, not xrealloc
+ * src/main.c (main): Use x2nrealloc, not xrealloc
+
+2011-07-24 Jim Meyering <meyering@redhat.com>
+
+ tests: add a test to trigger the bug
+ * tests/Makefile.am (TESTS): Add it.
+ * tests/in-eq-out-infloop: Exercise the bug/fix.
+
+ exit 2 (rather than infloop) when an input file is also on stdout
+ This avoids a potential "infinite" disk-filling loop.
+ Reported in http://savannah.gnu.org/patch/?5316
+ and http://savannah.gnu.org/bugs/?17457.
+ * src/main.c: Include "quote.h".
+ (out_stat): New global.
+ (grepfile): Compare each regular file's dev/ino/etc.
+ with those from the file on stdout (if it too is regular).
+ (main): Set out_stat, if stdout is a regular file.
+ * src/system.h: Include "same-inode.h".
+ (same_file_attributes): Define. From diffutils.
+ (SAME_REGULAR_FILE): Define.
+ * bootstrap.conf (gnulib_modules): Use quote, not quotearg.
+ Use same-inode.
+ * NEWS (Bug fixes): Mention it.
+
+2011-07-15 Reuben Thomas <rrt@sc3d.org>
+
+ doc: improve documentation of character classes in the man page
+ * doc/grep.in.1: Reword documentation of character classes.
+
+2011-07-12 Jim Meyering <meyering@redhat.com>
+
+ dfa: remove unnecessary inclusion of verify.h
+ * src/dfa.c: Don't include "verify.h".
+
+ dfa: simplify use of *ALLOC macros
+ * src/dfa.c (XNMALLOC, XCALLOC): Redefine without outer cast-to-(t *).
+ (CALLOC, MALLOC, REALLOC): Remove type "t" parameter and adjust callers.
+
+ dfa: change semantics of REALLOC_IF_NECESSARY's 3rd parameter
+ * src/dfa.c (REALLOC_IF_NECESSARY): Change meaning of 3rd param,
+ from "maximum index" to 1 greater than that: the required number
+ of *P-sized elements. Note that only some of the uses of
+ REALLOC_IF_NECESSARY needed to be adjusted, the others had already
+ required an extra element.
+
+ dfa: rename REALLOC_IF_NECESSARY param/local for clarity
+ * src/dfa.c (REALLOC_IF_NECESSARY): Rename nalloc and new_nalloc
+ to n_alloc and new_n_alloc.
+
+ dfa: prepare for a semantic change in REALLOC_IF_NECESSARY
+ * src/dfa.c (REALLOC_IF_NECESSARY): Remove "t" (type) parameter.
+ Use (*p) instead. Adjust all callers.
+
+ dfa: add braces to REALLOC_IF_NECESSARY definition
+ * src/dfa.c (REALLOC_IF_NECESSARY): Add curly braces; use TABs
+ to right-indent.
+
+2011-06-28 Paolo Bonzini <bonzini@gnu.org>
+
+ doc: improve documentation of character classes
+ * doc/grep.texi (Character classes): Mention explicitly when
+ examples refer to the C locale, explain better the general
+ meaning of character classes.
+
+2011-06-28 Jim Meyering <meyering@redhat.com>
+
+ dfa: fix the root cause of the heap overrun
+ dfa's "insert" function was supposed to be maintaining the position
+ list sorted on *decreasing* index, but since the 2009-12-09 "Speed
+ up insert" commit, 62458291, it was using code that assumed the data
+ were sorted on *increasing* index. As such, sometimes it would no
+ longer merge constraints (not finding a match) and would append
+ entries that normally would have matched and been merged. Those
+ erroneous append operations resulted in the heap overrun fixed by
+ 2011-06-17 commit 0b91d692 by doubling the array size.
+ * src/dfa.c (insert): Fix the comparison.
+ (dfaanalyze): Now that that's fixed, revert commit 0b91d692,
+ allocating space for only d->nleaves entries, not double that.
+ As far as I can tell, this change has no effect other than
+ decreased memory usage, although it may improve performance
+ slightly, since the resulting list of positions is half as long
+ as it used to be.
+
+2011-06-28 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: use memcpy to copy position_sets
+ * src/dfa.c (copy): Use memcpy.
+
+ dfa: use copyset to copy charclasses
+ * src/dfa.c (add_utf8_anychar): Change memcpy to copyset.
+
+ gnulib: Update
+ Fixes mmap-anon.m4 conflict with fn_grep, reported by Rainer Orth.
+
+2011-06-21 Jim Meyering <meyering@redhat.com>
+
+ maint: update bootstrap from gnulib
+ * bootstrap: Update to latest, so it no longer inserts empty lines
+ in .gitignore files.
+ * .gitignore: Let bootstrap move "!..." lines to end of file.
+
+ post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.9
+ * NEWS: Record release date.
+
+ build: avoid a warning when building with --disable-perl-regexp...
+ and --enable-gcc-warnings.
+ * src/pcresearch.c (WITHOUT_PCRE_NORETURN): Define.
+ Remove the unreachable return statement.
+ Reported by Eric Blake.
+
+ tests: ensure that each test script is executable
+ This adds a rule run at "make check" time to ensure that
+ test scripts are consistently executable.
+ This change is not required for "make check", but makes it easier
+ for people to run scripts manually, but that is discouraged because
+ doing so makes it easy to omit important variable settings that
+ are normally provided via TESTS_ENVIRONMENT.
+ This change also makes each of the existing TESTS executable.
+ * tests/Makefile.am (check_executable_TESTS): New rule.
+ (check): Depend on it.
+ * tests/{all_scripts}: chmod 755.
+ Prompted by a report from Eric Blake.
+
+ maint: update bootstrap from gnulib
+ * bootstrap: Update from gnulib.
+
+ maint: update po/POTFILES.in
+ * po/POTFILES.in: Remove dfasearch.c, now that it no longer
+ contains a translatable diagnostic.
+
+ tests: include-exclude: avoid false positive failure on FreeBSD
+ * tests/include-exclude: Avoid false-positive failure due to
+ matching "a" in a directory on FreeBSD, when searching a directory
+ without "-r". Search for '^aaa$' rather than just 'a'.
+ Adjust test inputs and expected output files accordingly.
+
+ dfa: remove some useless casts
+ * src/dfa.c (icatalloc): Change type of "old" parameter
+ from "char const *" to "char *".
+ Don't cast-away const on realloc argument.
+ Remove now-unnecessary const-discarding cast.
+ Don't (void)-cast strcpy result.
+ * src/dosbuf.c (undossify_input): Remove anachronistic
+ cast-to-"char *" of realloc argument.
+
+ dfa: more heap-allocation-related overflow protection
+ * src/dfa.c (enlist): Use xnrealloc, not realloc.
+ Also, remove unnecessary cast-to-(char *).
+ (dfamust): Use xnmalloc, not malloc. Before, this code would
+ return upon malloc failure (xnmalloc exits upon failure), but
+ later, via the *ALLOC macros, it could already exit, so this
+ new potential exit point is nothing new. The same applies
+ to enlist, since it is called only through dfamust.
+
+ tests: update init.sh; simplify TESTS_ENVIRONMENT
+ * tests/init.sh: Update from coreutils.
+ * tests/Makefile.am (TESTS_ENVIRONMENT): Remove shell_or_perl_
+ function. Instead, just use $(SHELL), since grep has no test
+ that starts with #!/usr/bin/perl.
+
+2011-06-20 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+ build: avoid configure/gnulib-related errors
+ * bootstrap.conf: Remove now-unnecessary code to exclude
+ gettext/intl-related m4 tests.
+
+2011-06-19 Jim Meyering <meyering@redhat.com>
+
+ maint: tighten up superfluous code
+ * src/main.c (parse_grep_colors): Use xstrdup in place of xmalloc,
+ a useless test, strlen, and strcpy.
+
+2011-06-19 Paul Eggert <eggert@cs.ucla.edu>
+
+ dfa: avoid possibility of overflow
+ * src/dfa.c (REALLOC_IF_NECESSARY, CALLOC, MALLOC, REALLOC):
+ Use functions from xalloc.h to avoid overflow.
+ * src/dfasearch.c (GEAcompile): Use xnrealloc rather than realloc.
+ * src/pcresearch.c (Pcompile): Use xnmalloc, not xmalloc.
+
+2011-06-17 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+ dfa: correct two uses of btowc
+ * src/dfa.c (setbit_c, setbit_case_fold_c): Compare the btowc
+ return value against WEOF, not EOF. Suggested by Eli Zaretskii.
+ On a system like MinGW with unsigned wint_t, comparing a btowc
+ return value against EOF (-1) would always be false.
+
+ dfa: don't overrun a malloc'd buffer for certain regexps
+ * src/dfa.c (dfaanalyze): Allocate space for twice as many
+ positions as there are leaves. Before this change, for some
+ regular expressions, DFA analysis would have inserted far more
+ "positions" than dfa->nleaves (up to double).
+ Reported by Raymond Russell in http://savannah.gnu.org/bugs/?33547
+ * tests/dfa-heap-overrun: Trigger the overrun.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+
+2011-06-08 Jim Meyering <meyering@redhat.com>
+
+ tests: don't ignore sjis-mb test failure
+ I made changes that caused grep to segfault during "make check" --
+ as seen in dmesg output -- yet no test failed(!), and there was no
+ trace of the segfault in the logs.
+ * tests/sjis-mb (test_grep_reject): Ensure that output is empty.
+ Don't ignore test failure.
+
+2011-06-07 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: optimize wide characters in a bracket expression
+ * src/dfa.c (addtok): Compile characters to an alternation. Handle the
+ case when nothing else remains in the MBCSET.
+
+ dfa: refactor to prepare for upcoming optimizations
+ * src/dfa.c (parse_bracket_exp): Move optimization of MBCSET from here...
+ (addtok): ... to here.
+
+2011-06-07 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: correct handling of single-byte character ranges
+ This provides a better fix for the unibyte-bracket-expr and high-bit-range
+ testcases, and fixes the latent bug tested by bogus-wctob.
+
+ * src/dfa.c (setbit_case_fold): Remove, replace with...
+ (setbit_wc, setbit_c, setbit_case_fold_c): ... these.
+ (parse_bracket_exp): Use setbit_case_fold_c when iterating over
+ single-byte sequences. Use setbit_wc for multi-byte character sets,
+ and setbit_case_fold_c for single-byte character sets.
+ (lex): Use setbit_case_fold_c for single-byte character sets.
+
+2011-06-07 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: exercise latent bug in character ranges
+ * tests/bogus-wctob: New.
+ * Makefile.am (TESTS): Add it.
+
+2011-06-07 Jim Meyering <meyering@redhat.com>
+
+ tests: exercise a uni-byte [...] bug: requires ru_RU.KOI8-R
+ * tests/unibyte-bracket-expr: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * init.cfg (require_ru_RU_koi8_r): New function.
+
+ fix the [...] bug also for relatively unusual uni-byte encodings
+ * src/dfa.c (setbit_case_fold): Also handle uni-byte locales
+ like the one mentioned in the original report: see 2011-05-07
+ commit d98338eb. Re-reported by Santiago Ruano Rincón.
+ Note that most uni-byte locales are not affected.
+ * NEWS (Bug fixes): Mention it.
+
+ tests: use skip_test_, not skip_
+ Use skip_test_, not skip_. The former prints its message both to
+ the log file and to FD 9 (redirected to tty via tests/Makefile.am),
+ while skip_ prints only to stderr, which goes to the log file.
+ * tests/init.cfg (skip_test_): New function.
+ Use skip_test_ in place of skip_ everywhere.
+ * tests/fmbtest: s/skip_/skip_test_/
+ * tests/sjis-mb: Likewise.
+ * tests/euc-mb: Likewise.
+
+ tests: fmbtest: factor
+ * tests/fmbtest: Factor out locale-name duplication.
+
+ tests: fix skip-inducing typo in fmbtest
+ * tests/fmbtest: Fix locale name typo (s/cz_CZ/cs_CZ/)
+ that would cause this test to be skipped every time.
+
+2011-06-07 Paolo Bonzini <bonzini@gnu.org>
+
+ gnulib: adjust included modules
+ * bootstrap.conf (gnulib_modules): Drop strtoul, rename wctype to
+ wctype-h.
+
+2011-05-21 Jim Meyering <meyering@redhat.com>
+
+ grep -P: don't abort upon exceeding PCRE's backtracking limit
+ * src/pcresearch.c (Pexecute): Handle PCRE_ERROR_MATCHLIMIT.
+ * tests/Makefile.am (XFAIL_TESTS): Remove pcre-abort.
+ * tests/pcre-abort: Expect failure, no output, and increase
+ the length of the input string, in case the backtracking limit
+ is ever raised. Adjust comment.
+ * NEWS (Bug fixes): Mention it.
+
+ tests: show how to make grep -P abort
+ * tests/pcre-abort: New file.
+ Minimal testcase by Paolo Bonzini, derived from a report
+ by www.beaver@list.ru.
+ * tests/Makefile.am (TESTS): Add it.
+ (XFAIL_TESTS): Add it here, too, since this test always fails, for now.
+
+ tests: fix oddities in pcre-z
+ * tests/pcre-z: Redirect stderr inside $(), not outside.
+ Remove double quotes around $REGEX (which is just 'a') within
+ double-quoted "$(...)". Split a long line.
+
+ tests: factor out a new require_pcre_ function
+ * tests/init.cfg (require_pcre_): New function, factored out of...
+ * tests/pcre-z: ...here. Use the function.
+ * tests/pcre: Likewise.
+
+ tests: clean up pcre
+ * tests/pcre: Skip (don't pass) the test when PCRE support is disabled.
+ Don't redirect so much to /dev/null, now that all test output goes to
+ pcre.log. Remove unnecessary braces and diagnostic about failing test.
+
+2011-05-13 Jim Meyering <meyering@redhat.com>
+
+ post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.8
+ * NEWS: Record release date.
+
+ build: update gnulib, for fixed getcwd test
+
+ build: update gnulib submodule to latest
+
+ maint: remove syntax-checking sc_tight_scope rule
+ * src/Makefile.am (sc_tight_scope): Remove rule.
+ Now it's provided via gnulib's maint.mk.
+ * cfg.mk (sc_tight_scope): Likewise.
+
+2011-05-08 Jim Meyering <meyering@redhat.com>
+
+ maint: use consistent declaration syntax
+ * src/grep.h (matchers): Declare consistently, so the sc_tight_scope
+ rule detects this as an extern-marked variable.
+
+2011-05-07 Jim Meyering <meyering@redhat.com>
+
+ maint: use gnulib's new readme-release module
+ * bootstrap.conf (gnulib_modules): Add readme-release.
+ (bootstrap_epilogue): Add the recommended perl one-liner.
+ * README-release: Remove file; it is now generated from gnulib.
+ * .gitignore: Add it.
+ * gnulib: Update submodule to latest.
+
+ tests: exercise bug with 0x80..0xff in [...]
+ * tests/high-bit-range: New test, inspired by an example in the
+ report by Igor O. Ladygin: http://bugs.debian.org/624387,
+ via Santiago Ruano Rincón's http://savannah.gnu.org/bugs/?33198
+ * tests/Makefile.am (TESTS): Add it.
+
+ fix a bug whereby echo c|grep '[c]' would fail for any c in 0x80..0xff
+ * src/dfa.c (setbit_case_fold) [MBS_SUPPORT]: Set the bit also
+ when wctob returns EOF.
+ * NEWS (Bug fixes): Mention it.
+
+2011-05-02 Reuben Thomas <rrt@sc3d.org>
+
+ doc: correct comment about mmap
+ * doc/grep.texi (Other Options) [--mmap]: This option is now
+ ignored, so using it can have no effect on performance.
+
+2011-05-02 Arnold D. Robbins <arnold@skeeve.com>
+
+ build: move add_utf8_anychar into MBS ifdef
+
+2011-05-01 Arnold D. Robbins <arnold@skeeve.com>
+
+ maint: remove GAWK ifndef; no longer needed
+
+2011-05-01 Jim Meyering <meyering@redhat.com>
+
+ maint: remove now-unnecessary use of gnulib's strtol module
+ * bootstrap.conf (gnulib_modules): Remove now-obsolete "strtol".
+
+2011-04-29 Jim Meyering <meyering@redhat.com>
+
+ maint: tweak README-release
+ * README-release: Add note to check the NixOS/Hydra autobuilder results.
+
+2011-04-28 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+ maint: add the tight_scope syntax-checking rule
+ This ensures that the only externally scoped symbols are ones
+ that are explicitly marked as "extern" or white-listed like "main".
+ * src/Makefile.am (sc_tight_scope): New rule, copied from coreutils.
+ * cfg.mk (sc_tight_scope): Define, to hook to it from the top level.
+
+ maint: mark some function declarations as extern
+ * src/search.h: Add "extern" keyword to each function declaration.
+
+2011-04-23 Jim Meyering <meyering@redhat.com>
+
+ maint: fix doubled-word typos in comments
+ * src/dfa.c (SUCCEEDS_IN_CONTEXT): Remove doubled "a".
+ * src/dfa.c (BACKREF): s/it it/it is/
+
+2011-04-09 Jim Meyering <meyering@redhat.com>
+
+ maint: fix typos in comments: s/can not/cannot/
+ * src/dfa.c (check_matching_with_multibyte_ops, dfastate): As above.
+
+2011-03-19 Jim Meyering <meyering@redhat.com>
+
+ maint: stop using .x-sc_* files to list syntax-check exemptions
+ Instead, use the new mechanism with which you merely use a
+ variable (derived from the rule name) defined in cfg.mk to an ERE
+ matching the exempted file names.
+ * gnulib: Update to latest, to get maint.mk that implements this.
+ * .x-sc_bindtextdomain: Remove file.
+ * .x-sc_prohibit_tab_based_indentation: Likewise.
+ * .x-sc_prohibit_xalloc_without_use: Likewise.
+ * .x-sc_space_tab: Likewise.
+ * cfg.mk: Define variables to exempt the same files.
+
+ build: correct my change of 2011-01-28
+ Do not override original dist-hook rule.
+ * Makefile.am (run-syntax-check): Rename from overriding dist-hook.
+ (dist-hook): Depend on run-syntax-check.
+
+2011-02-27 Jim Meyering <meyering@redhat.com>
+
+ maint: update from gnulib
+ * bootstrap: Update from gnulib.
+ * tests/init.sh: Likewise.
+ * gnulib: Update to latest.
+
+2011-01-27 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+ build: run syntax-check rules as part of "make dist"
+ * Makefile.am (dist-hook): Depend on syntax-check.
+ Suggested by Reuben Thomas.
+
+2011-01-26 Jim Meyering <meyering@redhat.com>
+
+ maint: remove unneeded #include directives
+ * lib/savedir.c: Don't include <stddef.h>. Not needed.
+ * src/dfa.c: Likewise.
+
+2011-01-22 Jim Meyering <meyering@redhat.com>
+
+ build: avoid new syntax-check failures
+ * .x-sc_bindtextdomain: New file, used to avoid a spurious
+ failure from the new syntax-check rule.
+ * NEWS: Remove a trailing space.
+
+2011-01-19 Jim Meyering <meyering@redhat.com>
+
+ tests: add a known-to-fail test
+ * tests/turkish-I: New test.
+ * tests/Makefile.am (TESTS): Add it.
+ (XFAIL_TESTS): Add here, too.
+ Reported by Ilya Basin.
+
+ maint: sort test names in Makefile.am
+ * tests/Makefile.am (TESTS): Sort test names.
+
+2011-01-05 Jim Meyering <meyering@redhat.com>
+
+ doc: remove erroneous "{,m}" item from grep man page
+ * doc/grep.in.1: Remove item describing bogus {,m} regex notation.
+ Reported by Fernando Basso.
+
+2011-01-03 Jim Meyering <meyering@redhat.com>
+
+ maint: update copyright year ranges to include 2011
+ Run "make update-copyright", so "make syntax-check" works in 2011.
+
+ build: update gnulib submodule to latest
+
+2010-12-20 Paolo Bonzini <bonzini@gnu.org>
+
+ main: fix exit status on xmalloc failures
+ * NEWS: Update.
+ * src/main.c (main): Set exit_failure. Reported by Guy Shaw.
+
+ add comment above fn_grep
+ * configure.ac (fn_grep): Add comment suggested by Bruno Haible.
+
+2010-11-14 Paolo Bonzini <bonzini@gnu.org>
+
+ grep: add include guards
+ * src/system.h: Add multiple inclusion guards.
+ * src/grep.h: Likewise.
+
+ configure: fix M4 quotation
+ * configure.ac: Add extra brackets around [...] patterns.
+
+ configure: remove dependency on grep that supports long lines and -e
+ * configure.ac (fn_grep): New. Set GREP and EGREP to it, replace
+ with newly-built grep before AC_OUTPUT. Reported by Florin Iucha
+ <http://savannah.gnu.org/bugs/?31646>.
+
+2010-11-04 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib to latest
+
+ tests: don't hard-code a 5-second timeout; that's not always enough
+ Instead, time the command in the C locale and use 10 times that
+ duration -- rounded up to whole seconds -- as the timeout when running
+ it in the UTF-8 locale.
+ * tests/backref-multibyte-slow: Compute a performance-relative timeout.
+ Reported by Gilles Espinasse, regarding an imac 400. For more details,
+ see http://thread.gmane.org/gmane.comp.gnu.grep.bugs/3360
+
+2010-10-09 Jim Meyering <meyering@redhat.com>
+
+ maint: describe policy on copyright year number ranges
+ * README: Mention coreutils' long-standing policy on use of M-N
+ ranges in copyright year lists. Requested by Richard Stallman.
+
+2010-10-04 Dmitry V. Levin <ldv@altlinux.org>
+
+ build: compile gnulib without -Wcast-align to avoid warnings on ARM
+ * configure.ac (GNULIB_WARN_CFLAGS): Remove -Wcast-align.
+
+2010-09-30 Jim Meyering <meyering@redhat.com>
+
+ maint: don't define a gpg_key_ID. now it's obtained automatically
+ * cfg.mk (gpg_key_ID): Remove definition. No longer needed.
+
+2010-09-23 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: add testcase for previous fix
+ * tests/inconsistent-ranges: New.
+ * tests/Makefile.am (TESTS): Add it.
+
+2010-09-23 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: process range expressions consistently with system regex
+ The actual meaning of range expressions in glibc is not exactly strcoll,
+ which makes the behavior of grep hard to predict when compiled with the
+ system regex. Leave to the system regex matcher the decision of which
+ single-byte characters are matched by a range expression.
+
+ This partially reverts a change made in commit 0d38a8bb (which made
+ sense at the time, but not now that src/dfa.c is not doing multibyte
+ character set matching anymore).
+
+ * src/dfa.c (in_coll_range): Remove.
+ (parse_bracket_exp): Use system regex to find which single-char
+ bytes match a range expression.
+
+2010-09-23 Bruno Haible <bruno@clisp.org>
+
+ build: fix link error on systems that have libiconv but not libintl
+ * src/Makefile.am (LDADD): Add $(LIBICONV).
+
+2010-09-21 Jim Meyering <meyering@redhat.com>
+
+ build: avoid compilation failure on the Hurd
+ * src/dfasearch.c (dfawarn): Rename enum symbols to use DW_ prefix,
+ so as not to collide with "GNU", which is defined by the Hurd.
+ Reported by Matthias Lanzinger in http://savannah.gnu.org/bugs/?31096
+
+2010-09-20 Jim Meyering <meyering@redhat.com>
+
+ maint: avoid obsolete gnulib modules
+ * bootstrap.conf (gnulib_modules): Don't use obsolete atexit module.
+ Use malloc-gnu and realloc-gnu -- malloc and realloc are obsolete.
+
+ maint: update README-release
+ * README-release: Reflect changes in coreutils' version of this file.
+
+2010-09-20 Aharon Robbins <arnold@skeeve.com>
+
+ dfa: fix compilation when not using MBS
+ * src/dfa.c (prepare_wc_buf) [!MBS_SUPPORT]: Do not compile this
+ function.
+
+2010-09-16 Jim Meyering <meyering@redhat.com>
+
+ post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.7
+ * NEWS: Record release date.
+
+2010-09-13 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: add equiv-classes
+ * configure.ac (USE_INCLUDED_REGEX): Add Automake conditional.
+ * tests/equiv-classes: New test.
+ * tests/Makefile.am (TESTS): Add it.
+ (XFAIL_TESTS) [USE_INCLUDED_REGEX]: Mark it as expected failure.
+
+2010-09-13 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: fall back to glibc matcher if a MBCSET is found
+ This patch enables full support of equivalence classes and multicharacter
+ collation symbols. It can also improve performance problems in some
+ cases for multibyte grep. Both of these changes however depend on the
+ glibc version installed in the system.
+
+ For UTF-8 it will trigger only in the presence of MBCSET, e.g. [a-z].
+ For other character sets all brackets and `.` as well will trigger it.
+
+ * NEWS: Document this.
+ * src/dfa.c (dfaexec): Fall back to glibc for multibyte matches,
+ if possible.
+
+2010-09-13 Paolo Bonzini <bonzini@gnu.org>
+
+ build: update gnulib submodule to latest
+ This is done to include commit "regex: Pass the system regex if its only
+ problem is 32-bit regoff_t".
+
+ * gnulib: Update to e2b0e1a.
+
+2010-09-12 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+ tests: update init.sh from gnulib
+ * tests/init.sh: Update from gnulib.
+
+2010-09-08 Patrick Boyd <pboyd04@gmail.com>
+
+ dfa: reduce stack usage
+ * src/dfa.c (dfaanalyze): Allocate GRPS and LABELS arrays from heap,
+ not on the stack. With this change, grep can now run in these UEFI
+ simulators:
+ http://sourceforge.net/apps/mediawiki/tianocore/index.php?title=EDK
+ http://sourceforge.net/apps/mediawiki/tianocore/index.php?title=EDK2
+
+2010-09-08 Jim Meyering <meyering@redhat.com>
+
+ tests/portability: avoid spurious failure with OpenBSD's /bin/sh
+ * tests/warn-char-classes: Don't use "set -x" here. It causes
+ a spurious test failure on openbsd 4.7 when using its /bin/sh,
+ since the command, /bin/sh -xc 'P=1 : 2> err' emits "P=1" into err.
+ To enable set -x, run the test with "VERBOSE=yes", e.g.,
+ make check -C tests TESTS=warn-char-classes VERBOSE=yes
+
+2010-09-07 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+2010-09-03 Jim Meyering <meyering@redhat.com>
+
+ tests: remove .sh suffix from remaining test scripts.
+ * tests/backref: Rename from backref.sh.
+ * tests/bre: Rename from bre.sh.
+ * tests/ere: Rename from ere.sh.
+ * tests/file: Rename from file.sh.
+ * tests/khadafy: Rename from khadafy.sh.
+ * tests/options: Rename from options.sh.
+ * tests/pcre: Rename from pcre.sh.
+ * tests/spencer1: Rename from spencer1.sh.
+ * tests/spencer2: Rename from spencer2.sh.
+ * tests/status: Rename from status.sh.
+ * tests/yesno: Rename from yesno.sh.
+ * tests/Makefile.am: Reflect renamings.
+
+ tests: convert remaining tests to use init.sh
+ * tests/file.sh: Use init.sh. Use Exit, not exit. Use grep, not ${GREP}.
+ * tests/khadafy.sh: Likewise.
+ * tests/options.sh: Likewise.
+ * tests/spencer1.sh: Likewise.
+ * tests/spencer2.sh: Likewise.
+ * tests/status.sh: Likewise.
+ * tests/spencer1.awk: Use grep, not ${GREP}.
+ Don't ignore failure to generate intermediate shell script.
+ * tests/Makefile.am (CLEANFILES): Remove altogether, now that
+ all tests use init.sh.
+ (TESTS_ENVIRONMENT): Don't set GREP. It's no longer used.
+
+ tests: remove warning.sh
+ * tests/warning.sh: Remove file. All it did was print a warning.
+ * tests/Makefile.am (TESTS): Remove warning.sh.
+
+ tests: convert pcre.sh to use init.sh
+ * tests/pcre.sh: Use init.sh. Use Exit, not exit. Use grep, not ${GREP}.
+
+ tests: convert bre.sh to use init.sh
+ * tests/bre.sh: Use init.sh.
+ Use Exit, not exit.
+ Use "$abs_top_srcdir/tests/", not "$srcdir/" to specify inputs.
+ Source generated bre.script, rather than invoking $SHELL.
+ * tests/ere.sh: Likewise.
+ * tests/bre.awk: Use grep, not ${GREP}.
+ * tests/ere.awk: Likewise.
+ * tests/Makefile.am (CLEANFILES): Remove bre.script and ere.script.
+
+ tests: convert to use init.sh
+ * tests/yesno.sh: Use init.sh.
+ Use Exit, not exit.
+ Use grep, not $GREP.
+ * tests/backref.sh: Likewise.
+ * tests/Makefile.am (CLEANFILES): Remove yesno.txt.
+
+ build: update gnulib submodule to latest
+
+ build: update build/test tools from gnulib
+ * bootstrap: Update from gnulib.
+ * tests/init.sh: Likewise.
+
+2010-09-01 Jim Meyering <meyering@redhat.com>
+
+ maint: add lib/version-etc.c to the list in POTFILES.in
+ * po/POTFILES.in: Add lib/version-etc.c.
+
+2010-09-01 Jim Meyering <meyering@redhat.com>
+
+ grep: diagnose and exit-2 for bogus REs like [:space:], [:digit:], etc.
+ When I make a mistake like this:
+ grep '[:lower:]' ...
+ be it in a script or on the command line, I want to know about
+ it as soon as possible. I don't want grep to print a mere warning
+ that it is interpreting this suspicious and almost guaranteed-wrong
+ regular expression as a set of just 6 bytes. And I certainly don't
+ want grep to silently do the wrong thing, even if that would be
+ officially standards-conforming. It's obvious that I intended
+ [[:lower:]], and I want my error to be diagnosed in a way that is
+ most likely to get my attention. Thus, with this change, grep now
+ prints a diagnostic and exits with status 2 the moment it
+ encounters an offending [:char_class:] construct.
+
+ This changes the way grep works by default, rather than
+ putting this new behavior on an option. A new option
+ would seldom be used in scripts (not portable), and would
+ probably be used only rarely by those who need it the most.
+ This new functionality provides a valuable safety measure
+ and incurs truly negligible risk.
+
+ For strict POSIX compliance, set POSIXLY_CORRECT in
+ your environment. That disables this new feature.
+
+ Revert the changes from commit 2cd3bcea, "grep: add
+ --warnings={always,never,auto}.", and then do the following:
+
+ * src/dfasearch.c (dfawarn): Call getenv("POSIXLY_CORRECT") here;
+ Remove "warning: " from the diagnostic, now that it's more than
+ a warning, and exit with status 2.
+ * NEWS (New features): Describe the new semantics.
+ * tests/warn-char-classes: Adjust one test to accommodate this change.
+ * doc/grep.texi (Character Classes and Bracket Expressions): Document.
+ (Environment Variables): Cross-reference it.
+ Remove reference to obsolete getopt illegal vs. invalid difference.
+ Thanks to Paul Eggert for suggestions and an initial prod.
+
+2010-08-30 Jim Meyering <meyering@redhat.com>
+
+ maint: use gnulib's standard --version-printing code
+ This includes author names and keeps the copyright year up to date.
+ * bootstrap.conf (gnulib_modules): Add propername and version-etc-fsf.
+ * src/main.c (AUTHORS): Define.
+ (main): Use version_etc, rather than hard-coding the copyright text.
+ Prompted by a patch from Paolo Bonzini.
+
+2010-08-27 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: warn on [:space:] and similar
+ * src/dfa.c (parse_bracket_exp): Warn on regular expressions such as
+ [:space:].
+ * src/dfa.h (dfawarn): New prototype.
+ * src/dfasearch.c (dfawarn): New.
+ * NEWS: Document.
+
+ tests: add test for warnings
+ * tests/Makefile.am (TESTS): Add warn-char-class.
+ * tests/warn-char-class: New.
+
+ grep: add --warnings={always,never,auto}.
+ * src/grep.h (no_warnings): New declaration.
+ * src/main.c (no_warnings): New.
+ (WARNINGS_OPTION): Add to enum.
+ (main): Add --warnings. Handle color_option == 2 together with it.
+
+ tests: add failing test for grep from a directory
+ * tests/Makefile.am (TESTS, XFAIL_TESTS): Add grep-dir.
+ * tests/grep-dir: New.
+
+ tests: add test for previous commit
+ * tests/Makefile.am (TESTS): Add grep-dev-null.
+ * tests/grep-dev-null: New.
+
+ search: fix "grep -Fif /dev/null"
+ * bootstrap.conf: Include gnulib module minmax.
+ * src/searchutils.c (mbtolower): Handle *N == 0 case.
+ * src/system.h: Include minmax.h from gnulib.
+
+2010-08-27 Adam Katz <savannah@kopis.com>
+
+ Remove declaration after statement in dfa.c
+ * dfa.c (dfaexec): Declare saved_end at the beginning of the function.
+
+2010-08-13 Jim Meyering <meyering@redhat.com>
+
+ make --include=FILE work once again
+ The semantics of excluded_file_name changed (when operating on
+ an "included" file name list).
+ * src/main.c (main): Adjust for changed semantics of excluded_file_name
+ simply by removing a negation.
+ * NEWS (Bug fixes): Mention this fix.
+ * tests/include-exclude: Add a test for this.
+ Reported by Joe Perches in http://savannah.gnu.org/bugs/?29876.
+
+2010-07-16 Paolo Bonzini <bonzini@gnu.org>
+
+ doc: document \s and \S
+ * doc/grep.texi (The Backslash Character and Special Expressions):
+ Document \s and \S escapes.
+
+2010-05-29 Karl Berry <karl@gnu.org>
+
+ doc: discuss matches that span two or more lines
+ * doc/grep.texi (Usage): Discuss matching across lines.
+ (Character Classes and Bracket Expressions) <[:space:]>: refer to it.
+
+2010-05-25 Jim Meyering <meyering@redhat.com>
+
+ build: use latest gettext: 0.18
+ * configure.ac: Use gettext-0.18.
+ * bootstrap.conf (gnulib_modules): Use gettext-h, not gettext.
+ since the latter drags in a depedency on gettext 0.18.
+ Suggested by Bruno Haible.
+
+ maint: update helper scripts from gnulib
+ * tests/init.sh: Update from gnulib.
+ * bootstrap: Likewise.
+
+ build: update gnulib submodule to latest
+
+ maint: don't emit an extra newline in each of two diagnostics
+ * src/main.c (context_length_arg, grepdir): Remove a stray \n in
+ each of two diagnostics.
+
+2010-05-24 Bruno Haible <bruno@clisp.org>
+
+ search: Avoid out-of-bounds access.
+ * src/dfasearch.c (EGexecute): Avoid access beyond end of buffer
+ that could happen if start != beg - buf.
+
+2010-05-23 Aharon Robbins <arnold@skeeve.com>
+
+ dfa: fix signedness warnings
+ * src/dfa.c (dfaexec): Cast p when passing it to prepare_wc_buf.
+
+2010-05-09 Jim Meyering <meyering@redhat.com>
+
+ tests: update init.sh
+ * tests/init.sh: Update from gnulib.
+
+ tests: normalize init.sh-sourcing code
+ * tests/backref-multibyte-slow: Use one-line idiom.
+ * tests/backref-word: Likewise.
+ * tests/case-fold-backref: Likewise.
+ * tests/case-fold-backslash-w: Likewise.
+ * tests/case-fold-char-class: Likewise.
+ * tests/case-fold-char-range: Likewise.
+ * tests/case-fold-char-type: Likewise.
+ * tests/char-class-multibyte: Likewise.
+ * tests/dfaexec-multibyte: Likewise.
+ * tests/empty: Likewise.
+ * tests/euc-mb: Likewise.
+ * tests/fedora: Likewise.
+ * tests/fgrep-infloop: Likewise.
+ * tests/fmbtest: Likewise.
+ * tests/foad1: Likewise.
+ * tests/ignore-mmap: Likewise.
+ * tests/include-exclude: Likewise.
+ * tests/max-count-vs-context: Likewise.
+ * tests/pcre-z: Likewise.
+ * tests/prefix-of-multibyte: Likewise.
+ * tests/reversed-range-endpoints: Likewise.
+ * tests/sjis-mb: Likewise.
+ * tests/spencer1-locale: Likewise.
+ * tests/word-delim-multibyte: Likewise.
+ * tests/word-multi-file: Likewise.
+
+ tests: update help-version
+ * tests/help-version: Update from coreutils.
+
+2010-05-06 Jim Meyering <meyering@redhat.com>
+
+ tests: enable glibc's malloc-perturbing option
+ * tests/Makefile.am (MALLOC_PERTURB_): Define, in case it's not already
+ set in your environment.
+ (TESTS_ENVIRONMENT): Propagate MALLOC_PERTURB_ setting to test scripts.
+
+2010-05-06 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: speed up [[:digit:]] and [[:xdigit:]]
+ There's no "multibyte pain" in these two classes, since POSIX
+ and ISO C99 mandate their contents.
+
+ Time for "./grep -x '[[:digit:]]' /usr/share/dict/linux.words"
+ Before: 1.5s, after: 0.07s. (sed manages only 0.5s).
+
+ * src/dfa.c (predicates): Declare struct dfa_ctype separately
+ from definition. Add sb_only.
+ (find_pred): Return const struct dfa_ctype *.
+ (parse_bracket_exp): Return const struct dfa_ctype *. Do
+ not fill MBCSET for sb_only character types.
+
+2010-05-05 Jim Meyering <meyering@redhat.com>
+
+ tests: readability: use awk rather than obfuscated sed
+ * tests/backref-multibyte-slow: Generate input using an awk for-loop
+ rather than expensive and harder-to-read sed pipes.
+ Remove stray "set -x" and "wc -l in".
+
+ dfa: avoid segfault when processing an invalid multi-byte sequence
+ * src/dfa.c (dfaexec): Handle the cases in which mbrtowc returns
+ (size_t)-1 or (size_t)-2, rather than setting mblen_buf[i] to an
+ outrageously large value.
+
+2010-05-05 Paolo Bonzini <bonzini@gnu.org>
+
+ grep: remove redundant syntax bit
+ * grep.c (Gcompile): Remove RE_HAT_LISTS_NOT_NEWLINE.
+
+ tests: add test for newly-fixed performance problem
+ * tests/backref-multibyte-slow: New.
+ * tests/Makefile.am: Add it.
+
+2010-05-05 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: convert to wide character line-by-line
+ This provides a nice speedup for -m in general, but especially
+ it avoids quadratic complexity in case we have to go to glibc.
+
+ * NEWS: Document change.
+ * src/dfa.c (prepare_wc_buf): Extract out of dfaexec. Convert
+ only up to the next newline.
+ (dfaexec): Exit multibyte processing loop if past buf_end.
+ Call prepare_wc_buf again after processing a newline.
+
+2010-05-01 Jim Meyering <meyering@redhat.com>
+
+ maint: remove useless #if HAVE_STDLIB_H
+ * src/mbsupport.h: Don't test HAVE_STDLIB_H.
+
+2010-04-20 Jim Meyering <meyering@redhat.com>
+
+ dfa: don't #ifdef-out member declarations
+ * src/dfa.c (struct dfa): Remove "#if MBS_SUPPORT" guard that made
+ several member declarations conditional on this cpp definition.
+ (token): Likewise.
+ Reported by Anders Wallin.
+
+ tests: ensure that the --mmap option is ignored
+ * tests/ignore-mmap: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ Reported by Jaroslav Škarvada in <http://savannah.gnu.org/bugs/?29614>
+
+2010-04-20 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: honor RE_DOT_NEWLINE and RE_DOT_NOT_NULL in UTF-8 period optimization
+ * src/dfa.c (add_utf8_anychar): Check for RE_DOT_NEWLINE and
+ RE_DOT_NOT_NULL.
+
+ grep: fix --mmap not being ignored
+ * NEWS: Document bugfix.
+ * main.c (main): Ignore MMAP_OPTION.
+
+2010-04-19 Jim Meyering <meyering@redhat.com>
+
+ maint: avoid syntax-check failure due to indentation via TABs
+ * src/dfa.c (atom): Expand TABs in indentation.
+
+ build: update gnulib submodule to latest
+
+ maint: restrict scope of two globals to dfasearch.c
+ * src/dfasearch.c (patterns, pcount): Declare these file-scoped
+ globals to be static.
+
+2010-04-19 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: optimize UTF-8 period
+ * NEWS: Document improvement.
+ * src/dfa.c (struct dfa): Add utf8_anychar_classes.
+ (add_utf8_anychar): New.
+ (atom): Simplify if/else nesting. Call add_utf8_anychar for ANYCHAR
+ in UTF-8 locales.
+ (dfaoptimize): Abort on ANYCHAR.
+
+ dfa: drop ORTOP
+ * src/dfa.c (token, prtok, addtok_mb, nsubtoks, dfaanalyze, dfamust):
+ Remove ORTOP.
+ (regexp): Remove parameter, always add OR at the end, adjust callers.
+ (atom): Adjust caller.
+ (dfaparse): Adjust caller. Always add OR at the end.
+
+ dfa: fix {0,0}
+ * NEWS: Document change.
+ * src/dfa.c (struct dfa): Remove "broken" field.
+ (lex): Do not set it.
+ (closure): On {0,0}, backup and lex another closure without
+ adding a CAT.
+ (dfabroken): Remove.
+ * src/dfa.h (dfabroken): Remove.
+ * tests/spencer1.tests: Add testcases for {m,n}.
+
+ dfa: simplify dfainit
+ * src/dfa.c (dfainit): Use memset.
+
+2010-04-17 Jim Meyering <meyering@redhat.com>
+
+ doc: fix a nit in HACKING
+ * HACKING: Correct size of .git/ dir: 9MB, not 30MB.
+
+ tests: add an expected-to-fail test using \< in a multi-byte locale
+ * tests/word-delim-multibyte: New test. Currently failing.
+ * tests/Makefile.am (TESTS): Add it.
+ (XFAIL_TESTS): Define, temporarily.
+ Reported by Jaroslav Škarvada in http://savannah.gnu.org/bugs/?29537.
+
+2010-04-16 Paolo Bonzini <bonzini@gnu.org>
+
+ test: cover just-fixed bug
+ * tests/empty: Test -Fw too.
+
+ grep: fix matching the empty string with grep -Fw
+ * NEWS: Document fix.
+ * src/kwsearch.c (Fexecute): The empty string is a valid match if it is
+ a whole word.
+
+2010-04-15 Jim Meyering <meyering@redhat.com>
+
+ maint: update init.sh and HACKING
+ * HACKING: Sync from coreutils.
+ * tests/init.sh: Update from gnulib.
+
+2010-04-13 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest; adapt
+ * COPYING: Remove empty line.
+ * README: Likewise.
+ * doc/fdl.texi: Likewise.
+ * tests/backref-word: Likewise.
+
+2010-04-11 Stefano Lattarini <stefano.lattarini@gmail.com>
+
+ tests: accept the Debian timeout program
+ * tests/init.cfg: test timeout with `timeout 10s true'
+
+2010-04-08 Jim Meyering <meyering@redhat.com>
+
+ dfa: convert "cannot happen" code/comment to use assert
+ * src/dfa.c (dfamust): There were numerous "cannot happen" comments,
+ some associated with "if (expr) goto done;". Replace each with an
+ equivalent "assert (!expr);".
+
+ build: use gnulib's isblank module
+ * bootstrap.conf (gnulib_modules): Use gnulib's isblank module,
+ now that we rely on the function by that name.
+
+ maint: undo TAB-conversion change to gl/lib/*.c.diff
+ This fixes a bootstrap failure due to the patches not applying.
+ * .x-sc_prohibit_tab_based_indentation: Add ^gl/lib/.*\.c\.diff$
+ * gl/lib/regcomp.c.diff: Revert today's TAB->space change.
+ * gl/lib/regex_internal.c.diff: Likewise.
+ * gl/lib/regexec.c.diff: Likewise.
+
+2010-04-08 Arnold D. Robbins <arnold@skeeve.com>
+
+ dfa: fix declaration of dfabroken in dfa.h
+ * dfa.h (dfabroken) [GAWK]: Fix declaration to match that in dfa.c.
+
+2010-04-08 Jim Meyering <meyering@redhat.com>
+
+ maint: add syntax-check rule to enforce the new no-leading-TABs policy
+ * cfg.mk (sc_prohibit_tab_based_indentation): New rule, from coreutils.
+ (sc_prohibit_emacs__indent_tabs_mode__setting): Likewise.
+ (old_NEWS_hash): Update.
+ * .x-sc_prohibit_tab_based_indentation: List exempt files.
+
+2010-04-08 Jim Meyering <meyering@redhat.com>
+
+ convert all TABs to equivalent spaces in indentation
+ Using this file,
+
+ cat > leading-blank.exempt <<\EOF
+ (?:^|\/)ChangeLog[^/]*$
+ (?:^|\/)(?:GNU)?[Mm]akefile[^/]*$
+ \.(?:am|mk)$
+ EOF
+
+ run this command to convert all non-conforming leading white
+ space to be all spaces:
+
+ git ls-files \
+ | pcregrep -vf leading-blank.exempt \
+ | xargs pcregrep -l '^ *\t' \
+ | xargs perl -MText::Tabs -ni -le \
+ '$m=/^( *\t[ \t]*)(.*)/; print $m ? expand($1) . $2 : $_'
+
+2010-04-08 Jim Meyering <meyering@redhat.com>
+
+ build: include cfg.mk in the distribution tarball
+ * Makefile.am (EXTRA_DIST): Add cfg.mk.
+
+2010-04-08 Jim Meyering <meyering@redhat.com>
+
+ maint: Makefile.am tweak (no semantic change)
+ * Makefile.am (EXTRA_DIST): List one per line. Sort.
+
+ build: include cfg.mk in the distribution tarball
+ * Makefile.am (EXTRA_DIST): Add cfg.mk.
+
+2010-04-08 Jim Meyering <meyering@redhat.com>
+
+ dfa: move definition of __attribute__ back into dfa.h
+ * src/dfa.c (__attribute__): Move definition back to...
+ * src/dfa.h: ... this file. It is essential for non-gcc compilers.
+ Reported by Arnold Robbins.
+
+2010-04-07 Arnold D. Robbins <arnold@skeeve.com>
+
+ dfa: move internals from dfa.h to dfa.c
+ * src/dfa.h: Move internals into dfa.c.
+ * src/dfa.c: The dfa internals are now totally local to this file.
+ (dfaalloc, dfamusts, dfabroken): New functions to access features.
+ * src/dfasearch.c (dfa): Change this global variable from struct to pointer.
+ Adapt to that change, and use new functions, dfamusts and dfaalloc.
+
+2010-04-07 Jim Meyering <meyering@redhat.com>
+
+ mbtolower: avoid potential NULL-dereference
+ * src/searchutils.c: Include <assert.h>.
+ (mbtolower): Assert that 0 < *n, to avoid possibility of NULL-deref.
+ Remove dead increment.
+
+ maint: tell git to ignore more build products
+ * .gitignore: Also ignore results of "make ID" and "make tags".
+
+ build: update gnulib submodule to latest
+
+ tests: use init.sh consistently
+ * tests/euc-mb: Call "path_prepend_ ." on a line by itself,
+ and with a comment. This makes it so all of the srcdir/init.sh
+ lines are consistent, project-wide, and so that the addition of "."
+ to PATH for this test is properly documented.
+ * tests/sjis-mb: Likewise.
+
+ maint: avoid new syntax-check failure, ...
+ ...now that the sole use of xmalloc no longer matches the
+ regular expression used by the syntax-check rule.
+ * .x-sc_prohibit_xalloc_without_use: Exempt src/kwset.c.
+
+ grep: make kwset's obstack use xmalloc, not malloc
+ This insidious bug could make grep fail to diagnose a failed malloc,
+ and then proceed to dereference the resulting NULL pointer.
+ Note that this bug was unlikely ever to cause real trouble; without
+ the fix, grep would segfault upon OOM, now it exits with a diagnostic.
+ * src/kwset.c (malloc) [GREP]: Define without the "(s)" macro
+ parameter, so that unadorned uses of malloc are also mapped to xmalloc.
+ One such use is in the expansion of obstack_init.
+ Report and patch by Nelson H. F. Beebe, in
+ http://thread.gmane.org/gmane.comp.gnu.grep.bugs/2995
+
+ tests: improve help-version (sync from gzip's version)
+ * tests/help-version: Cross-check $VERSION and --version output.
+ * tests/Makefile.am (TESTS_ENVIRONMENT): Export VERSION=$(VERSION).
+
+2010-04-06 Jim Meyering <meyering@redhat.com>
+
+ doc: update THANKS
+ * THANKS: Update.
+
+2010-04-06 Aharon Robbins <arnold@skeeve.com>
+
+ build: avoid conflict with WCHAR definition from Cygwin's <windows.h>
+ * src/dfa.h (enum token): Remove the definition from this file.
+ Replace with a declaration and typedef. Moved to ...
+ * src/dfa.c (enum token): ... here.
+ Reported by Corinna Vinschen.
+
+2010-04-06 Jim Meyering <meyering@redhat.com>
+
+ doc: add HACKING
+ * HACKING: New file. Copied from coreutils, with s/coreutils/grep/
+ and a few minor edits.
+
+2010-04-05 Jim Meyering <meyering@redhat.com>
+
+ tests: pull fixed init.sh from gnulib
+ * tests/init.sh: Update from gnulib.
+
+ maint: fix new argmatch-related syntax-check failures
+ * configure.ac (ARGMATCH_DIE): Use usage(EXIT_FAILURE), not exit(1).
+ * po/POTFILES.in: Add lib/argmatch.c.
+
+ maint: update cfg.mk to work with gnulib's newer "make syntax-check"
+ * cfg.mk: Update to use new _sc_search_regexp interface. Run this:
+ perl -pi -e 's/\b_prohibit_regexp\b/_sc_search_regexp/;'
+ -e 's/\bmsg=/halt=/; s/\bre=/prohibit=/;' cfg.mk
+ and then adjust backslashes so they still line up.
+
+ maint: update tests/init.sh from gnulib
+ This ensures that the explanation for any skipped or failed test
+ is printed on stderr, not buried in each .log file.
+ * tests/init.sh: Update from gnulib.
+ * tests/init.cfg (stderr_fileno_): Define to 9, to match the
+ literal 2>&9 in tests/Makefile.am
+
+ build: update gnulib submodule to latest
+
+2010-04-04 Jim Meyering <meyering@redhat.com>
+
+ maint: use argmatch, for better --directories=INVAL diagnostics
+ Before, you'd see this:
+ grep: unknown directories method
+
+ Now, you'll see this:
+ grep: invalid argument `INVAL' for `--directories'
+ Valid arguments are:
+ - `read'
+ - `recurse'
+ - `skip'
+ Usage: src/grep [OPTION]... PATTERN [FILE]...
+ Try `src/grep --help' for more information.
+
+ * bootstrap.conf: Add argmatch.
+ * configure.ac: Define ARGMATCH_DIE and ARGMATCH_DIE_DECL.
+ * src/main.c (directories_type): Define.
+ (directories_args, directories_types) Define.
+ All of the above so we can...
+ (main): Use XARGMATCH.
+ (usage): Declare extern, now that argmatch calls it via ARGMATCH_DIE.
+
+2010-04-04 Jim Meyering <meyering@redhat.com>
+
+ dfa.c: const correctness; and remove useless casts of realloc and malloc
+ * src/dfa.c (icatalloc, icpyalloc, istrstr, enlist): As above.
+ (inboth, dfamust, comsubs): Likewise.
+
+ dfa.c: use a better (unsigned) type for an index: int->unsigned int
+ * src/dfa.c (dfaexec): Use "unsigned int" for a logically unsigned index.
+
+ maint: style: use sizeof VAR, rather than sizeof TYPE, where possible
+ * src/dfa.c (copyset, zeroset): Prefer sizeof EXPR, over sizeof TYPE,
+ for improved readability/maintainability.
+ (equal, parse_bracket_exp, addtok_wc, dfaparse, dfaexec): Likewise.
+
+2010-04-02 Jim Meyering <meyering@redhat.com>
+
+ dfa.c: use a better (unsigned) type for an index: int->size_t
+ * src/dfa.c (parse_bracket_exp): Use size_t as type of index, not int.
+
+ maint: const-correctness
+ * src/dfa.c (tstbit, copyset, equal, charclass_index): Declare read-only
+ "charclass" parameters to be "const". No semantic change.
+
+ maint: include <wchar.h> and <wctype.h> unconditionally
+ * src/main.c: Include <wchar.h> and <wctype.h> unconditionally.
+ Their presence/usefulness are assured by gnulib.
+ * src/dfa.c: Likewise.
+ * src/search.h: Likewise.
+
+ maint: MBS_SUPPORT: define to 0/1, not undef/1
+ Prepare to remove many of these #ifdefs.
+ * src/mbsupport.h (MBS_SUPPORT): Define to 0/1, not undef/1.
+ Change each "#ifdef MBS_SUPPORT" to "#if MBS_SUPPORT". Use this:
+ perl -pi -e 's/ifdef (MBS_SUPPORT)/if $1/' $(g grep -l ifdef.MBS_SUPPO)
+ * src/dfa.c: s/#ifdef MBS_SUPPORT/#if MBS_SUPPORT/
+ * src/dfa.h: Likewise.
+ * src/dfasearch.c: Likewise.
+ * src/kwsearch.c: Likewise.
+ * src/main.c: Likewise.
+ * src/search.h: Likewise.
+ * src/searchutils.c: Likewise.
+
+2010-04-02 Jim Meyering <meyering@redhat.com>
+
+ maint: use STREQ in place of strcmp
+ perl -pi -e 's/\bstrcmp *\((.*?)\) == 0/STREQ ($1)/' src/main.c
+ perl -pi -e 's/\bstrcmp *\((.*?)\) != 0/!STREQ ($1)/' src/main.c
+
+ * src/dfa.c (STREQ): Define.
+ Use it instead of strcmp.
+ * src/main.c (STREQ): Likewise.
+ * cfg.mk (local-checks-to-skip): Remove sc_prohibit_strcmp,
+ to enable the strcmp-prohibition.
+
+2010-04-02 Jim Meyering <meyering@redhat.com>
+
+ maint: enable the useless_cpp_parens syntax check
+ * cfg.mk (local-checks-to-skip): Remove sc_useless_cpp_parens.
+ * src/main.c (devices, fillbuf, exit_on_match): Remove useless parens.
+ (print_line_head, grepfile, set_limits, main): Likewise.
+ * src/vms_fab.h: Likewise.
+ * vms/config_vms.h: Likewise.
+ * src/mbsupport.h: Likewise.
+
+ cleanup and improvement: parse command line arguments consistently
+ * src/main.c: Include c-ctype.h, for this:
+ (prepend_args): Use c_isspace, not ISSPACE.
+ This is important so that we parse arguments consistently,
+ and independently of the current locale.
+ * bootstrap.conf (gnulib_modules): Add c-ctype.
+ * src/system.h: Remove IS* definitions here, too.
+ * src/dfasearch.c (WCHAR): Use isalnum, not ISALNUM.
+ * src/kwsearch.c (WCHAR): Likewise.
+ * src/searchutils.c (kwsinit): Use tolower, not TOLOWER.
+
+ cleanup: rely on gnulib's ctype.h functions; remove IS* macros and is_*
+ * src/dfa.c (setbit_case_fold, prednames): Use official names.
+ (IS_WORD_CONSTITUENT, lex): Likewise.
+ (ISALNUM, ISALPHA, ISCNTRL, ISDIGIT, ISGRAPH): Remove definitions.
+ (ISLOWER, ISPRINT, ISPUNCT, ISSPACE, ISUPPER, ISXDIGIT): Likewise.
+ (is_alnum, is_alpha, is_blank, is_cntrl, is_digit, is_graph): Likewise.
+ (is_lower, is_print, is_punct, is_space, is_upper, is_xdigit): Likewise.
+ (isgraph): Likewise.
+
+ build: update gnulib submodule to latest, and adjust
+ * src/main.c (parse_grep_colors): Adjust diagnostics not to trigger
+ the sc_error_message_period and sc_error_message_uppercase
+ syntax-check rules.
+
+ maint: remove all VMS-related code
+ * configure.ac (AC_CONFIG_FILES): Remove vms/Makefile
+ * Makefile.am (SUBDIRS): Remove vms.
+ * src/Makefile.am (EXTRA_DIST): Remove vms_fab.c and vms_fab.h.
+ * src/vms_fab.c, src/vms_fab.h, vms/make.com: Remove files.
+ * vms/Makefile.am, vms/README, vms/config_vms.h: Likewise.
+
+ post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.6.3
+ * NEWS: Record release date.
+
+2010-04-02 Jim Meyering <meyering@redhat.com>
+
+ grep: avoid used-undefined error with truncated multibyte input
+ * src/dfa.c (addtok_wc): Don't use buf[0] (it's undefined) when
+ wcrtomb returns <= 0.
+
+ MBS_SUPPORT-removal: * src/dfa.c (dfastate):
+
+2010-04-01 Jim Meyering <meyering@redhat.com>
+
+ maint: avoid unnecessary 2nd getenv("TERM")
+ * src/main.c (main): Don't call getenv("TERM") twice -- in the same
+ expression, even.
+
+ tests: remove all unportable uses of echo
+ * src/main.c: Use printf rather than echo -ne in a comment.
+ * tests/fedora: Use printf (not echo) also in ok/fail functions.
+ * cfg.mk (sc_prohibit_echo_minus_en): New rule, to prohibit
+ any future introduction.
+
+ tests: add explicit requirement for en_US.UTF-8
+ * tests/char-class-multibyte: Use require_en_utf8_locale_,
+ rather than open-coding it.
+ * tests/prefix-of-multibyte: Require the locale explicitly.
+ * tests/fgrep-infloop: Likewise.
+ This fixes test failures that would arise on systems without
+ that particular locale. Reported by Ludovic Courtès.
+
+ tests: new function, to require an en_US UTF8 locale
+ * tests/init.cfg (require_en_utf8_locale_): New function.
+
+ tests: use printf, not echo -n, echo -e, or any combination
+ * tests/fedora: Using printf is more portable.
+
+ grep: remove unnecessary code
+ * src/main.c (print_line_middle): Now that we use RE_ICASE
+ (enabled in commit 70e23616, "dfa: rewrite handling of multibyte
+ case_fold lexing"), this case-conversion code is useless and wasteful.
+ Remove it.
+
+ doc: fix typo: s/AM_V_AT/AM_V_at/
+ * doc/Makefile.am (egrep.1 fgrep.1): The former has case consistent
+ with its sister variable, AM_V_GEN, but the latter is the one that
+ actually works.
+
+ doc: generated files are best made read-only, ...
+ ...to minimize risk of accidentally modifying the generated file
+ rather than its template. These are tiny, so no risk, but it's
+ a good to be consistent, so generated files are easier to spot.
+ * doc/Makefile.am (egrep.1 fgrep.1): When generating these files,
+ ensure that they too are created read-only.
+
+ doc: generate grep.1 from template
+ * doc/Makefile.am (grep.1): New rule.
+ (CLEANFILES): Add grep.1 to the list.
+ * .gitignore: Add /doc/grep.1
+ * doc/grep.in.1: Replace hard-coded "2.5.1-cvs" with @VERSION@.
+ Update copyright year list.
+ Omit the line-splitting \(co directive so that update-copyright
+ will perform future updates automatically.
+ Egmont Koblinger reported the outdated version string
+ and copyright year list in the man page:
+ http://savannah.gnu.org/bugs/?29390
+
+ doc: prepare to generate grep.1
+ * doc/grep.1: Rename to...
+ * doc/grep.in.1: ...this.
+
+2010-03-31 Eric Blake <eblake@redhat.com>
+
+ build: avoid another warning
+ Noticed on cygwin:
+ get-mb-cur-max.c: In function 'main':
+ get-mb-cur-max.c:27: error: unused parameter 'argc' [-Wunused-parameter]
+
+ * tests/get-mb-cur-max.c (main): Use argc.
+
+2010-03-31 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: fix on systems with broken sh
+ * tests/Makefile.am (TESTS_ENVIRONMENT): Adjust coreutils remnants.
+ * tests/bre.sh: Invoke script with $SHELL if defined.
+ * tests/ere.sh: Likewise.
+ * tests/spencer1-locale: Likewise.
+ * tests/spencer1.sh: Likewise.
+
+ tests: improve empty test
+ * tests/empty: Add more tests, note expected failure.
+
+ tests: improve empty test with respect to locales
+ * tests/empty: Add tests for multiple locales.
+
+ grep: fix grep -F against empty string
+ * src/searchutils.c (is_mb_middle): Do not return true for empty matches
+ when p == buf.
+
+ tests: rename empty.sh to empty
+ * tests/empty.sh: Rename to...
+ * tests/empty: ... this.
+ * tests/Makefile.am (TESTS): Adjust.
+
+ tests: convert empty.sh to new style
+ * tests/empty.sh: Convert to init.sh, add 10-second timeout.
+
+ tests: use get-mb-cur-max in char-class-multibyte
+ * tests/char-class-multibyte: Use get-mb-cur-max to detect UTF-8 support.
+ Rewrite previous locale detection code as a grep test.
+
+ tests: fix -Wformat failure
+ * tests/get-mb-cur-max (main): Cast MB_CUR_MAX to int.
+
+2010-03-30 Jim Meyering <meyering@redhat.com>
+
+ doc: add a "Reply-To" to the suggested announcement mail header
+ * README-release: Add "Reply-To" with the list address,
+ to minimize risk of replies to the other announcement recipients.
+ Suggestion from Eric Blake.
+
+2010-03-29 Jim Meyering <meyering@redhat.com>
+
+ build: avoid compiler warning when building test program
+ * tests/Makefile.am (AM_CPPFLAGS, AM_CFLAGS, AM_LDFLAGS): Define,
+ so that all the usual C compile-and-link machinery comes into play.
+ * tests/get-mb-cur-max.c: Include "progname.h".
+ Remove unnecessary inclusion of <ctype.h>.
+ Mike Frysinger reported the "implicit decl of set_program_name" warning.
+
+ build: detect PCRE support also when <pcre/pcre.h> is the header
+ * m4/pcre.m4: Also check for <pcre/pcre.h>.
+ * src/pcresearch.c: Include <pcre/pcre.h>, if needed.
+ Guard inclusions with HAVE_PCRE_H and HAVE_PCRE_PCRE_H, not HAVE_LIBPCRE.
+ * NEWS (Bug fixes): Mention it.
+ Dmitry V. Levin reported that PCRE support was not detected
+ on systems with <pcre.h> not in the default include path.
+
+ post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.6.2
+ * NEWS: Record release date.
+
+2010-03-29 Eric Blake <eblake@redhat.com>
+
+ build: avoid warnings on cygwin
+ * lib/savedir.c (isdir): Avoid shadowing a declaration.
+ * src/main.c (get_nondigit_option): Cast away const to avoid
+ compiler warning.
+
+ maint: ignore new test executable
+ * .gitignore: Enhance.
+
+2010-03-29 Jim Meyering <meyering@redhat.com>
+
+ doc: consolidate redundant-looking entries
+ * NEWS: Consolidate the two --include/exclude-related entries.
+ Suggested by Eric Blake.
+
+2010-03-29 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: use $(...) consistently
+ * tests/backref.sh: Use `...' instead of ``...'' in comments.
+ * tests/bre.awk: Use $(...) instead of `...`.
+ * tests/ere.awk: Use $(...) instead of `...`.
+ * tests/euc-mb: Use $(...) instead of `...`.
+ * tests/fmbtest: Use $(...) instead of `...`.
+ * tests/foad1: Use $(...) instead of `...`.
+ * tests/pcre-z: Use $(...) instead of `...`. Quote output of grep.
+ * tests/spencer1-locale.awk: Use $(...) instead of `...`.
+ * tests/spencer1.awk: Use $(...) instead of `...`.
+ * tests/yesno.sh: Use $(...) instead of `...`.
+
+2010-03-29 Jim Meyering <meyering@redhat.com>
+
+ build: make doc/Makefile.am cleaner and more robust
+ * doc/Makefile.am (egrep.1 fgrep.1): Generate robustly, i.e.,
+ do not redirect directly to $@.
+ Use $(AM_V_GEN).
+ Do not distribute intermediate files like fgrep.man and egrep.man.
+ Likewise, do not use them to generate their %.1 images.
+ Instead, generate the .1 files directly.
+
+2010-03-29 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: add program to detect locales
+ * tests/Makefile.am (check_PROGRAMS): Add get-mb-cur-max.
+ * tests/get-mb-cur-max.c: New.
+ * tests/euc-mb: Use it. Fail if the former detection test fails.
+ * tests/sjis-mb: Use it. Fail if the former detection test fails. Expand
+ comments.
+
+2010-03-29 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: add tests for SJIS character sets
+ The attached test will be skipped unless (on a glibc system) you run
+ something like
+
+ mkdir /usr/lib/locale/ja_JP.SHIFT_JIS
+ zcat /usr/share/i18n/charmaps/SHIFT_JIS.gz | \
+ localedef \
+ -f - \
+ -i /usr/share/i18n/locales/ja_JP \
+ /usr/lib/locale/ja_JP.SHIFT_JIS
+
+ * tests/Makefile.am: Add sjis-mb.
+ * tests/sjis-mb: New.
+
+2010-03-29 Paolo Bonzini <bonzini@gnu.org>
+
+ grep -F: fix a bug with SJIS character sets
+ Commit db9d6 would erroneously skip matches in SJIS character sets. In
+ this character set low bytes (i.e. ASCII bytes) are also valid second
+ bytes in a double-byte character, so you have to continue looking for
+ a match, even if you match in the middle of a double-byte character.
+
+ * src/kwsearch.c: Ensure that beg is advanced by at least one byte,
+ but do not fail immediately after matching in the middle of a double-byte
+ character.
+
+2010-03-28 Bruno Haible <bruno@clisp.org>
+
+ build: update after change in gnulib's lib-ignore module
+ * src/Makefile.am (AM_LDFLAGS): Define. Use gnulib's new
+ $(IGNORE_UNUSED_LIBRARIES_CFLAGS).
+
+2010-03-28 Jim Meyering <meyering@redhat.com>
+
+ tests: disable new texinfo-acronym syntax-check from gnulib
+ * cfg.mk (local-checks-to-skip): Add new sc_texinfo_acronym, to skip it.
+
+2010-03-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ tests: exercise fix for improper match of incomplete MB char prefix
+ * tests/prefix-of-multibyte: New file.
+ * tests/Makefile.am (TESTS): Add it.
+
+2010-03-28 Jim Meyering <meyering@redhat.com>
+
+ grep -F: fix a multi-byte erroneous-match-in-middle bug
+ Just as Perl prints nothing in this case,
+ printf '\357\274\241\n' | perl -CIO -lne '/\357/ and print'
+
+ grep should also print nothing when used as follows.
+ However, these would mistakenly match with grep prior to 2.6.2:
+ printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\357'
+ printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\357\274'
+
+ * src/searchutils.c (is_mb_middle): New parameter: the length of the
+ match, in bytes, as determined by kwsexec. Use this to detect when
+ the nominal match found by kwsexec must be skipped because it is for
+ an incomplete multi-byte character that is a prefix of a character
+ in the input.
+ * src/dfasearch.c (EGexecute): Update caller.
+ * src/kwsearch.c (Fexecute): Likewise.
+ * src/search.h: Update prototype.
+ * NEWS (Bug fixes): Mention it.
+ Report and analysis by Norihiro Tanaka.
+
+2010-03-28 Norihiro Tanaka <noritnk@kcn.ne.jp>
+
+ tests: add tests for the fgrep-infloop bug
+ * tests/init.cfg (require_timeout_): New function.
+ * tests/fgrep-infloop: New file. Test for the above fix.
+ * tests/Makefile.am (TESTS): Add it.
+
+2010-03-28 Jim Meyering <meyering@redhat.com>
+
+ grep -F: avoid infinite loop when searching for incomplete MB character
+ Searching for an incomplete non-prefix of a multi-byte character
+ should find no match.
+
+ Just as these print nothing,
+ printf '\357\274\241\357\274\241\n' \
+ | perl -CIO -ne '/\241\357/ and print'
+ printf '\357\274\241\n' | perl -CIO -ne '/\274\241/ and print'
+ printf '\357\274\241\n' | perl -CIO -ne '/\241/ and print'
+ printf '\357\274\241\n' | perl -CIO -ne '/\274/ and print'
+
+ These should also print nothing, but with grep-2.6 and grep-2.6.1,
+ they would infloop:
+ printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\241'
+ printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\274'
+ printf '\357\274\241\n' | LC_ALL=en_US.UTF-8 src/grep -F $'\274\241'
+
+ * src/kwsearch.c (Fexecute): Don't infloop when searching for
+ an incomplete non-prefix part of a multi-byte character.
+ * NEWS (Bug fixes): Mention it.
+ Reported and diagnosed by Norihiro Tanaka.
+
+2010-03-28 Jim Meyering <meyering@redhat.com>
+
+ tests: rename: fmbtest.sh -> fmbtest
+ * tests/fmbtest.sh: Rename to ...
+ * tests/fmbtest: ...this, dropping the .sh suffix.
+ * tests/Makefile.am (TESTS): Reflect renaming.
+
+ tests: convert fmbtest.sh to use init.sh
+ * tests/fmbtest.sh: Use init.sh and adapt accordingly:
+ Use "grep", not ${GREP}. Use Exit, not exit.
+
+ tests: also exercise the --include + glob path
+ * tests/include-exclude: Exercise Javier's fix.
+
+2010-03-28 Javier Villavicencio <the_paya@gentoo.org>
+
+ grep -r: fix --include with globs, too
+ The previous fix addressed only the non-glob case.
+ * src/main.c (main): Use add_exclude's EXCLUDE_WILDCARDS option,
+ to enable the use of fnmatch with --include=GLOB.
+ gnulib: Update to latest, for the fixed exclude.c.
+
+2010-03-28 Jim Meyering <meyering@redhat.com>
+
+ grep -r: fix --include with non-globs
+ * lib/savedir.c (savedir): Fix logic error. Introduced by commit
+ bf3bd92c, "build: adapt to the newer exclude API we now get from gnulib"
+ * tests/include-exclude: Test for this bug by exercising --include, too.
+ * NEWS (Bug fixes): Mention it.
+ Reported by Philipp Kohlbecher in http://savannah.gnu.org/bugs/?29358
+
+2010-03-27 Jim Meyering <meyering@redhat.com>
+
+ kwset: correct comments; require non-NULL kwsmatch argument
+ * src/kwset.c (kwsexec): Correct comments. This function has been
+ returning an offset, not a pointer, for 9 years.
+ Do not test for kwsmatch == NULL. All callers pass non-NULL.
+ (cwexec): Likewise.
+ * src/kwset.h (kwsexec): Mark the 4th parameter, kwsmatch, as non-NULL.
+ Include "arg-nonnull.h".
+
+ build: add -I$(top_builddir)/lib so we also find generated .h files
+ * src/Makefile.am (AM_CPPFLAGS): Rename from INCLUDES to avoid
+ warning from automake -Wall.
+ Add -I$(top_builddir)/lib, so we find generated .h files like
+ getopt.h in a non-srcdir build.
+
+ build: remove superfluous LOCALEDIR definition
+ * src/Makefile.am (INCLUDES): Remove unnecessary definition of
+ LOCALEDIR here. Now, it's defined via gnulib's configmake.h.
+ * src/system.h: Include "configmake.h" for its LOCALEDIR definition.
+
+ grep: don't segfault upon use of --include or --exclude* options
+ * lib/savedir.c (isdir1): Fix fatal typo: deref "dir" argument,
+ not the global (initially-NULL) "path". Reported by Standish Parsley.
+ * tests/include-exclude: New file.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+
+2010-03-26 Jim Meyering <meyering@redhat.com>
+
+ tests: rename: foad1.sh -> foad1
+ * tests/foad1.sh: Rename to ...
+ * tests/foad1: ...this, dropping the .sh suffix.
+ * tests/Makefile.am (TESTS): Reflect renaming.
+
+ tests: convert foad1.sh to use init.sh
+ This fixes a spurious test failure when "make check" is run with
+ certain envvars set, e.g., "make check GREP_COLOR=always"
+ * tests/foad1.sh: Use init.sh and adapt accordingly:
+ Use "grep", not ${GREP}. Test VERBOSE against "yes", not "1",
+ to be consistent with init.sh.
+ Use Exit, not exit.
+ Reported by Nelson H. F. Beebe.
+
+ tests: insulate tests from envvar settings
+ * tests/init.cfg (vars_): Unset each envvar that can affect how
+ grep works. This protects only those tests that have been
+ converted to use init.sh.
+
+2010-03-25 Eric Blake <eblake@redhat.com>
+
+ maint: ignore 'make dist pdf' droppings
+ * .gitignore: Add more exemptions.
+
+2010-03-25 Jim Meyering <meyering@redhat.com>
+
+ tests: avoid spurious test failure due to lack of a French UTF8 locale
+ * tests/init.cfg: New file. If either $LOCALE_FR or $LOCALE_FR_UTF8
+ is set to "none", reset it to the empty string.
+ Reported by Mike Frysinger and Sven Joachim.
+ * tests/Makefile.am (EXTRA_DIST): Add init.cfg.
+
+ build: do not use pkg-config to test for PCRE support
+ * configure.ac: Do not use PKG_PROG_PKG_CONFIG or PKG_CHECK_MODULES.
+ Do not modify CPPFLAGS; that belongs to those who invoke make.
+ Instead, use autoconf's AC_CHECK_HEADERS and AC_SEARCH_LIBS via the
+ new macro, gl_FUNC_PCRE, defined in...
+ * m4/pcre.m4 (gl_FUNC_PCRE): New macro, to handle pcre-related
+ configure-time tests.
+ * src/Makefile.am (grep_LDADD): Use LIB_PCRE, not PCRE_LIBS.
+ * src/pcresearch.c: Test HAVE_LIBPCRE via "#if", not "#ifdef".
+ All other cpp tests of this symbol used "#if".
+ Prompted by a suggestion from Bruno Haible.
+ * NEWS (Build-related): Mention this.
+
+ doc: correct and amend NEWS entries for 2.6.1
+ * NEWS (Bug fixes): Correct character ranges bug description.
+ Add an example from Dmitry V. Levin.
+ Add that the word-with-backref bug was introduced in 2.5.1.
+ * cfg.mk (old_NEWS_hash): Update to match.
+
+ post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.6.1
+ * NEWS: Record release date.
+
+2010-03-25 Tony Abou-Assaleh <taa@acm.org>
+
+ tests: use awk's -v option more portably
+ * tests/spencer1-locale: Add a space between awk's "-v" option and
+ the following VAR=value string, to avoid test failure on Mac OS X.
+
+2010-03-25 Norihirio Tanaka <noritnk@kcn.ne.jp>
+
+ dfa/grep: fix compilation with MBS_SUPPORT
+ * src/dfa.c (cur_mb_len): Initialize to 1 and always make it available.
+ (setbit_case_fold): Do not use wint_t in prototype if !MBS_SUPPORT.
+ (parse_bracket_exp): Fix compilation with !MBS_SUPPORT.
+ * src/kwsearch.c (kwsinit): Do not use mbtolower and MB_CUR_MAX
+ if !MBS_SUPPORT.
+ * src/searchutils.c (kwsinit): Do not refer to MB_CUR_MAX if !MBS_SUPPORT.
+
+ * tests/char-class-multibyte: Skip if UTF-8 matching does not work.
+ * tests/fmbtest.sh: Likewise.
+
+2010-03-25 Jim Meyering <meyering@redhat.com>
+
+ build: avoid warnings about unnecessary use of "return"
+ * src/grep.c (Gcompile, Ecompile, Acompile): Do not "return X"
+ from a function returning void, not even when X itself is a
+ function returning void. This avoids warnings from Sun Studio 11
+ reported by Dagobert Michelsen.
+ * src/egrep.c (Ecompile): Likewise.
+
+2010-03-25 Norihirio Tanaka <noritnk@kcn.ne.jp>
+
+ grep: fix printing when -w is used and regex is needed for matching
+ * NEWS: Document bugfix.
+ * src/dfasearch.c (EGexecute): After assess_pattern_match len, is either
+ invalid or end-beg; jump to success.
+ * tests/Makefile.am (TESTS): Add new test.
+ * tests/backref-word: New.
+
+2010-03-25 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: fix single byte character ranges
+ * src/dfa.c (in_coll_range): Fix ordering for second strcoll. Reported
+ by Dmitry V. Levin.
+ * tests/spencer1-locale.awk: Also test single-byte character sets.
+ * NEWS: Add a note about this bugfix.
+ * THANKS: Add Dmitry.
+
+2010-03-25 Norihirio Tanaka <noritnk@kcn.ne.jp>
+
+ grep: reset state after truncated or invalid multibyte sequences
+ * src/searchutils.c (is_mb_middle): When treating an invalid sequence
+ or a truncated multibyte character as a single byte character, reset
+ mbstate
+
+ grep: do lowercase conversion in print_line_middle only for single-byte case
+ * src/main.c (print_line_middle): Restrict match_icase code
+ to MB_CUR_MAX == 1. Adjust comments.
+
+2010-03-25 Jim Meyering <meyering@redhat.com>
+
+ tests: provide framework_failure_ function
+ The shell function "framework_failure" was called in the unusual
+ event that some fundamental test set-up operation would fail.
+ However it was not defined. Define it, but with a trailing underscore
+ to impinge less on the test writer's name space. Adjust all uses.
+ * tests/init.sh (framework_failure_): New function.
+ * tests/case-fold-backref: s/framework_failure/framework_failure_/
+ * tests/case-fold-char-class: Likewise.
+ * tests/case-fold-char-range: Likewise.
+ * tests/case-fold-char-type: Likewise.
+ * tests/char-class-multibyte: Likewise.
+ * tests/dfaexec-multibyte: Likewise.
+ * tests/max-count-vs-context: Likewise.
+ * tests/word-multi-file: Likewise.
+
+2010-03-24 Jim Meyering <meyering@redhat.com>
+
+ doc: tweak THANKS
+ * THANKS: Update Arnold's name and address, per request.
+
+ portability: use gnulib's lseek wrapper
+ * bootstrap.conf (gnulib_modules): Use gnulib's lseek wrapper,
+ for improved portability. lseek does not fail with ESPIPE on
+ pipes on some systems.
+
+ build: avoid link failure on Solaris 8
+ * bootstrap.conf (gnulib_modules): Add wctob.
+ * NEWS (Portability): Mention this.
+ Reported by Dagobert Michelsen in <http://sv.gnu.org/bugs/?29325>.
+
+2010-03-24 Petr Písař <petr.pisar@atlas.cz>
+
+ doc: translate new --help message
+ * src/main.c: Translate "after_options".
+
+2010-03-24 Jim Meyering <meyering@redhat.com>
+
+ doc: NEWS make it clear that the bug was introduced in 2.6
+ * NEWS: Clarify.
+
+2010-03-24 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: fix char-class-multibyte
+ * tests/char-class-multibyte: Make it pass.
+
+2010-03-23 Jim Meyering <meyering@redhat.com>
+
+ build: avoid compilation failure when MBS_SUPPORT not defined
+ * src/dfa.c (setbit_case_fold) [!MBS_SUPPORT]: Fix curly brace mismatch.
+
+2010-03-23 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: fix sigsegv on multibyte character classes
+ Reported by Jaroslav Škarvada <jskarvad@redhat.com>. This is
+ unfortunate. grep needs an automatic testcase generator.
+
+ * NEWS: Document bug.
+ * THANKS: Mention reporter.
+ * src/dfa.c (set_bit_casefold): Change type of first argument for
+ self-documentation.
+ (parse_bracket_exp): Fix call.
+ * tests/Makefile.am: Add new testcase.
+ * tests/char-class-multibyte: New testcase.
+
+2010-03-23 Jim Meyering <meyering@redhat.com>
+
+ post-release administrivia
+ * NEWS: Add header line for next release.
+ * .prev-version: Record previous version.
+ * cfg.mk (old_NEWS_hash): Auto-update.
+
+ version 2.6
+ * NEWS: Record release date.
+
+ build: avoid warnings: tell gcc and clang that dfaerror never returns
+ * src/dfa.h (__attribute__): Define.
+ (dfaerror): Declare with the "noreturn" attribute.
+ * src/dfasearch.c (dfaerror): Add an unreachable use of abort.
+
+2010-03-22 Eric Blake <eblake@redhat.com>
+
+ build: fix cygwin build
+ Portions of gnulib depend on -lintl, and cygwin does not allow
+ lazy linking.
+
+ * src/Makefile.am (LDADD): Include libraries in correct order.
+
+2010-03-22 Paolo Bonzini <bonzini@gnu.org>
+
+ grep: remove --mmap
+ mmap is a bad idea for sequentially accessed file because it will cause
+ a page fault for every read page. Just consider it a failed experiment,
+ and ignore --mmap while accepting it for backwards compatibility.
+
+ * configure.ac (AC_FUNC_MMAP): Remove.
+ * doc/grep.texi (Other options): Say --mmap is ignored.
+ * src/grep.c (mmap_option): Remove.
+ (long_options): Do not reference it.
+ (bufmapped, initial_bufoffset): Remove.
+ (reset, fillbuf): Remove HAVE_MMAP code.
+ (grepfile): Remove bufmapped reference.
+ (usage): Say --mmap is ignored.
+
+2010-03-22 Paolo Bonzini <bonzini@gnu.org>
+
+ grep: rename files for intuitiveness
+ * Makefile.am (libgrep_a_SOURCES, grep_SOURCES, egrep_SOURCES,
+ fgrep_SOURCES): Adjust.
+ * grep.c: Rename to main.c.
+ * esearch.c: Rename to egrep.c.
+ * fsearch.c: Rename to fgrep.c.
+ * gsearch.c: Rename to grep.c.
+
+ grep: kill GREP_PROGRAM/EGREP_PROGRAM/FGREP_PROGRAM
+ * NEWS: Document slight semantic change.
+ * TODO: #ifdefs are gone.
+ * po/POTFILES.in: Update.
+ * src/Makefile.am (grep_SOURCES, egrep_SOURCES, fgrep_SOURCES): Remove
+ grep.c/egrep.c/fgrep.c.
+ (noinst_LIBRARIES): Change libsearch.a to libgrep.a.
+ (libsearch_a_SOURCES): Rename to libgrep_a_SOURCES, add grep.c
+ (LDADD): Change libsearch.a to libgrep.a.
+ * src/esearch.c: Add before_options and after_options.
+ * src/fsearch.c: Likewise.
+ * src/gsearch.c: Likewise.
+ * src/grep.c (short_options, long_options): Remove GREP_PROGRAM
+ special-casing.
+ (usage): Use before_options and after_options, look at matchers.
+ (setmatcher): Merge with install_matcher.
+ (main): Call setmatcher (NULL) instead of install_matcher.
+ * src/grep.h (GREP_PROGRAM): Remove.
+ (before_options, after_options): Add.
+
+ thank Eric Blake
+ * THANKS: Add Eric Blake, who reported the warning fixed by 774d0ee.
+
+ grep: libify *search.c
+ * src/Makefile.am (libsearch_a_SOURCES): Add dfasearch.c, kwsearch.c,
+ pcresearch.c.
+ * src/esearch.c, src/fsearch.c, * src/gsearch.c: Only include search.h.
+ * src/dfasearch.c (GEAcompile, EGexecute): Export.
+ * src/kwsearch.c (Fcompile, Fexecute): Export.
+ * src/pcresearch.c (Pcompile, Pexecute): Export.
+ * src/search.h: Add new exported functions.
+
+ grep: prepare for libification of *search.c
+ * src/dfasearch.c (Ecompile): Remove.
+ * src/esearch.c: Place it here...
+ * src/gsearch.c: ... and here.
+
+ grep: split search.c
+ * po/POTFILES.in: Update.
+ * src/Makefile.am (grep_SOURCES, egrep_SOURCES, fgrep_SOURCES): Move
+ kwset.c and dfa.c to libsearch.a. Add searchutils.c there too.
+ * src/search.h, src/dfasearch.c, src/pcresearch.c, src/kwsearch.c,
+ src/searchutils.c: New files, split out of src/search.c.
+ * src/esearch.c, src/fsearch.c: Include the new files instead of search.c.
+ * src/gsearch.c: Likewise, plus move Gcompile/Acompile here.
+
+ grep: remove one #ifdef
+ * search.c (GEAcompile) [EGREP_PROGRAM]: Use common code. Inline IF_BK.
+
+2010-03-22 Paolo Bonzini <bonzini@gnu.org>
+
+ grep: eliminate {COMPILE,EXECUTE}_{RET,ARGS,FCT}
+ Modern compilers warn about type mismatches.
+
+ * src/grep.c (do_execute): Write full declaration.
+ * src/grep.h (COMPILE_RET, COMPILE_ARGS, COMPILE_FCT, EXECUTE_RET,
+ EXECUTE_ARGS, EXECUTE_FCT): Remove.
+ (compile_fp_t, execute_fp_t): Write full declaration.
+ * src/search.c (GEAcompile, Gcompile, Acompile, Ecompile, EGexecute,
+ Fcompile, Fexecute, Pcompile, Pexecute): Write full declaration.
+
+2010-03-22 Paolo Bonzini <bonzini@gnu.org>
+
+ grep: make egrep/fgrep use struct matcher
+ * Makefile.am (grep_SOURCES): Add gsearch.c.
+ (EXTRA_DIST): Add search.c.
+ * esearch.c (matchers): New.
+ * fsearch.c (matchers): New.
+ * gsearch.c: New.
+ * search.c (matchers): Remove.
+ * grep.c: Always compile most !GREP_PROGRAM sections.
+ (main): Use first matcher if none is explicitly provided. Remove
+ "default" matcher.
+ * grep.h (struct matcher): Adjust comments.
+
+ grep: change struct matcher termination
+ * src/grep.c (setmatcher): Look for NULL matchers[i].name.
+ * src/grep.h (struct matcher): Change name to pointer. Adjust comments.
+ * src/search.c (matchers): Terminate with three NULLs.
+
+ grep: remove one #ifdef
+ * search.c (Ecompile): Always go through GEAcompile to use same code path
+ for both grep and egrep.
+
+ grep: remove getpagesize.h
+ * src/getpagesize.h: Remove.
+ * src/Makefile.am (noinst_HEADERS): Remove getpagesize.h.
+
+2010-03-21 Jim Meyering <meyering@redhat.com>
+
+ build: use the fcntl-h module, not "fcntl"
+ * bootstrap.conf (gnulib_modules): We might need fcntl.h somewhere,
+ but don't use the fcntl function. Reported by Bruno Haible.
+
+ build: avoid link failure on systems using gnulib's fcntl but not open
+ * bootstrap.conf (gnulib_modules): Using gnulib's fcntl module
+ and including <fcntl.h>, but not also using gnulib's "open" module
+ would result in link failure due to references to rpl_open
+ on systems requiring the replacement (e.g., Cygwin and Darwin).
+
+ build: avoid compilation failure on systems using rpl_open
+ This new build failure has arisen as a result of using gnulib's
+ "fcntl" module. Now that an inadequate "open" syscall is replace
+ by gnulib's wrapper, it is essential to include <fcntl.h>.
+ * src/grep.c: Include <fcntl.h>.
+ This is required, for grepfile's use of open, at least on
+ Cygwin and Darwin.
+
+ maint: use gnulib's fcntl module, just in case
+ * bootstrap.conf (gnulib_modules): Add fcntl.
+ Grep uses at least O_BINARY, which may be defined therein.
+
+ maint: remove TYPE_* definitions from src/system.h
+ * src/system.h (TYPE_MAXIMUM, TYPE_MINIMUM, TYPE_SIGNED): Remove
+ definitions. They are provided by intprops.h.
+ * src/grep.c: Include "intprops.h"
+ * bootstrap.conf (gnulib_modules): Add intprops.
+
+ maint: alphabetize #include directives
+ * src/grep.c: Alphabetize #include directives.
+
+2010-03-20 Jim Meyering <meyering@redhat.com>
+
+ build: stop using gnulib's memmove module
+ * bootstrap.conf (gnulib_modules): Remove obsolete module: memmove
+
+ build: reinstate gnulib's fcntl-h-tests
+ * bootstrap.conf (gnulib_tool_option_extras): Do not avoid
+ the fcntl-h-tests. I cannot reproduce the failure.
+
+2010-03-20 Eric Blake <eblake@redhat.com>
+
+ build: allow compilation on cygwin
+ Gnulib is incompatible with -Wunused-macros. Addtionally,
+ cygwin 1.7.1 coupled with --enable-gcc-warnings tripped on:
+
+ grep.c: In function 'print_line_middle':
+ grep.c:805: error: array subscript has type 'char' [-Wchar-subscripts]
+ grep.c: In function 'main':
+ grep.c:1833: error: 'optarg' redeclared without dllimport attribute: previous dllimport ignored [-Wattributes]
+ grep.c:1834: error: 'optind' redeclared without dllimport attribute after being referenced with dll linkage
+
+ * configure.ac (GNULIB_WARN_FLAGS): Disable -Wunused-macros.
+ * src/grep.c (print_line_middle): Use correct type to tolower.
+ (main): Drop useless redeclarations.
+ * .gitignore: Ignore more built files.
+
+2010-03-20 Jim Meyering <meyering@redhat.com>
+
+ tests: ensure that all programs handle [b-a] consistently
+ * tests/reversed-range-endpoints: New test.
+ * tests/Makefile.am (TESTS): Add it.
+
+2010-03-20 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+ This pulls in the latest regex module from gnulib, including a fix
+ to make it honor the RE_NO_EMPTY_RANGES syntax bit.
+
+ tests: temporarily disable irrelevant-to-grep failing C++ fcntl-h-tests
+ * bootstrap.conf (gnulib_tool_option_extras): Temporarily add
+ --avoid=fcntl-h-tests, until the C++ part of that test is fixed.
+
+2010-03-20 Jim Meyering <meyering@redhat.com>
+
+ reject reversed-endpoint ranges, with all regex variants
+ * src/search.c: Add RE_NO_EMPTY_RANGES to the syntax bits
+ in three places, so that all of grep, egrep, and grep -E reject
+ a range with reversed endpoints like '[b-a]'. This is required,
+ when using the latest version of gnulib's regex module, since it
+ now honors the RE_NO_EMPTY_RANGES flag, rather than acting as if
+ it were always set.
+ Based on a change by Matthew Burgess.
+
+2010-03-19 Jim Meyering <meyering@redhat.com>
+
+ maint: correct macro parameter parentheses
+ * src/dfa.c (FETCH_WC, FETCH): Parenthesize macro parameters.
+
+2010-03-19 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: change help-version to per-program functions
+ * help-version: Change each *_args variable to a *_setup function.
+
+ dfa: fix wchar_t/wint_t type mismatch
+ * src/dfa.c (FETCH_WC): Pass a local wchar_t variable to mbrtowc.
+ (FETCH): Rename temporary second argument to FETCH_WC.
+ (parse_bracket_exp): Always use FETCH_WC.
+
+2010-03-19 Jim Meyering <meyering@redhat.com>
+
+ doc: add README-prereq, referenced from README-hacking
+ * README-prereq: New file. Cloned from coreutils, s/coreutils/grep/
+ Reported by Tony Abou-Assaleh.
+
+2010-03-19 Arnold Robbins <arnold@skeeve.com>
+
+ maint: sync dfa comments from gawk
+ * src/dfa.h (struct dfa) [newlines]: Amend comment.
+ * src/dfa.c: Update copyright year list to include gawk's.
+
+2010-03-17 Jim Meyering <meyering@redhat.com>
+
+ maint: remove obsolete "cvs-clean" make target
+ * Makefile.am (cvs-clean): Remove obsolete target.
+
+2010-03-17 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: initialize struct mbcset using memset
+ * src/dfa.c (parse_bracket_exp): Use memset to initialize workmbc.
+
+ dfa: spell out "unsigned int"
+ * dfa.c (setbit, tstbit, clrbit, setbit_case_fold, lex, dfaoptimize,
+ free_mbdata): Put "int" after unsigned.
+ * dfa.h (struct position, struct dfa): Likewise.
+
+2010-03-17 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: optimize simple character sets under UTF-8 charsets
+ Only use a bitset when possible without involving MBCSET. Testcase:
+ yes 'the quick brown fox jumps over the lazy dog' | sed 100000q | \
+ time grep -c [ABCDEFGHIJKLMNOPQRSTUVWXYZ,]
+
+ Before: 51ms (best of three runs); after: 16ms(best of three runs).
+
+ * src/dfa.c (parse_bracket_exp): For simple bracket expressions
+ under UTF-8, use a CSET.
+
+2010-03-17 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: speed up handling of brackets
+ This patch has two sides. One is to fold the parsing of brackets in the
+ single- and multi-byte cases. The second is to leverage this change,
+ and use a bitset to test for single-byte characters in the charset.
+ Splitting the two would be very hard.
+
+ Testcase:
+ yes 'the quick brown fox jumps over the lazy dog' | sed 100000q | \
+ time grep -c [ABCDEFGHIJKLMNOPQRSTUVWXYZ,]
+
+ Before: 59ms (best of three runs); after: 51ms (best of three runs).
+ Nice, but mostly providing infrastructure for the next patch.
+
+ * src/dfa.c (setbit_case_fold): Try applying towlower/towupper.
+ (looking_at): Remove.
+ (FETCH_WC): New.
+ (fetch_wc): Merge into FETCH_WC [MBS_SUPPORT].
+ (FETCH) [MBS_SUPPORT]: Call FETCH_WC.
+ (prednames, find_pred, is_blank and other predicates): Move above,
+ remove K&R syntax support.
+ (parse_bracket_exp): New name of parse_bracket_exp_mb, rewritten to
+ include single-byte character set parsing of brackets.
+ (lex): Adjust for fetch_wc->FETCH_WC change, remove single-byte
+ character set parsing of brackets.
+ (match_mb_charset): Test against work_mbc->cset.
+ * src/dfa.h (struct mb_char_classes): Add cset.
+
+2010-03-17 Paolo Bonzini <bonzini@gnu.org>
+
+ syntax-check: remove space-tab exception
+ * .x-sc_space_tab: Remove.
+ * src/dfa.c: Fix space-tab occurrence.
+
+ THANKS: fix Jim Meyering's email address
+ * THANKS: Jim is now with Red Hat.
+
+ dfa: add missing function
+ * src/dfa.c (using_utf8): New.
+ (addtok_wc, free_mbdata, dfaoptimize) [!MBS_SUPPORT]: Do not define.
+ (dfacomp) [!MBS_SUPPORT]: Do not call dfaoptimize.
+
+ tests: fix typo
+ * fedora: Fix typo.
+
+ tests: use Exit
+ * euc-mb: exit with "Exit 0".
+
+ grep: remove more register keywords
+ * dosbuf.c: Remove register keywords.
+ * grep.c: Remove register keywords.
+ * kwset.c: Remove register keywords.
+ * search.c: Remove register keywords.
+
+2010-03-17 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: run simple UTF-8 regexps as a single-byte character set
+ This provides a speedup whenever fgrep is "almost" sufficient but
+ not quite (e.g. grep ^abc). This affects test cases such as
+ https://savannah.gnu.org/bugs/?29117, which are already worked around
+ by the line-by-line matching patch c32c04; without that patch the
+ speedup can reach 1000x even on non-contrived testcases.
+
+ * src/dfa.c (dfaoptimize): New.
+ (dfacomp): Call it.
+
+2010-03-17 Paolo Bonzini <bonzini@gnu.org>
+
+ tests: fix syntax-check failures
+ * tests/case-fold-backref: Use "foo" instead of "the".
+ * tests/dfaexec-multibyte: Remove trailing blanks.
+
+2010-03-17 Paolo Bonzini <bonzini@gnu.org>
+
+ grep: remove check_multibyte_string, fix non-UTF8 missed match
+ Avoid computing ahead something that can be computed lazily as efficiently
+ (or more efficiently in the case of UTF-8, though this is left as TODO).
+ At the same time, "soften" the rejection condition for matching in the
+ middle of a multibyte sequence to fix bug 23814.
+
+ Multibyte "grep -i" would still be very slow if it wasn't for the workaround
+ patch c32c042 (grep: match multibyte charsets line-by-line when using -i,
+ 2010-03-08).
+
+ * NEWS: Document bugfix.
+ * src/search.c (check_multibyte_string): Rewrite as...
+ (is_mb_middle): ... this.
+ (EGexecute, Fexecute): Adjust.
+ * tests/Makefile.am (TESTS): Add euc-mb.
+ * tests/euc-mb: New testcase.
+
+2010-03-17 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: cache MB_CUR_MAX for dfaexec
+ * src/dfa.c (state_index, dfaexec): Use d->mb_cur_max.
+ (dfainit): Initialize it.
+ (free_mbdata): New, extracted out of dfafree.
+ (dfafree): Use it.
+
+ dfa: improve documentation of struct dfa
+ * src/dfa.h (struct dfa): Reword some comments.
+
+ tests: factor name of output files into a variable
+ * tests/case-fold-backref, tests/case-fold-char-class,
+ tests/case-fold-char-range, tests/case-fold-char-type,
+ tests/dfaexec-multibyte: Use a variable for the output filename,
+ as it is common to the grep and compare invocations.
+
+ tests: use different output files to simplify reading failed .log files
+ * tests/case-fold-backref, tests/case-fold-char-class,
+ tests/case-fold-char-range, tests/case-fold-char-type: Use a different
+ name for each output file from grep.
+ * tests/dfaexec-multibyte: Likewise, and merge some grep invocations.
+
+ tests: add another grep -i testcase, from bug 16179
+ * tests/case-fold-backref: New.
+ * tests/Makefile.am (TESTS): Add it.
+
+2010-03-16 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: rewrite handling of multibyte case_fold lexing
+ Let dfacomp do the folding to lowercase of multibyte input strings,
+ and remove it from grep.c. Input strings to kwset.c are still folded
+ outside kwset.c, so we still need to do mbtolower in search.c.
+
+ * NEWS: Document bugfixes.
+ * .x-sc_cast_of_argument_to_free: Remove.
+ * src/dfa.c (wctok, addtok_wc): New.
+ (cur_mb_index, update_mb_len_index): Remove.
+ (FETCH): Do not call it.
+ (parse_bracket_exp_mb) [GREP]: Disable case-folding of ranges and
+ characters.
+ (addtok): Extract part to...
+ (addtok_mb): ... this new function.
+ (lex): Call fetch_wc in the main loop for MB_CUR_MAX > 1. Return WCHAR
+ for normal characters if MB_CUR_MAX > 1.
+ (atom): Handle WCHAR instead of treating multibyte characters specially.
+ Do case folding of multibyte characters here.
+ (dfacomp): Remove case_fold special casing.
+ * src/dfa.h (WCHAR): New.
+ * src/grep.c (mb_icase_keys): Remove.
+ (main): Do not call it.
+ * src/search.c (kwsinit): Init transition table only for MB_CUR_MAX == 1.
+ (mbtolower): New.
+ (kwsincr_case): New.
+ (kwsmusts): Call it instead of kwsincr.
+ (check_multibyte_string): Remove.
+ (check_multibyte_string_no_icase): Rename to check_multibyte_string.
+ (GEAcompile, EGexecute, Fcompile): Use mbtolower instead of the old
+ check_multibyte_string.
+ * tests/Makefile.am (TESTS): Add case-fold-backslash-w.
+ * tests/foad1.sh: Enable fixed tests.
+ * tests/case-fold-backslash-w: New.
+
+2010-03-16 Paolo Bonzini <bonzini@gnu.org>
+
+ grep: match multibyte charsets line-by-line when using -i
+ The turtle combination -i + MB_CUR_MAX>1 requires case conversion ahead
+ of time. Avoid doing this repeatedly when many matches succeed. Together
+ with the previous changes, this fixes https://savannah.gnu.org/bugs/?29117
+ and https://savannah.gnu.org/bugs/?14472.
+
+ * NEWS: Document new speedup.
+ * src/grep.c (do_execute): New.
+ (grepbuf): Use it.
+
+2010-03-15 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: fix handling of ranges in multibyte character sets
+ * src/dfa.c (parse_bracket_exp_mb): Add separate ranges for
+ lowercase and uppercase endpoints if folding case.
+ * tests/Makefile.am (TESTS): Add case-fold-char-range.
+ * tests/case-fold-char-range: New.
+
+ tests: add more UTF-8 test cases
+ * tests/Makefile.am (TESTS): Add spencer1-locale.
+ (EXTRA_DIST): Add spencer1-locale.awk.
+ * tests/spencer1-locale.awk: New.
+ * tests/spencer1-locale: New.
+
+2010-03-15 Jim Meyering <meyering@redhat.com>
+
+ tests: complete the renaming fedora.sh -> fedora
+ * tests/Makefile.am (TESTS): Rename fedora.sh -> fedora here, too.
+
+2010-03-15 Jim Meyering <meyering@redhat.com>
+
+ * tests/fedora.sh: Rename to...
+ * tests/fedora: ...this, to reflect new convention:
+ Use the lack of a suffix to indicate we've converted to the new
+ init.sh-using test framework.
+
+ tests: adjust fedora.sh to handle traps more portably
+
+2010-03-15 Jim Meyering <meyering@redhat.com>
+
+ tests: adjust fedora.sh to handle traps more portably
+ * tests/fedora.sh: Use "Exit", not "exit".
+
+ tests: for each test, set an envvar to its name
+ * tests/Makefile.am (TESTS_ENVIRONMENT): Set GREP_TEST_NAME for
+ each test. This is used to help make the output of hundreds of
+ independent, often-parallel valgrind runs more manageable.
+
+2010-03-14 Jim Meyering <meyering@redhat.com>
+
+ tests: clean up fedora.sh
+ * tests/fedora.sh: Use "grep", not ${GREP}.
+ Use init.sh.
+ Use timeout 10, not sleep 1 (three times).
+ The latter would always sleep for 3 seconds, and the test would
+ fail with a false positive on a slow system or with a heavily
+ instrumented (valgrind) executable.
+
+2010-03-12 Jim Meyering <meyering@redhat.com>
+
+ build: avoid build failure with --enable-gcc-warnings
+ * src/dfa.c: Don't include <assert.h>, now that it is not used.
+ [DEBUG]: Remove #ifdef block.
+
+2010-03-12 Paolo Bonzini <bonzini@gnu.org>
+
+ syntax-check: enable space-tab
+ * cfg.mk (local-checks-to-skip): Enable space-tab.
+ * .x-sc_space_tab: Add exceptions.
+ * tests/status.sh: Fix occurrence.
+
+ syntax-check: enable m4-quote-check
+ * cfg.mk (local-checks-to-skip): Enable m4-quote-check.
+ * configure.ac: Fix occurrence.
+
+ syntax-check: enable makefile-TAB-only-indentation
+ * cfg.mk (local-checks-to-skip): Enable makefile-TAB-only-indentation.
+ * Makefile.am: Fix only occurrence.
+
+ grep: fix error-message-uppercase
+ * cfg.mk (local-checks-to-skip): Enable error-message-uppercase.
+ * src/dfa.c (parse_bracket_exp_mb, lex, dfaparse): Fix occurrences.
+ * src/search.c (Pcompile, Pexecute): Fix occurrences.
+
+ dfa, grep: cleanup if-before-free and cast-of-argument-to-free
+ * .x-sc_avoid_if_before_free: Remove.
+ * .x-sc_cast_of_alloca_return_value: Remove.
+ * .x-sc_cast_of_x_alloc_return_value: Remove.
+ * .x-sc_cast_of_argument_to_free: Temporarily add src/search.c.
+ * cfg.mk (local-checks-to-skip): Remove sc_cast_of_argument_to_free.
+ * src/dfa.c (ifree): Remove.
+ (dfamust, build_state, transit_state, dfafree): Do not do if-before-free,
+ do not cast free argument to ptr_t or char *.
+ (freelist): Call free instead of ifree.
+ * src/dfa.h (ptr_t): Remove.
+
+2010-03-12 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: remove CRANGE dead code
+ The only use of CRANGE was removed by commit 193830d. In theory it is
+ more correct to do what CRANGE did, but in practice it seems like it did
+ not work.
+
+ * src/dfa.h (token): Remove CRANGE.
+ * src/dfa.c (atom): Do not handle CRANGE.
+ (prtok): Likewise.
+
+2010-03-12 Paolo Bonzini <bonzini@gnu.org>
+
+ dfa: get rid of x*alloc
+ * src/dfa.c: Include xalloc.h.
+ (xmalloc, xrealloc, xcalloc): Remove.
+
+ grep: cleanup one const cast
+ * src/search.c (GEAcompile): Do not reuse motif when operating on the
+ (const) pattern, so we can make it non-const. Remove cast from free.
+
+ kwset/system: remove ptr_t
+ * src/kwset.h: Declare kwset using an incomplete struct type.
+ * src/system.h (ptr_t): Remove.
+
+2010-03-12 Jim Meyering <meyering@redhat.com>
+
+ tests: add test cases for dfaexec bug
+ * tests/dfaexec-multibyte: New test.
+ * tests/Makefile.am (TESTS): Add it.
+ Reported by Paolo Bonzini in http://bugzilla.redhat.com/544407
+ and http://bugzilla.redhat.com/544406 .
+
+2010-03-12 Jim Meyering <meyering@redhat.com>
+
+ dfa: manually merge gawk's dfaexec
+ * src/dfa.c (dfaexec): Adjust API: return pointer, not offset, and
+ take an "end" pointer parameter, rather than integral "size".
+ Adjust comment accordingly.
+ (build_state): Maintain d->newlines.
+ (copytoks): Update multibyte_prop indices.
+ (SKIP_REMAINS_MB_IF_INITIAL_STATE): Update a cast.
+ Return NULL, rather than (size_t) -1.
+ (realloc_trans_if_necessary): Realloc d->newlines.
+ * src/dfa.h (struct dfa): New member, "newlines".
+ (struct dfa) [GAWK]: New member, "broken".
+ (dfaexec): Update prototype and copy the new comment from dfa.c.
+
+ dfa: make search.c use the new dfaexec API
+
+ * src/search.c: Adjust to new dfaexec API.
+ Now, dfaexec returns a pointer, not an integer,
+ and the third parameter is END, not buffer size.
+ * src/dfa.c (dfaexec): Rewrite the function's comment.
+ Don't just clobber *END. While doing that happens to be
+ fine for gawk's usage, in grep, *END usually points to the
+ first byte of the next buffer. Save the initial value,
+ and restore it just before returning.
+ * src/dfa.h (dfaexec): Update comment; include parameter names.
+
+2010-03-12 Jim Meyering <meyering@redhat.com>
+
+ dfa: appease static analyzers
+ * src/dfa.c (transit_state_singlebyte): Call abort rather
+ than returning in a "can't happen" scenario.
+ This stops clang from emitting a false-positive report (I think it
+ was used-uninitialized) about a caller.
+
+2010-03-11 Jim Meyering <meyering@redhat.com>
+
+ dfa: do not accept [[:UPPER:]] or [[:LOWER:]] internally
+ * src/dfa.c (parse_bracket_exp_mb): Those class names are not
+ valid, and rejected elsewhere, so there is no point in allowing
+ upper or mixed-case versions here.
+
+2010-03-11 Jim Meyering <meyering@redhat.com>
+
+ maint: remove a trailing space
+ * src/search.c (EXECUTE_FCT): Remove trailing space.
+
+ maint: remove all uses of PARAMS
+ Remove most with this:
+ git grep -lw PARAMS |xargs perl -pi -e 's/\bPARAMS *\((.*)\);/$1;/'
+ Remove the remainder manually.
+
+2010-03-11 Jim Meyering <meyering@redhat.com>
+
+ maint: remove all uses of PARAMS
+ * lib/savedir.h (PARAMS): Remove definitions manually.
+ Remove the remaining ones via this command:
+ git grep -l define.PARAMS |xargs perl -ni -e '/define PARAMS/ or print'
+ * src/dfa.h (PARAMS): Remove definitions.
+ * src/system.h (PARAMS): Likewise.
+ Remove most uses with this:
+ git grep -lw PARAMS |xargs perl -pi -e 's/\bPARAMS *\((.*)\);/$1;/'
+ Remove the remainder manually.
+
+ maint: remove now-useless prototypes
+ * src/dfa.c: Remove the prototype of each static, non-recursive
+ function whose definition precedes first use.
+
+ grep: plug an inconsequential leak
+ * src/grep.c (main): Plug a leak: free "keys".
+
+ grep: avoid useless allocations for empty GREP_OPTIONS
+ * src/grep.c (prepend_default_options): Ignore GREP_OPTIONS
+ when it's empty, not just when it's undefined.
+ There are still relatively harmless leaks when GREP_OPTIONS
+ is set and non-empty. We'll address those, eventually.
+
+2010-03-09 Jim Meyering <meyering@redhat.com>
+
+ build: record build-from-clone tool requirements
+ * bootstrap.conf (buildreq): This makes bootstrap fail with
+ a clear explanation of the problem. Otherwise, you'd get into
+ the build process and fail with something far more cryptic.
+
+ dfa: remove a trailing blank
+ * src/dfa.c (dfaexec): No trailing blanks allowed.
+
+ dfa: sync a tiny change from gawk
+ * src/dfa.c (state_index) [MBS_SUPPORT]: Initialize .mpbs.nelem member
+ unconditionally. Also initialize .mbps.elems.
+
+ dfa: avoid a leak (work_mbc->chars)
+ * src/dfa.c (parse_bracket_exp_mb): Remove useless (and leaked MALLOC).
+
+ doc+bootstrap: document build-from-git-clone process
+ * bootstrap: Update from coreutils/gnulib.
+ * README-hacking: New file, nearly identical to the one in coreutils.
+
+2010-03-08 Paolo Bonzini <bonzini@gnu.org>
+
+ more work on TODO
+ * TODO: More work on the first section. Use clearer section headers.
+
+2010-03-08 Reuben Thomas <rrt@sc3d.org>
+
+ bring TODO up-to-date
+ * TODO: merge with TODO section of http://www.gnu.org/software/grep/devel.html
+ and remove done items. Some small bits of tidying also.
+
+2010-03-07 Paolo Bonzini <bonzini@gnu.org>
+
+ simplify parsing of [a-z]
+ * src/dfa.c (in_coll_range): New.
+ (lex): Use it instead of regcomp/regexec.
+
+ Small refactoring in src/dfa.c
+ * src/dfa.c (parse_bracket_exp_mb): Return MBCSET.
+ (lex): Assign return value of parse_bracket_exp_mb to lasttok, return it.
+
+ use do...while(0) idiom
+ * dfa.c (FETCH): Wrap with do...while(0).
+
+2010-03-06 Paolo Bonzini <bonzini@gnu.org>
+
+ extract common code from if/else
+ * dfa.c (dfaexec): Simplify logic for MB_CUR_MAX > 1 case.
+
+ remove register variable hacks
+ * dfa.c (dfaexec): We can extract the address of a variable without fearing
+ performance problems, modern compilers know better.
+
+ remove register keywords
+ * dfa.c (dfaexec): Modern compilers just ignore it.
+
+ allow grep -Pz
+ * NEWS: Document grep -P improvements.
+ * src/search.c (Pcompile): Remove restriction on grep -Pz.
+ * tests/pcre-z: New.
+ * tests/Makefile.am (TESTS): Add pcre-z.
+
+ fix cross-line matching in PCRE backend
+ * search.c (Pexecute): Split the buffer in lines and match each line
+ separately.
+ * tests/fedora.sh: Add regression testsuite.
+
+ fix formatting of NEWS
+ * NEWS: fix formatting of 2.6 entries.
+
+ fix a bug in handling of -i and character type
+ * dfa.c (parse_bracket_exp_mb): Convert [[:lower:]] and [[:upper]] to
+ [[:alpha:]] when folding case.
+ * tests/case-fold-char-type: New file. Test for the bug.
+ * tests/Makefile.am (TESTS): Add it.
+ * NEWS (Bug fixes): Mention it.
+
+ fix previous test case change
+ * tests/case-fold-char-class: Do not reset fail to 0 after first test.
+
+2010-03-06 Mike Frysinger <vapier@gentoo.org>
+
+ grep(1) man page: touchup --label option
+ * doc/grep.1 (--label): Don't italicize ending period. Point to -H
+ option.
+
+2010-03-06 Paolo Bonzini <bonzini@gnu.org>
+
+ augment case-fold-char-class test case
+ * tests/case-fold-char-class: Test matching lowercase against uppercase
+ as well as vice versa.
+
+2010-03-05 Reuben Thomas <rrt@sc3d.org>
+
+ doc: improve the discussion of PCRE
+ * doc/grep.1: Add a sentence about Perl regular expressions,
+ and point to pcresyntax(3) and pcrepattern(3).
+ * doc/grep.texi: Likewise.
+
+2010-03-05 Jim Meyering <meyering@redhat.com>
+
+ maint: dfa-sync: comment and dead-to-grep code: no semantic change
+ * src/dfa.c: Sync a comment and some #ifdef GAWK code.
+
+ maint: dfa-sync: don't malloc zero
+ * src/dfa.c (dfacomp): Skip case_fold logic when length is zero.
+ This probably "no semantic change", but does improve efficiency in
+ a degenerate case.
+
+ maint: dfa-sync: use CALLOC rather than equiv. MALLOC+initialize-loop
+ * src/dfa.c (dfaanalyze): Sync from gawk. No semantic change.
+
+ dfa.c: add support for \s and \S
+ * src/dfa.c (lex): Sync from gawk's dfa.c.
+
+ maint: dfa-sync: add omitted array initializer
+ * src/dfa.c (prednames): Add a "0" to final initializer.
+ No semantic change.
+
+ fix a bug in handling of -i and character classes
+ * dfa.c (parse_bracket_exp_mb): Sync one part of this function
+ from gawk's dfa.c, which was patched by Arnold D. Robbins.
+ * tests/case-fold-char-class: New file. Test for the bug.
+ * tests/Makefile.am (TESTS): Add it.
+ (TESTS_ENVIRONMENT): Propagate LOCALE_FR and LOCALE_FR_UTF8
+ definitions into tests.
+ * NEWS (Bug fixes): Mention it.
+
+2010-03-05 Paolo Bonzini <pbonzini@redhat.com>
+
+ Fedora Grep regression test suite
+ * tests/Makefile.am (TESTS): Add fedora.sh.
+ (CLEANFILES): Add several new files.
+ * tests/fedora.sh: New file, originally by Lubomir Rintel but somewhat
+ rewritten to avoid bashisms.
+
+2010-03-05 Paolo Bonzini <bonzini@gnu.org>
+
+ convert AUTHORS file to UTF-8
+ * AUTHORS: Convert to UTF-8.
+
+ eliminate invalid "ptr += (ptr2 - ptr1)"
+ * lib/savedir.c (savedir): new_name_space and name_space do not point into
+ the same object, so computing their difference is invalid. Similarly,
+ summing the difference to namep is invalid because namep and the result
+ point into different objects. Avoid this.
+
+ fix for bug 21276
+ * lib/savedir.c (isdir1): Use realloc instead of calloc. Remove
+ dead code.
+ (savedir): Do not leak name_space if allocation of new_name_space fails.
+
+2010-03-04 Jim Meyering <meyering@redhat.com>
+
+ tests: add a test based on an example from Paolo Bonzini
+ * tests/word-multi-file: New test.
+ * tests/Makefile.am (TESTS): Add it.
+
+ doc: document release procedure
+ * README-release: New file.
+
+ build: update gnulib submodule to latest
+
+2010-02-22 Paolo Bonzini <bonzini@gnu.org>
+
+ add --group-separator=FOO and --no-group-separator
+ * src/grep.c (group_separator): New.
+ (long_options): Add --group-separator=FOO and --no-group-separator.
+ (prtext): Print group_separator instead of SEP_STR_GROUP. Optionally
+ suppress the separator altogether.
+ (main) Handle GROUP_SEPARATOR_OPTION.
+ * doc/grep.texi (Context control): Document it.
+ * NEWS: Mention it.
+ * tests/yesno.sh: Add testcases.
+
+2010-02-21 Jim Meyering <meyering@redhat.com>
+
+ tests: don't use "echo -n"
+ * tests/foad1.sh: Use printf, not echo -n. The latter is not portable.
+ Reported by Daniel Richman.
+
+2010-02-08 Jim Meyering <meyering@redhat.com>
+
+ remove useless DJGPP-specific code
+ * src/grep.c (grepfile): Remove now-useless DJGPP-specific code.
+ Now, all S_IS* macros are guaranteed to be defined via gnulib.
+
+2010-02-07 Jim Meyering <meyering@redhat.com>
+
+ tests: add help-version sanity tests from coreutils
+ * tests/help-version: New test, from coreutils.
+ * tests/Makefile.am (TESTS): Add it.
+ (TESTS_ENVIRONMENT) [built_programs]: Define it.
+
+ tests: correct TESTS_ENVIRONMENT's PATH setting
+ * tests/Makefile.am (TESTS_ENVIRONMENT): Set PATH to start with
+ $(abs_top_builddir)/src, so that we test the programs we've just built.
+
+ grep: use the correct exit status (2) upon write failure, not 1
+ * src/grep.c (main): Initialize exit_failure to EXIT_TROUBLE.
+ * NEWS (Bug fixes): Mention this fix.
+
+ maint: enable the prohibit_magic_number_exit syntax check
+ * cfg.mk (local-checks-to-skip): Remove sc_prohibit_magic_number_exit,
+ to enable that check.
+ * src/system.h (EXIT_TROUBLE): Define.
+ * src/grep.c: Use symbolic names, EXIT_SUCCESS, EXIT_FAILURE, and
+ EXIT_TROUBLE, not 0, 1, 2.
+ * src/search.c: Likewise.
+ * src/vms_fab.c (string): Likewise.
+
+2010-02-04 Jim Meyering <meyering@redhat.com>
+
+ doc: adjust NEWS item
+ * NEWS: Correct a description.
+
+2010-02-03 Jim Meyering <meyering@redhat.com>
+
+ tests: exercise surprising -m1 vs. --context behavior
+ * tests/max-count-vs-context: New test. Exercise the surprising,
+ but documented, behavior reported by Markus Jochim in
+ http://savannah.gnu.org/bugs/?28588.
+ * tests/Makefile.am (TESTS): Add it.
+
+ tests: use init.sh from gnulib
+ * tests/init.sh: New file, from gnulib.
+ * tests/Makefile.am (EXTRA_DIST): Add it.
+ (TESTS_ENVIRONMENT): Add variables and features.
+ (VERBOSE): Define.
+
+ maint: remove unused Makefile rule
+ * tests/Makefile.am (dist-hook): Remove rule. No longer needed.
+
+ maint: adjust formatting in tests/Makefile.am
+ * tests/Makefile.am (TESTS, CLEANFILES): Align and sort.
+
+ build: avoid warnings in gnulib-supplied regex files
+ Now that we enable more warnings in lib/, we choose
+ to avoid some via patches applied by bootstrap, using
+ files in the gl/ hierarchy. Other, less-important
+ warnings are avoided simply by turning off the
+ -Wold-style-definition option and using a slightly
+ relaxed set of warnings $(GNULIB_WARN_CFLAGS) in lib/.
+ * gl/lib/regcomp.c.diff: Avoid warnings.
+ * gl/lib/regex_internal.c.diff: Likewise.
+ * gl/lib/regex_internal.h.diff: Likewise.
+ * gl/lib/regexec.c.diff: Likewise.
+ * configure.ac (GNULIB_PORTCHECK): Disable only -Wold-style-definition.
+ * lib/Makefile.am (AM_CFLAGS): Use $(GNULIB_WARN_CFLAGS) rather
+ than the slightly more strict $(WARN_CFLAGS).
+
+ tests: adjust spencer #37 to pass with gnulib's regex code
+ * tests/spencer1.tests: Change #37 to expect an exit status of 2, not 1.
+ grep 'a[b-a]' reports "Invalid range end".
+
+ maint: use regex from gnulib, rather than our bit-rotting one
+ * bootstrap.conf (gnulib_modules): Add regex.
+ * configure.ac: Don't use jm_INCLUDED_REGEX.
+ Update use of cache variable.
+ * lib/regex.c: Remove file.
+ * lib/regex.h: Likewise.
+ * m4/regex.m4: Likewise.
+ * POTFILES.in: Update to match.
+
+ build: update gnulib submodule to latest
+
+2010-01-28 Jim Meyering <meyering@redhat.com>
+
+ maint: update to latest gnulib; adjust cfg.mk
+ * gnulib: Update submodule to latest.
+ * cfg.mk (old_NEWS_hash): Update to reflect NEWS Copyright line change.
+
+2010-01-06 Jim Meyering <meyering@redhat.com>
+
+ maint: avoid old jm_* macros
+ There were jm_* macros here, until very recently.
+ * cfg.mk (sc_prohibit_jm_in_m4): New rule, from coreutils.
+
+ maint: remove decl.m4
+ * m4/decl.m4: Remove unused file.
+
+ maint: rely on gnulib's new isdir.h
+ * src/grep.c: Include "isdir.h".
+ * src/system.h: Remove declaration of isdir.
+
+ build: rename local to avoid shadowing global, dfa
+ * src/dfa.c (dfamust): Rename parameter: s/dfa/d/.
+
+ build: avoid warning from -Wmissing-prototypes
+ * src/dfa.c (match_mb_charset): Declare to be static.
+
+ build: avoid shadowing warning for "link"
+ * src/kwset.c (link): Define to kwset_link, to avoid shadowing
+ the function.
+
+ build: avoid shadowing warning for unused "rs"
+ * src/dfa.c (transit_state): Remove dead stores;
+ move a declaration "down".
+ Ignore transit_state_consume_1char return value.
+
+ build: avoid shadowing warnings
+ * src/dfa.c (match_mb_charset): Rename parameter: s/index/idx/.
+ (check_matching_with_multibyte_ops, match_anychar): Likewise.
+
+ build: avoid warning about unused definition of N_
+ * src/dfa.c (N_): Remove unused definition.
+
+ build: avoid format-string warnings
+ * src/search.c (dfaerror): Use literal "%s" as format string.
+ (kwsmusts, GEAcompile): Likewise.
+ (Pcompile): Likewise.
+
+ build: add configure-time --enable-gcc-warnings option; avoid warnings
+ * bootstrap.conf (gnulib_modules): Add "manywarnings" module.
+ * configure.ac: Add --enable-gcc-warnings, derived from code in bison.
+ * src/Makefile.am (AM_CFLAGS): Set to $(WARN_CFLAGS) $(WERROR_CFLAGS)
+ * lib/Makefile.am (AM_CFLAGS): Likewise, but append.
+
+ build: remove now-useless -I../intl option
+ * src/Makefile.am (INCLUDES): Remove -I../intl, now that intl is gone.
+
+ maint: avoid more warnings
+ * src/grep.c (MAX): Remove definition of unused macro.
+ (usage): Declare with __attribute__ ((noreturn)).
+ Split long strings into chunks of length < 509.
+
+ fix a possible bug: remove errant semicolon
+ * src/grep.c (prline): Remove erroneous semicolon-after-if-expr.
+
+ maint: avoid compilation warnings
+ * bootstrap.conf (gnulib_modules): Add ignore-value.
+ * src/search.c (check_multibyte_string_no_icase): A variant of
+ check_multibyte_string that does *not* convert case, and hence
+ does not modify its BUF parameter.
+ (check_multibyte_string): Use xcalloc in place of xmalloc+memset.
+ Use ignore_value to ignore the return value from wcrtomb. This is
+ ok, since we know the input is a valid upper case wide character.
+ (Fexecute, EGexecute): Update callers of check_multibyte_string
+ to use both it and check_multibyte_string_no_icase.
+
+ maint: avoid warnings about unused fwrite return value
+ * bootstrap.conf (gnulib_modules): Add unlocked-io.
+ * src/system.h: Include "unlocked-io.h".
+
+ maint: remove {m4,lib}/.gitignore; they were undergoing too much churn
+ * .gitignore: Ignore all of m4/* except m4/djgpp.m4
+ and all of lib/* except Makefile.am, savedir.c and savedir.h.
+ * m4/.gitignore: Remove file.
+ * lib/.gitignore: Remove file.
+
+2010-01-05 Jim Meyering <meyering@redhat.com>
+
+ build: run gnulib's tests, too
+ * Makefile.am (SUBDIRS): Add gnulib-tests.
+ * gnulib-tests/Makefile.am: New file.
+ * bootstrap.conf (bootstrap_epilogue): New function, from coreutils.
+ (gnulib_tool_option_extras): Define.
+ * configure.ac: Add gnulib-tests/Makefile.
+
+2010-01-03 Jim Meyering <meyering@redhat.com>
+
+ maint: record update-copyright options for this package
+ * cfg.mk: Next time, just run "make update-copyright".
+
+2010-01-01 Jim Meyering <meyering@redhat.com>
+
+ maint: update all FSF copyright year lists to include 2010
+ Use this command:
+ git ls-files |grep -vE '^(\..*|COPYING|gnulib)$' |xargs \
+ env UPDATE_COPYRIGHT_USE_INTERVALS=1 build-aux/update-copyright
+
+2009-12-23 Jim Meyering <meyering@redhat.com>
+
+ fix multi-byte-locale read-beyond-end-of-buffer error
+ Avoid read-beyond-end-of-buffer errors, evoked by running this:
+ LC_ALL=en_US.UTF-8 valgrind src/grep -f <(printf 'a\nb\n') <(echo c)
+
+ Conditional jump or move depends on uninitialised value(s)
+ at 0x78136D: __gconv_transform_utf8_internal (in /lib/libc-2.11.so)
+ by 0x7E7232: mbrtowc (in /lib/libc-2.11.so)
+ by 0x8055773: dfaexec (dfa.c:2816)
+ by 0x804D7B0: EGexecute (search.c:353)
+ by 0x804ACD8: grepbuf (grep.c:1036)
+ by 0x804B023: grep (grep.c:1156)
+ by 0x804B460: grepfile (grep.c:1287)
+ by 0x804CF0D: main (grep.c:2282)
+
+ Conditional jump or move depends on uninitialised value(s)
+ at 0x7E7248: mbrtowc (in /lib/libc-2.11.so)
+ by 0x8055773: dfaexec (dfa.c:2816)
+ by 0x804D7B0: EGexecute (search.c:353)
+ by 0x804ACD8: grepbuf (grep.c:1036)
+ by 0x804B023: grep (grep.c:1156)
+ by 0x804B460: grepfile (grep.c:1287)
+ by 0x804CF0D: main (grep.c:2282)
+
+ * src/dfa.c (dfaexec) [MBS_SUPPORT]: Do not access one byte beyond
+ end of buffer.
+
+2009-12-23 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+2009-12-23 Paolo Bonzini <bonzini@gnu.org>
+
+ Speed up insert.
+ Suggested by Johan Walles <johan.walles@gmail.com> (bug 23354).
+
+ * src/dfa.c (insert): Use binary search.
+
+2009-12-23 Johan Walles <johan.walles@gmail.com>
+
+ Decrease epsclosure memory usage
+ Fixes bug 23321.
+
+ * src/dfa.c (epsclosure): Make visited an array of char.
+
+2009-12-22 Paolo Bonzini <bonzini@gnu.org>
+
+ Make 'grep -1 -2' and 'grep -1v2' equivalent to grep -2
+ Fixes bug 12128.
+
+ * src/grep.c (get_nondigit_option): Reset the buffer every time
+ a non-digit option is found or a new argument is started.
+
+2009-12-22 Paolo Bonzini <bonzini@gnu.org>
+
+ Improve description of --label
+ Fixes bug 22681.
+
+ * doc/grep.1 (--label): Use -H in the example, improve wording.
+ * doc/grep.texi (Output Line Prefix Control): Likewise.
+
+2009-12-22 Paolo Bonzini <bonzini@gnu.org>
+
+ Avoid using an invalid memchr result.
+ Related to bug 13161. I cannot find a testcase, but it is better to be
+ defensive considering that these bug were found in the past.
+
+ * src/search.c (EGexecute, Fexecute): Check for memchr return values.
+
+2009-12-11 Jim Meyering <meyering@redhat.com>
+
+ build: update gnulib submodule to latest
+
+2009-12-04 Jim Meyering <meyering@redhat.com>
+
+ maint: enable prohibit_have_config_h check
+ * cfg.mk (local-checks-to-skip): Enable sc_prohibit_have_config_h
+ * lib/regex.c: Remove useless cpp test of HAVE_CONFIG_H.
+ * lib/savedir.c: Likewise.
+ * src/grep.c: Likewise.
+ * src/kwset.c: Likewise.
+ * src/search.c: Likewise.
+
+ maint: enable cast_of_x_alloc_return_value check
+ * cfg.mk (local-checks-to-skip): Enable sc_cast_of_x_alloc_return_value.
+ * .x-sc_cast_of_x_alloc_return_value:
+ * src/dfa.c (CALLOC, MALLOC, REALLOC): Remove casts.
+ * src/dosbuf.c (undossify_input): Likewise.
+ * src/grep.c (print_line_middle, prepend_default_options): Likewise.
+
+ maint: enable cast_of_alloca_return_value check
+ * cfg.mk (local-checks-to-skip): Enable sc_cast_of_alloca_return_value.
+ * .x-sc_cast_of_alloca_return_value: New file.
+
+2009-12-04 Paolo Bonzini <bonzini@gnu.org>
+
+ fix "grep -Ff" on CRLF-terminated files
+ * src/search.c (Fcompile) [HAVE_DOS_FILE_CONTENTS]: Recognize \r\n as
+ a line terminator.
+
+ fix compilation with included regex
+ * Makefile.am (libgreputils_a_DEPENDENCIES): New.
+
+ switch to pkg-config for PCRE detection
+ * configure.ac: use pkg-config to detect PCRE
+ * src/Makefile.am (grep_LDADD): link grep with PCRE_LIBS
+
+2009-12-04 Jim Meyering <meyering@redhat.com>
+
+ maint: remove "missing" script
+ * missing: Remove now-unused file.
+
+ maint: make .gitignore ignore more
+ * .gitignore: Ignore more.
+
+ maint: enable useless-if-before-free check
+ * cfg.mk (local-checks-to-skip): Enable sc_avoid_if_before_free.
+ * .x-sc_avoid_if_before_free: New file. Exempt regex.c and dfa.c,
+ in case anyone ever tries to merge their contents with other versions.
+ * src/grep.c (print_line_middle, grepdir): Remove useless if-before-free.
+ * src/search.c (IF_BK, EXECUTE_FCT): Likewise.
+
+ maint: enable po-check
+ * cfg.mk (local-checks-to-skip): Enable sc_po_check.
+ * po/POTFILES.in: Sort and update.
+
+2009-12-03 Paolo Bonzini <bonzini@gnu.org>
+
+ update gnulib, fixing missing inclusion of stdbool.h
+ * gnulib: Update.
+
+2009-11-30 Jim Meyering <meyering@redhat.com>
+
+ maint: enable two checks
+ * cfg.mk (local-checks-to-skip): Enable two:
+ sc_prohibit_xalloc_without_use sc_two_space_separator_in_usage
+ * src/grep.c (usage): Conform: use two spaces, not 1.
+ * src/kwset.c (malloc): Define as a function-macro so that the
+ syntax-check rule sees that we are indeed using xmalloc here.
+
+ maint: enable makefile_path_separator check
+ * cfg.mk (local-checks-to-skip): Enable sc_makefile_path_separator_check,
+ now that the sole offender, an old po/Makefile.in.in, is gone.
+
+ maint: remove now-generated file: po/Makefile.in.in
+ * po/Makefile.in.in: Remove file, now generated via bootstrap.
+
+ maint: enable makefile @...@ check
+ * cfg.mk (local-checks-to-skip): Enable sc_makefile_check.
+ * lib/Makefile.am (libgreputils_a_LIBADD): Use $(...), rather than
+ anachronistic @...@ notation.
+ * src/Makefile.am (LDADD): Likewise.
+ * tests/Makefile.am (AWK): Remove definition.
+
+ maint: enable trailing_blank check
+ * cfg.mk (local-checks-to-skip): Enable sc_trailing_blank.
+ * AUTHORS: Remove trailing blanks.
+ * COPYING: Likewise.
+ * README: Likewise.
+ * README-alpha: Likewise.
+ * README-boot: Likewise.
+ * THANKS: Likewise.
+ * TODO: Likewise.
+ * src/dfa.c: Likewise.
+ * src/mbsupport.h: Likewise.
+ * tests/backref.sh: Likewise.
+ * tests/file.sh: Likewise.
+ * tests/options.sh: Likewise.
+ * tests/tests: Likewise.
+ * vms/README: Likewise.
+ * vms/make.com: Likewise.
+
+ maint: enable unmarked_diagnostics check
+ * cfg.mk (local-checks-to-skip): Enable sc_unmarked_diagnostics
+ * src/grep.c (fillbuf): Mark a diagnostic for translation.
+ (reset): Likewise.
+
+ maint: enable require_config_h checks
+ * cfg.mk (local-checks-to-skip): Enable sc_require_config_h
+ and sc_require_config_h_first.
+ * src/dosbuf.c: Include <config.h>.
+ * src/vms_fab.c: Likewise.
+ * .x-sc_require_config_h: New file: list the exceptions.
+ * .x-sc_require_config_h_first: Likewise.
+
+ maint: use gnulib's progname module; enable set_program_name check
+ * bootstrap.conf (gnulib_modules): Add progname.
+ * src/grep.c: Include "progname.h".
+ (program_name): Remove declaration.
+ (main): Call set_program_name.
+ * cfg.mk (local-checks-to-skip): Add sc_program_name.
+
+ maint: enable "file system" check
+ * cfg.mk (local-checks-to-skip): Enable sc_file_system.
+ * lib/savedir.c (savedir): Tweak spelling. Remove trailing blanks.
+
+ maint: enable immutable_NEWS check
+ * NEWS: Move copyright to the bottom.
+ Use the format required by release-related tools.
+ * .prev-version: New file.
+ * cfg.mk (old_NEWS_hash): Define.
+ (local-checks-to-skip): Enable check: sc_immutable_NEWS.
+
+ maint: disable the many failing syntax-checks
+ * cfg.mk: New file.
+ (local-checks-to-skip): Define to the list of disabled rules.
+ Subsequent change-sets will enable them, one by one.
+
+ build: require automake-1.11, enable silent-rules, parallel tests, xz
+ * configure.ac (AM_INIT_AUTOMAKE): Create xz-compressed tarballs,
+ not bzip2-compressed ones. Enable automake's silent-rules,
+ parallel tests, and test PASS/FAIL coloring options.
+ Use AC_CONFIG_HEADERS, not AM_CONFIG_HEADER. Quote the argument.
+
+ build: use git-version-gen for inter-release version strings
+ * configure.ac (AC_INIT): Use git-version-gen.
+
+ build: add several build- and release-related gnulib modules
+ * bootstrap.conf (gnulib_modules): Add announce-gen update-copyright
+ do-release-commit-and-tag git-version-gen gnu-web-doc-update
+ gnupload maintainer-makefile useless-if-before-free
+
+ build: adapt to the newer closeout module from gnulib
+ * src/grep.c: Include "exitfail.h".
+ (main) [-q]: Set the global variable, exit_failure, rather than
+ calling the now-removed close_stdout_set_file_name function.
+
+ build: adapt to the newer exclude API we now get from gnulib
+ * src/grep.c (main): Adapt to newer exclude.c: add EXCLUDE_WILDCARDS as
+ the new "option" argument in calls to add_exclude and add_exclude_file.
+
+ build: get more lib/* files from gnulib, adjust savedir
+ * bootstrap.conf (gnulib_modules): Add the following:
+ closeout exclude hard-locale isdir strtoumax.
+ * lib/.gitignore, m4/.gitignore: Update.
+ * lib/closeout.c, lib/closeout.h: Remove.
+ * lib/exclude.c, lib/exclude.h: Remove.
+ * lib/hard-locale.c, lib/hard-locale.h: Remove.
+ * lib/strtoumax.c: Remove.
+ * lib/isdir.c: Remove.
+ * lib/Makefile.am: Remove here, too.
+ * lib/savedir.c: Adapt to new exclude module:
+ s/excluded_filename/excluded_file_name/ and remove 3rd argument.
+
+ build: update gnulib submodule to latest
+
+ maint: generate ChangeLog from git logs
+ * Makefile.am (dist-hook, gen-ChangeLog): New rules.
+ * bootstrap.conf (gnulib_modules): Add gitlog-to-changelog.
+ Ensure that ChangeLog exists.
+ * ChangeLog-2009: Rename from ChangeLog
+ * ChangeLog: Remove file.
+ * .gitignore: Add ChangeLog.
+
+ maint: list gnulib modules one per line
+ * bootstrap.conf (gnulib_modules): List them one per line.
+
+2009-11-29 Tony Abou-Assaleh <taa@acm.org>
+
+ Acknowledge new maintainers, update README-alpha
+ * AUTHORS: new maintainers added
+ * THANKS: same
+ * README-alpha: change CVS references to Git