summaryrefslogtreecommitdiffstats
path: root/src/sed/NEWS
blob: 233ffef7f7e755972a99db8e5f9c1afd3d9ffc7a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
Sed 4.1.5

* fix parsing of a negative character class not including a closed bracket,
  like [^]] or [^]a-z].

* fix parsing of [ inside an y command, like y/[/A/.

* output the result of commands a, r, R when a q command is found.

----------------------------------------------------------------------------
Sed 4.1.4

* \B correctly means "not on a word boundary" rather than "inside a word"

* bugfixes for platform without internationalization

* more thorough testing framework for tarballs (`make full-distcheck')

----------------------------------------------------------------------------
Sed 4.1.3

* regex addresses do not use leftmost-longest matching.  In other words,
  /.\+/ only looks for a single character, and does not try to find as
  many of them as possible like it used to do.

* added a note to BUGS and the manual about changed interpretation
  of `s|abc\|def||', and about localization issues.

* fixed --disable-nls build problems on Solaris.

* fixed `make check' in non-English locales.

* `make check' tests the regex library by default if the included regex
  is used (regex tests had to be enabled separately up to now).

----------------------------------------------------------------------------
Sed 4.1.2

* fix bug in 'y' command in multi-byte character sets

* fix severe bug in parsing of ranges with an embedded open bracket

* fix off-by-one error when printing a "bad command" error

----------------------------------------------------------------------------
Sed 4.1.1

* preserve permissions of in-place edited files

* yield an error when running -i on terminals or other non regular files

* do not interpret - as stdin when running in in-place editing mode

* fix bug that prevented 's' command modifiers from working

----------------------------------------------------------------------------
Sed 4.1

* // matches the last regular expression even in POSIXLY_CORRECT mode.

* change the way we treat lines which are not terminated by a newline.
Such lines are printed without the terminating newline (as before)
but as soon as more text is sent to the same output stream, the
missing newline is printed, so that the two lines don't concatenate.
The behavior is now independent from POSIXLY_CORRECT because POSIX
actually has undefined behavior in this case, and the new implementation
arguably gives the ``least expected surprise''.  Thanks to Stepan
Kasal for the implementation.

* documentation improvements, with updated references to the POSIX.2
specification

* error messages on I/O errors are better, and -i does not leave temporary
files around (e.g. when running ``sed -i'' on a directory).

* escapes are accepted in the y command (for example: y/o/\n/ transforms
o's into newlines)

* -i option tries to set the owner and group to the same as the input file

* `L' command is deprecated and will be removed in sed 4.2.

* line number addresses are processed differently -- this is supposedly
conformant to POSIX and surely more idiot-proof.  Line number addresses
are not affected by jumping around them: they are activated and
deactivated exactly where the script says, while previously
    5,8b
    1,5d
would actually delete lines 1,2,3,4 and 9 (!).

* multibyte characters are taken in consideration to compute the
operands of s and y, provided you set LC_CTYPE correctly.  They are
also considered by \l, \L, \u, \U, \E.

* [\n] matches either backslash or 'n' when POSIXLY_CORRECT.

* new option --posix, disables all GNU extensions.  POSIXLY_CORRECT only
disables GNU extensions that violate the POSIX standard.

* options -h and -V are not supported anymore, use --help and --version.

* removed documentation for \s and \S which worked incorrectly

* restored correct behavior for \w and \W: match [[:alnum:]_] and
[^[:alnum:]_] (they used to match [[:alpha:]_] and [^[:alpha:]_]

* the special address 0 can only be used in 0,/RE/ or 0~STEP addresses;
other cases give an error (you are hindering portability for no reason
if specifying 0,N and you are giving a dead command if specifying 0
alone).

* when a \ is used to escape the character that would terminate an operand
of the s or y commands, the backslash is removed before the regex is
compiled.  This is left undefined by POSIX; this behavior makes `s+x\+++g'
remove occurrences of `x+', consistently with `s/x\///g'.  (However, if
you enjoy yourself trying `s*x\***g', sed will use the `x*' regex, and you
won't be able to pass down `x\*' while using * as the delimiter; ideas on
how to simplify the parser in this respect, and/or gain more coherent
semantics, are welcome).


----------------------------------------------------------------------------
Sed 4.0.9

* 0 address behaves correctly in single-file (-i and -s) mode.

* documentation improvements.

* tested with many hosts and compilers.

* updated regex matcher from upstream, with many bugfixes and speedups.

* the `N' command's feature that is detailed in the BUGS file was disabled
by the first change below in sed 4.0.8.  The behavior has now been
restored, and is only enabled if POSIXLY_CORRECT behavior is not
requested.

----------------------------------------------------------------------------
Sed 4.0.8

* fix `sed n' printing the last line twice.

* fix incorrect error message for invalid character classes.

* fix segmentation violation with repeated empty subexpressions.

* fix incorrect parsing of ^ after escaped (.

* more comprehensive test suite (and with many expected failures...)

----------------------------------------------------------------------------
Sed 4.0.7

* VPATH builds working on non-glibc machines

* fixed bug in s///Np: was printing even if less than N matches were
found.

* fixed infinite loop on s///N when LHS matched a null string and
there were not enough matches in pattern space

* behavior of s///N is consistent with s///g when the LHS can match
a null string (and the infinite loop did not happen :-)

* updated some translations

----------------------------------------------------------------------------
Sed 4.0.6

* added parameter to `v' for the version of sed that is expected.

* configure switch --without-included-regex to use the system regex matcher

* fix for -i option under Cygwin

----------------------------------------------------------------------------
Sed 4.0.5

* portability fixes

* improvements to some error messages (e.g. y/abc/defg/ incorrectly said
`excess characters after command' instead of `y arguments have different
lengths')

* `a', `i', `l', `L', `r' accept two addresses except in POSIXLY_CORRECT
mode.  Only `q' and `Q' do not accept two addresses in standard (GNU) mode.

----------------------------------------------------------------------------
Sed 4.0.4

* documentation fixes

* update regex matcher

----------------------------------------------------------------------------
Sed 4.0.3

* fix packaging problem (two missing translation catalogs)

----------------------------------------------------------------------------
Sed 4.0.2

* more translations

* fix build problems (vpath builds and bootstrap builds)

----------------------------------------------------------------------------
Sed 4.0.1

* Remove last vestiges of super-sed

* man page automatically built

* more translations provided

* portability improvements

----------------------------------------------------------------------------
Sed 4.0

* Update regex matcher

----------------------------------------------------------------------------
Sed 3.96

* `y' command supports multibyte character sets

* Update regex matcher

----------------------------------------------------------------------------
Sed 3.95

* `R' command reads a single line from a file.

* CR-LF pairs are always ignored under Windows, even if (under Cygwin)
a disk is mounted as binary.

* More attention to errors on stdout

* New `W' command to write first line of pattern space to a file

* Can customize line wrap width on single `l' commands

* `L' command formats and reflows paragraphs like `fmt' does.

* The test suite makefiles are better organized (this change is
transparent however).

* Compiles and bootstraps out-of-the-box under MinGW32 and Cygwin.

* Optimizes cases when pattern space is truncated at its start or at
its end by `D' or by a substitution command with an empty RHS.
For example scripts like this,

    seq 1 10000 | tr \\n \  | ./sed ':a; s/^[0-9][0-9]* //; ta'

whose behavior was quadratic with previous versions of sed, have
now linear behavior.

* New command `e' to pipe the output of a command into the output
of sed.

* New option `e' to pass the output of the `s' command through the
Bourne shell and get the result into pattern space.

* Switched to obstacks in the parser -- less memory-related bugs
(there were none AFAIK but you never know) and less memory usage.

* New option -i, to support in-place editing a la Perl.  Usually one
had to use ed or, for more complex tasks, resort to Perl; this is
not necessary anymore.

* Dumped buffering code.  The performance loss is 10%, but it caused
bugs in systems with CRLF termination.  The current solution is
not definitive, though.

* Bug fix: Made the behavior of s/A*/x/g (i.e. `s' command with a
possibly empty LHS) more consistent:

       pattern               GNU sed 3.x       GNU sed 4.x
        B                      xBx               xBx
        BC                     xBxCx             xBxCx
        BAC                    xBxxCx            xBxCx
        BAAC                   xBxxCx            xBxCx

* Bug fix: the // empty regular expressions now refers to the last
regular expression that was matched, rather than to the last
regular expression that was compiled.  This richer behavior seems
to be the correct one (albeit neither one is POSIXLY_CORRECT).

* Check for invalid backreferences in the RHS of the `s' command
(e.g. s/1234/\1/)

* Support for \[lLuUE] in the RHS of the `s' command like in Perl.

* New regular expression matcher

* Bug fix: if a file was redirected to be stdin, sed did not consume
it.  So
      (sed d; sed G) < TESTFILE

double-spaced TESTFILE, while the equivalent `useless use of cat'
      cat TESTFILE | (sed d; sed G)

printed nothing (which is the correct behavior).  A test for this
bug was added to the test suite.

* The documentation is now much better, with a few examples provided,
and a thorough description of regular expressions.  The manual often
refers to "GNU extensions", but if they are described here they are
specific to this version.

* Documented command-line option:
  -r, --regexp-extended
    Use extended regexps -- e.g. (abc+) instead of \(abc\+\)

* Added feature to the `w' command and to the `w' option of the `s'
command: if the file name is /dev/stderr, it means the standard
error (inspired by awk); and similarly for /dev/stdout.  This is
disabled if POSIXLY_CORRECT is set.

* Added `m' and `M' modifiers to `s' command for multi-line
matching (Perl-style); in addresses, only `M' works.

* Added `Q' command for `silent quit'; added ability to pass
an exit code from a sed script to the caller.

* Added `T' command for `branch if failed'.

* Added `v' command, which is a do-nothing intended to fail on
seds that do not support GNU sed 4.0's extensions.

----------------------------------------------------------------------------
Sed 3.02.80

* Started new version nomenclature for pre-3.03 releases.  (I'm being
pessimistic in assuming that .90 won't give me enough breathing room.)

* Bug fixes: the regncomp()/regnexec() interfaces proved to be inadequate to
properly handle expressions such as "s/\</#/g".  Re-abstracted the regex
code in the sed/ tree, and now use the re_search_2() interface to the GNU
regex routines.  This change also fixed a bug where /./ did not match the
NUL character.  Had the glibc folk fix a bug in lib/regex.c where
's/0*\([0-9][0-9]\)/X\1X/' failed to match on input "002".

* Added new command-line options:
  -u, --unbuffered
    Do not attempt to read-ahead more than required; do not buffer stdout.
  -l N, --line-length=N
    Specify the desired line-wrap length for the `l' command.
    A length of "0" means "never wrap".

* New internationalization translations added: fr ru de it el sk pt_BR sv
(plus nl from 3.02a).

* The s/// command now understands the following escapes
(in both halves):
	\a	an "alert" (BEL)
	\f	a form-feed
	\n	a newline
	\r	a carriage-return
	\t	a horizontal tab
	\v	a vertical tab
	\oNNN	a character with the octal value NNN
	\dNNN	a character with the decimal value NNN
	\xNN	a character with the hexadecimal value NN
This behavior is disabled if POSIXLY_CORRECT is set, at least for the
time being (until I can be convinced that this behavior does not violate
the POSIX standard).  (Incidentally, \b (backspace) was omitted because
of the conflict with the existing "word boundary" meaning. \ooo octal
format was omitted because of the conflict with backreference syntax.)

* If POSIXLY_CORRECT is set, the empty RE // now is the null match
instead of "repeat the last REmatch".  As far as I can tell
this behavior is mandated by POSIX, but it would break too many
legacy sed scripts to blithely change GNU sed's default behavior.

----------------------------------------------------------------------------
Sed 3.02a

* Added internationalization support, and an initial (already out of date)
set of Dutch message translations (both provided by Erick Branderhorst).

* Added support for scripts like:
 sed -e 1ifoo -e '$abar'
(note no need for \ <newline> after a, i, and c commands).
Also, conditionally (on NO_INPUT_INDENT) added
experimental support for skipping leading whitespace on
each {a,i,c} input line.

* Added addressing of the form:
 /foo/,+5 p (print from foo to 5th line following)
 /foo/,~5 p (print from foo to next line whose line number is a multiple of 5)
The first address of these can be any of the previously existing
addressing types; the +N and ~N forms are only allowed as the
second address of a range.

* Added support for pseudo-address "0" as the first address in an
address-range, simplifying scripts which happen to match the end
address on the first line of input.  For example, a script
which deletes all lines from the beginning of the file to the
first line which contains "foo" is now simply "sed 0,/foo/d",
whereas before one had to go through contortions to deal with
the possibility that "foo" might appear on the first line of
the input.

* Made NUL characters in regexps work "correctly" --- i.e., a NUL
in a RE matches a NUL; it does not prematurely terminate the RE.
(This only works in -f scripts, as the POSIX.1 exec*() interface
only passes NUL-terminated strings, and so sed will only be able
to see up to the first NUL in any -e scriptlet.)

* Wherever a `;' is accepted as a command terminator, also allow a `}'
or a `#' to appear.  (This allows for less cluttered-looking scripts.)

* Lots of internal changes that are only relevant to source junkies
and development testing.  Some of which might cause imperceptible
performance improvements.

----------------------------------------------------------------------------
Sed 3.02

* Fixed a bug in the parsing of character classes (e.g., /[[:space:]]/).
Corrected an omission in djgpp/Makefile.am and an improper dependency
in testsuite/Makefile.am.

----------------------------------------------------------------------------
Sed 3.01

* This version of sed mainly contains bug fixes and portability
enhancements, plus performance enhancements related to sed's handling
of input files.  Due to excess performance penalties, I have reverted
(relative to 3.00) to using regex.c instead of the rx package for
regular expression handling, at the expense of losing true POSIX.2
BRE compatibility.  However, performance related to regular expression
handling *still* needs a fair bit of work.

* One new feature has been added: regular expressions may be followed
with an "I" directive ("i" was taken [the "i"nsert command]) to
indicate that the regexp should be matched in a case-insensitive
manner.  Also of note are a new organization to the source code,
new documentation, and a new maintainer.

----------------------------------------------------------------------------
Sed 3.0

* This version of sed passes the new test-suite donated by
Jason Molenda.

* Overall performance has been improved in the following sense: Sed 3.0
is often slightly slower than sed 2.05.  On a few scripts, though, sed
2.05 was so slow as to be nearly useless or to use up unreasonable
amounts of memory.  These problems have been fixed and in such cases,
sed 3.0 should have acceptable performance.