summaryrefslogtreecommitdiffstats
path: root/src/sed/doc/sed.info-2
blob: ee74b14f39045098952694419a116bf7ae4e71b3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
This is ../../doc/sed.info, produced by makeinfo version 4.5 from
../../doc/sed.texi.

INFO-DIR-SECTION Text creation and manipulation
START-INFO-DIR-ENTRY
* sed: (sed).                   Stream EDitor.

END-INFO-DIR-ENTRY

This file documents version 4.1.5 of GNU `sed', a stream editor.

   Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software
Foundation, Inc.

   This document is released under the terms of the GNU Free
Documentation License as published by the Free Software Foundation;
either version 1.1, or (at your option) any later version.

   You should have received a copy of the GNU Free Documentation
License along with GNU `sed'; see the file `COPYING.DOC'.  If not,
write to the Free Software Foundation, 59 Temple Place - Suite 330,
Boston, MA 02110-1301, USA.

   There are no Cover Texts and no Invariant Sections; this text, along
with its equivalent in the printed manual, constitutes the Title Page.

File: sed.info,  Node: Print bash environment,  Next: Reverse chars of lines,  Prev: Rename files to lower case,  Up: Examples

Print `bash' Environment
========================

   This script strips the definition of the shell functions from the
output of the `set' Bourne-shell command.

     #!/bin/sh
     
     set | sed -n '
     :x
     
     # if no occurrence of "=()" print and load next line
     /=()/! { p; b; }
     / () $/! { p; b; }
     
     # possible start of functions section
     # save the line in case this is a var like FOO="() "
     h
     
     # if the next line has a brace, we quit because
     # nothing comes after functions
     n
     /^{/ q
     
     # print the old line
     x; p
     
     # work on the new line now
     x; bx
     '


File: sed.info,  Node: Reverse chars of lines,  Next: tac,  Prev: Print bash environment,  Up: Examples

Reverse Characters of Lines
===========================

   This script can be used to reverse the position of characters in
lines.  The technique moves two characters at a time, hence it is
faster than more intuitive implementations.

   Note the `tx' command before the definition of the label.  This is
often needed to reset the flag that is tested by the `t' command.

   Imaginative readers will find uses for this script.  An example is
reversing the output of `banner'.(1)

     #!/usr/bin/sed -f
     
     /../! b
     
     # Reverse a line.  Begin embedding the line between two newlines
     s/^.*$/\
     &\
     /
     
     # Move first character at the end.  The regexp matches until
     # there are zero or one characters between the markers
     tx
     :x
     s/\(\n.\)\(.*\)\(.\n\)/\3\2\1/
     tx
     
     # Remove the newline markers
     s/\n//g

   ---------- Footnotes ----------

   (1) This requires another script to pad the output of banner; for
example

     #! /bin/sh
     
     banner -w $1 $2 $3 $4 |
       sed -e :a -e '/^.\{0,'$1'\}$/ { s/$/ /; ba; }' |
       ~/sedscripts/reverseline.sed


File: sed.info,  Node: tac,  Next: cat -n,  Prev: Reverse chars of lines,  Up: Examples

Reverse Lines of Files
======================

   This one begins a series of totally useless (yet interesting)
scripts emulating various Unix commands.  This, in particular, is a
`tac' workalike.

   Note that on implementations other than GNU `sed' this script might
easily overflow internal buffers.

     #!/usr/bin/sed -nf
     
     # reverse all lines of input, i.e. first line became last, ...
     
     # from the second line, the buffer (which contains all previous lines)
     # is *appended* to current line, so, the order will be reversed
     1! G
     
     # on the last line we're done -- print everything
     $ p
     
     # store everything on the buffer again
     h


File: sed.info,  Node: cat -n,  Next: cat -b,  Prev: tac,  Up: Examples

Numbering Lines
===============

   This script replaces `cat -n'; in fact it formats its output exactly
like GNU `cat' does.

   Of course this is completely useless and for two reasons:  first,
because somebody else did it in C, second, because the following
Bourne-shell script could be used for the same purpose and would be
much faster:

     #! /bin/sh
     sed -e "=" $@ | sed -e '
       s/^/      /
       N
       s/^ *\(......\)\n/\1  /
     '

   It uses `sed' to print the line number, then groups lines two by two
using `N'.  Of course, this script does not teach as much as the one
presented below.

   The algorithm used for incrementing uses both buffers, so the line
is printed as soon as possible and then discarded.  The number is split
so that changing digits go in a buffer and unchanged ones go in the
other; the changed digits are modified in a single step (using a `y'
command).  The line number for the next line is then composed and
stored in the hold space, to be used in the next iteration.

     #!/usr/bin/sed -nf
     
     # Prime the pump on the first line
     x
     /^$/ s/^.*$/1/
     
     # Add the correct line number before the pattern
     G
     h
     
     # Format it and print it
     s/^/      /
     s/^ *\(......\)\n/\1  /p
     
     # Get the line number from hold space; add a zero
     # if we're going to add a digit on the next line
     g
     s/\n.*$//
     /^9*$/ s/^/0/
     
     # separate changing/unchanged digits with an x
     s/.9*$/x&/
     
     # keep changing digits in hold space
     h
     s/^.*x//
     y/0123456789/1234567890/
     x
     
     # keep unchanged digits in pattern space
     s/x.*$//
     
     # compose the new number, remove the newline implicitly added by G
     G
     s/\n//
     h


File: sed.info,  Node: cat -b,  Next: wc -c,  Prev: cat -n,  Up: Examples

Numbering Non-blank Lines
=========================

   Emulating `cat -b' is almost the same as `cat -n'--we only have to
select which lines are to be numbered and which are not.

   The part that is common to this script and the previous one is not
commented to show how important it is to comment `sed' scripts
properly...

     #!/usr/bin/sed -nf
     
     /^$/ {
       p
       b
     }
     
     # Same as cat -n from now
     x
     /^$/ s/^.*$/1/
     G
     h
     s/^/      /
     s/^ *\(......\)\n/\1  /p
     x
     s/\n.*$//
     /^9*$/ s/^/0/
     s/.9*$/x&/
     h
     s/^.*x//
     y/0123456789/1234567890/
     x
     s/x.*$//
     G
     s/\n//
     h


File: sed.info,  Node: wc -c,  Next: wc -w,  Prev: cat -b,  Up: Examples

Counting Characters
===================

   This script shows another way to do arithmetic with `sed'.  In this
case we have to add possibly large numbers, so implementing this by
successive increments would not be feasible (and possibly even more
complicated to contrive than this script).

   The approach is to map numbers to letters, kind of an abacus
implemented with `sed'.  `a's are units, `b's are tens and so on: we
simply add the number of characters on the current line as units, and
then propagate the carry to tens, hundreds, and so on.

   As usual, running totals are kept in hold space.

   On the last line, we convert the abacus form back to decimal.  For
the sake of variety, this is done with a loop rather than with some 80
`s' commands(1): first we convert units, removing `a's from the number;
then we rotate letters so that tens become `a's, and so on until no
more letters remain.

     #!/usr/bin/sed -nf
     
     # Add n+1 a's to hold space (+1 is for the newline)
     s/./a/g
     H
     x
     s/\n/a/
     
     # Do the carry.  The t's and b's are not necessary,
     # but they do speed up the thing
     t a
     : a;  s/aaaaaaaaaa/b/g; t b; b done
     : b;  s/bbbbbbbbbb/c/g; t c; b done
     : c;  s/cccccccccc/d/g; t d; b done
     : d;  s/dddddddddd/e/g; t e; b done
     : e;  s/eeeeeeeeee/f/g; t f; b done
     : f;  s/ffffffffff/g/g; t g; b done
     : g;  s/gggggggggg/h/g; t h; b done
     : h;  s/hhhhhhhhhh//g
     
     : done
     $! {
       h
       b
     }
     
     # On the last line, convert back to decimal
     
     : loop
     /a/! s/[b-h]*/&0/
     s/aaaaaaaaa/9/
     s/aaaaaaaa/8/
     s/aaaaaaa/7/
     s/aaaaaa/6/
     s/aaaaa/5/
     s/aaaa/4/
     s/aaa/3/
     s/aa/2/
     s/a/1/
     
     : next
     y/bcdefgh/abcdefg/
     /[a-h]/ b loop
     p

   ---------- Footnotes ----------

   (1) Some implementations have a limit of 199 commands per script


File: sed.info,  Node: wc -w,  Next: wc -l,  Prev: wc -c,  Up: Examples

Counting Words
==============

   This script is almost the same as the previous one, once each of the
words on the line is converted to a single `a' (in the previous script
each letter was changed to an `a').

   It is interesting that real `wc' programs have optimized loops for
`wc -c', so they are much slower at counting words rather than
characters.  This script's bottleneck, instead, is arithmetic, and
hence the word-counting one is faster (it has to manage smaller
numbers).

   Again, the common parts are not commented to show the importance of
commenting `sed' scripts.

     #!/usr/bin/sed -nf
     
     # Convert words to a's
     s/[ tab][ tab]*/ /g
     s/^/ /
     s/ [^ ][^ ]*/a /g
     s/ //g
     
     # Append them to hold space
     H
     x
     s/\n//
     
     # From here on it is the same as in wc -c.
     /aaaaaaaaaa/! bx;   s/aaaaaaaaaa/b/g
     /bbbbbbbbbb/! bx;   s/bbbbbbbbbb/c/g
     /cccccccccc/! bx;   s/cccccccccc/d/g
     /dddddddddd/! bx;   s/dddddddddd/e/g
     /eeeeeeeeee/! bx;   s/eeeeeeeeee/f/g
     /ffffffffff/! bx;   s/ffffffffff/g/g
     /gggggggggg/! bx;   s/gggggggggg/h/g
     s/hhhhhhhhhh//g
     :x
     $! { h; b; }
     :y
     /a/! s/[b-h]*/&0/
     s/aaaaaaaaa/9/
     s/aaaaaaaa/8/
     s/aaaaaaa/7/
     s/aaaaaa/6/
     s/aaaaa/5/
     s/aaaa/4/
     s/aaa/3/
     s/aa/2/
     s/a/1/
     y/bcdefgh/abcdefg/
     /[a-h]/ by
     p


File: sed.info,  Node: wc -l,  Next: head,  Prev: wc -w,  Up: Examples

Counting Lines
==============

   No strange things are done now, because `sed' gives us `wc -l'
functionality for free!!! Look:

     #!/usr/bin/sed -nf
     $=


File: sed.info,  Node: head,  Next: tail,  Prev: wc -l,  Up: Examples

Printing the First Lines
========================

   This script is probably the simplest useful `sed' script.  It
displays the first 10 lines of input; the number of displayed lines is
right before the `q' command.

     #!/usr/bin/sed -f
     10q


File: sed.info,  Node: tail,  Next: uniq,  Prev: head,  Up: Examples

Printing the Last Lines
=======================

   Printing the last N lines rather than the first is more complex but
indeed possible.  N is encoded in the second line, before the bang
character.

   This script is similar to the `tac' script in that it keeps the
final output in the hold space and prints it at the end:

     #!/usr/bin/sed -nf
     
     1! {; H; g; }
     1,10 !s/[^\n]*\n//
     $p
     h

   Mainly, the scripts keeps a window of 10 lines and slides it by
adding a line and deleting the oldest (the substitution command on the
second line works like a `D' command but does not restart the loop).

   The "sliding window" technique is a very powerful way to write
efficient and complex `sed' scripts, because commands like `P' would
require a lot of work if implemented manually.

   To introduce the technique, which is fully demonstrated in the rest
of this chapter and is based on the `N', `P' and `D' commands, here is
an implementation of `tail' using a simple "sliding window."

   This looks complicated but in fact the working is the same as the
last script: after we have kicked in the appropriate number of lines,
however, we stop using the hold space to keep inter-line state, and
instead use `N' and `D' to slide pattern space by one line:

     #!/usr/bin/sed -f
     
     1h
     2,10 {; H; g; }
     $q
     1,9d
     N
     D

   Note how the first, second and fourth line are inactive after the
first ten lines of input.  After that, all the script does is: exiting
on the last line of input, appending the next input line to pattern
space, and removing the first line.


File: sed.info,  Node: uniq,  Next: uniq -d,  Prev: tail,  Up: Examples

Make Duplicate Lines Unique
===========================

   This is an example of the art of using the `N', `P' and `D'
commands, probably the most difficult to master.

     #!/usr/bin/sed -f
     h
     
     :b
     # On the last line, print and exit
     $b
     N
     /^\(.*\)\n\1$/ {
         # The two lines are identical.  Undo the effect of
         # the n command.
         g
         bb
     }
     
     # If the `N' command had added the last line, print and exit
     $b
     
     # The lines are different; print the first and go
     # back working on the second.
     P
     D

   As you can see, we mantain a 2-line window using `P' and `D'.  This
technique is often used in advanced `sed' scripts.


File: sed.info,  Node: uniq -d,  Next: uniq -u,  Prev: uniq,  Up: Examples

Print Duplicated Lines of Input
===============================

   This script prints only duplicated lines, like `uniq -d'.

     #!/usr/bin/sed -nf
     
     $b
     N
     /^\(.*\)\n\1$/ {
         # Print the first of the duplicated lines
         s/.*\n//
         p
     
         # Loop until we get a different line
         :b
         $b
         N
         /^\(.*\)\n\1$/ {
             s/.*\n//
             bb
         }
     }
     
     # The last line cannot be followed by duplicates
     $b
     
     # Found a different one.  Leave it alone in the pattern space
     # and go back to the top, hunting its duplicates
     D


File: sed.info,  Node: uniq -u,  Next: cat -s,  Prev: uniq -d,  Up: Examples

Remove All Duplicated Lines
===========================

   This script prints only unique lines, like `uniq -u'.

     #!/usr/bin/sed -f
     
     # Search for a duplicate line --- until that, print what you find.
     $b
     N
     /^\(.*\)\n\1$/ ! {
         P
         D
     }
     
     :c
     # Got two equal lines in pattern space.  At the
     # end of the file we simply exit
     $d
     
     # Else, we keep reading lines with `N' until we
     # find a different one
     s/.*\n//
     N
     /^\(.*\)\n\1$/ {
         bc
     }
     
     # Remove the last instance of the duplicate line
     # and go back to the top
     D


File: sed.info,  Node: cat -s,  Prev: uniq -u,  Up: Examples

Squeezing Blank Lines
=====================

   As a final example, here are three scripts, of increasing complexity
and speed, that implement the same function as `cat -s', that is
squeezing blank lines.

   The first leaves a blank line at the beginning and end if there are
some already.

     #!/usr/bin/sed -f
     
     # on empty lines, join with next
     # Note there is a star in the regexp
     :x
     /^\n*$/ {
     N
     bx
     }
     
     # now, squeeze all '\n', this can be also done by:
     # s/^\(\n\)*/\1/
     s/\n*/\
     /

   This one is a bit more complex and removes all empty lines at the
beginning.  It does leave a single blank line at end if one was there.

     #!/usr/bin/sed -f
     
     # delete all leading empty lines
     1,/^./{
     /./!d
     }
     
     # on an empty line we remove it and all the following
     # empty lines, but one
     :x
     /./!{
     N
     s/^\n$//
     tx
     }

   This removes leading and trailing blank lines.  It is also the
fastest.  Note that loops are completely done with `n' and `b', without
relying on `sed' to restart the the script automatically at the end of
a line.

     #!/usr/bin/sed -nf
     
     # delete all (leading) blanks
     /./!d
     
     # get here: so there is a non empty
     :x
     # print it
     p
     # get next
     n
     # got chars? print it again, etc...
     /./bx
     
     # no, don't have chars: got an empty line
     :z
     # get next, if last line we finish here so no trailing
     # empty lines are written
     n
     # also empty? then ignore it, and get next... this will
     # remove ALL empty lines
     /./!bz
     
     # all empty lines were deleted/ignored, but we have a non empty.  As
     # what we want to do is to squeeze, insert a blank line artificially
     i\
     
     bx


File: sed.info,  Node: Limitations,  Next: Other Resources,  Prev: Examples,  Up: Top

GNU `sed''s Limitations and Non-limitations
*******************************************

   For those who want to write portable `sed' scripts, be aware that
some implementations have been known to limit line lengths (for the
pattern and hold spaces) to be no more than 4000 bytes.  The POSIX
standard specifies that conforming `sed' implementations shall support
at least 8192 byte line lengths.  GNU `sed' has no built-in limit on
line length; as long as it can `malloc()' more (virtual) memory, you
can feed or construct lines as long as you like.

   However, recursion is used to handle subpatterns and indefinite
repetition.  This means that the available stack space may limit the
size of the buffer that can be processed by certain patterns.


File: sed.info,  Node: Other Resources,  Next: Reporting Bugs,  Prev: Limitations,  Up: Top

Other Resources for Learning About `sed'
****************************************

   In addition to several books that have been written about `sed'
(either specifically or as chapters in books which discuss shell
programming), one can find out more about `sed' (including suggestions
of a few books) from the FAQ for the `sed-users' mailing list,
available from any of:
      `http://www.student.northpark.edu/pemente/sed/sedfaq.html'
      `http://sed.sf.net/grabbag/tutorials/sedfaq.html'

   Also of interest are
`http://www.student.northpark.edu/pemente/sed/index.htm' and
`http://sed.sf.net/grabbag', which include `sed' tutorials and other
`sed'-related goodies.

   The `sed-users' mailing list itself maintained by Sven Guckes.  To
subscribe, visit `http://groups.yahoo.com' and search for the
`sed-users' mailing list.


File: sed.info,  Node: Reporting Bugs,  Next: Extended regexps,  Prev: Other Resources,  Up: Top

Reporting Bugs
**************

   Email bug reports to <bonzini@gnu.org>.  Be sure to include the word
"sed" somewhere in the `Subject:' field.  Also, please include the
output of `sed --version' in the body of your report if at all possible.

   Please do not send a bug report like this:

     while building frobme-1.3.4
     $ configure
     error--> sed: file sedscr line 1: Unknown option to 's'

   If GNU `sed' doesn't configure your favorite package, take a few
extra minutes to identify the specific problem and make a stand-alone
test case.  Unlike other programs such as C compilers, making such test
cases for `sed' is quite simple.

   A stand-alone test case includes all the data necessary to perform
the test, and the specific invocation of `sed' that causes the problem.
The smaller a stand-alone test case is, the better.  A test case should
not involve something as far removed from `sed' as "try to configure
frobme-1.3.4".  Yes, that is in principle enough information to look
for the bug, but that is not a very practical prospect.

   Here are a few commonly reported bugs that are not bugs.

`N' command on the last line
     Most versions of `sed' exit without printing anything when the `N'
     command is issued on the last line of a file.  GNU `sed' prints
     pattern space before exiting unless of course the `-n' command
     switch has been specified.  This choice is by design.

     For example, the behavior of
          sed N foo bar

     would depend on whether foo has an even or an odd number of
     lines(1).  Or, when writing a script to read the next few lines
     following a pattern match, traditional implementations of `sed'
     would force you to write something like
          /foo/{ $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N }

     instead of just
          /foo/{ N;N;N;N;N;N;N;N;N; }

     In any case, the simplest workaround is to use `$d;N' in scripts
     that rely on the traditional behavior, or to set the
     `POSIXLY_CORRECT' variable to a non-empty value.

Regex syntax clashes (problems with backslashes)
     `sed' uses the POSIX basic regular expression syntax.  According to
     the standard, the meaning of some escape sequences is undefined in
     this syntax;  notable in the case of `sed' are `\|', `\+', `\?',
     `\`', `\'', `\<', `\>', `\b', `\B', `\w', and `\W'.

     As in all GNU programs that use POSIX basic regular expressions,
     `sed' interprets these escape sequences as special characters.
     So, `x\+' matches one or more occurrences of `x'.  `abc\|def'
     matches either `abc' or `def'.

     This syntax may cause problems when running scripts written for
     other `sed's.  Some `sed' programs have been written with the
     assumption that `\|' and `\+' match the literal characters `|' and
     `+'.  Such scripts must be modified by removing the spurious
     backslashes if they are to be used with modern implementations of
     `sed', like GNU `sed'.

     On the other hand, some scripts use s|abc\|def||g to remove
     occurrences of _either_ `abc' or `def'.  While this worked until
     `sed' 4.0.x, newer versions interpret this as removing the string
     `abc|def'.  This is again undefined behavior according to POSIX,
     and this interpretation is arguably more robust: older `sed's, for
     example, required that the regex matcher parsed `\/' as `/' in the
     common case of escaping a slash, which is again undefined
     behavior; the new behavior avoids this, and this is good because
     the regex matcher is only partially under our control.

     In addition, this version of `sed' supports several escape
     characters (some of which are multi-character) to insert
     non-printable characters in scripts (`\a', `\c', `\d', `\o', `\r',
     `\t', `\v', `\x').  These can cause similar problems with scripts
     written for other `sed's.

`-i' clobbers read-only files
     In short, `sed -i' will let you delete the contents of a read-only
     file, and in general the `-i' option (*note Invocation: Invoking
     sed.) lets you clobber protected files.  This is not a bug, but
     rather a consequence of how the Unix filesystem works.

     The permissions on a file say what can happen to the data in that
     file, while the permissions on a directory say what can happen to
     the list of files in that directory.  `sed -i' will not ever open
     for writing  a file that is already on disk.  Rather, it will work
     on a temporary file that is finally renamed to the original name:
     if you rename or delete files, you're actually modifying the
     contents of the directory, so the operation depends on the
     permissions of the directory, not of the file.  For this same
     reason, `sed' does not let you use `-i' on a writeable file in a
     read-only directory (but unbelievably nobody reports that as a
     bug...).

`0a' does not work (gives an error)
     There is no line 0.  0 is a special address that is only used to
     treat addresses like `0,/RE/' as active when the script starts: if
     you write `1,/abc/d' and the first line includes the word `abc',
     then that match would be ignored because address ranges must span
     at least two lines (barring the end of the file); but what you
     probably wanted is to delete every line up to the first one
     including `abc', and this is obtained with `0,/abc/d'.

`[a-z]' is case insensitive
     You are encountering problems with locales.  POSIX mandates that
     `[a-z]' uses the current locale's collation order - in C parlance,
     that means using `strcoll(3)' instead of `strcmp(3)'.  Some
     locales have a case-insensitive collation order, others don't: one
     of those that have problems is Estonian.

     Another problem is that `[a-z]' tries to use collation symbols.
     This only happens if you are on the GNU system, using GNU libc's
     regular expression matcher instead of compiling the one supplied
     with GNU sed.  In a Danish locale, for example, the regular
     expression `^[a-z]$' matches the string `aa', because this is a
     single collating symbol that comes after `a' and before `b'; `ll'
     behaves similarly in Spanish locales, or `ij' in Dutch locales.

     To work around these problems, which may cause bugs in shell
     scripts, set the `LC_COLLATE' and `LC_CTYPE' environment variables
     to `C'.

   ---------- Footnotes ----------

   (1) which is the actual "bug" that prompted the change in behavior


File: sed.info,  Node: Extended regexps,  Next: Concept Index,  Prev: Reporting Bugs,  Up: Top

Extended regular expressions
****************************

   The only difference between basic and extended regular expressions
is in the behavior of a few characters: `?', `+', parentheses, and
braces (`{}').  While basic regular expressions require these to be
escaped if you want them to behave as special characters, when using
extended regular expressions you must escape them if you want them _to
match a literal character_.

Examples:
`abc?'
     becomes `abc\?' when using extended regular expressions.  It
     matches the literal string `abc?'.

`c\+'
     becomes `c+' when using extended regular expressions.  It matches
     one or more `c's.

`a\{3,\}'
     becomes `a{3,}' when using extended regular expressions.  It
     matches three or more `a's.

`\(abc\)\{2,3\}'
     becomes `(abc){2,3}' when using extended regular expressions.  It
     matches either `abcabc' or `abcabcabc'.

`\(abc*\)\1'
     becomes `(abc*)\1' when using extended regular expressions.
     Backreferences must still be escaped when using extended regular
     expressions.


File: sed.info,  Node: Concept Index,  Next: Command and Option Index,  Prev: Extended regexps,  Up: Top

Concept Index
*************

   This is a general index of all issues discussed in this manual, with
the exception of the `sed' commands and command-line options.

* Menu:

* Additional reading about sed:          Other Resources.
* ADDR1,+N:                              Addresses.
* ADDR1,~N:                              Addresses.
* Address, as a regular expression:      Addresses.
* Address, last line:                    Addresses.
* Address, numeric:                      Addresses.
* Addresses, in sed scripts:             Addresses.
* Append hold space to pattern space:    Other Commands.
* Append next input line to pattern space: Other Commands.
* Append pattern space to hold space:    Other Commands.
* Appending text after a line:           Other Commands.
* Backreferences, in regular expressions: The "s" Command.
* Branch to a label, if s/// failed:     Extended Commands.
* Branch to a label, if s/// succeeded:  Programming Commands.
* Branch to a label, unconditionally:    Programming Commands.
* Buffer spaces, pattern and hold:       Execution Cycle.
* Bugs, reporting:                       Reporting Bugs.
* Case-insensitive matching:             The "s" Command.
* Caveat -- #n on first line:            Common Commands.
* Command groups:                        Common Commands.
* Comments, in scripts:                  Common Commands.
* Conditional branch <1>:                Extended Commands.
* Conditional branch:                    Programming Commands.
* Copy hold space into pattern space:    Other Commands.
* Copy pattern space into hold space:    Other Commands.
* Delete first line from pattern space:  Other Commands.
* Disabling autoprint, from command line: Invoking sed.
* empty regular expression:              Addresses.
* Evaluate Bourne-shell commands:        Extended Commands.
* Evaluate Bourne-shell commands, after substitution: The "s" Command.
* Exchange hold space with pattern space: Other Commands.
* Excluding lines:                       Addresses.
* Extended regular expressions, choosing: Invoking sed.
* Extended regular expressions, syntax:  Extended regexps.
* Files to be processed as input:        Invoking sed.
* Flow of control in scripts:            Programming Commands.
* Global substitution:                   The "s" Command.
* GNU extensions, /dev/stderr file <1>:  The "s" Command.
* GNU extensions, /dev/stderr file:      Other Commands.
* GNU extensions, /dev/stdin file <1>:   Other Commands.
* GNU extensions, /dev/stdin file:       Extended Commands.
* GNU extensions, /dev/stdout file <1>:  Invoking sed.
* GNU extensions, /dev/stdout file <2>:  The "s" Command.
* GNU extensions, /dev/stdout file:      Other Commands.
* GNU extensions, 0 address:             Addresses.
* GNU extensions, 0,ADDR2 addressing:    Addresses.
* GNU extensions, ADDR1,+N addressing:   Addresses.
* GNU extensions, ADDR1,~N addressing:   Addresses.
* GNU extensions, branch if s/// failed: Extended Commands.
* GNU extensions, case modifiers in s commands: The "s" Command.
* GNU extensions, checking for their presence: Extended Commands.
* GNU extensions, disabling:             Invoking sed.
* GNU extensions, evaluating Bourne-shell commands <1>: Extended Commands.
* GNU extensions, evaluating Bourne-shell commands: The "s" Command.
* GNU extensions, extended regular expressions: Invoking sed.
* GNU extensions, g and NUMBER modifier interaction in s command: The "s" Command.
* GNU extensions, I modifier <1>:        Addresses.
* GNU extensions, I modifier:            The "s" Command.
* GNU extensions, in-place editing <1>:  Reporting Bugs.
* GNU extensions, in-place editing:      Invoking sed.
* GNU extensions, L command:             Extended Commands.
* GNU extensions, M modifier:            The "s" Command.
* GNU extensions, modifiers and the empty regular expression: Addresses.
* GNU extensions, N~M addresses:         Addresses.
* GNU extensions, quitting silently:     Extended Commands.
* GNU extensions, R command:             Extended Commands.
* GNU extensions, reading a file a line at a time: Extended Commands.
* GNU extensions, reformatting paragraphs: Extended Commands.
* GNU extensions, returning an exit code <1>: Common Commands.
* GNU extensions, returning an exit code: Extended Commands.
* GNU extensions, setting line length:   Other Commands.
* GNU extensions, special escapes <1>:   Reporting Bugs.
* GNU extensions, special escapes:       Escapes.
* GNU extensions, special two-address forms: Addresses.
* GNU extensions, subprocesses <1>:      The "s" Command.
* GNU extensions, subprocesses:          Extended Commands.
* GNU extensions, to basic regular expressions <1>: Reporting Bugs.
* GNU extensions, to basic regular expressions: Regular Expressions.
* GNU extensions, two addresses supported by most commands: Other Commands.
* GNU extensions, unlimited line length: Limitations.
* GNU extensions, writing first line to a file: Extended Commands.
* Goto, in scripts:                      Programming Commands.
* Greedy regular expression matching:    Regular Expressions.
* Grouping commands:                     Common Commands.
* Hold space, appending from pattern space: Other Commands.
* Hold space, appending to pattern space: Other Commands.
* Hold space, copy into pattern space:   Other Commands.
* Hold space, copying pattern space into: Other Commands.
* Hold space, definition:                Execution Cycle.
* Hold space, exchange with pattern space: Other Commands.
* In-place editing:                      Reporting Bugs.
* In-place editing, activating:          Invoking sed.
* In-place editing, Perl-style backup file names: Invoking sed.
* Inserting text before a line:          Other Commands.
* Labels, in scripts:                    Programming Commands.
* Last line, selecting:                  Addresses.
* Line length, setting <1>:              Invoking sed.
* Line length, setting:                  Other Commands.
* Line number, printing:                 Other Commands.
* Line selection:                        Addresses.
* Line, selecting by number:             Addresses.
* Line, selecting by regular expression match: Addresses.
* Line, selecting last:                  Addresses.
* List pattern space:                    Other Commands.
* Mixing g and NUMBER modifiers in the s command: The "s" Command.
* Next input line, append to pattern space: Other Commands.
* Next input line, replace pattern space with: Common Commands.
* Non-bugs, in-place editing:            Reporting Bugs.
* Non-bugs, N command on the last line:  Reporting Bugs.
* Non-bugs, regex syntax clashes:        Reporting Bugs.
* Parenthesized substrings:              The "s" Command.
* Pattern space, definition:             Execution Cycle.
* Perl-style regular expressions, multiline: Addresses.
* Portability, comments:                 Common Commands.
* Portability, line length limitations:  Limitations.
* Portability, N command on the last line: Reporting Bugs.
* POSIXLY_CORRECT behavior, bracket expressions: Regular Expressions.
* POSIXLY_CORRECT behavior, enabling:    Invoking sed.
* POSIXLY_CORRECT behavior, escapes:     Escapes.
* POSIXLY_CORRECT behavior, N command:   Reporting Bugs.
* Print first line from pattern space:   Other Commands.
* Printing line number:                  Other Commands.
* Printing text unambiguously:           Other Commands.
* Quitting <1>:                          Extended Commands.
* Quitting:                              Common Commands.
* Range of lines:                        Addresses.
* Range with start address of zero:      Addresses.
* Read next input line:                  Common Commands.
* Read text from a file <1>:             Extended Commands.
* Read text from a file:                 Other Commands.
* Reformat pattern space:                Extended Commands.
* Reformatting paragraphs:               Extended Commands.
* Replace hold space with copy of pattern space: Other Commands.
* Replace pattern space with copy of hold space: Other Commands.
* Replacing all text matching regexp in a line: The "s" Command.
* Replacing only Nth match of regexp in a line: The "s" Command.
* Replacing selected lines with other text: Other Commands.
* Requiring GNU sed:                     Extended Commands.
* Script structure:                      sed Programs.
* Script, from a file:                   Invoking sed.
* Script, from command line:             Invoking sed.
* sed program structure:                 sed Programs.
* Selecting lines to process:            Addresses.
* Selecting non-matching lines:          Addresses.
* Several lines, selecting:              Addresses.
* Slash character, in regular expressions: Addresses.
* Spaces, pattern and hold:              Execution Cycle.
* Special addressing forms:              Addresses.
* Standard input, processing as input:   Invoking sed.
* Stream editor:                         Introduction.
* Subprocesses <1>:                      Extended Commands.
* Subprocesses:                          The "s" Command.
* Substitution of text, options:         The "s" Command.
* Text, appending:                       Other Commands.
* Text, deleting:                        Common Commands.
* Text, insertion:                       Other Commands.
* Text, printing:                        Common Commands.
* Text, printing after substitution:     The "s" Command.
* Text, writing to a file after substitution: The "s" Command.
* Transliteration:                       Other Commands.
* Unbuffered I/O, choosing:              Invoking sed.
* Usage summary, printing:               Invoking sed.
* Version, printing:                     Invoking sed.
* Working on separate files:             Invoking sed.
* Write first line to a file:            Extended Commands.
* Write to a file:                       Other Commands.
* Zero, as range start address:          Addresses.


File: sed.info,  Node: Command and Option Index,  Prev: Concept Index,  Up: Top

Command and Option Index
************************

   This is an alphabetical list of all `sed' commands and command-line
options.

* Menu:

* # (comments):                          Common Commands.
* --expression:                          Invoking sed.
* --file:                                Invoking sed.
* --help:                                Invoking sed.
* --in-place:                            Invoking sed.
* --line-length:                         Invoking sed.
* --quiet:                               Invoking sed.
* --regexp-extended:                     Invoking sed.
* --silent:                              Invoking sed.
* --unbuffered:                          Invoking sed.
* --version:                             Invoking sed.
* -e:                                    Invoking sed.
* -f:                                    Invoking sed.
* -i:                                    Invoking sed.
* -l:                                    Invoking sed.
* -n:                                    Invoking sed.
* -n, forcing from within a script:      Common Commands.
* -r:                                    Invoking sed.
* -u:                                    Invoking sed.
* : (label) command:                     Programming Commands.
* = (print line number) command:         Other Commands.
* a (append text lines) command:         Other Commands.
* b (branch) command:                    Programming Commands.
* c (change to text lines) command:      Other Commands.
* D (delete first line) command:         Other Commands.
* d (delete) command:                    Common Commands.
* e (evaluate) command:                  Extended Commands.
* G (appending Get) command:             Other Commands.
* g (get) command:                       Other Commands.
* H (append Hold) command:               Other Commands.
* h (hold) command:                      Other Commands.
* i (insert text lines) command:         Other Commands.
* L (fLow paragraphs) command:           Extended Commands.
* l (list unambiguously) command:        Other Commands.
* N (append Next line) command:          Other Commands.
* n (next-line) command:                 Common Commands.
* P (print first line) command:          Other Commands.
* p (print) command:                     Common Commands.
* q (quit) command:                      Common Commands.
* Q (silent Quit) command:               Extended Commands.
* r (read file) command:                 Other Commands.
* R (read line) command:                 Extended Commands.
* s command, option flags:               The "s" Command.
* T (test and branch if failed) command: Extended Commands.
* t (test and branch if successful) command: Programming Commands.
* v (version) command:                   Extended Commands.
* w (write file) command:                Other Commands.
* W (write first line) command:          Extended Commands.
* x (eXchange) command:                  Other Commands.
* y (transliterate) command:             Other Commands.
* {} command grouping:                   Common Commands.