1 |
GNU 'sed' |
2 |
1 Introduction |
3 |
2 Running sed |
4 |
2.1 Overview |
5 |
2.2 Command-Line Options |
6 |
2.3 Exit status |
7 |
3 'sed' scripts |
8 |
3.1 'sed' script overview |
9 |
3.2 'sed' commands summary |
10 |
3.3 The 's' Command |
11 |
3.4 Often-Used Commands |
12 |
3.5 Less Frequently-Used Commands |
13 |
3.6 Commands for 'sed' gurus |
14 |
3.7 Commands Specific to GNU 'sed' |
15 |
3.8 Multiple commands syntax |
16 |
3.8.1 Commands Requiring a newline |
17 |
4 Addresses: selecting lines |
18 |
4.1 Addresses overview |
19 |
4.2 Selecting lines by numbers |
20 |
4.3 selecting lines by text matching |
21 |
4.4 Range Addresses |
22 |
5 Regular Expressions: selecting text |
23 |
5.1 Overview of regular expression in 'sed' |
24 |
5.2 Basic (BRE) and extended (ERE) regular expression |
25 |
5.3 Overview of basic regular expression syntax |
26 |
5.4 Overview of extended regular expression syntax |
27 |
5.5 Character Classes and Bracket Expressions |
28 |
5.6 regular expression extensions |
29 |
5.7 Back-references and Subexpressions |
30 |
5.8 Escape Sequences - specifying special characters |
31 |
5.8.1 Escaping Precedence |
32 |
5.9 Multibyte characters and Locale Considerations |
33 |
5.9.1 Invalid multibyte characters |
34 |
5.9.2 Upper/Lower case conversion |
35 |
5.9.3 Multibyte regexp character classes |
36 |
6 Advanced 'sed': cycles and buffers |
37 |
6.1 How 'sed' Works |
38 |
6.2 Hold and Pattern Buffers |
39 |
6.3 Multiline techniques - using D,G,H,N,P to process multiple lines |
40 |
6.4 Branching and Flow Control |
41 |
6.4.1 Branching and Cycles |
42 |
6.4.2 Branching example: joining lines |
43 |
7 Some Sample Scripts |
44 |
7.1 Joining lines |
45 |
7.2 Centering Lines |
46 |
7.3 Increment a Number |
47 |
7.4 Rename Files to Lower Case |
48 |
7.5 Print 'bash' Environment |
49 |
7.6 Reverse Characters of Lines |
50 |
7.7 Text search across multiple lines |
51 |
7.8 Line length adjustment |
52 |
7.9 Reverse Lines of Files |
53 |
7.10 Numbering Lines |
54 |
7.11 Numbering Non-blank Lines |
55 |
7.12 Counting Characters |
56 |
7.13 Counting Words |
57 |
7.14 Counting Lines |
58 |
7.15 Printing the First Lines |
59 |
7.16 Printing the Last Lines |
60 |
7.17 Make Duplicate Lines Unique |
61 |
7.18 Print Duplicated Lines of Input |
62 |
7.19 Remove All Duplicated Lines |
63 |
7.20 Squeezing Blank Lines |
64 |
8 GNU 'sed''s Limitations and Non-limitations |
65 |
9 Other Resources for Learning About 'sed' |
66 |
10 Reporting Bugs |
67 |
Appendix A GNU Free Documentation License |
68 |
Concept Index |
69 |
Command and Option Index |
70 |
GNU 'sed' |
71 |
********* |
72 |
|
73 |
This file documents version 4.8 of GNU 'sed', a stream editor. |
74 |
|
75 |
Copyright (C) 1998-2020 Free Software Foundation, Inc. |
76 |
|
77 |
Permission is granted to copy, distribute and/or modify this |
78 |
document under the terms of the GNU Free Documentation License, |
79 |
Version 1.3 or any later version published by the Free Software |
80 |
Foundation; with no Invariant Sections, no Front-Cover Texts, and |
81 |
no Back-Cover Texts. A copy of the license is included in the |
82 |
section entitled "GNU Free Documentation License". |
83 |
|
84 |
1 Introduction |
85 |
************** |
86 |
|
87 |
'sed' is a stream editor. A stream editor is used to perform basic text |
88 |
transformations on an input stream (a file or input from a pipeline). |
89 |
While in some ways similar to an editor which permits scripted edits |
90 |
(such as 'ed'), 'sed' works by making only one pass over the input(s), |
91 |
and is consequently more efficient. But it is 'sed''s ability to filter |
92 |
text in a pipeline which particularly distinguishes it from other types |
93 |
of editors. |
94 |
|
95 |
2 Running sed |
96 |
************* |
97 |
|
98 |
This chapter covers how to run 'sed'. Details of 'sed' scripts and |
99 |
individual 'sed' commands are discussed in the next chapter. |
100 |
|
101 |
2.1 Overview |
102 |
============ |
103 |
|
104 |
Normally 'sed' is invoked like this: |
105 |
|
106 |
sed SCRIPT INPUTFILE... |
107 |
|
108 |
For example, to replace all occurrences of 'hello' to 'world' in the |
109 |
file 'input.txt': |
110 |
|
111 |
sed 's/hello/world/' input.txt > output.txt |
112 |
|
113 |
If you do not specify INPUTFILE, or if INPUTFILE is '-', 'sed' |
114 |
filters the contents of the standard input. The following commands are |
115 |
equivalent: |
116 |
|
117 |
sed 's/hello/world/' input.txt > output.txt |
118 |
sed 's/hello/world/' < input.txt > output.txt |
119 |
cat input.txt | sed 's/hello/world/' - > output.txt |
120 |
|
121 |
'sed' writes output to standard output. Use '-i' to edit files |
122 |
in-place instead of printing to standard output. See also the 'W' and |
123 |
's///w' commands for writing output to other files. The following |
124 |
command modifies 'file.txt' and does not produce any output: |
125 |
|
126 |
sed -i 's/hello/world/' file.txt |
127 |
|
128 |
By default 'sed' prints all processed input (except input that has |
129 |
been modified/deleted by commands such as 'd'). Use '-n' to suppress |
130 |
output, and the 'p' command to print specific lines. The following |
131 |
command prints only line 45 of the input file: |
132 |
|
133 |
sed -n '45p' file.txt |
134 |
|
135 |
'sed' treats multiple input files as one long stream. The following |
136 |
example prints the first line of the first file ('one.txt') and the last |
137 |
line of the last file ('three.txt'). Use '-s' to reverse this behavior. |
138 |
|
139 |
sed -n '1p ; $p' one.txt two.txt three.txt |
140 |
|
141 |
Without '-e' or '-f' options, 'sed' uses the first non-option |
142 |
parameter as the SCRIPT, and the following non-option parameters as |
143 |
input files. If '-e' or '-f' options are used to specify a SCRIPT, all |
144 |
non-option parameters are taken as input files. Options '-e' and '-f' |
145 |
can be combined, and can appear multiple times (in which case the final |
146 |
effective SCRIPT will be concatenation of all the individual SCRIPTs). |
147 |
|
148 |
The following examples are equivalent: |
149 |
|
150 |
sed 's/hello/world/' input.txt > output.txt |
151 |
|
152 |
sed -e 's/hello/world/' input.txt > output.txt |
153 |
sed --expression='s/hello/world/' input.txt > output.txt |
154 |
|
155 |
echo 's/hello/world/' > myscript.sed |
156 |
sed -f myscript.sed input.txt > output.txt |
157 |
sed --file=myscript.sed input.txt > output.txt |
158 |
|
159 |
2.2 Command-Line Options |
160 |
======================== |
161 |
|
162 |
The full format for invoking 'sed' is: |
163 |
|
164 |
sed OPTIONS... [SCRIPT] [INPUTFILE...] |
165 |
|
166 |
'sed' may be invoked with the following command-line options: |
167 |
|
168 |
'--version' |
169 |
Print out the version of 'sed' that is being run and a copyright |
170 |
notice, then exit. |
171 |
|
172 |
'--help' |
173 |
Print a usage message briefly summarizing these command-line |
174 |
options and the bug-reporting address, then exit. |
175 |
|
176 |
'-n' |
177 |
'--quiet' |
178 |
'--silent' |
179 |
By default, 'sed' prints out the pattern space at the end of each |
180 |
cycle through the script (*note How 'sed' works: Execution Cycle.). |
181 |
These options disable this automatic printing, and 'sed' only |
182 |
produces output when explicitly told to via the 'p' command. |
183 |
|
184 |
'--debug' |
185 |
Print the input sed program in canonical form, and annotate program |
186 |
execution. |
187 |
$ echo 1 | sed '\%1%s21232' |
188 |
3 |
189 |
|
190 |
$ echo 1 | sed --debug '\%1%s21232' |
191 |
SED PROGRAM: |
192 |
/1/ s/1/3/ |
193 |
INPUT: 'STDIN' line 1 |
194 |
PATTERN: 1 |
195 |
COMMAND: /1/ s/1/3/ |
196 |
PATTERN: 3 |
197 |
END-OF-CYCLE: |
198 |
3 |
199 |
|
200 |
'-e SCRIPT' |
201 |
'--expression=SCRIPT' |
202 |
Add the commands in SCRIPT to the set of commands to be run while |
203 |
processing the input. |
204 |
|
205 |
'-f SCRIPT-FILE' |
206 |
'--file=SCRIPT-FILE' |
207 |
Add the commands contained in the file SCRIPT-FILE to the set of |
208 |
commands to be run while processing the input. |
209 |
|
210 |
'-i[SUFFIX]' |
211 |
'--in-place[=SUFFIX]' |
212 |
This option specifies that files are to be edited in-place. GNU |
213 |
'sed' does this by creating a temporary file and sending output to |
214 |
this file rather than to the standard output.(1). |
215 |
|
216 |
This option implies '-s'. |
217 |
|
218 |
When the end of the file is reached, the temporary file is renamed |
219 |
to the output file's original name. The extension, if supplied, is |
220 |
used to modify the name of the old file before renaming the |
221 |
temporary file, thereby making a backup copy(2)). |
222 |
|
223 |
This rule is followed: if the extension doesn't contain a '*', then |
224 |
it is appended to the end of the current filename as a suffix; if |
225 |
the extension does contain one or more '*' characters, then _each_ |
226 |
asterisk is replaced with the current filename. This allows you to |
227 |
add a prefix to the backup file, instead of (or in addition to) a |
228 |
suffix, or even to place backup copies of the original files into |
229 |
another directory (provided the directory already exists). |
230 |
|
231 |
If no extension is supplied, the original file is overwritten |
232 |
without making a backup. |
233 |
|
234 |
Because '-i' takes an optional argument, it should not be followed |
235 |
by other short options: |
236 |
'sed -Ei '...' FILE' |
237 |
Same as '-E -i' with no backup suffix - 'FILE' will be edited |
238 |
in-place without creating a backup. |
239 |
|
240 |
'sed -iE '...' FILE' |
241 |
This is equivalent to '--in-place=E', creating 'FILEE' as |
242 |
backup of 'FILE' |
243 |
|
244 |
Be cautious of using '-n' with '-i': the former disables automatic |
245 |
printing of lines and the latter changes the file in-place without |
246 |
a backup. Used carelessly (and without an explicit 'p' command), |
247 |
the output file will be empty: |
248 |
# WRONG USAGE: 'FILE' will be truncated. |
249 |
sed -ni 's/foo/bar/' FILE |
250 |
|
251 |
'-l N' |
252 |
'--line-length=N' |
253 |
Specify the default line-wrap length for the 'l' command. A length |
254 |
of 0 (zero) means to never wrap long lines. If not specified, it |
255 |
is taken to be 70. |
256 |
|
257 |
'--posix' |
258 |
GNU 'sed' includes several extensions to POSIX sed. In order to |
259 |
simplify writing portable scripts, this option disables all the |
260 |
extensions that this manual documents, including additional |
261 |
commands. Most of the extensions accept 'sed' programs that are |
262 |
outside the syntax mandated by POSIX, but some of them (such as the |
263 |
behavior of the 'N' command described in *note Reporting Bugs::) |
264 |
actually violate the standard. If you want to disable only the |
265 |
latter kind of extension, you can set the 'POSIXLY_CORRECT' |
266 |
variable to a non-empty value. |
267 |
|
268 |
'-b' |
269 |
'--binary' |
270 |
This option is available on every platform, but is only effective |
271 |
where the operating system makes a distinction between text files |
272 |
and binary files. When such a distinction is made--as is the case |
273 |
for MS-DOS, Windows, Cygwin--text files are composed of lines |
274 |
separated by a carriage return _and_ a line feed character, and |
275 |
'sed' does not see the ending CR. When this option is specified, |
276 |
'sed' will open input files in binary mode, thus not requesting |
277 |
this special processing and considering lines to end at a line |
278 |
feed. |
279 |
|
280 |
'--follow-symlinks' |
281 |
This option is available only on platforms that support symbolic |
282 |
links and has an effect only if option '-i' is specified. In this |
283 |
case, if the file that is specified on the command line is a |
284 |
symbolic link, 'sed' will follow the link and edit the ultimate |
285 |
destination of the link. The default behavior is to break the |
286 |
symbolic link, so that the link destination will not be modified. |
287 |
|
288 |
'-E' |
289 |
'-r' |
290 |
'--regexp-extended' |
291 |
Use extended regular expressions rather than basic regular |
292 |
expressions. Extended regexps are those that 'egrep' accepts; they |
293 |
can be clearer because they usually have fewer backslashes. |
294 |
Historically this was a GNU extension, but the '-E' extension has |
295 |
since been added to the POSIX standard |
296 |
(http://austingroupbugs.net/view.php?id=528), so use '-E' for |
297 |
portability. GNU sed has accepted '-E' as an undocumented option |
298 |
for years, and *BSD seds have accepted '-E' for years as well, but |
299 |
scripts that use '-E' might not port to other older systems. *Note |
300 |
Extended regular expressions: ERE syntax. |
301 |
|
302 |
'-s' |
303 |
'--separate' |
304 |
By default, 'sed' will consider the files specified on the command |
305 |
line as a single continuous long stream. This GNU 'sed' extension |
306 |
allows the user to consider them as separate files: range addresses |
307 |
(such as '/abc/,/def/') are not allowed to span several files, line |
308 |
numbers are relative to the start of each file, '$' refers to the |
309 |
last line of each file, and files invoked from the 'R' commands are |
310 |
rewound at the start of each file. |
311 |
|
312 |
'--sandbox' |
313 |
In sandbox mode, 'e/w/r' commands are rejected - programs |
314 |
containing them will be aborted without being run. Sandbox mode |
315 |
ensures 'sed' operates only on the input files designated on the |
316 |
command line, and cannot run external programs. |
317 |
|
318 |
'-u' |
319 |
'--unbuffered' |
320 |
Buffer both input and output as minimally as practical. (This is |
321 |
particularly useful if the input is coming from the likes of 'tail |
322 |
-f', and you wish to see the transformed output as soon as |
323 |
possible.) |
324 |
|
325 |
'-z' |
326 |
'--null-data' |
327 |
'--zero-terminated' |
328 |
Treat the input as a set of lines, each terminated by a zero byte |
329 |
(the ASCII 'NUL' character) instead of a newline. This option can |
330 |
be used with commands like 'sort -z' and 'find -print0' to process |
331 |
arbitrary file names. |
332 |
|
333 |
If no '-e', '-f', '--expression', or '--file' options are given on |
334 |
the command-line, then the first non-option argument on the command line |
335 |
is taken to be the SCRIPT to be executed. |
336 |
|
337 |
If any command-line parameters remain after processing the above, |
338 |
these parameters are interpreted as the names of input files to be |
339 |
processed. A file name of '-' refers to the standard input stream. The |
340 |
standard input will be processed if no file names are specified. |
341 |
|
342 |
---------- Footnotes ---------- |
343 |
|
344 |
(1) This applies to commands such as '=', 'a', 'c', 'i', 'l', 'p'. |
345 |
You can still write to the standard output by using the 'w' or 'W' |
346 |
commands together with the '/dev/stdout' special file |
347 |
|
348 |
(2) Note that GNU 'sed' creates the backup file whether or not any |
349 |
output is actually changed. |
350 |
|
351 |
2.3 Exit status |
352 |
=============== |
353 |
|
354 |
An exit status of zero indicates success, and a nonzero value indicates |
355 |
failure. GNU 'sed' returns the following exit status error values: |
356 |
|
357 |
0 |
358 |
Successful completion. |
359 |
|
360 |
1 |
361 |
Invalid command, invalid syntax, invalid regular expression or a |
362 |
GNU 'sed' extension command used with '--posix'. |
363 |
|
364 |
2 |
365 |
One or more of the input file specified on the command line could |
366 |
not be opened (e.g. if a file is not found, or read permission is |
367 |
denied). Processing continued with other files. |
368 |
|
369 |
4 |
370 |
An I/O error, or a serious processing error during runtime, GNU |
371 |
'sed' aborted immediately. |
372 |
|
373 |
Additionally, the commands 'q' and 'Q' can be used to terminate 'sed' |
374 |
with a custom exit code value (this is a GNU 'sed' extension): |
375 |
|
376 |
$ echo | sed 'Q42' ; echo $? |
377 |
42 |
378 |
|
379 |
3 'sed' scripts |
380 |
*************** |
381 |
|
382 |
3.1 'sed' script overview |
383 |
========================= |
384 |
|
385 |
A 'sed' program consists of one or more 'sed' commands, passed in by one |
386 |
or more of the '-e', '-f', '--expression', and '--file' options, or the |
387 |
first non-option argument if zero of these options are used. This |
388 |
document will refer to "the" 'sed' script; this is understood to mean |
389 |
the in-order concatenation of all of the SCRIPTs and SCRIPT-FILEs passed |
390 |
in. *Note Overview::. |
391 |
|
392 |
'sed' commands follow this syntax: |
393 |
|
394 |
[addr]X[options] |
395 |
|
396 |
X is a single-letter 'sed' command. '[addr]' is an optional line |
397 |
address. If '[addr]' is specified, the command X will be executed only |
398 |
on the matched lines. '[addr]' can be a single line number, a regular |
399 |
expression, or a range of lines (*note sed addresses::). Additional |
400 |
'[options]' are used for some 'sed' commands. |
401 |
|
402 |
The following example deletes lines 30 to 35 in the input. '30,35' |
403 |
is an address range. 'd' is the delete command: |
404 |
|
405 |
sed '30,35d' input.txt > output.txt |
406 |
|
407 |
The following example prints all input until a line starting with the |
408 |
word 'foo' is found. If such line is found, 'sed' will terminate with |
409 |
exit status 42. If such line was not found (and no other error |
410 |
occurred), 'sed' will exit with status 0. '/^foo/' is a |
411 |
regular-expression address. 'q' is the quit command. '42' is the |
412 |
command option. |
413 |
|
414 |
sed '/^foo/q42' input.txt > output.txt |
415 |
|
416 |
Commands within a SCRIPT or SCRIPT-FILE can be separated by |
417 |
semicolons (';') or newlines (ASCII 10). Multiple scripts can be |
418 |
specified with '-e' or '-f' options. |
419 |
|
420 |
The following examples are all equivalent. They perform two 'sed' |
421 |
operations: deleting any lines matching the regular expression '/^foo/', |
422 |
and replacing all occurrences of the string 'hello' with 'world': |
423 |
|
424 |
sed '/^foo/d ; s/hello/world/' input.txt > output.txt |
425 |
|
426 |
sed -e '/^foo/d' -e 's/hello/world/' input.txt > output.txt |
427 |
|
428 |
echo '/^foo/d' > script.sed |
429 |
echo 's/hello/world/' >> script.sed |
430 |
sed -f script.sed input.txt > output.txt |
431 |
|
432 |
echo 's/hello/world/' > script2.sed |
433 |
sed -e '/^foo/d' -f script2.sed input.txt > output.txt |
434 |
|
435 |
Commands 'a', 'c', 'i', due to their syntax, cannot be followed by |
436 |
semicolons working as command separators and thus should be terminated |
437 |
with newlines or be placed at the end of a SCRIPT or SCRIPT-FILE. |
438 |
Commands can also be preceded with optional non-significant whitespace |
439 |
characters. *Note Multiple commands syntax::. |
440 |
|
441 |
3.2 'sed' commands summary |
442 |
========================== |
443 |
|
444 |
The following commands are supported in GNU 'sed'. Some are standard |
445 |
POSIX commands, while other are GNU extensions. Details and examples |
446 |
for each command are in the following sections. (Mnemonics) are shown |
447 |
in parentheses. |
448 |
|
449 |
'a\' |
450 |
'TEXT' |
451 |
Append TEXT after a line. |
452 |
|
453 |
'a TEXT' |
454 |
Append TEXT after a line (alternative syntax). |
455 |
|
456 |
'b LABEL' |
457 |
Branch unconditionally to LABEL. The LABEL may be omitted, in |
458 |
which case the next cycle is started. |
459 |
|
460 |
'c\' |
461 |
'TEXT' |
462 |
Replace (change) lines with TEXT. |
463 |
|
464 |
'c TEXT' |
465 |
Replace (change) lines with TEXT (alternative syntax). |
466 |
|
467 |
'd' |
468 |
Delete the pattern space; immediately start next cycle. |
469 |
|
470 |
'D' |
471 |
If pattern space contains newlines, delete text in the pattern |
472 |
space up to the first newline, and restart cycle with the resultant |
473 |
pattern space, without reading a new line of input. |
474 |
|
475 |
If pattern space contains no newline, start a normal new cycle as |
476 |
if the 'd' command was issued. |
477 |
|
478 |
'e' |
479 |
Executes the command that is found in pattern space and replaces |
480 |
the pattern space with the output; a trailing newline is |
481 |
suppressed. |
482 |
|
483 |
'e COMMAND' |
484 |
Executes COMMAND and sends its output to the output stream. The |
485 |
command can run across multiple lines, all but the last ending with |
486 |
a back-slash. |
487 |
|
488 |
'F' |
489 |
(filename) Print the file name of the current input file (with a |
490 |
trailing newline). |
491 |
|
492 |
'g' |
493 |
Replace the contents of the pattern space with the contents of the |
494 |
hold space. |
495 |
|
496 |
'G' |
497 |
Append a newline to the contents of the pattern space, and then |
498 |
append the contents of the hold space to that of the pattern space. |
499 |
|
500 |
'h' |
501 |
(hold) Replace the contents of the hold space with the contents of |
502 |
the pattern space. |
503 |
|
504 |
'H' |
505 |
Append a newline to the contents of the hold space, and then append |
506 |
the contents of the pattern space to that of the hold space. |
507 |
|
508 |
'i\' |
509 |
'TEXT' |
510 |
insert TEXT before a line. |
511 |
|
512 |
'i TEXT' |
513 |
insert TEXT before a line (alternative syntax). |
514 |
|
515 |
'l' |
516 |
Print the pattern space in an unambiguous form. |
517 |
|
518 |
'n' |
519 |
(next) If auto-print is not disabled, print the pattern space, |
520 |
then, regardless, replace the pattern space with the next line of |
521 |
input. If there is no more input then 'sed' exits without |
522 |
processing any more commands. |
523 |
|
524 |
'N' |
525 |
Add a newline to the pattern space, then append the next line of |
526 |
input to the pattern space. If there is no more input then 'sed' |
527 |
exits without processing any more commands. |
528 |
|
529 |
'p' |
530 |
Print the pattern space. |
531 |
|
532 |
'P' |
533 |
Print the pattern space, up to the first <newline>. |
534 |
|
535 |
'q[EXIT-CODE]' |
536 |
(quit) Exit 'sed' without processing any more commands or input. |
537 |
|
538 |
'Q[EXIT-CODE]' |
539 |
(quit) This command is the same as 'q', but will not print the |
540 |
contents of pattern space. Like 'q', it provides the ability to |
541 |
return an exit code to the caller. |
542 |
|
543 |
'r filename' |
544 |
Reads file FILENAME. |
545 |
|
546 |
'R filename' |
547 |
Queue a line of FILENAME to be read and inserted into the output |
548 |
stream at the end of the current cycle, or when the next input line |
549 |
is read. |
550 |
|
551 |
's/REGEXP/REPLACEMENT/[FLAGS]' |
552 |
(substitute) Match the regular-expression against the content of |
553 |
the pattern space. If found, replace matched string with |
554 |
REPLACEMENT. |
555 |
|
556 |
't LABEL' |
557 |
(test) Branch to LABEL only if there has been a successful |
558 |
's'ubstitution since the last input line was read or conditional |
559 |
branch was taken. The LABEL may be omitted, in which case the next |
560 |
cycle is started. |
561 |
|
562 |
'T LABEL' |
563 |
(test) Branch to LABEL only if there have been no successful |
564 |
's'ubstitutions since the last input line was read or conditional |
565 |
branch was taken. The LABEL may be omitted, in which case the next |
566 |
cycle is started. |
567 |
|
568 |
'v [VERSION]' |
569 |
(version) This command does nothing, but makes 'sed' fail if GNU |
570 |
'sed' extensions are not supported, or if the requested version is |
571 |
not available. |
572 |
|
573 |
'w filename' |
574 |
Write the pattern space to FILENAME. |
575 |
|
576 |
'W filename' |
577 |
Write to the given filename the portion of the pattern space up to |
578 |
the first newline |
579 |
|
580 |
'x' |
581 |
Exchange the contents of the hold and pattern spaces. |
582 |
|
583 |
'y/src/dst/' |
584 |
Transliterate any characters in the pattern space which match any |
585 |
of the SOURCE-CHARS with the corresponding character in DEST-CHARS. |
586 |
|
587 |
'z' |
588 |
(zap) This command empties the content of pattern space. |
589 |
|
590 |
'#' |
591 |
A comment, until the next newline. |
592 |
|
593 |
'{ CMD ; CMD ... }' |
594 |
Group several commands together. |
595 |
|
596 |
'=' |
597 |
Print the current input line number (with a trailing newline). |
598 |
|
599 |
': LABEL' |
600 |
Specify the location of LABEL for branch commands ('b', 't', 'T'). |
601 |
|
602 |
3.3 The 's' Command |
603 |
=================== |
604 |
|
605 |
The 's' command (as in substitute) is probably the most important in |
606 |
'sed' and has a lot of different options. The syntax of the 's' command |
607 |
is 's/REGEXP/REPLACEMENT/FLAGS'. |
608 |
|
609 |
Its basic concept is simple: the 's' command attempts to match the |
610 |
pattern space against the supplied regular expression REGEXP; if the |
611 |
match is successful, then that portion of the pattern space which was |
612 |
matched is replaced with REPLACEMENT. |
613 |
|
614 |
For details about REGEXP syntax *note Regular Expression Addresses: |
615 |
Regexp Addresses. |
616 |
|
617 |
The REPLACEMENT can contain '\N' (N being a number from 1 to 9, |
618 |
inclusive) references, which refer to the portion of the match which is |
619 |
contained between the Nth '\(' and its matching '\)'. Also, the |
620 |
REPLACEMENT can contain unescaped '&' characters which reference the |
621 |
whole matched portion of the pattern space. |
622 |
|
623 |
The '/' characters may be uniformly replaced by any other single |
624 |
character within any given 's' command. The '/' character (or whatever |
625 |
other character is used in its stead) can appear in the REGEXP or |
626 |
REPLACEMENT only if it is preceded by a '\' character. |
627 |
|
628 |
Finally, as a GNU 'sed' extension, you can include a special sequence |
629 |
made of a backslash and one of the letters 'L', 'l', 'U', 'u', or 'E'. |
630 |
The meaning is as follows: |
631 |
|
632 |
'\L' |
633 |
Turn the replacement to lowercase until a '\U' or '\E' is found, |
634 |
|
635 |
'\l' |
636 |
Turn the next character to lowercase, |
637 |
|
638 |
'\U' |
639 |
Turn the replacement to uppercase until a '\L' or '\E' is found, |
640 |
|
641 |
'\u' |
642 |
Turn the next character to uppercase, |
643 |
|
644 |
'\E' |
645 |
Stop case conversion started by '\L' or '\U'. |
646 |
|
647 |
When the 'g' flag is being used, case conversion does not propagate |
648 |
from one occurrence of the regular expression to another. For example, |
649 |
when the following command is executed with 'a-b-' in pattern space: |
650 |
s/\(b\?\)-/x\u\1/g |
651 |
|
652 |
the output is 'axxB'. When replacing the first '-', the '\u' sequence |
653 |
only affects the empty replacement of '\1'. It does not affect the 'x' |
654 |
character that is added to pattern space when replacing 'b-' with 'xB'. |
655 |
|
656 |
On the other hand, '\l' and '\u' do affect the remainder of the |
657 |
replacement text if they are followed by an empty substitution. With |
658 |
'a-b-' in pattern space, the following command: |
659 |
s/\(b\?\)-/\u\1x/g |
660 |
|
661 |
will replace '-' with 'X' (uppercase) and 'b-' with 'Bx'. If this |
662 |
behavior is undesirable, you can prevent it by adding a '\E' |
663 |
sequence--after '\1' in this case. |
664 |
|
665 |
To include a literal '\', '&', or newline in the final replacement, |
666 |
be sure to precede the desired '\', '&', or newline in the REPLACEMENT |
667 |
with a '\'. |
668 |
|
669 |
The 's' command can be followed by zero or more of the following |
670 |
FLAGS: |
671 |
|
672 |
'g' |
673 |
Apply the replacement to _all_ matches to the REGEXP, not just the |
674 |
first. |
675 |
|
676 |
'NUMBER' |
677 |
Only replace the NUMBERth match of the REGEXP. |
678 |
|
679 |
interaction in 's' command Note: the POSIX standard does not |
680 |
specify what should happen when you mix the 'g' and NUMBER |
681 |
modifiers, and currently there is no widely agreed upon meaning |
682 |
across 'sed' implementations. For GNU 'sed', the interaction is |
683 |
defined to be: ignore matches before the NUMBERth, and then match |
684 |
and replace all matches from the NUMBERth on. |
685 |
|
686 |
'p' |
687 |
If the substitution was made, then print the new pattern space. |
688 |
|
689 |
Note: when both the 'p' and 'e' options are specified, the relative |
690 |
ordering of the two produces very different results. In general, |
691 |
'ep' (evaluate then print) is what you want, but operating the |
692 |
other way round can be useful for debugging. For this reason, the |
693 |
current version of GNU 'sed' interprets specially the presence of |
694 |
'p' options both before and after 'e', printing the pattern space |
695 |
before and after evaluation, while in general flags for the 's' |
696 |
command show their effect just once. This behavior, although |
697 |
documented, might change in future versions. |
698 |
|
699 |
'w FILENAME' |
700 |
If the substitution was made, then write out the result to the |
701 |
named file. As a GNU 'sed' extension, two special values of |
702 |
FILENAME are supported: '/dev/stderr', which writes the result to |
703 |
the standard error, and '/dev/stdout', which writes to the standard |
704 |
output.(1) |
705 |
|
706 |
'e' |
707 |
This command allows one to pipe input from a shell command into |
708 |
pattern space. If a substitution was made, the command that is |
709 |
found in pattern space is executed and pattern space is replaced |
710 |
with its output. A trailing newline is suppressed; results are |
711 |
undefined if the command to be executed contains a NUL character. |
712 |
This is a GNU 'sed' extension. |
713 |
|
714 |
'I' |
715 |
'i' |
716 |
The 'I' modifier to regular-expression matching is a GNU extension |
717 |
which makes 'sed' match REGEXP in a case-insensitive manner. |
718 |
|
719 |
'M' |
720 |
'm' |
721 |
The 'M' modifier to regular-expression matching is a GNU 'sed' |
722 |
extension which directs GNU 'sed' to match the regular expression |
723 |
in 'multi-line' mode. The modifier causes '^' and '$' to match |
724 |
respectively (in addition to the normal behavior) the empty string |
725 |
after a newline, and the empty string before a newline. There are |
726 |
special character sequences ('\`' and '\'') which always match the |
727 |
beginning or the end of the buffer. In addition, the period |
728 |
character does not match a new-line character in multi-line mode. |
729 |
|
730 |
---------- Footnotes ---------- |
731 |
|
732 |
(1) This is equivalent to 'p' unless the '-i' option is being used. |
733 |
|
734 |
3.4 Often-Used Commands |
735 |
======================= |
736 |
|
737 |
If you use 'sed' at all, you will quite likely want to know these |
738 |
commands. |
739 |
|
740 |
'#' |
741 |
[No addresses allowed.] |
742 |
|
743 |
The '#' character begins a comment; the comment continues until the |
744 |
next newline. |
745 |
|
746 |
If you are concerned about portability, be aware that some |
747 |
implementations of 'sed' (which are not POSIX conforming) may only |
748 |
support a single one-line comment, and then only when the very |
749 |
first character of the script is a '#'. |
750 |
|
751 |
Warning: if the first two characters of the 'sed' script are '#n', |
752 |
then the '-n' (no-autoprint) option is forced. If you want to put |
753 |
a comment in the first line of your script and that comment begins |
754 |
with the letter 'n' and you do not want this behavior, then be sure |
755 |
to either use a capital 'N', or place at least one space before the |
756 |
'n'. |
757 |
|
758 |
'q [EXIT-CODE]' |
759 |
Exit 'sed' without processing any more commands or input. |
760 |
|
761 |
Example: stop after printing the second line: |
762 |
$ seq 3 | sed 2q |
763 |
1 |
764 |
2 |
765 |
|
766 |
This command accepts only one address. Note that the current |
767 |
pattern space is printed if auto-print is not disabled with the |
768 |
'-n' options. The ability to return an exit code from the 'sed' |
769 |
script is a GNU 'sed' extension. |
770 |
|
771 |
See also the GNU 'sed' extension 'Q' command which quits silently |
772 |
without printing the current pattern space. |
773 |
|
774 |
'd' |
775 |
Delete the pattern space; immediately start next cycle. |
776 |
|
777 |
Example: delete the second input line: |
778 |
$ seq 3 | sed 2d |
779 |
1 |
780 |
3 |
781 |
|
782 |
'p' |
783 |
Print out the pattern space (to the standard output). This command |
784 |
is usually only used in conjunction with the '-n' command-line |
785 |
option. |
786 |
|
787 |
Example: print only the second input line: |
788 |
$ seq 3 | sed -n 2p |
789 |
2 |
790 |
|
791 |
'n' |
792 |
If auto-print is not disabled, print the pattern space, then, |
793 |
regardless, replace the pattern space with the next line of input. |
794 |
If there is no more input then 'sed' exits without processing any |
795 |
more commands. |
796 |
|
797 |
This command is useful to skip lines (e.g. process every Nth |
798 |
line). |
799 |
|
800 |
Example: perform substitution on every 3rd line (i.e. two 'n' |
801 |
commands skip two lines): |
802 |
$ seq 6 | sed 'n;n;s/./x/' |
803 |
1 |
804 |
2 |
805 |
x |
806 |
4 |
807 |
5 |
808 |
x |
809 |
|
810 |
GNU 'sed' provides an extension address syntax of FIRST~STEP to |
811 |
achieve the same result: |
812 |
|
813 |
$ seq 6 | sed '0~3s/./x/' |
814 |
1 |
815 |
2 |
816 |
x |
817 |
4 |
818 |
5 |
819 |
x |
820 |
|
821 |
'{ COMMANDS }' |
822 |
A group of commands may be enclosed between '{' and '}' characters. |
823 |
This is particularly useful when you want a group of commands to be |
824 |
triggered by a single address (or address-range) match. |
825 |
|
826 |
Example: perform substitution then print the second input line: |
827 |
$ seq 3 | sed -n '2{s/2/X/ ; p}' |
828 |
X |
829 |
|
830 |
3.5 Less Frequently-Used Commands |
831 |
================================= |
832 |
|
833 |
Though perhaps less frequently used than those in the previous section, |
834 |
some very small yet useful 'sed' scripts can be built with these |
835 |
commands. |
836 |
|
837 |
'y/SOURCE-CHARS/DEST-CHARS/' |
838 |
Transliterate any characters in the pattern space which match any |
839 |
of the SOURCE-CHARS with the corresponding character in DEST-CHARS. |
840 |
|
841 |
Example: transliterate 'a-j' into '0-9': |
842 |
$ echo hello world | sed 'y/abcdefghij/0123456789/' |
843 |
74llo worl3 |
844 |
|
845 |
(The '/' characters may be uniformly replaced by any other single |
846 |
character within any given 'y' command.) |
847 |
|
848 |
Instances of the '/' (or whatever other character is used in its |
849 |
stead), '\', or newlines can appear in the SOURCE-CHARS or |
850 |
DEST-CHARS lists, provide that each instance is escaped by a '\'. |
851 |
The SOURCE-CHARS and DEST-CHARS lists _must_ contain the same |
852 |
number of characters (after de-escaping). |
853 |
|
854 |
See the 'tr' command from GNU coreutils for similar functionality. |
855 |
|
856 |
'a TEXT' |
857 |
Appending TEXT after a line. This is a GNU extension to the |
858 |
standard 'a' command - see below for details. |
859 |
|
860 |
Example: Add the word 'hello' after the second line: |
861 |
$ seq 3 | sed '2a hello' |
862 |
1 |
863 |
2 |
864 |
hello |
865 |
3 |
866 |
|
867 |
Leading whitespace after the 'a' command is ignored. The text to |
868 |
add is read until the end of the line. |
869 |
|
870 |
'a\' |
871 |
'TEXT' |
872 |
Appending TEXT after a line. |
873 |
|
874 |
Example: Add 'hello' after the second line (-| indicates printed |
875 |
output lines): |
876 |
$ seq 3 | sed '2a\ |
877 |
hello' |
878 |
-|1 |
879 |
-|2 |
880 |
-|hello |
881 |
-|3 |
882 |
|
883 |
The 'a' command queues the lines of text which follow this command |
884 |
(each but the last ending with a '\', which are removed from the |
885 |
output) to be output at the end of the current cycle, or when the |
886 |
next input line is read. |
887 |
|
888 |
As a GNU extension, this command accepts two addresses. |
889 |
|
890 |
Escape sequences in TEXT are processed, so you should use '\\' in |
891 |
TEXT to print a single backslash. |
892 |
|
893 |
The commands resume after the last line without a backslash ('\') - |
894 |
'world' in the following example: |
895 |
$ seq 3 | sed '2a\ |
896 |
hello\ |
897 |
world |
898 |
3s/./X/' |
899 |
-|1 |
900 |
-|2 |
901 |
-|hello |
902 |
-|world |
903 |
-|X |
904 |
|
905 |
As a GNU extension, the 'a' command and TEXT can be separated into |
906 |
two '-e' parameters, enabling easier scripting: |
907 |
$ seq 3 | sed -e '2a\' -e hello |
908 |
1 |
909 |
2 |
910 |
hello |
911 |
3 |
912 |
|
913 |
$ sed -e '2a\' -e "$VAR" |
914 |
|
915 |
'i TEXT' |
916 |
insert TEXT before a line. This is a GNU extension to the standard |
917 |
'i' command - see below for details. |
918 |
|
919 |
Example: Insert the word 'hello' before the second line: |
920 |
$ seq 3 | sed '2i hello' |
921 |
1 |
922 |
hello |
923 |
2 |
924 |
3 |
925 |
|
926 |
Leading whitespace after the 'i' command is ignored. The text to |
927 |
add is read until the end of the line. |
928 |
|
929 |
'i\' |
930 |
'TEXT' |
931 |
Immediately output the lines of text which follow this command. |
932 |
|
933 |
Example: Insert 'hello' before the second line (-| indicates |
934 |
printed output lines): |
935 |
$ seq 3 | sed '2i\ |
936 |
hello' |
937 |
-|1 |
938 |
-|hello |
939 |
-|2 |
940 |
-|3 |
941 |
|
942 |
As a GNU extension, this command accepts two addresses. |
943 |
|
944 |
Escape sequences in TEXT are processed, so you should use '\\' in |
945 |
TEXT to print a single backslash. |
946 |
|
947 |
The commands resume after the last line without a backslash ('\') - |
948 |
'world' in the following example: |
949 |
$ seq 3 | sed '2i\ |
950 |
hello\ |
951 |
world |
952 |
s/./X/' |
953 |
-|X |
954 |
-|hello |
955 |
-|world |
956 |
-|X |
957 |
-|X |
958 |
|
959 |
As a GNU extension, the 'i' command and TEXT can be separated into |
960 |
two '-e' parameters, enabling easier scripting: |
961 |
$ seq 3 | sed -e '2i\' -e hello |
962 |
1 |
963 |
hello |
964 |
2 |
965 |
3 |
966 |
|
967 |
$ sed -e '2i\' -e "$VAR" |
968 |
|
969 |
'c TEXT' |
970 |
Replaces the line(s) with TEXT. This is a GNU extension to the |
971 |
standard 'c' command - see below for details. |
972 |
|
973 |
Example: Replace the 2nd to 9th lines with the word 'hello': |
974 |
$ seq 10 | sed '2,9c hello' |
975 |
1 |
976 |
hello |
977 |
10 |
978 |
|
979 |
Leading whitespace after the 'c' command is ignored. The text to |
980 |
add is read until the end of the line. |
981 |
|
982 |
'c\' |
983 |
'TEXT' |
984 |
Delete the lines matching the address or address-range, and output |
985 |
the lines of text which follow this command. |
986 |
|
987 |
Example: Replace 2nd to 4th lines with the words 'hello' and |
988 |
'world' (-| indicates printed output lines): |
989 |
$ seq 5 | sed '2,4c\ |
990 |
hello\ |
991 |
world' |
992 |
-|1 |
993 |
-|hello |
994 |
-|world |
995 |
-|5 |
996 |
|
997 |
If no addresses are given, each line is replaced. |
998 |
|
999 |
A new cycle is started after this command is done, since the |
1000 |
pattern space will have been deleted. In the following example, |
1001 |
the 'c' starts a new cycle and the substitution command is not |
1002 |
performed on the replaced text: |
1003 |
|
1004 |
$ seq 3 | sed '2c\ |
1005 |
hello |
1006 |
s/./X/' |
1007 |
-|X |
1008 |
-|hello |
1009 |
-|X |
1010 |
|
1011 |
As a GNU extension, the 'c' command and TEXT can be separated into |
1012 |
two '-e' parameters, enabling easier scripting: |
1013 |
$ seq 3 | sed -e '2c\' -e hello |
1014 |
1 |
1015 |
hello |
1016 |
3 |
1017 |
|
1018 |
$ sed -e '2c\' -e "$VAR" |
1019 |
|
1020 |
'=' |
1021 |
Print out the current input line number (with a trailing newline). |
1022 |
|
1023 |
$ printf '%s\n' aaa bbb ccc | sed = |
1024 |
1 |
1025 |
aaa |
1026 |
2 |
1027 |
bbb |
1028 |
3 |
1029 |
ccc |
1030 |
|
1031 |
As a GNU extension, this command accepts two addresses. |
1032 |
|
1033 |
'l N' |
1034 |
Print the pattern space in an unambiguous form: non-printable |
1035 |
characters (and the '\' character) are printed in C-style escaped |
1036 |
form; long lines are split, with a trailing '\' character to |
1037 |
indicate the split; the end of each line is marked with a '$'. |
1038 |
|
1039 |
N specifies the desired line-wrap length; a length of 0 (zero) |
1040 |
means to never wrap long lines. If omitted, the default as |
1041 |
specified on the command line is used. The N parameter is a GNU |
1042 |
'sed' extension. |
1043 |
|
1044 |
'r FILENAME' |
1045 |
|
1046 |
Reads file FILENAME. Example: |
1047 |
|
1048 |
$ seq 3 | sed '2r/etc/hostname' |
1049 |
1 |
1050 |
2 |
1051 |
fencepost.gnu.org |
1052 |
3 |
1053 |
|
1054 |
Queue the contents of FILENAME to be read and inserted into the |
1055 |
output stream at the end of the current cycle, or when the next |
1056 |
input line is read. Note that if FILENAME cannot be read, it is |
1057 |
treated as if it were an empty file, without any error indication. |
1058 |
|
1059 |
As a GNU 'sed' extension, the special value '/dev/stdin' is |
1060 |
supported for the file name, which reads the contents of the |
1061 |
standard input. |
1062 |
|
1063 |
As a GNU extension, this command accepts two addresses. The file |
1064 |
will then be reread and inserted on each of the addressed lines. |
1065 |
|
1066 |
'w FILENAME' |
1067 |
Write the pattern space to FILENAME. As a GNU 'sed' extension, two |
1068 |
special values of FILENAME are supported: '/dev/stderr', which |
1069 |
writes the result to the standard error, and '/dev/stdout', which |
1070 |
writes to the standard output.(1) |
1071 |
|
1072 |
The file will be created (or truncated) before the first input line |
1073 |
is read; all 'w' commands (including instances of the 'w' flag on |
1074 |
successful 's' commands) which refer to the same FILENAME are |
1075 |
output without closing and reopening the file. |
1076 |
|
1077 |
'D' |
1078 |
If pattern space contains no newline, start a normal new cycle as |
1079 |
if the 'd' command was issued. Otherwise, delete text in the |
1080 |
pattern space up to the first newline, and restart cycle with the |
1081 |
resultant pattern space, without reading a new line of input. |
1082 |
|
1083 |
'N' |
1084 |
Add a newline to the pattern space, then append the next line of |
1085 |
input to the pattern space. If there is no more input then 'sed' |
1086 |
exits without processing any more commands. |
1087 |
|
1088 |
When '-z' is used, a zero byte (the ascii 'NUL' character) is added |
1089 |
between the lines (instead of a new line). |
1090 |
|
1091 |
By default 'sed' does not terminate if there is no 'next' input |
1092 |
line. This is a GNU extension which can be disabled with |
1093 |
'--posix'. *Note N command on the last line: N_command_last_line. |
1094 |
|
1095 |
'P' |
1096 |
Print out the portion of the pattern space up to the first newline. |
1097 |
|
1098 |
'h' |
1099 |
Replace the contents of the hold space with the contents of the |
1100 |
pattern space. |
1101 |
|
1102 |
'H' |
1103 |
Append a newline to the contents of the hold space, and then append |
1104 |
the contents of the pattern space to that of the hold space. |
1105 |
|
1106 |
'g' |
1107 |
Replace the contents of the pattern space with the contents of the |
1108 |
hold space. |
1109 |
|
1110 |
'G' |
1111 |
Append a newline to the contents of the pattern space, and then |
1112 |
append the contents of the hold space to that of the pattern space. |
1113 |
|
1114 |
'x' |
1115 |
Exchange the contents of the hold and pattern spaces. |
1116 |
|
1117 |
---------- Footnotes ---------- |
1118 |
|
1119 |
(1) This is equivalent to 'p' unless the '-i' option is being used. |
1120 |
|
1121 |
3.6 Commands for 'sed' gurus |
1122 |
============================ |
1123 |
|
1124 |
In most cases, use of these commands indicates that you are probably |
1125 |
better off programming in something like 'awk' or Perl. But |
1126 |
occasionally one is committed to sticking with 'sed', and these commands |
1127 |
can enable one to write quite convoluted scripts. |
1128 |
|
1129 |
': LABEL' |
1130 |
[No addresses allowed.] |
1131 |
|
1132 |
Specify the location of LABEL for branch commands. In all other |
1133 |
respects, a no-op. |
1134 |
|
1135 |
'b LABEL' |
1136 |
Unconditionally branch to LABEL. The LABEL may be omitted, in |
1137 |
which case the next cycle is started. |
1138 |
|
1139 |
't LABEL' |
1140 |
Branch to LABEL only if there has been a successful 's'ubstitution |
1141 |
since the last input line was read or conditional branch was taken. |
1142 |
The LABEL may be omitted, in which case the next cycle is started. |
1143 |
|
1144 |
3.7 Commands Specific to GNU 'sed' |
1145 |
================================== |
1146 |
|
1147 |
These commands are specific to GNU 'sed', so you must use them with care |
1148 |
and only when you are sure that hindering portability is not evil. They |
1149 |
allow you to check for GNU 'sed' extensions or to do tasks that are |
1150 |
required quite often, yet are unsupported by standard 'sed's. |
1151 |
|
1152 |
'e [COMMAND]' |
1153 |
This command allows one to pipe input from a shell command into |
1154 |
pattern space. Without parameters, the 'e' command executes the |
1155 |
command that is found in pattern space and replaces the pattern |
1156 |
space with the output; a trailing newline is suppressed. |
1157 |
|
1158 |
If a parameter is specified, instead, the 'e' command interprets it |
1159 |
as a command and sends its output to the output stream. The |
1160 |
command can run across multiple lines, all but the last ending with |
1161 |
a back-slash. |
1162 |
|
1163 |
In both cases, the results are undefined if the command to be |
1164 |
executed contains a NUL character. |
1165 |
|
1166 |
Note that, unlike the 'r' command, the output of the command will |
1167 |
be printed immediately; the 'r' command instead delays the output |
1168 |
to the end of the current cycle. |
1169 |
|
1170 |
'F' |
1171 |
Print out the file name of the current input file (with a trailing |
1172 |
newline). |
1173 |
|
1174 |
'Q [EXIT-CODE]' |
1175 |
This command accepts only one address. |
1176 |
|
1177 |
This command is the same as 'q', but will not print the contents of |
1178 |
pattern space. Like 'q', it provides the ability to return an exit |
1179 |
code to the caller. |
1180 |
|
1181 |
This command can be useful because the only alternative ways to |
1182 |
accomplish this apparently trivial function are to use the '-n' |
1183 |
option (which can unnecessarily complicate your script) or |
1184 |
resorting to the following snippet, which wastes time by reading |
1185 |
the whole file without any visible effect: |
1186 |
|
1187 |
:eat |
1188 |
$d Quit silently on the last line |
1189 |
N Read another line, silently |
1190 |
g Overwrite pattern space each time to save memory |
1191 |
b eat |
1192 |
|
1193 |
'R FILENAME' |
1194 |
Queue a line of FILENAME to be read and inserted into the output |
1195 |
stream at the end of the current cycle, or when the next input line |
1196 |
is read. Note that if FILENAME cannot be read, or if its end is |
1197 |
reached, no line is appended, without any error indication. |
1198 |
|
1199 |
As with the 'r' command, the special value '/dev/stdin' is |
1200 |
supported for the file name, which reads a line from the standard |
1201 |
input. |
1202 |
|
1203 |
'T LABEL' |
1204 |
Branch to LABEL only if there have been no successful |
1205 |
's'ubstitutions since the last input line was read or conditional |
1206 |
branch was taken. The LABEL may be omitted, in which case the next |
1207 |
cycle is started. |
1208 |
|
1209 |
'v VERSION' |
1210 |
This command does nothing, but makes 'sed' fail if GNU 'sed' |
1211 |
extensions are not supported, simply because other versions of |
1212 |
'sed' do not implement it. In addition, you can specify the |
1213 |
version of 'sed' that your script requires, such as '4.0.5'. The |
1214 |
default is '4.0' because that is the first version that implemented |
1215 |
this command. |
1216 |
|
1217 |
This command enables all GNU extensions even if 'POSIXLY_CORRECT' |
1218 |
is set in the environment. |
1219 |
|
1220 |
'W FILENAME' |
1221 |
Write to the given filename the portion of the pattern space up to |
1222 |
the first newline. Everything said under the 'w' command about |
1223 |
file handling holds here too. |
1224 |
|
1225 |
'z' |
1226 |
This command empties the content of pattern space. It is usually |
1227 |
the same as 's/.*//', but is more efficient and works in the |
1228 |
presence of invalid multibyte sequences in the input stream. POSIX |
1229 |
mandates that such sequences are _not_ matched by '.', so that |
1230 |
there is no portable way to clear 'sed''s buffers in the middle of |
1231 |
the script in most multibyte locales (including UTF-8 locales). |
1232 |
|
1233 |
3.8 Multiple commands syntax |
1234 |
============================ |
1235 |
|
1236 |
There are several methods to specify multiple commands in a 'sed' |
1237 |
program. |
1238 |
|
1239 |
Using newlines is most natural when running a sed script from a file |
1240 |
(using the '-f' option). |
1241 |
|
1242 |
On the command line, all 'sed' commands may be separated by newlines. |
1243 |
Alternatively, you may specify each command as an argument to an '-e' |
1244 |
option: |
1245 |
|
1246 |
$ seq 6 | sed '1d |
1247 |
3d |
1248 |
5d' |
1249 |
2 |
1250 |
4 |
1251 |
6 |
1252 |
|
1253 |
$ seq 6 | sed -e 1d -e 3d -e 5d |
1254 |
2 |
1255 |
4 |
1256 |
6 |
1257 |
|
1258 |
A semicolon (';') may be used to separate most simple commands: |
1259 |
|
1260 |
$ seq 6 | sed '1d;3d;5d' |
1261 |
2 |
1262 |
4 |
1263 |
6 |
1264 |
|
1265 |
The '{','}','b','t','T',':' commands can be separated with a |
1266 |
semicolon (this is a non-portable GNU 'sed' extension). |
1267 |
|
1268 |
$ seq 4 | sed '{1d;3d}' |
1269 |
2 |
1270 |
4 |
1271 |
|
1272 |
$ seq 6 | sed '{1d;3d};5d' |
1273 |
2 |
1274 |
4 |
1275 |
6 |
1276 |
|
1277 |
Labels used in 'b','t','T',':' commands are read until a semicolon. |
1278 |
Leading and trailing whitespace is ignored. In the examples below the |
1279 |
label is 'x'. The first example works with GNU 'sed'. The second is a |
1280 |
portable equivalent. For more information about branching and labels |
1281 |
*note Branching and flow control::. |
1282 |
|
1283 |
$ seq 3 | sed '/1/b x ; s/^/=/ ; :x ; 3d' |
1284 |
1 |
1285 |
=2 |
1286 |
|
1287 |
$ seq 3 | sed -e '/1/bx' -e 's/^/=/' -e ':x' -e '3d' |
1288 |
1 |
1289 |
=2 |
1290 |
|
1291 |
3.8.1 Commands Requiring a newline |
1292 |
---------------------------------- |
1293 |
|
1294 |
The following commands cannot be separated by a semicolon and require a |
1295 |
newline: |
1296 |
|
1297 |
'a','c','i' (append/change/insert) |
1298 |
|
1299 |
All characters following 'a','c','i' commands are taken as the text |
1300 |
to append/change/insert. Using a semicolon leads to undesirable |
1301 |
results: |
1302 |
|
1303 |
$ seq 2 | sed '1aHello ; 2d' |
1304 |
1 |
1305 |
Hello ; 2d |
1306 |
2 |
1307 |
|
1308 |
Separate the commands using '-e' or a newline: |
1309 |
|
1310 |
$ seq 2 | sed -e 1aHello -e 2d |
1311 |
1 |
1312 |
Hello |
1313 |
|
1314 |
$ seq 2 | sed '1aHello |
1315 |
2d' |
1316 |
1 |
1317 |
Hello |
1318 |
|
1319 |
Note that specifying the text to add ('Hello') immediately after |
1320 |
'a','c','i' is itself a GNU 'sed' extension. A portable, |
1321 |
POSIX-compliant alternative is: |
1322 |
|
1323 |
$ seq 2 | sed '1a\ |
1324 |
Hello |
1325 |
2d' |
1326 |
1 |
1327 |
Hello |
1328 |
|
1329 |
'#' (comment) |
1330 |
|
1331 |
All characters following '#' until the next newline are ignored. |
1332 |
|
1333 |
$ seq 3 | sed '# this is a comment ; 2d' |
1334 |
1 |
1335 |
2 |
1336 |
3 |
1337 |
|
1338 |
|
1339 |
$ seq 3 | sed '# this is a comment |
1340 |
2d' |
1341 |
1 |
1342 |
3 |
1343 |
|
1344 |
'r','R','w','W' (reading and writing files) |
1345 |
|
1346 |
The 'r','R','w','W' commands parse the filename until end of the |
1347 |
line. If whitespace, comments or semicolons are found, they will |
1348 |
be included in the filename, leading to unexpected results: |
1349 |
|
1350 |
$ seq 2 | sed '1w hello.txt ; 2d' |
1351 |
1 |
1352 |
2 |
1353 |
|
1354 |
$ ls -log |
1355 |
total 4 |
1356 |
-rw-rw-r-- 1 2 Jan 23 23:03 hello.txt ; 2d |
1357 |
|
1358 |
$ cat 'hello.txt ; 2d' |
1359 |
1 |
1360 |
|
1361 |
Note that 'sed' silently ignores read/write errors in |
1362 |
'r','R','w','W' commands (such as missing files). In the following |
1363 |
example, 'sed' tries to read a file named ''hello.txt ; N''. The |
1364 |
file is missing, and the error is silently ignored: |
1365 |
|
1366 |
$ echo x | sed '1rhello.txt ; N' |
1367 |
x |
1368 |
|
1369 |
'e' (command execution) |
1370 |
|
1371 |
Any characters following the 'e' command until the end of the line |
1372 |
will be sent to the shell. If whitespace, comments or semicolons |
1373 |
are found, they will be included in the shell command, leading to |
1374 |
unexpected results: |
1375 |
|
1376 |
$ echo a | sed '1e touch foo#bar' |
1377 |
a |
1378 |
|
1379 |
$ ls -1 |
1380 |
foo#bar |
1381 |
|
1382 |
$ echo a | sed '1e touch foo ; s/a/b/' |
1383 |
sh: 1: s/a/b/: not found |
1384 |
a |
1385 |
|
1386 |
's///[we]' (substitute with 'e' or 'w' flags) |
1387 |
|
1388 |
In a substitution command, the 'w' flag writes the substitution |
1389 |
result to a file, and the 'e' flag executes the subsitution result |
1390 |
as a shell command. As with the 'r/R/w/W/e' commands, these must |
1391 |
be terminated with a newline. If whitespace, comments or |
1392 |
semicolons are found, they will be included in the shell command or |
1393 |
filename, leading to unexpected results: |
1394 |
|
1395 |
$ echo a | sed 's/a/b/w1.txt#foo' |
1396 |
b |
1397 |
|
1398 |
$ ls -1 |
1399 |
1.txt#foo |
1400 |
|
1401 |
4 Addresses: selecting lines |
1402 |
**************************** |
1403 |
|
1404 |
4.1 Addresses overview |
1405 |
====================== |
1406 |
|
1407 |
Addresses determine on which line(s) the 'sed' command will be executed. |
1408 |
The following command replaces the word 'hello' with 'world' only on |
1409 |
line 144: |
1410 |
|
1411 |
sed '144s/hello/world/' input.txt > output.txt |
1412 |
|
1413 |
If no addresses are given, the command is performed on all lines. |
1414 |
The following command replaces the word 'hello' with 'world' on all |
1415 |
lines in the input file: |
1416 |
|
1417 |
sed 's/hello/world/' input.txt > output.txt |
1418 |
|
1419 |
Addresses can contain regular expressions to match lines based on |
1420 |
content instead of line numbers. The following command replaces the |
1421 |
word 'hello' with 'world' only in lines containing the word 'apple': |
1422 |
|
1423 |
sed '/apple/s/hello/world/' input.txt > output.txt |
1424 |
|
1425 |
An address range is specified with two addresses separated by a comma |
1426 |
(','). Addresses can be numeric, regular expressions, or a mix of both. |
1427 |
The following command replaces the word 'hello' with 'world' only in |
1428 |
lines 4 to 17 (inclusive): |
1429 |
|
1430 |
sed '4,17s/hello/world/' input.txt > output.txt |
1431 |
|
1432 |
Appending the '!' character to the end of an address specification |
1433 |
(before the command letter) negates the sense of the match. That is, if |
1434 |
the '!' character follows an address or an address range, then only |
1435 |
lines which do _not_ match the addresses will be selected. The |
1436 |
following command replaces the word 'hello' with 'world' only in lines |
1437 |
_not_ containing the word 'apple': |
1438 |
|
1439 |
sed '/apple/!s/hello/world/' input.txt > output.txt |
1440 |
|
1441 |
The following command replaces the word 'hello' with 'world' only in |
1442 |
lines 1 to 3 and 18 till the last line of the input file (i.e. |
1443 |
excluding lines 4 to 17): |
1444 |
|
1445 |
sed '4,17!s/hello/world/' input.txt > output.txt |
1446 |
|
1447 |
4.2 Selecting lines by numbers |
1448 |
============================== |
1449 |
|
1450 |
Addresses in a 'sed' script can be in any of the following forms: |
1451 |
'NUMBER' |
1452 |
Specifying a line number will match only that line in the input. |
1453 |
(Note that 'sed' counts lines continuously across all input files |
1454 |
unless '-i' or '-s' options are specified.) |
1455 |
|
1456 |
'$' |
1457 |
This address matches the last line of the last file of input, or |
1458 |
the last line of each file when the '-i' or '-s' options are |
1459 |
specified. |
1460 |
|
1461 |
'FIRST~STEP' |
1462 |
This GNU extension matches every STEPth line starting with line |
1463 |
FIRST. In particular, lines will be selected when there exists a |
1464 |
non-negative N such that the current line-number equals FIRST + (N |
1465 |
* STEP). Thus, one would use '1~2' to select the odd-numbered |
1466 |
lines and '0~2' for even-numbered lines; to pick every third line |
1467 |
starting with the second, '2~3' would be used; to pick every fifth |
1468 |
line starting with the tenth, use '10~5'; and '50~0' is just an |
1469 |
obscure way of saying '50'. |
1470 |
|
1471 |
The following commands demonstrate the step address usage: |
1472 |
|
1473 |
$ seq 10 | sed -n '0~4p' |
1474 |
4 |
1475 |
8 |
1476 |
|
1477 |
$ seq 10 | sed -n '1~3p' |
1478 |
1 |
1479 |
4 |
1480 |
7 |
1481 |
10 |
1482 |
|
1483 |
4.3 selecting lines by text matching |
1484 |
==================================== |
1485 |
|
1486 |
GNU 'sed' supports the following regular expression addresses. The |
1487 |
default regular expression is *note Basic Regular Expression (BRE): BRE |
1488 |
syntax. If '-E' or '-r' options are used, The regular expression should |
1489 |
be in *note Extended Regular Expression (ERE): ERE syntax. syntax. |
1490 |
*Note BRE vs ERE::. |
1491 |
|
1492 |
'/REGEXP/' |
1493 |
This will select any line which matches the regular expression |
1494 |
REGEXP. If REGEXP itself includes any '/' characters, each must be |
1495 |
escaped by a backslash ('\'). |
1496 |
|
1497 |
The following command prints lines in '/etc/passwd' which end with |
1498 |
'bash'(1): |
1499 |
|
1500 |
sed -n '/bash$/p' /etc/passwd |
1501 |
|
1502 |
The empty regular expression '//' repeats the last regular |
1503 |
expression match (the same holds if the empty regular expression is |
1504 |
passed to the 's' command). Note that modifiers to regular |
1505 |
expressions are evaluated when the regular expression is compiled, |
1506 |
thus it is invalid to specify them together with the empty regular |
1507 |
expression. |
1508 |
|
1509 |
'\%REGEXP%' |
1510 |
(The '%' may be replaced by any other single character.) |
1511 |
|
1512 |
This also matches the regular expression REGEXP, but allows one to |
1513 |
use a different delimiter than '/'. This is particularly useful if |
1514 |
the REGEXP itself contains a lot of slashes, since it avoids the |
1515 |
tedious escaping of every '/'. If REGEXP itself includes any |
1516 |
delimiter characters, each must be escaped by a backslash ('\'). |
1517 |
|
1518 |
The following commands are equivalent. They print lines which |
1519 |
start with '/home/alice/documents/': |
1520 |
|
1521 |
sed -n '/^\/home\/alice\/documents\//p' |
1522 |
sed -n '\%^/home/alice/documents/%p' |
1523 |
sed -n '\;^/home/alice/documents/;p' |
1524 |
|
1525 |
'/REGEXP/I' |
1526 |
'\%REGEXP%I' |
1527 |
The 'I' modifier to regular-expression matching is a GNU extension |
1528 |
which causes the REGEXP to be matched in a case-insensitive manner. |
1529 |
|
1530 |
In many other programming languages, a lower case 'i' is used for |
1531 |
case-insensitive regular expression matching. However, in 'sed' |
1532 |
the 'i' is used for the insert command (*note insert command::). |
1533 |
|
1534 |
Observe the difference between the following examples. |
1535 |
|
1536 |
In this example, '/b/I' is the address: regular expression with 'I' |
1537 |
modifier. 'd' is the delete command: |
1538 |
|
1539 |
$ printf "%s\n" a b c | sed '/b/Id' |
1540 |
a |
1541 |
c |
1542 |
|
1543 |
Here, '/b/' is the address: a regular expression. 'i' is the |
1544 |
insert command. 'd' is the value to insert. A line with 'd' is |
1545 |
then inserted above the matched line: |
1546 |
|
1547 |
$ printf "%s\n" a b c | sed '/b/id' |
1548 |
a |
1549 |
d |
1550 |
b |
1551 |
c |
1552 |
|
1553 |
'/REGEXP/M' |
1554 |
'\%REGEXP%M' |
1555 |
The 'M' modifier to regular-expression matching is a GNU 'sed' |
1556 |
extension which directs GNU 'sed' to match the regular expression |
1557 |
in 'multi-line' mode. The modifier causes '^' and '$' to match |
1558 |
respectively (in addition to the normal behavior) the empty string |
1559 |
after a newline, and the empty string before a newline. There are |
1560 |
special character sequences ('\`' and '\'') which always match the |
1561 |
beginning or the end of the buffer. In addition, the period |
1562 |
character does not match a new-line character in multi-line mode. |
1563 |
|
1564 |
Regex addresses operate on the content of the current pattern space. |
1565 |
If the pattern space is changed (for example with 's///' command) the |
1566 |
regular expression matching will operate on the changed text. |
1567 |
|
1568 |
In the following example, automatic printing is disabled with '-n'. |
1569 |
The 's/2/X/' command changes lines containing '2' to 'X'. The command |
1570 |
'/[0-9]/p' matches lines with digits and prints them. Because the |
1571 |
second line is changed before the '/[0-9]/' regex, it will not match and |
1572 |
will not be printed: |
1573 |
|
1574 |
$ seq 3 | sed -n 's/2/X/ ; /[0-9]/p' |
1575 |
1 |
1576 |
3 |
1577 |
|
1578 |
---------- Footnotes ---------- |
1579 |
|
1580 |
(1) There are of course many other ways to do the same, e.g. |
1581 |
grep 'bash$' /etc/passwd |
1582 |
awk -F: '$7 == "/bin/bash"' /etc/passwd |
1583 |
|
1584 |
4.4 Range Addresses |
1585 |
=================== |
1586 |
|
1587 |
An address range can be specified by specifying two addresses separated |
1588 |
by a comma (','). An address range matches lines starting from where |
1589 |
the first address matches, and continues until the second address |
1590 |
matches (inclusively): |
1591 |
|
1592 |
$ seq 10 | sed -n '4,6p' |
1593 |
4 |
1594 |
5 |
1595 |
6 |
1596 |
|
1597 |
If the second address is a REGEXP, then checking for the ending match |
1598 |
will start with the line _following_ the line which matched the first |
1599 |
address: a range will always span at least two lines (except of course |
1600 |
if the input stream ends). |
1601 |
|
1602 |
$ seq 10 | sed -n '4,/[0-9]/p' |
1603 |
4 |
1604 |
5 |
1605 |
|
1606 |
If the second address is a NUMBER less than (or equal to) the line |
1607 |
matching the first address, then only the one line is matched: |
1608 |
|
1609 |
$ seq 10 | sed -n '4,1p' |
1610 |
4 |
1611 |
|
1612 |
GNU 'sed' also supports some special two-address forms; all these are |
1613 |
GNU extensions: |
1614 |
'0,/REGEXP/' |
1615 |
A line number of '0' can be used in an address specification like |
1616 |
'0,/REGEXP/' so that 'sed' will try to match REGEXP in the first |
1617 |
input line too. In other words, '0,/REGEXP/' is similar to |
1618 |
'1,/REGEXP/', except that if ADDR2 matches the very first line of |
1619 |
input the '0,/REGEXP/' form will consider it to end the range, |
1620 |
whereas the '1,/REGEXP/' form will match the beginning of its range |
1621 |
and hence make the range span up to the _second_ occurrence of the |
1622 |
regular expression. |
1623 |
|
1624 |
Note that this is the only place where the '0' address makes sense; |
1625 |
there is no 0-th line and commands which are given the '0' address |
1626 |
in any other way will give an error. |
1627 |
|
1628 |
The following examples demonstrate the difference between starting |
1629 |
with address 1 and 0: |
1630 |
|
1631 |
$ seq 10 | sed -n '1,/[0-9]/p' |
1632 |
1 |
1633 |
2 |
1634 |
|
1635 |
$ seq 10 | sed -n '0,/[0-9]/p' |
1636 |
1 |
1637 |
|
1638 |
'ADDR1,+N' |
1639 |
Matches ADDR1 and the N lines following ADDR1. |
1640 |
|
1641 |
$ seq 10 | sed -n '6,+2p' |
1642 |
6 |
1643 |
7 |
1644 |
8 |
1645 |
|
1646 |
ADDR1 can be a line number or a regular expression. |
1647 |
|
1648 |
'ADDR1,~N' |
1649 |
Matches ADDR1 and the lines following ADDR1 until the next line |
1650 |
whose input line number is a multiple of N. The following command |
1651 |
prints starting at line 6, until the next line which is a multiple |
1652 |
of 4 (i.e. line 8): |
1653 |
|
1654 |
$ seq 10 | sed -n '6,~4p' |
1655 |
6 |
1656 |
7 |
1657 |
8 |
1658 |
|
1659 |
ADDR1 can be a line number or a regular expression. |
1660 |
|
1661 |
5 Regular Expressions: selecting text |
1662 |
************************************* |
1663 |
|
1664 |
5.1 Overview of regular expression in 'sed' |
1665 |
=========================================== |
1666 |
|
1667 |
To know how to use 'sed', people should understand regular expressions |
1668 |
("regexp" for short). A regular expression is a pattern that is matched |
1669 |
against a subject string from left to right. Most characters are |
1670 |
"ordinary": they stand for themselves in a pattern, and match the |
1671 |
corresponding characters. Regular expressions in 'sed' are specified |
1672 |
between two slashes. |
1673 |
|
1674 |
The following command prints lines containing the word 'hello': |
1675 |
|
1676 |
sed -n '/hello/p' |
1677 |
|
1678 |
The above example is equivalent to this 'grep' command: |
1679 |
|
1680 |
grep 'hello' |
1681 |
|
1682 |
The power of regular expressions comes from the ability to include |
1683 |
alternatives and repetitions in the pattern. These are encoded in the |
1684 |
pattern by the use of "special characters", which do not stand for |
1685 |
themselves but instead are interpreted in some special way. |
1686 |
|
1687 |
The character '^' (caret) in a regular expression matches the |
1688 |
beginning of the line. The character '.' (dot) matches any single |
1689 |
character. The following 'sed' command matches and prints lines which |
1690 |
start with the letter 'b', followed by any single character, followed by |
1691 |
the letter 'd': |
1692 |
|
1693 |
$ printf "%s\n" abode bad bed bit bid byte body | sed -n '/^b.d/p' |
1694 |
bad |
1695 |
bed |
1696 |
bid |
1697 |
body |
1698 |
|
1699 |
The following sections explain the meaning and usage of special |
1700 |
characters in regular expressions. |
1701 |
|
1702 |
5.2 Basic (BRE) and extended (ERE) regular expression |
1703 |
===================================================== |
1704 |
|
1705 |
Basic and extended regular expressions are two variations on the syntax |
1706 |
of the specified pattern. Basic Regular Expression (BRE) syntax is the |
1707 |
default in 'sed' (and similarly in 'grep'). Use the POSIX-specified |
1708 |
'-E' option ('-r', '--regexp-extended') to enable Extended Regular |
1709 |
Expression (ERE) syntax. |
1710 |
|
1711 |
In GNU 'sed', the only difference between basic and extended regular |
1712 |
expressions is in the behavior of a few special characters: '?', '+', |
1713 |
parentheses, braces ('{}'), and '|'. |
1714 |
|
1715 |
With basic (BRE) syntax, these characters do not have special meaning |
1716 |
unless prefixed with a backslash ('\'); While with extended (ERE) syntax |
1717 |
it is reversed: these characters are special unless they are prefixed |
1718 |
with backslash ('\'). |
1719 |
|
1720 |
Desired pattern Basic (BRE) Syntax Extended (ERE) Syntax |
1721 |
|
1722 |
-------------------------------------------------------------------------- |
1723 |
literal '+' (plus $ echo 'a+b=c' > foo $ echo 'a+b=c' > foo |
1724 |
sign) $ sed -n '/a+b/p' foo $ sed -E -n '/a\+b/p' foo |
1725 |
a+b=c a+b=c |
1726 |
|
1727 |
One or more 'a' $ echo aab > foo $ echo aab > foo |
1728 |
characters $ sed -n '/a\+b/p' foo $ sed -E -n '/a+b/p' foo |
1729 |
followed by 'b' aab aab |
1730 |
(plus sign as |
1731 |
special |
1732 |
meta-character) |
1733 |
|
1734 |
5.3 Overview of basic regular expression syntax |
1735 |
=============================================== |
1736 |
|
1737 |
Here is a brief description of regular expression syntax as used in |
1738 |
'sed'. |
1739 |
|
1740 |
'CHAR' |
1741 |
A single ordinary character matches itself. |
1742 |
|
1743 |
'*' |
1744 |
Matches a sequence of zero or more instances of matches for the |
1745 |
preceding regular expression, which must be an ordinary character, |
1746 |
a special character preceded by '\', a '.', a grouped regexp (see |
1747 |
below), or a bracket expression. As a GNU extension, a postfixed |
1748 |
regular expression can also be followed by '*'; for example, 'a**' |
1749 |
is equivalent to 'a*'. POSIX 1003.1-2001 says that '*' stands for |
1750 |
itself when it appears at the start of a regular expression or |
1751 |
subexpression, but many nonGNU implementations do not support this |
1752 |
and portable scripts should instead use '\*' in these contexts. |
1753 |
'.' |
1754 |
Matches any character, including newline. |
1755 |
|
1756 |
'^' |
1757 |
Matches the null string at beginning of the pattern space, i.e. |
1758 |
what appears after the circumflex must appear at the beginning of |
1759 |
the pattern space. |
1760 |
|
1761 |
In most scripts, pattern space is initialized to the content of |
1762 |
each line (*note How 'sed' works: Execution Cycle.). So, it is a |
1763 |
useful simplification to think of '^#include' as matching only |
1764 |
lines where '#include' is the first thing on line--if there are |
1765 |
spaces before, for example, the match fails. This simplification |
1766 |
is valid as long as the original content of pattern space is not |
1767 |
modified, for example with an 's' command. |
1768 |
|
1769 |
'^' acts as a special character only at the beginning of the |
1770 |
regular expression or subexpression (that is, after '\(' or '\|'). |
1771 |
Portable scripts should avoid '^' at the beginning of a |
1772 |
subexpression, though, as POSIX allows implementations that treat |
1773 |
'^' as an ordinary character in that context. |
1774 |
|
1775 |
'$' |
1776 |
It is the same as '^', but refers to end of pattern space. '$' |
1777 |
also acts as a special character only at the end of the regular |
1778 |
expression or subexpression (that is, before '\)' or '\|'), and its |
1779 |
use at the end of a subexpression is not portable. |
1780 |
|
1781 |
'[LIST]' |
1782 |
'[^LIST]' |
1783 |
Matches any single character in LIST: for example, '[aeiou]' |
1784 |
matches all vowels. A list may include sequences like |
1785 |
'CHAR1-CHAR2', which matches any character between (inclusive) |
1786 |
CHAR1 and CHAR2. *Note Character Classes and Bracket |
1787 |
Expressions::. |
1788 |
|
1789 |
'\+' |
1790 |
As '*', but matches one or more. It is a GNU extension. |
1791 |
|
1792 |
'\?' |
1793 |
As '*', but only matches zero or one. It is a GNU extension. |
1794 |
|
1795 |
'\{I\}' |
1796 |
As '*', but matches exactly I sequences (I is a decimal integer; |
1797 |
for portability, keep it between 0 and 255 inclusive). |
1798 |
|
1799 |
'\{I,J\}' |
1800 |
Matches between I and J, inclusive, sequences. |
1801 |
|
1802 |
'\{I,\}' |
1803 |
Matches more than or equal to I sequences. |
1804 |
|
1805 |
'\(REGEXP\)' |
1806 |
Groups the inner REGEXP as a whole, this is used to: |
1807 |
|
1808 |
* Apply postfix operators, like '\(abcd\)*': this will search |
1809 |
for zero or more whole sequences of 'abcd', while 'abcd*' |
1810 |
would search for 'abc' followed by zero or more occurrences of |
1811 |
'd'. Note that support for '\(abcd\)*' is required by POSIX |
1812 |
1003.1-2001, but many non-GNU implementations do not support |
1813 |
it and hence it is not universally portable. |
1814 |
|
1815 |
* Use back references (see below). |
1816 |
|
1817 |
'REGEXP1\|REGEXP2' |
1818 |
Matches either REGEXP1 or REGEXP2. Use parentheses to use complex |
1819 |
alternative regular expressions. The matching process tries each |
1820 |
alternative in turn, from left to right, and the first one that |
1821 |
succeeds is used. It is a GNU extension. |
1822 |
|
1823 |
'REGEXP1REGEXP2' |
1824 |
Matches the concatenation of REGEXP1 and REGEXP2. Concatenation |
1825 |
binds more tightly than '\|', '^', and '$', but less tightly than |
1826 |
the other regular expression operators. |
1827 |
|
1828 |
'\DIGIT' |
1829 |
Matches the DIGIT-th '\(...\)' parenthesized subexpression in the |
1830 |
regular expression. This is called a "back reference". |
1831 |
Subexpressions are implicitly numbered by counting occurrences of |
1832 |
'\(' left-to-right. |
1833 |
|
1834 |
'\n' |
1835 |
Matches the newline character. |
1836 |
|
1837 |
'\CHAR' |
1838 |
Matches CHAR, where CHAR is one of '$', '*', '.', '[', '\', or '^'. |
1839 |
Note that the only C-like backslash sequences that you can portably |
1840 |
assume to be interpreted are '\n' and '\\'; in particular '\t' is |
1841 |
not portable, and matches a 't' under most implementations of |
1842 |
'sed', rather than a tab character. |
1843 |
|
1844 |
Note that the regular expression matcher is greedy, i.e., matches are |
1845 |
attempted from left to right and, if two or more matches are possible |
1846 |
starting at the same character, it selects the longest. |
1847 |
|
1848 |
Examples: |
1849 |
'abcdef' |
1850 |
Matches 'abcdef'. |
1851 |
|
1852 |
'a*b' |
1853 |
Matches zero or more 'a's followed by a single 'b'. For example, |
1854 |
'b' or 'aaaaab'. |
1855 |
|
1856 |
'a\?b' |
1857 |
Matches 'b' or 'ab'. |
1858 |
|
1859 |
'a\+b\+' |
1860 |
Matches one or more 'a's followed by one or more 'b's: 'ab' is the |
1861 |
shortest possible match, but other examples are 'aaaab' or 'abbbbb' |
1862 |
or 'aaaaaabbbbbbb'. |
1863 |
|
1864 |
'.*' |
1865 |
'.\+' |
1866 |
These two both match all the characters in a string; however, the |
1867 |
first matches every string (including the empty string), while the |
1868 |
second matches only strings containing at least one character. |
1869 |
|
1870 |
'^main.*(.*)' |
1871 |
This matches a string starting with 'main', followed by an opening |
1872 |
and closing parenthesis. The 'n', '(' and ')' need not be |
1873 |
adjacent. |
1874 |
|
1875 |
'^#' |
1876 |
This matches a string beginning with '#'. |
1877 |
|
1878 |
'\\$' |
1879 |
This matches a string ending with a single backslash. The regexp |
1880 |
contains two backslashes for escaping. |
1881 |
|
1882 |
'\$' |
1883 |
Instead, this matches a string consisting of a single dollar sign, |
1884 |
because it is escaped. |
1885 |
|
1886 |
'[a-zA-Z0-9]' |
1887 |
In the C locale, this matches any ASCII letters or digits. |
1888 |
|
1889 |
'[^ '<TAB>']\+' |
1890 |
(Here '<TAB>' stands for a single tab character.) This matches a |
1891 |
string of one or more characters, none of which is a space or a |
1892 |
tab. Usually this means a word. |
1893 |
|
1894 |
'^\(.*\)\n\1$' |
1895 |
This matches a string consisting of two equal substrings separated |
1896 |
by a newline. |
1897 |
|
1898 |
'.\{9\}A$' |
1899 |
This matches nine characters followed by an 'A' at the end of a |
1900 |
line. |
1901 |
|
1902 |
'^.\{15\}A' |
1903 |
This matches the start of a string that contains 16 characters, the |
1904 |
last of which is an 'A'. |
1905 |
|
1906 |
5.4 Overview of extended regular expression syntax |
1907 |
================================================== |
1908 |
|
1909 |
The only difference between basic and extended regular expressions is in |
1910 |
the behavior of a few characters: '?', '+', parentheses, braces ('{}'), |
1911 |
and '|'. While basic regular expressions require these to be escaped if |
1912 |
you want them to behave as special characters, when using extended |
1913 |
regular expressions you must escape them if you want them _to match a |
1914 |
literal character_. '|' is special here because '\|' is a GNU extension |
1915 |
- standard basic regular expressions do not provide its functionality. |
1916 |
|
1917 |
Examples: |
1918 |
'abc?' |
1919 |
becomes 'abc\?' when using extended regular expressions. It |
1920 |
matches the literal string 'abc?'. |
1921 |
|
1922 |
'c\+' |
1923 |
becomes 'c+' when using extended regular expressions. It matches |
1924 |
one or more 'c's. |
1925 |
|
1926 |
'a\{3,\}' |
1927 |
becomes 'a{3,}' when using extended regular expressions. It |
1928 |
matches three or more 'a's. |
1929 |
|
1930 |
'\(abc\)\{2,3\}' |
1931 |
becomes '(abc){2,3}' when using extended regular expressions. It |
1932 |
matches either 'abcabc' or 'abcabcabc'. |
1933 |
|
1934 |
'\(abc*\)\1' |
1935 |
becomes '(abc*)\1' when using extended regular expressions. |
1936 |
Backreferences must still be escaped when using extended regular |
1937 |
expressions. |
1938 |
|
1939 |
'a\|b' |
1940 |
becomes 'a|b' when using extended regular expressions. It matches |
1941 |
'a' or 'b'. |
1942 |
|
1943 |
5.5 Character Classes and Bracket Expressions |
1944 |
============================================= |
1945 |
|
1946 |
A "bracket expression" is a list of characters enclosed by '[' and ']'. |
1947 |
It matches any single character in that list; if the first character of |
1948 |
the list is the caret '^', then it matches any character *not* in the |
1949 |
list. For example, the following command replaces the words 'gray' or |
1950 |
'grey' with 'blue': |
1951 |
|
1952 |
sed 's/gr[ae]y/blue/' |
1953 |
|
1954 |
Bracket expressions can be used in both *note basic: BRE syntax. and |
1955 |
*note extended: ERE syntax. regular expressions (that is, with or |
1956 |
without the '-E'/'-r' options). |
1957 |
|
1958 |
Within a bracket expression, a "range expression" consists of two |
1959 |
characters separated by a hyphen. It matches any single character that |
1960 |
sorts between the two characters, inclusive. In the default C locale, |
1961 |
the sorting sequence is the native character order; for example, '[a-d]' |
1962 |
is equivalent to '[abcd]'. |
1963 |
|
1964 |
Finally, certain named classes of characters are predefined within |
1965 |
bracket expressions, as follows. |
1966 |
|
1967 |
These named classes must be used _inside_ brackets themselves. |
1968 |
Correct usage: |
1969 |
$ echo 1 | sed 's/[[:digit:]]/X/' |
1970 |
X |
1971 |
|
1972 |
Incorrect usage is rejected by newer 'sed' versions. Older versions |
1973 |
accepted it but treated it as a single bracket expression (which is |
1974 |
equivalent to '[dgit:]', that is, only the characters D/G/I/T/:): |
1975 |
# current GNU sed versions - incorrect usage rejected |
1976 |
$ echo 1 | sed 's/[:digit:]/X/' |
1977 |
sed: character class syntax is [[:space:]], not [:space:] |
1978 |
|
1979 |
# older GNU sed versions |
1980 |
$ echo 1 | sed 's/[:digit:]/X/' |
1981 |
1 |
1982 |
|
1983 |
'[:alnum:]' |
1984 |
Alphanumeric characters: '[:alpha:]' and '[:digit:]'; in the 'C' |
1985 |
locale and ASCII character encoding, this is the same as |
1986 |
'[0-9A-Za-z]'. |
1987 |
|
1988 |
'[:alpha:]' |
1989 |
Alphabetic characters: '[:lower:]' and '[:upper:]'; in the 'C' |
1990 |
locale and ASCII character encoding, this is the same as |
1991 |
'[A-Za-z]'. |
1992 |
|
1993 |
'[:blank:]' |
1994 |
Blank characters: space and tab. |
1995 |
|
1996 |
'[:cntrl:]' |
1997 |
Control characters. In ASCII, these characters have octal codes |
1998 |
000 through 037, and 177 (DEL). In other character sets, these are |
1999 |
the equivalent characters, if any. |
2000 |
|
2001 |
'[:digit:]' |
2002 |
Digits: '0 1 2 3 4 5 6 7 8 9'. |
2003 |
|
2004 |
'[:graph:]' |
2005 |
Graphical characters: '[:alnum:]' and '[:punct:]'. |
2006 |
|
2007 |
'[:lower:]' |
2008 |
Lower-case letters; in the 'C' locale and ASCII character encoding, |
2009 |
this is 'a b c d e f g h i j k l m n o p q r s t u v w x y z'. |
2010 |
|
2011 |
'[:print:]' |
2012 |
Printable characters: '[:alnum:]', '[:punct:]', and space. |
2013 |
|
2014 |
'[:punct:]' |
2015 |
Punctuation characters; in the 'C' locale and ASCII character |
2016 |
encoding, this is '! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ |
2017 |
] ^ _ ` { | } ~'. |
2018 |
|
2019 |
'[:space:]' |
2020 |
Space characters: in the 'C' locale, this is tab, newline, vertical |
2021 |
tab, form feed, carriage return, and space. |
2022 |
|
2023 |
'[:upper:]' |
2024 |
Upper-case letters: in the 'C' locale and ASCII character encoding, |
2025 |
this is 'A B C D E F G H I J K L M N O P Q R S T U V W X Y Z'. |
2026 |
|
2027 |
'[:xdigit:]' |
2028 |
Hexadecimal digits: '0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f'. |
2029 |
|
2030 |
Note that the brackets in these class names are part of the symbolic |
2031 |
names, and must be included in addition to the brackets delimiting the |
2032 |
bracket expression. |
2033 |
|
2034 |
Most meta-characters lose their special meaning inside bracket |
2035 |
expressions: |
2036 |
|
2037 |
']' |
2038 |
ends the bracket expression if it's not the first list item. So, |
2039 |
if you want to make the ']' character a list item, you must put it |
2040 |
first. |
2041 |
|
2042 |
'-' |
2043 |
represents the range if it's not first or last in a list or the |
2044 |
ending point of a range. |
2045 |
|
2046 |
'^' |
2047 |
represents the characters not in the list. If you want to make the |
2048 |
'^' character a list item, place it anywhere but first. |
2049 |
|
2050 |
TODO: incorporate this paragraph (copied verbatim from BRE section). |
2051 |
|
2052 |
The characters '$', '*', '.', '[', and '\' are normally not special |
2053 |
within LIST. For example, '[\*]' matches either '\' or '*', because the |
2054 |
'\' is not special here. However, strings like '[.ch.]', '[=a=]', and |
2055 |
'[:space:]' are special within LIST and represent collating symbols, |
2056 |
equivalence classes, and character classes, respectively, and '[' is |
2057 |
therefore special within LIST when it is followed by '.', '=', or ':'. |
2058 |
Also, when not in 'POSIXLY_CORRECT' mode, special escapes like '\n' and |
2059 |
'\t' are recognized within LIST. *Note Escapes::. |
2060 |
|
2061 |
'[.' |
2062 |
represents the open collating symbol. |
2063 |
|
2064 |
'.]' |
2065 |
represents the close collating symbol. |
2066 |
|
2067 |
'[=' |
2068 |
represents the open equivalence class. |
2069 |
|
2070 |
'=]' |
2071 |
represents the close equivalence class. |
2072 |
|
2073 |
'[:' |
2074 |
represents the open character class symbol, and should be followed |
2075 |
by a valid character class name. |
2076 |
|
2077 |
':]' |
2078 |
represents the close character class symbol. |
2079 |
|
2080 |
5.6 regular expression extensions |
2081 |
================================= |
2082 |
|
2083 |
The following sequences have special meaning inside regular expressions |
2084 |
(used in *note addresses: Regexp Addresses. and the 's' command). |
2085 |
|
2086 |
These can be used in both *note basic: BRE syntax. and *note |
2087 |
extended: ERE syntax. regular expressions (that is, with or without the |
2088 |
'-E'/'-r' options). |
2089 |
|
2090 |
'\w' |
2091 |
Matches any "word" character. A "word" character is any letter or |
2092 |
digit or the underscore character. |
2093 |
|
2094 |
$ echo "abc %-= def." | sed 's/\w/X/g' |
2095 |
XXX %-= XXX. |
2096 |
|
2097 |
'\W' |
2098 |
Matches any "non-word" character. |
2099 |
|
2100 |
$ echo "abc %-= def." | sed 's/\W/X/g' |
2101 |
abcXXXXXdefX |
2102 |
|
2103 |
'\b' |
2104 |
Matches a word boundary; that is it matches if the character to the |
2105 |
left is a "word" character and the character to the right is a |
2106 |
"non-word" character, or vice-versa. |
2107 |
|
2108 |
$ echo "abc %-= def." | sed 's/\b/X/g' |
2109 |
XabcX %-= XdefX. |
2110 |
|
2111 |
'\B' |
2112 |
Matches everywhere but on a word boundary; that is it matches if |
2113 |
the character to the left and the character to the right are either |
2114 |
both "word" characters or both "non-word" characters. |
2115 |
|
2116 |
$ echo "abc %-= def." | sed 's/\B/X/g' |
2117 |
aXbXc X%X-X=X dXeXf.X |
2118 |
|
2119 |
'\s' |
2120 |
Matches whitespace characters (spaces and tabs). Newlines embedded |
2121 |
in the pattern/hold spaces will also match: |
2122 |
|
2123 |
$ echo "abc %-= def." | sed 's/\s/X/g' |
2124 |
abcX%-=Xdef. |
2125 |
|
2126 |
'\S' |
2127 |
Matches non-whitespace characters. |
2128 |
|
2129 |
$ echo "abc %-= def." | sed 's/\S/X/g' |
2130 |
XXX XXX XXXX |
2131 |
|
2132 |
'\<' |
2133 |
Matches the beginning of a word. |
2134 |
|
2135 |
$ echo "abc %-= def." | sed 's/\</X/g' |
2136 |
Xabc %-= Xdef. |
2137 |
|
2138 |
'\>' |
2139 |
Matches the end of a word. |
2140 |
|
2141 |
$ echo "abc %-= def." | sed 's/\>/X/g' |
2142 |
abcX %-= defX. |
2143 |
|
2144 |
'\`' |
2145 |
Matches only at the start of pattern space. This is different from |
2146 |
'^' in multi-line mode. |
2147 |
|
2148 |
Compare the following two examples: |
2149 |
|
2150 |
$ printf "a\nb\nc\n" | sed 'N;N;s/^/X/gm' |
2151 |
Xa |
2152 |
Xb |
2153 |
Xc |
2154 |
|
2155 |
$ printf "a\nb\nc\n" | sed 'N;N;s/\`/X/gm' |
2156 |
Xa |
2157 |
b |
2158 |
c |
2159 |
|
2160 |
'\'' |
2161 |
Matches only at the end of pattern space. This is different from |
2162 |
'$' in multi-line mode. |
2163 |
|
2164 |
5.7 Back-references and Subexpressions |
2165 |
====================================== |
2166 |
|
2167 |
"back-references" are regular expression commands which refer to a |
2168 |
previous part of the matched regular expression. Back-references are |
2169 |
specified with backslash and a single digit (e.g. '\1'). The part of |
2170 |
the regular expression they refer to is called a "subexpression", and is |
2171 |
designated with parentheses. |
2172 |
|
2173 |
Back-references and subexpressions are used in two cases: in the |
2174 |
regular expression search pattern, and in the REPLACEMENT part of the |
2175 |
's' command (*note Regular Expression Addresses: Regexp Addresses. and |
2176 |
*note The "s" Command::). |
2177 |
|
2178 |
In a regular expression pattern, back-references are used to match |
2179 |
the same content as a previously matched subexpression. In the |
2180 |
following example, the subexpression is '.' - any single character |
2181 |
(being surrounded by parentheses makes it a subexpression). The |
2182 |
back-reference '\1' asks to match the same content (same character) as |
2183 |
the sub-expression. |
2184 |
|
2185 |
The command below matches words starting with any character, followed |
2186 |
by the letter 'o', followed by the same character as the first. |
2187 |
|
2188 |
$ sed -E -n '/^(.)o\1$/p' /usr/share/dict/words |
2189 |
bob |
2190 |
mom |
2191 |
non |
2192 |
pop |
2193 |
sos |
2194 |
tot |
2195 |
wow |
2196 |
|
2197 |
Multiple subexpressions are automatically numbered from |
2198 |
left-to-right. This command searches for 6-letter palindromes (the |
2199 |
first three letters are 3 subexpressions, followed by 3 back-references |
2200 |
in reverse order): |
2201 |
|
2202 |
$ sed -E -n '/^(.)(.)(.)\3\2\1$/p' /usr/share/dict/words |
2203 |
redder |
2204 |
|
2205 |
In the 's' command, back-references can be used in the REPLACEMENT |
2206 |
part to refer back to subexpressions in the REGEXP part. |
2207 |
|
2208 |
The following example uses two subexpressions in the regular |
2209 |
expression to match two space-separated words. The back-references in |
2210 |
the REPLACEMENT part prints the words in a different order: |
2211 |
|
2212 |
$ echo "James Bond" | sed -E 's/(.*) (.*)/The name is \2, \1 \2./' |
2213 |
The name is Bond, James Bond. |
2214 |
|
2215 |
When used with alternation, if the group does not participate in the |
2216 |
match then the back-reference makes the whole match fail. For example, |
2217 |
'a(.)|b\1' will not match 'ba'. When multiple regular expressions are |
2218 |
given with '-e' or from a file ('-f FILE'), back-references are local to |
2219 |
each expression. |
2220 |
|
2221 |
5.8 Escape Sequences - specifying special characters |
2222 |
==================================================== |
2223 |
|
2224 |
Until this chapter, we have only encountered escapes of the form '\^', |
2225 |
which tell 'sed' not to interpret the circumflex as a special character, |
2226 |
but rather to take it literally. For example, '\*' matches a single |
2227 |
asterisk rather than zero or more backslashes. |
2228 |
|
2229 |
This chapter introduces another kind of escape(1)--that is, escapes |
2230 |
that are applied to a character or sequence of characters that |
2231 |
ordinarily are taken literally, and that 'sed' replaces with a special |
2232 |
character. This provides a way of encoding non-printable characters in |
2233 |
patterns in a visible manner. There is no restriction on the appearance |
2234 |
of non-printing characters in a 'sed' script but when a script is being |
2235 |
prepared in the shell or by text editing, it is usually easier to use |
2236 |
one of the following escape sequences than the binary character it |
2237 |
represents: |
2238 |
|
2239 |
The list of these escapes is: |
2240 |
|
2241 |
'\a' |
2242 |
Produces or matches a BEL character, that is an "alert" (ASCII 7). |
2243 |
|
2244 |
'\f' |
2245 |
Produces or matches a form feed (ASCII 12). |
2246 |
|
2247 |
'\n' |
2248 |
Produces or matches a newline (ASCII 10). |
2249 |
|
2250 |
'\r' |
2251 |
Produces or matches a carriage return (ASCII 13). |
2252 |
|
2253 |
'\t' |
2254 |
Produces or matches a horizontal tab (ASCII 9). |
2255 |
|
2256 |
'\v' |
2257 |
Produces or matches a so called "vertical tab" (ASCII 11). |
2258 |
|
2259 |
'\cX' |
2260 |
Produces or matches 'CONTROL-X', where X is any character. The |
2261 |
precise effect of '\cX' is as follows: if X is a lower case letter, |
2262 |
it is converted to upper case. Then bit 6 of the character (hex |
2263 |
40) is inverted. Thus '\cz' becomes hex 1A, but '\c{' becomes hex |
2264 |
3B, while '\c;' becomes hex 7B. |
2265 |
|
2266 |
'\dXXX' |
2267 |
Produces or matches a character whose decimal ASCII value is XXX. |
2268 |
|
2269 |
'\oXXX' |
2270 |
Produces or matches a character whose octal ASCII value is XXX. |
2271 |
|
2272 |
'\xXX' |
2273 |
Produces or matches a character whose hexadecimal ASCII value is |
2274 |
XX. |
2275 |
|
2276 |
'\b' (backspace) was omitted because of the conflict with the |
2277 |
existing "word boundary" meaning. |
2278 |
|
2279 |
5.8.1 Escaping Precedence |
2280 |
------------------------- |
2281 |
|
2282 |
GNU 'sed' processes escape sequences _before_ passing the text onto the |
2283 |
regular-expression matching of the 's///' command and Address matching. |
2284 |
Thus the follwing two commands are equivalent ('0x5e' is the hexadecimal |
2285 |
ASCII value of the character '^'): |
2286 |
|
2287 |
$ echo 'a^c' | sed 's/^/b/' |
2288 |
ba^c |
2289 |
|
2290 |
$ echo 'a^c' | sed 's/\x5e/b/' |
2291 |
ba^c |
2292 |
|
2293 |
As are the following ('0x5b','0x5d' are the hexadecimal ASCII values |
2294 |
of '[',']', respectively): |
2295 |
|
2296 |
$ echo abc | sed 's/[a]/x/' |
2297 |
Xbc |
2298 |
$ echo abc | sed 's/\x5ba\x5d/x/' |
2299 |
Xbc |
2300 |
|
2301 |
However it is recommended to avoid such special characters due to |
2302 |
unexpected edge-cases. For example, the following are not equivalent: |
2303 |
|
2304 |
$ echo 'a^c' | sed 's/\^/b/' |
2305 |
abc |
2306 |
|
2307 |
$ echo 'a^c' | sed 's/\\\x5e/b/' |
2308 |
a^c |
2309 |
|
2310 |
---------- Footnotes ---------- |
2311 |
|
2312 |
(1) All the escapes introduced here are GNU extensions, with the |
2313 |
exception of '\n'. In basic regular expression mode, setting |
2314 |
'POSIXLY_CORRECT' disables them inside bracket expressions. |
2315 |
|
2316 |
5.9 Multibyte characters and Locale Considerations |
2317 |
================================================== |
2318 |
|
2319 |
GNU 'sed' processes valid multibyte characters in multibyte locales |
2320 |
(e.g. 'UTF-8'). (1) |
2321 |
|
2322 |
The following example uses the Greek letter Capital Sigma (U+03A3, |
2323 |
Unicode code point '0x03A3'). In a 'UTF-8' locale, 'sed' correctly |
2324 |
processes the Sigma as one character despite it being 2 octets (bytes): |
2325 |
|
2326 |
$ locale | grep LANG |
2327 |
LANG=en_US.UTF-8 |
2328 |
|
2329 |
$ printf 'a\u03A3b' |
2330 |
aU+03A3b |
2331 |
|
2332 |
$ printf 'a\u03A3b' | sed 's/./X/g' |
2333 |
XXX |
2334 |
|
2335 |
$ printf 'a\u03A3b' | od -tx1 -An |
2336 |
61 ce a3 62 |
2337 |
|
2338 |
To force 'sed' to process octets separately, use the 'C' locale (also |
2339 |
known as the 'POSIX' locale): |
2340 |
|
2341 |
$ printf 'a\u03A3b' | LC_ALL=C sed 's/./X/g' |
2342 |
XXXX |
2343 |
|
2344 |
5.9.1 Invalid multibyte characters |
2345 |
---------------------------------- |
2346 |
|
2347 |
'sed''s regular expressions _do not_ match invalid multibyte sequences |
2348 |
in a multibyte locale. |
2349 |
|
2350 |
In the following examples, the ascii value '0xCE' is an incomplete |
2351 |
multibyte character (shown here as U+FFFD). The regular expression '.' |
2352 |
does not match it: |
2353 |
|
2354 |
$ printf 'a\xCEb\n' |
2355 |
aU+FFFDe |
2356 |
|
2357 |
$ printf 'a\xCEb\n' | sed 's/./X/g' |
2358 |
XU+FFFDX |
2359 |
|
2360 |
$ printf 'a\xCEc\n' | sed 's/./X/g' | od -tx1c -An |
2361 |
58 ce 58 0a |
2362 |
X X \n |
2363 |
|
2364 |
Similarly, the 'catch-all' regular expression '.*' does not match the |
2365 |
entire line: |
2366 |
|
2367 |
$ printf 'a\xCEc\n' | sed 's/.*//' | od -tx1c -An |
2368 |
ce 63 0a |
2369 |
c \n |
2370 |
|
2371 |
GNU 'sed' offers the special 'z' command to clear the current pattern |
2372 |
space regardless of invalid multibyte characters (i.e. it works like |
2373 |
's/.*//' but also removes invalid multibyte characters): |
2374 |
|
2375 |
$ printf 'a\xCEc\n' | sed 'z' | od -tx1c -An |
2376 |
0a |
2377 |
\n |
2378 |
|
2379 |
Alternatively, force the 'C' locale to process each octet separately |
2380 |
(every octet is a valid character in the 'C' locale): |
2381 |
|
2382 |
$ printf 'a\xCEc\n' | LC_ALL=C sed 's/.*//' | od -tx1c -An |
2383 |
0a |
2384 |
\n |
2385 |
|
2386 |
'sed''s inability to process invalid multibyte characters can be used |
2387 |
to detect such invalid sequences in a file. In the following examples, |
2388 |
the '\xCE\xCE' is an invalid multibyte sequence, while '\xCE\A3' is a |
2389 |
valid multibyte sequence (of the Greek Sigma character). |
2390 |
|
2391 |
The following 'sed' program removes all valid characters using 's/.//g'. |
2392 |
Any content left in the pattern space (the invalid characters) are added |
2393 |
to the hold space using the 'H' command. On the last line ('$'), the |
2394 |
hold space is retrieved ('x'), newlines are removed ('s/\n//g'), and any |
2395 |
remaining octets are printed unambiguously ('l'). Thus, any invalid |
2396 |
multibyte sequences are printed as octal values: |
2397 |
|
2398 |
$ printf 'ab\nc\n\xCE\xCEde\n\xCE\xA3f\n' > invalid.txt |
2399 |
|
2400 |
$ cat invalid.txt |
2401 |
ab |
2402 |
c |
2403 |
U+FFFDU+FFFDde |
2404 |
U+03A3f |
2405 |
|
2406 |
$ sed -n 's/.//g ; H ; ${x;s/\n//g;l}' invalid.txt |
2407 |
\316\316$ |
2408 |
|
2409 |
With a few more commands, 'sed' can print the exact line number |
2410 |
corresponding to each invalid characters (line 3). These characters can |
2411 |
then be removed by forcing the 'C' locale and using octal escape |
2412 |
sequences: |
2413 |
|
2414 |
$ sed -n 's/.//g;=;l' invalid.txt | paste - - | awk '$2!="$"' |
2415 |
3 \316\316$ |
2416 |
|
2417 |
$ LC_ALL=C sed '3s/\o316\o316//' invalid.txt > fixed.txt |
2418 |
|
2419 |
5.9.2 Upper/Lower case conversion |
2420 |
--------------------------------- |
2421 |
|
2422 |
GNU 'sed''s substitute command ('s') supports upper/lower case |
2423 |
conversions using '\U','\L' codes. These conversions support multibyte |
2424 |
characters: |
2425 |
|
2426 |
$ printf 'ABC\u03a3\n' |
2427 |
ABCU+03A3 |
2428 |
|
2429 |
$ printf 'ABC\u03a3\n' | sed 's/.*/\L&/' |
2430 |
abcU+03C3 |
2431 |
|
2432 |
*Note The "s" Command::. |
2433 |
|
2434 |
5.9.3 Multibyte regexp character classes |
2435 |
---------------------------------------- |
2436 |
|
2437 |
In other locales, the sorting sequence is not specified, and '[a-d]' |
2438 |
might be equivalent to '[abcd]' or to '[aBbCcDd]', or it might fail to |
2439 |
match any character, or the set of characters that it matches might even |
2440 |
be erratic. To obtain the traditional interpretation of bracket |
2441 |
expressions, you can use the 'C' locale by setting the 'LC_ALL' |
2442 |
environment variable to the value 'C'. |
2443 |
|
2444 |
# TODO: is there any real-world system/locale where 'A' |
2445 |
# is replaced by '-' ? |
2446 |
$ echo A | sed 's/[a-z]/-/' |
2447 |
A |
2448 |
|
2449 |
Their interpretation depends on the 'LC_CTYPE' locale; for example, |
2450 |
'[[:alnum:]]' means the character class of numbers and letters in the |
2451 |
current locale. |
2452 |
|
2453 |
TODO: show example of collation |
2454 |
|
2455 |
# TODO: this works on glibc systems, not on musl-libc/freebsd/macosx. |
2456 |
$ printf 'cliché\n' | LC_ALL=fr_FR.utf8 sed 's/[[=e=]]/X/g' |
2457 |
clichX |
2458 |
|
2459 |
---------- Footnotes ---------- |
2460 |
|
2461 |
(1) Some regexp edge-cases depends on the operating system and libc |
2462 |
implementation. The examples shown are known to work as-expected on |
2463 |
GNU/Linux systems using glibc. |
2464 |
|
2465 |
6 Advanced 'sed': cycles and buffers |
2466 |
************************************ |
2467 |
|
2468 |
6.1 How 'sed' Works |
2469 |
=================== |
2470 |
|
2471 |
'sed' maintains two data buffers: the active _pattern_ space, and the |
2472 |
auxiliary _hold_ space. Both are initially empty. |
2473 |
|
2474 |
'sed' operates by performing the following cycle on each line of |
2475 |
input: first, 'sed' reads one line from the input stream, removes any |
2476 |
trailing newline, and places it in the pattern space. Then commands are |
2477 |
executed; each command can have an address associated to it: addresses |
2478 |
are a kind of condition code, and a command is only executed if the |
2479 |
condition is verified before the command is to be executed. |
2480 |
|
2481 |
When the end of the script is reached, unless the '-n' option is in |
2482 |
use, the contents of pattern space are printed out to the output stream, |
2483 |
adding back the trailing newline if it was removed.(1) Then the next |
2484 |
cycle starts for the next input line. |
2485 |
|
2486 |
Unless special commands (like 'D') are used, the pattern space is |
2487 |
deleted between two cycles. The hold space, on the other hand, keeps |
2488 |
its data between cycles (see commands 'h', 'H', 'x', 'g', 'G' to move |
2489 |
data between both buffers). |
2490 |
|
2491 |
---------- Footnotes ---------- |
2492 |
|
2493 |
(1) Actually, if 'sed' prints a line without the terminating newline, |
2494 |
it will nevertheless print the missing newline as soon as more text is |
2495 |
sent to the same output stream, which gives the "least expected |
2496 |
surprise" even though it does not make commands like 'sed -n p' exactly |
2497 |
identical to 'cat'. |
2498 |
|
2499 |
6.2 Hold and Pattern Buffers |
2500 |
============================ |
2501 |
|
2502 |
TODO |
2503 |
|
2504 |
6.3 Multiline techniques - using D,G,H,N,P to process multiple lines |
2505 |
==================================================================== |
2506 |
|
2507 |
Multiple lines can be processed as one buffer using the |
2508 |
'D','G','H','N','P'. They are similar to their lowercase counterparts |
2509 |
('d','g', 'h','n','p'), except that these commands append or subtract |
2510 |
data while respecting embedded newlines - allowing adding and removing |
2511 |
lines from the pattern and hold spaces. |
2512 |
|
2513 |
They operate as follows: |
2514 |
'D' |
2515 |
_deletes_ line from the pattern space until the first newline, and |
2516 |
restarts the cycle. |
2517 |
|
2518 |
'G' |
2519 |
_appends_ line from the hold space to the pattern space, with a |
2520 |
newline before it. |
2521 |
|
2522 |
'H' |
2523 |
_appends_ line from the pattern space to the hold space, with a |
2524 |
newline before it. |
2525 |
|
2526 |
'N' |
2527 |
_appends_ line from the input file to the pattern space. |
2528 |
|
2529 |
'P' |
2530 |
_prints_ line from the pattern space until the first newline. |
2531 |
|
2532 |
The following example illustrates the operation of 'N' and 'D' |
2533 |
commands: |
2534 |
|
2535 |
$ seq 6 | sed -n 'N;l;D' |
2536 |
1\n2$ |
2537 |
2\n3$ |
2538 |
3\n4$ |
2539 |
4\n5$ |
2540 |
5\n6$ |
2541 |
|
2542 |
1. 'sed' starts by reading the first line into the pattern space (i.e. |
2543 |
'1'). |
2544 |
2. At the beginning of every cycle, the 'N' command appends a newline |
2545 |
and the next line to the pattern space (i.e. '1', '\n', '2' in the |
2546 |
first cycle). |
2547 |
3. The 'l' command prints the content of the pattern space |
2548 |
unambiguously. |
2549 |
4. The 'D' command then removes the content of pattern space up to the |
2550 |
first newline (leaving '2' at the end of the first cycle). |
2551 |
5. At the next cycle the 'N' command appends a newline and the next |
2552 |
input line to the pattern space (e.g. '2', '\n', '3'). |
2553 |
|
2554 |
A common technique to process blocks of text such as paragraphs |
2555 |
(instead of line-by-line) is using the following construct: |
2556 |
|
2557 |
sed '/./{H;$!d} ; x ; s/REGEXP/REPLACEMENT/' |
2558 |
|
2559 |
1. The first expression, '/./{H;$!d}' operates on all non-empty lines, |
2560 |
and adds the current line (in the pattern space) to the hold space. |
2561 |
On all lines except the last, the pattern space is deleted and the |
2562 |
cycle is restarted. |
2563 |
|
2564 |
2. The other expressions 'x' and 's' are executed only on empty lines |
2565 |
(i.e. paragraph separators). The 'x' command fetches the |
2566 |
accumulated lines from the hold space back to the pattern space. |
2567 |
The 's///' command then operates on all the text in the paragraph |
2568 |
(including the embedded newlines). |
2569 |
|
2570 |
The following example demonstrates this technique: |
2571 |
$ cat input.txt |
2572 |
a a a aa aaa |
2573 |
aaaa aaaa aa |
2574 |
aaaa aaa aaa |
2575 |
|
2576 |
bbbb bbb bbb |
2577 |
bb bb bbb bb |
2578 |
bbbbbbbb bbb |
2579 |
|
2580 |
ccc ccc cccc |
2581 |
cccc ccccc c |
2582 |
cc cc cc cc |
2583 |
|
2584 |
$ sed '/./{H;$!d} ; x ; s/^/\nSTART-->/ ; s/$/\n<--END/' input.txt |
2585 |
|
2586 |
START--> |
2587 |
a a a aa aaa |
2588 |
aaaa aaaa aa |
2589 |
aaaa aaa aaa |
2590 |
<--END |
2591 |
|
2592 |
START--> |
2593 |
bbbb bbb bbb |
2594 |
bb bb bbb bb |
2595 |
bbbbbbbb bbb |
2596 |
<--END |
2597 |
|
2598 |
START--> |
2599 |
ccc ccc cccc |
2600 |
cccc ccccc c |
2601 |
cc cc cc cc |
2602 |
<--END |
2603 |
|
2604 |
For more annotated examples, *note Text search across multiple |
2605 |
lines:: and *note Line length adjustment::. |
2606 |
|
2607 |
6.4 Branching and Flow Control |
2608 |
============================== |
2609 |
|
2610 |
The branching commands 'b', 't', and 'T' enable changing the flow of |
2611 |
'sed' programs. |
2612 |
|
2613 |
By default, 'sed' reads an input line into the pattern buffer, then |
2614 |
continues to processes all commands in order. Commands without |
2615 |
addresses affect all lines. Commands with addresses affect only |
2616 |
matching lines. *Note Execution Cycle:: and *note Addresses overview::. |
2617 |
|
2618 |
'sed' does not support a typical 'if/then' construct. Instead, some |
2619 |
commands can be used as conditionals or to change the default flow |
2620 |
control: |
2621 |
|
2622 |
'd' |
2623 |
delete (clears) the current pattern space, and restart the program |
2624 |
cycle without processing the rest of the commands and without |
2625 |
printing the pattern space. |
2626 |
|
2627 |
'D' |
2628 |
delete the contents of the pattern space _up to the first newline_, |
2629 |
and restart the program cycle without processing the rest of the |
2630 |
commands and without printing the pattern space. |
2631 |
|
2632 |
'[addr]X' |
2633 |
'[addr]{ X ; X ; X }' |
2634 |
'/regexp/X' |
2635 |
'/regexp/{ X ; X ; X }' |
2636 |
Addresses and regular expressions can be used as an 'if/then' |
2637 |
conditional: If [ADDR] matches the current pattern space, execute |
2638 |
the command(s). For example: The command '/^#/d' means: _if_ the |
2639 |
current pattern matches the regular expression '^#' (a line |
2640 |
starting with a hash), _then_ execute the 'd' command: delete the |
2641 |
line without printing it, and restart the program cycle |
2642 |
immediately. |
2643 |
|
2644 |
'b' |
2645 |
branch unconditionally (that is: always jump to a label, skipping |
2646 |
or repeating other commands, without restarting a new cycle). |
2647 |
Combined with an address, the branch can be conditionally executed |
2648 |
on matched lines. |
2649 |
|
2650 |
't' |
2651 |
branch conditionally (that is: jump to a label) _only if_ a 's///' |
2652 |
command has succeeded since the last input line was read or another |
2653 |
conditional branch was taken. |
2654 |
|
2655 |
'T' |
2656 |
similar but opposite to the 't' command: branch only if there has |
2657 |
been _no_ successful substitutions since the last input line was |
2658 |
read. |
2659 |
|
2660 |
The following two 'sed' programs are equivalent. The first |
2661 |
(contrived) example uses the 'b' command to skip the 's///' command on |
2662 |
lines containing '1'. The second example uses an address with negation |
2663 |
('!') to perform substitution only on desired lines. The 'y///' command |
2664 |
is still executed on all lines: |
2665 |
|
2666 |
$ printf '%s\n' a1 a2 a3 | sed -E '/1/bx ; s/a/z/ ; :x ; y/123/456/' |
2667 |
a4 |
2668 |
z5 |
2669 |
z6 |
2670 |
|
2671 |
$ printf '%s\n' a1 a2 a3 | sed -E '/1/!s/a/z/ ; y/123/456/' |
2672 |
a4 |
2673 |
z5 |
2674 |
z6 |
2675 |
|
2676 |
6.4.1 Branching and Cycles |
2677 |
-------------------------- |
2678 |
|
2679 |
The 'b','t' and 'T' commands can be followed by a label (typically a |
2680 |
single letter). Labels are defined with a colon followed by one or more |
2681 |
letters (e.g. ':x'). If the label is omitted the branch commands |
2682 |
restart the cycle. Note the difference between branching to a label and |
2683 |
restarting the cycle: when a cycle is restarted, 'sed' first prints the |
2684 |
current content of the pattern space, then reads the next input line |
2685 |
into the pattern space; Jumping to a label (even if it is at the |
2686 |
beginning of the program) does not print the pattern space and does not |
2687 |
read the next input line. |
2688 |
|
2689 |
The following program is a no-op. The 'b' command (the only command |
2690 |
in the program) does not have a label, and thus simply restarts the |
2691 |
cycle. On each cycle, the pattern space is printed and the next input |
2692 |
line is read: |
2693 |
|
2694 |
$ seq 3 | sed b |
2695 |
1 |
2696 |
2 |
2697 |
3 |
2698 |
|
2699 |
The following example is an infinite-loop - it doesn't terminate and |
2700 |
doesn't print anything. The 'b' command jumps to the 'x' label, and a |
2701 |
new cycle is never started: |
2702 |
|
2703 |
$ seq 3 | sed ':x ; bx' |
2704 |
|
2705 |
# The above command requires gnu sed (which supports additional |
2706 |
# commands following a label, without a newline). A portable equivalent: |
2707 |
# sed -e ':x' -e bx |
2708 |
|
2709 |
Branching is often complemented with the 'n' or 'N' commands: both |
2710 |
commands read the next input line into the pattern space without waiting |
2711 |
for the cycle to restart. Before reading the next input line, 'n' |
2712 |
prints the current pattern space then empties it, while 'N' appends a |
2713 |
newline and the next input line to the pattern space. |
2714 |
|
2715 |
Consider the following two examples: |
2716 |
|
2717 |
$ seq 3 | sed ':x ; n ; bx' |
2718 |
1 |
2719 |
2 |
2720 |
3 |
2721 |
|
2722 |
$ seq 3 | sed ':x ; N ; bx' |
2723 |
1 |
2724 |
2 |
2725 |
3 |
2726 |
|
2727 |
* Both examples do not inf-loop, despite never starting a new cycle. |
2728 |
|
2729 |
* In the first example, the 'n' commands first prints the content of |
2730 |
the pattern space, empties the pattern space then reads the next |
2731 |
input line. |
2732 |
|
2733 |
* In the second example, the 'N' commands appends the next input line |
2734 |
to the pattern space (with a newline). Lines are accumulated in |
2735 |
the pattern space until there are no more input lines to read, then |
2736 |
the 'N' command terminates the 'sed' program. When the program |
2737 |
terminates, the end-of-cycle actions are performed, and the entire |
2738 |
pattern space is printed. |
2739 |
|
2740 |
* The second example requires GNU 'sed', because it uses the |
2741 |
non-POSIX-standard behavior of 'N'. See the "'N' command on the |
2742 |
last line" paragraph in *note Reporting Bugs::. |
2743 |
|
2744 |
* To further examine the difference between the two examples, try the |
2745 |
following commands: |
2746 |
printf '%s\n' aa bb cc dd | sed ':x ; n ; = ; bx' |
2747 |
printf '%s\n' aa bb cc dd | sed ':x ; N ; = ; bx' |
2748 |
printf '%s\n' aa bb cc dd | sed ':x ; n ; s/\n/***/ ; bx' |
2749 |
printf '%s\n' aa bb cc dd | sed ':x ; N ; s/\n/***/ ; bx' |
2750 |
|
2751 |
6.4.2 Branching example: joining lines |
2752 |
-------------------------------------- |
2753 |
|
2754 |
As a real-world example of using branching, consider the case of |
2755 |
quoted-printable (https://en.wikipedia.org/wiki/Quoted-printable) files, |
2756 |
typically used to encode email messages. In these files long lines are |
2757 |
split and marked with a "soft line break" consisting of a single '=' |
2758 |
character at the end of the line: |
2759 |
|
2760 |
$ cat jaques.txt |
2761 |
All the wor= |
2762 |
ld's a stag= |
2763 |
e, |
2764 |
And all the= |
2765 |
men and wo= |
2766 |
men merely = |
2767 |
players: |
2768 |
They have t= |
2769 |
heir exits = |
2770 |
and their e= |
2771 |
ntrances; |
2772 |
And one man= |
2773 |
in his tim= |
2774 |
e plays man= |
2775 |
y parts. |
2776 |
|
2777 |
The following program uses an address match '/=$/' as a conditional: |
2778 |
If the current pattern space ends with a '=', it reads the next input |
2779 |
line using 'N', replaces all '=' characters which are followed by a |
2780 |
newline, and unconditionally branches ('b') to the beginning of the |
2781 |
program without restarting a new cycle. If the pattern space does not |
2782 |
ends with '=', the default action is performed: the pattern space is |
2783 |
printed and a new cycle is started: |
2784 |
|
2785 |
$ sed ':x ; /=$/ { N ; s/=\n//g ; bx }' jaques.txt |
2786 |
All the world's a stage, |
2787 |
And all the men and women merely players: |
2788 |
They have their exits and their entrances; |
2789 |
And one man in his time plays many parts. |
2790 |
|
2791 |
Here's an alternative program with a slightly different approach: On |
2792 |
all lines except the last, 'N' appends the line to the pattern space. A |
2793 |
substitution command then removes soft line breaks ('=' at the end of a |
2794 |
line, i.e. followed by a newline) by replacing them with an empty |
2795 |
string. _if_ the substitution was successful (meaning the pattern space |
2796 |
contained a line which should be joined), The conditional branch command |
2797 |
't' jumps to the beginning of the program without completing or |
2798 |
restarting the cycle. If the substitution failed (meaning there were no |
2799 |
soft line breaks), The 't' command will _not_ branch. Then, 'P' will |
2800 |
print the pattern space content until the first newline, and 'D' will |
2801 |
delete the pattern space content until the first new line. (To learn |
2802 |
more about 'N', 'P' and 'D' commands *note Multiline techniques::). |
2803 |
|
2804 |
$ sed ':x ; $!N ; s/=\n// ; tx ; P ; D' jaques.txt |
2805 |
All the world's a stage, |
2806 |
And all the men and women merely players: |
2807 |
They have their exits and their entrances; |
2808 |
And one man in his time plays many parts. |
2809 |
|
2810 |
For more line-joining examples *note Joining lines::. |
2811 |
|
2812 |
7 Some Sample Scripts |
2813 |
********************* |
2814 |
|
2815 |
Here are some 'sed' scripts to guide you in the art of mastering 'sed'. |
2816 |
|
2817 |
7.1 Joining lines |
2818 |
================= |
2819 |
|
2820 |
This section uses 'N', 'D' and 'P' commands to process multiple lines, |
2821 |
and the 'b' and 't' commands for branching. *Note Multiline |
2822 |
techniques:: and *note Branching and flow control::. |
2823 |
|
2824 |
Join specific lines (e.g. if lines 2 and 3 need to be joined): |
2825 |
|
2826 |
$ cat lines.txt |
2827 |
hello |
2828 |
hel |
2829 |
lo |
2830 |
hello |
2831 |
|
2832 |
$ sed '2{N;s/\n//;}' lines.txt |
2833 |
hello |
2834 |
hello |
2835 |
hello |
2836 |
|
2837 |
Join backslash-continued lines: |
2838 |
|
2839 |
$ cat 1.txt |
2840 |
this \ |
2841 |
is \ |
2842 |
a \ |
2843 |
long \ |
2844 |
line |
2845 |
and another \ |
2846 |
line |
2847 |
|
2848 |
$ sed -e ':x /\\$/ { N; s/\\\n//g ; bx }' 1.txt |
2849 |
this is a long line |
2850 |
and another line |
2851 |
|
2852 |
|
2853 |
#TODO: The above requires gnu sed. |
2854 |
# non-gnu seds need newlines after ':' and 'b' |
2855 |
|
2856 |
Join lines that start with whitespace (e.g SMTP headers): |
2857 |
|
2858 |
$ cat 2.txt |
2859 |
Subject: Hello |
2860 |
World |
2861 |
Content-Type: multipart/alternative; |
2862 |
boundary=94eb2c190cc6370f06054535da6a |
2863 |
Date: Tue, 3 Jan 2017 19:41:16 +0000 (GMT) |
2864 |
Authentication-Results: mx.gnu.org; |
2865 |
dkim=pass header.i=@gnu.org; |
2866 |
spf=pass |
2867 |
Message-ID: <abcdef@gnu.org> |
2868 |
From: John Doe <jdoe@gnu.org> |
2869 |
To: Jane Smith <jsmith@gnu.org> |
2870 |
|
2871 |
$ sed -E ':a ; $!N ; s/\n\s+/ / ; ta ; P ; D' 2.txt |
2872 |
Subject: Hello World |
2873 |
Content-Type: multipart/alternative; boundary=94eb2c190cc6370f06054535da6a |
2874 |
Date: Tue, 3 Jan 2017 19:41:16 +0000 (GMT) |
2875 |
Authentication-Results: mx.gnu.org; dkim=pass header.i=@gnu.org; spf=pass |
2876 |
Message-ID: <abcdef@gnu.org> |
2877 |
From: John Doe <jdoe@gnu.org> |
2878 |
To: Jane Smith <jsmith@gnu.org> |
2879 |
|
2880 |
# A portable (non-gnu) variation: |
2881 |
# sed -e :a -e '$!N;s/\n */ /;ta' -e 'P;D' |
2882 |
|
2883 |
7.2 Centering Lines |
2884 |
=================== |
2885 |
|
2886 |
This script centers all lines of a file on a 80 columns width. To |
2887 |
change that width, the number in '\{...\}' must be replaced, and the |
2888 |
number of added spaces also must be changed. |
2889 |
|
2890 |
Note how the buffer commands are used to separate parts in the |
2891 |
regular expressions to be matched--this is a common technique. |
2892 |
|
2893 |
#!/usr/bin/sed -f |
2894 |
|
2895 |
# Put 80 spaces in the buffer |
2896 |
1 { |
2897 |
x |
2898 |
s/^$/ / |
2899 |
s/^.*$/&&&&&&&&/ |
2900 |
x |
2901 |
} |
2902 |
|
2903 |
# delete leading and trailing spaces |
2904 |
y/<TAB>/ / |
2905 |
s/^ *// |
2906 |
s/ *$// |
2907 |
|
2908 |
# add a newline and 80 spaces to end of line |
2909 |
G |
2910 |
|
2911 |
# keep first 81 chars (80 + a newline) |
2912 |
s/^\(.\{81\}\).*$/\1/ |
2913 |
|
2914 |
# \2 matches half of the spaces, which are moved to the beginning |
2915 |
s/^\(.*\)\n\(.*\)\2/\2\1/ |
2916 |
|
2917 |
7.3 Increment a Number |
2918 |
====================== |
2919 |
|
2920 |
This script is one of a few that demonstrate how to do arithmetic in |
2921 |
'sed'. This is indeed possible,(1) but must be done manually. |
2922 |
|
2923 |
To increment one number you just add 1 to last digit, replacing it by |
2924 |
the following digit. There is one exception: when the digit is a nine |
2925 |
the previous digits must be also incremented until you don't have a |
2926 |
nine. |
2927 |
|
2928 |
This solution by Bruno Haible is very clever and smart because it |
2929 |
uses a single buffer; if you don't have this limitation, the algorithm |
2930 |
used in *note Numbering lines: cat -n, is faster. It works by replacing |
2931 |
trailing nines with an underscore, then using multiple 's' commands to |
2932 |
increment the last digit, and then again substituting underscores with |
2933 |
zeros. |
2934 |
|
2935 |
#!/usr/bin/sed -f |
2936 |
|
2937 |
/[^0-9]/ d |
2938 |
|
2939 |
# replace all trailing 9s by _ (any other character except digits, could |
2940 |
# be used) |
2941 |
:d |
2942 |
s/9\(_*\)$/_\1/ |
2943 |
td |
2944 |
|
2945 |
# incr last digit only. The first line adds a most-significant |
2946 |
# digit of 1 if we have to add a digit. |
2947 |
|
2948 |
s/^\(_*\)$/1\1/; tn |
2949 |
s/8\(_*\)$/9\1/; tn |
2950 |
s/7\(_*\)$/8\1/; tn |
2951 |
s/6\(_*\)$/7\1/; tn |
2952 |
s/5\(_*\)$/6\1/; tn |
2953 |
s/4\(_*\)$/5\1/; tn |
2954 |
s/3\(_*\)$/4\1/; tn |
2955 |
s/2\(_*\)$/3\1/; tn |
2956 |
s/1\(_*\)$/2\1/; tn |
2957 |
s/0\(_*\)$/1\1/; tn |
2958 |
|
2959 |
:n |
2960 |
y/_/0/ |
2961 |
|
2962 |
---------- Footnotes ---------- |
2963 |
|
2964 |
(1) 'sed' guru Greg Ubben wrote an implementation of the 'dc' RPN |
2965 |
calculator! It is distributed together with sed. |
2966 |
|
2967 |
7.4 Rename Files to Lower Case |
2968 |
============================== |
2969 |
|
2970 |
This is a pretty strange use of 'sed'. We transform text, and transform |
2971 |
it to be shell commands, then just feed them to shell. Don't worry, |
2972 |
even worse hacks are done when using 'sed'; I have seen a script |
2973 |
converting the output of 'date' into a 'bc' program! |
2974 |
|
2975 |
The main body of this is the 'sed' script, which remaps the name from |
2976 |
lower to upper (or vice-versa) and even checks out if the remapped name |
2977 |
is the same as the original name. Note how the script is parameterized |
2978 |
using shell variables and proper quoting. |
2979 |
|
2980 |
#! /bin/sh |
2981 |
# rename files to lower/upper case... |
2982 |
# |
2983 |
# usage: |
2984 |
# move-to-lower * |
2985 |
# move-to-upper * |
2986 |
# or |
2987 |
# move-to-lower -R . |
2988 |
# move-to-upper -R . |
2989 |
# |
2990 |
|
2991 |
help() |
2992 |
{ |
2993 |
cat << eof |
2994 |
Usage: $0 [-n] [-r] [-h] files... |
2995 |
|
2996 |
-n do nothing, only see what would be done |
2997 |
-R recursive (use find) |
2998 |
-h this message |
2999 |
files files to remap to lower case |
3000 |
|
3001 |
Examples: |
3002 |
$0 -n * (see if everything is ok, then...) |
3003 |
$0 * |
3004 |
|
3005 |
$0 -R . |
3006 |
|
3007 |
eof |
3008 |
} |
3009 |
|
3010 |
apply_cmd='sh' |
3011 |
finder='echo "$@" | tr " " "\n"' |
3012 |
files_only= |
3013 |
|
3014 |
while : |
3015 |
do |
3016 |
case "$1" in |
3017 |
-n) apply_cmd='cat' ;; |
3018 |
-R) finder='find "$@" -type f';; |
3019 |
-h) help ; exit 1 ;; |
3020 |
*) break ;; |
3021 |
esac |
3022 |
shift |
3023 |
done |
3024 |
|
3025 |
if [ -z "$1" ]; then |
3026 |
echo Usage: $0 [-h] [-n] [-r] files... |
3027 |
exit 1 |
3028 |
fi |
3029 |
|
3030 |
LOWER='abcdefghijklmnopqrstuvwxyz' |
3031 |
UPPER='ABCDEFGHIJKLMNOPQRSTUVWXYZ' |
3032 |
|
3033 |
case `basename $0` in |
3034 |
*upper*) TO=$UPPER; FROM=$LOWER ;; |
3035 |
*) FROM=$UPPER; TO=$LOWER ;; |
3036 |
esac |
3037 |
|
3038 |
eval $finder | sed -n ' |
3039 |
|
3040 |
# remove all trailing slashes |
3041 |
s/\/*$// |
3042 |
|
3043 |
# add ./ if there is no path, only a filename |
3044 |
/\//! s/^/.\// |
3045 |
|
3046 |
# save path+filename |
3047 |
h |
3048 |
|
3049 |
# remove path |
3050 |
s/.*\/// |
3051 |
|
3052 |
# do conversion only on filename |
3053 |
y/'$FROM'/'$TO'/ |
3054 |
|
3055 |
# now line contains original path+file, while |
3056 |
# hold space contains the new filename |
3057 |
x |
3058 |
|
3059 |
# add converted file name to line, which now contains |
3060 |
# path/file-name\nconverted-file-name |
3061 |
G |
3062 |
|
3063 |
# check if converted file name is equal to original file name, |
3064 |
# if it is, do not print anything |
3065 |
/^.*\/\(.*\)\n\1/b |
3066 |
|
3067 |
# escape special characters for the shell |
3068 |
s/["$`\\]/\\&/g |
3069 |
|
3070 |
# now, transform path/fromfile\n, into |
3071 |
# mv path/fromfile path/tofile and print it |
3072 |
s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p |
3073 |
|
3074 |
' | $apply_cmd |
3075 |
|
3076 |
7.5 Print 'bash' Environment |
3077 |
============================ |
3078 |
|
3079 |
This script strips the definition of the shell functions from the output |
3080 |
of the 'set' Bourne-shell command. |
3081 |
|
3082 |
#!/bin/sh |
3083 |
|
3084 |
set | sed -n ' |
3085 |
:x |
3086 |
|
3087 |
# if no occurrence of "=()" print and load next line |
3088 |
/=()/! { p; b; } |
3089 |
/ () $/! { p; b; } |
3090 |
|
3091 |
# possible start of functions section |
3092 |
# save the line in case this is a var like FOO="() " |
3093 |
h |
3094 |
|
3095 |
# if the next line has a brace, we quit because |
3096 |
# nothing comes after functions |
3097 |
n |
3098 |
/^{/ q |
3099 |
|
3100 |
# print the old line |
3101 |
x; p |
3102 |
|
3103 |
# work on the new line now |
3104 |
x; bx |
3105 |
' |
3106 |
|
3107 |
7.6 Reverse Characters of Lines |
3108 |
=============================== |
3109 |
|
3110 |
This script can be used to reverse the position of characters in lines. |
3111 |
The technique moves two characters at a time, hence it is faster than |
3112 |
more intuitive implementations. |
3113 |
|
3114 |
Note the 'tx' command before the definition of the label. This is |
3115 |
often needed to reset the flag that is tested by the 't' command. |
3116 |
|
3117 |
Imaginative readers will find uses for this script. An example is |
3118 |
reversing the output of 'banner'.(1) |
3119 |
|
3120 |
#!/usr/bin/sed -f |
3121 |
|
3122 |
/../! b |
3123 |
|
3124 |
# Reverse a line. Begin embedding the line between two newlines |
3125 |
s/^.*$/\ |
3126 |
&\ |
3127 |
/ |
3128 |
|
3129 |
# Move first character at the end. The regexp matches until |
3130 |
# there are zero or one characters between the markers |
3131 |
tx |
3132 |
:x |
3133 |
s/\(\n.\)\(.*\)\(.\n\)/\3\2\1/ |
3134 |
tx |
3135 |
|
3136 |
# Remove the newline markers |
3137 |
s/\n//g |
3138 |
|
3139 |
---------- Footnotes ---------- |
3140 |
|
3141 |
(1) This requires another script to pad the output of banner; for |
3142 |
example |
3143 |
|
3144 |
#! /bin/sh |
3145 |
|
3146 |
banner -w $1 $2 $3 $4 | |
3147 |
sed -e :a -e '/^.\{0,'$1'\}$/ { s/$/ /; ba; }' | |
3148 |
~/sedscripts/reverseline.sed |
3149 |
|
3150 |
7.7 Text search across multiple lines |
3151 |
===================================== |
3152 |
|
3153 |
This section uses 'N' and 'D' commands to search for consecutive words |
3154 |
spanning multiple lines. *Note Multiline techniques::. |
3155 |
|
3156 |
These examples deal with finding doubled occurrences of words in a |
3157 |
document. |
3158 |
|
3159 |
Finding doubled words in a single line is easy using GNU 'grep' and |
3160 |
similarly with GNU 'sed': |
3161 |
|
3162 |
$ cat two-cities-dup1.txt |
3163 |
It was the best of times, |
3164 |
it was the worst of times, |
3165 |
it was the the age of wisdom, |
3166 |
it was the age of foolishness, |
3167 |
|
3168 |
$ grep -E '\b(\w+)\s+\1\b' two-cities-dup1.txt |
3169 |
it was the the age of wisdom, |
3170 |
|
3171 |
$ grep -n -E '\b(\w+)\s+\1\b' two-cities-dup1.txt |
3172 |
3:it was the the age of wisdom, |
3173 |
|
3174 |
$ sed -En '/\b(\w+)\s+\1\b/p' two-cities-dup1.txt |
3175 |
it was the the age of wisdom, |
3176 |
|
3177 |
$ sed -En '/\b(\w+)\s+\1\b/{=;p}' two-cities-dup1.txt |
3178 |
3 |
3179 |
it was the the age of wisdom, |
3180 |
|
3181 |
* The regular expression '\b\w+\s+' searches for word-boundary |
3182 |
('\b'), followed by one-or-more word-characters ('\w+'), followed |
3183 |
by whitespace ('\s+'). *Note regexp extensions::. |
3184 |
|
3185 |
* Adding parentheses around the '(\w+)' expression creates a |
3186 |
subexpression. The regular expression pattern '(PATTERN)\s+\1' |
3187 |
defines a subexpression (in the parentheses) followed by a |
3188 |
back-reference, separated by whitespace. A successful match means |
3189 |
the PATTERN was repeated twice in succession. *Note |
3190 |
Back-references and Subexpressions::. |
3191 |
|
3192 |
* The word-boundery expression ('\b') at both ends ensures partial |
3193 |
words are not matched (e.g. 'the then' is not a desired match). |
3194 |
|
3195 |
* The '-E' option enables extended regular expression syntax, |
3196 |
alleviating the need to add backslashes before the parenthesis. |
3197 |
*Note ERE syntax::. |
3198 |
|
3199 |
When the doubled word span two lines the above regular expression |
3200 |
will not find them as 'grep' and 'sed' operate line-by-line. |
3201 |
|
3202 |
By using 'N' and 'D' commands, 'sed' can apply regular expressions on |
3203 |
multiple lines (that is, multiple lines are stored in the pattern space, |
3204 |
and the regular expression works on it): |
3205 |
|
3206 |
$ cat two-cities-dup2.txt |
3207 |
It was the best of times, it was the |
3208 |
worst of times, it was the |
3209 |
the age of wisdom, |
3210 |
it was the age of foolishness, |
3211 |
|
3212 |
$ sed -En '{N; /\b(\w+)\s+\1\b/{=;p} ; D}' two-cities-dup2.txt |
3213 |
3 |
3214 |
worst of times, it was the |
3215 |
the age of wisdom, |
3216 |
|
3217 |
* The 'N' command appends the next line to the pattern space (thus |
3218 |
ensuring it contains two consecutive lines in every cycle). |
3219 |
|
3220 |
* The regular expression uses '\s+' for word separator which matches |
3221 |
both spaces and newlines. |
3222 |
|
3223 |
* The regular expression matches, the entire pattern space is printed |
3224 |
with 'p'. No lines are printed by default due to the '-n' option. |
3225 |
|
3226 |
* The 'D' removes the first line from the pattern space (up until the |
3227 |
first newline), readying it for the next cycle. |
3228 |
|
3229 |
See the GNU 'coreutils' manual for an alternative solution using 'tr |
3230 |
-s' and 'uniq' at |
3231 |
<https://gnu.org/s/coreutils/manual/html_node/Squeezing-and-deleting.html>. |
3232 |
|
3233 |
7.8 Line length adjustment |
3234 |
========================== |
3235 |
|
3236 |
This section uses 'N' and 'D' commands to search for consecutive words |
3237 |
spanning multiple lines, and the 'b' command for branching. *Note |
3238 |
Multiline techniques:: and *note Branching and flow control::. |
3239 |
|
3240 |
This (somewhat contrived) example deal with formatting and wrapping |
3241 |
lines of text of the following input file: |
3242 |
|
3243 |
$ cat two-cities-mix.txt |
3244 |
It was the best of times, it was |
3245 |
the worst of times, it |
3246 |
was the age of |
3247 |
wisdom, |
3248 |
it |
3249 |
was |
3250 |
the age |
3251 |
of foolishness, |
3252 |
|
3253 |
The following sed program wraps lines at 40 characters: |
3254 |
$ cat wrap40.sed |
3255 |
# outer loop |
3256 |
:x |
3257 |
|
3258 |
# Appead a newline followed by the next input line to the pattern buffer |
3259 |
N |
3260 |
|
3261 |
# Remove all newlines from the pattern buffer |
3262 |
s/\n/ /g |
3263 |
|
3264 |
|
3265 |
# Inner loop |
3266 |
:y |
3267 |
|
3268 |
# Add a newline after the first 40 characters |
3269 |
s/(.{40,40})/\1\n/ |
3270 |
|
3271 |
# If there is a newline in the pattern buffer |
3272 |
# (i.e. the previous substitution added a newline) |
3273 |
/\n/ { |
3274 |
# There are newlines in the pattern buffer - |
3275 |
# print the content until the first newline. |
3276 |
P |
3277 |
|
3278 |
# Remove the printed characters and the first newline |
3279 |
s/.*\n// |
3280 |
|
3281 |
# branch to label 'y' - repeat inner loop |
3282 |
by |
3283 |
} |
3284 |
|
3285 |
# No newlines in the pattern buffer - Branch to label 'x' (outer loop) |
3286 |
# and read the next input line |
3287 |
bx |
3288 |
|
3289 |
The wrapped output: |
3290 |
$ sed -E -f wrap40.sed two-cities-mix.txt |
3291 |
It was the best of times, it was the wor |
3292 |
st of times, it was the age of wisdom, i |
3293 |
t was the age of foolishness, |
3294 |
|
3295 |
7.9 Reverse Lines of Files |
3296 |
========================== |
3297 |
|
3298 |
This one begins a series of totally useless (yet interesting) scripts |
3299 |
emulating various Unix commands. This, in particular, is a 'tac' |
3300 |
workalike. |
3301 |
|
3302 |
Note that on implementations other than GNU 'sed' this script might |
3303 |
easily overflow internal buffers. |
3304 |
|
3305 |
#!/usr/bin/sed -nf |
3306 |
|
3307 |
# reverse all lines of input, i.e. first line became last, ... |
3308 |
|
3309 |
# from the second line, the buffer (which contains all previous lines) |
3310 |
# is *appended* to current line, so, the order will be reversed |
3311 |
1! G |
3312 |
|
3313 |
# on the last line we're done -- print everything |
3314 |
$ p |
3315 |
|
3316 |
# store everything on the buffer again |
3317 |
h |
3318 |
|
3319 |
7.10 Numbering Lines |
3320 |
==================== |
3321 |
|
3322 |
This script replaces 'cat -n'; in fact it formats its output exactly |
3323 |
like GNU 'cat' does. |
3324 |
|
3325 |
Of course this is completely useless and for two reasons: first, |
3326 |
because somebody else did it in C, second, because the following |
3327 |
Bourne-shell script could be used for the same purpose and would be much |
3328 |
faster: |
3329 |
|
3330 |
#! /bin/sh |
3331 |
sed -e "=" $@ | sed -e ' |
3332 |
s/^/ / |
3333 |
N |
3334 |
s/^ *\(......\)\n/\1 / |
3335 |
' |
3336 |
|
3337 |
It uses 'sed' to print the line number, then groups lines two by two |
3338 |
using 'N'. Of course, this script does not teach as much as the one |
3339 |
presented below. |
3340 |
|
3341 |
The algorithm used for incrementing uses both buffers, so the line is |
3342 |
printed as soon as possible and then discarded. The number is split so |
3343 |
that changing digits go in a buffer and unchanged ones go in the other; |
3344 |
the changed digits are modified in a single step (using a 'y' command). |
3345 |
The line number for the next line is then composed and stored in the |
3346 |
hold space, to be used in the next iteration. |
3347 |
|
3348 |
#!/usr/bin/sed -nf |
3349 |
|
3350 |
# Prime the pump on the first line |
3351 |
x |
3352 |
/^$/ s/^.*$/1/ |
3353 |
|
3354 |
# Add the correct line number before the pattern |
3355 |
G |
3356 |
h |
3357 |
|
3358 |
# Format it and print it |
3359 |
s/^/ / |
3360 |
s/^ *\(......\)\n/\1 /p |
3361 |
|
3362 |
# Get the line number from hold space; add a zero |
3363 |
# if we're going to add a digit on the next line |
3364 |
g |
3365 |
s/\n.*$// |
3366 |
/^9*$/ s/^/0/ |
3367 |
|
3368 |
# separate changing/unchanged digits with an x |
3369 |
s/.9*$/x&/ |
3370 |
|
3371 |
# keep changing digits in hold space |
3372 |
h |
3373 |
s/^.*x// |
3374 |
y/0123456789/1234567890/ |
3375 |
x |
3376 |
|
3377 |
# keep unchanged digits in pattern space |
3378 |
s/x.*$// |
3379 |
|
3380 |
# compose the new number, remove the newline implicitly added by G |
3381 |
G |
3382 |
s/\n// |
3383 |
h |
3384 |
|
3385 |
7.11 Numbering Non-blank Lines |
3386 |
============================== |
3387 |
|
3388 |
Emulating 'cat -b' is almost the same as 'cat -n'--we only have to |
3389 |
select which lines are to be numbered and which are not. |
3390 |
|
3391 |
The part that is common to this script and the previous one is not |
3392 |
commented to show how important it is to comment 'sed' scripts |
3393 |
properly... |
3394 |
|
3395 |
#!/usr/bin/sed -nf |
3396 |
|
3397 |
/^$/ { |
3398 |
p |
3399 |
b |
3400 |
} |
3401 |
|
3402 |
# Same as cat -n from now |
3403 |
x |
3404 |
/^$/ s/^.*$/1/ |
3405 |
G |
3406 |
h |
3407 |
s/^/ / |
3408 |
s/^ *\(......\)\n/\1 /p |
3409 |
x |
3410 |
s/\n.*$// |
3411 |
/^9*$/ s/^/0/ |
3412 |
s/.9*$/x&/ |
3413 |
h |
3414 |
s/^.*x// |
3415 |
y/0123456789/1234567890/ |
3416 |
x |
3417 |
s/x.*$// |
3418 |
G |
3419 |
s/\n// |
3420 |
h |
3421 |
|
3422 |
7.12 Counting Characters |
3423 |
======================== |
3424 |
|
3425 |
This script shows another way to do arithmetic with 'sed'. In this case |
3426 |
we have to add possibly large numbers, so implementing this by |
3427 |
successive increments would not be feasible (and possibly even more |
3428 |
complicated to contrive than this script). |
3429 |
|
3430 |
The approach is to map numbers to letters, kind of an abacus |
3431 |
implemented with 'sed'. 'a's are units, 'b's are tens and so on: we |
3432 |
simply add the number of characters on the current line as units, and |
3433 |
then propagate the carry to tens, hundreds, and so on. |
3434 |
|
3435 |
As usual, running totals are kept in hold space. |
3436 |
|
3437 |
On the last line, we convert the abacus form back to decimal. For |
3438 |
the sake of variety, this is done with a loop rather than with some 80 |
3439 |
's' commands(1): first we convert units, removing 'a's from the number; |
3440 |
then we rotate letters so that tens become 'a's, and so on until no more |
3441 |
letters remain. |
3442 |
|
3443 |
#!/usr/bin/sed -nf |
3444 |
|
3445 |
# Add n+1 a's to hold space (+1 is for the newline) |
3446 |
s/./a/g |
3447 |
H |
3448 |
x |
3449 |
s/\n/a/ |
3450 |
|
3451 |
# Do the carry. The t's and b's are not necessary, |
3452 |
# but they do speed up the thing |
3453 |
t a |
3454 |
: a; s/aaaaaaaaaa/b/g; t b; b done |
3455 |
: b; s/bbbbbbbbbb/c/g; t c; b done |
3456 |
: c; s/cccccccccc/d/g; t d; b done |
3457 |
: d; s/dddddddddd/e/g; t e; b done |
3458 |
: e; s/eeeeeeeeee/f/g; t f; b done |
3459 |
: f; s/ffffffffff/g/g; t g; b done |
3460 |
: g; s/gggggggggg/h/g; t h; b done |
3461 |
: h; s/hhhhhhhhhh//g |
3462 |
|
3463 |
: done |
3464 |
$! { |
3465 |
h |
3466 |
b |
3467 |
} |
3468 |
|
3469 |
# On the last line, convert back to decimal |
3470 |
|
3471 |
: loop |
3472 |
/a/! s/[b-h]*/&0/ |
3473 |
s/aaaaaaaaa/9/ |
3474 |
s/aaaaaaaa/8/ |
3475 |
s/aaaaaaa/7/ |
3476 |
s/aaaaaa/6/ |
3477 |
s/aaaaa/5/ |
3478 |
s/aaaa/4/ |
3479 |
s/aaa/3/ |
3480 |
s/aa/2/ |
3481 |
s/a/1/ |
3482 |
|
3483 |
: next |
3484 |
y/bcdefgh/abcdefg/ |
3485 |
/[a-h]/ b loop |
3486 |
p |
3487 |
|
3488 |
---------- Footnotes ---------- |
3489 |
|
3490 |
(1) Some implementations have a limit of 199 commands per script |
3491 |
|
3492 |
7.13 Counting Words |
3493 |
=================== |
3494 |
|
3495 |
This script is almost the same as the previous one, once each of the |
3496 |
words on the line is converted to a single 'a' (in the previous script |
3497 |
each letter was changed to an 'a'). |
3498 |
|
3499 |
It is interesting that real 'wc' programs have optimized loops for |
3500 |
'wc -c', so they are much slower at counting words rather than |
3501 |
characters. This script's bottleneck, instead, is arithmetic, and hence |
3502 |
the word-counting one is faster (it has to manage smaller numbers). |
3503 |
|
3504 |
Again, the common parts are not commented to show the importance of |
3505 |
commenting 'sed' scripts. |
3506 |
|
3507 |
#!/usr/bin/sed -nf |
3508 |
|
3509 |
# Convert words to a's |
3510 |
s/[ <TAB>][ <TAB>]*/ /g |
3511 |
s/^/ / |
3512 |
s/ [^ ][^ ]*/a /g |
3513 |
s/ //g |
3514 |
|
3515 |
# Append them to hold space |
3516 |
H |
3517 |
x |
3518 |
s/\n// |
3519 |
|
3520 |
# From here on it is the same as in wc -c. |
3521 |
/aaaaaaaaaa/! bx; s/aaaaaaaaaa/b/g |
3522 |
/bbbbbbbbbb/! bx; s/bbbbbbbbbb/c/g |
3523 |
/cccccccccc/! bx; s/cccccccccc/d/g |
3524 |
/dddddddddd/! bx; s/dddddddddd/e/g |
3525 |
/eeeeeeeeee/! bx; s/eeeeeeeeee/f/g |
3526 |
/ffffffffff/! bx; s/ffffffffff/g/g |
3527 |
/gggggggggg/! bx; s/gggggggggg/h/g |
3528 |
s/hhhhhhhhhh//g |
3529 |
:x |
3530 |
$! { h; b; } |
3531 |
:y |
3532 |
/a/! s/[b-h]*/&0/ |
3533 |
s/aaaaaaaaa/9/ |
3534 |
s/aaaaaaaa/8/ |
3535 |
s/aaaaaaa/7/ |
3536 |
s/aaaaaa/6/ |
3537 |
s/aaaaa/5/ |
3538 |
s/aaaa/4/ |
3539 |
s/aaa/3/ |
3540 |
s/aa/2/ |
3541 |
s/a/1/ |
3542 |
y/bcdefgh/abcdefg/ |
3543 |
/[a-h]/ by |
3544 |
p |
3545 |
|
3546 |
7.14 Counting Lines |
3547 |
=================== |
3548 |
|
3549 |
No strange things are done now, because 'sed' gives us 'wc -l' |
3550 |
functionality for free!!! Look: |
3551 |
|
3552 |
#!/usr/bin/sed -nf |
3553 |
$= |
3554 |
|
3555 |
7.15 Printing the First Lines |
3556 |
============================= |
3557 |
|
3558 |
This script is probably the simplest useful 'sed' script. It displays |
3559 |
the first 10 lines of input; the number of displayed lines is right |
3560 |
before the 'q' command. |
3561 |
|
3562 |
#!/usr/bin/sed -f |
3563 |
10q |
3564 |
|
3565 |
7.16 Printing the Last Lines |
3566 |
============================ |
3567 |
|
3568 |
Printing the last N lines rather than the first is more complex but |
3569 |
indeed possible. N is encoded in the second line, before the bang |
3570 |
character. |
3571 |
|
3572 |
This script is similar to the 'tac' script in that it keeps the final |
3573 |
output in the hold space and prints it at the end: |
3574 |
|
3575 |
#!/usr/bin/sed -nf |
3576 |
|
3577 |
1! {; H; g; } |
3578 |
1,10 !s/[^\n]*\n// |
3579 |
$p |
3580 |
h |
3581 |
|
3582 |
Mainly, the scripts keeps a window of 10 lines and slides it by |
3583 |
adding a line and deleting the oldest (the substitution command on the |
3584 |
second line works like a 'D' command but does not restart the loop). |
3585 |
|
3586 |
The "sliding window" technique is a very powerful way to write |
3587 |
efficient and complex 'sed' scripts, because commands like 'P' would |
3588 |
require a lot of work if implemented manually. |
3589 |
|
3590 |
To introduce the technique, which is fully demonstrated in the rest |
3591 |
of this chapter and is based on the 'N', 'P' and 'D' commands, here is |
3592 |
an implementation of 'tail' using a simple "sliding window." |
3593 |
|
3594 |
This looks complicated but in fact the working is the same as the |
3595 |
last script: after we have kicked in the appropriate number of lines, |
3596 |
however, we stop using the hold space to keep inter-line state, and |
3597 |
instead use 'N' and 'D' to slide pattern space by one line: |
3598 |
|
3599 |
#!/usr/bin/sed -f |
3600 |
|
3601 |
1h |
3602 |
2,10 {; H; g; } |
3603 |
$q |
3604 |
1,9d |
3605 |
N |
3606 |
D |
3607 |
|
3608 |
Note how the first, second and fourth line are inactive after the |
3609 |
first ten lines of input. After that, all the script does is: exiting |
3610 |
on the last line of input, appending the next input line to pattern |
3611 |
space, and removing the first line. |
3612 |
|
3613 |
7.17 Make Duplicate Lines Unique |
3614 |
================================ |
3615 |
|
3616 |
This is an example of the art of using the 'N', 'P' and 'D' commands, |
3617 |
probably the most difficult to master. |
3618 |
|
3619 |
#!/usr/bin/sed -f |
3620 |
h |
3621 |
|
3622 |
:b |
3623 |
# On the last line, print and exit |
3624 |
$b |
3625 |
N |
3626 |
/^\(.*\)\n\1$/ { |
3627 |
# The two lines are identical. Undo the effect of |
3628 |
# the n command. |
3629 |
g |
3630 |
bb |
3631 |
} |
3632 |
|
3633 |
# If the N command had added the last line, print and exit |
3634 |
$b |
3635 |
|
3636 |
# The lines are different; print the first and go |
3637 |
# back working on the second. |
3638 |
P |
3639 |
D |
3640 |
|
3641 |
As you can see, we maintain a 2-line window using 'P' and 'D'. This |
3642 |
technique is often used in advanced 'sed' scripts. |
3643 |
|
3644 |
7.18 Print Duplicated Lines of Input |
3645 |
==================================== |
3646 |
|
3647 |
This script prints only duplicated lines, like 'uniq -d'. |
3648 |
|
3649 |
#!/usr/bin/sed -nf |
3650 |
|
3651 |
$b |
3652 |
N |
3653 |
/^\(.*\)\n\1$/ { |
3654 |
# Print the first of the duplicated lines |
3655 |
s/.*\n// |
3656 |
p |
3657 |
|
3658 |
# Loop until we get a different line |
3659 |
:b |
3660 |
$b |
3661 |
N |
3662 |
/^\(.*\)\n\1$/ { |
3663 |
s/.*\n// |
3664 |
bb |
3665 |
} |
3666 |
} |
3667 |
|
3668 |
# The last line cannot be followed by duplicates |
3669 |
$b |
3670 |
|
3671 |
# Found a different one. Leave it alone in the pattern space |
3672 |
# and go back to the top, hunting its duplicates |
3673 |
D |
3674 |
|
3675 |
7.19 Remove All Duplicated Lines |
3676 |
================================ |
3677 |
|
3678 |
This script prints only unique lines, like 'uniq -u'. |
3679 |
|
3680 |
#!/usr/bin/sed -f |
3681 |
|
3682 |
# Search for a duplicate line --- until that, print what you find. |
3683 |
$b |
3684 |
N |
3685 |
/^\(.*\)\n\1$/ ! { |
3686 |
P |
3687 |
D |
3688 |
} |
3689 |
|
3690 |
:c |
3691 |
# Got two equal lines in pattern space. At the |
3692 |
# end of the file we simply exit |
3693 |
$d |
3694 |
|
3695 |
# Else, we keep reading lines with N until we |
3696 |
# find a different one |
3697 |
s/.*\n// |
3698 |
N |
3699 |
/^\(.*\)\n\1$/ { |
3700 |
bc |
3701 |
} |
3702 |
|
3703 |
# Remove the last instance of the duplicate line |
3704 |
# and go back to the top |
3705 |
D |
3706 |
|
3707 |
7.20 Squeezing Blank Lines |
3708 |
========================== |
3709 |
|
3710 |
As a final example, here are three scripts, of increasing complexity and |
3711 |
speed, that implement the same function as 'cat -s', that is squeezing |
3712 |
blank lines. |
3713 |
|
3714 |
The first leaves a blank line at the beginning and end if there are |
3715 |
some already. |
3716 |
|
3717 |
#!/usr/bin/sed -f |
3718 |
|
3719 |
# on empty lines, join with next |
3720 |
# Note there is a star in the regexp |
3721 |
:x |
3722 |
/^\n*$/ { |
3723 |
N |
3724 |
bx |
3725 |
} |
3726 |
|
3727 |
# now, squeeze all '\n', this can be also done by: |
3728 |
# s/^\(\n\)*/\1/ |
3729 |
s/\n*/\ |
3730 |
/ |
3731 |
|
3732 |
This one is a bit more complex and removes all empty lines at the |
3733 |
beginning. It does leave a single blank line at end if one was there. |
3734 |
|
3735 |
#!/usr/bin/sed -f |
3736 |
|
3737 |
# delete all leading empty lines |
3738 |
1,/^./{ |
3739 |
/./!d |
3740 |
} |
3741 |
|
3742 |
# on an empty line we remove it and all the following |
3743 |
# empty lines, but one |
3744 |
:x |
3745 |
/./!{ |
3746 |
N |
3747 |
s/^\n$// |
3748 |
tx |
3749 |
} |
3750 |
|
3751 |
This removes leading and trailing blank lines. It is also the |
3752 |
fastest. Note that loops are completely done with 'n' and 'b', without |
3753 |
relying on 'sed' to restart the script automatically at the end of a |
3754 |
line. |
3755 |
|
3756 |
#!/usr/bin/sed -nf |
3757 |
|
3758 |
# delete all (leading) blanks |
3759 |
/./!d |
3760 |
|
3761 |
# get here: so there is a non empty |
3762 |
:x |
3763 |
# print it |
3764 |
p |
3765 |
# get next |
3766 |
n |
3767 |
# got chars? print it again, etc... |
3768 |
/./bx |
3769 |
|
3770 |
# no, don't have chars: got an empty line |
3771 |
:z |
3772 |
# get next, if last line we finish here so no trailing |
3773 |
# empty lines are written |
3774 |
n |
3775 |
# also empty? then ignore it, and get next... this will |
3776 |
# remove ALL empty lines |
3777 |
/./!bz |
3778 |
|
3779 |
# all empty lines were deleted/ignored, but we have a non empty. As |
3780 |
# what we want to do is to squeeze, insert a blank line artificially |
3781 |
i\ |
3782 |
|
3783 |
bx |
3784 |
|
3785 |
8 GNU 'sed''s Limitations and Non-limitations |
3786 |
********************************************* |
3787 |
|
3788 |
For those who want to write portable 'sed' scripts, be aware that some |
3789 |
implementations have been known to limit line lengths (for the pattern |
3790 |
and hold spaces) to be no more than 4000 bytes. The POSIX standard |
3791 |
specifies that conforming 'sed' implementations shall support at least |
3792 |
8192 byte line lengths. GNU 'sed' has no built-in limit on line length; |
3793 |
as long as it can 'malloc()' more (virtual) memory, you can feed or |
3794 |
construct lines as long as you like. |
3795 |
|
3796 |
However, recursion is used to handle subpatterns and indefinite |
3797 |
repetition. This means that the available stack space may limit the |
3798 |
size of the buffer that can be processed by certain patterns. |
3799 |
|
3800 |
9 Other Resources for Learning About 'sed' |
3801 |
****************************************** |
3802 |
|
3803 |
For up to date information about GNU 'sed' please visit |
3804 |
<https://www.gnu.org/software/sed/>. |
3805 |
|
3806 |
Send general questions and suggestions to <sed-devel@gnu.org>. Visit |
3807 |
the mailing list archives for past discussions at |
3808 |
<https://lists.gnu.org/archive/html/sed-devel/>. |
3809 |
|
3810 |
The following resources provide information about 'sed' (both GNU |
3811 |
'sed' and other variations). Note these not maintained by GNU 'sed' |
3812 |
developers. |
3813 |
|
3814 |
* sed '$HOME': <http://sed.sf.net> |
3815 |
|
3816 |
* sed FAQ: <http://sed.sf.net/sedfaq.html> |
3817 |
|
3818 |
* seder's grabbag: <http://sed.sf.net/grabbag> |
3819 |
|
3820 |
* The 'sed-users' mailing list maintained by Sven Guckes: |
3821 |
<http://groups.yahoo.com/group/sed-users/> (note this is _not_ the |
3822 |
GNU 'sed' mailing list). |
3823 |
|
3824 |
10 Reporting Bugs |
3825 |
***************** |
3826 |
|
3827 |
Email bug reports to <bug-sed@gnu.org>. Also, please include the output |
3828 |
of 'sed --version' in the body of your report if at all possible. |
3829 |
|
3830 |
Please do not send a bug report like this: |
3831 |
|
3832 |
while building frobme-1.3.4 |
3833 |
$ configure |
3834 |
error-> sed: file sedscr line 1: Unknown option to 's' |
3835 |
|
3836 |
If GNU 'sed' doesn't configure your favorite package, take a few |
3837 |
extra minutes to identify the specific problem and make a stand-alone |
3838 |
test case. Unlike other programs such as C compilers, making such test |
3839 |
cases for 'sed' is quite simple. |
3840 |
|
3841 |
A stand-alone test case includes all the data necessary to perform |
3842 |
the test, and the specific invocation of 'sed' that causes the problem. |
3843 |
The smaller a stand-alone test case is, the better. A test case should |
3844 |
not involve something as far removed from 'sed' as "try to configure |
3845 |
frobme-1.3.4". Yes, that is in principle enough information to look for |
3846 |
the bug, but that is not a very practical prospect. |
3847 |
|
3848 |
Here are a few commonly reported bugs that are not bugs. |
3849 |
|
3850 |
'N' command on the last line |
3851 |
|
3852 |
Most versions of 'sed' exit without printing anything when the 'N' |
3853 |
command is issued on the last line of a file. GNU 'sed' prints |
3854 |
pattern space before exiting unless of course the '-n' command |
3855 |
switch has been specified. This choice is by design. |
3856 |
|
3857 |
Default behavior (gnu extension, non-POSIX conforming): |
3858 |
$ seq 3 | sed N |
3859 |
1 |
3860 |
2 |
3861 |
3 |
3862 |
To force POSIX-conforming behavior: |
3863 |
$ seq 3 | sed --posix N |
3864 |
1 |
3865 |
2 |
3866 |
|
3867 |
For example, the behavior of |
3868 |
sed N foo bar |
3869 |
would depend on whether foo has an even or an odd number of |
3870 |
lines(1). Or, when writing a script to read the next few lines |
3871 |
following a pattern match, traditional implementations of 'sed' |
3872 |
would force you to write something like |
3873 |
/foo/{ $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N } |
3874 |
instead of just |
3875 |
/foo/{ N;N;N;N;N;N;N;N;N; } |
3876 |
|
3877 |
In any case, the simplest workaround is to use '$d;N' in scripts |
3878 |
that rely on the traditional behavior, or to set the |
3879 |
'POSIXLY_CORRECT' variable to a non-empty value. |
3880 |
|
3881 |
Regex syntax clashes (problems with backslashes) |
3882 |
'sed' uses the POSIX basic regular expression syntax. According to |
3883 |
the standard, the meaning of some escape sequences is undefined in |
3884 |
this syntax; notable in the case of 'sed' are '\|', '\+', '\?', |
3885 |
'\`', '\'', '\<', '\>', '\b', '\B', '\w', and '\W'. |
3886 |
|
3887 |
As in all GNU programs that use POSIX basic regular expressions, |
3888 |
'sed' interprets these escape sequences as special characters. So, |
3889 |
'x\+' matches one or more occurrences of 'x'. 'abc\|def' matches |
3890 |
either 'abc' or 'def'. |
3891 |
|
3892 |
This syntax may cause problems when running scripts written for |
3893 |
other 'sed's. Some 'sed' programs have been written with the |
3894 |
assumption that '\|' and '\+' match the literal characters '|' and |
3895 |
'+'. Such scripts must be modified by removing the spurious |
3896 |
backslashes if they are to be used with modern implementations of |
3897 |
'sed', like GNU 'sed'. |
3898 |
|
3899 |
On the other hand, some scripts use s|abc\|def||g to remove |
3900 |
occurrences of _either_ 'abc' or 'def'. While this worked until |
3901 |
'sed' 4.0.x, newer versions interpret this as removing the string |
3902 |
'abc|def'. This is again undefined behavior according to POSIX, |
3903 |
and this interpretation is arguably more robust: older 'sed's, for |
3904 |
example, required that the regex matcher parsed '\/' as '/' in the |
3905 |
common case of escaping a slash, which is again undefined behavior; |
3906 |
the new behavior avoids this, and this is good because the regex |
3907 |
matcher is only partially under our control. |
3908 |
|
3909 |
In addition, this version of 'sed' supports several escape |
3910 |
characters (some of which are multi-character) to insert |
3911 |
non-printable characters in scripts ('\a', '\c', '\d', '\o', '\r', |
3912 |
'\t', '\v', '\x'). These can cause similar problems with scripts |
3913 |
written for other 'sed's. |
3914 |
|
3915 |
'-i' clobbers read-only files |
3916 |
|
3917 |
In short, 'sed -i' will let you delete the contents of a read-only |
3918 |
file, and in general the '-i' option (*note Invocation: Invoking |
3919 |
sed.) lets you clobber protected files. This is not a bug, but |
3920 |
rather a consequence of how the Unix file system works. |
3921 |
|
3922 |
The permissions on a file say what can happen to the data in that |
3923 |
file, while the permissions on a directory say what can happen to |
3924 |
the list of files in that directory. 'sed -i' will not ever open |
3925 |
for writing a file that is already on disk. Rather, it will work |
3926 |
on a temporary file that is finally renamed to the original name: |
3927 |
if you rename or delete files, you're actually modifying the |
3928 |
contents of the directory, so the operation depends on the |
3929 |
permissions of the directory, not of the file. For this same |
3930 |
reason, 'sed' does not let you use '-i' on a writable file in a |
3931 |
read-only directory, and will break hard or symbolic links when |
3932 |
'-i' is used on such a file. |
3933 |
|
3934 |
'0a' does not work (gives an error) |
3935 |
|
3936 |
There is no line 0. 0 is a special address that is only used to |
3937 |
treat addresses like '0,/RE/' as active when the script starts: if |
3938 |
you write '1,/abc/d' and the first line includes the word 'abc', |
3939 |
then that match would be ignored because address ranges must span |
3940 |
at least two lines (barring the end of the file); but what you |
3941 |
probably wanted is to delete every line up to the first one |
3942 |
including 'abc', and this is obtained with '0,/abc/d'. |
3943 |
|
3944 |
'[a-z]' is case insensitive |
3945 |
|
3946 |
You are encountering problems with locales. POSIX mandates that |
3947 |
'[a-z]' uses the current locale's collation order - in C parlance, |
3948 |
that means using 'strcoll(3)' instead of 'strcmp(3)'. Some locales |
3949 |
have a case-insensitive collation order, others don't. |
3950 |
|
3951 |
Another problem is that '[a-z]' tries to use collation symbols. |
3952 |
This only happens if you are on the GNU system, using GNU libc's |
3953 |
regular expression matcher instead of compiling the one supplied |
3954 |
with GNU sed. In a Danish locale, for example, the regular |
3955 |
expression '^[a-z]$' matches the string 'aa', because this is a |
3956 |
single collating symbol that comes after 'a' and before 'b'; 'll' |
3957 |
behaves similarly in Spanish locales, or 'ij' in Dutch locales. |
3958 |
|
3959 |
To work around these problems, which may cause bugs in shell |
3960 |
scripts, set the 'LC_COLLATE' and 'LC_CTYPE' environment variables |
3961 |
to 'C'. |
3962 |
|
3963 |
's/.*//' does not clear pattern space |
3964 |
|
3965 |
This happens if your input stream includes invalid multibyte |
3966 |
sequences. POSIX mandates that such sequences are _not_ matched by |
3967 |
'.', so that 's/.*//' will not clear pattern space as you would |
3968 |
expect. In fact, there is no way to clear sed's buffers in the |
3969 |
middle of the script in most multibyte locales (including UTF-8 |
3970 |
locales). For this reason, GNU 'sed' provides a 'z' command (for |
3971 |
'zap') as an extension. |
3972 |
|
3973 |
To work around these problems, which may cause bugs in shell |
3974 |
scripts, set the 'LC_COLLATE' and 'LC_CTYPE' environment variables |
3975 |
to 'C'. |
3976 |
|
3977 |
---------- Footnotes ---------- |
3978 |
|
3979 |
(1) which is the actual "bug" that prompted the change in behavior |
3980 |
|
3981 |
Appendix A GNU Free Documentation License |
3982 |
***************************************** |
3983 |
|
3984 |
Version 1.3, 3 November 2008 |
3985 |
|
3986 |
Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc. |
3987 |
<https://fsf.org/> |
3988 |
|
3989 |
Everyone is permitted to copy and distribute verbatim copies |
3990 |
of this license document, but changing it is not allowed. |
3991 |
|
3992 |
0. PREAMBLE |
3993 |
|
3994 |
The purpose of this License is to make a manual, textbook, or other |
3995 |
functional and useful document "free" in the sense of freedom: to |
3996 |
assure everyone the effective freedom to copy and redistribute it, |
3997 |
with or without modifying it, either commercially or |
3998 |
noncommercially. Secondarily, this License preserves for the |
3999 |
author and publisher a way to get credit for their work, while not |
4000 |
being considered responsible for modifications made by others. |
4001 |
|
4002 |
This License is a kind of "copyleft", which means that derivative |
4003 |
works of the document must themselves be free in the same sense. |
4004 |
It complements the GNU General Public License, which is a copyleft |
4005 |
license designed for free software. |
4006 |
|
4007 |
We have designed this License in order to use it for manuals for |
4008 |
free software, because free software needs free documentation: a |
4009 |
free program should come with manuals providing the same freedoms |
4010 |
that the software does. But this License is not limited to |
4011 |
software manuals; it can be used for any textual work, regardless |
4012 |
of subject matter or whether it is published as a printed book. We |
4013 |
recommend this License principally for works whose purpose is |
4014 |
instruction or reference. |
4015 |
|
4016 |
1. APPLICABILITY AND DEFINITIONS |
4017 |
|
4018 |
This License applies to any manual or other work, in any medium, |
4019 |
that contains a notice placed by the copyright holder saying it can |
4020 |
be distributed under the terms of this License. Such a notice |
4021 |
grants a world-wide, royalty-free license, unlimited in duration, |
4022 |
to use that work under the conditions stated herein. The |
4023 |
"Document", below, refers to any such manual or work. Any member |
4024 |
of the public is a licensee, and is addressed as "you". You accept |
4025 |
the license if you copy, modify or distribute the work in a way |
4026 |
requiring permission under copyright law. |
4027 |
|
4028 |
A "Modified Version" of the Document means any work containing the |
4029 |
Document or a portion of it, either copied verbatim, or with |
4030 |
modifications and/or translated into another language. |
4031 |
|
4032 |
A "Secondary Section" is a named appendix or a front-matter section |
4033 |
of the Document that deals exclusively with the relationship of the |
4034 |
publishers or authors of the Document to the Document's overall |
4035 |
subject (or to related matters) and contains nothing that could |
4036 |
fall directly within that overall subject. (Thus, if the Document |
4037 |
is in part a textbook of mathematics, a Secondary Section may not |
4038 |
explain any mathematics.) The relationship could be a matter of |
4039 |
historical connection with the subject or with related matters, or |
4040 |
of legal, commercial, philosophical, ethical or political position |
4041 |
regarding them. |
4042 |
|
4043 |
The "Invariant Sections" are certain Secondary Sections whose |
4044 |
titles are designated, as being those of Invariant Sections, in the |
4045 |
notice that says that the Document is released under this License. |
4046 |
If a section does not fit the above definition of Secondary then it |
4047 |
is not allowed to be designated as Invariant. The Document may |
4048 |
contain zero Invariant Sections. If the Document does not identify |
4049 |
any Invariant Sections then there are none. |
4050 |
|
4051 |
The "Cover Texts" are certain short passages of text that are |
4052 |
listed, as Front-Cover Texts or Back-Cover Texts, in the notice |
4053 |
that says that the Document is released under this License. A |
4054 |
Front-Cover Text may be at most 5 words, and a Back-Cover Text may |
4055 |
be at most 25 words. |
4056 |
|
4057 |
A "Transparent" copy of the Document means a machine-readable copy, |
4058 |
represented in a format whose specification is available to the |
4059 |
general public, that is suitable for revising the document |
4060 |
straightforwardly with generic text editors or (for images composed |
4061 |
of pixels) generic paint programs or (for drawings) some widely |
4062 |
available drawing editor, and that is suitable for input to text |
4063 |
formatters or for automatic translation to a variety of formats |
4064 |
suitable for input to text formatters. A copy made in an otherwise |
4065 |
Transparent file format whose markup, or absence of markup, has |
4066 |
been arranged to thwart or discourage subsequent modification by |
4067 |
readers is not Transparent. An image format is not Transparent if |
4068 |
used for any substantial amount of text. A copy that is not |
4069 |
"Transparent" is called "Opaque". |
4070 |
|
4071 |
Examples of suitable formats for Transparent copies include plain |
4072 |
ASCII without markup, Texinfo input format, LaTeX input format, |
4073 |
SGML or XML using a publicly available DTD, and standard-conforming |
4074 |
simple HTML, PostScript or PDF designed for human modification. |
4075 |
Examples of transparent image formats include PNG, XCF and JPG. |
4076 |
Opaque formats include proprietary formats that can be read and |
4077 |
edited only by proprietary word processors, SGML or XML for which |
4078 |
the DTD and/or processing tools are not generally available, and |
4079 |
the machine-generated HTML, PostScript or PDF produced by some word |
4080 |
processors for output purposes only. |
4081 |
|
4082 |
The "Title Page" means, for a printed book, the title page itself, |
4083 |
plus such following pages as are needed to hold, legibly, the |
4084 |
material this License requires to appear in the title page. For |
4085 |
works in formats which do not have any title page as such, "Title |
4086 |
Page" means the text near the most prominent appearance of the |
4087 |
work's title, preceding the beginning of the body of the text. |
4088 |
|
4089 |
The "publisher" means any person or entity that distributes copies |
4090 |
of the Document to the public. |
4091 |
|
4092 |
A section "Entitled XYZ" means a named subunit of the Document |
4093 |
whose title either is precisely XYZ or contains XYZ in parentheses |
4094 |
following text that translates XYZ in another language. (Here XYZ |
4095 |
stands for a specific section name mentioned below, such as |
4096 |
"Acknowledgements", "Dedications", "Endorsements", or "History".) |
4097 |
To "Preserve the Title" of such a section when you modify the |
4098 |
Document means that it remains a section "Entitled XYZ" according |
4099 |
to this definition. |
4100 |
|
4101 |
The Document may include Warranty Disclaimers next to the notice |
4102 |
which states that this License applies to the Document. These |
4103 |
Warranty Disclaimers are considered to be included by reference in |
4104 |
this License, but only as regards disclaiming warranties: any other |
4105 |
implication that these Warranty Disclaimers may have is void and |
4106 |
has no effect on the meaning of this License. |
4107 |
|
4108 |
2. VERBATIM COPYING |
4109 |
|
4110 |
You may copy and distribute the Document in any medium, either |
4111 |
commercially or noncommercially, provided that this License, the |
4112 |
copyright notices, and the license notice saying this License |
4113 |
applies to the Document are reproduced in all copies, and that you |
4114 |
add no other conditions whatsoever to those of this License. You |
4115 |
may not use technical measures to obstruct or control the reading |
4116 |
or further copying of the copies you make or distribute. However, |
4117 |
you may accept compensation in exchange for copies. If you |
4118 |
distribute a large enough number of copies you must also follow the |
4119 |
conditions in section 3. |
4120 |
|
4121 |
You may also lend copies, under the same conditions stated above, |
4122 |
and you may publicly display copies. |
4123 |
|
4124 |
3. COPYING IN QUANTITY |
4125 |
|
4126 |
If you publish printed copies (or copies in media that commonly |
4127 |
have printed covers) of the Document, numbering more than 100, and |
4128 |
the Document's license notice requires Cover Texts, you must |
4129 |
enclose the copies in covers that carry, clearly and legibly, all |
4130 |
these Cover Texts: Front-Cover Texts on the front cover, and |
4131 |
Back-Cover Texts on the back cover. Both covers must also clearly |
4132 |
and legibly identify you as the publisher of these copies. The |
4133 |
front cover must present the full title with all words of the title |
4134 |
equally prominent and visible. You may add other material on the |
4135 |
covers in addition. Copying with changes limited to the covers, as |
4136 |
long as they preserve the title of the Document and satisfy these |
4137 |
conditions, can be treated as verbatim copying in other respects. |
4138 |
|
4139 |
If the required texts for either cover are too voluminous to fit |
4140 |
legibly, you should put the first ones listed (as many as fit |
4141 |
reasonably) on the actual cover, and continue the rest onto |
4142 |
adjacent pages. |
4143 |
|
4144 |
If you publish or distribute Opaque copies of the Document |
4145 |
numbering more than 100, you must either include a machine-readable |
4146 |
Transparent copy along with each Opaque copy, or state in or with |
4147 |
each Opaque copy a computer-network location from which the general |
4148 |
network-using public has access to download using public-standard |
4149 |
network protocols a complete Transparent copy of the Document, free |
4150 |
of added material. If you use the latter option, you must take |
4151 |
reasonably prudent steps, when you begin distribution of Opaque |
4152 |
copies in quantity, to ensure that this Transparent copy will |
4153 |
remain thus accessible at the stated location until at least one |
4154 |
year after the last time you distribute an Opaque copy (directly or |
4155 |
through your agents or retailers) of that edition to the public. |
4156 |
|
4157 |
It is requested, but not required, that you contact the authors of |
4158 |
the Document well before redistributing any large number of copies, |
4159 |
to give them a chance to provide you with an updated version of the |
4160 |
Document. |
4161 |
|
4162 |
4. MODIFICATIONS |
4163 |
|
4164 |
You may copy and distribute a Modified Version of the Document |
4165 |
under the conditions of sections 2 and 3 above, provided that you |
4166 |
release the Modified Version under precisely this License, with the |
4167 |
Modified Version filling the role of the Document, thus licensing |
4168 |
distribution and modification of the Modified Version to whoever |
4169 |
possesses a copy of it. In addition, you must do these things in |
4170 |
the Modified Version: |
4171 |
|
4172 |
A. Use in the Title Page (and on the covers, if any) a title |
4173 |
distinct from that of the Document, and from those of previous |
4174 |
versions (which should, if there were any, be listed in the |
4175 |
History section of the Document). You may use the same title |
4176 |
as a previous version if the original publisher of that |
4177 |
version gives permission. |
4178 |
|
4179 |
B. List on the Title Page, as authors, one or more persons or |
4180 |
entities responsible for authorship of the modifications in |
4181 |
the Modified Version, together with at least five of the |
4182 |
principal authors of the Document (all of its principal |
4183 |
authors, if it has fewer than five), unless they release you |
4184 |
from this requirement. |
4185 |
|
4186 |
C. State on the Title page the name of the publisher of the |
4187 |
Modified Version, as the publisher. |
4188 |
|
4189 |
D. Preserve all the copyright notices of the Document. |
4190 |
|
4191 |
E. Add an appropriate copyright notice for your modifications |
4192 |
adjacent to the other copyright notices. |
4193 |
|
4194 |
F. Include, immediately after the copyright notices, a license |
4195 |
notice giving the public permission to use the Modified |
4196 |
Version under the terms of this License, in the form shown in |
4197 |
the Addendum below. |
4198 |
|
4199 |
G. Preserve in that license notice the full lists of Invariant |
4200 |
Sections and required Cover Texts given in the Document's |
4201 |
license notice. |
4202 |
|
4203 |
H. Include an unaltered copy of this License. |
4204 |
|
4205 |
I. Preserve the section Entitled "History", Preserve its Title, |
4206 |
and add to it an item stating at least the title, year, new |
4207 |
authors, and publisher of the Modified Version as given on the |
4208 |
Title Page. If there is no section Entitled "History" in the |
4209 |
Document, create one stating the title, year, authors, and |
4210 |
publisher of the Document as given on its Title Page, then add |
4211 |
an item describing the Modified Version as stated in the |
4212 |
previous sentence. |
4213 |
|
4214 |
J. Preserve the network location, if any, given in the Document |
4215 |
for public access to a Transparent copy of the Document, and |
4216 |
likewise the network locations given in the Document for |
4217 |
previous versions it was based on. These may be placed in the |
4218 |
"History" section. You may omit a network location for a work |
4219 |
that was published at least four years before the Document |
4220 |
itself, or if the original publisher of the version it refers |
4221 |
to gives permission. |
4222 |
|
4223 |
K. For any section Entitled "Acknowledgements" or "Dedications", |
4224 |
Preserve the Title of the section, and preserve in the section |
4225 |
all the substance and tone of each of the contributor |
4226 |
acknowledgements and/or dedications given therein. |
4227 |
|
4228 |
L. Preserve all the Invariant Sections of the Document, unaltered |
4229 |
in their text and in their titles. Section numbers or the |
4230 |
equivalent are not considered part of the section titles. |
4231 |
|
4232 |
M. Delete any section Entitled "Endorsements". Such a section |
4233 |
may not be included in the Modified Version. |
4234 |
|
4235 |
N. Do not retitle any existing section to be Entitled |
4236 |
"Endorsements" or to conflict in title with any Invariant |
4237 |
Section. |
4238 |
|
4239 |
O. Preserve any Warranty Disclaimers. |
4240 |
|
4241 |
If the Modified Version includes new front-matter sections or |
4242 |
appendices that qualify as Secondary Sections and contain no |
4243 |
material copied from the Document, you may at your option designate |
4244 |
some or all of these sections as invariant. To do this, add their |
4245 |
titles to the list of Invariant Sections in the Modified Version's |
4246 |
license notice. These titles must be distinct from any other |
4247 |
section titles. |
4248 |
|
4249 |
You may add a section Entitled "Endorsements", provided it contains |
4250 |
nothing but endorsements of your Modified Version by various |
4251 |
parties--for example, statements of peer review or that the text |
4252 |
has been approved by an organization as the authoritative |
4253 |
definition of a standard. |
4254 |
|
4255 |
You may add a passage of up to five words as a Front-Cover Text, |
4256 |
and a passage of up to 25 words as a Back-Cover Text, to the end of |
4257 |
the list of Cover Texts in the Modified Version. Only one passage |
4258 |
of Front-Cover Text and one of Back-Cover Text may be added by (or |
4259 |
through arrangements made by) any one entity. If the Document |
4260 |
already includes a cover text for the same cover, previously added |
4261 |
by you or by arrangement made by the same entity you are acting on |
4262 |
behalf of, you may not add another; but you may replace the old |
4263 |
one, on explicit permission from the previous publisher that added |
4264 |
the old one. |
4265 |
|
4266 |
The author(s) and publisher(s) of the Document do not by this |
4267 |
License give permission to use their names for publicity for or to |
4268 |
assert or imply endorsement of any Modified Version. |
4269 |
|
4270 |
5. COMBINING DOCUMENTS |
4271 |
|
4272 |
You may combine the Document with other documents released under |
4273 |
this License, under the terms defined in section 4 above for |
4274 |
modified versions, provided that you include in the combination all |
4275 |
of the Invariant Sections of all of the original documents, |
4276 |
unmodified, and list them all as Invariant Sections of your |
4277 |
combined work in its license notice, and that you preserve all |
4278 |
their Warranty Disclaimers. |
4279 |
|
4280 |
The combined work need only contain one copy of this License, and |
4281 |
multiple identical Invariant Sections may be replaced with a single |
4282 |
copy. If there are multiple Invariant Sections with the same name |
4283 |
but different contents, make the title of each such section unique |
4284 |
by adding at the end of it, in parentheses, the name of the |
4285 |
original author or publisher of that section if known, or else a |
4286 |
unique number. Make the same adjustment to the section titles in |
4287 |
the list of Invariant Sections in the license notice of the |
4288 |
combined work. |
4289 |
|
4290 |
In the combination, you must combine any sections Entitled |
4291 |
"History" in the various original documents, forming one section |
4292 |
Entitled "History"; likewise combine any sections Entitled |
4293 |
"Acknowledgements", and any sections Entitled "Dedications". You |
4294 |
must delete all sections Entitled "Endorsements." |
4295 |
|
4296 |
6. COLLECTIONS OF DOCUMENTS |
4297 |
|
4298 |
You may make a collection consisting of the Document and other |
4299 |
documents released under this License, and replace the individual |
4300 |
copies of this License in the various documents with a single copy |
4301 |
that is included in the collection, provided that you follow the |
4302 |
rules of this License for verbatim copying of each of the documents |
4303 |
in all other respects. |
4304 |
|
4305 |
You may extract a single document from such a collection, and |
4306 |
distribute it individually under this License, provided you insert |
4307 |
a copy of this License into the extracted document, and follow this |
4308 |
License in all other respects regarding verbatim copying of that |
4309 |
document. |
4310 |
|
4311 |
7. AGGREGATION WITH INDEPENDENT WORKS |
4312 |
|
4313 |
A compilation of the Document or its derivatives with other |
4314 |
separate and independent documents or works, in or on a volume of a |
4315 |
storage or distribution medium, is called an "aggregate" if the |
4316 |
copyright resulting from the compilation is not used to limit the |
4317 |
legal rights of the compilation's users beyond what the individual |
4318 |
works permit. When the Document is included in an aggregate, this |
4319 |
License does not apply to the other works in the aggregate which |
4320 |
are not themselves derivative works of the Document. |
4321 |
|
4322 |
If the Cover Text requirement of section 3 is applicable to these |
4323 |
copies of the Document, then if the Document is less than one half |
4324 |
of the entire aggregate, the Document's Cover Texts may be placed |
4325 |
on covers that bracket the Document within the aggregate, or the |
4326 |
electronic equivalent of covers if the Document is in electronic |
4327 |
form. Otherwise they must appear on printed covers that bracket |
4328 |
the whole aggregate. |
4329 |
|
4330 |
8. TRANSLATION |
4331 |
|
4332 |
Translation is considered a kind of modification, so you may |
4333 |
distribute translations of the Document under the terms of section |
4334 |
4. Replacing Invariant Sections with translations requires special |
4335 |
permission from their copyright holders, but you may include |
4336 |
translations of some or all Invariant Sections in addition to the |
4337 |
original versions of these Invariant Sections. You may include a |
4338 |
translation of this License, and all the license notices in the |
4339 |
Document, and any Warranty Disclaimers, provided that you also |
4340 |
include the original English version of this License and the |
4341 |
original versions of those notices and disclaimers. In case of a |
4342 |
disagreement between the translation and the original version of |
4343 |
this License or a notice or disclaimer, the original version will |
4344 |
prevail. |
4345 |
|
4346 |
If a section in the Document is Entitled "Acknowledgements", |
4347 |
"Dedications", or "History", the requirement (section 4) to |
4348 |
Preserve its Title (section 1) will typically require changing the |
4349 |
actual title. |
4350 |
|
4351 |
9. TERMINATION |
4352 |
|
4353 |
You may not copy, modify, sublicense, or distribute the Document |
4354 |
except as expressly provided under this License. Any attempt |
4355 |
otherwise to copy, modify, sublicense, or distribute it is void, |
4356 |
and will automatically terminate your rights under this License. |
4357 |
|
4358 |
However, if you cease all violation of this License, then your |
4359 |
license from a particular copyright holder is reinstated (a) |
4360 |
provisionally, unless and until the copyright holder explicitly and |
4361 |
finally terminates your license, and (b) permanently, if the |
4362 |
copyright holder fails to notify you of the violation by some |
4363 |
reasonable means prior to 60 days after the cessation. |
4364 |
|
4365 |
Moreover, your license from a particular copyright holder is |
4366 |
reinstated permanently if the copyright holder notifies you of the |
4367 |
violation by some reasonable means, this is the first time you have |
4368 |
received notice of violation of this License (for any work) from |
4369 |
that copyright holder, and you cure the violation prior to 30 days |
4370 |
after your receipt of the notice. |
4371 |
|
4372 |
Termination of your rights under this section does not terminate |
4373 |
the licenses of parties who have received copies or rights from you |
4374 |
under this License. If your rights have been terminated and not |
4375 |
permanently reinstated, receipt of a copy of some or all of the |
4376 |
same material does not give you any rights to use it. |
4377 |
|
4378 |
10. FUTURE REVISIONS OF THIS LICENSE |
4379 |
|
4380 |
The Free Software Foundation may publish new, revised versions of |
4381 |
the GNU Free Documentation License from time to time. Such new |
4382 |
versions will be similar in spirit to the present version, but may |
4383 |
differ in detail to address new problems or concerns. See |
4384 |
<https://www.gnu.org/copyleft/>. |
4385 |
|
4386 |
Each version of the License is given a distinguishing version |
4387 |
number. If the Document specifies that a particular numbered |
4388 |
version of this License "or any later version" applies to it, you |
4389 |
have the option of following the terms and conditions either of |
4390 |
that specified version or of any later version that has been |
4391 |
published (not as a draft) by the Free Software Foundation. If the |
4392 |
Document does not specify a version number of this License, you may |
4393 |
choose any version ever published (not as a draft) by the Free |
4394 |
Software Foundation. If the Document specifies that a proxy can |
4395 |
decide which future versions of this License can be used, that |
4396 |
proxy's public statement of acceptance of a version permanently |
4397 |
authorizes you to choose that version for the Document. |
4398 |
|
4399 |
11. RELICENSING |
4400 |
|
4401 |
"Massive Multiauthor Collaboration Site" (or "MMC Site") means any |
4402 |
World Wide Web server that publishes copyrightable works and also |
4403 |
provides prominent facilities for anybody to edit those works. A |
4404 |
public wiki that anybody can edit is an example of such a server. |
4405 |
A "Massive Multiauthor Collaboration" (or "MMC") contained in the |
4406 |
site means any set of copyrightable works thus published on the MMC |
4407 |
site. |
4408 |
|
4409 |
"CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0 |
4410 |
license published by Creative Commons Corporation, a not-for-profit |
4411 |
corporation with a principal place of business in San Francisco, |
4412 |
California, as well as future copyleft versions of that license |
4413 |
published by that same organization. |
4414 |
|
4415 |
"Incorporate" means to publish or republish a Document, in whole or |
4416 |
in part, as part of another Document. |
4417 |
|
4418 |
An MMC is "eligible for relicensing" if it is licensed under this |
4419 |
License, and if all works that were first published under this |
4420 |
License somewhere other than this MMC, and subsequently |
4421 |
incorporated in whole or in part into the MMC, (1) had no cover |
4422 |
texts or invariant sections, and (2) were thus incorporated prior |
4423 |
to November 1, 2008. |
4424 |
|
4425 |
The operator of an MMC Site may republish an MMC contained in the |
4426 |
site under CC-BY-SA on the same site at any time before August 1, |
4427 |
2009, provided the MMC is eligible for relicensing. |
4428 |
|
4429 |
ADDENDUM: How to use this License for your documents |
4430 |
==================================================== |
4431 |
|
4432 |
To use this License in a document you have written, include a copy of |
4433 |
the License in the document and put the following copyright and license |
4434 |
notices just after the title page: |
4435 |
|
4436 |
Copyright (C) YEAR YOUR NAME. |
4437 |
Permission is granted to copy, distribute and/or modify this document |
4438 |
under the terms of the GNU Free Documentation License, Version 1.3 |
4439 |
or any later version published by the Free Software Foundation; |
4440 |
with no Invariant Sections, no Front-Cover Texts, and no Back-Cover |
4441 |
Texts. A copy of the license is included in the section entitled ``GNU |
4442 |
Free Documentation License''. |
4443 |
|
4444 |
If you have Invariant Sections, Front-Cover Texts and Back-Cover |
4445 |
Texts, replace the "with...Texts." line with this: |
4446 |
|
4447 |
with the Invariant Sections being LIST THEIR TITLES, with |
4448 |
the Front-Cover Texts being LIST, and with the Back-Cover Texts |
4449 |
being LIST. |
4450 |
|
4451 |
If you have Invariant Sections without Cover Texts, or some other |
4452 |
combination of the three, merge those two alternatives to suit the |
4453 |
situation. |
4454 |
|
4455 |
If your document contains nontrivial examples of program code, we |
4456 |
recommend releasing these examples in parallel under your choice of free |
4457 |
software license, such as the GNU General Public License, to permit |
4458 |
their use in free software. |
4459 |
|
4460 |
Concept Index |
4461 |
************* |
4462 |
|
4463 |
This is a general index of all issues discussed in this manual, with the |
4464 |
exception of the 'sed' commands and command-line options. |
4465 |
|
4466 |
* Menu: |
4467 |
|
4468 |
* -e, example: Overview. (line 140) |
4469 |
* -e, example <1>: sed script overview. |
4470 |
(line 415) |
4471 |
* -expression, example: Overview. (line 140) |
4472 |
* -f, example: Overview. (line 140) |
4473 |
* -f, example <1>: sed script overview. |
4474 |
(line 415) |
4475 |
* -file, example: Overview. (line 140) |
4476 |
* -i, example: Overview. (line 120) |
4477 |
* -n, example: Overview. (line 127) |
4478 |
* -s, example: Overview. (line 134) |
4479 |
* 0 address: Reporting Bugs. (line 3934) |
4480 |
* ;, command separator: sed script overview. |
4481 |
(line 415) |
4482 |
* a, and semicolons: sed script overview. |
4483 |
(line 434) |
4484 |
* Additional reading about sed: Other Resources. (line 3809) |
4485 |
* ADDR1,+N: Range Addresses. (line 1611) |
4486 |
* ADDR1,~N: Range Addresses. (line 1611) |
4487 |
* address range, example: sed script overview. |
4488 |
(line 401) |
4489 |
* Address, as a regular expression: Regexp Addresses. (line 1492) |
4490 |
* Address, last line: Numeric Addresses. (line 1456) |
4491 |
* Address, numeric: Numeric Addresses. (line 1451) |
4492 |
* addresses, excluding: Addresses overview. (line 1431) |
4493 |
* Addresses, in sed scripts: Numeric Addresses. (line 1449) |
4494 |
* addresses, negating: Addresses overview. (line 1431) |
4495 |
* addresses, numeric: Addresses overview. (line 1406) |
4496 |
* addresses, range: Addresses overview. (line 1424) |
4497 |
* addresses, regular expression: Addresses overview. (line 1418) |
4498 |
* addresses, syntax: sed script overview. |
4499 |
(line 391) |
4500 |
* alphabetic characters: Character Classes and Bracket Expressions. |
4501 |
(line 1988) |
4502 |
* alphanumeric characters: Character Classes and Bracket Expressions. |
4503 |
(line 1983) |
4504 |
* Append hold space to pattern space: Other Commands. (line 1110) |
4505 |
* Append next input line to pattern space: Other Commands. (line 1083) |
4506 |
* Append pattern space to hold space: Other Commands. (line 1102) |
4507 |
* Appending text after a line: Other Commands. (line 871) |
4508 |
* b, joining lines with: Branching and flow control. |
4509 |
(line 2753) |
4510 |
* b, versus t: Branching and flow control. |
4511 |
(line 2753) |
4512 |
* back-reference: Back-references and Subexpressions. |
4513 |
(line 2166) |
4514 |
* Backreferences, in regular expressions: The "s" Command. (line 616) |
4515 |
* blank characters: Character Classes and Bracket Expressions. |
4516 |
(line 1993) |
4517 |
* bracket expression: Character Classes and Bracket Expressions. |
4518 |
(line 1945) |
4519 |
* Branch to a label, if s/// failed: Extended Commands. (line 1203) |
4520 |
* Branch to a label, if s/// succeeded: Programming Commands. |
4521 |
(line 1139) |
4522 |
* Branch to a label, unconditionally: Programming Commands. |
4523 |
(line 1135) |
4524 |
* branching and n, N: Branching and flow control. |
4525 |
(line 2708) |
4526 |
* branching, infinite loop: Branching and flow control. |
4527 |
(line 2698) |
4528 |
* branching, joining lines: Branching and flow control. |
4529 |
(line 2753) |
4530 |
* Buffer spaces, pattern and hold: Execution Cycle. (line 2470) |
4531 |
* Bugs, reporting: Reporting Bugs. (line 3826) |
4532 |
* c, and semicolons: sed script overview. |
4533 |
(line 434) |
4534 |
* case insensitive, regular expression: Regexp Addresses. (line 1526) |
4535 |
* Case-insensitive matching: The "s" Command. (line 715) |
4536 |
* Caveat -- #n on first line: Common Commands. (line 750) |
4537 |
* character class: Character Classes and Bracket Expressions. |
4538 |
(line 1945) |
4539 |
* character classes: Character Classes and Bracket Expressions. |
4540 |
(line 1982) |
4541 |
* classes of characters: Character Classes and Bracket Expressions. |
4542 |
(line 1982) |
4543 |
* Command groups: Common Commands. (line 821) |
4544 |
* Comments, in scripts: Common Commands. (line 742) |
4545 |
* Conditional branch: Programming Commands. |
4546 |
(line 1139) |
4547 |
* Conditional branch <1>: Extended Commands. (line 1203) |
4548 |
* control characters: Character Classes and Bracket Expressions. |
4549 |
(line 1996) |
4550 |
* Copy hold space into pattern space: Other Commands. (line 1106) |
4551 |
* Copy pattern space into hold space: Other Commands. (line 1098) |
4552 |
* cycle, restarting: Branching and flow control. |
4553 |
(line 2678) |
4554 |
* d, example: sed script overview. |
4555 |
(line 401) |
4556 |
* Delete first line from pattern space: Other Commands. (line 1077) |
4557 |
* digit characters: Character Classes and Bracket Expressions. |
4558 |
(line 2001) |
4559 |
* Disabling autoprint, from command line: Command-Line Options. |
4560 |
(line 178) |
4561 |
* empty regular expression: Regexp Addresses. (line 1501) |
4562 |
* Emptying pattern space: Extended Commands. (line 1225) |
4563 |
* Emptying pattern space <1>: Reporting Bugs. (line 3963) |
4564 |
* Evaluate Bourne-shell commands: Extended Commands. (line 1152) |
4565 |
* Evaluate Bourne-shell commands, after substitution: The "s" Command. |
4566 |
(line 706) |
4567 |
* example, address range: sed script overview. |
4568 |
(line 401) |
4569 |
* example, regular expression: sed script overview. |
4570 |
(line 406) |
4571 |
* Exchange hold space with pattern space: Other Commands. (line 1114) |
4572 |
* Excluding lines: Addresses overview. (line 1431) |
4573 |
* exit status: Exit status. (line 353) |
4574 |
* exit status, example: Exit status. (line 372) |
4575 |
* Extended regular expressions, choosing: Command-Line Options. |
4576 |
(line 290) |
4577 |
* Extended regular expressions, syntax: ERE syntax. (line 1908) |
4578 |
* File name, printing: Extended Commands. (line 1170) |
4579 |
* Files to be processed as input: Command-Line Options. |
4580 |
(line 336) |
4581 |
* Flow of control in scripts: Programming Commands. |
4582 |
(line 1128) |
4583 |
* Global substitution: The "s" Command. (line 672) |
4584 |
* GNU extensions, /dev/stderr file: The "s" Command. (line 699) |
4585 |
* GNU extensions, /dev/stderr file <1>: Other Commands. (line 1066) |
4586 |
* GNU extensions, /dev/stdin file: Other Commands. (line 1053) |
4587 |
* GNU extensions, /dev/stdin file <1>: Extended Commands. (line 1193) |
4588 |
* GNU extensions, /dev/stdout file: Command-Line Options. |
4589 |
(line 344) |
4590 |
* GNU extensions, /dev/stdout file <1>: The "s" Command. (line 699) |
4591 |
* GNU extensions, /dev/stdout file <2>: Other Commands. (line 1066) |
4592 |
* GNU extensions, 0 address: Range Addresses. (line 1611) |
4593 |
* GNU extensions, 0 address <1>: Reporting Bugs. (line 3934) |
4594 |
* GNU extensions, 0,ADDR2 addressing: Range Addresses. (line 1611) |
4595 |
* GNU extensions, ADDR1,+N addressing: Range Addresses. (line 1611) |
4596 |
* GNU extensions, ADDR1,~N addressing: Range Addresses. (line 1611) |
4597 |
* GNU extensions, branch if s/// failed: Extended Commands. (line 1203) |
4598 |
* GNU extensions, case modifiers in s commands: The "s" Command. |
4599 |
(line 627) |
4600 |
* GNU extensions, checking for their presence: Extended Commands. |
4601 |
(line 1209) |
4602 |
* GNU extensions, debug: Command-Line Options. |
4603 |
(line 184) |
4604 |
* GNU extensions, disabling: Command-Line Options. |
4605 |
(line 257) |
4606 |
* GNU extensions, emptying pattern space: Extended Commands. (line 1225) |
4607 |
* GNU extensions, emptying pattern space <1>: Reporting Bugs. |
4608 |
(line 3963) |
4609 |
* GNU extensions, evaluating Bourne-shell commands: The "s" Command. |
4610 |
(line 706) |
4611 |
* GNU extensions, evaluating Bourne-shell commands <1>: Extended Commands. |
4612 |
(line 1152) |
4613 |
* GNU extensions, extended regular expressions: Command-Line Options. |
4614 |
(line 290) |
4615 |
* GNU extensions, g and NUMBER modifier: The "s" Command. (line 678) |
4616 |
* GNU extensions, I modifier: The "s" Command. (line 715) |
4617 |
* GNU extensions, I modifier <1>: Regexp Addresses. (line 1526) |
4618 |
* GNU extensions, in-place editing: Command-Line Options. |
4619 |
(line 211) |
4620 |
* GNU extensions, in-place editing <1>: Reporting Bugs. (line 3915) |
4621 |
* GNU extensions, M modifier: The "s" Command. (line 720) |
4622 |
* GNU extensions, M modifier <1>: Regexp Addresses. (line 1554) |
4623 |
* GNU extensions, modifiers and the empty regular expression: Regexp Addresses. |
4624 |
(line 1501) |
4625 |
* GNU extensions, N~M addresses: Numeric Addresses. (line 1461) |
4626 |
* GNU extensions, quitting silently: Extended Commands. (line 1176) |
4627 |
* GNU extensions, R command: Extended Commands. (line 1193) |
4628 |
* GNU extensions, reading a file a line at a time: Extended Commands. |
4629 |
(line 1193) |
4630 |
* GNU extensions, returning an exit code: Common Commands. (line 758) |
4631 |
* GNU extensions, returning an exit code <1>: Extended Commands. |
4632 |
(line 1176) |
4633 |
* GNU extensions, setting line length: Other Commands. (line 1033) |
4634 |
* GNU extensions, special escapes: Escapes. (line 2223) |
4635 |
* GNU extensions, special escapes <1>: Reporting Bugs. (line 3908) |
4636 |
* GNU extensions, special two-address forms: Range Addresses. |
4637 |
(line 1611) |
4638 |
* GNU extensions, subprocesses: The "s" Command. (line 706) |
4639 |
* GNU extensions, subprocesses <1>: Extended Commands. (line 1152) |
4640 |
* GNU extensions, to basic regular expressions: BRE syntax. (line 1743) |
4641 |
* GNU extensions, to basic regular expressions <1>: BRE syntax. |
4642 |
(line 1789) |
4643 |
* GNU extensions, to basic regular expressions <2>: BRE syntax. |
4644 |
(line 1792) |
4645 |
* GNU extensions, to basic regular expressions <3>: BRE syntax. |
4646 |
(line 1807) |
4647 |
* GNU extensions, to basic regular expressions <4>: BRE syntax. |
4648 |
(line 1817) |
4649 |
* GNU extensions, to basic regular expressions <5>: Reporting Bugs. |
4650 |
(line 3881) |
4651 |
* GNU extensions, two addresses supported by most commands: Other Commands. |
4652 |
(line 887) |
4653 |
* GNU extensions, two addresses supported by most commands <1>: Other Commands. |
4654 |
(line 941) |
4655 |
* GNU extensions, two addresses supported by most commands <2>: Other Commands. |
4656 |
(line 1030) |
4657 |
* GNU extensions, two addresses supported by most commands <3>: Other Commands. |
4658 |
(line 1062) |
4659 |
* GNU extensions, unlimited line length: Limitations. (line 3787) |
4660 |
* GNU extensions, writing first line to a file: Extended Commands. |
4661 |
(line 1220) |
4662 |
* Goto, in scripts: Programming Commands. |
4663 |
(line 1135) |
4664 |
* graphic characters: Character Classes and Bracket Expressions. |
4665 |
(line 2004) |
4666 |
* Greedy regular expression matching: BRE syntax. (line 1843) |
4667 |
* Grouping commands: Common Commands. (line 821) |
4668 |
* hexadecimal digits: Character Classes and Bracket Expressions. |
4669 |
(line 2027) |
4670 |
* Hold space, appending from pattern space: Other Commands. (line 1102) |
4671 |
* Hold space, appending to pattern space: Other Commands. (line 1110) |
4672 |
* Hold space, copy into pattern space: Other Commands. (line 1106) |
4673 |
* Hold space, copying pattern space into: Other Commands. (line 1098) |
4674 |
* Hold space, definition: Execution Cycle. (line 2470) |
4675 |
* Hold space, exchange with pattern space: Other Commands. (line 1114) |
4676 |
* i, and semicolons: sed script overview. |
4677 |
(line 434) |
4678 |
* In-place editing: Reporting Bugs. (line 3915) |
4679 |
* In-place editing, activating: Command-Line Options. |
4680 |
(line 211) |
4681 |
* In-place editing, Perl-style backup file names: Command-Line Options. |
4682 |
(line 222) |
4683 |
* infinite loop, branching: Branching and flow control. |
4684 |
(line 2698) |
4685 |
* Inserting text before a line: Other Commands. (line 930) |
4686 |
* joining lines with branching: Branching and flow control. |
4687 |
(line 2753) |
4688 |
* joining quoted-printable lines: Branching and flow control. |
4689 |
(line 2753) |
4690 |
* labels: Branching and flow control. |
4691 |
(line 2678) |
4692 |
* Labels, in scripts: Programming Commands. |
4693 |
(line 1131) |
4694 |
* Last line, selecting: Numeric Addresses. (line 1456) |
4695 |
* Line length, setting: Command-Line Options. |
4696 |
(line 252) |
4697 |
* Line length, setting <1>: Other Commands. (line 1033) |
4698 |
* Line number, printing: Other Commands. (line 1020) |
4699 |
* Line selection: Numeric Addresses. (line 1449) |
4700 |
* Line, selecting by number: Numeric Addresses. (line 1451) |
4701 |
* Line, selecting by regular expression match: Regexp Addresses. |
4702 |
(line 1492) |
4703 |
* Line, selecting last: Numeric Addresses. (line 1456) |
4704 |
* List pattern space: Other Commands. (line 1033) |
4705 |
* lower-case letters: Character Classes and Bracket Expressions. |
4706 |
(line 2007) |
4707 |
* Mixing g and NUMBER modifiers in the s command: The "s" Command. |
4708 |
(line 678) |
4709 |
* multiple files: Overview. (line 134) |
4710 |
* multiple sed commands: sed script overview. |
4711 |
(line 415) |
4712 |
* n, and branching: Branching and flow control. |
4713 |
(line 2708) |
4714 |
* N, and branching: Branching and flow control. |
4715 |
(line 2708) |
4716 |
* named character classes: Character Classes and Bracket Expressions. |
4717 |
(line 1982) |
4718 |
* newline, command separator: sed script overview. |
4719 |
(line 415) |
4720 |
* Next input line, append to pattern space: Other Commands. (line 1083) |
4721 |
* Next input line, replace pattern space with: Common Commands. |
4722 |
(line 791) |
4723 |
* Non-bugs, 0 address: Reporting Bugs. (line 3934) |
4724 |
* Non-bugs, in-place editing: Reporting Bugs. (line 3915) |
4725 |
* Non-bugs, localization-related: Reporting Bugs. (line 3944) |
4726 |
* Non-bugs, localization-related <1>: Reporting Bugs. (line 3963) |
4727 |
* Non-bugs, N command on the last line: Reporting Bugs. (line 3850) |
4728 |
* Non-bugs, regex syntax clashes: Reporting Bugs. (line 3881) |
4729 |
* numeric addresses: Addresses overview. (line 1406) |
4730 |
* numeric characters: Character Classes and Bracket Expressions. |
4731 |
(line 2001) |
4732 |
* omitting labels: Branching and flow control. |
4733 |
(line 2678) |
4734 |
* output: Overview. (line 120) |
4735 |
* output, suppressing: Overview. (line 127) |
4736 |
* p, example: Overview. (line 127) |
4737 |
* paragraphs, processing: Multiline techniques. |
4738 |
(line 2553) |
4739 |
* parameters, script: Overview. (line 140) |
4740 |
* Parenthesized substrings: The "s" Command. (line 616) |
4741 |
* Pattern space, definition: Execution Cycle. (line 2470) |
4742 |
* Portability, comments: Common Commands. (line 745) |
4743 |
* Portability, line length limitations: Limitations. (line 3787) |
4744 |
* Portability, N command on the last line: Reporting Bugs. (line 3850) |
4745 |
* POSIXLY_CORRECT behavior, bracket expressions: Character Classes and Bracket Expressions. |
4746 |
(line 2051) |
4747 |
* POSIXLY_CORRECT behavior, enabling: Command-Line Options. |
4748 |
(line 260) |
4749 |
* POSIXLY_CORRECT behavior, escapes: Escapes. (line 2228) |
4750 |
* POSIXLY_CORRECT behavior, N command: Reporting Bugs. (line 3876) |
4751 |
* Print first line from pattern space: Other Commands. (line 1095) |
4752 |
* printable characters: Character Classes and Bracket Expressions. |
4753 |
(line 2011) |
4754 |
* Printing file name: Extended Commands. (line 1170) |
4755 |
* Printing line number: Other Commands. (line 1020) |
4756 |
* Printing text unambiguously: Other Commands. (line 1033) |
4757 |
* processing paragraphs: Multiline techniques. |
4758 |
(line 2553) |
4759 |
* punctuation characters: Character Classes and Bracket Expressions. |
4760 |
(line 2014) |
4761 |
* Q, example: Exit status. (line 372) |
4762 |
* q, example: sed script overview. |
4763 |
(line 406) |
4764 |
* Quitting: Common Commands. (line 758) |
4765 |
* Quitting <1>: Extended Commands. (line 1176) |
4766 |
* quoted-printable lines, joining: Branching and flow control. |
4767 |
(line 2753) |
4768 |
* range addresses: Addresses overview. (line 1424) |
4769 |
* range expression: Character Classes and Bracket Expressions. |
4770 |
(line 1957) |
4771 |
* Range of lines: Range Addresses. (line 1586) |
4772 |
* Range with start address of zero: Range Addresses. (line 1611) |
4773 |
* Read next input line: Common Commands. (line 791) |
4774 |
* Read text from a file: Other Commands. (line 1045) |
4775 |
* Read text from a file <1>: Extended Commands. (line 1193) |
4776 |
* regex addresses and input lines: Regexp Addresses. (line 1563) |
4777 |
* regex addresses and pattern space: Regexp Addresses. (line 1563) |
4778 |
* regular expression addresses: Addresses overview. (line 1418) |
4779 |
* regular expression, example: sed script overview. |
4780 |
(line 406) |
4781 |
* Replace hold space with copy of pattern space: Other Commands. |
4782 |
(line 1098) |
4783 |
* Replace pattern space with copy of hold space: Other Commands. |
4784 |
(line 1106) |
4785 |
* Replacing all text matching regexp in a line: The "s" Command. |
4786 |
(line 672) |
4787 |
* Replacing only Nth match of regexp in a line: The "s" Command. |
4788 |
(line 676) |
4789 |
* Replacing selected lines with other text: Other Commands. (line 983) |
4790 |
* Requiring GNU sed: Extended Commands. (line 1209) |
4791 |
* restarting a cycle: Branching and flow control. |
4792 |
(line 2678) |
4793 |
* Sandbox mode: Command-Line Options. |
4794 |
(line 312) |
4795 |
* script parameter: Overview. (line 140) |
4796 |
* Script structure: sed script overview. |
4797 |
(line 384) |
4798 |
* Script, from a file: Command-Line Options. |
4799 |
(line 206) |
4800 |
* Script, from command line: Command-Line Options. |
4801 |
(line 201) |
4802 |
* sed commands syntax: sed script overview. |
4803 |
(line 391) |
4804 |
* sed commands, multiple: sed script overview. |
4805 |
(line 415) |
4806 |
* sed script structure: sed script overview. |
4807 |
(line 384) |
4808 |
* Selecting lines to process: Numeric Addresses. (line 1449) |
4809 |
* Selecting non-matching lines: Addresses overview. (line 1431) |
4810 |
* semicolons, command separator: sed script overview. |
4811 |
(line 415) |
4812 |
* Several lines, selecting: Range Addresses. (line 1586) |
4813 |
* Slash character, in regular expressions: Regexp Addresses. (line 1511) |
4814 |
* space characters: Character Classes and Bracket Expressions. |
4815 |
(line 2019) |
4816 |
* Spaces, pattern and hold: Execution Cycle. (line 2470) |
4817 |
* Special addressing forms: Range Addresses. (line 1611) |
4818 |
* standard input: Overview. (line 112) |
4819 |
* Standard input, processing as input: Command-Line Options. |
4820 |
(line 338) |
4821 |
* standard output: Overview. (line 120) |
4822 |
* stdin: Overview. (line 112) |
4823 |
* stdout: Overview. (line 120) |
4824 |
* Stream editor: Introduction. (line 86) |
4825 |
* subexpression: Back-references and Subexpressions. |
4826 |
(line 2166) |
4827 |
* Subprocesses: The "s" Command. (line 706) |
4828 |
* Subprocesses <1>: Extended Commands. (line 1152) |
4829 |
* Substitution of text, options: The "s" Command. (line 668) |
4830 |
* suppressing output: Overview. (line 127) |
4831 |
* syntax, addresses: sed script overview. |
4832 |
(line 391) |
4833 |
* syntax, sed commands: sed script overview. |
4834 |
(line 391) |
4835 |
* t, joining lines with: Branching and flow control. |
4836 |
(line 2753) |
4837 |
* t, versus b: Branching and flow control. |
4838 |
(line 2753) |
4839 |
* Text, appending: Other Commands. (line 871) |
4840 |
* Text, deleting: Common Commands. (line 774) |
4841 |
* Text, insertion: Other Commands. (line 930) |
4842 |
* Text, printing: Common Commands. (line 782) |
4843 |
* Text, printing after substitution: The "s" Command. (line 686) |
4844 |
* Text, writing to a file after substitution: The "s" Command. |
4845 |
(line 699) |
4846 |
* Transliteration: Other Commands. (line 837) |
4847 |
* Unbuffered I/O, choosing: Command-Line Options. |
4848 |
(line 319) |
4849 |
* upper-case letters: Character Classes and Bracket Expressions. |
4850 |
(line 2023) |
4851 |
* Usage summary, printing: Command-Line Options. |
4852 |
(line 172) |
4853 |
* Version, printing: Command-Line Options. |
4854 |
(line 168) |
4855 |
* whitespace characters: Character Classes and Bracket Expressions. |
4856 |
(line 2019) |
4857 |
* Working on separate files: Command-Line Options. |
4858 |
(line 303) |
4859 |
* Write first line to a file: Extended Commands. (line 1220) |
4860 |
* Write to a file: Other Commands. (line 1066) |
4861 |
* xdigit class: Character Classes and Bracket Expressions. |
4862 |
(line 2027) |
4863 |
* Zero, as range start address: Range Addresses. (line 1611) |
4864 |
|
4865 |
Command and Option Index |
4866 |
************************ |
4867 |
|
4868 |
This is an alphabetical list of all 'sed' commands and command-line |
4869 |
options. |
4870 |
|
4871 |
* Menu: |
4872 |
|
4873 |
* # (comments): Common Commands. (line 742) |
4874 |
* --binary: Command-Line Options. |
4875 |
(line 269) |
4876 |
* --debug: Command-Line Options. |
4877 |
(line 184) |
4878 |
* --expression: Command-Line Options. |
4879 |
(line 201) |
4880 |
* --file: Command-Line Options. |
4881 |
(line 206) |
4882 |
* --follow-symlinks: Command-Line Options. |
4883 |
(line 280) |
4884 |
* --help: Command-Line Options. |
4885 |
(line 172) |
4886 |
* --in-place: Command-Line Options. |
4887 |
(line 211) |
4888 |
* --line-length: Command-Line Options. |
4889 |
(line 252) |
4890 |
* --null-data: Command-Line Options. |
4891 |
(line 327) |
4892 |
* --posix: Command-Line Options. |
4893 |
(line 257) |
4894 |
* --quiet: Command-Line Options. |
4895 |
(line 178) |
4896 |
* --regexp-extended: Command-Line Options. |
4897 |
(line 290) |
4898 |
* --sandbox: Command-Line Options. |
4899 |
(line 312) |
4900 |
* --separate: Command-Line Options. |
4901 |
(line 303) |
4902 |
* --silent: Command-Line Options. |
4903 |
(line 178) |
4904 |
* --unbuffered: Command-Line Options. |
4905 |
(line 319) |
4906 |
* --version: Command-Line Options. |
4907 |
(line 168) |
4908 |
* --zero-terminated: Command-Line Options. |
4909 |
(line 327) |
4910 |
* -b: Command-Line Options. |
4911 |
(line 269) |
4912 |
* -e: Command-Line Options. |
4913 |
(line 201) |
4914 |
* -E: Command-Line Options. |
4915 |
(line 290) |
4916 |
* -f: Command-Line Options. |
4917 |
(line 206) |
4918 |
* -i: Command-Line Options. |
4919 |
(line 211) |
4920 |
* -l: Command-Line Options. |
4921 |
(line 252) |
4922 |
* -n: Command-Line Options. |
4923 |
(line 178) |
4924 |
* -n, forcing from within a script: Common Commands. (line 750) |
4925 |
* -r: Command-Line Options. |
4926 |
(line 290) |
4927 |
* -s: Command-Line Options. |
4928 |
(line 303) |
4929 |
* -u: Command-Line Options. |
4930 |
(line 319) |
4931 |
* -z: Command-Line Options. |
4932 |
(line 327) |
4933 |
* : (label) command: Programming Commands. |
4934 |
(line 1131) |
4935 |
* = (print line number) command: Other Commands. (line 1020) |
4936 |
* {} command grouping: Common Commands. (line 821) |
4937 |
* a (append text lines) command: Other Commands. (line 871) |
4938 |
* alnum character class: Character Classes and Bracket Expressions. |
4939 |
(line 1983) |
4940 |
* alpha character class: Character Classes and Bracket Expressions. |
4941 |
(line 1988) |
4942 |
* b (branch) command: Programming Commands. |
4943 |
(line 1135) |
4944 |
* blank character class: Character Classes and Bracket Expressions. |
4945 |
(line 1993) |
4946 |
* c (change to text lines) command: Other Commands. (line 983) |
4947 |
* cntrl character class: Character Classes and Bracket Expressions. |
4948 |
(line 1996) |
4949 |
* D (delete first line) command: Other Commands. (line 1077) |
4950 |
* d (delete) command: Common Commands. (line 774) |
4951 |
* digit character class: Character Classes and Bracket Expressions. |
4952 |
(line 2001) |
4953 |
* e (evaluate) command: Extended Commands. (line 1152) |
4954 |
* F (File name) command: Extended Commands. (line 1170) |
4955 |
* G (appending Get) command: Other Commands. (line 1110) |
4956 |
* g (get) command: Other Commands. (line 1106) |
4957 |
* graph character class: Character Classes and Bracket Expressions. |
4958 |
(line 2004) |
4959 |
* H (append Hold) command: Other Commands. (line 1102) |
4960 |
* h (hold) command: Other Commands. (line 1098) |
4961 |
* i (insert text lines) command: Other Commands. (line 930) |
4962 |
* l (list unambiguously) command: Other Commands. (line 1033) |
4963 |
* lower character class: Character Classes and Bracket Expressions. |
4964 |
(line 2007) |
4965 |
* N (append Next line) command: Other Commands. (line 1083) |
4966 |
* n (next-line) command: Common Commands. (line 791) |
4967 |
* P (print first line) command: Other Commands. (line 1095) |
4968 |
* p (print) command: Common Commands. (line 782) |
4969 |
* print character class: Character Classes and Bracket Expressions. |
4970 |
(line 2011) |
4971 |
* punct character class: Character Classes and Bracket Expressions. |
4972 |
(line 2014) |
4973 |
* q (quit) command: Common Commands. (line 758) |
4974 |
* Q (silent Quit) command: Extended Commands. (line 1176) |
4975 |
* r (read file) command: Other Commands. (line 1045) |
4976 |
* R (read line) command: Extended Commands. (line 1193) |
4977 |
* s command, option flags: The "s" Command. (line 668) |
4978 |
* space character class: Character Classes and Bracket Expressions. |
4979 |
(line 2019) |
4980 |
* T (test and branch if failed) command: Extended Commands. (line 1203) |
4981 |
* t (test and branch if successful) command: Programming Commands. |
4982 |
(line 1139) |
4983 |
* upper character class: Character Classes and Bracket Expressions. |
4984 |
(line 2023) |
4985 |
* v (version) command: Extended Commands. (line 1209) |
4986 |
* w (write file) command: Other Commands. (line 1066) |
4987 |
* W (write first line) command: Extended Commands. (line 1220) |
4988 |
* x (eXchange) command: Other Commands. (line 1114) |
4989 |
* xdigit character class: Character Classes and Bracket Expressions. |
4990 |
(line 2027) |
4991 |
* y (transliterate) command: Other Commands. (line 837) |
4992 |
* z (Zap) command: Extended Commands. (line 1225) |
4993 |
|
4994 |
GNU 'sed' |
4995 |
1 Introduction |
4996 |
2 Running sed |
4997 |
2.1 Overview |
4998 |
2.2 Command-Line Options |
4999 |
2.3 Exit status |
5000 |
3 'sed' scripts |
5001 |
3.1 'sed' script overview |
5002 |
3.2 'sed' commands summary |
5003 |
3.3 The 's' Command |
5004 |
3.4 Often-Used Commands |
5005 |
3.5 Less Frequently-Used Commands |
5006 |
3.6 Commands for 'sed' gurus |
5007 |
3.7 Commands Specific to GNU 'sed' |
5008 |
3.8 Multiple commands syntax |
5009 |
3.8.1 Commands Requiring a newline |
5010 |
4 Addresses: selecting lines |
5011 |
4.1 Addresses overview |
5012 |
4.2 Selecting lines by numbers |
5013 |
4.3 selecting lines by text matching |
5014 |
4.4 Range Addresses |
5015 |
5 Regular Expressions: selecting text |
5016 |
5.1 Overview of regular expression in 'sed' |
5017 |
5.2 Basic (BRE) and extended (ERE) regular expression |
5018 |
5.3 Overview of basic regular expression syntax |
5019 |
5.4 Overview of extended regular expression syntax |
5020 |
5.5 Character Classes and Bracket Expressions |
5021 |
5.6 regular expression extensions |
5022 |
5.7 Back-references and Subexpressions |
5023 |
5.8 Escape Sequences - specifying special characters |
5024 |
5.8.1 Escaping Precedence |
5025 |
5.9 Multibyte characters and Locale Considerations |
5026 |
5.9.1 Invalid multibyte characters |
5027 |
5.9.2 Upper/Lower case conversion |
5028 |
5.9.3 Multibyte regexp character classes |
5029 |
6 Advanced 'sed': cycles and buffers |
5030 |
6.1 How 'sed' Works |
5031 |
6.2 Hold and Pattern Buffers |
5032 |
6.3 Multiline techniques - using D,G,H,N,P to process multiple lines |
5033 |
6.4 Branching and Flow Control |
5034 |
6.4.1 Branching and Cycles |
5035 |
6.4.2 Branching example: joining lines |
5036 |
7 Some Sample Scripts |
5037 |
7.1 Joining lines |
5038 |
7.2 Centering Lines |
5039 |
7.3 Increment a Number |
5040 |
7.4 Rename Files to Lower Case |
5041 |
7.5 Print 'bash' Environment |
5042 |
7.6 Reverse Characters of Lines |
5043 |
7.7 Text search across multiple lines |
5044 |
7.8 Line length adjustment |
5045 |
7.9 Reverse Lines of Files |
5046 |
7.10 Numbering Lines |
5047 |
7.11 Numbering Non-blank Lines |
5048 |
7.12 Counting Characters |
5049 |
7.13 Counting Words |
5050 |
7.14 Counting Lines |
5051 |
7.15 Printing the First Lines |
5052 |
7.16 Printing the Last Lines |
5053 |
7.17 Make Duplicate Lines Unique |
5054 |
7.18 Print Duplicated Lines of Input |
5055 |
7.19 Remove All Duplicated Lines |
5056 |
7.20 Squeezing Blank Lines |
5057 |
8 GNU 'sed''s Limitations and Non-limitations |
5058 |
9 Other Resources for Learning About 'sed' |
5059 |
10 Reporting Bugs |
5060 |
Appendix A GNU Free Documentation License |
5061 |
Concept Index |
5062 |
Command and Option Index |