St Patty’s day ’12

Leave a comment

Heavy rain pummel the Los Angeles basin. Hairy drive into store. Near arrival it awakens. Of course subject is funding. (no surprise there). bmw’s about how much should vs. reality of ACTUALLY having outbound elements. But in their myopic world that’s not possible. Ran for warm fresh bread. Wishing I was home watching the 6 nations Irish v. English match. Recording for later viewing. Now to this and some productivity. (ha! Today’s suppose to be a day of fun – NOT IN MY WORLD!)

Advertisements

Understanding the “improved” in VIM

Leave a comment

Sometimes, every once in a while, a programmer feels like I do. VIM should be more efficient, more effective for editing text files yet I find myself reverting to the modern mouse-based approach more often than not.

A rather famous question on Stack Overflow asked for tips on how to be more productive with VIM – and he got one of the best answers I’ve ever seen. Not a list of tips but a working, detailed explanation from Stack Overflow user Jim Dennis. It’s so good, I am copying his answer’s raw source here, running it through the Markdown parser and copied it here.

So, without further ado, over to Jim’ amazing answer.

Your problem with Vim is that you don’t grok vi.

You mention cutting with yy and complain that you almost never want to cut whole lines. In fact programmers, editing source code, very often want to work on whole lines, ranges of lines and blocks of code. However, yy is only one of many way to yank text into the anonymous copy buffer (or “register” as it’s called in vi).

The “Zen” of vi is that you’re speaking a language. The initial y is a verb. The statement yy is a synonym for y_. The y is doubled up to make it easier to type, since it is such a common operation.

This can also be expressed as dd P (delete the current line and paste a copy back into place; leaving a copy in the anonymous register as a side effect). The y and d “verbs” take any movement as their “subject.” Thus yW is “yank from here (the cursor) to the end of the current/next (big) word” and y'a is “yank from here to the line containing the mark named ‘a‘.”

If you only understand basic up, down, left, and right cursor movements then vi will be no more productive than a copy of “notepad” for you. (Okay, you’ll still have syntax highlighting and the ability to handle files larger than a piddling ~45KB or so; but work with me here).

vi has 26 “marks” and 26 “registers.” A mark is set to any cursor location using the mcommand. Each mark is designated by a single lower case letter. Thus ma sets the ‘a‘ mark to the current location, and mz sets the ‘z‘ mark. You can move to the line containing a mark using the ' (single quote) command. Thus 'a moves to the beginning of the line containing the ‘a‘ mark. You can move to the precise location of any mark using the ` (backquote) command. Thus `z will move directly to the exact location of the ‘z‘ mark.

Because these are “movements” they can also be used as subjects for other “statements.”

So, one way to cut an arbitrary selection of text would be to drop a mark (I usually use ‘a‘ as my “first” mark, ‘z‘ as my next mark, ‘b‘ as another, and ‘e‘ as yet another (I don’t recall ever having interactively used more than four marks in 15 years of using vi; one creates one’s own conventions regarding how marks and registers are used by macros that don’t disturb one’s interactive context). Then we go to the other end of our desired text; we can start at either end, it doesn’t matter. Then we can simply use d`a to cut or y`a to copy. Thus the whole process has a 5 keystrokes overhead (six if we started in “insert” mode and needed toEsc out command mode). Once we’ve cut or copied then pasting in a copy is a single keystroke: p.

I say that this is one way to cut or copy text. However, it is only one of many. Frequently we can more succinctly describe the range of text without moving our cursor around and dropping a mark. For example if I’m in a paragraph of text I can use { and } movements to the beginning or end of the paragraph respectively. So, to move a paragraph of text I cut it using { d} (3 keystrokes). (If I happen to already be on the first or last line of the paragraph I can then simply use d} or d{ respectively.

The notion of “paragraph” defaults to something which is usually intuitively reasonable. Thus it often works for code as well as prose.

Frequently we know some pattern (regular expression) that marks one end or the other of the text in which we’re interested. Searching forwards or backwards are movements in vi. Thus they can also be used as “subjects” in our “statements.” So I can use d/foo to cut from the current line to the next line containing the string “foo” and y?bar to copy from the current line to the most recent (previous) line containing “bar.” If I don’t want whole lines I can still use the search movements (as statements of their own), drop my mark(s) and use the `x commands as described previously.

In addition to “verbs” and “subjects” vi also has “objects” (in the grammatical sense of the term). So far I’ve only described the use of the anonymous register. However, I can use any of the 26 “named” registers by prefixing the “object” reference with " (the double quote modifier). Thus if I use "add I’m cutting the current line into the ‘a‘ register and if I use"by/foo then I’m yanking a copy of the text from here to the next line containing “foo” into the ‘b‘ register. To paste from a register I simply prefix the paste with the same modifier sequence: "ap pastes a copy of the ‘a‘ register’s contents into the text after the cursor and"bP pastes a copy from ‘b‘ to before the current line.

This notion of “prefixes” also adds the analogs of grammatical “adjectives” and “adverbs’ to our text manipulation “language.” Most commands (verbs) and movement (verbs or objects, depending on context) can also take numeric prefixes. Thus 3J means “join the next three lines” and d5} means “delete from the current line through the end of the fifth paragraph down from here.”

This is all intermediate level vi. None of it is Vim specific and there are far more advanced tricks in vi if you’re ready to learn them. If you were to master just these intermediate concepts then you’d probably find that you rarely need to write any macros because the text manipulation language is sufficiently concise and expressive to do most things easily enough using the editor’s “native” language.

A sampling of more advanced tricks:

There are a number of : commands, most notably the :% s/foo/bar/g global substitution technique. (That’s not advanced but other : commands can be). The whole : set of commands was historically inherited by vi‘s previous incarnations as the ed (line editor) and later the ex (extended line editor) utilities. In fact vi is so named because it’s the visual interface to ex.

: commands normally operate over lines of text. ed and ex were written in an era when terminal screens were uncommon and many terminals were “teletype” (TTY) devices. So it was common to work from printed copies of the text, using commands through an extremely terse interface (common connection speeds were 110 baud, or, roughly, 11 characters per second — which is slower than a fast typist; lags were common on multi-user interactive sessions; additionally there was often some motivation to conserve paper).

So the syntax of most : commands includes an address or range of addresses (line number) followed by a command. Naturally one could use literal line numbers: :127,215 s/foo/barto change the first occurrence of “foo” into “bar” on each line between 127 and 215. One could also use some abbreviations such as . or $ for current and last lines respectively. One could also use relative prefixes + and - to refer to offsets after or before the current line, respectively. Thus: :.,$j meaning “from the current line to the last line, join them all into one line”. :% is synonymous with :1,$ (all the lines).

The :... g and :... v commands bear some explanation as they are incredibly powerful.:... g is a prefix for “globally” applying a subsequent command to all lines which match a pattern (regular expression) while :... v applies such a command to all lines which do NOT match the given pattern (“v” from “conVerse”). As with other ex commands these can be prefixed by addressing/range references. Thus :.,+21g/foo/d means “delete any lines containing the string “foo” from the current one through the next 21 lines” while:.,$v/bar/d means “from here to the end of the file, delete any lines which DON’T contain the string “bar.”

It’s interesting that the common Unix command grep was actually inspired by this excommand (and is named after the way in which it was documented). The ex command:g/re/p (grep) was the way they documented how to “globally” “print” lines containing a “regular expression” (re). When ed and ex were used, the :p command was one of the first that anyone learned and often the first one used when editing any file. It was how you printed the current contents (usually just one page full at a time using :.,+25p or some such).

Note that :% g/.../d or (its reVerse/conVerse counterpart: :% v/.../d are the most common usage patterns. However there are couple of other ex commands which are worth remembering:

We can use m to move lines around, and j to join lines. For example if you have a list and you want to separate all the stuff matching (or conversely NOT matching some pattern) without deleting them, then you can use something like: :% g/foo/m$ … and all the “foo” lines will have been moved to the end of the file. (Note the other tip about using the end of your file as a scratch space). This will have preserved the relative order of all the “foo” lines while having extracted them from the rest of the list. (This would be equivalent to doing something like: 1G!GGmap!Ggrep foo<ENTER>1G:1,'a g/foo'/d (copy the file to its own tail, filter the tail through grep, and delete all the stuff from the head).

To join lines usually I can find a pattern for all the lines which need to be joined to their predecessor (all the lines which start with “^ ” rather than “^ * ” in some bullet list, for example). For that case I’d use: :% g/^ /-1j (for every matching line, go up one line and join them). (BTW: for bullet lists trying to search for the bullet lines and join to the next doesn’t work for a couple reasons … it can join one bullet line to another, and it won’t join any bullet line to all of its continuations; it’ll only work pairwise on the matches).

Almost needless to mention you can use our old friend s (substitute) with the g and v(global/converse-global) commands. Usually you don’t need to do so. However, consider some case where you want to perform a substitution only on lines matching some other pattern. Often you can use a complicated pattern with captures and use back references to preserve the portions of the lines that you DON’T want to change. However, it will often be easier to separate the match from the substitution: :% g/foo/s/bar/zzz/g — for every line containing “foo” substitute all “bar” with “zzz.” (Something like :% s/(.foo.)bar(.*)/\1zzz\2/g would only work for the cases those instances of “bar” which were PRECEDED by “foo” on the same line; it’s ungainly enough already, and would have to be mangled further to catch all the cases where “bar” preceded “foo”)

The point is that there are more than just 

Awesome Programming Book – check it out

Leave a comment

A great site for quick – learn to program eBooks.  Well worth the little monies.

Plus, if this quote doesn’t motivate you – well – I have nothing more to say…

To this I have just one piece of advice: they can go to hell. The world needs more weird people who know how things work and who love to figure it all out. When they treat you like this, just remember that this is your journey, not theirs. Being different is not a crime, and people who tell you it is are just jealous that you’ve picked up a skill they never in their wildest dreams could acquire.

You can code. They cannot. That is pretty damn cool.”

Awesome Programming eBooks – check ’em out

Leave a comment

A great site for quick – learn to program eBooks. Well worth the little monies.

Plus, if this quote doesn’t motivate you – well – I have nothing more to say…

“To this I have just one piece of advice: they can go to hell. The world needs more weird people who know how things work and who love to figure it all out. When they treat you like this, just remember that this is your journey, not theirs. Being different is not a crime, and people who tell you it is are just jealous that you’ve picked up a skill they never in their wildest dreams could acquire.

You can code. They cannot. That is pretty damn cool.”

http://learnpythonthehardway.org/book/advice.html

using SED

Leave a comment

Using the sed Editor
By Emmett Dulaney

The sed editor is among the most useful assets in the Linux sysadmin’s toolbox,
so it pays to understand its applications thoroughly

One of the best things about the Linux operating system is that it is crammed full of utilities. There are so many different utilities, in fact, that it is next to impossible to know and understand all of them. One utility that can simplify life in key situations is sed. It is one of the most powerful tools in any administrator’s toolkit and can prove itself invaluable in a crunch.

The sed utility is an “editor,” but it is unlike most others. In addition to not being screen-oriented, it is also noninteractive. This means you have to insert commands to be executed on the data at the command line or in a script to be processed. When you visualize it, forget any ability to interactively edit files as you would do with Microsoft Word or most other editors. sed accepts a series of commands and executes them on a file (or set of files) noninteractively and unquestionably. As such, it flows through text as water would through a stream, and thus sed fittingly stands for stream editor . It can be used to change all occurrences of “Mr. Smyth” to “Mr. Smith” or “tiger cub” to “wolf cub.” The stream editor is ideally suited to performing repetitive edits that would take considerable time if done manually. The parameters can be as limited as those needed for a one-time use of a simple operation, or as complex as a script file filled with thousands of lines of editing changes to be made. With very little argument, sed is one of the most useful tools in the Linux and UNIX tool chest.

How sed Works

The sed utility works by sequentially reading a file, line by line, into memory. It then performs all actions specified for the line and places the line back in memory to dump to the terminal with the requested changes made. After all actions have taken place to this one line, it reads the next line of the file and repeats the process until it is finished with the file. As mentioned, the default output is to display the contents of each line on the screen. Two important factors come into play herefirst, the output can be redirected to another file to save the changes; second, the original file, by default, is left unchanged. The default is for sed to read the entire file and make changes to each line within it. It can, however, be restricted to specified lines as needed.

The syntax for the utility is:

sed [options] '{command}' [filename] 

In this article, we’ll walk through the most commonly used commands and options and illustrate how they work and where they would be appropriate for use.

The Substitute Command

One of the most common uses of the sed utility, and any similar editor, is to substitute one value for another. To accomplish this, the syntax for the command portion of the operation is:

's/{old value}/{new value}/' 

Thus, the following illustrates how “tiger” can be changed to “wolf” very simply:

$ echo The tiger cubs will meet on Tuesday after school | sed 's/tiger/wolf/' The wolf cubs will meet on Tuesday after school $ 

Notice that it is not necessary to specify a filename if input is being derived from the output of a preceding commandthe same as is true for awk, sort, and most other Linux\UNIX command-line utility programs.

Multiple Changes

If multiple changes need to be made to the same file or line, there are three methods by which this can be accomplished. The first is to use the “-e” option, which informs the program that more than one editing command is being used. For example:

$ echo The tiger cubs will meet on Tuesday after school | sed -e ' s/tiger/wolf/' -e 's/after/before/' The wolf cubs will meet on Tuesday before school $ 

This is pretty much the long way of going about it, and the “-e” option is not commonly used to any great extent. A more preferable way is to separate command with semicolons:

$ echo The tiger cubs will meet on Tuesday after school | sed ' s/tiger/wolf/; s/after/before/' The wolf cubs will meet on Tuesday before school $ 

Notice that the semicolon must be the next character following the slash. If a space is between the two, the operation will not successfully complete and an error message will be returned. These two methods are well and good, but there is one more method that many administrators prefer. The key thing to note is that everything between the two apostrophes (‘ ‘) is interpreted as sed commands. The shell program reading in the commands will not assume you are finished entering until the second apostrophe is entered. This means that the command can be entered on multiple lineswith Linux changing the prompt from PS1 to a continuation prompt (usually “>”)until the second apostrophe is entered. As soon as it is entered, and Enter pressed, the processing will take place and the same results will be generated, as the following illustrates:

$ echo The tiger cubs will meet on Tuesday after school | sed ' > s/tiger/wolf/ > s/after/before/' The wolf cubs will meet on Tuesday before school $ 

Global Changes

Let’s begin with a deceptively simple edit. Suppose the message that is to be changed contains more than one occurrence of the item to be changed. By default, the result can be different than what was expected, as the following illustrates:

$ echo The tiger cubs will meet this Tuesday at the same time as the meeting last Tuesday | sed 's/Tuesday/Thursday/' The tiger cubs will meet this Thursday at the same time as the meeting last Tuesday $ 

Instead of changing every occurrence of “Tuesday” for “Thursday,” the sed editor moves on after finding a change and making it, without reading the whole line. The majority of sed commands function like the substitute one, meaning they all work for the first occurrence of the chosen sequence in each line. In order for every occurrence to be substituted, in the event that more than one occurrence appears in the same line, you must specify for the action to take place globally:

$ echo The tiger cubs will meet this Tuesday at the same time as the meeting last Tuesday | sed 's/Tuesday/Thursday/g' The tiger cubs will meet this Thursday at the same time as the meeting last Thursday $ 

Bear in mind that this need for globalization is true whether the sequence you are looking for consists of only one character or a phrase.

sed can also be used to change record field delimiters from one to another. For example, the following will change all tabs to spaces:

sed 's/ / /g' 

where the entry between the first set of slashes is a tab, while the entry between the second set is a space. As a general rule, sed can be used to change any printable character to any other printable character. If you want to change unprintable characters to printable onesfor example, a bell to the word “bell”sed is not the right tool for the job (but tr would be).

Sometimes, you don’t want to change every occurrence that appears in a file. At times, you only want to make a change if certain conditions are metfor example, following a match of some other data. To illustrate, consider the following text file:

$ cat sample_one one 1 two 1 three 1 one 1 two 1 two 1 three 1 $ 

Suppose that it would be desirable for “1” to be substituted with “2,” but only after the word “two” and not throughout every line. This can be accomplished by specifying that a match is to be found before giving the substitute command:

$ sed '/two/ s/1/2/' sample_one one 1 two 2 three 1 one 1 two 2 two 2 three 1 $ 

And now, to make it even more accurate:

$ sed ' > /two/ s/1/2/ > /three/ s/1/3/' sample_one one 1 two 2 three 3 one 1 two 2 two 2 three 3 $ 

Bear in mind once again that the only thing changed is the display. If you look at the original file, it is the same as it always was. You must save the output to another file to create permanence. It is worth repeating that the fact that changes are not made to the original file is a true blessing in disguiseit lets you experiment with the file without causing any real harm, until you get the right commands working exactly the way you expect and want them to.

The following saves the changed output to a new file:

$ sed ' > /two/ s/1/2/ > /three/ s/1/3/' sample_one > sample_two 

The output file has all the changes incorporated in it that would normally appear on the screen. It can now be viewed with head, cat, or any other similar utility.

Script Files

The sed tool allows you to create a script file containing commands that are processed from the file, rather than at the command line, and is referenced via the “-f” option. By creating a script file, you have the ability to run the same operations over and over again, and to specify far more detailed operations than what you would want to try to tackle from the command line each time.

Consider the following script file:

$ cat sedlist /two/ s/1/2/ /three/ s/1/3/ $ 

It can now be used on the data file to obtain the same results we saw earlier:

$ sed -f sedlist sample_one one 1 two 2 three 3 one 1 two 2 two 2 three 3 $ 

Notice that apostrophes are not used inside the source file, or from the command line when the “-f” option is invoked. Script files, also known as source files, are invaluable for operations that you intend to repeat more than once and for complicated commands where there is a possibility that you may make an error at the command line. It is far easier to edit the source file and change one character than to retype a multiple-line entry at the command line.

Restricting Lines

The default is for the editor to look at, and for editing to take place on, every line that is input to the stream editor. This can be changed by specifying restrictions preceding the command. For example, to substitute “1” with “2” only in the fifth and sixth lines of the sample file’s output, the command would be:

$ sed '5,6 s/1/2/' sample_one one 1 two 1 three 1 one 1 two 2 two 2 three 1 $ 

In this case, since the lines to changes were specifically specified, the substitute command was not needed. Thus you have the flexibility of choosing which lines to changes (essentially, restricting the changes) based upon matching criteria that can be either line numbers or a matched pattern.

Prohibiting the Display

The default is for sed to display on the screen (or to a file, if so redirected) every line from the original file, whether it is affected by an edit operation or not; the “-n” parameter overrides this action. “-n” overrides all printing and displays no lines whatsoever, whether they were changed by the edit or not. For example:

$ sed -n -f sedlist sample_one $ $ sed -n -f sedlist sample_one > sample_two $ cat sample_two $ 

In the first example, nothing is displayed on the screen. In the second example, nothing is changed, and thus nothing is written to the new fileit ends up being empty. Doesn’t this negate the whole purpose of the edit? Why is this useful? It is useful only because the “-n” option has the ability to be overridden by a print command (-p). To illustrate, suppose the script file were modified to now resemble the following:

$ cat sedlist /two/ s/1/2/p /three/ s/1/3/p $ 

Then this would be the result of running it:

$ sed -n -f sedlist sample_one two 2 three 3 two 2 two 2 three 3 $ 

Lines that stay the same as they were are not displayed at all. Only the lines affected by the edit are displayed. In this manner, it is possible to pull those lines only, make the changes, and place them in a separate file:

$ sed -n -f sedlist sample_one > sample_two $ $ cat sample_two two 2 three 3 two 2 two 2 three 3 $ 

Another method of utilizing this is to print only a set number of lines. For example, to print only lines two through six while making no other editing changes:

$ sed -n '2,6p' sample_one two 1 three 1 one 1 two 1 two 1 $ 

All other lines are ignored, and only lines two through six are printed as output. This is something remarkable that you cannot do easily with any other utility. head will print the top of a file, and tail will print the bottom, but sed allows you to pull anything you want to from anywhere.

Deleting Lines

Substituting one value for another is far from the only function that can be performed with a stream editor. There are many more possibilities, and the second-most-used function in my opinion is delete. Delete works in the same manner as substitute, only it removes the specified lines (if you want to remove a word and not a line, don’t think of deleting, but think of substituting it for nothing s/cat// ).

The syntax for the command is:

'{what to find} d' 

To remove all of the lines containing “two” from the sample_one file:

$ sed '/two/ d' sample_one one 1 three 1 one 1 three 1 $ 

To remove the first three lines from the display, regardless of what they are:

$ sed '1,3 d' sample_one one 1 two 1 two 1 three 1 $ 

Only the remaining lines are shown, and the first three cease to exist in the display. There are several things to keep in mind with the stream editor as they relate to global expressions in general, and as they apply to deletions in particular:

  1. The up carat (^) signifies the beginning of a line, thus
    sed '/^two/ d' sample_one 

    would only delete the line if “two” were the first three characters of the line.

  2. The dollar sign ($) represents the end of the file, or the end of a line, thus
    sed '/two$/ d' sample_one 

    would delete the line only if “two” were the last three characters of the line.

The result of putting these two together:

sed '/^$/ d' {filename} 

deletes all blank lines from a file. For example, the following substitutes “1” for “2” as well as “1” for “3” and removes any trailing lines in the file:

$ sed '/two/ s/1/2/; /three/ s/1/3/; /^$/ d' sample_one one 1 two 1 three 1 one 1 two 2 two 2 three 1 $ 

A common use for this is to delete a header. The following command will delete all lines in a file, from the first line through to the first blank line:

sed '1,/^$/ d' {filename} 

Appending and Inserting Text

Text can be appended to the end of a file by using sed with the “a” option. This is done in the following manner:

$ sed '$a\ > This is where we stop\ > the test' sample_one one 1 two 1 three 1 one 1 two 1 two 1 three 1 This is where we stop the test $ 

Within the command, the dollar sign ($) signifies that the text is to be appended to the end of the file. The backslashes (\) are necessary to signify that a carriage return is coming. If they are left out, an error will result proclaiming that the command is garbled; anywhere that a carriage return is to be entered, you must use the backslash.

To append the lines into the fourth and fifth positions instead of at the end, the command becomes:

$ sed '3a\ > This is where we stop\ > the test' sample_one one 1 two 1 three 1 This is where we stop the test one 1 two 1 two 1 three 1 $ 

This appends the text after the third line. As with almost any editor, you can choose to insert rather than append if you so desire. The difference between the two is that append follows the line specified, and insert starts with the line specified. When using insert instead of append, just replace the “a” with an “i,” as shown below:

$ sed '3i\ > This is where we stop\ > the test' sample_one one 1 two 1 This is where we stop the test three 1 one 1 two 1 two 1 three 1 $ 

The new text appears in the middle of the output, and processing resumes normally after the specified operation is carried out.

Reading and Writing Files

The ability to redirect the output has already been illustrated, but it needs to be pointed out that files can be read in and written out to simultaneously during operation of the editing commands. For example, to perform the substitution and write the lines between one and three to a file called sample_three:

$ sed ' > /two/ s/1/2/ > /three/ s/1/3/ > 1,3 w sample_three' sample_one one 1 two 2 three 3 one 1 two 2 two 2 three 3 $ $ cat sample_three one 1 two 2 three 3 $ 

Only the lines specified are written to the new file, thanks to the “1,3” specification given to the w (write) command. Regardless of those written, all lines are displayed in the default output.

The Change Command

In addition to substituting entries, it is possible to change the lines from one value to another. The thing to keep in mind is that substitute works on a character-for-character basis, whereas change functions like delete in that it affects the entire line:

$ sed '/two/ c\ > We are no longer using two' sample_one one 1 We are no longer using two three 1 one 1 We are no longer using two We are no longer using two three 1 $ 

Working much like substitute, the change command is greater in scalecompletely replacing the one entry for another, regardless of character content, or context. At the risk of overstating the obvious, when substitute was used, then only the character “1” was replaced with “2,” while when using change, the entire original line was modified. In both situations, the match to look for was simply the “two.”

Change All but…

With most sed commands, the functions are spelled out as to what changes are to take place. Using the exclamation mark, it is possible to have the changes take place everywhere but those specifiedcompletely reversing the default operation.

For example, to delete all lines that contain the phrase “two,” the operation is:

$ sed '/two/ d' sample_one one 1 three 1 one 1 three 1 $ 

And to delete all lines except those that contain the phrase “two,” the syntax becomes:

$ sed '/two/ !d' sample_one two 1 two 1 two 1 $ 

If you have a file that contains a list of items and want to perform an operation on each of the items in the file, then it is important that you first do an intelligent scan of those entries and think about what you are doing. To make matters easier, you can do so by combining sed with any iteration routine (for, while, until).

As an example, assume you have a text file named “animals” with the following entries:

pig
horse
elephant
cow
dog
cat

And you want to run the following routine:

#mcd.ksh for I in $* do echo Old McDonald had a $I echo E-I, E-I-O done 

The result will be that each line is printed at the end of “Old McDonald has a.” While this is correct for the majority of the entries, it is grammatically incorrect for the “elephant” entry, as the result should be “an elephant” rather than “a elephant.” Using sed, you can scan the output from your shell file for such grammatical errors and correct them on the fly, by first creating a file of commands:

#sublist / a a/ s/ a / an / / a e/ s/ a / an / /a i/ s / a / an / /a o/ s/ a / an / /a u/ s/ a / an / 

and then executing the process as follows:

$ sh mcd.ksh 'cat animals' | sed -f sublist 

Now, after the mcd script has been run, sed will scan the output for anywhere that the single letter a (space, “a,” space) is followed by a vowel. If such exists, it will change the sequence to space, “an,” space. This corrects the problem before it ever prints on the screen and ensures that editors everywhere sleep easier at night. The result is:

Old McDonald had a pig
E-I, E-I-O
Old McDonald had a horse
E-I, E-I-O
Old McDonald had an elephant
E-I, E-I-O
Old McDonald had a cow
E-I, E-I-O
Old McDonald had a dog
E-I, E-I-O
Old McDonald had a cat
E-I, E-I-O

Quitting Early

The default is for sed to read through an entire file and stop only when the end is reached. You can stop processing early, however, by using the quit command. Only one quit command can be specified, and processing will continue until the condition calling the quit command is satisfied.

For example, to perform substitution only on the first five lines of a file and then quit:

$ sed ' > /two/ s/1/2/ > /three/ s/1/3/ > 5q' sample_one one 1 two 2 three 3 one 1 two 2 $ 

The entry preceding the quit command can be a line number, as shown, or a find/matching command like the following:

$ sed ' > /two/ s/1/2/ > /three/ s/1/3/ > /three/q' sample_one one 1 two 2 three 3 $ 

You can also use the quit command to view lines beyond a standard number and add functionality that exceeds those in head. For example, the head command allows you to specify how many of the first lines of a file you want to seethe default number is ten, but any number can be used from one to ninety-nine. If you want to see the first 110 lines of a file, you cannot do so with head, but you can with sed:

sed 110q filename 

Handling Problems

The main thing to keep in mind when dealing with sed is how it works. It works by reading one line in, performing all the tasks it knows to perform on that one line, and then moving on to the next line. Each line is subjected to every editing command given.

This can be troublesome if the order of your operations is not thoroughly thought out. For example, suppose you need to change all “two” entries to “three” and all “three” to “four”:

$ sed ' > /two/ s/two/three/ > /three/ s/three/four/' sample_one one 1 four 1 four 1 one 1 four 1 four 1 four 1 $ 

The very first “two” read was changed to “three.” It then meets the criteria established for the next edit and becomes “four.” The end result is not what was wantedthere are now no entries but “four” where there should be “three” and “four.”

When performing such an operation, you must pay diligent attention to the manner in which the operations are specified and arrange them in an order in which one will not clobber another. For example:

$ sed ' > /three/ s/three/four/ > /two/ s/two/three/' sample_one one 1 three 1 four 1 one 1 three 1 three 1 four 1 $ 

This works perfectly, since the “three” value is changed prior to “two” becoming “three.”

Labels and Comments

Labels can be placed inside sed script files to make it easier to explain what is transpiring, once the files begin to grow in size. There are a variety of commands that relate to these labels, and they include:

Next Steps

Visit and bookmark the Linux Technology Center

Read Dale Dougherty and Arnold Robbins’ book sed & awk, 2 nd Edition (O’Reilly and Associates).

  1. : The colon signifies a label name. For example:
     :HERE 

    Labels beginning with the colon can be addressed by “b” and “t” commands.

  2. b {label} Works as a “goto” statement, sending processing to the label preceded by a colon. For example,
     b HERE 

    sends processing to the line

     :HERE 

    If no label is specified following the b, processing goes to the end of the script file.

  3. t {label} Branches to the label only if substitutions have been made since the last input line or execution of a “t” command. As with “b,” if a label name is not given, processing moves to the end of the script file.
  4. # The pound sign as the first character of a line causes the entire line to be treated as a comment. Comment lines are different from labels and cannot be branched to with b or t commands.

Further Investigations
The sed utility is one of the most powerful and flexible tools that a Linux administrator has. While this article has covered a lot of ground, it has only scratched the surface of this versatile tool. For more information, one of the best sources is Dale Dougherty and Arnold Robbins’ book sed & awk, now in its second edition from O’Reilly and Associates (see “Next Steps”). The same publisher also puts out a pocket reference that you can carry with you.