1. Both are scripting languages
2. Both work primarily with text files
3. Both are programmable editors
4. Both accept command-line options and can be scripted (-f script_name)
5. Both GNU versions uspport POSIX (GREP) and EGREP RegExes
6. Lineage = ed (editor) -> sed -> awk
###SED''s FEATURES###
1. Non-interactive editor
2. Stream Editor
a. Manipulates input - performing edits as instructed
b. Sed accepts input on/from: STDIN (Keyboard), File, Pipe (|)
3. Sed Loops through ALL input lines of input stream or file, by DEFAULT
4. Does NOT operate on the source file, by default. (Will NOT clobber the original file, unless instructed to do so)
5. Supports addresses to indicate which lines to operate on: /^$/d - deletes blank lines
6. Stores active (current) line the ''pattern space'' and maintains a ''hold space'' for usage
7. Used primarily to perform Search-and-Replaces
###AWK''s FEATURES###
1. Field processor based on whitespace, by default
2. Used for reporting ( extracting specific columns) from data feed
3. Supports programming constructs:
a. loop (for, while, do)
b. conditioins (if, then, else)
c. arrays (lists)
d. functions (string, umeric, user-defined)
4. Automatically tokenizes words in a line for later usage - $1, $2, $3, etc. (This is based on the current delimiter)
5. Automatically loops through input like Sed, making lines availables for processing
6. Ability to execute shell commands using ''system()'' functions
###REGULAR EXPRESSIONS (RegEx) REVIEW###
Regular Expressions (RegExes) are key to mastering Awk & Sed
###METACHARACTERS###
^ - matches the character(s) at the beginning of a line
a. sed -ne ''/^dog/p'' animals.txt
$ - matches the character(s) at the end of a line
a. sed -ne ''/dog$/p'' animals.txt
Task: Match line which contains only ''dog'':
a. sed -ne ''/^dog$/p'' animals.txt
b. sed -ne ''/^dog$/p'' - reads from STDIN, Press Enter after each line, Terminate with CTRL-D
c. cat animals.txt | sed -ne ''/^dog$/p''
d. cat animals.txt | sed -ne ''/^dog$/Ip'' - Prints matches case-insensitively
. - matches any character (typically except new line)
a. sed -ne ''/^d...$/Ip'' animals.txt
b. sed -ne ''/^d.../Ip'' animals.txt
###REGEX QUANTIFIERS###
* - 0 or more matches of the previous character
- 1 or more matches of the previous character
? - 0 or 1 of the previous character
a. sed -ne ''/^d.\ /Ip'' animals.txt
Note: Escape quantifiers in RegExes using the escape character ''\''
###CHARACTERS CLASSES###
Allow to search for a range of characters
a. [0-9]
b. [a-z][A-Z]
a. sed -ne ''/^d.\ [0-9]/Ip'' animals.txt
Note: Character Classes match 1, and only 1 character
###INTRO TO SED###
Usage:
1. sed [options] ''instruction'' file | PIPE | STDIN
2. sed -e ''instruction1'' -e ''instruction2'' ...
3. sed -f script_file_name file
Note: Execute Sed by indicating instruction on one of the following:
1. Command-line
2. Script File
Note: Sed accepts instructions based on ''/pattern_tp_match/action''
###Print Specific Lines of a file###
Note: ''-e'' is optional if there is only 1 instruction to execute
sed -ne ''1p'' animals.txt - prints first line of file
sed -ne ''2p'' animals.txt - prints second line of file
sed -ne ''$p'' animals.txt - prints last printable line of file
sed -ne ''2,4p'' animals.txt - prints lines 2-4 from file
sed -ne ''1!p'' animals.txt - prints ALL EXCEPT line #1
sed -ne ''1,4!p'' animals.txt - prints ALL EXCEPT line 1 - 4
sed -ne ''/dog/p'' animals.txt - prints ALL line scontaining ''dog'' - case-sensitive
sed -ne ''/dog/Ip'' animals.txt - prints ALL line scontaining ''dog'' - case-insensitive
sed -ne ''/[0-9]/p'' animals.txt - prints ALL lines with AT LEAST 1 numeric
sed -ne ''/cat/,/deer/p'' animals.txt - prints ALL lines beginning with ''cat'', ending with ''deer''
sed -ne ''/deer/, 2p'' animals.txt - prints the line with ''deer'' plus 2 extra lines
###Delete Lines using Sed Addresses###
sed -e ''/^$/d'' animals.txt - deletes blank lines from file
Note: Drop ''-n'' to see the new output when deleting
sed -e ''1d'' animals.txt - deletes the first line form animals.txt
sed -e ''1,4d'' animals.txt - deletes lines 1-4 form animals.txt
sed -e ''1~2d'' animals.txt - deletes every 2nd line beginning with line 2 - 1, 3, 5...
###Saves Sed''s Changes using Output Redirection###
sed -e ''/^$/d'' animals.txt > animals2.txt - deletes blank lines from file and creates new output file ''animals2.txt
###SEARCH & REPLACE USING Sed###
General Usage:
sed -e ''s/find/replace/g'' animals.txt - replaces ''find'' with ''replace''
Note: Left Hand Side (LHS) supports literals and RegExes
Note: Right Hand Side (RHS) supports literals and back references
Examples:
sed -e ''s/LinuxCBT/UnixCBT/'' - replaces ''LinuxCBT'' with ''UnixCBT'' on STDIN to STDOUT
sed -e ''s/LinuxCBT/UnixCBT/I'' - replaces ''LinuxCBT'' with ''UnixCBT'' on STDIN to STDOUT (Case-Insensitives)
Note: Replacements occur on the FIRST match, unless ''g'' is appended to the s/find/replace/g sequence
sed -e ''s/LinuxCBT/UnixCBT/Ig'' - replaces ''LinuxCBT'' with ''UnixCBT'' on STDIN to STDOUT (Case-Insensitives)
Task:
1. Remove ALL blank lines
2. Substitute ''cat'', regardless of case, with ''Tiger''
Note: Whenever using ''-n'' option, you MUST specify the print modifier ''p''
sed -ne ''/^$/d'' -e ''s/cat/Tiger/Ig'' animals.txt - removes blank lines & substitutes ''cat'' with ''Tiger''
OR sed -e ''/^$/d; s/cat/Tiger/Igp'' animals.txt - does the same as above
Note: Simply separate multiple commands with semicolons
###Update Source File - Backup Source File###
sed -i.bak -e ''/^$/d; s/Cat/Tiger/Igp'' animals.txt - performs as above, but ALSO replaces the source file and backs it up
###Search & Replace (Text Substitution) Continued###
sed -e ''/address/s/find/replace/g/'' file
sed -e ''/Tiger/s/dog/mutt/g'' animals.txt
sed -ne ''/Tiger/s/dog/mutt/gp'' animals.txt - substitutes ''dog'' with ''mutt'' where line contains ''Tiger''
sed -e ''/Tiger/s/dog/mutt/gI'' animals.txt
sed -e ''/^Tiger/s/dog/mutt/gI'' animals.txt - Updates lines that begin with ''Tiger''
sed -e ''/^Tiger/Is/dog/mutt/gI'' animals.txt - Updates lines that begin with ''Tiger'' (Case-Insensitive)
###Focus on the Right Hand Side (RHS) of Search & Replace Function in SED###
Note: SED reserves a few characters to help with substitutions based on the matchsd pattern from the LHS
& = The full value of the LHS (Pattern Matched) OR the values in the pattern space
Task:
Intersperse each line with the word ''Animal ''
sed -ne ''s/.*/&/p'' animals.txt - replace the matched pattern with the matched pattern
sed -ne ''s/.*/Animal &/p'' animals.txt - Intersperses ''Animal'' on each line
sed -ne ''s/.*/Animal: &/p'' animals.txt - Intersperses ''Animal'' on each line
sed -ne ''s/.*[0-9]/&/p'' animals.txt - returns animals with at least 1 numeric at the end of the name
sed -ne ''s/.*[0-9]\{1\}/&/p'' animals.txt - returns animals with only 1 numeric at the end of the name
sed -ne ''s/[a-z][0-9]\{4\}$/&/pI'' animals.txt - returns animal(s) with 4 numeric values at the end of the line
sed -ne ''s/[a-z][0-9]\{1,4\}$/&/pI'' animals.txt - returns animal(s) with at leaset 1, up to 4 numeric values at the end of the name
###Grouping & Backreferences###
#Note: Segement matches into backreferences using escaped parenthesis: \(RegEx\)
sed -ne ''s/\(.*\)\([0-9]\)/&/p'' animals.txt - This creates 2 variables: \1 & \2
sed -ne ''s/\(.*\)\([0-9]\)$/\1/p'' animals.txt - This creates 2 variables: \1 & \2 but references \1
sed -ne ''s/\(.*\)\([0-9]\)$/\2/p'' animals.txt - This creates 2 variables: \1 & \2 but references \2
sed -ne ''s/\(.*\)\([0-9]\)$/\1 \2/p'' animals.txt - This creates 2 variables: \1 & \2 but references \1 and \2
###Apply Changes to Multiple Files###
Sed Supports Globbing: *, ?
sed -ne ''s/\(.*\)\([0-9]\)$/\1 \2/p'' animals*.txt - This creates 2 variables: \1 & \2 but references \1 and \2
###Sed Scripts###
Note: Sed supports scripting, which means, the ability to dump 1 or more instructions into 1 file
Sed Scripting Rules:
1. Sed applies ALL rules to each line
2. Sed applies ALL changes dynamically to the pattern space
3. Sed ALWAYS works with the current line
###Awk - Intro###
Features:
1. Reporter
2. Field Processor
3. Supports Scripting
4. Programming Constructs
5. Default delimiter is whitespace
6. Supports: Pipes, Files, and STDIN as sources of input
7. Automatically tokenizes processed columns/fields into the variables: $1, $2, $3 .. $n
8. Supports GREP and EGREP RegExes
Tasks:
Note: $0 represents the current record or row
1. Print enrire row, one at a time, form a input file (animals.txt)
a. awk ''{ print $0 }'' animals.txt
2. Print specific columns from (animals.txt)
a. awk ''{ print $1 }'' animals.txt - this print the 1st column form the file
3. Print multiple columns from (animals.txt)
a. awk ''{ print $1; print $2; }'' animals.txt
b. awk ''{ print $1,$2; }'' animals.txt
4. Print columns from lines containing ''deer'' using RegEx Support
a. awk ''/deer/ { print $0 }'' animals.txt
5. Print columns from lines containing digits
a. awk ''/[0-9]/ { print $0 }'' animals.txt
6. Remove blank lines with Sed and pipe output to awk for processing
a. sed -e ''/^$/d'' animals.txt | awk ''/[0-9]/ { print $0 }''
7. Print blank lines
a. awk ''/^$/ { print }'' animals.txt
b. awk ''/^$/ { print $0 }'' animals.txt
8. Print ALL lines beginning with the animal ''dog'' case-insensitive
b. Effect the change to ALL product files and create .new output files without clobbering the source file
for i in `ls -A products_*php`; do sed -e ''s/<b>Shipping<\/b>: Free<br>//'' $i > $i.new; done
2. Strip ''.new'' suffix from newly generated files
a. echo "products_linuxcbt.php.new" | sed -e ''s/\.new//''
b. for i in `ls -A products_*new | sed -e ''s/\.new//''`; echo $i; done
c. for i in `ls -A products_*new | sed -e ''s/\.new//''`; do mv $i.new $i; done
3. Remove ''Free Shipping'' from faq.php file
a. Code to remove: <li>Free Shipping
b. sed -e ''s/<li>Free Shipping//'' faq.php > faq.php.new
Use Awk & Sed Together to update specific rows in /var/log/message
Task:
a. Update Month information for kernel messages for September 3
awk ''$1 ~ /Sep/ && $2 ~ /3/ && $5 ~ /kernel/ { print }'' /var/log/message
b. awk ''$1 ~ /Sep/ && $2 ~ /3/ && $5 ~ /kernel/ { total; print } END { print "Total Records Updated:" total }'' /var/log/message | sed -ne ''s/Sep/September/p''
###Windows Support for GNU Sed & Awk###
Download GNU Sed & Awk from: http://gnuwin32.sourceforge.net
Windows Stuff:
gawk "BEGIN { max=ARGV[1]; for (i=1;i<=max; i) print i }" 10 - reads 10 from ARGV[1] and passes it to ''max'' var for use in the ''for'' loop