Awk – Command line output less awkward

One very powerful tool in Linux is the (awk) text processing language. I always remember it because it makes long log files or command output, come back to you a little less (awk)ward. Yeah corny I know, but let’s look at what it can do and maybe you’ll agree:

I want to find what’s happening on my site:

cat /usr/local/apache/domlogs/atomlabs.net
 
173.205.126.94 - - [25/Dec/2010:01:19:57 -0500] "POST /wp-cron.php?doing_wp_cron HTTP/1.0" 200 - "-" "WordPress/3.0.3; http://atomlabs.net"
88.214.193.166 - - [25/Dec/2010:01:19:56 -0500] "GET / HTTP/1.1" 200 2715 "http://www.semrush.com/info/atomlabs.net" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"
109.120.144.247 - - [25/Dec/2010:01:39:04 -0500] "GET /plugins/editors/tinymce/jscripts/tiny_mce/plugins/tinybrowser/tinybrowser.php?type=file&folder= HTTP/1.0" 404 - "http://atomlabs.net/plugins/editors/tinymce/jscripts/tiny_mce/plugins/tinybrowser/tinybrowser.php?type=file&folder=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90)"
207.46.13.87 - - [25/Dec/2010:01:46:00 -0500] "GET /robots.txt HTTP/1.1" 301 238 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.13.87 - - [25/Dec/2010:01:46:00 -0500] "GET /robots.txt HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
207.46.13.87 - - [25/Dec/2010:01:55:44 -0500] "GET /page2.php HTTP/1.1" 301 237 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
88.214.193.166 - - [25/Dec/2010:02:10:24 -0500] "GET / HTTP/1.1" 200 3342 "http://www.semrush.com/info/atomlabs.net" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"
209.85.238.86 - - [25/Dec/2010:02:33:04 -0500] "GET /dev/?feed=rss2 HTTP/1.1" 404 - "-" "Feedfetcher-Google; (+http://www.google.com/feedfetcher.html; 1 subscribers; feed-id=17880273271851489999)"
66.249.72.100 - - [25/Dec/2010:03:14:50 -0500] "GET /?p=23 HTTP/1.1" 200 13019 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
88.214.193.166 - - [25/Dec/2010:03:22:32 -0500] "GET / HTTP/1.1" 200 5884 "http://www.whorush.com/search/?q=atomlabs.net" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"
88.214.193.166 - - [25/Dec/2010:03:24:48 -0500] "GET / HTTP/1.1" 200 5884 "http://www.whorush.com/search/?q=atomlabs.net" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"
66.249.72.100 - - [25/Dec/2010:04:09:11 -0500] "GET /dev HTTP/1.1" 404 - "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"

 
I don’t know about you but that can become very cumbersome to look at, especially when you just want a specific chunk of information out of all of that mess. So this is where (awk) comes into play, in its simplest form we can use it to print information from a specific column of text.

By default (awk) is going to make the default delimiter a space, so if we had the text:

66.249.72.100 - - [25/Dec/2010:04:09:11 -0500] "GET /dev HTTP/1.1" 404 - "-" "Mozilla/5.0

To only show the date this would be the 4th column of data, here is how (awk) would break it down:

$0 = 66.249.72.100 - - [25/Dec/2010...]
$1 = 66.249.72.100
$2 = -
$3 = -
$4 = [25/Dec/2010:04:09:11
$5 = -0500]

 
I want to find who’s visiting my site and what pages:

cat /usr/local/apache/domlogs/atomlabs.net | awk '{print $1 " " $7}'
 
173.205.126.94 /wp-cron.php?doing_wp_cron
88.214.193.166 /
109.120.144.247 /plugins/editors/tinymce/jscripts/tiny_mce/plugins/tinybrowser/tinybrowser.php?type=file&folder=
207.46.13.87 /robots.txt
207.46.13.87 /robots.txt
207.46.13.87 /page2.php
88.214.193.166 /
209.85.238.86 /dev/?feed=rss2
66.249.72.100 /?p=23
88.214.193.166 /
88.214.193.166 /
66.249.72.100 /dev

 
You can pass multiple variables to (awk) like you’ve just seen above, you can also pass string data between quotes (“). I passed a space, that way the output would be the IP address, a space, then the file.

Recall previous CLI arguments with ease

Have you ever been running various commands on the same file or folder, and get sick of having to re-type it in each time? One easy way to get around this is with the built in recall arguments feature built into Linux. Essentially you hit (Alt + # + .) where (#) is the number of argument you are recalling:

Run a simple command with two arguments:

grep user1@domain1.com /var/log/exim_mainlog

Now after I type in the new command I want to run:

grep user2@domain2.com

I can recall the 2nd argument to grab my log path:

grep user2@domain2.com (Hold Alt + 2 + .)
grep user2@domain2.com /var/log/exim_mainlog

Exim – Mail queue commands

exim -bp : Show summary of messages in queue
exim -bpc : Show number of messages in queue

-M : Force delivery
-Mar : Add recipient
-Meb : Edit message body
-Mes : Edit sender
-Mf : Freeze message
-Mg : Give up (and bounce message)
-Mmad : Mark all recipients as delivered
-Mmd : Mark recipient as delivered
-Mrm : Remove message (no bounce)
-Mt : Thaw message
-Mvb : View message body
-Mvh : View message header
-Mvl : View message log

Create file of certain size

Wish you could create a test file for file-transfer testing? It’s easy with Linux’s (dd) command:

Create 1MB file:

dd if=/dev/zero of=1MB.txt bs=1024 count=1024

Create 50MB file:

dd if=/dev/zero of=50MB.txt bs=1024 count=51200

Verify the sizes:

ls -lh | grep MB
-rw-r--r--  1 root root 1.0M Dec 23 16:06 1MB.txt
-rw-r--r--  1 root root  50M Dec 23 16:06 50MB.txt

Convert text to upper-case or lower-case

There are more than a couple of ways to convert a string with upper-case letters to lower-case and vice-versa in Linux, here are my favorite two:

awk

echo "TEXT WITH CAPS-LOCK ON" | awk '{print tolower($0)}'
text with caps-lock on
 
echo "text with caps-lock off" | awk '{print toupper($0)}'
TEXT WITH CAPS-LOCK OFF

dd

echo "TEXT WITH CAPS-LOCK ON" > upperText; dd if=upperText of=lowerText conv=lcase; cat lowerText
0+1 records in
0+1 records out
23 bytes (23 B) copied, 2.1e-05 seconds, 1.1 MB/s
text with caps-lock on
 
echo "text with caps-lock off" > lowerText; dd if=lowerText of=upperText conv=ucase; cat upperText
0+1 records in
0+1 records out
24 bytes (24 B) copied, 3.4e-05 seconds, 706 kB/s
TEXT WITH CAPS-LOCK OFF

Find the difference between two files

Using the (diff) command you can find out the difference between two files, (sdiff) allows for you to see both files at once to find the difference.

echo "ABC" > a
echo "ABCD" > b
diff a b
1c1
< ABC
---
> ABCD
sdiff a b
ABC                                                           | ABCD

If statements

In Linux just like with most scripting languages you can use if statements to allow for step-by-step processing through a set of rules. Let’s say you wanted to have a simple script to handle clock-ins for you employees:

#!/bin/sh
 
echo "Enter your employee ID:"
read employeeID
 
echo "in - Clock in"
echo "out - Clock out"
read clockStatus
 
if [ $clockStatus = "in" ]
then
        date >> $employeeID"_timeclock"
        echo "$employeeID logged in" >> $employeeID"_timeclock"
else
        date >> $employeeID"_timeclock"
        echo "$employeeID logged out" >> $employeeID"_timeclock"
fi

For loops

for variable in list; do echo $variable; done
list
for variable in a b c d; do echo $variable; done
a
b
c
d
for variable in a b c d; do echo $variable; done > list
for variable in $(cat list); do echo $variable; done
a
b
c
d

Page optimized by WP Minify WordPress Plugin