Awk


AWK is special-purpose programming language to handle text-reformatting jobs

Glossary

Usage

#!/usr/bin/env awk -f
awk <script> <file>
awk -f <script> <file>
awk -f <script> -v key1=value1 -v key2=value2 <file>
awk -f <script>
awk -F <delimiter> -f <script> <file>
awk -f <script> -f <path/to/library.awk> file

Press CTRL-D to signal EOF

Rules

BEGIN { commands }

END { commands }

Pipes

|

Redirect output

print "Foo" | "wc -c"

>

Output to a file

print "Foo" > "path/to/file"

>>

Append to file

print "Foo" >> "path/to/file"

Functions

You can declare functions in the following form:

function name([argument1][, argument2][, ...]) {
  command1
  command2
  return [value]
}

As an alternative, declare them at the end of the argument list, and don’t pass them at all when invoking the function, which will cause Awk to set them to an empty string:

function foo(arg1, arg2, temporary_arg1, temporary_arg2) {
  temporary_arg1 = 5
  temporary_arg2 = 7
}

foo("hello", "world")

Variables

FS

The file separator, it defauls to " "

BEGIN {
  FS = ","
}
BEGIN {
  FS = "[,:]+"
}

OFS

The output file separator

BEGIN {
  OFS = "\t"
}

# Each comma will be replaced with `\t`

{
  print $1, $2, $3
}

NF

The number of field for the current record

NF == <number of fields> {
  commands
}

RS

The record separator, it defaults to \n

ORS

The output record separator, it defaults to ‘’

NR

The number of the current record (the current line number)

NR == <line number> {
  commands
}

CONVFMT

Control number-to-string conversions, the default value is %.6g

OFMT

Control number-to-string conversions when using print

BEGIN {
  OFMT = "%.2f"
}

ARGC

Number of arguments passed

ARGV

List of arguments passed (array)

ENVIRON

Object of environment variables

print ENVIRON["PATH"]

Statements

Print a string to stdout

{ print The first field is $1, and the second is $2 }
{ print "" }
print "This is an error" > "/dev/stderr"

printf [expression][, arguments...]

Print a formatted string to stdout

{ printf "%20s\n", $1 }
{ printf "%-20s\n", $1 }
BEGIN {
  width = 15
}

{
  printf "%*s\n", width, $1
}

next

Get the next record and start over, ignoring any potential rules that would have match the current record

exit [code]

Exit from the script

Number split(string, array, separator)

Create an array by splitting string by separator. The elements are populated into array, and the function returns the number of elements

delete array[subscript]

Delete an element from an array

[element] in [array]

Test if element is inside array

if (foo in ARGV) {
  print "foo is in ARGV"
}
for (index in array) {
  print array[i]
}

getline

Read next input line and save it to $0

It might return the following values:


BEGIN {
  while (getline > 0) {
    list = list $0
  }
}
getline < "path/to/file"
getline < "-"
BEGIN {
  printf "Enter your name: "
  getline variable < "-"
  print variable
}

Notice that when you read into variable, the variable is not split into fields as when its assigned to $0.

"command" | getline

For example:

"whoami" | getline

You may save the output into a variable.

Notice that getline only reads the first line of output. Accumulate all using while:

while ("command" | getline) {
  output = output $0
}

close([file|pipe])

Close a file or pipe

For example:

BEGIN {
  "whoami" | getline user
  close("whoami")
  print user
}

system([command])

Execute a command, but don’t make its output available to Awk

Operators

$<N>

{ print $2 }
BEGIN {
  one = 1
  two = 2
}

{ print $(one + two) }

<field> ~ <pattern>

$5 ~ /foo/ {
  print The string "foo" is inside the fifth field
}

<field> !~ <pattern>

$5 !~ /foo/ {
  print The string "foo" is NOT inside the fifth field
}

[:space:]

BEGIN {
  foo = "Hello" " - " "World"
}

Examples

BEGIN {
  x = 0
}

/^$/ {
  x += 1
}

END {
  print x
}

Tips & Tricks

Caveats