Skip to main content

Sponsors

By default, a record is a line. A line is made up of fields with default delimiter. Main programs are mainly based on C program syntax.

BEGIN { FS="\t+" } Initialization. Set field separator to be one or more TABs. Take the string as regular expression if more than 1 characters.
END { ... } Finalization. Execute ... after all files have been processed.
gawk -F"\t" '{ ... }' Execute with field separator TAB.
/xyz/ { print } For each line contain xyz, print the line.
$1 == 100 { print $2, $3 } For each line with first field equals 100, print the 2nd and 3rd fields separated by space.
$3 ~ /PAT/ { print $2 $3 } If the 3rd field matches PAT, print the concatenated 2nd and 3rd fields.
$3 !~ /PAT/ { x = 0 } If the 3rd field doesnt match PAT, let x = 0.
print $1 OFS $2 Print 1st and 2nd fields separated by output field separator.
FS, OFS, RS, ORS Field & output field separator, record & output record separator.
NF, NR Number of fields and records.
FILENAME Current input file.
$NF Last field
'${val:-hello}' In matching region, represents value of $val from the BASH environment, but use "hello" if val is not defined.
"'${val}'" In code region, represents value of $val from the BASH environment.
array[2]="hello"
array["i"]="world"
for(i in array) print array[i] Also works for multi-dimensional array.
if("i" in array) print "found" Print if "i" is a subscript of array.
n = split($1, array, ":")
for(i = 1; i <= n ; i++) print array[i]
Split 1st field into array with ":" as delimiter.
array[2,5]="val25" Equivalent to array["2" SUBSEP "5"]="val25". SUBSEP is a subscript-component separator such as "\034" by default.
if((i,j) in array)
delete array Delete all elements in the array.
delete array[i] Delete element i in the array.
ARGC, ARGV, ENVIRON Number of arguments, argument array, and environment array.
index(s,t) Position of t in s.
length(s)
sub(r,s[,t]) Substitute first match of r in t by s. t is $0 by default.
match(s,p) Return starting position of the substring in s that regular expression p matches.
sprintf("fmt",expr)
substr(s,p,n) The substring of s at position p up to n long.
toupper(s), tolower(s)
function name(list) {
statement
}
getline Get next line.
getline <"-" Get a line from stdin.
print > "out.txt" Print to the file out.txt.

Reference: UNIX Power Tools: sed & awk by Dale Dougherty