English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية

Linux awk command

Linux Command大全

AWK is a language for processing text files and a powerful text analysis tool.

The name AWK is because it takes the initial letters of the family names of the three founders Alfred Aho, Peter Weinberger, and Brian Kernighan.

Syntax

awk [options] 'script' var=value file(s)
or
awk [options] -f scriptfile var=value file(s)

Option parameter description:

  • -F fs or --field-separator fs
    Specify the input file field separator, fs is a string or a regular expression, such as-F:.

  • -v var=value or --asign var=value
    Assign a user-defined variable.

  • -f scripfile or --file scriptfile
    Read awk commands from a script file.

  • -mf nnn and -mr nnn
    Set an intrinsic limit for the nnn value,-The mf option limits the maximum number of blocks allocated to nnn;-The mr option limits the maximum number of records. These two features are extended features of the Bell Labs version of awk and are not applicable in standard awk.

  • -W compact or --compat, -W traditional or --traditional
    Run awk in compatibility mode. Therefore, the behavior of gawk is completely the same as standard awk, and all awk extensions are ignored.

  • -W copyleft or --copyleft, -W copyright or --copyright
    Print a brief copyright information.

  • -W help or --help, -W usage or --usage
    Print all awk options and a brief description of each option.

  • -W lint or --lint
    Print warnings about structures that cannot be ported to traditional unix platforms.

  • -W lint-old or --lint-old
    Print warnings about structures that cannot be ported to traditional unix platforms.

  • -W posix
    Open compatibility mode. But there are the following limitations, not recognized:/x, function keywords, func, escape sequences, and when fs is a space, the new line is treated as a field delimiter; operators**and**= cannot replace ^ and ^=; fflush is invalid.

  • -W re-interval or --re-inerval
    Allow the use of interval regular expressions, refer to (Posix character classes in grep), such as bracket expressions [[:alpha:]].

  • -W source program-text or --source program-text
    using program-text as source code, can be used with-mixed use of f command.

  • -W version or --version
    Print the version information of bug report.

Basic usage

The content of log.txt is as follows:

2 this is a test
3 Are you like awk
This's a test
10 There are orange,apple,mongo

Usage 1:

awk '{[pattern] action}' {filenames} # Line matching statement awk '' can only use single quotes

Example:

# Split each line by space or TAB, and output the1、4Item
 $ awk '{print $1$4}' log.txt
 ---------------------------------------------
 2 a
 3 like
 This's
 10 orange,apple,mongo
 # Format output
 $ awk '{printf "%-8s %-10s\n",$1$4}' log.txt
 ---------------------------------------------
 2        a
 3        like
 This's
 10       orange,apple,mongo

Usage 2:

awk -F #-F is equivalent to the built-in variable FS, specifying the delimiter

Example:

# Use "," to split
 $ awk -F '{print $1$2}' log.txt
 ---------------------------------------------
 2 this is a test
 3 Are you like awk
 This's a test
 10 There are orange apple
 # Or use built-in variables
 $ awk 'BEGIN{FS=","} {print $1$2}' log.txt
 ---------------------------------------------
 2 this is a test
 3 Are you like awk
 This's a test
 10 There are orange apple
 # Use multiple delimiters. First use space to split, then split the results again using ","
 $ awk -F '[ ,]' '{print $1$2$5}' log.txt
 ---------------------------------------------
 2 this test
 3 Are awk
 This's a
 10 There apple

Usage 3:

awk -v # Set variables

Example:

 $ awk -va=1 '{print $1$1+a}' log.txt
 ---------------------------------------------
 2 3
 3 4
 This's 1
 10 11
 $ awk -va=1 -vb=s '{print $1$1+a,$1b}' log.txt
 ---------------------------------------------
 2 3 2s
 3 4 3s
 This's 1 This'ss
 10 11 10s

Usage four:

awk -f {awk script} {filename}

Example:

 $ awk -f cal.awk log.txt

Operator

OperatorDescription
= += -= *= /= %= ^= **=Assignment
?:C conditional expression
||Logical OR
&&Logical AND
~ and !~Match regular expression and not match regular expression
< <= > >= != ==Relational operator
SpaceConcatenation
+ -Addition, subtraction
* / %Multiplication, division, and modulus
+ - !Unary plus, minus, and logical NOT
^ ***Power
++ --Increase or decrease, as a prefix or suffix
$Field reference
inArray member

Filter lines where the first column is greater than2lines

$ awk '"1>2' log.txt           #Command
#Output
3 Are you like awk
This's a test
10 There are orange,apple,mongo

Filter lines where the first column equals2lines

$ awk '"1==2 {print $1$3}' log.txt           #Command
#Output
2 is

Filter lines where the first column is greater than2And the second column equals 'Are' lines

$ awk '"1>2 && $2=="Are" {print $1$2$3}' log.txt           #Command
#Output
3 Are you

Built-in variable

VariableDescription
$nThe nth field of the current record, separated by FS
$0Complete input record
ARGCNumber of command line arguments
ARGINDPosition of the current file in the command line (starting from 0)
ARGVArray containing command line arguments
CONVFMTNumber conversion format (default is %.6g)ENVIRON environment variable associated array
ERRNODescription of the last system error
FIELDWIDTHSField width list (separated by spaces)
FILENAMECurrent filename
FNRLine numbers counted separately for each file
FSField separator (default is any whitespace)
IGNORECASEIf true, perform case-insensitive matching
NFThe number of fields in a record
NRThe number of records read, which is the line number, from1Start
OFMTNumber output format (default is %.6g)
OFSOutput field separator, default is the same as input field separator.
ORSOutput record separator (default is a newline)
RLENGTHThe length of the string matched by the match function
RSRecord separator (default is a newline)
RSTARTThe first position of the string matched by the match function
SUBSEPArray index separator (default is/034)
$ awk 'BEGIN{printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n","FILENAME","ARGC","FNR","FS","NF","NR","OFS","ORS","RS";printf "%---------------------------------------------\n"} {printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n",FILENAME,ARGC,FNR,FS,NF,NR,OFS,ORS,RS}' log.txt
FILENAME ARGC FNR FS NF NR OFS ORS RS
---------------------------------------------
log.txt    2    1         5    1
log.txt    2    2         5    2
log.txt    2    3         3    3
log.txt    2    4         4    4
$ awk -F' 'BEGIN{printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n","FILENAME","ARGC","FNR","FS","NF","NR","OFS","ORS","RS";printf "%---------------------------------------------\n"} {printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n",FILENAME,ARGC,FNR,FS,NF,NR,OFS,ORS,RS}' log.txt
FILENAME ARGC FNR FS NF NR OFS ORS RS
---------------------------------------------
log.txt    2    1        1    1
log.txt    2    2        1    2
log.txt    2    3        2    3
log.txt    2    4        1    4
# Output sequence number NR, line number of matching text
$ awk '{print NR,FNR,$1$2$3}' log.txt
---------------------------------------------
1 1 2 this is
2 2 3 Are you
3 3 This's a test
4 4 10 There are
# Specify the output delimiter
$ $ awk '{print $1$2$5}' OFS=" $ " log.txt
---------------------------------------------
2 $ this $ test $
3 $ Are $ awk $
This's $ a $
10 There $

Use regular expressions for string matching

# Output the second column containing "th", and print the second and fourth columns
$ awk '"2 ~ /th/ {print $2$4}' log.txt
---------------------------------------------
this a

~ means the start of the pattern.// is the pattern.

# Output lines containing "re"
$ awk '"/re/ "' log.txt
---------------------------------------------
3 Are you like awk
10 There are orange,apple,mongo

Ignore case

$ awk 'BEGIN{IGNORECASE=",1} /this/"' log.txt
---------------------------------------------
2 this is a test
This's a test

Pattern negation

$ awk '"2 !~ /th/ {print $2$4}' log.txt
---------------------------------------------
Are like
a
There orange,apple,mongo
$ awk '!/th/ {print $2$4}' log.txt
---------------------------------------------
Are like
a
There orange,apple,mongo

awk script

For awk scripts, we need to pay attention to two keywords BEGIN and END.

  • BEGIN{This is where the statements to be executed before execution are placed }

  • END {This is where the statements to be executed after all lines are processed are placed }

  • {This is where the statements to be executed for each line are placed}

Suppose there is a file (student score sheet):

$ cat score.txt
Marry   2143 78 84 77
Jack    2321 66 78 45
Tom     2122 48 77 71
Mike    2537 87 97 95
Bob     2415 40 57 62

Our awk script is as follows:

$ cat cal.awk
#!/bin/awk -f
# Before running
BEGIN {
    math = 0
    english = 0
    computer = 0
 
    printf "NAME        NO.        MATH        ENGLISH     COMPUTER     TOTAL\n"
    printf "---------------------------------------------\n"
}
# Running
{
    math+=$3
    english+=$4
    computer+=$5
    printf "%-6s %-6s %4d %8d %8d %8d\n$1, $2, $3$4$5, $3+$4+$5
}
# After running
END {
    printf "---------------------------------------------\n"
    printf "  TOTAL:%10d %8d %8d \n", math, english, computer
    printf "AVERAGE:%10.2f %8.2f %8.2f\n", math/NR, english/NR, computer/NR
}

Let's take a look at the execution result:

$ awk -f cal.awk score.txt
NAME    NO.   MATH  ENGLISH  COMPUTER   TOTAL
---------------------------------------------
Marry  2143     78       84       77      239
Jack   2321     66       78       45      189
Tom    2122     48       77       71      196
Mike   2537     87       97       95      279
Bob    2415     40       57       62      159
---------------------------------------------
  TOTAL:       319      393      350
AVERAGE:     63.80    78.60    70.00

Other examples

The hello world program of AWK is:

BEGIN { print "Hello, world!" }

Calculate file size

$ ls -l *.txt | awk '{sum+=$5} END {print sum}
--------------------------------------------------
666581

Find lines longer than from the file 80 lines:

awk 'length>80' log.txt

Print the 9x9 multiplication table

seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")'

Linux Command大全