Awk is a powerful Linux utility that is primarily used to create statement-based programs for processing text. These programs are quite effective for searching and scanning the file for a given pattern or text. The awk is also referred to as a scripting language as it manipulates the data and generates reports.
Following the importance of this utility, we have compiled this guide that contains 20 important awk commands that you need to know.
How the AWK command works
The working of commands primarily depends upon the syntax. Let’s start with the syntax of the awk command below.
Syntax:
> awk '{action}' <filename.txt>
In case of regular expressions, the syntax would be.
> awk /regex-pattern/{action} <filename.txt>
- {action}: The operations that are going to be performed on a file
- filename.txt: Name of the file under analysis.
- regex-pattern:
AWK Command in Linux/Unix with Examples
This section contains the 20 commands that are recommended for any Linux user. Here we are going to apply awk to the file named “test.txt”
1. Print All the Content of a File
Awk’s most basic function is to print the content of a file. Use the following command to retrieve the content within the file “test.txt“.
Note: Alongside the awk command, a cat command is used to list the file’s content.
$ awk '{print $0}' test.txt
2. Print Content with Line Numbers
You can print all the content by specifying the line number for each line. For this, the NR variable of awk is used with $0, and its applicability is shown below.
$ awk '{print NR,$0}' test.txt
3. Print Specific Fields
The $0 operator of the awk command prints all the entities in a file. If you want to get the specific fields, you can do it by changing the operator like $1 for one entity, $2 for two entities, etc. The command written below will print the second field of each line from the “test.txt” file
$ awk '{print $2}' test.txt
4. Print Specific Lines
The awk command is piped with the head command to select a specific line. The awk command fetches the data from the file, and the head command shows the number of lines. In the example below, the command will print the “first three lines” of the file “test.txt.”
$ awk '{print $0}' test.txt | head -3
5. Print the Total Number of Fields in Each Line
You can get the number of fields in each line by using NF. The NF keyword refers to the number of fields. The command written below prints the number of fields in each line of the test.txt file.
$ awk '{print NF}' test.txt
6. Delete Empty Lines
The awk command can be used to delete empty lines from a file. You can “delete empty lines” by specifying the condition for NF>0. By setting NF>0, the following awk commands print only those lines that contain data.
$ awk 'NF>0 {print $0}' test.txt
7. Print the Lines that Match a Character
The awk command can be used to print the lines that contain characters specified in the command. The command written below searches for “linuxfoss” and prints all the lines that contain this string.
$ awk '/linuxfoss/' distro.txt
8. Print Specific Lines That Match a Pattern
If you want to match the pattern in specific lines, you have to mention the line numbers. For instance, the command provided below will print the second and third fields of the lines containing “Linux” word
$ awk '/linux/{print $2, $3}' distro.txt
9. Print the Line That Matches the Pattern(at the start)
The awk command can be used to print the lines that start with a specific pattern. The command below will print the line(s) that starts with a “D“.
$ awk '/^D/' test.txt
10. Print the Line That Matches a Pattern(at the end)
Alternatively, you can print the lines ending in a specific pattern. For instance, the following command to print the lines that end with “2“.
$ awk '/2$/' test.txt
Note: The carrot(^) and dollar($) signs are used to match the pattern from starting and ending of a line respectively.
11. Prints the Line That Contains Digits
The following awk command will print the lines (from test.txt file) that contains digits in it. To identify the digits, the [0-9] character class is used.
$ awk '/[0-9]/' test.txt
12. Print the Lines That Contain Alphabets
The following alphabet classes can be used in awk commands. The class [a-z] refers to small alphabets whereas [A-Z] is used to denote the capital alphabetical letters—the command written below prints the line that contains small alphabets in “distributions.txt” file.
$ awk '/[a-z]/{print $0}' distributions.txt
Similarly, the following command prints the lines that contain capital alphabetical letters.
$ awk '/[A-Z]/{print $0}' distributions.txt
13. Get the Print of Lines That Contains Alphanumeric Characters
Like digits and alphabets, awk commands can also look at the alphanumeric characters by using the alphanumeric characters class “[A-Za-z0-9]“. This command prints all lines from the “distributions.txt” file that contain alphanumeric characters.
$ awk '/[A-Za-z0-9]/{print $0}' distributions.txt
14. Using Comparison Operators to Print the Lines
The “comparison operators” match the condition and then print the result. For instance, the following command prints all the lines with a 3rd field value greater than “20“.
$ awk '$3 > 20 { print $0 }' numbers.txt
The command is written below prints a line greater than or equal to 10 characters.
$ awk 'length($0) >= 10' numbers.txt
15. Postprocessing
Postprocessing enforces the action to be carried out after processing the data in a file. The “postprocessing” in the awk command is performed using the END keyword. In the following command, END is used in Awk to count the number of lines in the test.txt file.
$ awk 'END { print NR }' test.txt
16. Preprocessing
You can use “BEGIN” keyword to start the output with the specified operation. Taking the command below as an example, the command prints the content of the file (test.txt), and the header is defined as “The content of file is shown below:“.
$ awk 'BEGIN {print "The content of file is shown below"} {print $0}' test.txt
17. If statement
The if statement is mostly used to compare two or more fields and then prints the condition’s content. In AWK, an “if statement” is exercised to check the condition and then prints the results that satisfy the condition. The command written below will print only those values of the 1st field that are less than 10.
$ awk '{if ($1 < 10) print $1}' numbers.txt
18. printf Command
The “printf command” in awk can be used to get the formatted answers using the format specifiers. The following format specifiers are used with the awk command to get the formatted output.
- o: print octal values
- d: prints an integer value
- f: prints float values
- e: prints scientific numbers
- c: prints numeric output as a string
- s: prints a text string
The command written below will print the scientific numbers of the data present in the price.txt file
$ awk '{printf "The result is: %e\n", $0 }' price.txt
The numeric output can also be obtained as a string using printf. Like, the following is a command which converts numeric values into octal values.
$ awk '{printf "The result is: %o\n", $0 }' price.txt
19. Change the Case of a String
This functionality belongs to the family of string functions from awk. By using the command below, you can change the case of the letters in the test.txt file to uppercase.
$ awk '{print toupper($0)}' test.txt
20. Get the Environment Variables
The awk command also performs some system-related tasks as well. For instance, the following awk command written below will print the shell environment variables from file test.txt.
$ awk '{print ENVIRON["PATH"]}' test.txt
Conclusion
The awk utility of Linux is quite helpful in manipulating text files. It is also a scripting language; its functionality differs from programming languages as the awk utility is a data-driven language. In this article, we have described the functionality of the awk command in Linux. There are “20 awk” most used commands explained with an example. Although it’s a scripting language, a normal Linux user can also use this utility to automate text files related tasks.
TUTORIALS ON LINUX, PROGRAMMING & TECHNOLOGY