How to Use split() with awk in Linux?

The “awk” is the built-in command line utility mainly used for text processing and manipulation in Linux. It searches the text pattern from the file and performs specific operations like add, replace, and print. 

In addition, it offers several built-in functions to perform special tasks based on their names, like searching and replacing the texts, sorting, validating, and indexing the database, and splitting the strings.

This post explains the working and usage of the split() function with the “awk” command in Linux.

  • How to Use split() with awk in Linux?
  • Using the “Space” Default Delimiter
  • Using the “Comma” Delimiter
  • Using the “Pipe” Delimiter

How to Use split() With awk in Linux?

The “split()” function splits the strings into the awk array separated by the default delimiter/separator “space.” Its generalized syntax is stated here:

split(Source,Destination,Delimiter)

The split syntax contains the following arguments:

  • Source: Represent the string that needs to be split.
  • Destination: Shows where the split string will place
  • Delimiter: Acts as a separator in the text or strings.

Example 1: Using the split() function With “Space” Default Delimiter

The delimiter passed in the split() function is optional. If the user does not provide any delimiter in the split() function, then by default, the “Space” delimiter will be used. 

In this scenario, the three strings are parsed using the “space” delimiter in the split function:

$ echo "Linux Ubuntu RHEL" | awk '{split($0,a); print a[3]; print a[2]; print a[1]}'

The command description is stated below:

  • echo: Represents the “echo” command to print the encoded string “Linux,” “Ubuntu,” and “RHEL.”
  • |(Pipe Character): Concatenate both “echo” and “awk” commands.
  • awk: Uses for text manipulation, i.e., split strings and print into the desired place.
  • split(): Defines the condition starting from “$0(entire string)”, “a” specifies the array name, and there is no delimiter, i.e., “space” by default.
  • print: Performs the print operation of array “a” indices. i.e., “a[3]” denotes the third string, “a[2]” for the second string, and “a[1]” for the first string.

The output displays the array “a” values in the terminal according to defined indices.

Example 2: Using the split() Function With “Comma” Delimiter

The “comma” can also be used as a delimiter for the split() function. It must be enclosed in double quotes after the array name in the split() function. Let’s see its practical implementation:

The command is the same as the above “default delimiter” but with the addition of “,(comma)” delimiter:

$ echo "Linux,Ubuntu,RHEL" | awk '{split($0,a,","); print a[3]; print a[2]; print a[1]}'

The original string has been split into three sub-strings.

Example 3: Using the split() function With “Pipe” Delimiter

The “|(Pipe)” is another special character that helps in splitting a string using the “split ()” function of the “awk” command. It works like the “space” and “comma” delimiter.

This example shows the practical implementation of the “|” pipe character in the split() function:

$ echo "Linux|Ubuntu|RHEL" | awk '{split($0,a,"|"); print a[3]; print a[2]; print a[1]}'

At this time the strings are separated by “|” character:

Each substring has been displayed in the new line at the specified position.

Conclusion

The “split()” is the “awk” command line function that splits the defined string into an “awk” array separated by the delimiter. This function uses three essential arguments to perform this task i.e “string,” “array,” and the delimiter. The delimiter is optional and can be changed according to user choice.

This post has illustrated a complete procedure to use split() with the awk command in Linux.