In Linux, the “sort” command is utilized to sort the lines of an input stream or file in either descending or ascending order. It sorts the lines based on various criteria such as alphabetically, numerically, based on specific columns or fields, and based on the output of specific commands.
This article will offer the difference between the “sort -u” and “sort | uniq” commands in Linux.
- What is sort -u Command?
- What is sort | uniq Command?
- Difference Between “sort -u” and “sort | uniq” Commands
What is the sort -u Command?
The “sort -u” command sorts the input and deletes duplicate lines and produces a sorted list of unique items. The syntax of using the “sort -u” command is given below:
Syntax
$ sort -u file_name
In the above syntax, the “-u” option instructs the “sort” command to remove duplication in the “file_name”. Let’s practice the “sort-u” command:
Example “sort -u” Command
This “sort -u” command sorts the input and removes any duplicate lines. It produces a sorted list of unique items. For instance, a file named “file.txt” is displayed via the “cat” command having the following contents:
$ cat file.txt
The output shows the content that is placed in the “file.txt”.
Let’s run the “sort -u” command by specifying the same file name as file.txt. The “-u” option remove duplicates:
$ sort -u file.txt
The above command shows that the list is sorted and removes the duplication of “Apple”.
What is sort | uniq Command?
The “sort | uniq” command, is a combination of two commands “sort” and “uniq“. The basic syntax of the combination of these commands is given below:
Syntax
$ sort file_name | uniq
The “sort” command sorts the input, and the “uniq” command removes any duplicate lines.
Let’s explore practical implementation:
Example of sort | uniq Command
The “cat” command is utilized to display the content of the “file.txt” as seen below:
$ cat file.tx
The above display shows the unsorted and duplicated content in the “file.txt”.
For instance, run the “sort | uniq” command by specifying the file name file.txt:
$ sort file.txt | uniq
It produces a sorted list of unique items by removing the duplicate lines.
Difference Between “sort -u” and “sort | uniq” Commands
The difference between “sort -u” and “sort | uniq” commands are enlisted below:
- The sort -u performs both sorting and removing duplicates in a single command, whereas sort | uniq uses two separate commands connected by a pipe operator to achieve the same result.
- The sort -u command is a shortcut that combines the functionality of sort and uniq commands.
- Another difference is that sort -u only works on sorted input, so it sorts the input file before removing duplicates, whereas sort | uniq can work on unsorted input, but it will sort the input first before removing duplicates.
- The key difference between the two commands is how they handle non-consecutive duplicate lines. The “sort -u” removes all duplicates, regardless of whether they are consecutive or not. The “sort | uniq” only removes consecutive duplicates.
- Another difference is in their performance. The “sort -u” is a single command that performs both sorting and duplicate removal in one pass. While the “sort | uniq” requires two commands and an intermediate pipe, which can be slower and less efficient for large datasets.
Conclusion
In Linux, the main difference between the “sort -u” and “sort | uniq” commands is how they handle non-consecutive duplicate lines. The “sort -u” removes all duplicates, while “sort | uniq” only removes consecutive duplicates. The “sort -u” is a single command that performs both sorting and duplicate removal in one pass, while the “sort | uniq” requires two commands and an intermediate pipe.
This guide has explained the difference between the “sort -u” and “sort | uniq” commands in Linux.