English | 简体中文 | 繁體中文 | Русский язык | Français | Español | Português | Deutsch | 日本語 | 한국어 | Italiano | بالعربية
The Linux uniq command is used to check and remove repeated rows and columns in text files and is generally used in conjunction with the sort command.
uniq Can check for repeated rows and columns in text files.
uniq [-cdu][-f<field>][-s<character position>][-w<character position>][--help][--version][Input File][Output File]
Parameters:
-c or--count Displays the number of times each row appears in each column.
-d or--repeated Displays rows and columns that appear more than once.
-f<field> or--skip-fields=<field> Ignores the specified field for comparison.
-s<character position> or--skip-chars=<character position> Ignores the specified characters for comparison.
-u or--unique Displays rows and columns that appear only once.
-w<character position> or--check-chars=<character position> Specifies the characters to be compared.
--help Displays help.
--version Displays version information.
[Input File] Specifies the sorted text file. If not specified, data is read from standard input;
[Output File] Specify the output file. If this option is not specified, the content will be displayed on the standard output device (display terminal).
The line number in the file testfile is 2,3,5,6,7,9To delete duplicate lines of lines with the same content, use the uniq command and the following command:
uniq testfile
The original content of testfile is:
$ cat testfile #Original content test 30 test 30 test 30 Hello 95 Hello 95 Hello 95 Hello 95 Linux 85 Linux 85
After using the uniq command to delete duplicate lines, the following output is obtained:
$ uniq testfile #Content after deleting duplicate lines test 30 Hello 95 Linux 85
Check the file and delete the duplicate lines in the file, and display the number of times the line appears at the beginning of the line. Use the following command:
uniq -c testfile
The result is as follows:
$ uniq -c testfile #Content after deleting duplicate lines 3 test 30 #The number before the digit means the total number of times the line has appeared3times 4 Hello 95 #The number before the digit means the total number of times the line has appeared4times 2 Linux 85 #The number before the digit means the total number of times the line has appeared2times
The uniq command is not effective when the repeated lines are not adjacent, that is, if the file content is as follows, the uniq command is not effective:
$ cat testfile1 # Original content test 30 Hello 95 Linux 85 test 30 Hello 95 Linux 85 test 30 Hello 95 Linux 85
Now we can use sort:
$ sort testfile1 | uniq Hello 95 Linux 85 test 30
Count the number of times each line appears in the file:
$ sort testfile1 | uniq -c 3 Hello 95 3 Linux 85 3 test 30
Find duplicate lines in the file:
$ sort testfile1 | uniq -d Hello 95 Linux 85 test 30