The uniq
command is a powerful tool in Bash for filtering and manipulating text files. It's particularly useful for removing or identifying duplicate lines in sorted input.
The basic syntax of the uniq
command is:
uniq [OPTION]... [INPUT [OUTPUT]]
By default, uniq
reads from standard input and writes to standard output. It compares adjacent lines and removes any duplicates.
-c
: Prefix lines with the number of occurrences-d
: Only print duplicate lines-u
: Only print unique lines-i
: Ignore case when comparing lines-f N
: Skip the first N fields on each line before comparingsort file.txt | uniq > output.txt
This command first sorts the file (since uniq
only compares adjacent lines) and then removes duplicates, writing the result to output.txt
.
sort file.txt | uniq -c
This command will output each unique line prefixed with its number of occurrences.
sort file.txt | uniq -d
This command will only output lines that appear more than once in the file.
uniq
to work correctly, as it only compares adjacent lines.uniq
for best results.uniq
with Bash Pipes for efficient data flow.For more complex text processing tasks, you can combine uniq
with other Bash commands like grep, sed, or awk. This allows for powerful text manipulation and analysis capabilities.
sort file.txt | uniq -i -f 2
This command ignores case differences and skips the first two fields when comparing lines, which is useful for data with leading identifiers or timestamps.
The uniq
command is an essential tool for text processing in Bash. Its ability to filter and analyze duplicate lines makes it invaluable for log analysis, data cleaning, and various text manipulation tasks. By mastering uniq
and combining it with other Bash commands, you can significantly enhance your text processing capabilities in shell scripting.