Start Coding

Topics

The uniq Command in Bash

The uniq command is a powerful tool in Bash for filtering and manipulating text files. It's particularly useful for removing or identifying duplicate lines in sorted input.

Basic Syntax and Usage

The basic syntax of the uniq command is:

uniq [OPTION]... [INPUT [OUTPUT]]

By default, uniq reads from standard input and writes to standard output. It compares adjacent lines and removes any duplicates.

Common Options

  • -c: Prefix lines with the number of occurrences
  • -d: Only print duplicate lines
  • -u: Only print unique lines
  • -i: Ignore case when comparing lines
  • -f N: Skip the first N fields on each line before comparing

Practical Examples

1. Remove Duplicate Lines

sort file.txt | uniq > output.txt

This command first sorts the file (since uniq only compares adjacent lines) and then removes duplicates, writing the result to output.txt.

2. Count Occurrences of Lines

sort file.txt | uniq -c

This command will output each unique line prefixed with its number of occurrences.

3. Find Duplicate Lines

sort file.txt | uniq -d

This command will only output lines that appear more than once in the file.

Important Considerations

  • The input must be sorted for uniq to work correctly, as it only compares adjacent lines.
  • Use the Bash Sort Command in conjunction with uniq for best results.
  • When processing large files, consider using uniq with Bash Pipes for efficient data flow.

Advanced Usage

For more complex text processing tasks, you can combine uniq with other Bash commands like grep, sed, or awk. This allows for powerful text manipulation and analysis capabilities.

Example: Ignore Case and Skip Fields

sort file.txt | uniq -i -f 2

This command ignores case differences and skips the first two fields when comparing lines, which is useful for data with leading identifiers or timestamps.

Conclusion

The uniq command is an essential tool for text processing in Bash. Its ability to filter and analyze duplicate lines makes it invaluable for log analysis, data cleaning, and various text manipulation tasks. By mastering uniq and combining it with other Bash commands, you can significantly enhance your text processing capabilities in shell scripting.