
Learn advanced grep techniques to search files efficiently in Linux. Discover how to use regular expressions, case-insensitive searches, recursion, and more with practical examples. Table
Master Linux text processing: learn how to use sed for quick edits and awk for powerful field-based reports, with examples, tips, and command pipelines.
In the world of Linux command‑line text processing, two tools reign supreme: sed (stream editor) and awk (pattern scanning and processing language). These tools provide extraordinary flexibility for transforming, filtering, and reporting data in plain-text files. Whether you’re cleaning log files, generating reports, or automating edits, mastering sed and awk can vastly improve your efficiency.
In this post, we dive deep into:
|
|
|
|
Let’s decode the strengths of sed
and awk
—and understand when to use each.
sed
?sed
, the stream editor, reads text line by line, applies editing rules, and writes to standard output (or a file). It’s ideal for:
|
|
|
🔄 Replace ‘foo’ with ‘bar’ in a file |
sed 's/foo/bar/g' input.txt
🔄 Save changes back to the same file |
sed -i 's/foo/bar/g' input.txt
🔄 Delete blank lines |
sed '/^$/d' input.txt
🔄 Print lines 10–20 |
sed -n '10,20p' input.txt
awk
?awk
is a full-fledged scripting language optimized for text processing—especially tables and columns. It splits input into records and fields, allowing:
|
|
|
🔄 Print the 2nd and 5th columns of a space‑delimited file |
awk '{ print $2, $5 }' data.txt
🔄 Sum values in the 3rd column |
awk '{ total += $3 } END { print total }' data.txt
🔄 Filter rows where the 1st column > 100 |
awk '$1 > 100' data.txt
Here’s a comparison to help you choose the right tool based on the task:
Use Case | sed | awk |
---|---|---|
Simple find-and-replace | Excellent | Possible (but verbose) |
Delete or insert specific lines | Easy | More complex |
Column selection printing | Not designed | Tailor‑made |
Arithmetic on fields | Not supported | Native support |
Formatted reports (e.g., aligned data) | Limited | Powerful |
Multi-line context matching (via N) | Supported | Supported |
🔄 Example 1: Clean Up Log File (Remove Timestamps, Extract IPs) |
Using sed
to strip leading timestamps in access.log
:
sed -r 's/^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9:]{8} //g' access.log > cleaned.log
Then awk
to extract client IP (1st field) and endpoint (7th field):
awk '{ print $1, $7 }' cleaned.log | head -n 20
🔄 Example 2: Generate Summary Report from CSV |
Suppose sales.csv
has date,product,quantity,price
. Sum total revenue per product using awk
:
awk -F, '
NR > 1 {
revenue = $3 * $4
total[$2] += revenue
}
END {
printf "%-15s %10s\n", "Product", "Revenue"
for (p in total) printf "%-15s %10.2f\n", p, total[p]
}
' sales.csv
🔹Explanation |
|
|
|
|
🔄 Example 3: In-place File Audit with |
Imagine verifying config file lines containing timeout
values > 30. Extract and test with awk
:
awk -F'=' '/timeout/ && $2 > 30 { print FILENAME ": " $0 }' *.conf
Or edit in place with sed
to adjust too‑high timeouts:
sed -i -r 's/(timeout *= *)([0-9]{2,})/\130/' *.conf
🔍 Visualizing sed vs awk – Suitability Chart |
[sed]───────────────•─ Simple global edits (replace, delete, insert)
|
| (overlaps, though awk can do some via scripting)
|
[awk]───────•───────•─ Column-based filters | arithmetic | formatted reporting
\
•─ Efficient when dealing with structured data (CSV, logs, etc.)
|
|
|
|
|
|
|
awk 'BEGIN { print "Header" } { ... } END { print "Footer" }'
Scenario | Recommended Tool |
---|---|
Replace or delete strings across many files | sed |
Extract specific columns from a space-delimited log | awk |
Generate table summaries or reports (e.g., CSV) | awk |
Quickly strip unwanted text from logs | sed |
Complex per-record conditionals or math operations | awk |
Let’s combine both tools in a useful workflow. Suppose you have server.log
lines like:
2025‑08‑31 12:34:56 INFO User john.doe logged in: 192.168.1.10
🔹 Step 1: Remove timestamp with |
sed -E 's/^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9:]{8} //g' server.log > no_stamp.log
🔹 Step 2: Summarize logins per IP with |
awk '{ count[$NF]++ } END {
printf "%-15s %s\n", "IP Address", "Count"
for (ip in count) printf "%-15s %d\n", ip, count[ip]
}' no_stamp.log
🔹 Step 3 (Optional): Sort descending by count: |
awk '{ count[$NF]++ } END {
for (ip in count) print count[ip], ip
}' no_stamp.log | sort -nr
This pipeline is fast, scriptable, and replicable—a hallmark of efficient Linux text processing.
Understanding the balance between sed
and awk
empowers you to handle a wide range of text‑processing tasks. If you’re doing simple replacements or line edits, sed is quick and lightweight. For data‑driven tasks—reporting, filtering, arithmetic—awk shines.
Harnessing both in tandem lets you:
|
|
|
Now go experiment: armed with these examples, you’re ready to tackle real-world text processing in Linux like a pro.
Did you find this article helpful? Your feedback is invaluable to us! Feel free to share this post with those who may benefit, and let us know your thoughts in the comments section below.
Learn advanced grep techniques to search files efficiently in Linux. Discover how to use regular expressions, case-insensitive searches, recursion, and more with practical examples. Table
When it comes to popular text editors for Linux, text editors are like wallets. Everybody’s got one and more importantly, everyone’s choice of a text
Are you wondering, which text editor should you choose?! Compare Visual Studio Code, Sublime Text, Atom, Notepad++, etc… to find the perfect tool for coding,