Pipes and Redirection
The Unix Philosophy: Small Tools, Big Results
Pipes (|) connect the output of one command to the input of another. Redirection (>, >>, <) connects commands to files. Together, they let you build powerful data processing pipelines from simple building blocks.
This is the Unix philosophy: each command does one thing well, and pipes combine them into workflows that no single command could accomplish alone.
Pipes
The pipe operator | takes the standard output (stdout) of the left command and passes it as standard input (stdin) to the right command.
Basic Pipe Examples
# Count how many POST requests returned 500
grep "POST /api" /var/log/nginx/access.log | grep "500" | wc -l
# Find the 10 most frequent error messages
grep "ERROR" app.log | sort | uniq -c | sort -rn | head -10
# Show unique IP addresses accessing your server
awk '{print $1}' /var/log/nginx/access.log | sort -u | wc -l
# Find the largest files in a directory
du -sh * | sort -rh | head -5
Breaking Down a Complex Pipe
Let us trace through the "10 most frequent error messages" example step by step:
# Step 1: Get all ERROR lines
grep "ERROR" app.log
# Output: hundreds of error lines
# Step 2: Sort them (required for uniq)
grep "ERROR" app.log | sort
# Output: error lines in alphabetical order
# Step 3: Count consecutive duplicates
grep "ERROR" app.log | sort | uniq -c
# Output:
# 47 ERROR: Connection timeout to payment-service
# 23 ERROR: Invalid session token
# 12 ERROR: Database query exceeded timeout
# Step 4: Sort by count (descending, numeric)
grep "ERROR" app.log | sort | uniq -c | sort -rn
# Output: Most frequent errors first
# Step 5: Show only the top 10
grep "ERROR" app.log | sort | uniq -c | sort -rn | head -10
# Output: The 10 most frequent error messages with counts
Redirection
Redirection connects command input and output to files instead of the terminal.
Output Redirection
# Write output to a file (overwrite)
curl -s https://api.example.com/users > users.json
# Append output to a file
echo "Test run completed at $(date)" >> test-log.txt
# Write to a file and display on screen simultaneously
curl -s https://api.example.com/users | tee users.json
# Append with tee
echo "Another result" | tee -a test-log.txt
Error Redirection
Linux has three standard streams:
- stdin (0): Standard input
- stdout (1): Standard output (normal output)
- stderr (2): Standard error (error messages)
# Redirect only errors to a file
npm run test 2> errors.log
# Redirect both stdout and stderr to the same file
npm run test > output.log 2>&1
# Redirect stdout and stderr to separate files
npm run test > output.log 2> errors.log
# Discard all output (useful for scripts where you only care about the exit code)
curl -s https://api.example.com/health > /dev/null 2>&1
# Discard errors only (suppress "Permission denied" noise from find)
find / -name "config.json" 2>/dev/null
Input Redirection
# Read input from a file
sort < unsorted-list.txt
# Here document (inline input)
cat <<EOF > test-config.json
{
"baseURL": "https://staging.example.com",
"timeout": 30000
}
EOF
# Here string (single line input)
grep "error" <<< "This has an error in it"
Essential Pipe Commands
These commands are most useful when combined with pipes:
sort
# Sort alphabetically
sort names.txt
# Sort numerically
sort -n numbers.txt
# Sort in reverse
sort -r names.txt
# Sort by specific column (e.g., 3rd column)
sort -t',' -k3 data.csv
# Sort numerically, reverse (most common in pipe chains)
sort -rn
uniq
uniq removes consecutive duplicate lines. Always sort before uniq.
# Remove duplicates
sort names.txt | uniq
# Count occurrences
sort names.txt | uniq -c
# Show only duplicates
sort names.txt | uniq -d
# Show only unique lines (no duplicates)
sort names.txt | uniq -u
wc (Word Count)
# Count lines
wc -l file.txt
# Count words
wc -w file.txt
# Count characters
wc -c file.txt
# Count lines from pipe (most common usage)
grep "ERROR" app.log | wc -l
head and tail
# First 10 lines (default)
head file.txt
# First 20 lines
head -20 file.txt
# Last 10 lines (default)
tail file.txt
# Last 50 lines
tail -50 file.txt
# Follow a file in real-time (live log monitoring)
tail -f /var/log/app/application.log
# Follow and show last 100 lines
tail -100f /var/log/app/application.log
cut
# Extract specific columns from delimited data
cut -d',' -f1,3 data.csv # Fields 1 and 3, comma-delimited
cut -d':' -f1 /etc/passwd # Username from passwd file
cut -c1-10 file.txt # First 10 characters of each line
tr (Translate)
# Convert to uppercase
echo "hello" | tr 'a-z' 'A-Z'
# Output: HELLO
# Replace spaces with newlines
echo "one two three" | tr ' ' '\n'
# Delete specific characters
echo "Hello, World!" | tr -d '!,'
# Output: Hello World
# Squeeze repeated characters
echo "hello world" | tr -s ' '
# Output: hello world
xargs
# Pass pipe output as arguments to a command
find . -name "*.spec.ts" | xargs grep "test.only"
# Delete files found by find
find /tmp -name "*.log" -mtime +7 | xargs rm
# Run a command for each line of input
cat urls.txt | xargs -I {} curl -s -o /dev/null -w "{}: %{http_code}\n" {}
Practical QA Pipe Chains
Analyzing Nginx Access Logs
# Top 10 most requested URLs
awk '{print $7}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -10
# Top 10 IP addresses by request count
awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c | sort -rn | head -10
# Count requests per HTTP status code
awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn
# All 500 errors with timestamps
grep " 500 " /var/log/nginx/access.log | awk '{print $4, $7}' | tail -20
Analyzing Test Results
# Count passed/failed/skipped from JUnit XML
grep -c 'status="passed"' test-results/*.xml
grep -c 'status="failed"' test-results/*.xml
# Find all test files that contain .only (accidentally focused tests)
grep -rl "\.only" tests/ --include="*.spec.ts"
# List all test descriptions
grep -rh "test\('" tests/ --include="*.spec.ts" | sed "s/.*test('//" | sed "s/',.*//" | sort
Monitoring and Health Checks
# Check multiple endpoints and show status
for url in https://api.example.com/health https://web.example.com https://admin.example.com; do
STATUS=$(curl -s -o /dev/null -w "%{http_code}" --max-time 10 "$url")
echo "$url: $STATUS"
done
# Monitor a log file for errors and send an alert
tail -f /var/log/app/application.log | grep --line-buffered "CRITICAL" | while read line; do
echo "ALERT: $line" | mail -s "Critical Error" qa-team@example.com
done
Process Substitution
Process substitution lets you use a command's output as if it were a file:
# Compare the output of two commands
diff <(curl -s https://staging.example.com/api/config) <(curl -s https://prod.example.com/api/config)
# Compare sorted output
diff <(sort file1.txt) <(sort file2.txt)
Hands-On Exercise
- Use a pipe chain to find the 5 most common words in a text file
- Redirect both stdout and stderr of a test run to separate files
- Use
tail -fto monitor a log file while running tests in another terminal - Build a pipe that analyzes an access log: count requests by status code, showing the top 5
- Use
teeto save curl output to a file while also piping it through jq - Write a one-liner that finds all
.spec.tsfiles containingtest.onlyand lists them with line numbers