Eliminating Leading And Trailing Whitespace With Ifs In Bash Scripts
The Problem: Extra Whitespace Causing Issues in Scripts
Whitespace characters like spaces, tabs, and newlines are ubiquitous in text data. Often, these characters inadvertently get introduced at the beginning or end of strings read into Bash scripts. This can cause problems when processing or passing around such values in scripts.
For example, a script may read in a filename from user input or a configuration file. If the input contains unexpected leading or trailing spaces, using that value directly could potentially fail to find the right file. Trimmed whitespace ensures the input matches expectations.
Some common issues that arise from leading/trailing whitespace in Bash scripts include:
- Problems accessing files or directories due to hidden whitespace in names
- String comparisons failing due to discrepancies in whitespace
- Syntax errors when unparsed whitespace is introduced in code
- Configuration values and arguments getting split incorrectly
Thankfully, Bash provides easy ways to eliminate extraneous whitespace from string variables within scripts.
Understanding IFS and Word Splitting
To understand how whitespace processing works in Bash, we need to look at two key concepts – IFS and word splitting.
IFS – Internal Field Separator
The IFS environment variable designates what Bash interprets as a separator between words or fields. The default value consists of three characters – space, tab, and newline.
When Bash encounters these characters in input, it splits the input into separate words or tokens. This helps facilitate operations like iterating over command arguments.
Word Splitting
By default, Bash performs “word splitting” on unquoted parameter expansions, treating whitespace and newlines as delimiters to break the expansion into separate words.
For example, if a variable fruit contains “apple banana”, expanding ${fruit} splits it into the two words “apple” and “banana” due to the space.
This behavior is often useful for iterating over strings as distinct words. However, unintended leading/trailing whitespace can result in incorrect split results.
Using Parameter Expansion to Trim Leading and Trailing Whitespace
Bash provides a powerful parameter expansion option to selectively remove matching leading or trailing substrings from variables.
By specifying a wildcard pattern to strip from the start or end of the parameter’s value, we can effectively eliminate extraneous whitespace.
${parameter##word} – Remove Matching Prefix Pattern
The ## prefix operator strips the longest matching pattern from the start of parameter’s value. Let’s look at some examples of using ## to remove leading whitespace:
foo=" hello world " echo "${foo}" # " hello world " # Strip leading spaces echo "${foo##* }" # "hello world " bar=" foo.txt " echo "${bar}" # " foo.txt " # Strip leading tabs echo "${bar##*}" # "foo.txt "
As you can see, everything matching from the start gets removed. This allows flexible removal of leading whitespace without needing to hard-code exact patterns.
${parameter%%word} – Remove Matching Suffix Pattern
The %% suffix operator does the inverse – stripping the longest matching pattern from the end of the parameter’s value instead. Here are some examples:
foo=" hello world " echo "${foo}" # " hello world " # Strip trailing spaces echo "${foo%% *}" # " hello" bar=" example.txt " echo "${bar}" # " example.txt " # Strip trailing tabs and spaces echo "${bar%%[[:space:]]*}" # " example.txt"
The %% operator allows flexible removal of trailing whitespace characters without hard-coding specifics.
Putting It All Together: Script Example
Let’s walk through a practical script example to see how we can leverage parameter expansion to eliminate leading and trailing whitespace read from input.
Reading Input with Whitespace
Say we prompt the user to input a filename, but they accidentally add trailing whitespace:
read -p "Enter file: " f echo "$f" # "example.txt " (trailing spaces)
Trimming Whitespace
We can immediately trim it by using %% suffix removal:
f="${f%% }" echo "$f" # "example.txt" (trailing spaces removed)
And leading whitespace as well with ##:
f="${f##* }" echo "$f" # "example.txt" (leading/trailing spaces stripped)
Operating on Cleaned Input
With whitespace eliminated, we can reliably pass the filename into commands now:
cat "$f" # displays "example.txt" contents correctly grep foo "$f" # searches "example.txt" as expected
Our parameter expansion enabled us to sanitize the filename by removing extraneous whitespace that could have caused issues.
Additional Tips for Managing Whitespace
Here are some other helpful points around handling whitespace in Bash scripts:
- Quote parameter expansions (“$foo”) accessed later to avoid split issues
- Use read -r to avoid backslash interpretation of escape sequences
- Set IFS to just newline instead of space/tab/newline when parsing structured data
- Use
[[:space:]]
character class to match all whitespace characters - Consider trimming inputs from users/files before further processing
Carefully managing whitespace is crucial for robust Bash scripting. Leveraging parameter expansion operators to strip whitespace combined with safe quoting/escaping allows even untrusted inputs to be handled reliably.