Ifs: The Input Field Separator Super-Variable In Bash
What is IFS and Why It Matters
The IFS environment variable, short for Input Field Separator, controls a vital behavior in Bash – how words and lines are split into distinct entities. By default, IFS contains the space, tab, and newline characters. When Bash encounters these characters in input, it splits the input into separate words or lines.
Understanding IFS is critical for writing effective Bash scripts. The default splitting behavior is convenient for interactive shells. But in scripts, it can cause unintended consequences like lost whitespace and split variable values. Modifying IFS is a common technique to alter this splitting for better script reliability.
For example, consider reading a file line-by-line using the default $IFS. Any lines containing spaces or tabs will be split prematurely:
while read line; do
echo "$line"
done < "file.txt"
This demonstrates the importance of IFS in scripts - proper handling can prevent bugs.
Definition of IFS
IFS is an environment variable that controls field splitting behaviors. The default value contains a space, tab, and newline:
$ echo "$IFS"
" " "\t" "\n"
Bash uses the characters in $IFS to delimit distinct words and lines when performing expansions. Any input separated by these characters will become split into separate entities.
Explanation of Field Splitting Behavior
To understand IFS splitting, consider a simple variable assignment:
$ var="first second"
$ echo "$var"
first second
Although the string contains a space, echo prints it properly as a single entity. This is because variable expansions in double quotes preserve the integrity of the stored string.
However, interactions with the filesystem can engage field splitting behaviors:
$ touch 'text file.txt'
$ ls
text file.txt
Here the space character triggered file name splitting. This demonstrates how IFS induces splitting on word boundaries.
Examples of Issues Caused by Default IFS
As mentioned earlier, the default $IFS can cause unintended field splitting in scripts:
$ var="first second"
$ while read line; do echo "$line"; done <<< "$var"
first
second
Despite double quotes, multi-line data separated by newlines also gets split:
$ var=$'"line 1\nline 2"\n'
$ echo "$var"
"line 1
line 2"
And Unlike file system interactions, the special $'\n' string literal is still split on expansions:
$ echo "$var"
"line 1
line 2"
These examples demonstrate critical IFS parsing behaviors that can break scripts.
Modifying IFS
To control field splitting, scripts often modify $IFS to contain only the newline character:
IFS=$'\n'
This prevents splitting words and preserves all whitespace characters. Understanding when and how to alter $IFS is essential to robust Bash scripting.
Setting a Custom IFS Value
Values can be assigned directly to $IFS to change delimiters. For example, to split on commas instead of default characters:
$ IFS=,
$ var="first, second, third"
$ echo "$var"
first, second, third
Now expansions will split on commas rather than spaces and tabs. Always remember to restore the default $IFS when finished.
Code Examples for Common IFS Modifications
Some typical use cases for altering $IFS include:
# Only split on newlines
IFS=$'\n'
# Don't split words across whitespace
IFS=$'\n\t'
# Parse colon-delimited records
IFS=:
Adjust $IFS based on the parsing requirements of each script.
IFS Pitfalls to Avoid
Although powerful, IFS comes with some surprising behaviors to watch out for. Failing to account for certain edge cases can still lead to unintended splitting even with a custom $IFS.
Unexpected Field Splitting Behaviors
Consider a file with comma-separated records:
$ cat file.csv
col1,col2,col3
data1,data2,data3
Parsing each record is simple by changing $IFS:
IFS=,
while read col1 col2 col3; do
echo "$col1,$col2,$col3"
done < file.csv
But there is a hidden parsing flaw here. Leading whitespace also acts as a delimiter, causing improper splits:
col1, col2,col3
,data1,data2,data3
Always trim inputs when modifying $IFS to avoid surprises.
Problems with Preserving Whitespace
Although newlines remain split with $IFS=$'\n' set, other whitespace characters can still cause issues in expansions:
$ IFS=$'\n'
$ var=$'\t first second \t'
$ echo "$var"
first second
So parsing structured whitespace-separated records can require other techniques like using read with the -d option instead of relying solely on IFS.
Considerations for Scripts vs Interactive Shells
Interactive shells automatically reset $IFS to defaults upon a new command line. But for scripts, $IFS remains modified indefinitely:
# Set modified IFS
IFS=,
some-script # IFS still modified here!
Always save and restore $IFS in scripts to avoid interfering with other logic.
Leveraging IFS for Better Scripts
While IFS requires caution, it remains immensely useful for scripting applications like structured text parsing.
Using IFS to Parse Structured Text Data
With proper boundary awareness, IFS enables easy parsing routines. Consider delimited JSON data:
{ "col1":"data1", "col2": "data2"} {/*Record 2*/}
An IFS-based parser handles this nicely:
IFS='} {'
while read record; do
# process "$record"
done < file.json
The same approach applies to other structured records like CSV, TSV, etc.
Simplifying for Loops with IFS
IFS also helps iterate over delimited data simply via for loops instead of while read:
$ IFS=','
$ var="1, 2, 3, 4"
$ for x in $var; do echo $x; done
1
2
3
4
Code Examples for Practical IFS Use Cases
Some other handy scripting examples include:
# Iterate delimited records
IFS=','
for row in $(< file.csv); do
# process $row
done
# Read structured JSON
IFS=',' read col1 col2 <<< '{"col1":"data1","col2":"data2"}'
echo $col1, $col2
# Parse lines into arrays
IFS=' ' read -a my_array <<< "first second third"
echo "${my_array[1]}"
When applied properly, IFS remains an invaluable tool for text processing workflows.
Restoring Default IFS
After modifying IFS even temporarily, best practice is to always restore it back to the original default value. This avoids interference with expected field splitting behaviors.
Resetting to the Default IFS Value
To reset $IFS, simply reassign the default characters:
# Reset field separators
IFS=" "$'\t'$'\n'
Alternatively, unset IFS then let Bash reinitialize it:
unset IFS
Ensuring Consistent Field Splitting Behavior
Restoring default IFS ensures consistent parsing behavior across scripts and interactive shell use:
# Script logic that modified IFS
IFS=" "$'\t'$'\n' # Reset defaults
# Interact with other commands safely again
ls # Field splitting works normally again
Save and restore IFS around any code the manipulates it for robust workflows.