Common Pitfalls When Assigning Output To Variables In Bash

Failing to Properly Quote Variables

One of the most common pitfalls when assigning command output to a variable in Bash is failing to properly quote the variable expansion. When you assign output to a variable without quoting, like this:

out=$(mycommand)

Bash will perform word splitting and filename expansion on the unquoted variable, which can lead to incorrect results if the output contains whitespace or special characters. To prevent this, it is best practice to always quote the variable expansion when assigning output, like so:

out="$(mycommand)"

The quotes prevent Bash from parsing the output and ensure the entire output is assigned intact to the variable. Omitting quotes can cause whitespace to be interpreted as argument separators, filenames with spaces or globs to expand incorrectly, and special characters to be interpreted. Always quoting the assigned output avoids these parsing issues.

Why Quoting Prevents Problems

When you expand an unquoted variable, Bash applies the following parsing steps:

  1. Bash separates the output into words via word splitting – treating whitespace and newlines as delimiters.
  2. Bash expands wildcard globs into matching filenames (wildcard expansion).
  3. Bash divides the results into separate arguments based on the IFSinternal field separators (by default space, tab, and newline).

This parsing causes unquoted command output containing spaces, newlines or globs to be incorrectly divided or expanded. But when you quote the expansion, Bash treats the entire output as a single word – preventing all parsing steps.

Examples of Problems Caused

Some examples of problems caused by unquoted command substitutions:

# Word splitting on whitespace 
out=$(echo "foo bar")
echo "$out" # Prints just "foo"

# Glob expansion
out=$(echo *.txt)
echo "$out" # Prints glob expanded filenames

# Argument separation in loops
out=$(mycmd)
for x in $out; do
  # Loops over words in $out 
done

The solution is always the same – quote the variable to avoid parsing:

out="$(echo "foo bar")" # Preserves whitespace
out="$(echo *.txt)" # Disable glob expansion

for x in "$out"; do
  # Loops over entire output
done

When to Quote

Always quote the command substitution whenever assigning output to a variable:

var="$(cmd)" # Double quotes
var='$(cmd)' # Single quotes

Exceptions are when you specifically want to divide the output into words or arguments by leaving unquoted. But typically, you should quote.

Ignoring Exit Codes

Another common mistake is assigning command output without checking the exit code to see if the command succeeded:

out=$(mycommand)

This ignores mycommand’s exit code. If mycommand fails, the script will continue executing with empty or incorrect output stored in $out. To handle failures properly, always check the $? exit code after command substitutions:

if out=$(mycommand); then
  echo "Success"
else
  echo "Failed with exit code $?"
fi

This tests if mycommand succeeded, and prints an error if it failed. Handling exit codes prevents errors being silently ignored.

Why Check Exit Codes

The $? variable contains a command’s exit code – an integer indicating whether it succeeded (0) or failed (non-zero). When assigning output, you lose this exit code information:

mycommand # $? contains exit code
out=$(mycommand) # $? is overwritten

So the only way to check if the command succeeded is to explicitly test $? after substitution:

out=$(mycommand)
if [ $? -eq 0 ]; then  
  echo "Succeeded"
else
  echo "Failed with exit code $?"
fi

Making exit code checks mandatory prevents overlooked errors.

Common Exit Code Checks

Some common exit code check patterns are:

# Check immediately after substitution
out=$(cmd)
if [ $? -ne 0 ]; then
   ...handle error..
fi

# Use an if statement
if out=$(cmd); then
   ..process output..
else
   ..handle failure..
fi

# Write a bash function
run() {
  if out=$(cmd); then
    return 0
  else  
    return 1
  fi
}

if run; then
   ..success..
else
   ..failed..
fi

The key is checking $? immediately after substitutions to catch errors.

Overwriting Existing Variables

A very common pitfall when assigning to variables is accidentally overwriting values already stored in other variables. For example:

hostname=$(hostname)
out=$(ls)
echo "$hostname" # Oops, clobbered with ls output 

Since variables share a single namespace, it’s easy to overwrite them by mistake. Always be careful when naming output variables to avoid collisions.

Use Unique Variable Names

The simplest way to avoid overwrites is choosing unique names for output variables:

hostname=$(hostname)
files=$(ls) # Changed name to avoid conflict
echo "$hostname" # Preserved original value

Prefixing variables or using longer descriptive names prevents collisions:

server_name=$(hostname)
cmd_output=$(ls)

Use Namespaces for Scripts

For larger scripts, encapsulate variables in a namespace to avoid global collisions:

myscript() {

  # Local namespace 
  local out 
  local filename

  out=$(cmd)
  filename=$(generate_filename) 

  # Rest of script can freely assign to out/filename
  ...
}

This uses local to scope variables to myscript only. Code outside the function cannot overwrite them accidentally.

Avoid Overwriting Environment Vars

Be very careful when naming output vars as you can easily overwrite imported environment variables by mistake:

echo "$PATH" # Oops, shadows $PATH
PATH=$(cmd)   

Stick to unique names avoid accidentally breaking imported environment variables.

Assigning Huge Outputs to Variables

A common mistake is trying to assign an extremely large command output to a variable:

  
out=$(command_with_large_output)

When a command outputs megabytes/gigabytes of text, allocating all that data to a variable can use up a lot of memory and slow the script to a crawl.

Memory Overhead

Variable assignments like the above store the output in memory. Bash has to allocate resources for the entire contents of $out. For small outputs this isn’t an issue – but 1GB outputs can overwhelm your system:

# 50KB Output - OK 
out=$(generate_report)

# 1GB output - Problems!!
out=$(dump_database)  

The larger the saved output, the more memory consumed. This leads to RAM shortages, slow performance, crashes etc. Storing too much data is inefficient.

Slow Processing

Parsing and manipulating large outputs stored in variables is very slow:

# Slow - first stores entire 1GB output
out=$(big_data_query)
# Then processes line-by-line  
for line in $out; do
   echo $line
done

Processing the output line-by-line while it is being generated avoids having to store it entirely in memory first. This saves resources and is much faster.

Alternatives

To avoid issues with huge outputs, either:

  • Pipe the output directly to other commands instead of assigning to a variable
  • Write the output incrementally to disk instead of storing in memory
  • Only store extracts of outputs that you actually need to variables

This saves memory/disk space and processes faster.

Attempting Variable Expansion Before Assignment

Another pitfall is trying to expand variables before they have been assigned any value:

echo "$out" # Expands to empty string 
out=$(cmd)

Since out doesn’t exist yet on the first line, the expansion results in an empty value. Trying to use unassigned variables expands to empty strings or sometimes error messages, and causes undefined behavior.

Why Undefined Expansion Happens

Variables in bash are untyped – they do not declare any name or type before assignment. You can expand variables that have never been set:

echo "$undefined_var" # Expands to ""

So Bash cannot catch “variable used before assignment” errors. You have to be careful of unassigned expansions yourself.

How to Avoid Issues

Always set variables before trying to expand them:

out=$(cmd) # Assign first
echo "$out" # Then expand

Or use parameter expansion to provide defaults:

  
echo "${out:-"Default if unset"}" # Show default if undefined

Checking with set -u to detect undefined variable errors can also help debug issues.

Confusing Standard Output and Standard Error

The final common pitfall is only capturing standard output, and missing error messages sent to standard error.

For example:

out=$(mycommand) # Captures standard output only

This assigns the standard output stream to the variable, but ignores the standard error stream. If mycommand logs errors to stderr, those messages will get lost.

Missing Error Messages

Losing error messages can mask issues and prevent proper handling of failures. For example:

out=$(mycommand)
if [ $? -eq 0 ]; then
   # Success right?
   echo "$out"
fi

The exit code says it succeeded, but stderr may contain errors that you miss!

Redirect Both Streams

To fix this, redirect both standard output and standard error when assigning:

  
out=$(mycommand 2>&1) # Redirect stderr to stdout

This combines both streams into the variable. Now errors are correctly captured:

out=$(mycommand 2>&1)
if [ $? -eq 0 ]; then
  echo "$out" # $out contains all output  
else
  echo "$out" # $out contains error logs
fi

If you only want errors in the variable, swap stdout/stderr.

Other Ways to Access Stderr

You can also access stderr separately without redirection:

out=$(mycommand) # stdout only
errors=$(mycommand 2>&1 1>/dev/null) # stderr only 

# Or access the file descriptors directly
exec 3>&1 # Save stdout 
out=$(mycommand 2>&1 1>&3) # Redirect stdout, keep stderr

But typically redirecting to stdout is recommended for simplicity.

Leave a Reply

Your email address will not be published. Required fields are marked *