Shell Tricks: Native Regular Expression Escaping In Zsh, Ksh93 And Fish

Escaping Regular Expressions Natively

Properly escaping regular expressions in shell scripts can be challenging. Metacharacters like ., *, ^, $, [], {}, () carry special meaning and will not match literally by default. Usually external utilities like sed or awk are used to escape these characters. However, native regular expression escaping built into shells like zsh, ksh93, and fish provides a faster and more straightforward way to match literal strings.

zsh’s RE_MATCH Option

zsh provides the RE_MATCH shell option to natively escape regex metacharacters in glob patterns. When set, special characters match literally instead of their regex meaning. For example:

% setopt RE_MATCH
% echo *.txt
file1.txt file2.txt 

Here RE_MATCH allows matching the * wildcard literally instead of globbing. The .txt is also literal instead of matching any character. This provides an easy way in zsh to use regex symbols as normal text.

ksh93’s ~ Syntax

In ksh93, prefixing a regex with a ~ escapes all metacharacters, causing them to match literally. For example:

$ echo ~*.txt
*.txt

The ~ escapes the * and . to match them literally instead of their special meaning. The txt is also matched literally after the prefix. This provides a straightforward regex escape mechanism native to ksh93.

fish’s \\[ ] Syntax

fish shell offers literal regex matching with \\[ ] escape sequences. All patterns inside \\[ ] are treated literally instead of their special meanings. For example:

> echo \[*\].txt  
*.txt

This matches * and . literally, avoiding glob and regex expansion by escaping with \\[ ]. The txt is also literal after \\[ ]. So \\[ ] gives native regex escaping in fish shell.

Performance Differences

Native regex escaping in shells avoids spawning external processes like sed and awk to process escapes. This provides significant performance gains for match operations.

# Native ksh93 escape
$ time echo ~*.txt > /dev/null 

real	0m0.003s

# sed process substitution 
$ time echo $(sed 's:[*]:\\&:g' <<< *.txt) > /dev/null

real	0m0.100s

Here the ksh93 ~ escape performs a 30x faster literal match compared to using sed substitution. So builtin escapes win on speed.

Recommendations

For native regular expression escaping:

  • In zsh, use the RE_MATCH shell option.
  • In ksh93, prefix patterns with ~
  • In fish, escape patterns with \\[ ]

Utilize these instead of spawning external processes for matches. The builtin escapes parse faster and simplify match logic in scripts.

Leave a Reply

Your email address will not be published. Required fields are marked *