regular expressions benchmark
This page measures the overhead of regular expression matching
in JavaScript, when trying to match messages of the form:
File /usr/lib/mypatterns not readable, please use chmod!!!
File /usr/lib/mypatterns.so not found, please create it!!!
and return the length of the matched filenames.
a simple test
The most natural way to do this is using a simple regexp:
var m;
if((m = s.match(/File ([^ ]*) not (found|readable).*!!!/)) != null)
return m[1].length;
As a second thought, one may try to optimize the regexp by using
non-capturing sub-groups:
var m;
if((m = s.match(/File ([^ ]*) not (?:found|readable).*!!!/)) != null)
return m[1].length;
Finally, really motivated programmers can replace the whole regexp
with some carefully designed, hand-crafted matching using only
substring extractions and comparisons:
var pos;
if(s.substr(0, 5) == "File " && ((pos = s.indexOf(' ', 5)) >= 0) &&
s.substr(++pos, 4) == "not " &&
(s.substr((pos += 4), 5) == "found" ||
s.substr(pos, 8) == "readable" ) &&
s.substr(-3) == "!!!")
return pos - 10;
comparison
The actual results heavily depend on the
Javascript engine included in your browser and on your overall
platform characteristics (CPU, memory, etc.). On our platform (Firefox
3.0.19, Linux kernel 2.6.24, PC/Athlon XP 2GHz, 256MB RAM) we observe
that:
- the versions using regex matching are considerably slower than
the hand-crafted version (between 2.5 and 3 times slower)
- the "optimized" regex is actually sometimes a bit slower than
the simple regex (up to 15% slower)
comments
Although the slowdown factor due to regular expression matching is
significant (several times on some platforms!), it is definitely worth
in many cases because the resulting program is so much simpler to
write and maintain, and by consequence also much more reliable.