regular expressions benchmark

This page measures the overhead of regular expression matching in JavaScript, when trying to match messages of the form:
File /usr/lib/mypatterns not readable, please use chmod!!!
File /usr/lib/mypatterns.so not found, please create it!!!
and return the length of the matched filenames.

a simple test

The most natural way to do this is using a simple regexp:
  var m;
  if((m = s.match(/File ([^ ]*) not (found|readable).*!!!/)) != null)
    return m[1].length;

As a second thought, one may try to optimize the regexp by using non-capturing sub-groups:

  var m;
  if((m = s.match(/File ([^ ]*) not (?:found|readable).*!!!/)) != null)
    return m[1].length;

Finally, really motivated programmers can replace the whole regexp with some carefully designed, hand-crafted matching using only substring extractions and comparisons:

  var pos;
  if(s.substr(0, 5) == "File " && ((pos = s.indexOf(' ', 5)) >= 0) &&
     s.substr(++pos, 4) == "not " &&
     (s.substr((pos += 4), 5) == "found" || 
      s.substr(pos, 8) == "readable" ) &&
     s.substr(-3) == "!!!")
    return pos - 10;

comparison

The actual results heavily depend on the Javascript engine included in your browser and on your overall platform characteristics (CPU, memory, etc.). On our platform (Firefox 3.0.19, Linux kernel 2.6.24, PC/Athlon XP 2GHz, 256MB RAM) we observe that:

comments

Although the slowdown factor due to regular expression matching is significant (several times on some platforms!), it is definitely worth in many cases because the resulting program is so much simpler to write and maintain, and by consequence also much more reliable.