the predefined notation

The predefined notation in myPatterns is a superset of the standard notation JSON (JavaScript Object Notation).

Using this notation, one can write for instance:

  s = match([1,{a:2,b:3}], "[%x,{b:%y}]"); // binds s.x to 1 and s.y to 3
  if(s) { use(s.x, s.y); }

Thus, the main matching function is called match(), takes some data and a pattern, and returns a substitution binding pattern variables to data components (or "sub-data"). If the data does not match the pattern, it returns null.

Any type of object can be matched using this notation, which is described in detail below.

base types

The predefined notations for base types (numbers, booleans, and strings) are simply their native notations. To be more precise, their notation is as returned by the standard JavaScript global function String(), except for strings, whose content has to be wrapped within single or double quotes.

For instance, number 123 is matched by the pattern "123", the boolean true is matched by "true", and the string 'who am I' is matched by both "'who am I'" and '"who am I"'.

Note that none of these patterns contain variables. Therefore, the result of a successful match on base types using these patterns is always the empty substitution (which is not null, but rather the substitution {}, containing no bindings).

As an extension to JSON, strings may also be matched using regular expressions (or regexes), written within slash ("/") characters. Like in the standard regex matching, the result of matching a string with a regex pattern is an array of values, one for each "capturing subgroup" (a sub-pattern within parentheses "(...)").

For instance, the same string "who am I" can be matched by the patterns "/who/", "/^who (am|are|is)/", or "/^who (\S+) (\S+)$/". The first match returns the empty substitution {}; the second match returns the substitution {0:"am"}; the third match returns the substitution {0:"am",1:"I"}. Note that unlike traditional regex matching the first element in the resulting array does not automatically provide the string matched by the whole regex. One can obtain the same effect by defining the whole regex as a capturing group.

objects

The notation for objects smoothly generalizes their native JavaScript notation:

the pattern "{fld1:%x1,...,fldN:%xN}" matches any object containing at least the fields fld1 ... fldN, and binds the pattern variables to the values of the respective fields; in particular, the pattern "{}" matches any object

For example match({a:1,b:2}, "{a:%x}") returns {x:1}, match({a:1,b:2}, "{b:%x,a:%y}" returns {y:1,x:2}, match({a:1,b:2}, "{}") returns {} (successful match with an empty substitution) and match({a:1,b:2}, "{c:%x}") returns null (match failure).

arrays

The notation for arrays is an extended version of their native JavaScript notation:

the pattern "[%x1,...,%xN]" matches any array of length N, and the N pattern variables are bound to the elements of the array, in their order; in particular, the pattern "[]" matches any empty array
the pattern "[%x1,...,%xN-1|%xN]" matches any array of length at least N-1, the first N-1 pattern variables are bound to the first N-1 elements of the array, in their order, and the last variable is bound to the rest of the array (the subsequence starting with the N-th element).

For instance, match([1,2,3], "[%h|%t]") returns {h:1, t:[2,3]}, match([1,2], "[%a,%b|%t]") returns {a:1, b:2, t:[]}, match([1,2], "[%a]") returns null, and match([1], "[%a,%b]") also returns null.

composing the predefined notations

Of course, the predefined notations can be freely composed, by replacing any variable in a pattern with some sub-pattern. When such a composed pattern is matched, the sub-patterns will be recursively matched against the corresponding sub-data.

For instance, match([1,{a:2,b:3}], "[%x,{b:%y}]") first bounds 1 to variable %x, then recursively matches the sub-data {a:2,b:3} with the sub-pattern "{b:%y}", which binds the variable %y to 3.

As a more complex example, let us consider the following function "zipping" a pair of arrays of the form {p: [a1, a2 ...], q: [b1, b2 ...]} into an array of pairs, of the form [{p:a1, q:b1}, {p:a2, q:b2} ...]:

function zip(lsts) {
  var s;
  for(var res = [];
      (s = match(lsts, "{p:[%a|%x],q:[%b|%y]}"));
      lsts = {p:s.x, q:s.y})
    res.push({p:s.a, q:s.b});
  return res;
}

The assignment of the match result to substitution s serves both to test whether the match is successful, and to decompose the pair and its embedded lists. Note that substitution s being represented as a JavaScript object, the value of variable %x bound in s may be simply obtained as s.x. Using this function, {p:[1,2], q:[3,4]} is transformed into [{p:1, q:3}, {p:2, q:4}].

This example shows that the native notations in JavaScript integrate gracefully with the generic notations for lists and objects, because the generic notations are just extensions of the native ones; the code is thereby easy to write and to understand.

summary

Without any initial investment, anyone may use pattern matching on any object type using the predefined notations in myPatterns. These notations either are exactly the native notations in the language (for numbers and booleans) or backward-compatible extensions of the native notations (for strings, objects and arrays).

The next chapter shows the flexibility offered by myPatterns which allows users to override the predefined notations with their own notations.

Next: Custom notations