Unparsed Patterns: Source Code Pattern Matching made easy

Unparsed pattern matching is a new paradigm for source code pattern matching that is very easy to implement in existing program manipulation tools such as: compilers, model checkers, code inspectors, refactoring tools, tools for legacy code understanding and transformation, etc.

Source code patterns serve to easily search for code fragments having a specific form. They are used in many different tools for program manipulation. Source code patterns in these tools are generally written in a specific notation for syntax trees, either private to the tool, or publicly known, such as LISP or XML. Writing such source code patterns usually require an intimate knowledge of the syntax tree representation within that tool and of its notation.

Concrete syntax patterns are a specific kind of source code patterns that are very conveniently expressed in the native syntax of the "subject" programming language: they are simply source code fragments with holes. Thus, concrete syntax patterns are trivial to write and understand by any programmer. They are also portable, because they are independent of the syntax tree representation within different tools. However, they are suprisingly difficult to implement efficiently. That is the reason why just a few tools implement traditional concrete syntax patterns without imposing severe restrictions on their form.

Unparsed patterns are a new form of concrete syntax patterns that are not only easy to use, but also very easy to implement:

Unparsed patterns have first been implement in an extensible version of the GCC compiler, called mygcc. They are now available within the free myPatterns matching library.

Publications:

Prototype


Contact us. Last updated on 30/5/2010.