2.6 KiB

Lime: An LALR(1) parser generator in and for PHP.

Interpreter pattern got you down? Time to use a real parser? Welcome to Lime.

If you're familiar with BISON or YACC, you may want to read the metagrammar. It's written in the Lime input language, so you'll get a head-start on understanding how to use Lime.

  1. If you're not running Linux on an IA32 box, then you will have to rebuild lime_scan_tokens for your system. It should be enough to erase it, and then type CFLAGS=-O2 make lime_scan_tokens at the bash prompt.

  2. Stare at the file lime/metagrammar to understand the syntax. You're seeing slightly modified and tweaked Backus-Naur forms. The main differences are that you get to name your components, instead of refering to them by numbers the way that BISON demands. This idea was stolen from the C-based "Lemon" parser from which Lime derives its name. Incidentally, the author of Lemon disclaimed copyright, so you get a copy of the C code that taught me LALR(1) parsing better than any book, despite the obvious difficulties in understanding it. Oh, and one other thing: symbols are terminal if the scanner feeds them to the parser. They are non-terminal if they appear on the left side of a production rule. Lime names semantic categories using strings instead of the numbers that BISON-based parsers use, so you don't have to declare any list of terminal symbols anywhere.

  3. Look at the file lime/lime.php to see what pragmas are defined. To be more specific, you might look at the method lime::pragma(), which at the time of this writing, supports "%left", "%right", "%nonassoc", "%start", and "%class". The first three are for operator precedence. The last two declare the start symbol and the name of a PHP class to generate which will hold all the bottom-up parsing tables.

  4. Write a grammar file.

  5. /path/to/lime/lime.php list-of-grammar-files > my_parser.php

  6. Read the function parse_lime_grammar() in lime.php to understand how to integrate your parser into your program.

  7. Integrate your parser as follows:

     require 'lime/parse_engine.php';
     require 'my_parser.php';
     // Later:
     $parser = new parse_engine(new my_parser());
     // And still later:
     try {
     	while (..something..) {
     		$parser->eat($type, $val);
     		// You figure out how to get the parameters.
     	// And after the last token has been eaten:
     } catch (parse_error $e) {
     return $parser->semantic;
  8. You now have the computed semantic value of whatever you parsed. Add salt and pepper to taste, and serve.