An improved fork of LIME, an LALR(1) parser generator written in PHP. The original source code can be found at
Go to file
Vitaliy Filippov 52918751c4 Actually return false from eat() on error for reduce part 2013-08-26 16:12:51 +04:00
examples Moved passing of errors to the parser itself 2012-01-02 13:40:10 +01:00
.gitignore Added .gitignore 2012-01-02 12:55:44 +01:00 Typo 2012-01-02 12:56:43 +01:00
flex_token_stream.php Cleaning up the code 2011-12-28 22:50:59 +01:00
lemon.c.gz Gzipped lemon.c, since I do not want github to think this is a C project 2011-12-31 11:13:05 +01:00
lime.bootstrap Allow to use multi-character tokens 2013-04-20 14:06:54 +04:00
lime.php Compress reduce rules 2013-04-20 14:58:25 +04:00
lime_scan_tokens Allow to use multi-character tokens 2013-04-20 14:06:54 +04:00
lime_scan_tokens.l Allow to use multi-character tokens 2013-04-20 14:06:54 +04:00
metagrammar Initial commit 2011-12-27 22:23:38 +01:00
parse_engine.php Actually return false from eat() on error for reduce part 2013-08-26 16:12:51 +04:00 Cleaning up the code 2011-12-28 22:50:59 +01:00

Lime: An LALR(1) parser generator in and for PHP.

Interpreter pattern got you down? Time to use a real parser? Welcome to Lime.

If you're familiar with BISON or YACC, you may want to read the metagrammar. It's written in the Lime input language, so you'll get a head-start on understanding how to use Lime.

  1. If you're not running Linux on an IA32 box, then you will have to rebuild lime_scan_tokens for your system. It should be enough to erase it, and then type CFLAGS=-O2 make lime_scan_tokens at the bash prompt.

  2. Stare at the file lime/metagrammar to understand the syntax. You're seeing slightly modified and tweaked Backus-Naur forms. The main differences are that you get to name your components, instead of refering to them by numbers the way that BISON demands. This idea was stolen from the C-based "Lemon" parser from which Lime derives its name. Incidentally, the author of Lemon disclaimed copyright, so you get a copy of the C code that taught me LALR(1) parsing better than any book, despite the obvious difficulties in understanding it. Oh, and one other thing: symbols are terminal if the scanner feeds them to the parser. They are non-terminal if they appear on the left side of a production rule. Lime names semantic categories using strings instead of the numbers that BISON-based parsers use, so you don't have to declare any list of terminal symbols anywhere.

  3. Look at the file lime/lime.php to see what pragmas are defined. To be more specific, you might look at the method lime::pragma(), which at the time of this writing, supports "%left", "%right", "%nonassoc", "%start", and "%class". The first three are for operator precedence. The last two declare the start symbol and the name of a PHP class to generate which will hold all the bottom-up parsing tables.

  4. Write a grammar file.

  5. /path/to/lime/lime.php list-of-grammar-files > my_parser.php

  6. Read the function parse_lime_grammar() in lime.php to understand how to integrate your parser into your program.

  7. Integrate your parser as follows:

     require 'lime/parse_engine.php';
     require 'my_parser.php';
     // Later:
     $parser = new parse_engine(new my_parser());
     // And still later:
     try {
     	while (..something..) {
     		$parser->eat($type, $val);
     		// You figure out how to get the parameters.
     	// And after the last token has been eaten:
     } catch (parse_error $e) {
     return $parser->semantic;
  8. You now have the computed semantic value of whatever you parsed. Add salt and pepper to taste, and serve.