LRX parser generator for C++

A.M.D.G.

Home Downloads Feedback Theory Documentation Contact

LRX

The LR eXperiment.  Taking LR parsing into the future.  It's the new LRSTAR, with LR(*) parsing,
but it has more to offer.

LRX: LR(*) parser generator v 16.1

  • Creates LR(*) parsers in C++.
  • Creates very fast parsers - 2,500,000 lines/sec.
  • Creates small parsers and no runtime library is required.
  • Creates scalable parsers (the AST grows as needed for large input files). new
  • Allows EBNF notation (+,*,?) in grammars.
  • Grammars are completely separate from code.
  • Handles context-sensitivity (e.g. typedef in C).
  • Parsers have a symbol-table builder.
  • Parsers do AST construction and traversal, calling AST functions, in top-down order.
  • Includes 6 Visual Studio 2019 projects.
  • DFA: fast lexer generator v 16.1

  • Creates DFA compressed-matrix lexers in C++.
  • Lexers do keyword recognition - 30,000,000 keywords/sec.

  • LR(*) Parsers

    LRX creates an LALR(1) parser, by default. With option /k=2 or more, you get an LALR(*) parser. With option /mlr, you get a minimal LR(1) parser. With both /mlr and /k=2 or more, you get an LR(*) parser. The /mlr option activates the Honalee algorithm, which provides an LR(1) parser in some cases, but not all. There is a better LR(1) algorithm, which is planned for LRX in the near future. Not to worry, option /k=2 provides a more powerful parser than any LR(1) parser.

    LR(*) Parser Speed and Size

    An ANTLR parser, built with the C++ target, requires 10.0 seconds to process a C source code file of 227,000 lines, whereas the LRX parser requires 0.10 seconds. The LRX parser reads 44 MB per second and is 1/10th of the size of the ANTLR parser.

    Context Sensitivity

    The "typedef" declaration in the C and C++ languages is a context sensitive issue.  This cannot be solved by upgrading from LALR(1) to LR(1) or LR(k).  It requires making use of a symbol table while parsing and this allows the parser to handle this context-sensitive issue.  LRX has this feature built into the grammar notation.

    DFA Lexical Analyzers

    A DFA lexer is a finite-state automaton.  A DFA is the fastest recognition structure.  DFAs work fine for most programming languages.  Our tests showed that DFA lexers are almost TWICE THE SPEED of Flex lexers and nearly as small. DFA uses a compressed-matrix structure to accomplish this high performance.

    Keyword Recognition

    The keywords of your language will be found in your parser grammar and automatically moved into the lexer grammar, which makes keyword recognition as fast as possible. Whether you have 10,000 keywords or 1, the speed is still the same. Keyword and <identifier> recognition occurs simultaneously. See the lexer state machine for a better understanding of this.

    (c) Copyright Parser Generator Guy 2020.  All rights reserved.