ANTLR

ANTLR is a lexer and parser generator for processing structured text. It takes a grammar file as an input (including regexes for tokens) expressed in extended Backus-Naur form (EBNF). It produces a lexer and parser in a target language (C, C++, Java, C#, etc.).

Internally it uses a LL(*) algorithm for parsing.

Output

ANTLR outputs a lexer and parser class definition/implementation file. It also defines visitors and listeners (i.e., to traverse a parse tree).

To hook into the ANTLR output, we create our own main file, which: takes an input stream, constructs a lexer object and gets tokens from it, then constructs a parser object and gets a parse tree from it. It uses the visitors/listeners to take actions. We also create our own visitors/listeners that inherit from ANTLR’s output classes.

We can also embed our target language code into ANTLR with curly brackets.

Specification

To specify a context-free grammar, we use EBNF. Note that there are a few variations to meet the requirements of the program:

All definitions must be terminated by a semi-colon.
An empty string (as a possible option with another option A) is expressed like A?.

Resources

The Definitive ANTLR4 Reference, by Terence Parr

jszhn

Recent Notes

Accounting method

Adjugate matrix

Algorithm

Algorithmic analysis

Alma Linux

ANTLR

Output

Specification

Resources

Graph View

Backlinks