Erlang style comments, starting with a '%', are allowed in grammar files.
Each declaration or rule ends with a dot (the character '.').
The grammar starts with an optional header section. The header is put first in the generated file, before the module declaration. The purpose of the header is to provide a means to make the documentation generated by EDoc look nicer. Each header line should be enclosed in double quotes, and newlines will be inserted between the lines. For example:
Header "%% Copyright (C)"
"%% @private"
"%% @Author John".
Next comes a declaration of the nonterminal categories to be used in the rules. For example:
Nonterminals sentence nounphrase verbphrase.
A non-terminal category can be used at the left hand side (= lhs, or head) of a grammar rule. It can also appear at the right hand side of rules.
Next comes a declaration of the terminal categories, which are the categories of tokens produced by the scanner. For example:
Terminals article adjective noun verb.
Terminal categories may only appear in the right hand sides (= rhs) of grammar rules.
Next comes a declaration of the rootsymbol, or start category of the grammar. For example:
Rootsymbol sentence.
This symbol should appear in the lhs of at least one grammar rule. This is the most general syntactic category which the parser ultimately will parse every input string into.
After the rootsymbol declaration comes an optional declaration of the end_of_input symbol that your scanner is expected to use. For example:
Endsymbol '$end'.
Next comes one or more declarations of operator precedences, if needed. These are used to resolve shift/reduce conflicts (see yacc documentation).
Examples of operator declarations:
Right 100 '='.
Nonassoc 200 '==' '=/='.
Left 300 '+'.
Left 400 '*'.
Unary 500 '-'.
These declarations mean that '=' is defined as a right associative binary operator with precedence 100, '==' and '=/=' are operators with no associativity, '+' and '*' are left associative binary operators, where '*' takes precedence over '+' (the normal case), and '-' is a unary operator of higher precedence than '*'. The fact that '==' has no associativity means that an expression like a == b == c is considered a syntax error.
Certain rules are assigned precedence: each rule gets its precedence from the last terminal symbol mentioned in the right hand side of the rule. It is also possible to declare precedence for non-terminals, "one level up". This is practical when an operator is overloaded (see also example 3 below).
Next come the grammar rules. Each rule has the general form
Left_hand_side -> Right_hand_side : Associated_code.
The left hand side is a non-terminal category. The right hand side is a sequence of one or more non-terminal or terminal symbols with spaces between. The associated code is a sequence of zero or more Erlang expressions (with commas ',' as separators). If the associated code is empty, the separating colon ':' is also omitted. A final dot marks the end of the rule.
Symbols such as '{', '.', etc., have to be enclosed in single quotes when used as terminal or non-terminal symbols in grammar rules. The use of the symbols '$empty', '$end', and '$undefined' should be avoided.
The last part of the grammar file is an optional section with Erlang code (= function definitions) which is included 'as is' in the resulting parser file. This section must start with the pseudo declaration, or key words
Erlang code.
No syntax rule definitions or other declarations may follow this section. To avoid conflicts with internal variables, do not use variable names beginning with two underscore characters ('__') in the Erlang code in this section, or in the code associated with the individual syntax rules.
The optional expect declaration can be placed anywhere before the last optional section with Erlang code. It is used for suppressing the warning about conflicts that is ordinarily given if the grammar is ambiguous. An example:
Expect 2.
The warning is given if the number of shift/reduce conflicts differs from 2, or if there are reduce/reduce conflicts.