Saturday, April 7, 2012

EBNF - Extended Backus Naur Form

The next step on my way to write a parser. As mentioned in my previous post, I thought I needed a parser, then realised I didn't, and now think that I might like to re-learn the decent parser because it might come in handy - probably won't but it's fun.  Link to my post on BNF.

Why EBNF

EBNF makes BNF easier to understand. (Debatable)

The trouble with EBNF, and BNF, is that there are quite a few ways to use it. ISO has a notation and it's final draft can be found here. It is a good read and worth it if you're interested. You can skip the Forward and Introduction and jump to page 1 if you're short of patience.

EBNF includes regular expressions that can make the specification more compact.

Everything in EBNF can be represented in BNF.


Characters representing operators

These are listed in order of precedence.

*    repetition symbol
-    except symbol
,    concatenate symbol
|    definition separator symbol
=    defining symbol
;    terminator symbol

The specification gives alternative characters also, e.g. Table 2 in the ISO spec gives alternative terminal characters.

Precedence can be overridden using the following brackets.

'    first quote symbol       first quote symbol          '
"    second quote symbol      second quote symbol         "
(*   start comment symbol     end comment symbol         *)
(    start group symbol       end group symbol            )
[    start option symbol      end option symbol           ]
{    start repeat symbol      end repeat symbol           }
?    special sequence symbol  end special sequence symbol ?


Examples

See here for the C language in EBNF.



Another example:

number       = digit+ .
identifier   = letter (letter | digit)* .
functioncall = functionname "(" parameterlist? ")" .
+   - repeat 1 or more times
*   - repeat 0 or more times
?   - optional item

And another:

Here's some EBNF for  a simple calculator (unfinished - I'll do this in another post): 
calculator   = line (newline | line)* .
newline      = '\n' | '\r' | '\r\n' .
assignment   = identifier '=' expression .
sum          = 
identifier   = 
expression   = 
number       = ( '+' | '-')? digit+ .
digit        = '0' |'1' |'2' |'3' |'4' |'5' |'6' |'7' |'8' |'9' . 

No comments:

Post a Comment