96 lines
3.7 KiB
Plaintext
96 lines
3.7 KiB
Plaintext
[/==============================================================================
|
|
Copyright (C) 2001-2015 Joel de Guzman
|
|
Copyright (C) 2001-2011 Hartmut Kaiser
|
|
|
|
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
|
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
|
===============================================================================/]
|
|
|
|
[section Introduction]
|
|
|
|
Boost Spirit X3 is an object-oriented, recursive-descent parser for C++. It
|
|
allows you to write grammars using a format similar to
|
|
Extended Backus Naur Form (EBNF)[footnote
|
|
[@http://www.cl.cam.ac.uk/%7Emgk25/iso-14977.pdf ISO-EBNF]] directly in C++.
|
|
These inline grammar specifications can mix freely with other C++ code and,
|
|
thanks to the generative power of C++ templates, are immediately executable.
|
|
Conventional compiler-compilers or parser-generators have to perform an
|
|
additional translation step from the source EBNF code to C or C++ code.
|
|
|
|
Since the target input grammars are written entirely in C++ we do not need any
|
|
separate tools to compile, preprocess or integrate those into the build process.
|
|
__spirit__ allows seamless integration of the parsing process with other C++
|
|
code. This often allows for simpler and more efficient code.
|
|
|
|
The created parsers are fully attributed, which allows you to easily build and
|
|
handle hierarchical data structures in memory. These data structures resemble
|
|
the structure of the input data and can directly be used to generate
|
|
arbitrarily-formatted output.
|
|
|
|
A simple EBNF grammar snippet:
|
|
|
|
group ::= '(' expression ')'
|
|
factor ::= integer | group
|
|
term ::= factor (('*' factor) | ('/' factor))*
|
|
expression ::= term (('+' term) | ('-' term))*
|
|
|
|
is approximated using facilities of Spirit's /X3/ as seen in this code snippet:
|
|
|
|
group = '(' >> expression >> ')';
|
|
factor = integer | group;
|
|
term = factor >> *(('*' >> factor) | ('/' >> factor));
|
|
expression = term >> *(('+' >> term) | ('-' >> term));
|
|
|
|
Through the magic of expression templates, this is perfectly valid and
|
|
executable C++ code. The production rule `expression` is, in fact, an object
|
|
that has a member function `parse` that does the work given a source code
|
|
written in the grammar that we have just declared. Yes, it's a calculator. We
|
|
shall simplify for now by skipping the type declarations and the definition of
|
|
the rule `integer` invoked by `factor`. Now, the production rule `expression` in
|
|
our grammar specification, traditionally called the `start` symbol, can
|
|
recognize inputs such as:
|
|
|
|
12345
|
|
-12345
|
|
+12345
|
|
1 + 2
|
|
1 * 2
|
|
1/2 + 3/4
|
|
1 + 2 + 3 + 4
|
|
1 * 2 * 3 * 4
|
|
(1 + 2) * (3 + 4)
|
|
(-1 + 2) * (3 + -4)
|
|
1 + ((6 * 200) - 20) / 6
|
|
(1 + (2 + (3 + (4 + 5))))
|
|
|
|
Certainly we have modified the original EBNF syntax. This is done to
|
|
conform to C++ syntax rules. Most notably we see the abundance of shift >>
|
|
operators. Since there are no juxtaposition operators in C++, it is simply not
|
|
possible to write something like:
|
|
|
|
a b
|
|
|
|
as seen in math syntax, for example, to mean multiplication or, in our case,
|
|
as seen in EBNF syntax to mean sequencing (b should follow a). __x3__
|
|
uses the shift `>>` operator instead for this purpose. We take the `>>`
|
|
operator, with arrows pointing to the right, to mean "is followed by". Thus we
|
|
write:
|
|
|
|
a >> b
|
|
|
|
The alternative operator `|` and the parentheses `()` remain as is. The
|
|
assignment operator `=` is used in place of EBNF's `::=`. Last but not least,
|
|
the Kleene star `*`, which in this case is a postfix operator in EBNF becomes a
|
|
prefix. Instead of:
|
|
|
|
a* //... in EBNF syntax,
|
|
|
|
we write:
|
|
|
|
*a //... in Spirit.
|
|
|
|
since there are no postfix stars, `*`, in C/C++. Finally, we terminate each
|
|
rule with the ubiquitous semi-colon, `;`.
|
|
|
|
[endsect]
|