994d4e48cc
[SVN r44163]
272 lines
24 KiB
HTML
272 lines
24 KiB
HTML
<html>
|
|
<head>
|
|
<title>The Grammar</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|
<link rel="stylesheet" href="theme/style.css" type="text/css">
|
|
</head>
|
|
|
|
<body>
|
|
<table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
|
|
<tr>
|
|
<td width="10">
|
|
</td>
|
|
<td width="85%">
|
|
<font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>The Grammar</b></font>
|
|
</td>
|
|
<td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
|
|
</tr>
|
|
</table>
|
|
<br>
|
|
<table border="0">
|
|
<tr>
|
|
<td width="10"></td>
|
|
<td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
|
|
<td width="30"><a href="scanner.html"><img src="theme/l_arr.gif" border="0"></a></td>
|
|
<td width="30"><a href="subrules.html"><img src="theme/r_arr.gif" border="0"></a></td>
|
|
</tr>
|
|
</table>
|
|
<p>The <b>grammar</b> encapsulates a set of rules. The <tt>grammar</tt> class
|
|
is a protocol base class. It is essentially an interface contract. The <tt>grammar</tt>
|
|
is a template class that is parameterized by its derived class, <tt>DerivedT</tt>,
|
|
and its context, <tt>ContextT</tt>. The template parameter ContextT defaults
|
|
to <tt>parser_context</tt>, a predefined context. </p>
|
|
<p>You need not be concerned at all with the ContextT template parameter unless
|
|
you wish to tweak the low level behavior of the grammar. Detailed information
|
|
on the ContextT template parameter is provided <a href="indepth_the_parser_context.html">elsewhere</a>.
|
|
The <tt>grammar</tt> relies on the template parameter DerivedT, a grammar subclass
|
|
to define the actual rules.</p>
|
|
<p>Presented below is the public API. There may actually be more template parameters
|
|
after <tt>ContextT</tt>. Everything after the <tt>ContextT</tt> parameter should
|
|
not be of concern to the client and are strictly for internal use only.</p>
|
|
<pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>template</span><span class=special><
|
|
</span><span class=keyword>typename </span><span class=identifier>DerivedT</span><span class=special>,
|
|
</span><span class=keyword>typename </span><span class=identifier>ContextT </span><span class=special>= </span><span class=identifier>parser_context</span><span class=special><</span><span class=special>> >
|
|
</span><span class=keyword>struct </span><span class=identifier>grammar</span><span class=special>;</span></font></code></pre>
|
|
<h2>Grammar definition</h2>
|
|
<p>A concrete sub-class inheriting from <tt>grammar</tt> is expected to have a
|
|
nested template class (or struct) named <tt>definition</tt>:</p>
|
|
<blockquote>
|
|
<p><img src="theme/bullet.gif" width="13" height="13"> It is a nested template
|
|
class with a typename <tt>ScannerT</tt> parameter.<br>
|
|
<img src="theme/bullet.gif" width="13" height="13"> Its constructor defines
|
|
the grammar rules.<br>
|
|
<img src="theme/bullet.gif" width="13" height="13"> Its constructor is passed
|
|
in a reference to the actual grammar <tt>self</tt>.<br>
|
|
<img src="theme/bullet.gif" width="13" height="13"> It has a member function
|
|
named <tt>start</tt> that returns a reference to the start <tt>rule</tt>.</p>
|
|
</blockquote>
|
|
<h2>Grammar skeleton</h2>
|
|
<pre><code><font color="#000000"><span class=special> </span><span class=keyword>struct </span><span class=identifier>my_grammar </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>my_grammar</span><span class=special>>
|
|
</span><span class=special>{
|
|
</span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>>
|
|
</span><span class=keyword>struct </span><span class=identifier>definition
|
|
</span><span class=special>{
|
|
</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>r</span><span class=special>;
|
|
</span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>my_grammar </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>) </span><span class=special>{ </span><span class=identifier>r </span><span class=special>= </span><span class=comment>/*..define here..*/</span><span class=special>; </span><span class=special>}
|
|
</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>r</span><span class=special>; </span><span class=special>}
|
|
</span><span class=special>};
|
|
</span><span class=special>};</span></font></code></pre>
|
|
<p>Decoupling the scanner type from the rules that form a grammar allows the grammar
|
|
to be used in different contexts possibly using different scanners. We do not
|
|
care what scanner we are dealing with. The user-defined <tt>my_grammar</tt>
|
|
can be used with <b>any</b> type of scanner. Unlike the rule, the grammar is
|
|
not tied to a specific scanner type. See <a href="faq.html#scanner_business">"Scanner
|
|
Business"</a> to see why this is important and to gain further understanding
|
|
on this scanner-rule coupling problem.</p>
|
|
<h2>Instantiating and using my_grammar</h2>
|
|
<p>Our grammar above may be instantiated and put into action:</p>
|
|
<pre><code><font color="#000000"><span class=special> </span><span class=identifier>my_grammar </span><span class=identifier>g</span><span class=special>;
|
|
|
|
</span><span class=keyword>if </span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=identifier>first</span><span class=special>, </span><span class=identifier>last</span><span class=special>, </span><span class=identifier>g</span><span class=special>, </span><span class=identifier>space_p</span><span class=special>).</span><span class=identifier>full</span><span class=special>)
|
|
</span><span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing succeeded\n"</span><span class=special>;
|
|
</span><span class=keyword>else
|
|
</span><span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing failed\n"</span><span class=special>;</span></font></code></pre>
|
|
<p><tt>my_grammar</tt> <b>IS-A </b>parser and can be used anywhere a parser is
|
|
expected, even referenced by another rule:</p>
|
|
<pre><code><font color="#000000"><span class=special> </span><span class=identifier>rule</span><span class=special><> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>g </span><span class=special>>> </span><span class=identifier>str_p</span><span class=special>(</span><span class=string>"cool huh?"</span><span class=special>);</span></font></code></pre>
|
|
<table width="80%" border="0" align="center">
|
|
<tr>
|
|
<td class="note_box"><img src="theme/alert.gif" width="16" height="16"> <b>Referencing
|
|
grammars<br>
|
|
</b><br>
|
|
Like the rule, the grammar is also held by reference when it is placed in
|
|
the right hand side of an EBNF expression. It is the responsibility of the
|
|
client to ensure that the referenced grammar stays in scope and does not
|
|
get destructed while it is being referenced. </td>
|
|
</tr>
|
|
</table>
|
|
<h2><a name="full_grammar"></a>Full Grammar Example</h2>
|
|
<p>Recalling our original calculator example, here it is now rewritten using a
|
|
grammar:</p>
|
|
<pre><code><font color="#000000"><span class=special> </span><span class=keyword>struct </span><span class=identifier>calculator </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>calculator</span><span class=special>>
|
|
</span><span class=special>{
|
|
</span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>>
|
|
</span><span class=keyword>struct </span><span class=identifier>definition
|
|
</span><span class=special>{
|
|
</span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>calculator </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>)
|
|
</span><span class=special>{
|
|
</span><span class=identifier>group </span><span class=special>= </span><span class=literal>'(' </span><span class=special>>> </span><span class=identifier>expression </span><span class=special>>> </span><span class=literal>')'</span><span class=special>;
|
|
</span><span class=identifier>factor </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>group</span><span class=special>;
|
|
</span><span class=identifier>term </span><span class=special>= </span><span class=identifier>factor </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'*' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'/' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>));
|
|
</span><span class=identifier>expression </span><span class=special>= </span><span class=identifier>term </span><span class=special>>> </span><span class=special>*((</span><span class=literal>'+' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'-' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>));
|
|
</span><span class=special>}
|
|
|
|
</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>expression</span><span class=special>, </span><span class=identifier>term</span><span class=special>, </span><span class=identifier>factor</span><span class=special>, </span><span class=identifier>group</span><span class=special>;
|
|
|
|
</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=keyword>const</span><span class=special>&
|
|
</span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>expression</span><span class=special>; </span><span class=special>}
|
|
</span><span class=special>};
|
|
</span><span class=special>};</span></font></code></pre>
|
|
<p><img src="theme/lens.gif" width="15" height="16"> A fully working example with
|
|
<a href="semantic_actions.html">semantic actions</a> can be <a href="../example/fundamental/calc_plain.cpp">viewed
|
|
here</a>. This is part of the Spirit distribution. </p>
|
|
<table width="80%" border="0" align="center">
|
|
<tr>
|
|
<td class="note_box"><img src="theme/lens.gif" width="15" height="16"> <b>self</b><br>
|
|
<br>
|
|
You might notice that the definition of the grammar has a constructor that
|
|
accepts a const reference to the outer grammar. In the example above, notice
|
|
that <tt>calculator::definition</tt> takes in a <tt>calculator const&
|
|
self</tt>. While this is unused in the example above, in many cases, this
|
|
is very useful. The self argument is the definition's window to the outside
|
|
world. For example, the calculator class might have a reference to some
|
|
state information that the definition can update while parsing proceeds
|
|
through <a href="semantic_actions.html">semantic actions</a>. </td>
|
|
</tr>
|
|
</table>
|
|
<h2>Grammar Capsules</h2>
|
|
<p>As a grammar becomes complicated, it is a good idea to group parts into logical
|
|
modules. For instance, when writing a language, it might be wise to put expressions
|
|
and statements into separate grammar capsules. The grammar takes advantage of
|
|
the encapsulation properties of C++ classes. The declarative nature of classes
|
|
makes it a perfect fit for the definition of grammars. Since the grammar is
|
|
nothing more than a class declaration, we can conveniently publish it in header
|
|
files. The idea is that once written and fully tested, a grammar can be reused
|
|
in many contexts. We now have the notion of grammar libraries.</p>
|
|
<h2><a name="multithreading"></a>Reentrancy and multithreading</h2>
|
|
<p>An instance of a grammar may be used in different places multiple times without
|
|
any problem. The implementation is tuned to allow this at the expense of some
|
|
overhead. However, we can save considerable cycles and bytes if we are certain
|
|
that a grammar will only have a single instance. If this is desired, simply
|
|
define <tt>BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE</tt> before including any spirit
|
|
header files.</p>
|
|
<pre><font face="Courier New, Courier, mono"><code><span class="preprocessor"> #define</span></code></font><span class="preprocessor"><code><font face="Courier New, Courier, mono"> </font><tt>BOOST_SPIRIT_SINGLE_GRAMMAR_INSTANCE</tt></code></span></pre>
|
|
<p> On the other hand, if a grammar is intended to be used in multithreaded code,
|
|
we should then define <tt>BOOST_SPIRIT_THREADSAFE</tt> before including any
|
|
spirit header files. In this case it will also be required to link against <a href="http://www.boost.org/libs/thread/doc/index.html">Boost.Threads</a></p>
|
|
<pre><font face="Courier New, Courier, mono"><span class="preprocessor"> #define</span></font> <span class="preprocessor"><tt>BOOST_SPIRIT_THREADSAFE</tt></span></pre>
|
|
<h2>Using more than one grammar start rule </h2>
|
|
<p>Sometimes it is desirable to have more than one visible entry point to a grammar
|
|
(apart from the start rule). To allow additional start points, Spirit provides
|
|
a helper template <tt>grammar_def</tt>, which may be used as a base class for
|
|
the <tt>definition</tt> subclass of your <tt>grammar</tt>. Here's an example:</p>
|
|
<pre><code> <span class="comment">// this header has to be explicitly included</span>
|
|
<span class="preprocessor">#include</span> <span class="string"><boost/spirit/utility/grammar_def.hpp></span>
|
|
|
|
</span><span class=keyword>struct </span><span class=identifier>calculator2 </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special><</span><span class=identifier>calculator2</span><span class=special>>
|
|
{
|
|
</span> <span class="keyword">enum</span>
|
|
{
|
|
expression = 0,
|
|
term = 1,
|
|
factor = 2,
|
|
};
|
|
|
|
<span class=special> </span><span class=keyword>template </span><span class=special><</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>>
|
|
</span><span class=keyword>struct </span><span class=identifier>definition
|
|
</span><span class="special">:</span> <span class="keyword">public</span><span class=identifier> grammar_def</span><span class="special"><</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>>,</span> same<span class="special">,</span> same<span class="special">></span>
|
|
<span class=special>{</span>
|
|
<span class=identifier>definition</span><span class=special>(</span><span class=identifier>calculator2 </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>)
|
|
{
|
|
</span><span class=identifier>group </span><span class=special>= </span><span class=literal>'(' </span><span class=special>>> </span><span class=identifier>expression </span><span class=special>>> </span><span class=literal>')'</span><span class=special>;
|
|
</span><span class=identifier>factor </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>group</span><span class=special>;
|
|
</span><span class=identifier>term </span><span class=special>= </span><span class=identifier>factor </span><span class=special>>> *((</span><span class=literal>'*' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>) | (</span><span class=literal>'/' </span><span class=special>>> </span><span class=identifier>factor</span><span class=special>));
|
|
</span><span class=identifier>expression </span><span class=special>= </span><span class=identifier>term </span><span class=special>>> *((</span><span class=literal>'+' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>) | (</span><span class=literal>'-' </span><span class=special>>> </span><span class=identifier>term</span><span class=special>));</span>
|
|
|
|
<span class="keyword">this</span><span class="special">-></span>start_parsers<span class="special">(</span>expression<span class="special">,</span> term<span class="special">,</span> factor<span class="special">);</span>
|
|
<span class="special">}</span>
|
|
|
|
<span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>> </span><span class=identifier>expression</span><span class=special>, </span><span class=identifier>term</span><span class=special>, </span><span class=identifier>factor, group</span><span class=special>;
|
|
</span><span class=special> };
|
|
};</span></font></code></pre>
|
|
<p>The <tt>grammar_def</tt> template has to be instantiated with the types of
|
|
all the rules you wish to make visible from outside the <tt>grammar</tt>:</p>
|
|
<pre><code><span class=identifier> </span><span class=identifier>grammar_def</span><span class="special"><</span><span class=identifier>rule</span><span class=special><</span><span class=identifier>ScannerT</span><span class=special>>,</span> same<span class="special">,</span> same<span class="special">></span></code> </pre>
|
|
<p>The shorthand notation <tt>same</tt> is used to indicate that the same type
|
|
be used as specified by the previous template parameter (e.g. <code><tt>rule<ScannerT></tt></code>).
|
|
Obviously, <tt>same</tt> may not be used as the first template parameter. </p>
|
|
<table width="80%" border="0" align="center">
|
|
<tr>
|
|
<td class="note_box"> <img src="theme/bulb.gif" width="13" height="18"> <strong>grammar_def
|
|
start types</strong><br>
|
|
<br>
|
|
It may not be obvious, but it is interesting to note that aside from rule<>s,
|
|
any parser type may be specified (e.g. chlit<>, strlit<>, int_parser<>,
|
|
etc.).</td>
|
|
</tr>
|
|
</table>
|
|
<p>Using the grammar_def class, there is no need to provide a <tt>start()</tt>member
|
|
function anymore. Instead, you'll have to insert a call to the <tt>this->start_parsers()</tt>
|
|
(which is a member function of the <tt>grammar_def</tt> template) to define
|
|
the start symbols for your <tt>grammar</tt>. <img src="theme/note.gif" width="16" height="16">
|
|
Note that the number and the sequence of the rules used as the parameters to
|
|
the <tt>start_parsers()</tt> function should match the types specified in the
|
|
<tt>grammar_def</tt> template:</p>
|
|
<pre><code> <span class="keyword">this</span><span class="special">-></span>start_parsers<span class="special">(</span>expression<span class="special">,</span> term<span class="special">,</span> factor<span class="special">);</span></code></pre>
|
|
<p> The grammar entry point may be specified using the following syntax:</p>
|
|
<pre><code><font color="#000000"><span class=identifier> g</span><span class="special">.</span><span class=identifier>use_parser</span><span class="special"><</span><span class=identifier>N</span><span class=special>>() </span><span class="comment">// Where g is your grammar and N is the Nth entry.</span></font></code></pre>
|
|
<p>This sample shows how to use the <tt>term</tt> rule from the <tt>calculator2</tt>
|
|
grammar above:</p>
|
|
<pre><code><font color="#000000"><span class=identifier> calculator2 g</span><span class=special>;
|
|
|
|
</span><span class=keyword>if </span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=identifier>
|
|
first</span><span class=special>, </span><span class=identifier>last</span><span class=special>,
|
|
</span><span class=identifier>g</span><span class="special">.</span><span class=identifier>use_parser</span><span class="special"><</span><span class=identifier>calculator2::term</span><span class=special>>(),</span><span class=identifier>
|
|
space_p</span><span class=special>
|
|
).</span><span class=identifier>full</span><span class=special>)
|
|
{
|
|
</span><span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing succeeded\n"</span><span class=special>;
|
|
}
|
|
</span><span class=keyword>else</span> <span class="special">{</span>
|
|
<span class=identifier>cout </span><span class=special><< </span><span class=string>"parsing failed\n"</span><span class=special>;
|
|
}</span></font></code></pre>
|
|
<p>The template parameter for the <tt>use_parser<></tt> template type should
|
|
be the zero based index into the list of rules specified in the <tt>start_parsers()</tt>
|
|
function call. </p>
|
|
<table width="80%" border="0" align="center">
|
|
<tr>
|
|
<td class="note_box"><img src="theme/note.gif" width="16" height="16"> <tt><strong>use_parser<0></strong></tt><br>
|
|
<br>
|
|
Note, that using <span class="literal">0</span> (zero) as the template parameter
|
|
to <tt>use_parser</tt> is equivalent to using the start rule, exported by
|
|
conventional means through the <tt>start()</tt> function, as shown in the
|
|
first <tt><a href="grammar.html#full_grammar">calculator</a></tt> sample
|
|
above. So this notation may be used even for grammars exporting one rule
|
|
through its <tt>start()</tt> function only. On the other hand, calling a
|
|
<tt>grammar</tt> without the <tt>use_parser</tt> notation will execute the
|
|
rule specified as the first parameter to the <tt>start_parsers()</tt> function.
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<p>The maximum number of usable start rules is limited by the preprocessor constant:</p>
|
|
<pre> <span class="identifier">BOOST_SPIRIT_GRAMMAR_STARTRULE_TYPE_LIMIT</span> <span class="comment">// defaults to 3</span></pre>
|
|
<table border="0">
|
|
<tr>
|
|
<td width="10"></td>
|
|
<td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
|
|
<td width="30"><a href="scanner.html"><img src="theme/l_arr.gif" border="0"></a></td>
|
|
<td width="30"><a href="subrules.html"><img src="theme/r_arr.gif" border="0"></a></td>
|
|
</tr>
|
|
</table>
|
|
<br>
|
|
<hr size="1">
|
|
<p class="copyright">Copyright © 1998-2003 Joel de Guzman<br>
|
|
Copyright © 2003-2004 Hartmut Kaiser <br>
|
|
<br>
|
|
<font size="2">Use, modification and distribution is subject to the Boost Software
|
|
License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
|
|
http://www.boost.org/LICENSE_1_0.txt) </font> </p>
|
|
<p> </p>
|
|
</body>
|
|
</html>
|