290 lines
12 KiB
HTML
290 lines
12 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html>
|
|
<head>
|
|
<meta content=
|
|
"HTML Tidy for Windows (vers 1st February 2003), see www.w3.org"
|
|
name="generator">
|
|
<title>
|
|
Preface
|
|
</title>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
|
|
<link rel="stylesheet" href="theme/style.css" type="text/css">
|
|
</head>
|
|
<body>
|
|
<table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
|
|
<tr>
|
|
<td width="10"></td>
|
|
<td width="85%">
|
|
<font size="6" face=
|
|
"Verdana, Arial, Helvetica, sans-serif"><b>Preface</b></font>
|
|
</td>
|
|
<td width="112">
|
|
<a href="http://spirit.sf.net"><img src="theme/spirit.gif"
|
|
width="112" height="48" align="right" border="0"></a>
|
|
</td>
|
|
</tr>
|
|
</table><br>
|
|
|
|
<table border="0">
|
|
<tr>
|
|
<td width="10"></td>
|
|
<td width="30">
|
|
<a href="../index.html"><img src="theme/u_arr.gif" border="0"></a>
|
|
</td>
|
|
<td width="30">
|
|
<img src="theme/l_arr_disabled.gif" width="20" height="19">
|
|
</td>
|
|
<td width="30">
|
|
<a href="introduction.html"><img src="theme/r_arr.gif" border="0">
|
|
</a>
|
|
</td>
|
|
</tr>
|
|
</table><br>
|
|
|
|
<table width="80%" border="0" align="center">
|
|
<tr>
|
|
<td>
|
|
<p>
|
|
<i>"Examples of designs that meet most of the criteria for
|
|
"goodness" (easy to understand, flexible, efficient) are a
|
|
recursive-descent parser, which is traditional procedural code.
|
|
Another example is the STL, which is a generic library of
|
|
containers and algorithms depending crucially on both traditional
|
|
procedural code and on parametric polymorphism."</i>
|
|
</p>
|
|
<p>
|
|
<b><font color="#003366">Bjarne Stroustrup</font></b>
|
|
</p>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<p>
|
|
<b>History</b>
|
|
</p>
|
|
<p>
|
|
A decade and a half ago, I wrote my first calculator in Pascal. It is one
|
|
of my most unforgettable coding experiences. I was amazed how a mutually
|
|
recursive set of functions can model a grammar specification. In time,
|
|
the skills I acquired from that academic experience became very
|
|
practical. Periodically I was tasked to do some parsing. For instance,
|
|
whenever I need to perform any form of I/O, even in binary, I try to
|
|
approach the task somewhat formally by writing a grammar using
|
|
Pascal-like syntax diagrams and then write a corresponding
|
|
recursive-descent parser. This worked very well.
|
|
</p>
|
|
<p>
|
|
The arrival of the Internet and the World Wide Web magnified this
|
|
thousand-fold. At one point I had to write an HTML parser for a Web
|
|
browser project. I got a recursive-descent HTML parser working based on
|
|
the W3C formal specifications easily. I was certainly glad that HTML had
|
|
a formal grammar specification. Because of the influence of the Internet,
|
|
I then had to do more parsing. RFC specifications were everywhere. SGML,
|
|
HTML, XML, even email addresses and those seemingly trivial URLs were all
|
|
formally specified using small EBNF-style grammar specifications. This
|
|
made me wish for a tool similar to big-time parser generators such as
|
|
YACC and <a href="http://www.antlr.org/">ANTLR</a>, where a parser is
|
|
built automatically from a grammar specification. Yet, I want it to be
|
|
extremely small; small enough to fit in my pocket, yet scalable.
|
|
</p>
|
|
<p>
|
|
It must be able to practically parse simple grammars such as email
|
|
addresses to moderately complex grammars such as XML and perhaps some
|
|
small to medium-sized scripting languages. Scalability is a prime goal.
|
|
You should be able to use it for small tasks such as parsing command
|
|
lines without incurring a heavy payload, as you do when you are using
|
|
YACC or PCCTS. Even now that it has evolved and matured to become a
|
|
multi-module library, true to its original intent, Spirit can still be
|
|
used for extreme micro-parsing tasks. You only pay for features that you
|
|
need. The power of Spirit comes from its modularity and extensibility.
|
|
Instead of giving you a sledgehammer, it gives you the right ingredients
|
|
to create a sledgehammer easily. For instance, it does not really have a
|
|
lexer, but you have all the raw ingredients to write one, if you need
|
|
one.
|
|
</p>
|
|
<p>
|
|
The result was Spirit. Spirit was a personal project that was conceived
|
|
when I was doing R&D in Japan. Inspired by the GoF's composite and
|
|
interpreter patterns, I realized that I can model a recursive-descent
|
|
parser with hierarchical-object composition of primitives (terminals) and
|
|
composites (productions). The original version was implemented with
|
|
run-time polymorphic classes. A parser is generated at run time by
|
|
feeding in production rule strings such as <tt>"prod ::= {‘A’
|
|
| ‘B’} ‘C’;"</tt>A compile function compiled the
|
|
parser, dynamically creating a hierarchy of objects and linking semantic
|
|
actions on the fly. A very early text can be found <a href=
|
|
"http://spirit.sourceforge.net/dl_docs/pre-spirit.htm">here</a>.
|
|
</p>
|
|
<p>
|
|
The version that we have now is a complete rewrite of the original Spirit
|
|
parser using expression templates and static polymorphism, inspired by
|
|
the works of Todd Veldhuizen (" <a href=
|
|
"http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.248">
|
|
Expression Templates</a>", C++ Report, June 1995). Initially, the
|
|
<i><b>static-Spirit</b></i> version was meant only to replace the core of
|
|
the original <i><b>dynamic-Spirit</b></i>. Dynamic-spirit needed a parser
|
|
to implement itself anyway. The original employed a hand-coded
|
|
recursive-descent parser to parse the input grammar specification
|
|
strings.
|
|
</p>
|
|
<p>
|
|
After its initial "open-source" debut in May 2001, static-Spirit became a
|
|
success. At around November 2001, the Spirit website had an activity
|
|
percentile of 98%, making it the number one parser tool at Source Forge
|
|
at the time. Not bad for such a niche project such as a parser library.
|
|
The "static" portion of Spirit was forgotten and static-Spirit simply
|
|
became Spirit. The framework soon evolved to acquire more dynamic
|
|
features.
|
|
</p>
|
|
<p>
|
|
<b>How to use this manual</b>
|
|
</p>
|
|
<p>
|
|
The Spirit framework is organized in logical modules starting from the
|
|
core. This documentation provides a user's guide and reference for each
|
|
module in the framework. A simple and clear code example is worth a
|
|
hundred lines of documentation; therefore, the user's guide is presented
|
|
with abundant examples annotated and explained in step-wise manner. The
|
|
user's guide is based on examples -lots of them.
|
|
</p>
|
|
<p>
|
|
As much as possible, forward information (i.e. citing a specific piece of
|
|
information that has not yet been discussed) is avoided in the user's
|
|
manual portion of each module. In many cases, though, it is unavoidable
|
|
that advanced but related topics are interspersed with the normal flow of
|
|
discussion. To alleviate this problem, topics categorized as "advanced"
|
|
may be skipped at first reading.
|
|
</p>
|
|
<p>
|
|
Some icons are used to mark certain topics indicative of their relevance.
|
|
These icons precede some text to indicate:
|
|
</p>
|
|
<table width="90%" border="0" align="center">
|
|
<tr>
|
|
<td>
|
|
<table width="100%" border="0">
|
|
<tr>
|
|
<td colspan="3" class="table_title">
|
|
Icons
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td width="19" class="table_cells">
|
|
<img src="theme/note.gif" width="16" height="16">
|
|
</td>
|
|
<td width="58" class="table_cells">
|
|
<b>Note</b>
|
|
</td>
|
|
<td width="627" class="table_cells">
|
|
Information provided is moderately important and should be
|
|
noted by the reader.
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td width="19" class="table_cells">
|
|
<img src="theme/alert.gif">
|
|
</td>
|
|
<td width="58" class="table_cells">
|
|
<b>Alert</b>
|
|
</td>
|
|
<td width="627" class="table_cells">
|
|
Information provided is of utmost importance.
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td width="19" class="table_cells">
|
|
<img src="theme/lens.gif" width="15" height="16">
|
|
</td>
|
|
<td width="58" class="table_cells">
|
|
<b>Detail</b>
|
|
</td>
|
|
<td width="627" class="table_cells">
|
|
Information provided is auxiliary but will give the reader a
|
|
deeper insight into a specific topic. May be skipped.
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td width="19" class="table_cells">
|
|
<img src="theme/bulb.gif" width="13" height="18">
|
|
</td>
|
|
<td width="58" class="table_cells">
|
|
<b>Tip</b>
|
|
</td>
|
|
<td width="627" class="table_cells">
|
|
A potentially useful and helpful piece of information.
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<p>
|
|
<b>Support</b>
|
|
</p>
|
|
<p>
|
|
Please direct all questions to Spirit's mailing list. You can subscribe
|
|
to the mailing list <a href=
|
|
"https://lists.sourceforge.net/lists/listinfo/spirit-general">here</a>.
|
|
The mailing list has a searchable archive. A search link to this archive
|
|
is provided in <a href="http://spirit.sf.net">Spirit's home page</a>. You
|
|
may also read and post messages to the mailing list through an
|
|
<a href="http://news.gmane.org/thread.php?group=gmane.comp.parsers.spirit.general">
|
|
NNTP news portal</a> (thanks to <a href=
|
|
"http://www.gmane.org">www.gmane.org</a>). The news group mirrors the
|
|
mailing list. Here are two links to the archives: via <a href=
|
|
"http://dir.gmane.org/gmane.comp.parsers.spirit.general">
|
|
gmane</a>, via <a href=
|
|
"http://sourceforge.net/mailarchive/forum.php?forum_id=1595gmane.org">geocrawler</a>.
|
|
</p>
|
|
<table width="100%" border="0" align="center">
|
|
<tr>
|
|
<td>
|
|
<div align="center">
|
|
<i><b><font size="5">To my dear daughter Phoenix</font></b></i>
|
|
</div>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<table width="100%" border="0">
|
|
<tr>
|
|
<td width="72%">
|
|
|
|
</td>
|
|
<td width="28%">
|
|
<div align="right">
|
|
<p>
|
|
<b>Joel de Guzman<br></b> September 2002
|
|
</p>
|
|
</div>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<table border="0">
|
|
<tr>
|
|
<td width="10"></td>
|
|
<td width="30">
|
|
<a href="../index.html"><img src="theme/u_arr.gif" border="0"></a>
|
|
</td>
|
|
<td width="30">
|
|
<img src="theme/l_arr_disabled.gif" width="20" height="19">
|
|
</td>
|
|
<td width="30">
|
|
<a href="introduction.html"><img src="theme/r_arr.gif" border="0">
|
|
</a>
|
|
</td>
|
|
</tr>
|
|
</table><br>
|
|
|
|
<hr size="1">
|
|
<p class="copyright">
|
|
Copyright © 1998-2003 Joel de Guzman<br>
|
|
<br>
|
|
<font size="2">Use, modification and distribution is subject to the
|
|
Boost Software License, Version 1.0. (See accompanying file
|
|
LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)</font>
|
|
</p>
|
|
<p>
|
|
|
|
</p>
|
|
</body>
|
|
</html>
|