spirit/doc/lex/lexer_tutorials.qbk

[/==============================================================================
    Copyright (C) 2001-2011 Joel de Guzman
    Copyright (C) 2001-2011 Hartmut Kaiser

    Distributed under the Boost Software License, Version 1.0. (See accompanying
    file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
===============================================================================/]

[section:lexer_tutorials __lex__ Tutorials Overview]

The __lex__ library implements several components on top of possibly different
lexer generator libraries. It exposes a pair of iterators, which, when
dereferenced, return a stream of tokens generated from the underlying character
stream. The generated tokens are based on the token definitions supplied by the
user.

Currently, __lex__ is built on top of Ben Hanson's excellent __lexertl__
library (which is a proposed Boost library). __lexertl__ provides the necessary
functionality to build state machines based on a set of supplied regular
expressions. But __lex__ is not restricted to be used with __lexertl__. We
expect it to be usable in conjunction with any other lexical scanner generator
library, all what needs to be implemented is a set of wrapper objects exposing a
well defined interface as described in this documentation.

[note   For the sake of clarity all examples in this documentation assume
        __lex__ to be used on top of __lexertl__.]

Building a lexer using __lex__ is highly configurable, where most of this
configuration is done at compile time. Almost all of the configurable
parameters have generally useful default values, allowing project startup to be
a easy and straightforward task. Here is a (non-complete) list of features you
can tweak to adjust the generated lexer instance to the actual needs:

* Select and customize the token type to be generated by the lexer instance.
* Select and customize the token value types the generated token instances will
  be able to hold.
* Select the iterator type of the underlying input stream, which will be used
  as the source for the character stream to tokenize.
* Customize the iterator type returned by the lexer to enable debug support,
  special handling of certain input sequences, etc.
* Select the /dynamic/ or the /static/ runtime model for the lexical
  analyzer.

Special care has been taken during the development of the library that
optimal code will be generated regardless of the configuration options
selected.

The series of tutorial examples of this section will guide you through some
common use cases helping to understand the big picture. The first two quick
start examples (__sec_lex_quickstart_1__ and __sec_lex_quickstart_2__)
introduce the __lex__ library while building two stand alone applications, not
being connected to or depending on any other part of __spirit__. The section
__sec_lex_quickstart_3__ demonstrates how to use a lexer in conjunction with a
parser (where obviously the parser is built using __qi__).

[endsect]