spirit/doc/karma/char.qbk
Hartmut Kaiser 83a792d7ed Spirit: updating copyrights
[SVN r67619]
2011-01-03 16:58:38 +00:00

492 lines
22 KiB
Plaintext

[/==============================================================================
Copyright (C) 2001-2011 Hartmut Kaiser
Copyright (C) 2001-2011 Joel de Guzman
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
===============================================================================/]
[section:char Char Generators]
This module includes different character oriented generators allowing to output
single characters. Currently, it includes literal chars (e.g. `'x'`, `L'x'`),
`char_` (single characters, ranges and character sets) and the encoding
specific character classifiers (`alnum`, `alpha`, `digit`, `xdigit`, etc.).
[heading Module Header]
// forwards to <boost/spirit/home/karma/char.hpp>
#include <boost/spirit/include/karma_char.hpp>
Also, see __include_structure__.
[/////////////////////////////////////////////////////////////////////////////]
[section:char_generator Character Generators (`char_`, `lit`)]
[heading Description]
The character generators described in this section are:
The `char_` generator emits single characters. The `char_` generator has an
associated __karma_char_encoding_namespace__. This is needed when doing basic
operations such as forcing lower or upper case and dealing with
character ranges.
There are various forms of `char_`.
[heading char_]
The no argument form of `char_` emits any character in the associated
__karma_char_encoding_namespace__.
char_ // emits any character as supplied by the attribute
[heading char_(ch)]
The single argument form of `char_` (with a character argument) emits
the supplied character.
char_('x') // emits 'x'
char_(L'x') // emits L'x'
char_(x) // emits x (a char)
[heading char_(first, last)]
`char_` with two arguments, emits any character from a range of characters as
supplied by the attribute.
char_('a','z') // alphabetic characters
char_(L'0',L'9') // digits
A range of characters is created from a low-high character pair. Such a
generator emits a single character that is in the range, including both
endpoints. Note, the first character must be /before/ the second,
according to the underlying __karma_char_encoding_namespace__.
Character mapping is inherently platform dependent. It is not guaranteed
in the standard for example that `'A' < 'Z'`, that is why in Spirit2, we
purposely attach a specific __karma_char_encoding_namespace__ (such as ASCII,
ISO-8859-1) to the `char_` generator to eliminate such ambiguities.
[note *Sparse bit vectors*
To accommodate 16/32 and 64 bit characters, the char-set statically
switches from a `std::bitset` implementation when the character type is
not greater than 8 bits, to a sparse bit/boolean set which uses a sorted
vector of disjoint ranges (`range_run`). The set is constructed from
ranges such that adjacent or overlapping ranges are coalesced.
`range_runs` are very space-economical in situations where there are lots
of ranges and a few individual disjoint values. Searching is O(log n)
where n is the number of ranges.]
[heading char_(def)]
Lastly, when given a string (a plain C string, a `std::basic_string`,
etc.), the string is regarded as a char-set definition string following
a syntax that resembles posix style regular expression character sets
(except that double quotes delimit the set elements instead of square
brackets and there is no special negation ^ character). Examples:
char_("a-zA-Z") // alphabetic characters
char_("0-9a-fA-F") // hexadecimal characters
char_("actgACTG") // DNA identifiers
char_("\x7f\x7e") // Hexadecimal 0x7F and 0x7E
These generators emit any character from a range of characters as
supplied by the attribute.
[heading lit(ch)]
`lit`, when passed a single character, behaves like the single argument
`char_` except that `lit` does not consume an attribute. A plain
`char` or `wchar_t` is equivalent to a `lit`.
[note `lit` is reused by the [karma_string String Generators], the
char generators, and the Numeric Generators (see [signed_int signed integer],
[unsigned_int unsigned integer], and [real_number real number] generators). In
general, a char generator is created when you pass in a
character, a string generator is created when you pass in a string, and a
numeric generator is created when you use a numeric literal. The
exception is when you pass a single element literal string, e.g.
`lit("x")`. In this case, we optimize this to create a char generator
instead of a string generator.]
Examples:
'x'
lit('x')
lit(L'x')
lit(c) // c is a char
[heading Header]
// forwards to <boost/spirit/home/karma/char/char.hpp>
#include <boost/spirit/include/karma_char_.hpp>
Also, see __include_structure__.
[heading Namespace]
[table
[[Name]]
[[`boost::spirit::lit // alias: boost::spirit::karma::lit` ]]
[[`ns::char_`]]
]
In the table above, `ns` represents a __karma_char_encoding_namespace__.
[heading Model of]
[:__primitive_generator_concept__]
[variablelist Notation
[[`ch`, `ch1`, `ch2`]
[Character-class specific character (See __char_class_types__),
or a __karma_lazy_argument__ that evaluates to a
character-class specific character value]]
[[`cs`] [Character-set specifier string (See
__char_class_types__), or a __karma_lazy_argument__ that
evaluates to a character-set specifier string, or a
pointer/reference to a null-terminated array of characters.
This string specifies a char-set definition string following
a syntax that resembles posix style regular expression character
sets (except the square brackets and the negation `^` character).]]
[[`ns`] [A __karma_char_encoding_namespace__.]]
[[`cg`] [A char generator, a char range generator, or a char set generator.]]]
[heading Expression Semantics]
Semantics of an expression is defined only where it differs from, or is
not defined in __primitive_generator_concept__.
[table
[[Expression] [Description]]
[[`ch`] [Generate the character literal `ch`. This generator
never fails (unless the underlying output stream
reports an error).]]
[[`lit(ch)`] [Generate the character literal `ch`. This generator
never fails (unless the underlying output stream
reports an error).]]
[[`ns::char_`] [Generate the character provided by a mandatory
attribute interpreted in the character set defined
by `ns`. This generator never fails (unless the
underlying output stream reports an error).]]
[[`ns::char_(ch)`] [Generate the character `ch` as provided by the
immediate literal value the generator is initialized
from. If this generator has an associated attribute
it succeeds only as long as the attribute is equal
to the immediate literal (unless the underlying
output stream reports an error). Otherwise this
generator fails and does not generate any output.]]
[[`ns::char_("c")`] [Generate the character `c` as provided by the
immediate literal value the generator is initialized
from. If this generator has an associated attribute
it succeeds only as long as the attribute is equal
to the immediate literal (unless the underlying
output stream reports an error). Otherwise this
generator fails and does not generate any output.]]
[[`ns::char_(ch1, ch2)`][Generate the character provided by a mandatory
attribute interpreted in the character set defined
by `ns`. The generator succeeds as long as the
attribute belongs to the character range `[ch1, ch2]`
(unless the underlying output stream reports an
error). Otherwise this generator fails and does not
generate any output.]]
[[`ns::char_(cs)`] [Generate the character provided by a mandatory
attribute interpreted in the character set defined
by `ns`. The generator succeeds as long as the
attribute belongs to the character set `cs`
(unless the underlying output stream reports an
error). Otherwise this generator fails and does not
generate any output.]]
[[`~cg`] [Negate `cg`. The result is a negated char generator
that inverts the test condition of the character
generator it is attached to.]]
]
A character `ch` is assumed to belong to the character range defined by
`ns::char_(ch1, ch2)` if its character value (binary representation)
interpreted in the character set defined by `ns` is not smaller than the
character value of `ch1` and not larger then the character value of `ch2` (i.e.
`ch1 <= ch <= ch2`).
The `charset` parameter passed to `ns::char_(charset)` must be a string
containing more than one character. Every single character in this string is
assumed to belong to the character set defined by this expression. An exception
to this is the `'-'` character which has a special meaning if it is not
specified as the first and not the last character in `charset`. If the `'-'`
is used in between to characters it is interpreted as spanning a character
range. A character `ch` is considered to belong to the defined character set
`charset` if it matches one of the characters as specified by the string
parameter described above. For example
[table
[[Example] [Description]]
[[`char_("abc")`] ['a', 'b', and 'c']]
[[`char_("a-z")`] [all characters (and including) from 'a' to 'z']]
[[`char_("a-zA-Z")`] [all characters (and including) from 'a' to 'z' and 'A' and 'Z']]
[[`char_("-1-9")`] ['-' and all characters (and including) from '1' to '9']]
]
[heading Attributes]
[table
[[Expression] [Attribute]]
[[`ch`] [__unused__]]
[[`lit(ch)`] [__unused__]]
[[`ns::char_`] [`Ch`, attribute is mandatory (otherwise compilation
will fail). `Ch` is the character type of the
__karma_char_encoding_namespace__, `ns`.]]
[[`ns::char_(ch)`] [`Ch`, attribute is optional, if it is supplied, the
generator compares the attribute with `ch` and
succeeds only if both are equal, failing otherwise.
`Ch` is the character type of the
__karma_char_encoding_namespace__, `ns`.]]
[[`ns::char_("c")`] [`Ch`, attribute is optional, if it is supplied, the
generator compares the attribute with `c` and
succeeds only if both are equal, failing otherwise.
`Ch` is the character type of the
__karma_char_encoding_namespace__, `ns`.]]
[[`ns::char_(ch1, ch2)`][`Ch`, attribute is mandatory (otherwise compilation
will fail), the generator succeeds if the attribute
belongs to the character range `[ch1, ch2]`
interpreted in the character set defined by `ns`.
`Ch` is the character type of the
__karma_char_encoding_namespace__, `ns`.]]
[[`ns::char_(cs)`] [`Ch`, attribute is mandatory (otherwise compilation
will fail), the generator succeeds if the attribute
belongs to the character set `cs`, interpreted
in the character set defined by `ns`.
`Ch` is the character type of the
__karma_char_encoding_namespace__, `ns`.]]
[[`~cg`] [Attribute of `cg`]]
]
[note In addition to their usual attribute of type `Ch` all listed generators
accept an instance of a `boost::optional<Ch>` as well. If the
`boost::optional<>` is initialized (holds a value) the generators behave
as if their attribute was an instance of `Ch` and emit the value stored
in the `boost::optional<>`. Otherwise the generators will fail.]
[heading Complexity]
[:O(1)]
The complexity of `ch`, `lit(ch)`, `ns::char_`, `ns::char_(ch)`, and
`ns::char_("c")` is constant as all generators emit exactly one character per
invocation.
The character range generator (`ns::char_(ch1, ch2)`) additionally requires
constant lookup time for the verification whether the attribute belongs to
the character range.
The character set generator (`ns::char_(cs)`) additionally requires
O(log N) lookup time for the verification whether the attribute belongs to
the character set, where N is the number of characters in the character set.
[heading Example]
[note The test harness for the example(s) below is presented in the
__karma_basics_examples__ section.]
Some includes:
[reference_karma_includes]
Some using declarations:
[reference_karma_using_declarations_char]
Basic usage of `char_` generators:
[reference_karma_char]
[endsect]
[/////////////////////////////////////////////////////////////////////////////]
[section:char_class Character Classification Generators (`alnum`, `digit`, etc.)]
[heading Description]
The library has the full repertoire of single character generators for
character classification. This includes the usual `alnum`, `alpha`,
`digit`, `xdigit`, etc. generators. These generators have an associated
__karma_char_encoding_namespace__. This is needed when doing basic operations
such as forcing lower or upper case.
[heading Header]
// forwards to <boost/spirit/home/karma/char/char_class.hpp>
#include <boost/spirit/include/karma_char_class.hpp>
Also, see __include_structure__.
[heading Namespace]
[table
[[Name]]
[[`ns::alnum`]]
[[`ns::alpha`]]
[[`ns::blank`]]
[[`ns::cntrl`]]
[[`ns::digit`]]
[[`ns::graph`]]
[[`ns::lower`]]
[[`ns::print`]]
[[`ns::punct`]]
[[`ns::space`]]
[[`ns::upper`]]
[[`ns::xdigit`]]
]
In the table above, `ns` represents a __karma_char_encoding_namespace__ used by the
corresponding character class generator. All listed generators have a mandatory
attribute `Ch` and will not compile if no attribute is associated.
[heading Model of]
[:__primitive_generator_concept__]
[variablelist Notation
[[`ns`] [A __karma_char_encoding_namespace__.]]]
[heading Expression Semantics]
Semantics of an expression is defined only where it differs from, or is
not defined in __primitive_generator_concept__.
[table
[[Expression] [Semantics]]
[[`ns::alnum`] [If the mandatory attribute satisfies the concept of
`std::isalnum` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::alpha`] [If the mandatory attribute satisfies the concept of
`std::isalpha` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::blank`] [If the mandatory attribute satisfies the concept of
`std::isblank` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::cntrl`] [If the mandatory attribute satisfies the concept of
`std::iscntrl` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::digit`] [If the mandatory attribute satisfies the concept of
`std::isdigit` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::graph`] [If the mandatory attribute satisfies the concept of
`std::isgraph` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::print`] [If the mandatory attribute satisfies the concept of
`std::isprint` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::punct`] [If the mandatory attribute satisfies the concept of
`std::ispunct` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::xdigit`] [If the mandatory attribute satisfies the concept of
`std::isxdigit` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::lower`] [If the mandatory attribute satisfies the concept of
`std::islower` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::upper`] [If the mandatory attribute satisfies the concept of
`std::isupper` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.]]
[[`ns::space`] [If the optional attribute satisfies the concept of
`std::isspace` in the __karma_char_encoding_namespace__
the generator succeeds after emitting
its attribute (unless the underlying output stream
reports an error). This generator fails otherwise
while not generating anything.If no attribute is
supplied this generator emits a single space
character in the character set defined by `ns`.]]
]
Possible values for `ns` are described in the section __karma_char_encoding_namespace__.
[note The generators `alpha` and `alnum` might seem to behave unexpected if
used inside a `lower[]` or `upper[]` directive. Both directives
additionally apply the semantics of `std::islower` or `std::isupper`
to the respective character class. Some examples:
``
std::string s;
std::back_insert_iterator<std::string> out(s);
generate(out, lower[alpha], 'a'); // succeeds emitting 'a'
generate(out, lower[alpha], 'A'); // fails
``
The generator directive `upper[]` behaves correspondingly.
]
[heading Attributes]
[:All listed character class generators can take any attribute `Ch`. All
character class generators (except `space`) require an attribute and will
fail compiling otherwise.]
[note In addition to their usual attribute of type `Ch` all listed generators
accept an instance of a `boost::optional<Ch>` as well. If the
`boost::optional<>` is initialized (holds a value) the generators behave
as if their attribute was an instance of `Ch` and emit the value stored
in the `boost::optional<>`. Otherwise the generators will fail.]
[heading Complexity]
[:O(1)]
The complexity is constant as the generators emit not more than one character
per invocation.
[heading Example]
[note The test harness for the example(s) below is presented in the
__karma_basics_examples__ section.]
Some includes:
[reference_karma_includes]
Some using declarations:
[reference_karma_using_declarations_char_class]
Basic usage of an `alpha` generator:
[reference_karma_char_class]
[endsect]
[endsect]