regex/doc/collating_names.qbk
John Maddock f11333d352 Added license info.
[SVN r40900]
2007-11-07 18:26:11 +00:00

129 lines
2.7 KiB
Plaintext

[/
Copyright 2006-2007 John Maddock.
Distributed under the Boost Software License, Version 1.0.
(See accompanying file LICENSE_1_0.txt or copy at
http://www.boost.org/LICENSE_1_0.txt).
]
[section:collating_names Collating Names]
[section:digraphs Digraphs]
The following are treated as valid digraphs when used as a collating name:
"ae", "Ae", "AE", "ch", "Ch", "CH", "ll", "Ll", "LL", "ss", "Ss", "SS", "nj", "Nj", "NJ", "dz", "Dz", "DZ", "lj", "Lj", "LJ".
So for example the expression:
[pre \[\[.ae.\]-c\] ]
will match any character that collates between the digraph "ae" and the character "c".
[endsect]
[section:posix_symbolic_names POSIX Symbolic Names]
The following symbolic names are recognised as valid collating element names,
in addition to any single character, this allows you to write for example:
[pre \[\[.left-square-bracket.\]\[.right-square-bracket.\]\]]
if you wanted to match either "\[" or "\]".
[table
[[Name][Character]]
[[NUL] [\\x00]]
[[SOH] [\\x01]]
[[STX] [\\x02]]
[[ETX] [\\x03]]
[[EOT] [\\x04]]
[[ENQ] [\\x05]]
[[ACK] [\\x06]]
[[alert] [\\x07]]
[[backspace] [\\x08]]
[[tab] [\\t]]
[[newline] [\\n]]
[[vertical-tab] [\\v]]
[[form-feed] [\\f]]
[[carriage-return] [\\r]]
[[SO] [\\xE]]
[[SI] [\\xF]]
[[DLE] [\\x10]]
[[DC1] [\\x11]]
[[DC2] [\\x12]]
[[DC3] [\\x13]]
[[DC4] [\\x14]]
[[NAK] [\\x15]]
[[SYN] [\\x16]]
[[ETB] [\\x17]]
[[CAN] [\\x18]]
[[EM] [\\x19]]
[[SUB] [\\x1A]]
[[ESC] [\\x1B]]
[[IS4] [\\x1C]]
[[IS3] [\\x1D]]
[[IS2] [\\x1E]]
[[IS1] [\\x1F]]
[[space] [\\x20]]
[[exclamation-mark] [!]]
[[quotation-mark] ["]]
[[number-sign] [#]]
[[dollar-sign] [$]]
[[percent-sign] [%]]
[[ampersand] [&]]
[[apostrophe] [\']]
[[left-parenthesis] [(]]
[[right-parenthesis] [)]]
[[asterisk] [\*]]
[[plus-sign] [+]]
[[comma] [,]]
[[hyphen] [-]]
[[period] [.]]
[[slash] [ / ]]
[[zero] [0]]
[[one] [1]]
[[two] [2]]
[[three] [3]]
[[four] [4]]
[[five] [5]]
[[six] [6]]
[[seven] [7]]
[[eight] [8]]
[[nine] [9]]
[[colon] [\:]]
[[semicolon] [;]]
[[less-than-sign] [<]]
[[equals-sign] [=]]
[[greater-than-sign] [>]]
[[question-mark] [?]]
[[commercial-at] [@]]
[[left-square-bracket] [\[]]
[[backslash][\\]]
[[right-square-bracket][\]]]
[[circumflex][~]]
[[underscore][_]]
[[grave-accent][`]]
[[left-curly-bracket][{]]
[[vertical-line][|]]
[[right-curly-bracket][}]]
[[tilde][~]]
[[DEL][\\x7F]]
]
[endsect]
[section:named_unicode Named Unicode Characters]
When using [link boost_regex.unicode Unicode aware regular expressions] (with the `u32regex` type), all
the normal symbolic names for Unicode characters (those given in Unidata.txt)
are recognised. So for example:
[pre \[\[.CYRILLIC CAPITAL LETTER I.\]\] ]
would match the Unicode character 0x0418.
[endsect]
[endsect]