TL;DR: The new parser fixes long-standing bugs and has full Unicode support, but removes non-standard extensions
of the old parser, which could break code:
- String concatenation: the old parser concatenated adjacent string literals like C does.
- Comments: the old parser supported C and C++-style comments. JSON doesn't allow comments.
The JSON writer hasn't been changed; it still has all the Unicode-related problems.
The old JSON parser had quite a few problems:
- Slow to compile.
- Based on the obsolete Spirit.Classic.
- Inherited a multithreading bug from Spirit.Classic (see bug #5520).
- Poor to no support for Unicode.
- Weird departures from standard JSON.
- Tightly bound to string-based property trees.
The new parser has the following features:
- Hand-written recursive descent parser - few template instantiations, fast to compile.
- Parses through a pair of iterators with support for input iterators - can parse directly from streambuf_iterators.
Doesn't need to load the entire file into memory first.
- Push-based stream model.
- Full support for Unicode. Assumes that char is UTF-8. If wchar_t is 16 bits, assumes UTF-16, with support for surrogate pairs.
- Pluggable encoding support. The public interface doesn't expose this yet. Currently, narrow input streams are assumed to use
UTF-8 both internally and externally, and wide streams are assumed to use UTF-16 or UTF-32, depending on the bit width of wchar_t.
Malformed encodings are not accepted.
The pluggable support allows inserting other external encodings, or making narrow streams parse into wide internal trees, etc.
- Replaceable event handlers. Also not exposed by the public interface, the replaceable event handlers allow parsing into non-string
property trees and preserving type information of the JSON.