c957250757
Fixed a typo. Changed "implemenations" to "implementations".
585 lines
22 KiB
HTML
585 lines
22 KiB
HTML
<!doctype HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|
<html>
|
|
<!--
|
|
(C) Copyright 2002-4 Robert Ramey - http://www.rrsd.com .
|
|
Use, modification and distribution is subject to the Boost Software
|
|
License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
|
|
http://www.boost.org/LICENSE_1_0.txt)
|
|
-->
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
|
<link rel="stylesheet" type="text/css" href="../../../boost.css">
|
|
<link rel="stylesheet" type="text/css" href="style.css">
|
|
<title>Serialization - Tutorial</title>
|
|
</head>
|
|
<body link="#0000ff" vlink="#800080">
|
|
<table border="0" cellpadding="7" cellspacing="0" width="100%" summary="header">
|
|
<tr>
|
|
<td valign="top" width="300">
|
|
<h3><a href="../../../index.htm"><img height="86" width="277" alt="C++ Boost" src="../../../boost.png" border="0"></a></h3>
|
|
</td>
|
|
<td valign="top">
|
|
<h1 align="center">Serialization</h1>
|
|
<h2 align="center">Tutorial</h2>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<hr>
|
|
<dl class="page-index">
|
|
<dt><a href="#simplecase">A Very Simple Case</a>
|
|
<dt><a href="#nonintrusiveversion">Non Intrusive Version</a>
|
|
<dt><a href="#serializablemembers">Serializable Members</a>
|
|
<dt><a href="#derivedclasses">Derived Classes</a>
|
|
<dt><a href="#pointers">Pointers</a>
|
|
<dt><a href="#arrays">Arrays</a>
|
|
<dt><a href="#stl">STL Collections</a>
|
|
<dt><a href="#versioning">Class Versioning</a>
|
|
<dt><a href="#splitting">Splitting <code style="white-space: normal">serialize</code> into <code style="white-space: normal">save/load</code></a>
|
|
<dt><a href="#archives">Archives</a>
|
|
<dt><a href="#examples">List of examples</a>
|
|
</dl>
|
|
An output archive is similar to an output data stream. Data can be saved to the archive
|
|
with either the << or the & operator:
|
|
<pre><code>
|
|
ar << data;
|
|
ar & data;
|
|
</code></pre>
|
|
An input archive is similar to an input datastream. Data can be loaded from the archive
|
|
with either the >> or the & operator.
|
|
<pre><code>
|
|
ar >> data;
|
|
ar & data;
|
|
</code></pre>
|
|
<p>
|
|
When these operators are invoked for primitive data types, the data is simply saved/loaded
|
|
to/from the archive. When invoked for class data types, the class
|
|
<code style="white-space: normal">serialize</code> function is invoked. Each
|
|
<code style="white-space: normal">serialize</code> function is uses the above operators
|
|
to save/load its data members. This process will continue in a recursive manner until
|
|
all the data contained in the class is saved/loaded.
|
|
|
|
<h3><a name="simplecase">A Very Simple Case</a></h3>
|
|
These operators are used inside the <code style="white-space: normal">serialize</code>
|
|
function to save and load class data members.
|
|
<p>
|
|
Included in this library is a program called
|
|
<a href="../example/demo.cpp" target="demo_cpp">demo.cpp</a> which illustrates how
|
|
to use this system. Below we excerpt code from this program to
|
|
illustrate with the simplest possible case how this library is
|
|
intended to be used.
|
|
<pre>
|
|
<code>
|
|
#include <fstream>
|
|
|
|
// include headers that implement a archive in simple text format
|
|
#include <boost/archive/text_oarchive.hpp>
|
|
#include <boost/archive/text_iarchive.hpp>
|
|
|
|
/////////////////////////////////////////////////////////////
|
|
// gps coordinate
|
|
//
|
|
// illustrates serialization for a simple type
|
|
//
|
|
class gps_position
|
|
{
|
|
private:
|
|
friend class boost::serialization::access;
|
|
// When the class Archive corresponds to an output archive, the
|
|
// & operator is defined similar to <<. Likewise, when the class Archive
|
|
// is a type of input archive the & operator is defined similar to >>.
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int version)
|
|
{
|
|
ar & degrees;
|
|
ar & minutes;
|
|
ar & seconds;
|
|
}
|
|
int degrees;
|
|
int minutes;
|
|
float seconds;
|
|
public:
|
|
gps_position(){};
|
|
gps_position(int d, int m, float s) :
|
|
degrees(d), minutes(m), seconds(s)
|
|
{}
|
|
};
|
|
|
|
int main() {
|
|
// create and open a character archive for output
|
|
std::ofstream ofs("filename");
|
|
|
|
// create class instance
|
|
const gps_position g(35, 59, 24.567f);
|
|
|
|
// save data to archive
|
|
{
|
|
boost::archive::text_oarchive oa(ofs);
|
|
// write class instance to archive
|
|
oa << g;
|
|
// archive and stream closed when destructors are called
|
|
}
|
|
|
|
// ... some time later restore the class instance to its orginal state
|
|
gps_position newg;
|
|
{
|
|
// create and open an archive for input
|
|
std::ifstream ifs("filename");
|
|
boost::archive::text_iarchive ia(ifs);
|
|
// read class state from archive
|
|
ia >> newg;
|
|
// archive and stream closed when destructors are called
|
|
}
|
|
return 0;
|
|
}
|
|
</code>
|
|
</pre>
|
|
<p>For each class to be saved via serialization, there must exist a function to
|
|
save all the class members which define the state of the class.
|
|
For each class to be loaded via serialization, there must exist a function to
|
|
load theese class members in the same sequence as they were saved.
|
|
In the above example, these functions are generated by the
|
|
template member function <code style="white-space: normal">serialize</code>.
|
|
|
|
<h3><a name="nonintrusiveversion">Non Intrusive Version</a></h3>
|
|
<p>The above formulation is intrusive. That is, it requires
|
|
that classes whose instances are to be serialized be
|
|
altered. This can be inconvenient in some cases.
|
|
An equivalent alternative formulation permitted by the
|
|
system would be:
|
|
<pre><code>
|
|
#include <boost/archive/text_oarchive.hpp>
|
|
#include <boost/archive/text_iarchive.hpp>
|
|
|
|
class gps_position
|
|
{
|
|
public:
|
|
int degrees;
|
|
int minutes;
|
|
float seconds;
|
|
gps_position(){};
|
|
gps_position(int d, int m, float s) :
|
|
degrees(d), minutes(m), seconds(s)
|
|
{}
|
|
};
|
|
|
|
namespace boost {
|
|
namespace serialization {
|
|
|
|
template<class Archive>
|
|
void serialize(Archive & ar, gps_position & g, const unsigned int version)
|
|
{
|
|
ar & g.degrees;
|
|
ar & g.minutes;
|
|
ar & g.seconds;
|
|
}
|
|
|
|
} // namespace serialization
|
|
} // namespace boost
|
|
</code></pre>
|
|
<p>
|
|
In this case the generated serialize functions are not members of the
|
|
<code style="white-space: normal">gps_position</code> class. The two formulations function
|
|
in exactly the same way.
|
|
<p>
|
|
The main application of non-intrusive serialization is to permit serialization
|
|
to be implemented for classes without changing the class definition.
|
|
In order for this to be possible, the class must expose enough information
|
|
to reconstruct the class state. In this example, we presumed that the
|
|
class had <code style="white-space: normal">public</code> members - not a common occurence. Only
|
|
classes which expose enough information to save and restore the class
|
|
state will be serializable without changing the class definition.
|
|
<h3><a name="serializablemembers">Serializable Members</a></h3>
|
|
<p>
|
|
A serializable class with serializable members would look like this:
|
|
<pre><code>
|
|
class bus_stop
|
|
{
|
|
friend class boost::serialization::access;
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int version)
|
|
{
|
|
ar & latitude;
|
|
ar & longitude;
|
|
}
|
|
gps_position latitude;
|
|
gps_position longitude;
|
|
protected:
|
|
bus_stop(const gps_position & lat_, const gps_position & long_) :
|
|
latitude(lat_), longitude(long_)
|
|
{}
|
|
public:
|
|
bus_stop(){}
|
|
// See item # 14 in Effective C++ by Scott Meyers.
|
|
// re non-virtual destructors in base classes.
|
|
virtual ~bus_stop(){}
|
|
};
|
|
</code></pre>
|
|
<p>That is, members of class type are serialized just as
|
|
members of primitive types are.
|
|
<p>
|
|
Note that saving an instance of the class <code style="white-space: normal">bus_stop</code>
|
|
with one of the archive operators will invoke the
|
|
<code style="white-space: normal">serialize</code> function which saves
|
|
<code style="white-space: normal">latitude</code> and
|
|
<code style="white-space: normal">longitude</code>. Each of these in turn will be saved by invoking
|
|
<code style="white-space: normal">serialize</code> in the definition of
|
|
<code style="white-space: normal">gps_position</code>. In this manner the whole
|
|
data structure is saved by the application of an archive operator to
|
|
just its root item.
|
|
|
|
|
|
<h3><a name="derivedclasses">Derived Classes</a></h3>
|
|
<p>Derived classes should include serializations of their base classes.
|
|
<pre><code>
|
|
#include <boost/serialization/base_object.hpp>
|
|
|
|
class bus_stop_corner : public bus_stop
|
|
{
|
|
friend class boost::serialization::access;
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int version)
|
|
{
|
|
// serialize base class information
|
|
ar & boost::serialization::base_object<bus_stop>(*this);
|
|
ar & street1;
|
|
ar & street2;
|
|
}
|
|
std::string street1;
|
|
std::string street2;
|
|
virtual std::string description() const
|
|
{
|
|
return street1 + " and " + street2;
|
|
}
|
|
public:
|
|
bus_stop_corner(){}
|
|
bus_stop_corner(const gps_position & lat_, const gps_position & long_,
|
|
const std::string & s1_, const std::string & s2_
|
|
) :
|
|
bus_stop(lat_, long_), street1(s1_), street2(s2_)
|
|
{}
|
|
};
|
|
</code>
|
|
</pre>
|
|
<p>
|
|
Note the serialization of the base classes from the derived
|
|
class. Do <b>NOT</b> directly call the base class serialize
|
|
functions. Doing so might seem to work but will bypass the code
|
|
that tracks instances written to storage to eliminate redundancies.
|
|
It will also bypass the writing of class version information into
|
|
the archive. For this reason, it is advisable to always make member
|
|
<code style="white-space: normal">serialize</code> functions private. The declaration
|
|
<code style="white-space: normal">friend boost::serialization::access</code> will grant to the
|
|
serialization library access to private member variables and functions.
|
|
<p>
|
|
<h3><a name="pointers">Pointers</a></h3>
|
|
Suppose we define a bus route as an array of bus stops. Given that
|
|
<ol>
|
|
<li>we might have several types of bus stops (remember bus_stop is
|
|
a base class)
|
|
<li>a given bus_stop might appear in more than one route.
|
|
</ol>
|
|
it's convenient to represent a bus route with an array of pointers
|
|
to <code style="white-space: normal">bus_stop</code>.
|
|
<pre>
|
|
<code>
|
|
class bus_route
|
|
{
|
|
friend class boost::serialization::access;
|
|
bus_stop * stops[10];
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int version)
|
|
{
|
|
int i;
|
|
for(i = 0; i < 10; ++i)
|
|
ar & stops[i];
|
|
}
|
|
public:
|
|
bus_route(){}
|
|
};
|
|
</code>
|
|
</pre>
|
|
Each member of the array <code style="white-space: normal">stops</code> will be serialized.
|
|
But remember each member is a pointer - so what can this really
|
|
mean? The whole object of this serialization is to permit
|
|
reconstruction of the original data structures at another place
|
|
and time. In order to accomplish this with a pointer, it is
|
|
not sufficient to save the value of the pointer, rather the
|
|
object it points to must be saved. When the member is later
|
|
loaded, a new object has to be created and a new pointer has
|
|
to be loaded into the class member.
|
|
<p>
|
|
If the same pointer is serialized more than once, only one instance
|
|
is be added to the archive. When read back, no data is read back in.
|
|
The only operation that occurs is for the second pointer is set equal to the first
|
|
<p>
|
|
Note that, in this example, the array consists of polymorphic pointers.
|
|
That is, each array element point to one of several possible
|
|
kinds of bus stops. So when the pointer is saved, some sort of class
|
|
identifier must be saved. When the pointer is loaded, the class
|
|
identifier must be read and and instance of the corresponding class
|
|
must be constructed. Finally the data can be loaded to newly created
|
|
instance of the correct type.
|
|
|
|
As can be seen in
|
|
<a href="../example/demo.cpp" target="demo_cpp">demo.cpp</a>,
|
|
serialization of pointers to derived classes through a base
|
|
clas pointer may require explicit enumeration of the derived
|
|
classes to be serialized. This is referred to as "registration" or "export"
|
|
of derived classes. This requirement and the methods of
|
|
satisfying it are explained in detail
|
|
<a href="serialization.html#derivedpointers">here</a>.
|
|
<p>
|
|
All this is accomplished automatically by the serialization
|
|
library. The above code is all that is necessary to accomplish
|
|
the saving and loading of objects accessed through pointers.
|
|
<p>
|
|
<h3><a name="arrays">Arrays</a></h3>
|
|
The above formulation is in fact more complex than necessary.
|
|
The serialization library detects when the object being
|
|
serialized is an array and emits code equivalent to the above.
|
|
So the above can be shortened to:
|
|
<pre>
|
|
<code>
|
|
class bus_route
|
|
{
|
|
friend class boost::serialization::access;
|
|
bus_stop * stops[10];
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int version)
|
|
{
|
|
ar & stops;
|
|
}
|
|
public:
|
|
bus_route(){}
|
|
};
|
|
</code>
|
|
</pre>
|
|
<h3><a name="stl">STL Collections</a></h3>
|
|
The above example uses an array of members. More likely such
|
|
an application would use an STL collection for such a purpose.
|
|
The serialization library contains code for serialization
|
|
of all STL classes. Hence, the reformulation below will
|
|
also work as one would expect.
|
|
<pre>
|
|
<code>
|
|
#include <boost/serialization/list.hpp>
|
|
|
|
class bus_route
|
|
{
|
|
friend class boost::serialization::access;
|
|
std::list<bus_stop *> stops;
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int version)
|
|
{
|
|
ar & stops;
|
|
}
|
|
public:
|
|
bus_route(){}
|
|
};
|
|
</code>
|
|
</pre>
|
|
<h3><a name="versioning">Class Versioning</a></h3>
|
|
<p>
|
|
Suppose we're satisfied with our <code style="white-space: normal">bus_route</code> class, build a program
|
|
that uses it and ship the product. Some time later, it's decided
|
|
that the program needs enhancement and the <code style="white-space: normal">bus_route</code> class is
|
|
altered to include the name of the driver of the route. So the
|
|
new version looks like:
|
|
<pre>
|
|
<code>
|
|
#include <boost/serialization/list.hpp>
|
|
#include <boost/serialization/string.hpp>
|
|
|
|
class bus_route
|
|
{
|
|
friend class boost::serialization::access;
|
|
std::list<bus_stop *> stops;
|
|
std::string driver_name;
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int version)
|
|
{
|
|
ar & driver_name;
|
|
ar & stops;
|
|
}
|
|
public:
|
|
bus_route(){}
|
|
};
|
|
</code>
|
|
</pre>
|
|
Great, we're all done. Except... what about people using our application
|
|
who now have a bunch of files created under the previous program.
|
|
How can these be used with our new program version?
|
|
<p>
|
|
In general, the serialization library stores a version number in the
|
|
archive for each class serialized. By default this version number is 0.
|
|
When the archive is loaded, the version number under which it was saved
|
|
is read. The above code can be altered to exploit this
|
|
<pre>
|
|
<code>
|
|
#include <boost/serialization/list.hpp>
|
|
#include <boost/serialization/string.hpp>
|
|
#include <boost/serialization/version.hpp>
|
|
|
|
class bus_route
|
|
{
|
|
friend class boost::serialization::access;
|
|
std::list<bus_stop *> stops;
|
|
std::string driver_name;
|
|
template<class Archive>
|
|
void serialize(Archive & ar, const unsigned int version)
|
|
{
|
|
// only save/load driver_name for newer archives
|
|
if(version > 0)
|
|
ar & driver_name;
|
|
ar & stops;
|
|
}
|
|
public:
|
|
bus_route(){}
|
|
};
|
|
|
|
BOOST_CLASS_VERSION(bus_route, 1)
|
|
</code>
|
|
</pre>
|
|
By application of versioning to each class, there is no need to
|
|
try to maintain a versioning of files. That is, a file version
|
|
is the combination of the versions of all its constituent classes.
|
|
|
|
This system permits programs to be always compatible with archives
|
|
created by all previous versions of a program with no more
|
|
effort than required by this example.
|
|
|
|
<h3><a name="splitting">Splitting <code style="white-space: normal">serialize</code>
|
|
into <code style="white-space: normal">save/load</code></a></h3>
|
|
The <code style="white-space: normal">serialize</code> function is simple, concise, and guarantees
|
|
that class members are saved and loaded in the same sequence
|
|
- the key to the serialization system. However, there are cases
|
|
where the load and save operations are not as similar as the examples
|
|
used here. For example, this could occur with a class that has evolved through
|
|
multiple versions. The above class can be reformulated as:
|
|
<pre>
|
|
<code>
|
|
#include <boost/serialization/list.hpp>
|
|
#include <boost/serialization/string.hpp>
|
|
#include <boost/serialization/version.hpp>
|
|
#include <boost/serialization/split_member.hpp>
|
|
|
|
class bus_route
|
|
{
|
|
friend class boost::serialization::access;
|
|
std::list<bus_stop *> stops;
|
|
std::string driver_name;
|
|
template<class Archive>
|
|
void save(Archive & ar, const unsigned int version) const
|
|
{
|
|
// note, version is always the latest when saving
|
|
ar & driver_name;
|
|
ar & stops;
|
|
}
|
|
template<class Archive>
|
|
void load(Archive & ar, const unsigned int version)
|
|
{
|
|
if(version > 0)
|
|
ar & driver_name;
|
|
ar & stops;
|
|
}
|
|
BOOST_SERIALIZATION_SPLIT_MEMBER()
|
|
public:
|
|
bus_route(){}
|
|
};
|
|
|
|
BOOST_CLASS_VERSION(bus_route, 1)
|
|
</code>
|
|
</pre>
|
|
The macro <code style="white-space: normal">BOOST_SERIALIZATION_SPLIT_MEMBER()</code> generates
|
|
code which invokes the <code style="white-space: normal">save</code>
|
|
or <code style="white-space: normal">load</code>
|
|
depending on whether the archive is used for saving or loading.
|
|
<h3><a name="archives">Archives</a></h3>
|
|
Our discussion here has focused on adding serialization
|
|
capability to classes. The actual rendering of the data to be serialized
|
|
is implemented in the archive class. Thus the stream of serialized
|
|
data is a product of the serialization of the class and the
|
|
archive selected. It is a key design decision that these two
|
|
components be independent. This permits any serialization specification
|
|
to be usable with any archive.
|
|
<p>
|
|
In this tutorial, we have used a particular
|
|
archive class - <code style="white-space: normal">text_oarchive</code> for saving and
|
|
<code style="white-space: normal">text_iarchive</code> for loading.
|
|
text archives render data as text and are portable across platforms. In addition
|
|
to text archives, the library includes archive class for native binary data
|
|
and xml formatted data. Interfaces to all archive classes are all identical.
|
|
Once serialization has been defined for a class, that class can be serialized to
|
|
any type of archive.
|
|
<p>
|
|
If the current set of archive classes doesn't provide the
|
|
attributes, format, or behavior needed for a particular application,
|
|
one can either make a new archive class or derive from an existing one.
|
|
This is described later in the manual.
|
|
|
|
<h3><a name="examples">List of Examples</h3>
|
|
<dl>
|
|
<dt><a href="../example/demo.cpp" target="demo_cpp">demo.cpp</a>
|
|
<dd>This is the completed example used in this tutorial.
|
|
It does the following:
|
|
<ol>
|
|
<li>Creates a structure of differing kinds of stops, routes and schedules
|
|
<li>Displays it
|
|
<li>Serializes it to a file named "testfile.txt" with one
|
|
statement
|
|
<li>Restores to another structure
|
|
<li>Displays the restored structure
|
|
</ol>
|
|
<a href="../example/demo_output.txt" target="demo_output">Output of
|
|
this program</a> is sufficient to verify that all the
|
|
originally stated requirements for a serialization system
|
|
are met with this system. The <a href="../example/demofile.txt"
|
|
target="test_file">contents of the archive file</a> can
|
|
also be displayed as serialization files are ASCII text.
|
|
|
|
<dt><a href="../example/demo_xml.cpp" target="demo_xml_cpp">demo_xml.cpp</a>
|
|
<dd>This is a variation the original demo which supports xml archives in addition
|
|
to the others. The extra wrapping macro, BOOST_SERIALIZATION_NVP(name), is
|
|
needed to associate a data item name with the corresponding xml
|
|
tag. It is importanted that 'name' be a valid xml tag, else it
|
|
will be impossible to restore the archive.
|
|
For more information see
|
|
<a target="detail" href="wrappers.html#nvp">Name-Value Pairs</a>.
|
|
<a href="../example/demo_save.xml" target="demo_save_xml">Here</a>
|
|
is what an xml archive looks like.
|
|
|
|
<dt><a href="../example/demo_xml_save.cpp" target="demo_xml_save_cpp">demo_xml_save.cpp</a>
|
|
and <a href="../example/demo_xml_load.cpp" target="demo_xml_load_cpp">demo_xml_load.cpp</a>
|
|
<dd>Note also that though our examples save and load the program data
|
|
to an archive within the same program, this merely a convenience
|
|
for purposes of illustration. In general, the archive may or may
|
|
not be loaded by the same program that created it.
|
|
</dl>
|
|
<p>
|
|
The astute reader might notice that these examples contain a subtle but important flaw.
|
|
They leak memory. The bus stops are created in the <code style="white-space: normal">
|
|
main</code> function. The bus schedules may refer to these bus stops
|
|
any number of times. At the end of the main function after the bus schedules are destroyed,
|
|
the bus stops are destroyed. This seems fine. But what about the structure
|
|
<code style="white-space: normal">new_schedule</code> data item created by the
|
|
process of loading from an archive? This contains its own separate set of bus stops
|
|
that are not referenced outside of the bus schedule. These won't be destroyed
|
|
anywhere in the program - a memory leak.
|
|
<p>
|
|
There are couple of ways of fixing this. One way is to explicitly manage the bus stops.
|
|
However, a more robust and transparent is to use
|
|
<code style="white-space: normal">shared_ptr</code> rather than raw pointers. Along
|
|
with serialization implementations for the Standard Library, the serialization library
|
|
includes implementation of serialization for
|
|
<code style="white-space: normal">boost::shared ptr</code>. Given this, it should be
|
|
easy to alter any of these examples to eliminate the memory leak. This is left
|
|
as an excercise for the reader.
|
|
|
|
<hr>
|
|
<p><i>© Copyright <a href="http://www.rrsd.com">Robert Ramey</a> 2002-2004.
|
|
Distributed under the Boost Software License, Version 1.0. (See
|
|
accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
|
</i></p>
|
|
</body>
|
|
</html>
|