457 lines
19 KiB
Plaintext
457 lines
19 KiB
Plaintext
[/
|
|
Copyright (c) 2016-2019 Vinnie Falco (vinnie dot falco at gmail dot com)
|
|
|
|
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
|
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
|
|
|
Official repository: https://github.com/boostorg/beast
|
|
]
|
|
|
|
[section HTTP Comparison to Other Libraries]
|
|
|
|
There are a few C++ published libraries which implement some of the HTTP
|
|
protocol. We analyze the message model chosen by those libraries and discuss
|
|
the advantages and disadvantages relative to Beast.
|
|
|
|
The general strategy used by the author to evaluate external libraries is
|
|
as follows:
|
|
|
|
* Review the message model. Can it represent a complete request or
|
|
response? What level of allocator support is present? How much
|
|
customization is possible?
|
|
|
|
* Review the stream abstraction. This is the type of object, such as
|
|
a socket, which may be used to parse or serialize (i.e. read and write).
|
|
Can user defined types be specified? What's the level of conformance to
|
|
to Asio or Networking-TS concepts?
|
|
|
|
* Check treatment of buffers. Does the library manage the buffers
|
|
or can users provide their own buffers?
|
|
|
|
* How does the library handle corner cases such as trailers,
|
|
Expect: 100-continue, or deferred commitment of the body type?
|
|
|
|
[note
|
|
Declarations examples from external libraries have been edited:
|
|
portions have been removed for simplification.
|
|
]
|
|
|
|
|
|
|
|
[heading cpp-netlib]
|
|
|
|
[@https://github.com/cpp-netlib/cpp-netlib/tree/092cd570fb179d029d1865aade9f25aae90d97b9 [*cpp-netlib]]
|
|
is a network programming library previously intended for Boost but not
|
|
having gone through formal review. As of this writing it still uses the
|
|
Boost name, namespace, and directory structure although the project states
|
|
that Boost acceptance is no longer a goal. The library is based on Boost.Asio
|
|
and bills itself as ['"a collection of network related routines/implementations
|
|
geared towards providing a robust cross-platform networking library"]. It
|
|
cites ['"Common Message Type"] as a feature. As of the branch previous
|
|
linked, it uses these declarations:
|
|
```
|
|
template <class Tag>
|
|
struct basic_message {
|
|
public:
|
|
typedef Tag tag;
|
|
|
|
typedef typename headers_container<Tag>::type headers_container_type;
|
|
typedef typename headers_container_type::value_type header_type;
|
|
typedef typename string<Tag>::type string_type;
|
|
|
|
headers_container_type& headers() { return headers_; }
|
|
headers_container_type const& headers() const { return headers_; }
|
|
|
|
string_type& body() { return body_; }
|
|
string_type const& body() const { return body_; }
|
|
|
|
string_type& source() { return source_; }
|
|
string_type const& source() const { return source_; }
|
|
|
|
string_type& destination() { return destination_; }
|
|
string_type const& destination() const { return destination_; }
|
|
|
|
private:
|
|
friend struct detail::directive_base<Tag>;
|
|
friend struct detail::wrapper_base<Tag, basic_message<Tag> >;
|
|
|
|
mutable headers_container_type headers_;
|
|
mutable string_type body_;
|
|
mutable string_type source_;
|
|
mutable string_type destination_;
|
|
};
|
|
```
|
|
|
|
This container is the base class template used to represent HTTP messages.
|
|
It uses a "tag" type style specializations for a variety of trait classes,
|
|
allowing for customization of the various parts of the message. For example,
|
|
a user specializes `headers_container<T>` to determine what container type
|
|
holds the header fields. We note some problems with the container declaration:
|
|
|
|
* The header and body containers may only be default-constructed.
|
|
|
|
* No stateful allocator support.
|
|
|
|
* There is no way to defer the commitment of the type for `body_` to
|
|
after the headers are read in.
|
|
|
|
* The message model includes a "source" and "destination." This is
|
|
extraneous metadata associated with the connection which is not part
|
|
of the HTTP protocol specification and belongs elsewhere.
|
|
|
|
* The use of `string_type` (a customization point) for source,
|
|
destination, and body suggests that `string_type` models a
|
|
[*ForwardRange] whose `value_type` is `char`. This representation
|
|
is less than ideal, considering that the library is built on
|
|
Boost.Asio. Adapting a __DynamicBuffer__ to the required forward
|
|
range destroys information conveyed by the __ConstBufferSequence__
|
|
and __MutableBufferSequence__ used in dynamic buffers. The consequence
|
|
is that cpp-netlib implementations will be less efficient than an
|
|
equivalent __NetTS__ conforming implementation.
|
|
|
|
* The library uses specializations of `string<Tag>` to change the type
|
|
of string used everywhere, including the body, field name and value
|
|
pairs, and extraneous metadata such as source and destination. The
|
|
user may only choose a single type: field name, field values, and
|
|
the body container will all use the same string type. This limits
|
|
utility of the customization point. The library's use of the string
|
|
trait is limited to selecting between `std::string` and `std::wstring`.
|
|
We do not find this use-case compelling given the limitations.
|
|
|
|
* The specialized trait classes generate a proliferation of small
|
|
additional framework types. To specialize traits, users need to exit
|
|
their namespace and intrude into the `boost::network::http` namespace.
|
|
The way the traits are used in the library limits the usefulness
|
|
of the traits to trivial purpose.
|
|
|
|
* The `string<Tag> customization point constrains user defined body types
|
|
to few possible strategies. There is no way to represent an HTTP message
|
|
body as a filename with accompanying algorithms to store or retrieve data
|
|
from the file system.
|
|
|
|
The design of the message container in this library is cumbersome
|
|
with its system of customization using trait specializations. The
|
|
use of these customizations is extremely limited due to the way they
|
|
are used in the container declaration, making the design overly
|
|
complex without corresponding benefit.
|
|
|
|
|
|
|
|
[heading Boost.HTTP]
|
|
|
|
[@https://github.com/BoostGSoC14/boost.http/tree/45fc1aa828a9e3810b8d87e669b7f60ec100bff4 [*boost.http]]
|
|
is a library resulting from the 2014 Google Summer of Code. It was submitted
|
|
for a Boost formal review and rejected in 2015. It is based on Boost.Asio,
|
|
and development on the library has continued to the present. As of the branch
|
|
previously linked, it uses these message declarations:
|
|
```
|
|
template<class Headers, class Body>
|
|
struct basic_message
|
|
{
|
|
typedef Headers headers_type;
|
|
typedef Body body_type;
|
|
|
|
headers_type &headers();
|
|
|
|
const headers_type &headers() const;
|
|
|
|
body_type &body();
|
|
|
|
const body_type &body() const;
|
|
|
|
headers_type &trailers();
|
|
|
|
const headers_type &trailers() const;
|
|
|
|
private:
|
|
headers_type headers_;
|
|
body_type body_;
|
|
headers_type trailers_;
|
|
};
|
|
|
|
typedef basic_message<boost::http::headers, std::vector<std::uint8_t>> message;
|
|
|
|
template<class Headers, class Body>
|
|
struct is_message<basic_message<Headers, Body>>: public std::true_type {};
|
|
```
|
|
|
|
* This container cannot model a complete message. The ['start-line] items
|
|
(method and target for requests, reason-phrase for responses) are
|
|
communicated out of band, as is the ['http-version]. A function that
|
|
operates on the message including the start line requires additional
|
|
parameters. This is evident in one of the
|
|
[@https://github.com/BoostGSoC14/boost.http/blob/45fc1aa828a9e3810b8d87e669b7f60ec100bff4/example/basic_router.cpp#L81 example programs].
|
|
The `500` and `"OK"` arguments represent the response ['status-code] and
|
|
['reason-phrase] respectively:
|
|
```
|
|
...
|
|
http::message reply;
|
|
...
|
|
self->socket.async_write_response(500, string_ref("OK"), reply, yield);
|
|
```
|
|
|
|
* `headers_`, `body_`, and `trailers_` may only be default-constructed,
|
|
since there are no explicitly declared constructors.
|
|
|
|
* There is no way to defer the commitment of the [*Body] type to after
|
|
the headers are read in. This is related to the previous limitation
|
|
on default-construction.
|
|
|
|
* No stateful allocator support. This follows from the previous limitation
|
|
on default-construction. Buffers for start-line strings must be
|
|
managed externally from the message object since they are not members.
|
|
|
|
* The trailers are stored in a separate object. Aside from the combinatorial
|
|
explosion of the number of additional constructors necessary to fully
|
|
support arbitrary forwarded parameter lists for each of the headers, body,
|
|
and trailers members, the requirement to know in advance whether a
|
|
particular HTTP field will be located in the headers or the trailers
|
|
poses an unnecessary complication for general purpose functions that
|
|
operate on messages.
|
|
|
|
* The declarations imply that `std::vector` is a model of [*Body].
|
|
More formally, that a body is represented by the [*ForwardRange]
|
|
concept whose `value_type` is an 8-bit integer. This representation
|
|
is less than ideal, considering that the library is built on
|
|
Boost.Asio. Adapting a __DynamicBuffer__ to the required forward range
|
|
destroys information conveyed by the __ConstBufferSequence__ and
|
|
__MutableBufferSequence__ used in dynamic buffers. The consequence is
|
|
that Boost.HTTP implementations will be less efficient when dealing
|
|
with body containers than an equivalent __NetTS__ conforming
|
|
implementation.
|
|
|
|
* The [*Body] customization point constrains user defined types to
|
|
very limited implementation strategies. For example, there is no way
|
|
to represent an HTTP message body as a filename with accompanying
|
|
algorithms to store or retrieve data from the file system.
|
|
|
|
This representation addresses a narrow range of use cases. It has
|
|
limited potential for customization and performance. It is more difficult
|
|
to use because it excludes the start line fields from the model.
|
|
|
|
|
|
|
|
[heading C++ REST SDK (cpprestsdk)]
|
|
|
|
[@https://github.com/Microsoft/cpprestsdk/tree/381f5aa92d0dfb59e37c0c47b4d3771d8024e09a [*cpprestsdk]]
|
|
is a Microsoft project which ['"...aims to help C++ developers connect to and
|
|
interact with services"]. It offers the most functionality of the libraries
|
|
reviewed here, including support for Websocket services using its websocket++
|
|
dependency. It can use native APIs such as HTTP.SYS when building Windows
|
|
based applications, and it can use Boost.Asio. The WebSocket module uses
|
|
Boost.Asio exclusively.
|
|
|
|
As cpprestsdk is developed by a large corporation, it contains quite a bit
|
|
of functionality and necessarily has more interfaces. We will break down
|
|
the interfaces used to model messages into more manageable pieces. This
|
|
is the container used to store the HTTP header fields:
|
|
```
|
|
class http_headers
|
|
{
|
|
public:
|
|
...
|
|
|
|
private:
|
|
std::map<utility::string_t, utility::string_t, _case_insensitive_cmp> m_headers;
|
|
};
|
|
```
|
|
|
|
This declaration is quite bare-bones. We note the typical problems of
|
|
most field containers:
|
|
|
|
* The container may only be default-constructed.
|
|
|
|
* No support for allocators, stateful or otherwise.
|
|
|
|
* There are no customization points at all.
|
|
|
|
Now we analyze the structure of
|
|
the larger message container. The library uses a handle/body idiom. There
|
|
are two public message container interfaces, one for requests (`http_request`)
|
|
and one for responses (`http_response`). Each interface maintains a private
|
|
shared pointer to an implementation class. Public member function calls
|
|
are routed to the internal implementation. This is the first implementation
|
|
class, which forms the base class for both the request and response
|
|
implementations:
|
|
```
|
|
namespace details {
|
|
|
|
class http_msg_base
|
|
{
|
|
public:
|
|
http_headers &headers() { return m_headers; }
|
|
|
|
_ASYNCRTIMP void set_body(const concurrency::streams::istream &instream, const utf8string &contentType);
|
|
|
|
/// Set the stream through which the message body could be read
|
|
void set_instream(const concurrency::streams::istream &instream) { m_inStream = instream; }
|
|
|
|
/// Set the stream through which the message body could be written
|
|
void set_outstream(const concurrency::streams::ostream &outstream, bool is_default) { m_outStream = outstream; m_default_outstream = is_default; }
|
|
|
|
const pplx::task_completion_event<utility::size64_t> & _get_data_available() const { return m_data_available; }
|
|
|
|
protected:
|
|
/// Stream to read the message body.
|
|
concurrency::streams::istream m_inStream;
|
|
|
|
/// stream to write the msg body
|
|
concurrency::streams::ostream m_outStream;
|
|
|
|
http_headers m_headers;
|
|
bool m_default_outstream;
|
|
|
|
/// <summary> The TCE is used to signal the availability of the message body. </summary>
|
|
pplx::task_completion_event<utility::size64_t> m_data_available;
|
|
};
|
|
```
|
|
|
|
To understand these declarations we need to first understand that cpprestsdk
|
|
uses the asynchronous model defined by Microsoft's
|
|
[@https://msdn.microsoft.com/en-us/library/dd504870.aspx [*Concurrency Runtime]].
|
|
Identifiers from the [@https://msdn.microsoft.com/en-us/library/jj987780.aspx [*`pplx` namespace]]
|
|
define common asynchronous patterns such as tasks and events. The
|
|
`concurrency::streams::istream` parameter and `m_data_available` data member
|
|
indicates a lack of separation of concerns. The representation of HTTP messages
|
|
should not be conflated with the asynchronous model used to serialize or
|
|
parse those messages in the message declarations.
|
|
|
|
The next declaration forms the complete implementation class referenced by the
|
|
handle in the public interface (which follows after):
|
|
```
|
|
/// Internal representation of an HTTP request message.
|
|
class _http_request final : public http::details::http_msg_base, public std::enable_shared_from_this<_http_request>
|
|
{
|
|
public:
|
|
_ASYNCRTIMP _http_request(http::method mtd);
|
|
|
|
_ASYNCRTIMP _http_request(std::unique_ptr<http::details::_http_server_context> server_context);
|
|
|
|
http::method &method() { return m_method; }
|
|
|
|
const pplx::cancellation_token &cancellation_token() const { return m_cancellationToken; }
|
|
|
|
_ASYNCRTIMP pplx::task<void> reply(const http_response &response);
|
|
|
|
private:
|
|
|
|
// Actual initiates sending the response, without checking if a response has already been sent.
|
|
pplx::task<void> _reply_impl(http_response response);
|
|
|
|
http::method m_method;
|
|
|
|
std::shared_ptr<progress_handler> m_progress_handler;
|
|
};
|
|
|
|
} // namespace details
|
|
```
|
|
|
|
As before, we note that the implementation class for HTTP requests concerns
|
|
itself more with the mechanics of sending the message asynchronously than
|
|
it does with actually modeling the HTTP message as described in __rfc7230__:
|
|
|
|
* The constructor accepting `std::unique_ptr<http::details::_http_server_context`
|
|
breaks encapsulation and separation of concerns. This cannot be extended
|
|
for user defined server contexts.
|
|
|
|
* The "cancellation token" is stored inside the message. This breaks the
|
|
separation of concerns.
|
|
|
|
* The `_reply_impl` function implies that the message implementation also
|
|
shares responsibility for the means of sending back an HTTP reply. This
|
|
would be better if it was completely separate from the message container.
|
|
|
|
Finally, here is the public class which represents an HTTP request:
|
|
```
|
|
class http_request
|
|
{
|
|
public:
|
|
const http::method &method() const { return _m_impl->method(); }
|
|
|
|
void set_method(const http::method &method) const { _m_impl->method() = method; }
|
|
|
|
/// Extract the body of the request message as a string value, checking that the content type is a MIME text type.
|
|
/// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out.
|
|
pplx::task<utility::string_t> extract_string(bool ignore_content_type = false)
|
|
{
|
|
auto impl = _m_impl;
|
|
return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->extract_string(ignore_content_type); });
|
|
}
|
|
|
|
/// Extracts the body of the request message into a json value, checking that the content type is application/json.
|
|
/// A body can only be extracted once because in some cases an optimization is made where the data is 'moved' out.
|
|
pplx::task<json::value> extract_json(bool ignore_content_type = false) const
|
|
{
|
|
auto impl = _m_impl;
|
|
return pplx::create_task(_m_impl->_get_data_available()).then([impl, ignore_content_type](utility::size64_t) { return impl->_extract_json(ignore_content_type); });
|
|
}
|
|
|
|
/// Sets the body of the message to the contents of a byte vector. If the 'Content-Type'
|
|
void set_body(const std::vector<unsigned char> &body_data);
|
|
|
|
/// Defines a stream that will be relied on to provide the body of the HTTP message when it is
|
|
/// sent.
|
|
void set_body(const concurrency::streams::istream &stream, const utility::string_t &content_type = _XPLATSTR("application/octet-stream"));
|
|
|
|
/// Defines a stream that will be relied on to hold the body of the HTTP response message that
|
|
/// results from the request.
|
|
void set_response_stream(const concurrency::streams::ostream &stream);
|
|
{
|
|
return _m_impl->set_response_stream(stream);
|
|
}
|
|
|
|
/// Defines a callback function that will be invoked for every chunk of data uploaded or downloaded
|
|
/// as part of the request.
|
|
void set_progress_handler(const progress_handler &handler);
|
|
|
|
private:
|
|
friend class http::details::_http_request;
|
|
friend class http::client::http_client;
|
|
|
|
std::shared_ptr<http::details::_http_request> _m_impl;
|
|
};
|
|
```
|
|
|
|
It is clear from this declaration that the goal of the message model in
|
|
this library is driven by its use-case (interacting with REST servers)
|
|
and not to model HTTP messages generally. We note problems similar to
|
|
the other declarations:
|
|
|
|
* There are no compile-time customization points at all. The only
|
|
customization is in the `concurrency::streams::istream` and
|
|
`concurrency::streams::ostream` reference parameters. Presumably,
|
|
these are abstract interfaces which may be subclassed by users
|
|
to achieve custom behaviors.
|
|
|
|
* The extraction of the body is conflated with the asynchronous model.
|
|
|
|
* No way to define an allocator for the container used when extracting
|
|
the body.
|
|
|
|
* A body can only be extracted once, limiting the use of this container
|
|
when using a functional programming style.
|
|
|
|
* Setting the body requires either a vector or a `concurrency::streams::istream`.
|
|
No user defined types are possible.
|
|
|
|
* The HTTP request container conflates HTTP response behavior (see the
|
|
`set_response_stream` member). Again this is likely purpose-driven but
|
|
the lack of separation of concerns limits this library to only the
|
|
uses explicitly envisioned by the authors.
|
|
|
|
The general theme of the HTTP message model in cpprestsdk is "no user
|
|
definable customizations". There is no allocator support, and no
|
|
separation of concerns. It is designed to perform a specific set of
|
|
behaviors. In other words, it does not follow the open/closed principle.
|
|
|
|
Tasks in the Concurrency Runtime operate in a fashion similar to
|
|
`std::future`, but with some improvements such as continuations which
|
|
are not yet in the C++ standard. The costs of using a task based
|
|
asynchronous interface instead of completion handlers is well
|
|
documented: synchronization points along the call chain of composed
|
|
task operations which cannot be optimized away. See:
|
|
[@http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3747.pdf
|
|
[*A Universal Model for Asynchronous Operations]] (Kohlhoff).
|
|
|
|
[endsect]
|