985 lines
32 KiB
Plaintext
985 lines
32 KiB
Plaintext
////
|
|
Copyright 2015-2017 Peter Dimov
|
|
|
|
Distributed under the Boost Software License, Version 1.0.
|
|
|
|
See accompanying file LICENSE_1_0.txt or copy at
|
|
http://www.boost.org/LICENSE_1_0.txt
|
|
////
|
|
|
|
# Simple {cpp}11 metaprogramming, part 2
|
|
Peter Dimov
|
|
2015-06-20
|
|
:toc: left
|
|
:idprefix:
|
|
:docinfo: shared-footer
|
|
|
|
[.lead]
|
|
__Efficient algorithms for membership testing, random access, and retrieval by
|
|
key__
|
|
|
|
NOTE: Being late to the metaprogramming party, I make no claim of having
|
|
invented the techniques in this article. A quick look at the implementations
|
|
of, for example, Louis Dionne's https://github.com/ldionne/mpl11[mpl11] and
|
|
Eric Niebler's https://github.com/ericniebler/meta[meta], shows that most of
|
|
these tricks are already known. Dave Abrahams
|
|
https://github.com/dabrahams/mpl11[has experimented] along these lines in 2012.
|
|
The original inventor of the multiple inheritance trick and the `void*`
|
|
arguments trick is probably Richard Smith, who has posted
|
|
https://llvm.org/bugs/attachment.cgi?id=8825[two]
|
|
https://llvm.org/bugs/attachment.cgi?id=8838[examples] in response to
|
|
https://llvm.org/bugs/show_bug.cgi?id=13263[a Clang bug report].
|
|
|
|
## Vectors, sets, and maps
|
|
|
|
<<simple_cxx11_metaprogramming.adoc#,Last time>>, I outlined a style of
|
|
metaprogramming that operated on type lists -- variadic class templates:
|
|
```
|
|
template<class... T> struct mp_list {};
|
|
```
|
|
Classic Lisp uses lists as its only data structure, but operating on a list is
|
|
usually linear in the number of its elements.
|
|
|
|
In addition to `list`, the STL has `vector`, `set`, and `map`. `vector`
|
|
supports random access by index; `set` has efficient test for membership; `map`
|
|
associates keys with values and has efficient lookup based on key.
|
|
|
|
Instead of introducing separate data structure such as `mp_vector`, `mp_set`,
|
|
`mp_map`, we'll keep our data in a list form, and attempt to provide efficient
|
|
algorithms for random access, membership testing, and lookup.
|
|
|
|
## mp_contains
|
|
|
|
Let's starts with sets. A set is just a list with unique elements. To obtain a
|
|
set from an arbitrary list, we'll need an algorithm that removes the
|
|
duplicates. Let's call it `mp_unique<L>`:
|
|
[subs=+quotes]
|
|
```
|
|
// mp_if
|
|
|
|
template<bool C, class T, class E> struct mp_if_c_impl;
|
|
|
|
template<class T, class E> struct mp_if_c_impl<true, T, E>
|
|
{
|
|
using type = T;
|
|
};
|
|
|
|
template<class T, class E> struct mp_if_c_impl<false, T, E>
|
|
{
|
|
using type = E;
|
|
};
|
|
|
|
template<bool C, class T, class E>
|
|
using mp_if_c = typename mp_if_c_impl<C, T, E>::type;
|
|
|
|
template<class C, class T, class E>
|
|
using mp_if = typename mp_if_c_impl<C::value != 0, T, E>::type;
|
|
|
|
// mp_unique
|
|
|
|
template<class L> struct mp_unique_impl;
|
|
|
|
template<class L> using mp_unique = typename mp_unique_impl<L>::type;
|
|
|
|
template<template<class...> class L> struct mp_unique_impl<L<>>
|
|
{
|
|
using type = L<>;
|
|
};
|
|
|
|
template<template<class...> class L, class T1, class... T>
|
|
struct mp_unique_impl<L<T1, T...>>
|
|
{
|
|
using _rest = mp_unique<L<T...>>;
|
|
using type = mp_if<**mp_contains**<_rest, T1>, _rest, mp_push_front<_rest, T1>>;
|
|
};
|
|
```
|
|
For membership testing, we've introduced an algorithm `mp_contains<L, V>` that
|
|
returns `true` when `L` contains `V`. The straightforward recursive
|
|
implementation of `mp_contains` is:
|
|
```
|
|
template<class L, class V> struct mp_contains_impl;
|
|
|
|
template<class L, class V> using mp_contains = typename mp_contains_impl<L, V>::type;
|
|
|
|
template<template<class...> class L, class V> struct mp_contains_impl<L<>, V>
|
|
{
|
|
using type = std::false_type;
|
|
};
|
|
|
|
template<template<class...> class L, class... T, class V>
|
|
struct mp_contains_impl<L<V, T...>, V>
|
|
{
|
|
using type = std::true_type;
|
|
};
|
|
|
|
template<template<class...> class L, class T1, class... T, class V>
|
|
struct mp_contains_impl<L<T1, T...>, V>: mp_contains_impl<L<T...>, V>
|
|
{
|
|
};
|
|
```
|
|
Note that `mp_unique<L>` makes `N` calls to `mp_contains`, where `N` is the
|
|
length of the list `L`. This means that `mp_contains` needs to be as fast as
|
|
possible, which the above implementation, well, isn't.
|
|
|
|
Here are the compile times in seconds for invoking `mp_unique` on a list with
|
|
`N` (distinct) elements:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, recursive |2.1 |DNF ||||||
|
|
|
|
|clang$$++$$ 3.5.1, recursive |0.9 |4.5 |13.2 |30.2 |DNF |||
|
|
|
|
|g$$++$$ 4.9.2, recursive |0.7 |3.6 |10.4 |23.2 |DNF |||
|
|
|===
|
|
(Tests done under Windows/Cygwin. All compilers are 32 bit. No optimizations.
|
|
DNF stands for "did not finish", which usually means that the compiler ran out
|
|
of heap space or crashed.)
|
|
|
|
We clearly need a better alternative.
|
|
|
|
I ended the previous article with an implementation of `mp_contains` that
|
|
relied on `mp_count`, which in turn relied on `mp_plus`. Let's see how it
|
|
fares:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, mp_count/mp_plus |1.1 |9.8 |50.5 |DNF ||||
|
|
|
|
|clang$$++$$ 3.5.1, mp_count/mp_plus |0.5 |1.4 |3.1 |6.1 |DNF |||
|
|
|
|
|g$$++$$ 4.9.2, mp_count/mp_plus |0.5 |1.3 |2.9 |5.8 |9.7 |15.6 |22.4 |32.3
|
|
|===
|
|
Not _that_ bad, at least if your compiler happens to be `g$$++$$`. Still, there
|
|
ought to be room for improvement here.
|
|
|
|
To do better, we have to somehow leverage the language features, such as pack
|
|
expansion, to do more of the work for us. For inspiration, let's turn to
|
|
section 14.5.3 paragraph 4 of the {cpp}11 standard, which explains that pack
|
|
expansions can occur in the following contexts:
|
|
|
|
* **In a function parameter pack (8.3.5); the pattern is the
|
|
__parameter-declaration__ without the ellipsis.**
|
|
* In a template parameter pack that is a pack expansion (14.1):
|
|
* **In an __initializer-list__ (8.5); the pattern is an
|
|
__initializer-clause__.**
|
|
* **In a __base-specifier-list__ (Clause 10); the pattern is a
|
|
__base-specifier__.**
|
|
* In a __mem-initializer-list__ (12.6.2); the pattern is a
|
|
__mem-initializer__.
|
|
* In a __template-argument-list__ (14.3); the pattern is a
|
|
__template-argument__.
|
|
* In a __dynamic-exception-specification__ (15.4); the pattern is a
|
|
__type-id__.
|
|
* In an __attribute-list__ (7.6.1); the pattern is an __attribute__.
|
|
* In an __alignment-specifier__ (7.6.2); the pattern is the
|
|
__alignment-specifier__ without the ellipsis.
|
|
* In a __capture-list__ (5.1.2); the pattern is a __capture__.
|
|
* In a `sizeof$$...$$` expression (5.3.3); the pattern is an __identifier__.
|
|
|
|
The **emphasis** is mine and indicates possible leads.
|
|
|
|
Our first option is to expand the parameter pack into arguments for a function
|
|
call. Since we're interested in operations that occur at compile time, calling
|
|
a function may not appear useful; but {cpp}11 functions can be `constexpr`, and
|
|
`constexpr` function "calls" do occur at compile time.
|
|
|
|
Recall our `mp_count`:
|
|
```
|
|
template<class L, class V> struct mp_count_impl;
|
|
|
|
template<template<class...> class L, class... T, class V>
|
|
struct mp_count_impl<L<T...>, V>
|
|
{
|
|
using type = mp_plus<std::is_same<T, V>...>;
|
|
};
|
|
|
|
template<class L, class V> using mp_count = typename mp_count_impl<L, V>::type;
|
|
```
|
|
Instead of using the template alias `mp_plus` to sum the `is_same` expressions,
|
|
we can use a `constexpr` function:
|
|
```
|
|
constexpr std::size_t cx_plus()
|
|
{
|
|
return 0;
|
|
}
|
|
|
|
template<class T1, class... T> constexpr std::size_t cx_plus(T1 t1, T... t)
|
|
{
|
|
return t1 + cx_plus(t...);
|
|
}
|
|
|
|
// mp_size_t
|
|
|
|
template<std::size_t N> using mp_size_t = std::integral_constant<std::size_t, N>;
|
|
|
|
// mp_count
|
|
|
|
template<class L, class V> struct mp_count_impl;
|
|
|
|
template<template<class...> class L, class... T, class V>
|
|
struct mp_count_impl<L<T...>, V>
|
|
{
|
|
using type = mp_size_t<cx_plus(std::is_same<T, V>::value...)>;
|
|
};
|
|
|
|
template<class L, class V> using mp_count = typename mp_count_impl<L, V>::type;
|
|
```
|
|
with the following results:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|clang$$++$$ 3.5.1, mp_count/cx_plus |0.4 |1.1 |2.5 |5.0 |DNF |||
|
|
|
|
|g$$++$$ 4.9.2, mp_count/cx_plus |0.4 |0.9 |1.7 |2.9 |4.7 |6.7 |9.2 |11.8
|
|
|===
|
|
We've improved the times, but lost VC$$++$$ 2013 due to its not implementing
|
|
`constexpr`.
|
|
|
|
Let's try pack expansion into an __initializer-list__. Instead of passing the
|
|
`is_same` expressions to a function, we can build a constant array out of them,
|
|
then sum the array with a `constexpr` function:
|
|
```
|
|
constexpr std::size_t cx_plus2(bool const * first, bool const * last)
|
|
{
|
|
return first == last? 0: *first + cx_plus2(first + 1, last);
|
|
}
|
|
|
|
// mp_count
|
|
|
|
template<class L, class V> struct mp_count_impl;
|
|
|
|
template<template<class...> class L, class... T, class V>
|
|
struct mp_count_impl<L<T...>, V>
|
|
{
|
|
static constexpr bool _v[] = { std::is_same<T, V>::value... };
|
|
using type = mp_size_t<cx_plus2(_v, _v + sizeof...(T))>;
|
|
};
|
|
|
|
template<class L, class V> using mp_count = typename mp_count_impl<L, V>::type;
|
|
```
|
|
This is a neat trick, but is it fast?
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|clang$$++$$ 3.5.1, mp_count/cx_plus2 |0.4 |0.9 |1.8 |DNF ||||
|
|
|
|
|g$$++$$ 4.9.2, mp_count/cx_plus2 |0.4 |0.9 |1.9 |3.4 |5.4 |7.8 |11.0 |14.7
|
|
|===
|
|
That's a bit disappointing. Let's see what can we do with expanding a parameter
|
|
pack into a base-specifier-list. We would be able to define a class that
|
|
derives from every element of the pack:
|
|
```
|
|
struct U: T... {};
|
|
```
|
|
We can then use `std::is_base_of<V, U>` to test whether a type `V` is a base of
|
|
`U`, that is, whether it's one of the elements of the parameter pack. Which is
|
|
exactly what we need.
|
|
|
|
Arbitrary types such as `void`, `int`, or `void(int)` can't be used as base
|
|
classes, but we'll wrap the types in an empty class template, which we'll call
|
|
`mp_identity`.
|
|
```
|
|
template<class T> struct mp_identity
|
|
{
|
|
using type = T;
|
|
};
|
|
|
|
template<class L, class V> struct mp_contains_impl;
|
|
|
|
template<class L, class V> using mp_contains = typename mp_contains_impl<L, V>::type;
|
|
|
|
template<template<class...> class L, class... T, class V>
|
|
struct mp_contains_impl<L<T...>, V>
|
|
{
|
|
struct U: mp_identity<T>... {};
|
|
using type = std::is_base_of<mp_identity<V>, U>;
|
|
};
|
|
```
|
|
Performance?
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, is_base_of |0.3 |0.6 |1.3 |2.5 |DNF |||
|
|
|
|
|clang$$++$$ 3.5.1, is_base_of |0.3 |0.4 |0.6 |0.8 |DNF |||
|
|
|
|
|g$$++$$ 4.9.2, is_base_of |0.3 |0.4 |0.6 |0.9 |1.3 |1.7 |2.3 |3.0
|
|
|===
|
|
This implementation is a clear winner.
|
|
|
|
In fairness, we ought to note that the first four implementations of
|
|
`mp_contains` do not rely on the list elements being unique. This makes
|
|
`mp_contains` an algorithm that supports arbitrary lists, not just sets.
|
|
|
|
The `is_base_of` implementation, however, does not support lists that contain
|
|
duplicates, because it's not possible to inherit directly from the same type
|
|
twice. So it does not implement the general `mp_contains`, but something that
|
|
should probably be named `mp_set_contains`.
|
|
|
|
We can avoid the "no duplicates" requirement by modifying the implementation to
|
|
inherit from `mp_identity<T>` indirectly, via an intermediate base class:
|
|
[subs=+macros]
|
|
```
|
|
// indirect_inherit
|
|
|
|
template<std::size_t I, class T> struct inherit_second: T {};
|
|
|
|
template<class L, class S> struct indirect_inherit_impl;
|
|
|
|
template<template<class...> class L, class... T, std::size_t... J>
|
|
struct indirect_inherit_impl<L<T...>, http://en.cppreference.com/w/cpp/utility/integer_sequence[integer_sequence]<std::size_t, J...>>:
|
|
inherit_second<J, mp_identity<T>>... {};
|
|
|
|
template<class L> using indirect_inherit =
|
|
indirect_inherit_impl<L, http://en.cppreference.com/w/cpp/utility/integer_sequence[make_index_sequence]<mp_size<L>::value>>;
|
|
|
|
// mp_contains
|
|
|
|
template<class L, class V> struct mp_contains_impl
|
|
{
|
|
using U = indirect_inherit<L>;
|
|
using type = std::is_base_of<mp_identity<V>, U>;
|
|
};
|
|
|
|
template<class L, class V> using mp_contains = typename mp_contains_impl<L, V>::type;
|
|
```
|
|
This, however, pretty much nullifies the spectacular performance gains we've
|
|
observed with the original `is_base_of`-based implementation:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, recursive |2.1 |DNF ||||||
|
|
|
|
|VC$$++$$ 2013, mp_count/mp_plus |1.1 |9.8 |50.5 |DNF ||||
|
|
|
|
|VC$$++$$ 2013, is_base_of |0.3 |0.6 |1.3 |2.5 |DNF |||
|
|
|
|
|VC$$++$$ 2013, is_base_of/indirect |1.0 |9.3 |49.5 |153.8 |DNF |||
|
|
|===
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|clang$$++$$ 3.5.1, recursive |0.9 |4.5 |13.2 |30.2 |DNF |||
|
|
|
|
|clang$$++$$ 3.5.1, mp_count/mp_plus |0.5 |1.4 |3.1 |6.1 |DNF |||
|
|
|
|
|clang$$++$$ 3.5.1, mp_count/cx_plus |0.4 |1.1 |2.5 |5.0 |DNF |||
|
|
|
|
|clang$$++$$ 3.5.1, mp_count/cx_plus2 |0.4 |0.9 |1.8 |DNF ||||
|
|
|
|
|clang$$++$$ 3.5.1, is_base_of |0.3 |0.4 |0.6 |0.8 |DNF |||
|
|
|
|
|clang$$++$$ 3.5.1, is_base_of/indirect |0.4 |0.9 |1.6 |2.5 |DNF |||
|
|
|===
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|g$$++$$ 4.9.2, recursive |0.7 |3.6 |10.4 |23.2 |DNF |||
|
|
|
|
|g$$++$$ 4.9.2, mp_count/mp_plus |0.5 |1.3 |2.9 |5.8 |9.7 |15.6 |22.4 |32.3
|
|
|
|
|g$$++$$ 4.9.2, mp_count/cx_plus |0.4 |0.9 |1.7 |2.9 |4.7 |6.7 |9.2 |11.8
|
|
|
|
|g$$++$$ 4.9.2, mp_count/cx_plus2 |0.4 |0.9 |1.9 |3.4 |5.4 |7.8 |11.0 |14.7
|
|
|
|
|g$$++$$ 4.9.2, is_base_of |0.3 |0.4 |0.6 |0.9 |1.3 |1.7 |2.3 |3.0
|
|
|
|
|g$$++$$ 4.9.2, is_base_of/indirect |0.5 |1.1 |2.3 |4.0 |6.6 |9.8 |13.6 |18.2
|
|
|===
|
|
|
|
## mp_map_find
|
|
|
|
A map, in the STL sense, is a data structure that associates keys with values
|
|
and can efficiently retrieve, given a key, its associated value. For our
|
|
purposes, a map will be any list of lists for which the inner lists have at
|
|
least one element, the key; the rest of the elements we'll consider to be the
|
|
associated value. For example, the list
|
|
```
|
|
[[A, B], [C, D, E], [F], [G, H]]
|
|
```
|
|
is a map with keys `A`, `C`, `F`, and `G`, with associated values `[B]`,
|
|
`[D, E]`, `[]`, and `[H]`, respectively. We'll require unique keys, for reasons
|
|
that'll become evident later.
|
|
|
|
I'll show two other examples of maps, this time using real {cpp} code:
|
|
```
|
|
using Map = mp_list<mp_list<int, int*>, mp_list<void, void*>, mp_list<char, char*>>;
|
|
```
|
|
```
|
|
using Map2 = std::tuple<std::pair<int, int[2]>, std::pair<char, char[2]>>;
|
|
```
|
|
The Lisp name of the algorithm that performs retrieval based on key is `ASSOC`,
|
|
but I'll call it `mp_map_find`. `mp_map_find<M, K>` returns the element of `M`
|
|
whose first element is `K`. For example, `mp_map_find<Map2, int>` would return
|
|
`std::pair<int, int[2]>`. If there's no such key, it returns `void`.
|
|
|
|
There's almost no need to implement and benchmark the recursive version of
|
|
`mp_map_find` -- we can be pretty sure it will perform horribly. Still,
|
|
```
|
|
template<class M, class K> struct mp_map_find_impl;
|
|
|
|
template<class M, class K> using mp_map_find = typename mp_map_find_impl<M, K>::type;
|
|
|
|
template<template<class...> class M, class K> struct mp_map_find_impl<M<>, K>
|
|
{
|
|
using type = void;
|
|
};
|
|
|
|
template<template<class...> class M, class T1, class... T, class K>
|
|
struct mp_map_find_impl<M<T1, T...>, K>
|
|
{
|
|
using type = mp_if<std::is_same<mp_front<T1>, K>, T1, mp_map_find<M<T...>, K>>;
|
|
};
|
|
```
|
|
The compile time, in seconds, for `N` lookups into a map of size `N`, is as
|
|
follows:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, recursive |38.2 |DNF ||||||
|
|
|
|
|clang$$++$$ 3.5.1, recursive |2.5 |13.7 |DNF |||||
|
|
|
|
|g$$++$$ 4.9.2, recursive |1.9 |10.2 |28.8 |DNF ||||
|
|
|===
|
|
I told you there was no point.
|
|
|
|
But, I hear some of you say, you're evaluating the else branch even if the
|
|
condition is true, and that's horribly inefficient!
|
|
|
|
Well, this would only improve the performance by a factor of approximately two
|
|
on average, and only if the element is present, but fine, let's try it. The
|
|
element happens to be present in the benchmark, so let's see.
|
|
```
|
|
// mp_eval_if
|
|
|
|
template<bool C, class T, template<class...> class E, class... A>
|
|
struct mp_eval_if_c_impl;
|
|
|
|
template<class T, template<class...> class E, class... A>
|
|
struct mp_eval_if_c_impl<true, T, E, A...>
|
|
{
|
|
using type = T;
|
|
};
|
|
|
|
template<class T, template<class...> class E, class... A>
|
|
struct mp_eval_if_c_impl<false, T, E, A...>
|
|
{
|
|
using type = E<A...>;
|
|
};
|
|
|
|
template<bool C, class T, template<class...> class E, class... A>
|
|
using mp_eval_if_c = typename mp_eval_if_c_impl<C, T, E, A...>::type;
|
|
|
|
template<class C, class T, template<class...> class E, class... A>
|
|
using mp_eval_if = typename mp_eval_if_c_impl<C::value != 0, T, E, A...>::type;
|
|
|
|
// mp_map_find
|
|
|
|
template<class M, class K> struct mp_map_find_impl;
|
|
|
|
template<class M, class K> using mp_map_find = typename mp_map_find_impl<M, K>::type;
|
|
|
|
template<template<class...> class M, class K> struct mp_map_find_impl<M<>, K>
|
|
{
|
|
using type = void;
|
|
};
|
|
|
|
template<template<class...> class M, class T1, class... T, class K>
|
|
struct mp_map_find_impl<M<T1, T...>, K>
|
|
{
|
|
using type = mp_eval_if<std::is_same<mp_front<T1>, K>, T1, mp_map_find, M<T...>, K>;
|
|
};
|
|
```
|
|
There you go:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, recursive |15.6 |DNF ||||||
|
|
|
|
|clang$$++$$ 3.5.1, recursive |1.8 |9.5 |DNF |||||
|
|
|
|
|g$$++$$ 4.9.2, recursive |1.4 |7.0 |19.7 |DNF ||||
|
|
|===
|
|
I told you there was no point.
|
|
|
|
Point or no, to establish that the recursive implementation is inefficient is
|
|
not the same as to come up with an efficient one. There are two things that
|
|
make the `mp_contains` techniques inapplicable to our present case: first,
|
|
`mp_contains` only had to return true or false, whereas `mp_map_find` returns a
|
|
type, and second, in `mp_contains` we knew the exact type of the element for
|
|
which we were looking, whereas here, we only know its `mp_front`.
|
|
|
|
Fortunately, there does exist a language feature that can solve both: {cpp} can
|
|
deduce the template parameters of base classes when passed a derived class. In
|
|
this example,
|
|
```
|
|
struct K1 {};
|
|
struct V1 {};
|
|
|
|
struct X: std::pair<K1, V1> {};
|
|
|
|
template<class A, class B> void f(std::pair<A, B> const & p);
|
|
|
|
int main()
|
|
{
|
|
f(X());
|
|
}
|
|
```
|
|
the call to `f(X())` deduces `A` as `K1` and `B` as `V1`. If we have more than
|
|
one `std::pair` base class, we can fix `A` to be `K1`:
|
|
```
|
|
struct K1 {};
|
|
struct V1 {};
|
|
|
|
struct K2 {};
|
|
struct V2 {};
|
|
|
|
struct X: std::pair<K1, V1>, std::pair<K2, V2> {};
|
|
|
|
template<class B> void f(std::pair<K1, B> const & p);
|
|
|
|
int main()
|
|
{
|
|
f(X());
|
|
}
|
|
```
|
|
and `B` will be deduced as `V1`.
|
|
|
|
We can retrieve the results of the deduction by returning the type we want:
|
|
```
|
|
template<class B> std::pair<K1, B> f(std::pair<K1, B> const & p);
|
|
```
|
|
and then using `decltype(f(X()))` to obtain this return type.
|
|
|
|
What if `X` doesn't have a base of type `std::pair<K1, B>`? The deduction will
|
|
fail and we'll get an error that `f(X())` cannot be called. To avoid it, we can
|
|
add an overload of `f` that takes anything and returns `void`. But in this
|
|
case, what will happen if `X` has two bases of the form that match the first
|
|
`f` overload, such as for example `std::pair<K1, Y>` and `std::pair<K1, Z>`?
|
|
|
|
The deduction will fail, the second overload will again be chosen and we'll get
|
|
`void`. This is why we require maps to have unique keys.
|
|
|
|
Here's an implementation of `mp_map_find` based on this technique:
|
|
```
|
|
template<class M, class K> struct mp_map_find_impl;
|
|
|
|
template<class M, class K>
|
|
using mp_map_find = typename mp_map_find_impl<M, K>::type;
|
|
|
|
template<template<class...> class M, class... T, class K>
|
|
struct mp_map_find_impl<M<T...>, K>
|
|
{
|
|
struct U: mp_identity<T>... {};
|
|
|
|
template<template<class...> class L, class... U>
|
|
static mp_identity<L<K, U...>>
|
|
f( mp_identity<L<K, U...>>* );
|
|
|
|
static mp_identity<void> f( ... );
|
|
|
|
using V = decltype( f((U*)0) );
|
|
|
|
using type = typename V::type;
|
|
};
|
|
```
|
|
and its corresponding compile times:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, deduction |0.3 |0.7 |1.8 |3.6 |6.4 |10.4 |16.2 |DNF
|
|
|
|
|clang$$++$$ 3.5.1, deduction |0.3 |0.4 |0.6 |0.9 |1.2 |1.6 |2.2 |2.7
|
|
|
|
|g$$++$$ 4.9.2, deduction |0.3 |0.5 |0.9 |1.6 |2.3 |3.4 |4.7 |6.3
|
|
|===
|
|
This looks ready to ship.
|
|
|
|
The implementation contains one inefficiency though. If we evaluate
|
|
`mp_map_find<M, K1>`, then `mp_map_find<M, K2>`, the two nested `U` types are
|
|
the same as they only depend on `M`, but the compiler doesn't know that and
|
|
will instantiate each one separately. We should move this type outside
|
|
`mp_map_find_impl` so that it can be reused:
|
|
[subs=+quotes]
|
|
```
|
|
template<class... T> struct **mp_inherit**: T... {};
|
|
|
|
template<class M, class K> struct mp_map_find_impl;
|
|
|
|
template<class M, class K>
|
|
using mp_map_find = typename mp_map_find_impl<M, K>::type;
|
|
|
|
template<template<class...> class M, class... T, class K>
|
|
struct mp_map_find_impl<M<T...>, K>
|
|
{
|
|
**using U = mp_inherit<mp_identity<T>...>;**
|
|
|
|
template<template<class...> class L, class... U>
|
|
static mp_identity<L<K, U...>>
|
|
f( mp_identity<L<K, U...>>* );
|
|
|
|
static mp_identity<void> f( ... );
|
|
|
|
using V = decltype( f((U*)0) );
|
|
|
|
using type = typename V::type;
|
|
};
|
|
```
|
|
(This same optimization, by the way, applies to our `is_base_of` implementation
|
|
of `mp_contains`.)
|
|
|
|
The improvement in compile times on our benchmark is measurable:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, deduction+mp_inherit |0.3 |0.6 |1.4 |2.6 |4.5 |7.1 |10.7 |DNF
|
|
|
|
|clang$$++$$ 3.5.1, deduction+mp_inherit |0.3 |0.4 |0.6 |0.8 |1.0 |1.4 |1.8 |2.2
|
|
|
|
|g$$++$$ 4.9.2, deduction+mp_inherit |0.3 |0.4 |0.6 |0.9 |1.3 |1.8 |2.3 |2.9
|
|
|===
|
|
|
|
## mp_at
|
|
|
|
With sets and maps covered, it's time to tackle vectors. Vectors for us are
|
|
just lists, to which we'll need to add the ability to efficiently access an
|
|
element based on its index. The customary name for this accessor is
|
|
`mp_at<L, I>`, where `L` is a list and `I` is an `integral_constant` that
|
|
represents the index. We'll also follow the Boost.MPL convention and add
|
|
`mp_at_c<L, I>`, where `I` is the index of type `size_t`.
|
|
|
|
The recursive implementation of `mp_at` is:
|
|
```
|
|
template<class L, std::size_t I> struct mp_at_c_impl;
|
|
|
|
template<class L, std::size_t I> using mp_at_c = typename mp_at_c_impl<L, I>::type;
|
|
|
|
template<class L, class I> using mp_at = typename mp_at_c_impl<L, I::value>::type;
|
|
|
|
template<template<class...> class L, class T1, class... T>
|
|
struct mp_at_c_impl<L<T1, T...>, 0>
|
|
{
|
|
using type = T1;
|
|
};
|
|
|
|
template<template<class...> class L, class T1, class... T, std::size_t I>
|
|
struct mp_at_c_impl<L<T1, T...>, I>
|
|
{
|
|
using type = mp_at_c<L<T...>, I-1>;
|
|
};
|
|
```
|
|
and the compile times for making `N` calls to `mp_at` with a list of size `N`
|
|
as the first argument are:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, recursive |3.6 |DNF ||||||
|
|
|
|
|clang$$++$$ 3.5.1, recursive |1.0 |5.1 |15.3 |DNF ||||
|
|
|
|
|g$$++$$ 4.9.2, recursive |0.9 |4.7 |14.2 |32.4 |DNF |||
|
|
|===
|
|
To improve upon this appalling result, we'll again exploit pack expansion into a
|
|
function call, but in a novel way. Let's suppose that we need to access the
|
|
fourth element (`I = 3`). We'll generate the function signature
|
|
```
|
|
template<class W> W f( void*, void*, void*, W*, ... );
|
|
```
|
|
and then, given a list `L<T1, T2, T3, T4, T5, T6, T7>`, we'll evaluate the
|
|
expression
|
|
```
|
|
decltype( f( (T1*)0, (T2*)0, (T3*)0, (T4*)0, (T5*)0, (T6*)0, (T7*)0 ) )
|
|
```
|
|
The three `void*` parameters will eat the first three elements, `W` will be
|
|
deduced as the fourth, and the ellipsis will take care of the rest.
|
|
|
|
A working implementation based on this technique is shown below:
|
|
```
|
|
// mp_repeat_c
|
|
|
|
template<std::size_t N, class... T> struct mp_repeat_c_impl
|
|
{
|
|
using _l1 = typename mp_repeat_c_impl<N/2, T...>::type;
|
|
using _l2 = typename mp_repeat_c_impl<N%2, T...>::type;
|
|
|
|
using type = mp_append<_l1, _l1, _l2>;
|
|
};
|
|
|
|
template<class... T> struct mp_repeat_c_impl<0, T...>
|
|
{
|
|
using type = mp_list<>;
|
|
};
|
|
|
|
template<class... T> struct mp_repeat_c_impl<1, T...>
|
|
{
|
|
using type = mp_list<T...>;
|
|
};
|
|
|
|
template<std::size_t N, class... T> using mp_repeat_c =
|
|
typename mp_repeat_c_impl<N, T...>::type;
|
|
|
|
// mp_at
|
|
|
|
template<class L, class L2> struct mp_at_c_impl;
|
|
|
|
template<template<class...> class L, class... T,
|
|
template<class...> class L2, class... U>
|
|
struct mp_at_c_impl<L<T...>, L2<U...>>
|
|
{
|
|
template<class W> static W f( U*..., W*, ... );
|
|
|
|
using R = decltype( f( (mp_identity<T>*)0 ... ) );
|
|
|
|
using type = typename R::type;
|
|
};
|
|
|
|
template<class L, std::size_t I> using mp_at_c =
|
|
typename mp_at_c_impl<L, mp_repeat_c<I, void>>::type;
|
|
|
|
template<class L, class I> using mp_at = mp_at_c<L, I::value>;
|
|
```
|
|
and the compile times in the following table show it to be good enough for most
|
|
practical purposes.
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, void* |0.4 |1.1 |2.4 |4.7 |DNF |||
|
|
|
|
|clang$$++$$ 3.5.1, void* |0.4 |0.7 |1.2 |1.9 |2.7 |3.8 |5.0 |6.6
|
|
|
|
|g$$++$$ 4.9.2, void* |0.3 |0.5 |0.9 |1.3 |2.1 |3.0 |4.2 |5.5
|
|
|===
|
|
Are we done with `mp_at`, then?
|
|
|
|
Let's try something else -- transform the input list `[T1, T2, T3]` into a map
|
|
`[[0, T1], [1, T2], [2, T3]]`, then use `mp_map_find` for the lookup:
|
|
[subs=+macros]
|
|
```
|
|
// mp_map_from_list
|
|
|
|
template<class L, class S> struct mp_map_from_list_impl;
|
|
|
|
template<template<class...> class L, class... T, std::size_t... J>
|
|
struct mp_map_from_list_impl<L<T...>, http://en.cppreference.com/w/cpp/utility/integer_sequence[integer_sequence]<std::size_t, J...>>
|
|
{
|
|
using type = mp_list<mp_list<mp_size_t<J>, T>...>;
|
|
};
|
|
|
|
template<class L> using mp_map_from_list = typename mp_map_from_list_impl<L,
|
|
http://en.cppreference.com/w/cpp/utility/integer_sequence[make_index_sequence]<mp_size<L>::value>>::type;
|
|
|
|
// mp_at
|
|
|
|
template<class L, std::size_t I> struct mp_at_c_impl
|
|
{
|
|
using map = mp_map_from_list<L>;
|
|
using type = mp_second<mp_map_find<map, mp_size_t<I>>>;
|
|
};
|
|
|
|
template<class L, std::size_t I> using mp_at_c = typename mp_at_c_impl<L, I>::type;
|
|
|
|
template<class L, class I> using mp_at = typename mp_at_c_impl<L, I::value>::type;
|
|
```
|
|
At first sight, this looks ridiculous, but metaprogramming has its own rules.
|
|
Let's measure:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, map |0.3 |0.7 |1.5 |2.9 |5.0 |7.8 |11.9 |DNF
|
|
|
|
|clang$$++$$ 3.5.1, map |0.3 |0.4 |0.6 |0.8 |1.1 |1.5 |1.8 |2.3
|
|
|
|
|g$$++$$ 4.9.2, map |0.3 |0.4 |0.7 |1.0 |1.4 |1.9 |2.5 |3.2
|
|
|===
|
|
Surprise, this is the best implementation.
|
|
|
|
## mp_drop
|
|
|
|
It turned out that we didn't need the `void*` trick for `mp_at`, but I'll show
|
|
an example where we do: `mp_drop`. `mp_drop<L, N>` returns the list `L` without
|
|
its first `N` elements; or, in other words, it drops the first `N` elements
|
|
(presumably on the cutting room floor) and returns what's left.
|
|
|
|
To implement `mp_drop`, we just need to change
|
|
```
|
|
template<class W> W f( void*, void*, void*, W*, ... );
|
|
```
|
|
from above to return the rest of the elements, rather than just one:
|
|
```
|
|
template<class... W> mp_list<W> f( void*, void*, void*, W*... );
|
|
```
|
|
Adding the necessary `mp_identity` seasoning produces the following working
|
|
implementation:
|
|
```
|
|
template<class L, class L2> struct mp_drop_c_impl;
|
|
|
|
template<template<class...> class L, class... T,
|
|
template<class...> class L2, class... U>
|
|
struct mp_drop_c_impl<L<T...>, L2<U...>>
|
|
{
|
|
template<class... W> static mp_identity<L<W...>> f( U*..., mp_identity<W>*... );
|
|
|
|
using R = decltype( f( (mp_identity<T>*)0 ... ) );
|
|
|
|
using type = typename R::type;
|
|
};
|
|
|
|
template<class L, std::size_t N> using mp_drop_c =
|
|
typename mp_drop_c_impl<L, mp_repeat_c<N, void>>::type;
|
|
|
|
template<class L, class N> using mp_drop = mp_drop_c<L, N::value>;
|
|
```
|
|
I'll skip the recursive implementation and the performance comparison for this
|
|
one. We can pretty much tell who's going to win, and by how much.
|
|
|
|
## mp_find_index
|
|
|
|
The final algorithm that I'll bring to your attention is `mp_find_index`.
|
|
`mp_find_index<L, V>` returns an integral constant of type `size_t` with a
|
|
value that is the index of the first occurrence of `V` in `L`. If `V` is not in
|
|
`L`, the return value is the size of `L`.
|
|
|
|
We'll start with the recursive implementation, as usual:
|
|
```
|
|
template<class L, class V> struct mp_find_index_impl;
|
|
|
|
template<class L, class V> using mp_find_index = typename mp_find_index_impl<L, V>::type;
|
|
|
|
template<template<class...> class L, class V> struct mp_find_index_impl<L<>, V>
|
|
{
|
|
using type = mp_size_t<0>;
|
|
};
|
|
|
|
template<template<class...> class L, class... T, class V>
|
|
struct mp_find_index_impl<L<V, T...>, V>
|
|
{
|
|
using type = mp_size_t<0>;
|
|
};
|
|
|
|
template<template<class...> class L, class T1, class... T, class V>
|
|
struct mp_find_index_impl<L<T1, T...>, V>
|
|
{
|
|
using type = mp_size_t<1 + mp_find_index<L<T...>, V>::value>;
|
|
};
|
|
```
|
|
and will continue with the compile times for `N` calls to `mp_find_index` on a
|
|
list with `N` elements, as usual:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, recursive |3.5 |DNF ||||||
|
|
|
|
|clang$$++$$ 3.5.1, recursive |1.1 |5.5 |DNF |||||
|
|
|
|
|g$$++$$ 4.9.2, recursive |0.8 |4.6 |13.6 |DNF ||||
|
|
|===
|
|
What can we do here?
|
|
|
|
Let's go back to `mp_contains` and look at the "mp_count/cx_plus2"
|
|
implementation which we rejected in favor of the inheritance-based one. It
|
|
built a `constexpr` array of booleans and summed them in a `constexpr`
|
|
function. We can do the same here, except instead of summing the array, we can
|
|
find the index of the first true value:
|
|
```
|
|
template<class L, class V> struct mp_find_index_impl;
|
|
|
|
template<class L, class V> using mp_find_index = typename mp_find_index_impl<L, V>::type;
|
|
|
|
template<template<class...> class L, class V> struct mp_find_index_impl<L<>, V>
|
|
{
|
|
using type = mp_size_t<0>;
|
|
};
|
|
|
|
constexpr std::size_t cx_find_index( bool const * first, bool const * last )
|
|
{
|
|
return first == last || *first? 0: 1 + cx_find_index( first + 1, last );
|
|
}
|
|
|
|
template<template<class...> class L, class... T, class V>
|
|
struct mp_find_index_impl<L<T...>, V>
|
|
{
|
|
static constexpr bool _v[] = { std::is_same<T, V>::value... };
|
|
|
|
using type = mp_size_t< cx_find_index( _v, _v + sizeof...(T) ) >;
|
|
};
|
|
```
|
|
The performance of this version is:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|clang$$++$$ 3.5.1, constexpr |0.5 |1.3 |2.9 |DNF ||||
|
|
|
|
|g$$++$$ 4.9.2, constexpr |0.5 |1.4 |3.1 |5.5 |8.7 |13.0 |18.0 |DNF
|
|
|===
|
|
which, while not ideal, is significantly better than our recursive punching
|
|
bag. But if our compiler of choice is VC$$++$$ 2013, we can't use `constexpr`.
|
|
|
|
We may attempt an implementation along the same lines, but with the `constexpr`
|
|
function replaced with ordinary metaprogramming:
|
|
```
|
|
template<class L, class V> struct mp_find_index_impl;
|
|
|
|
template<class L, class V> using mp_find_index = typename mp_find_index_impl<L, V>::type;
|
|
|
|
template<template<class...> class L, class V> struct mp_find_index_impl<L<>, V>
|
|
{
|
|
using type = mp_size_t<0>;
|
|
};
|
|
|
|
template<bool...> struct find_index_impl_;
|
|
|
|
template<> struct find_index_impl_<>
|
|
{
|
|
static const std::size_t value = 0;
|
|
};
|
|
|
|
template<bool B1, bool... R> struct find_index_impl_<B1, R...>
|
|
{
|
|
static const std::size_t value = B1? 0: 1 + find_index_impl_<R...>::value;
|
|
};
|
|
|
|
template<bool B1, bool B2, bool B3, bool B4, bool B5,
|
|
bool B6, bool B7, bool B8, bool B9, bool B10, bool... R>
|
|
struct find_index_impl_<B1, B2, B3, B4, B5, B6, B7, B8, B9, B10, R...>
|
|
{
|
|
static const std::size_t value = B1? 0: B2? 1: B3? 2: B4? 3: B5? 4:
|
|
B6? 5: B7? 6: B8? 7: B9? 8: B10? 9: 10 + find_index_impl_<R...>::value;
|
|
};
|
|
|
|
template<template<class...> class L, class... T, class V>
|
|
struct mp_find_index_impl<L<T...>, V>
|
|
{
|
|
using type = mp_size_t<find_index_impl_<std::is_same<T, V>::value...>::value>;
|
|
};
|
|
```
|
|
This is still recursive, so we don't expect miracles, but it wouldn't hurt to
|
|
measure:
|
|
|===
|
|
||N=100 |N=200 |N=300 |N=400 |N=500 |N=600 |N=700 |N=800
|
|
|
|
|VC$$++$$ 2013, bool... |4.7 |94.5 |488.3 |XFA ||||
|
|
|
|
|clang$$++$$ 3.5.1, bool... |0.6 |2.2 |5.8 |12.0 |21.7 |35.2 |DNF |
|
|
|
|
|g$$++$$ 4.9.2, bool... |0.6 |2.4 |6.5 |13.2 |23.8 |39.1 |59.0 |DNF
|
|
|===
|
|
(where XFA stands for "experimenter fell asleep".)
|
|
|
|
This is an interesting tradeoff for VC$$++$$ 2013 and Clang. On the one hand,
|
|
this implementation is slower; on the other, it doesn't crash the compiler as
|
|
easily. Which to prefer is a matter of taste and of stern evaluation of one's
|
|
needs to manipulate type lists of length 300.
|
|
|
|
Note that once we have `mp_drop` and `mp_find_index`, we can derive the
|
|
`mp_find<L, V>` algorithm, which returns the suffix of `L` starting with the
|
|
first occurrence of `V`, if any, and an empty list otherwise, by using
|
|
`mp_drop<L, mp_find_index<L, V>>`.
|
|
|
|
## Conclusion
|
|
|
|
In this article, I have shown efficient algorithms that allow us to treat type
|
|
lists as sets, maps and vectors, demonstrating various {cpp}11 implementation
|
|
techniques in the process.
|