Go to file

Jakub Szuppe 36c89134d4 Merge pull request #826 from jszuppe/fix-cacheline-rettype Fix type for CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE		2019-02-16 10:40:51 +01:00
cmake	Use custom FindOpenCL.cmake only for cmake older than 3.1	2016-03-09 15:30:02 +01:00
doc	Add Xilinx FPGA OpenCL implementation	2018-10-23 18:50:53 -07:00
example	fix variable shadowing	2018-12-27 12:25:16 +01:00
include/boost	Fix type for CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE	2019-02-15 22:22:49 +01:00
meta	Add meta/libraries.json file	2015-02-07 09:53:36 -08:00
perf	Fix benchmark for std::partial_sum()	2016-07-25 22:37:17 +02:00
test	Fix: unsigned long is not fundamental type on Win64	2019-02-11 09:13:06 +01:00
.appveyor.yml	Fix AppVeyor	2018-12-27 15:15:07 +00:00
.coveralls.yml	Add coveralls.io integration	2014-03-12 18:26:29 -07:00
.gitignore	Add performance section to documentation	2014-09-21 18:36:58 -07:00
.travis.yml	Minor Travis CI fixes	2018-12-28 19:11:15 +01:00
CMakeLists.txt	Update minimum required Boost version to 1.54	2016-05-03 14:32:22 +02:00
CONTRIBUTING.md	Update GitHub links	2015-05-17 20:32:09 -07:00
index.html	Add top-level index.html file	2015-06-29 20:16:25 -07:00
LICENSE_1_0.txt	Initial commit	2013-03-02 15:14:17 -05:00
README.md	Add AppVeyor CI badge	2017-05-03 19:17:12 +02:00

README.md

Boost.Compute

Boost.Compute is a GPU/parallel-computing library for C++ based on OpenCL.

The core library is a thin C++ wrapper over the OpenCL API and provides access to compute devices, contexts, command queues and memory buffers.

On top of the core library is a generic, STL-like interface providing common algorithms (e.g. transform(), accumulate(), sort()) along with common containers (e.g. vector<T>, flat_set<T>). It also features a number of extensions including parallel-computing algorithms (e.g. exclusive_scan(), scatter(), reduce()) and a number of fancy iterators (e.g. transform_iterator<>, permutation_iterator<>, zip_iterator<>).

The full documentation is available at http://boostorg.github.io/compute/.

Example

The following example shows how to sort a vector of floats on the GPU:

#include <vector>
#include <algorithm>
#include <boost/compute.hpp>

namespace compute = boost::compute;

int main()
{
    // get the default compute device
    compute::device gpu = compute::system::default_device();

    // create a compute context and command queue
    compute::context ctx(gpu);
    compute::command_queue queue(ctx, gpu);

    // generate random numbers on the host
    std::vector<float> host_vector(1000000);
    std::generate(host_vector.begin(), host_vector.end(), rand);

    // create vector on the device
    compute::vector<float> device_vector(1000000, ctx);

    // copy data to the device
    compute::copy(
        host_vector.begin(), host_vector.end(), device_vector.begin(), queue
    );

    // sort data on the device
    compute::sort(
        device_vector.begin(), device_vector.end(), queue
    );

    // copy data back to the host
    compute::copy(
        device_vector.begin(), device_vector.end(), host_vector.begin(), queue
    );

    return 0;
}

Boost.Compute is a header-only library, so no linking is required. The example above can be compiled with:

g++ -I/path/to/compute/include sort.cpp -lOpenCL

More examples can be found in the tutorial and under the examples directory.

Support

Questions about the library (both usage and development) can be posted to the mailing list.

Bugs and feature requests can be reported through the issue tracker.

Also feel free to send me an email with any problems, questions, or feedback.

Help Wanted

The Boost.Compute project is currently looking for additional developers with interest in parallel computing.

Please send an email to Kyle Lutz (kyle.r.lutz@gmail.com) for more information.