This changes the BOOST_COMPUTE_FUNCTION() macro (and the related
BOOST_COMPUTE_CLOSURE() macro) to use custom, user-provided argument
names instead of auto-generating them based on their index.
This is an API-breaking change. Users should now provide argument
names when using the BOOST_COMPUTE_FUNCTION() macro. The examples
and documentation have been updated to reflect the new API.
This changes the vector<T> constructors which copy or initialize
data to take a queue argument used for performing the operations.
Previously they just took a context argument used to initialize the
buffer and then created a new command queue to use. This improves
performance by not requiring a new command queue and also fixes issues
when performing operations on a different command queue while the
vector was still being initialized.
This adds interoperability support between Boost.Compute and various
other C/C++ libraries (Eigen, OpenCV, OpenGL, Qt and VTK). This eases
development for users using external libraries with Boost.Compute.
See kylelutz/compute#21
This adds program::build_with_source() function that both creates and
builds the program for the given context with supplied source and
compile options. In case BOOST_COMPUTE_USE_OFFLINE_CACHE macro is
defined, it also saves the compiled program binary for reuse in the
offline cache located in $HOME/.boost_compute folder on UNIX-like
systems and in %APPDATA%/boost_compute folder on Windows.
All internal uses of program::create_with_source() followed by
program::build() are replaced with program::build_with_source().
This improves the monte carlo example by using the count_if()
algorithm instead of a custom kernel with atomics. Also includes
only the required headers instead of all the Boost.Compute headers.
This changes the checks for the device type to use the bitwise-and
operator instead of the equaility operator. The returned type is a
bitset and this would cause errors when multiple bits were set.
This fixes a bug on POCL which returns the device type as a
combination of CL_DEVICE_TYPE_DEFAULT and CL_DEVICE_TYPE_CPU. Now
the correct device type (device::cpu) is detected for POCL.
This removes the timer class. The technique of measuring the time
difference between two different OpenCL markers on a command queue
is not portable to all OpenCL implementations (only works on NVIDIA).
A new internal timer class has been added which uses boost::chrono
(or std::chrono if BOOST_COMPUTE_TIMER_USE_STD_CHRONO is defined).
This new timer is used by the benchmarks to measure time elapsed
on the host.
This cleans up the example code. Now all of the examples use
the "namespace compute = boost::compute" alias. This shortens
the example code making it less verbose and more clear. Also
cleans up a few style issues.
This adds a system-wide default command queue. This queue is
accessible via the new static system::default_queue() method.
The default command queue is created for the default compute
device in the default context and is analogous to the default
stream in CUDA.
This changes how algorithms operate when invoked without an
explicit command queue. Previously, each algorithm had two
overloads, the first expected a command queue to be explicitly
passed and the second would create and use a temporary command
queue. Now, all algorithms take a command queue argument which
has a default value equal to system::default_queue().
This fixes a number of race-conditions and performance issues
througout the library associated with create, using, and
destroying many separate command queues.
This refactors the system::default_device() method. Now, the
default compute device for the system is only found once and
stored in a static variable. This eliminates many redundant
calls to clGetPlatformIDs() and clGetDeviceIDs().
Also, the default_cpu_device() and default_gpu_device() methods
have been removed and their usages replaced with default_device().
boost::compute::system::default_device() supports the following
environment variables:
BOOST_COMPUTE_DEFAULT_DEVICE for device name
BOOST_COMPUTE_DEFAULT_PLATFORM for OpenCL platform name
BOOST_COMPUTE_DEFAULT_VENDOR for device vendor name
If one or more of these variables is set, then device that satisfies
all conditions gets selected. If such a device is unavailable, then
the first available GPU is selected. If there are no GPUs in the
system, then the first available CPU is selected. Otherwise,
default_device() returns null device.
The hello_world example is modified to use default_device() instead
of default_gpu_device().