Commit Graph

1404 Commits

Author SHA1 Message Date
Kyle Lutz
e46828a9d6 Fix issues involving iterators with void value_type
This fixes a few issues encountered when using iterators with a
void value_type (e.g. std::insert_iterator<>).

The is_contiguous_iterator meta-function was refactored to always
return false for iterators with a void value_type and avoid
instantiating types for containers with a void value_type
(e.g. std::vector<void>::iterator) which previously resulted
in compilation errors.
2013-05-20 19:57:13 -04:00
Kyle Lutz
4ab37ada07 Add system-wide default command queue
This adds a system-wide default command queue. This queue is
accessible via the new static system::default_queue() method.
The default command queue is created for the default compute
device in the default context and is analogous to the default
stream in CUDA.

This changes how algorithms operate when invoked without an
explicit command queue. Previously, each algorithm had two
overloads, the first expected a command queue to be explicitly
passed and the second would create and use a temporary command
queue. Now, all algorithms take a command queue argument which
has a default value equal to system::default_queue().

This fixes a number of race-conditions and performance issues
througout the library associated with create, using, and
destroying many separate command queues.
2013-05-15 20:59:56 -04:00
Kyle Lutz
1b9e904cc7 Add CHECK_RANGE_EQUAL() test macro
This adds a new macro for the unit-tests which checks a range of
values on the device against an array of values on the host. This
simplifies writing tests and removes the need to explicitly copy
values back to the host for verification.
2013-05-13 23:06:40 -04:00
Kyle Lutz
a2bda0610d Fix memory issues with device_ptr and allocator
This fixes a few memory handling issues between device_ptr,
buffer_iterator, buffer_value, allocator, and malloc/free.

Previously, memory buffers that were allocated by allocator and
malloc were being retained (via clRetainMemObject() in buffer's
constructor) by device_ptr, buffer_iterator and buffer_value.

Now, false is passed for the retain parameter to buffer's
constructor so that the buffer's reference count is not
incremented. Furthermore, the classes now set the buffer to
null before being destructed so that they will not decrement its
reference count (which normally occurs buffer's destructor).

The main effect of this change is that objects which refer to a
memory buffer but do not own it (e.g. device_ptr, buffer_iterator)
will not modify the reference count for the buffer. This fixes a
number of memory leaks which occured in longer running programs.
2013-05-13 22:27:02 -04:00
Kyle Lutz
a5ddeae614 Add scalar<T> container
This adds a new scalar<T> "container" which stores a single
value in a memory buffer. This simplifies memory handling in
algorithms which read and write a single value.
2013-05-11 20:20:27 -04:00
Kyle Lutz
eea16e61c7 Remove simple_histogram from flat_map test
This removes the inefficent simple_histogram test-case from
the flat_map test-suite.
2013-05-11 20:17:24 -04:00
Kyle Lutz
130f8c30f1 Rename kernel::num_args() method to arity()
This renames the kernel::num_args() method to arity().
2013-05-11 20:15:00 -04:00
Kyle Lutz
0b466fec72 Fix indexing bug in list_devices example
This fixes an indexing bug in the list_devices example in
which the information would only be printed for the first
device on a platform.
2013-05-11 20:11:58 -04:00
Kyle Lutz
ffec5fd34a Remove unnecessary includes from transform_reduce
This removes a couple of unnecessary includes from the
transform_reduce.hpp header file.
2013-05-11 20:10:28 -04:00
Kyle Lutz
178676df4f Refactor the system::default_device() method
This refactors the system::default_device() method. Now, the
default compute device for the system is only found once and
stored in a static variable. This eliminates many redundant
calls to clGetPlatformIDs() and clGetDeviceIDs().

Also, the default_cpu_device() and default_gpu_device() methods
have been removed and their usages replaced with default_device().
2013-05-10 22:49:05 -04:00
Kyle Lutz
d40eddc56b Fix compilation error with get<N>() and tuple
This fixes a compilation error which occured when using
the get<N>() function with tuple types.
2013-05-10 21:51:28 -04:00
Kyle Lutz
fecbe63043 Check for partitioning support in device test
This adds checks to the device test-suite to ensure that the
current device supports the partitioning types before attempting
to use the corresponding device::partition_*() methods.
2013-05-09 23:05:11 -04:00
Kyle Lutz
8934bea627 Skip enqueue_write_buffer_rect() test on AMD GPUs
This skips the enqueue_write_buffer_rect() test on AMD GPUs which
don't correctly implement the clEnqueueWriteBuffer() function.
2013-05-09 22:13:28 -04:00
Kyle Lutz
705b3f35a3 Fix narrowing conversion warnings in device
This fixes a couple of narrowing conversion warnings in the
device partitioning methods which were seen when compiling
VexCL with Boost.Compute in C++11 mode.
2013-05-09 22:04:00 -04:00
Kyle Lutz
9a64f6b39a Add get<N>() function
This adds a get<N>() function which returns the n'th element
of an aggregate type (e.g. vector type, pair, tuple).

This unifies the functionality of, and replaces, the get_pair()
and vector_component() functions.
2013-05-05 12:46:05 -04:00
Kyle Lutz
3e840fa306 Add transform_if() algorithm
This adds a new algorithm named transform_if() which applies
a given unary function to an input value only if it passes a
separate predicate function.
2013-05-05 11:51:21 -04:00
Kyle Lutz
49a34442e5 Remove unused histogram() algorithm
This removes the unused histogram() algorithm.
2013-05-05 10:56:14 -04:00
Dominic Meiser
7c5e321c2a Fixing build issues under windows 2013-05-03 18:37:09 -04:00
Kyle Lutz
3e93d01475 Add default constructors to image2d and image3d
This adds default constructors to the image2d and image3d
classes which initialize them with null memory objects.
2013-05-02 21:01:30 -04:00
Kyle Lutz
5d28d3887e Make pick_copy_work_group_size() inline
This makes the pick_copy_work_group_size() function inline.
2013-05-02 20:55:22 -04:00
Kyle Lutz
0ab2fe85eb Don't auto-initialize values in vector
This changes the vector class to not auto-initialize values
when it is created or resized. This improves performance by
eliminating a call to fill(). If needed, user code can call
fill() explicitly on the newly allocated values.
2013-04-27 10:30:26 -04:00
Kyle Lutz
03195275b3 Increase work-group size for copy() kernel
This increases the work-group size for the copy() kernel to be
up to 32 items based on the size of the input. This increases the
performance of copy() and related algorithms (e.g. transform()).
2013-04-27 10:21:47 -04:00
Kyle Lutz
9a350f65cf Change clamp_range() test to use float
This changes the clamp_range() test to use float values instead
of int values. The OpenCL clamp() function is only defined for
float values and this test caused kernel compilation errors on
certain platforms.

Also updates the test to use the new global context.
2013-04-27 10:20:47 -04:00
Kyle Lutz
dde1747b32 Merge remote-tracking branch 'ddemidov/master' 2013-04-27 09:45:32 -04:00
Dominic Meiser
2a93124ef5 Using FindOpenCL module from VexCL
Changed CMakeLists.txt files in Boost.Compute to use the variables
defined by FindOpenCL.
2013-04-23 20:03:38 -04:00
Kyle Lutz
ea107ae5d6 Add clamp_range() algorithm
This adds a clamp_range() algorithm which clamps a range
of values between a low and high value. This is based on
the algorithm of the same name in Boost.Algorithm.
2013-04-22 22:06:04 -04:00
Kyle Lutz
3b24d0d15e Add test for SAXPY
This adds a test for the SAXPY function.
2013-04-22 20:40:50 -04:00
Kyle Lutz
425ada2d03 Add documentation for platform::unload_compiler()
This adds documentation for the unload_compiler() method in
the platform class.
2013-04-22 20:33:35 -04:00
Kyle Lutz
1fbb7b1b9a Add documentation for platform::get_extension_function_address()
This adds documentation for the get_extension_function_address()
method in the platform class.
2013-04-22 20:30:09 -04:00
Kyle Lutz
2f7ae1bc9c Remove documentation for non-existent platform methods
This removes the documentation for the non-existent platforms()
and platform_count() methods in the platform class. These methods
have been moved to the system class and are documented there.
2013-04-22 20:24:51 -04:00
Kyle Lutz
00cdca5b55 Add documentation for type-traits
This adds documentation for the type-traits.
2013-04-22 20:20:17 -04:00
Kyle Lutz
e0f78ff81e Update README file
This replaces the README file with a new, improved README.md
file written in markdown which contains more infomation and a
simple example.
2013-04-20 22:29:47 -04:00
Denis Demidov
13887e8ed5 Rely on system::default_context() to hold static context
refs kylelutz/compute#9
2013-04-19 15:16:46 +04:00
Denis Demidov
5d77bbebee Global setup for OpenCL context in tests
refs kylelutz/compute#9

device, context, and queue are initialized statically in `context_setup.hpp`.
With this change all tests are able to complete when an NVIDIA GPU is in
exclusive compute mode.

Side effect of the change:
Time for all tests to complete reduced from 15.71 to 13.03 sec Tesla C2075.
2013-04-19 14:53:59 +04:00
Kyle Lutz
8142e5d5f9 Add move-constructors to wrapper classes
This adds move-constructors and move-assignment operators
to the OpenCL wrapper classes.
2013-04-17 20:45:04 -04:00
Kyle Lutz
5cb51569eb Add test for command_queue::enqueue_write_buffer_rect()
This adds a test for the enqueue_write_buffer_rect() method
in the command_queue class. This method copies a rectangular
region of memory from the host to a device buffer.
2013-04-13 20:52:42 -04:00
Kyle Lutz
4bdec761cd Add memory_object::reference_count() method
This adds a reference_count() method to the memory_object
class which returns its current reference count.
2013-04-13 11:07:04 -04:00
Kyle Lutz
d58b7c0902 Return event from command_queue::enqueue_task()
This changes the command_queue::enqueue_task() method to return
an event object.
2013-04-13 10:23:29 -04:00
Kyle Lutz
da4cb81679 Return event from command_queue::enqueue_nd_range_kernel()
This changes the enqueue_nd_range_kernel() method to return an
event object. This allows clients to monitor the progress of a
kernel executing on a device.
2013-04-13 10:23:01 -04:00
Kyle Lutz
001b3ff7fe Add get() methods to wrapper classes
This adds a get() method to each wrapper class which returns
a reference to the underlying OpenCL object.
2013-04-13 09:44:51 -04:00
Denis Demidov
8b78d4187d Adds support for selecting devices with environment variables
boost::compute::system::default_device() supports the following
environment variables:

BOOST_COMPUTE_DEFAULT_DEVICE   for device name
BOOST_COMPUTE_DEFAULT_PLATFORM for OpenCL platform name
BOOST_COMPUTE_DEFAULT_VENDOR   for device vendor name

If one or more of these variables is set, then device that satisfies
all conditions gets selected. If such a device is unavailable, then
the first available GPU is selected. If there are no GPUs in the
system, then the first available CPU is selected. Otherwise,
default_device() returns null device.

The hello_world example is modified to use default_device() instead
of default_gpu_device().
2013-04-12 17:22:25 -04:00
Kyle Lutz
1be19a6305 Add multiplies<T> specialization for std::complex<T>
This adds a specialization of multiplies<T> for std::complex<T>
which implements complex number multiplication.

Also adds a simple test using transform() to verify the complex
multiplication works correctly.
2013-04-10 22:04:04 -04:00
Kyle Lutz
8d13920dc4 Move swizzle_iterator to detail namespace
This moves the swizzle_iterator class to the detail
namespace.
2013-04-10 21:51:24 -04:00
Kyle Lutz
bcc3aed40f Move pixel_input_iterator to detail namespace
This moves the pixel_input_iterator class to the detail
namespace.
2013-04-10 21:38:05 -04:00
Kyle Lutz
5cce555d8c Move binary_transform_iterator to detail namespace
This moves the binary_transform_iterator class to the
detail namespace.
2013-04-10 21:33:29 -04:00
Kyle Lutz
e30ec9f26c Move adjacent_transform_iterator to detail namespace
This moves the adjacent_transform_iterator class to the
detail namespace.
2013-04-10 21:24:15 -04:00
Kyle Lutz
6dd6e11c7d Fix unused variable warning in get_base_iterator_buffer()
This fixes an unused variable warning which occurs in the
get_base_iterator_buffer() function when the base iterator
is not a buffer iterator and thus the iter argument is not
used.
2013-04-10 21:09:17 -04:00
Kyle Lutz
6fdffd8a2b Replace usages of result_of() with tr1_result_of()
This fixes a bug in which boost::result_of() would return the
wrong result type for a function due to the new implementation
using decltype instead of the result_of protocol on compilers
that sufficently support C++11 (such as clang >= 3.2).

Now, boost::tr1_result_of() is used to explicitly request that
the result_of protocol be used even when decltype is supported
by the compiler.
2013-04-10 20:17:34 -04:00
Kyle Lutz
430a76bb6c Add generate_fibonacci_sequence test-case
This adds a new test case which computes the first twenty-five
fibonacci numbers using the transform() algorithm and a custom
generator function.
2013-04-09 22:36:00 -04:00
Kyle Lutz
652f99e449 Fix bug in get_buffer() for iterator adaptors
This fixes a bug in which the get_buffer() method was not properly
disabled for iterator adaptors with a non-buffer base iterator.
2013-04-09 21:56:24 -04:00