compute

Author	SHA1	Message	Date
Kyle Lutz	e46828a9d6	Fix issues involving iterators with void value_type This fixes a few issues encountered when using iterators with a void value_type (e.g. std::insert_iterator<>). The is_contiguous_iterator meta-function was refactored to always return false for iterators with a void value_type and avoid instantiating types for containers with a void value_type (e.g. std::vector<void>::iterator) which previously resulted in compilation errors.	2013-05-20 19:57:13 -04:00
Kyle Lutz	4ab37ada07	Add system-wide default command queue This adds a system-wide default command queue. This queue is accessible via the new static system::default_queue() method. The default command queue is created for the default compute device in the default context and is analogous to the default stream in CUDA. This changes how algorithms operate when invoked without an explicit command queue. Previously, each algorithm had two overloads, the first expected a command queue to be explicitly passed and the second would create and use a temporary command queue. Now, all algorithms take a command queue argument which has a default value equal to system::default_queue(). This fixes a number of race-conditions and performance issues througout the library associated with create, using, and destroying many separate command queues.	2013-05-15 20:59:56 -04:00
Kyle Lutz	1b9e904cc7	Add CHECK_RANGE_EQUAL() test macro This adds a new macro for the unit-tests which checks a range of values on the device against an array of values on the host. This simplifies writing tests and removes the need to explicitly copy values back to the host for verification.	2013-05-13 23:06:40 -04:00
Kyle Lutz	a2bda0610d	Fix memory issues with device_ptr and allocator This fixes a few memory handling issues between device_ptr, buffer_iterator, buffer_value, allocator, and malloc/free. Previously, memory buffers that were allocated by allocator and malloc were being retained (via clRetainMemObject() in buffer's constructor) by device_ptr, buffer_iterator and buffer_value. Now, false is passed for the retain parameter to buffer's constructor so that the buffer's reference count is not incremented. Furthermore, the classes now set the buffer to null before being destructed so that they will not decrement its reference count (which normally occurs buffer's destructor). The main effect of this change is that objects which refer to a memory buffer but do not own it (e.g. device_ptr, buffer_iterator) will not modify the reference count for the buffer. This fixes a number of memory leaks which occured in longer running programs.	2013-05-13 22:27:02 -04:00
Kyle Lutz	a5ddeae614	Add scalar<T> container This adds a new scalar<T> "container" which stores a single value in a memory buffer. This simplifies memory handling in algorithms which read and write a single value.	2013-05-11 20:20:27 -04:00
Kyle Lutz	eea16e61c7	Remove simple_histogram from flat_map test This removes the inefficent simple_histogram test-case from the flat_map test-suite.	2013-05-11 20:17:24 -04:00
Kyle Lutz	130f8c30f1	Rename kernel::num_args() method to arity() This renames the kernel::num_args() method to arity().	2013-05-11 20:15:00 -04:00
Kyle Lutz	0b466fec72	Fix indexing bug in list_devices example This fixes an indexing bug in the list_devices example in which the information would only be printed for the first device on a platform.	2013-05-11 20:11:58 -04:00
Kyle Lutz	ffec5fd34a	Remove unnecessary includes from transform_reduce This removes a couple of unnecessary includes from the transform_reduce.hpp header file.	2013-05-11 20:10:28 -04:00
Kyle Lutz	178676df4f	Refactor the system::default_device() method This refactors the system::default_device() method. Now, the default compute device for the system is only found once and stored in a static variable. This eliminates many redundant calls to clGetPlatformIDs() and clGetDeviceIDs(). Also, the default_cpu_device() and default_gpu_device() methods have been removed and their usages replaced with default_device().	2013-05-10 22:49:05 -04:00
Kyle Lutz	d40eddc56b	Fix compilation error with get<N>() and tuple This fixes a compilation error which occured when using the get<N>() function with tuple types.	2013-05-10 21:51:28 -04:00
Kyle Lutz	fecbe63043	Check for partitioning support in device test This adds checks to the device test-suite to ensure that the current device supports the partitioning types before attempting to use the corresponding device::partition_*() methods.	2013-05-09 23:05:11 -04:00
Kyle Lutz	8934bea627	Skip enqueue_write_buffer_rect() test on AMD GPUs This skips the enqueue_write_buffer_rect() test on AMD GPUs which don't correctly implement the clEnqueueWriteBuffer() function.	2013-05-09 22:13:28 -04:00
Kyle Lutz	705b3f35a3	Fix narrowing conversion warnings in device This fixes a couple of narrowing conversion warnings in the device partitioning methods which were seen when compiling VexCL with Boost.Compute in C++11 mode.	2013-05-09 22:04:00 -04:00
Kyle Lutz	9a64f6b39a	Add get<N>() function This adds a get<N>() function which returns the n'th element of an aggregate type (e.g. vector type, pair, tuple). This unifies the functionality of, and replaces, the get_pair() and vector_component() functions.	2013-05-05 12:46:05 -04:00
Kyle Lutz	3e840fa306	Add transform_if() algorithm This adds a new algorithm named transform_if() which applies a given unary function to an input value only if it passes a separate predicate function.	2013-05-05 11:51:21 -04:00
Kyle Lutz	49a34442e5	Remove unused histogram() algorithm This removes the unused histogram() algorithm.	2013-05-05 10:56:14 -04:00
Dominic Meiser	7c5e321c2a	Fixing build issues under windows	2013-05-03 18:37:09 -04:00
Kyle Lutz	3e93d01475	Add default constructors to image2d and image3d This adds default constructors to the image2d and image3d classes which initialize them with null memory objects.	2013-05-02 21:01:30 -04:00
Kyle Lutz	5d28d3887e	Make pick_copy_work_group_size() inline This makes the pick_copy_work_group_size() function inline.	2013-05-02 20:55:22 -04:00
Kyle Lutz	0ab2fe85eb	Don't auto-initialize values in vector This changes the vector class to not auto-initialize values when it is created or resized. This improves performance by eliminating a call to fill(). If needed, user code can call fill() explicitly on the newly allocated values.	2013-04-27 10:30:26 -04:00
Kyle Lutz	03195275b3	Increase work-group size for copy() kernel This increases the work-group size for the copy() kernel to be up to 32 items based on the size of the input. This increases the performance of copy() and related algorithms (e.g. transform()).	2013-04-27 10:21:47 -04:00
Kyle Lutz	9a350f65cf	Change clamp_range() test to use float This changes the clamp_range() test to use float values instead of int values. The OpenCL clamp() function is only defined for float values and this test caused kernel compilation errors on certain platforms. Also updates the test to use the new global context.	2013-04-27 10:20:47 -04:00
Kyle Lutz	dde1747b32	Merge remote-tracking branch 'ddemidov/master'	2013-04-27 09:45:32 -04:00
Dominic Meiser	2a93124ef5	Using FindOpenCL module from VexCL Changed CMakeLists.txt files in Boost.Compute to use the variables defined by FindOpenCL.	2013-04-23 20:03:38 -04:00
Kyle Lutz	ea107ae5d6	Add clamp_range() algorithm This adds a clamp_range() algorithm which clamps a range of values between a low and high value. This is based on the algorithm of the same name in Boost.Algorithm.	2013-04-22 22:06:04 -04:00
Kyle Lutz	3b24d0d15e	Add test for SAXPY This adds a test for the SAXPY function.	2013-04-22 20:40:50 -04:00
Kyle Lutz	425ada2d03	Add documentation for platform::unload_compiler() This adds documentation for the unload_compiler() method in the platform class.	2013-04-22 20:33:35 -04:00
Kyle Lutz	1fbb7b1b9a	Add documentation for platform::get_extension_function_address() This adds documentation for the get_extension_function_address() method in the platform class.	2013-04-22 20:30:09 -04:00
Kyle Lutz	2f7ae1bc9c	Remove documentation for non-existent platform methods This removes the documentation for the non-existent platforms() and platform_count() methods in the platform class. These methods have been moved to the system class and are documented there.	2013-04-22 20:24:51 -04:00
Kyle Lutz	00cdca5b55	Add documentation for type-traits This adds documentation for the type-traits.	2013-04-22 20:20:17 -04:00
Kyle Lutz	e0f78ff81e	Update README file This replaces the README file with a new, improved README.md file written in markdown which contains more infomation and a simple example.	2013-04-20 22:29:47 -04:00
Denis Demidov	13887e8ed5	Rely on system::default_context() to hold static context refs kylelutz/compute#9	2013-04-19 15:16:46 +04:00
Denis Demidov	5d77bbebee	Global setup for OpenCL context in tests refs kylelutz/compute#9 device, context, and queue are initialized statically in `context_setup.hpp`. With this change all tests are able to complete when an NVIDIA GPU is in exclusive compute mode. Side effect of the change: Time for all tests to complete reduced from 15.71 to 13.03 sec Tesla C2075.	2013-04-19 14:53:59 +04:00
Kyle Lutz	8142e5d5f9	Add move-constructors to wrapper classes This adds move-constructors and move-assignment operators to the OpenCL wrapper classes.	2013-04-17 20:45:04 -04:00
Kyle Lutz	5cb51569eb	Add test for command_queue::enqueue_write_buffer_rect() This adds a test for the enqueue_write_buffer_rect() method in the command_queue class. This method copies a rectangular region of memory from the host to a device buffer.	2013-04-13 20:52:42 -04:00
Kyle Lutz	4bdec761cd	Add memory_object::reference_count() method This adds a reference_count() method to the memory_object class which returns its current reference count.	2013-04-13 11:07:04 -04:00
Kyle Lutz	d58b7c0902	Return event from command_queue::enqueue_task() This changes the command_queue::enqueue_task() method to return an event object.	2013-04-13 10:23:29 -04:00
Kyle Lutz	da4cb81679	Return event from command_queue::enqueue_nd_range_kernel() This changes the enqueue_nd_range_kernel() method to return an event object. This allows clients to monitor the progress of a kernel executing on a device.	2013-04-13 10:23:01 -04:00
Kyle Lutz	001b3ff7fe	Add get() methods to wrapper classes This adds a get() method to each wrapper class which returns a reference to the underlying OpenCL object.	2013-04-13 09:44:51 -04:00
Denis Demidov	8b78d4187d	Adds support for selecting devices with environment variables boost::compute::system::default_device() supports the following environment variables: BOOST_COMPUTE_DEFAULT_DEVICE for device name BOOST_COMPUTE_DEFAULT_PLATFORM for OpenCL platform name BOOST_COMPUTE_DEFAULT_VENDOR for device vendor name If one or more of these variables is set, then device that satisfies all conditions gets selected. If such a device is unavailable, then the first available GPU is selected. If there are no GPUs in the system, then the first available CPU is selected. Otherwise, default_device() returns null device. The hello_world example is modified to use default_device() instead of default_gpu_device().	2013-04-12 17:22:25 -04:00
Kyle Lutz	1be19a6305	Add multiplies<T> specialization for std::complex<T> This adds a specialization of multiplies<T> for std::complex<T> which implements complex number multiplication. Also adds a simple test using transform() to verify the complex multiplication works correctly.	2013-04-10 22:04:04 -04:00
Kyle Lutz	8d13920dc4	Move swizzle_iterator to detail namespace This moves the swizzle_iterator class to the detail namespace.	2013-04-10 21:51:24 -04:00
Kyle Lutz	bcc3aed40f	Move pixel_input_iterator to detail namespace This moves the pixel_input_iterator class to the detail namespace.	2013-04-10 21:38:05 -04:00
Kyle Lutz	5cce555d8c	Move binary_transform_iterator to detail namespace This moves the binary_transform_iterator class to the detail namespace.	2013-04-10 21:33:29 -04:00
Kyle Lutz	e30ec9f26c	Move adjacent_transform_iterator to detail namespace This moves the adjacent_transform_iterator class to the detail namespace.	2013-04-10 21:24:15 -04:00
Kyle Lutz	6dd6e11c7d	Fix unused variable warning in get_base_iterator_buffer() This fixes an unused variable warning which occurs in the get_base_iterator_buffer() function when the base iterator is not a buffer iterator and thus the iter argument is not used.	2013-04-10 21:09:17 -04:00
Kyle Lutz	6fdffd8a2b	Replace usages of result_of() with tr1_result_of() This fixes a bug in which boost::result_of() would return the wrong result type for a function due to the new implementation using decltype instead of the result_of protocol on compilers that sufficently support C++11 (such as clang >= 3.2). Now, boost::tr1_result_of() is used to explicitly request that the result_of protocol be used even when decltype is supported by the compiler.	2013-04-10 20:17:34 -04:00
Kyle Lutz	430a76bb6c	Add generate_fibonacci_sequence test-case This adds a new test case which computes the first twenty-five fibonacci numbers using the transform() algorithm and a custom generator function.	2013-04-09 22:36:00 -04:00
Kyle Lutz	652f99e449	Fix bug in get_buffer() for iterator adaptors This fixes a bug in which the get_buffer() method was not properly disabled for iterator adaptors with a non-buffer base iterator.	2013-04-09 21:56:24 -04:00

... 25 26 27 28 29

1404 Commits