compute

Author	SHA1	Message	Date
Kyle Lutz	ab0a365060	Only allocate temporary vector if necessary in generic_reduce()	2014-04-22 21:42:28 -07:00
Kyle Lutz	72a5449ffe	Fix BOOST_COMPUTE_FUNCTION() usage in struct.hpp documentation	2014-04-22 21:31:14 -07:00
Kyle Lutz	127b350411	Remove unnecessary typename in function.hpp and closure.hpp This fixes a warning when compiling these files with clang.	2014-04-20 19:34:21 -07:00
Kyle Lutz	4b67907023	Change BOOST_COMPUTE_FUNCTION() to use custom argument names This changes the BOOST_COMPUTE_FUNCTION() macro (and the related BOOST_COMPUTE_CLOSURE() macro) to use custom, user-provided argument names instead of auto-generating them based on their index. This is an API-breaking change. Users should now provide argument names when using the BOOST_COMPUTE_FUNCTION() macro. The examples and documentation have been updated to reflect the new API.	2014-04-20 19:13:48 -07:00
Kyle Lutz	2511bdb436	Merge pull request #97 from roshanr95/unique Fix errors in unique	2014-04-20 16:42:56 -07:00
roshanr	3f537d806e	Add unique_copy, modify unique to use it	2014-04-21 01:43:10 +05:30
Kyle Lutz	a78212fdde	Rename K to K_BITS in radix_sort() This should fix the following error seen on the Apple OpenCL implementation when compiling the radix_sort program: "error: definition of macro 'K' conflicts with an identifier used in the precompiled header".	2014-04-20 10:16:02 -07:00
Kyle Lutz	8b06e3f7bb	Add event::duration() method	2014-04-19 12:31:37 -07:00
Kyle Lutz	6ac757887c	Support generic function callbacks for event	2014-04-19 11:38:10 -07:00
Kyle Lutz	7629748e49	Merge pull request #90 from roshanr95/perf_sort_by_key Fix errors in perf_sort_by_key	2014-04-18 08:15:08 -07:00
roshanr	70da4979f5	Refactor recurring code into preprocessor Makes it easy to add specialisations	2014-04-18 10:31:12 +05:30
Kyle Lutz	21d81fcd76	Add user_event class	2014-04-16 21:11:47 -07:00
Kyle Lutz	ac0be42cfc	Change program::build() to return void	2014-04-16 21:01:31 -07:00
Kyle Lutz	b3ab16578b	Include <boost/mpl/size.hpp> in function.hpp	2014-04-16 19:12:29 -07:00
Kyle Lutz	e84987b3f4	Fix unused parameter warning in reduce_on_gpu.hpp	2014-04-16 19:12:15 -07:00
Kyle Lutz	4f6c591362	Remove unnecessary typename in discard_iterator	2014-04-13 14:55:08 -07:00
Kyle Lutz	663ab01425	Add documentation for the random number generator classes	2014-04-13 14:33:03 -07:00
Kyle Lutz	3d8616e27e	Add mersenne_twister_engine::generate() overload with transform	2014-04-13 14:18:41 -07:00
Kyle Lutz	6336b81911	Rename mersenne_twister_engine::fill() to generate()	2014-04-13 14:13:53 -07:00
Kyle Lutz	2ebb04caac	Add discard() method to mersenne_twister_engine	2014-04-13 13:57:12 -07:00
Kyle Lutz	dd0b1fcb7b	Add discard_iterator class	2014-04-13 13:45:01 -07:00
Kyle Lutz	6a9efd6d03	Optimize vector<T>::erase() when last equals end()	2014-04-12 16:16:01 -07:00
Kyle Lutz	dadced4703	Remove unused context variable in random_fill()	2014-04-12 11:33:09 -07:00
Kyle Lutz	b7c4f0ce18	Change mersenne_twister::seed() to take a command_queue	2014-04-12 11:14:44 -07:00
Kyle Lutz	7b2ca68539	Add documentation for the enqueue_1d_range_kernel() method	2014-04-12 10:11:02 -07:00
Kyle Lutz	8dac90de3a	Fix spelling error in enqueue_native_kernel() documentation	2014-04-12 10:09:44 -07:00
Kyle Lutz	420c3dd15b	Remove cl_int return values from command_queue This updates the methods in command_queue to either return void (for synchronous operations) or an event object (for asynchronous operations). The caller will be notified of OpenCL errors via an exception being thrown.	2014-04-12 10:02:45 -07:00
Kyle Lutz	7966768c80	Remove read/write buffer convenience overloads in command_queue	2014-04-12 09:40:37 -07:00
Kyle Lutz	e3604817df	Remove explicit call to finish() in command_queue destructor This removes the explicit call to finish() in the destructor for the command_queue class. The clFinish() function will be called automatically by the clReleaseCommandQueue() function once the reference count for the command queue drops to zero.	2014-04-12 09:35:39 -07:00
Kyle Lutz	7ec4566a00	Remove default local_work_size argument for enqueue_nd_range_kernel()	2014-04-12 09:22:28 -07:00
Kyle Lutz	89d97768d2	Remove enqueue_1d_range_kernel() overload with no local work-size	2014-04-12 09:17:48 -07:00
Kyle Lutz	15cf54cc48	Fix ambiguous member template warning with clang	2014-04-10 22:31:01 -07:00
Kyle Lutz	acb2188382	Improve reduce() performance with generic iterators	2014-04-10 22:16:04 -07:00
Kyle Lutz	b897b1f023	Copy multiple values per thread in copy_on_device()	2014-04-02 21:46:36 -07:00
Kyle Lutz	bae2bb6c7f	Add get_nvidia_compute_capability() function	2014-04-02 21:30:22 -07:00
Kyle Lutz	01eb24f36c	Fix bug in copy-constructor for wait_list	2014-03-23 21:26:09 -07:00
Kyle Lutz	6334d67720	Merge pull request #75 from roshanr95/unique Unique algorithm	2014-03-23 21:24:05 -07:00
Kyle Lutz	5efecbdaad	Merge pull request #74 from ddemidov/master Fixing several warnings given by pedantic g++-4.8.2	2014-03-23 21:23:21 -07:00
roshanr	1e81b7ec2e	Unique algorithm Added unique() algorithm, tests and benchmarks. Removed unused variable in scan_on_gpu() to remove warnings	2014-03-24 06:30:28 +05:30
Denis Demidov	d653df535d	Fixing several warnings given by pedantic g++-4.8.2	2014-03-22 22:37:39 +04:00
Kyle Lutz	667aa9c200	Add buffer::clone() method	2014-03-20 23:31:41 -07:00
Kyle Lutz	0446e24baf	Fix BOOST_COMPUTE_FUNCTION() with non-default-constructible types	2014-03-20 23:31:36 -07:00
Kyle Lutz	c15f35b0be	Check for empty strings in get_object_info()	2014-03-20 23:17:27 -07:00
Kyle Lutz	21f053fe00	Check binary status in program::create_with_binary()	2014-03-20 23:17:27 -07:00
Kyle Lutz	53d6e95054	Release v0.1	2014-03-16 15:18:26 -07:00
Kyle Lutz	a439709fc2	Improve documentation	2014-03-16 13:59:14 -07:00
Kyle Lutz	8e086104a0	Add event::set_callback() method This adds a method to the event class which allows the user to register a callback function to be invoked when the event reaches the specified state (e.g. when it completes).	2014-03-16 13:20:57 -07:00
Kyle Lutz	0c3a325554	Move transform_if() algorithm to experimental	2014-03-16 13:16:39 -07:00
Kyle Lutz	9bf22a41d1	Merge pull request #66 from roshanr95/rotate_copy rotate_copy algorithm and test	2014-03-13 08:07:11 -07:00
Kyle Lutz	bae7432c04	Improve sort_by_key() performance	2014-03-12 23:40:57 -07:00
Kyle Lutz	cf8e972e55	Improve kernel::set_arg() method	2014-03-12 21:02:22 -07:00
Kyle Lutz	e1e84252d0	Merge pull request #65 from roshanr95/mersenneTwister Fix warnings in Mersenne Twister	2014-03-12 18:21:13 -07:00
roshanr	f1b7f39655	rotate_copy algorithm and test	2014-03-13 03:56:56 +05:30
Kyle Lutz	5fb6f94cea	Merge pull request #62 from roshanr95/rotate Rotate algorithm	2014-03-12 10:38:56 -07:00
roshanr	d1a87603f0	Fix warnings in Mersenne Twister	2014-03-12 23:05:06 +05:30
roshanr	03edbbbdab	Rotate algorithm	2014-03-12 22:41:30 +05:30
Kyle Lutz	0a1c378731	Add opengl_renderbuffer class	2014-03-09 22:17:36 -07:00
Kyle Lutz	ad48527dcd	Add documentation for OpenGL interop headers	2014-03-09 22:16:10 -07:00
Kyle Lutz	6c8f158c00	Fix documentation for the wait_list class	2014-03-09 22:06:45 -07:00
Kyle Lutz	83d104f24f	Add BOOST_COMPUTE_CLOSURE() macro This adds a new macro which allows users to create closure functions which can capture C++ variables and make them available in OpenCL.	2014-03-08 18:44:03 -08:00
Kyle Lutz	dec92cc438	Add BOOST_COMPUTE_ADAPT_STRUCT() macro This adds a new macro which allows the user to adapt a C++ struct or class for use with OpenCL given its type, name, and members. This allows for custom user-defined data-types to be used with the Boost.Compute containers and algorithms.	2014-03-08 18:21:34 -08:00
Kyle Lutz	6f3f30bee9	Add enqueue_native_kernel() method to command_queue	2014-03-08 15:21:57 -08:00
Kyle Lutz	3b49cf14f8	Add wait_list class This adds a wait_list class which contains a vector of OpenCL events that can be waited on before executing further commands.	2014-03-08 14:09:41 -08:00
Kyle Lutz	71af014b3d	Add mapped_view container	2014-03-08 13:17:55 -08:00
Kyle Lutz	51e89596b1	Simplify accumulate() with reduce()	2014-03-08 13:13:32 -08:00
Kyle Lutz	b8de46d4de	Add experimental directory This adds an experimental directory which contains various experimental algorithms and functions. The files and APIs under this directory are experimental and unstable.	2014-03-08 13:02:06 -08:00
Kyle Lutz	86c0bb0a12	Add inline specifier to opengl_enqueue_release_gl_objects()	2014-02-28 21:09:34 -08:00
Kyle Lutz	b1b50f5e3a	Add meta_kernel::insert_function_call() method	2014-02-24 19:56:52 -08:00
Kyle Lutz	d9a45b06d3	Move float vector stream operators in meta_kernel	2014-02-24 19:42:31 -08:00
Kyle Lutz	80781ce9d2	Add OpenCV-OCL interop functions	2014-02-22 10:57:42 -08:00
Kyle Lutz	dacdbf0ffd	Bug in fill() with uchar4	2014-02-22 10:51:39 -08:00
Kyle Lutz	e7a76c343a	Remove unused variable in reduce_on_gpu() kernel	2014-02-14 18:14:18 -08:00
Kyle Lutz	ec11d8cdc4	Add third-party perf tests This adds third-party performance tests to use in comparing Boost.Compute with other parallel/GPGPU frameworks like Intel's TBB and NVIDIA's Thrust along with the C++ STL. Also refactors the timing and profiling infrastructure and adds a simple perf.py driver script for running performance tests.	2014-02-02 13:12:17 -08:00
Kyle Lutz	6de0b65d18	Improve documentation	2014-02-02 11:32:49 -08:00
Kyle Lutz	f3c2384af4	Add opengl_create_shared_context() function	2014-02-01 12:27:23 -08:00
Kyle Lutz	0c88eca831	Add platform::id() method	2014-02-01 12:17:21 -08:00
Kyle Lutz	9a0aa33c2f	Make platform::get_extension_function_address() const	2014-02-01 12:15:53 -08:00
Kyle Lutz	ccd6f21d98	Change vector constructors to take queue argument This changes the vector<T> constructors which copy or initialize data to take a queue argument used for performing the operations. Previously they just took a context argument used to initialize the buffer and then created a new command queue to use. This improves performance by not requiring a new command queue and also fixes issues when performing operations on a different command queue while the vector was still being initialized.	2014-01-27 23:39:19 -08:00
Kyle Lutz	47922aa780	Add Boost version check to config.hpp This adds a compile-time check to config.hpp which ensures that the miniumum supported Boost version (1.48) is found.	2014-01-20 18:31:18 -08:00
Kyle Lutz	dc20f09d92	Add make_tuple() lambda function	2014-01-14 22:18:35 -08:00
Kyle Lutz	ea7c2bf2f4	Add make_pair() lambda function	2014-01-14 22:03:48 -08:00
Kyle Lutz	c784ae994e	Add third lambda placeholder	2014-01-14 22:00:22 -08:00
Kyle Lutz	46ef3fffb5	Make lambda function expressions variadic	2014-01-14 21:58:09 -08:00
Kyle Lutz	c57e1953d8	Make lambda get<N>() variadic	2014-01-14 21:54:54 -08:00
Kyle Lutz	8aad57612b	Make function_signature_to_mpl_vector<> meta-function variadic	2014-01-14 21:52:34 -08:00
Kyle Lutz	72664c8de9	Add test for generate() with pair<T1, T2>	2014-01-14 21:31:51 -08:00
Kyle Lutz	68412f5ae0	Refactor function handling in lambda expressions	2014-01-13 18:27:57 -08:00
Kyle Lutz	936d801466	Add support for host iterators to sort()	2014-01-13 18:27:52 -08:00
Kyle Lutz	413267b32a	Improve accumulate() performance This improves the performance for the accumulate() algorithm for types/operations that can be performed with reduce().	2014-01-13 18:27:48 -08:00
Kyle Lutz	ac148e8f1f	Fix extra semicolon warning in interop/eigen/core.hpp	2014-01-13 18:27:40 -08:00
Denis Demidov	5e912dff1c	Move BOOST_COMPUTE_MAX_ARITY definition to compute/config.hpp	2014-01-10 09:55:35 +04:00
Denis Demidov	52bae83504	Make zip_iterator take more than three elements This uses Boost.Preprocessor macros to allow zip iterators to work with arbitrary number of elements (the current limit is maximum boost::tuple size which is 10 by default). Refs #50	2014-01-09 23:39:58 +04:00
Denis Demidov	d24749ae52	Use SHA1 for online cache keys This makes online cache use sha1 of the program source as key. Introduces boost::compute::detail::sha1() function, which is moved from compute::program into its own header file.	2014-01-07 23:07:18 +04:00
Kyle Lutz	6f52e3ce1f	Merge pull request #46 from ddemidov/offline-cache Use the original program source for program creation/compilation	2014-01-07 10:11:08 -08:00
Denis Demidov	41d2052c2a	Fix linkage problem with detail::getenv() detail::getenv() function was not declared inline, which led to `multiple definition` errors at link time when a program consisted of multiple objects that included Boost.Compute headers. Fixed the problem and added core.multiple_objects test.	2014-01-07 21:29:18 +04:00
Denis Demidov	f519ad3639	Use the original program source for program creation/compilation Instead of building the program from source with the added comment block (used for distinction between different platforms and devices when offline cache is in use), only use the altered source for the hash computation. This way users will not get unexpected results from program.source().	2014-01-07 21:05:26 +04:00
Kyle Lutz	aad03486d9	Add interop support This adds interoperability support between Boost.Compute and various other C/C++ libraries (Eigen, OpenCV, OpenGL, Qt and VTK). This eases development for users using external libraries with Boost.Compute.	2014-01-06 23:35:38 -08:00
Kyle Lutz	b47e74df6f	Add is_fundamental type-trait	2014-01-06 23:04:36 -08:00
Kyle Lutz	eca81df028	Merge pull request #39 from ddemidov/offline-cache Implements offline kernel caching	2014-01-06 22:47:52 -08:00
Denis Demidov	562f149b18	Implements offline kernel caching See kylelutz/compute#21 This adds program::build_with_source() function that both creates and builds the program for the given context with supplied source and compile options. In case BOOST_COMPUTE_USE_OFFLINE_CACHE macro is defined, it also saves the compiled program binary for reuse in the offline cache located in $HOME/.boost_compute folder on UNIX-like systems and in %APPDATA%/boost_compute folder on Windows. All internal uses of program::create_with_source() followed by program::build() are replaced with program::build_with_source().	2014-01-07 09:07:00 +04:00
Kyle Lutz	b17888b604	Move future header to async directory	2014-01-06 18:44:37 -08:00
Kyle Lutz	55eeada078	Add getenv() wrapper This adds a getenv() wrapper which can be used to avoid having to explicitly disable MSVC warnings when checking for environment variables.	2014-01-06 07:53:07 -08:00
Kyle Lutz	e337f632da	Add height() and width() methods to image2d	2014-01-05 18:36:29 -08:00
Kyle Lutz	3bc4a6366d	Add BOOST_COMPUTE_STRINGIZE_SOURCE() macro	2014-01-05 18:30:34 -08:00
Kyle Lutz	6b30645d6d	Remove extra semicolon in accumulate.hpp	2014-01-05 18:18:45 -08:00
Kyle Lutz	0d9be38326	Fix issues with gather() algorithm This fixes some issues with the gather algorithm and also adds another test for it.	2013-12-21 15:34:29 -08:00
Kyle Lutz	55783258e7	Add cache support to meta_kernel::compile() This updates the meta_kernel::compile() method to support caching of program objects. The programs are cached based on a hash of their source code.	2013-12-21 11:44:02 -08:00
Kyle Lutz	ac1ff45eff	Add reduce_on_gpu() algorithm This adds a improved reduce() algorithm implementation for GPUs. Also adds checks to accumulate() which allow it to use the higher-performance reduce() algorithm if possible.	2013-12-21 10:56:55 -08:00
Kyle Lutz	26612823a4	Add merge() overload with custom compare function This adds a merge() function overload which uses a custom compare function instead of the default less<T>() to compare the values.	2013-12-07 15:15:37 -08:00
Kyle Lutz	6b6f66b6ba	Add reduce() overload without function argument This adds adds an overload of the reduce() function which uses plus<T>() as the reductor. This simplifies the common case of calculating the sum for a range of values.	2013-12-07 15:02:04 -08:00
Kyle Lutz	ba9e64e316	Remove init argument from reduce() This removes the init argument from reduce. This simplifies the implementation and avoids copying a value from the host to the device on every call to reduce. If an initial value is required, the accumulate function can be called instead.	2013-12-07 14:49:46 -08:00
Kyle Lutz	7db9ad715f	Fix compilation error on Windows for context error handler This fixes a compilation error which occurs on Windows when registering the default error handler callback when creating a new context object. In OpenCL 1.1 and later the callback function is expected to use the __stdcall calling convention. This is optionally defined by the CL_CALLBACK macro on WIN32 platforms. If available, it is defined with the BOOST_COMPUTE_CL_CALLBACK macro which is then used to annotate the callback functions.	2013-12-06 23:11:01 -08:00
Kyle Lutz	4b2aa35326	Increase work-group size for copy() kernel This increases the work-group size for the copy() kernel to 256 which improves performance on several benchmarks.	2013-11-23 12:22:55 -08:00
Kyle Lutz	701bc8a5f3	Add nth_element() algorithm This adds an implementation of the nth_element() algorithm. For now the algorithm is trivially implemented by calling sort().	2013-11-15 20:51:13 -08:00
Kyle Lutz	0daa62e41f	Add experimental copy_index_if() algorithm This adds an experimental algorithm like copy_if() which copies the index of the values for which predicate returns true instead of the values themselves.	2013-11-15 20:30:30 -08:00
Kyle Lutz	adde232fc8	Add context error handler This adds an error handler function which is invoked when an OpenCL context encounters an error condition. The context error is converted to a C++ exception containing the error information and thrown.	2013-11-15 20:26:01 -08:00
Kyle Lutz	953ebb4e26	Add variadic tuple support This adds support for variadic tuples on C++11 compilers.	2013-11-15 20:07:39 -08:00
Kyle Lutz	b5ff4743bb	Add field() function This adds a new function which will return the named field from a value. For example, this can be used to return one of the components of a pair object or to swizzle a vector value.	2013-11-10 15:44:45 -08:00
Kyle Lutz	8213697307	Add BOOST_COMPUTE_FUNCTION() macro This adds a new macro to ease the definition of custom user functions. The BOOST_COMPUTE_FUNCTION() macro creates a new boost::compute::function<> object with the provided return type, argument types, function name and OpenCL source code.	2013-11-10 15:32:15 -08:00
Kyle Lutz	8608e60116	Refactor invoked_function<> This refactors the invoked_function<> classes. Previously each function arity (e.g. unary, binary) had a separate invoked_function<> template class. Now they all use the same class which simplifies the logic in function<> and meta_kernel.	2013-11-10 15:31:56 -08:00
Kyle Lutz	43678410be	Fix bugs with type definitions in meta_kernel This fixes a bug in which type definitions were being inserted into meta_kernel's multiple times. Also forces zip_iterator to insert its type definitions when used in a kernel.	2013-11-10 15:13:46 -08:00
Kyle Lutz	a0b635e201	Add type_name<void>() specialization This adds a type_name<>() specialization for void types.	2013-11-10 14:35:04 -08:00
Kyle Lutz	85812f4e93	Add BOOST_COMPUTE_TYPE_NAME() macro This adds a macro for registering custom type names for C++ types to be used in OpenCL kernel code. Internally the macro specializes the type_name<T>() function.	2013-10-02 21:40:22 -04:00
Kyle Lutz	a2b7595f36	Make type_name<T>() inline This adds the inline specifier to the type_name<T>() function.	2013-10-02 21:23:09 -04:00
Kyle Lutz	feb510a019	Add unpack() function adaptor This adds a new unpack() function adaptor which converts a function with N arguments to a function which takes a single tuple argument with N components. This is useful for calling built-in functions with the tuples values returned from zip_iterator. This also removes the now un-needed binary_transform_iterator.	2013-09-24 23:05:08 -04:00
Kyle Lutz	736f3a17a6	Add min_and_max reduce() test This adds a test for computing the minimum and maximum values of a vector simultaneously using reduce() with a custom reduction function. Also fixes a bug in reduce() in which inplace_reduce() was being used even if the input type and result type differed.	2013-09-24 22:47:16 -04:00
Kyle Lutz	a1155bc343	Store source strings for binary and ternary functions This fixes an issue in which the source strings for binary and ternary functions were not being stored and thus not being inserted into kernels when they were invoked.	2013-09-24 22:42:50 -04:00
Kyle Lutz	dc6b3228eb	Add as() and convert() type-conversion functions This adds the as() and convert() functions for converting between OpenCL types.	2013-09-24 22:27:50 -04:00
Kyle Lutz	3412d0935d	Add not1() and not2() function adaptors This adds the not1() and not2() function adaptors which negate unary and binary functions respectively.	2013-09-24 22:22:52 -04:00
Kyle Lutz	07e4a6b3aa	Remove BLAS functions This removes the incomplete BLAS API functions.	2013-09-24 22:19:56 -04:00
Kyle Lutz	d16309f57e	Add program_cache This adds a program cache which can be used by algorithms and other functions to store programs which may be re-used. This improves performance by reducing the need for costly recompilation of commonly used programs. Program caches are context specific and multiple copies of the same context will use the same program cache. They are created and accessed by the global get_program_cache() function. For now, only a few algorithms and functions (radix sort, mersenne twister, fixed size sorts) make use of the program cache.	2013-09-07 22:58:34 -04:00
Kyle Lutz	d04e628367	Add experimental sort_by_transform() algorithm This adds a sort_by_transform() algorithm which sorts a sets of values based on the value of a transform function. For example, this can be used to sort a set of vectors by their length (when used with the length<T>() function) or by a single component (when used with the get<N>() function).	2013-09-07 17:10:15 -04:00
Kyle Lutz	3389a5c741	Add sort_by_key() algorithm This adds a new sort_by_key() algorithm which sorts a range of values by a range of keys with a comparison operator. For now this is only implemented by the serial insertion sort algorithm. In the future it will be ported to the other sorting algorithms (e.g. radix sort).	2013-09-07 17:02:08 -04:00
Kyle Lutz	f9d887e30d	Add experimental tabulate() algorithm This adds a tabulate() algorithm which fills a range with values calculated from a function given each elements index.	2013-09-07 16:53:08 -04:00
Kyle Lutz	a96c9c0182	Add result argument to reduce() algorithm This adds an output iterator result argument to the reduce() algorithm. Now, instead of returning the reduced result, the result is written to an output iterator. This allows the value to stay on the device and avoids a device-to-host copy in cases where the result is not needed on the host (e.g. it is part of a larger computation). This is an API breaking change to users of reduce(). Affected code should now declare a result variable and then pass a pointer to it as the new result argument.	2013-09-07 15:36:49 -04:00
Kyle Lutz	a8f4421739	Add copy() specialization for host-to-host transfers This adds a copy() specialization for host-to-host transfers which simply forwards the call to std::copy(). This is useful in templated algorithms which may in certain circumstances copy() between data ranges on the host.	2013-09-07 15:29:48 -04:00
Kyle Lutz	78a561eff1	Add scan_on_cpu() algorithm This adds a new scan_on_cpu() algorithm which implements the scan() algorithm for CPU devices. Also renames the existing scan() algorithm to scan_on_gpu(). This fixes some tests failures on POCL which were caused by the prior GPU scan() algorithm not functioning properly with POCL.	2013-09-07 15:03:42 -04:00
Kyle Lutz	518d39fc2b	Use bitwise-and to check device::type() This changes the checks for the device type to use the bitwise-and operator instead of the equaility operator. The returned type is a bitset and this would cause errors when multiple bits were set. This fixes a bug on POCL which returns the device type as a combination of CL_DEVICE_TYPE_DEFAULT and CL_DEVICE_TYPE_CPU. Now the correct device type (device::cpu) is detected for POCL.	2013-09-07 14:16:20 -04:00
Kyle Lutz	3a7b90ff06	Fix issue with comparison operators in lambda expressions This fixes an issue in which comparison operators (e.g. <, ==) in lambda expressions would return the wrong result type causing compilation errors. Also adds a few test cases to ensure the correct result type and that lambda expressions can be properly used with count_if().	2013-08-15 22:10:03 -04:00
Kyle Lutz	bacec5b8fe	Add uniform_real_distribution This adds a random number distribution which generates random numbers in a uniform distribution. Also adds a convenience algorithm which fills a range with uniformly distributed random numbers between two values.	2013-08-13 20:40:42 -04:00
Kyle Lutz	767589fe0d	Rearrange type headers This rearranges the type headers to live under the <boost/compute/types/...> directory instead of the top-level <boost/compute/...> directory.	2013-08-13 20:37:56 -04:00
Kyle Lutz	b539e8413c	Add Doxygen documentation This replaces the BoostBook/XML based reference documentation with Doxygen auto-generated documentation.	2013-07-16 21:48:16 -04:00
Kyle Lutz	b3d2fbb7eb	Add fill_async() algorithm This adds a fill_async() which fills a range with a given value asynchronously.	2013-07-02 21:57:19 -04:00
Kyle Lutz	5203506c16	Add support for on-device copy_async() This adds support for copy_async() when copying between memory objects on a compute device.	2013-07-02 21:57:19 -04:00
Kyle Lutz	8459fdeb0e	Change meta_kernel::exec*() methods to return events This changes the exec() and exec_1d() methods in the meta_kernel class to return event objects.	2013-07-02 21:57:19 -04:00
Kyle Lutz	d8f5a5b503	Change enqueue_*_buffer() methods to return events This changes the enqueue_copy_buffer() and enqueue_fill_buffer() methods in the command_queue class to return event objects.	2013-07-02 21:57:19 -04:00
Kyle Lutz	c1bf707b41	Add event::get_command_type() method This adds a get_command_type() method to the event class which returns the OpenCL type for an event object.	2013-07-02 21:57:19 -04:00
Kyle Lutz	ee5f581094	Add command_queue::enqueue_migrate_memory_objects() method This adds an enqueue_migrate_memory_objects() method to the command_queue class which allows memory objects to be migrated between compute devices and to the host.	2013-07-02 21:57:19 -04:00
Kyle Lutz	2ca028c37b	Improve reduce() performance This makes a few tweaks to the reduce() algorithm in order to improve performance. An unnecessary barrier() has been removed and now multiple values are reduced on the initial read.	2013-07-02 21:57:15 -04:00
Denis Demidov	84394de119	Get rid of type convesion warnings inside VS2010	2013-06-24 09:57:22 +02:00
Denis Demidov	b28d8697bc	Silence MSVC security warning C4996 in system.hpp	2013-06-24 09:55:40 +02:00
Denis Demidov	f5c86057a1	Get rid of clang v3.3 warning -Wconstexpr-not-const	2013-06-21 15:27:00 +04:00
Kyle Lutz	f2b812019c	Fix bugs with char/uchar/bool literals in meta_kernel This fixes a few issues that occurred when using char, uchar and bool literals with meta_kernel.	2013-06-19 23:55:22 -04:00
Kyle Lutz	e01569049b	Add type_name<bool>() specialization This adds a type_name() specialization for bool.	2013-06-19 23:48:49 -04:00
Kyle Lutz	0d285d8a30	Change meta_kernel::add_arg(name, value) to add_set_arg() This changes the meta_kernel::add_arg() overload with a name and a value to a separate method. This fixes conflict when using add_arg() with string values.	2013-06-11 21:19:47 -04:00
Kyle Lutz	7fb77ef9c5	Add test for any/all/none_if() with NaN and inf This adds a test for the any_of(), all_of() and none_of() functions with NaN and Inf values.	2013-06-11 21:16:15 -04:00
Kyle Lutz	8e51a0a162	Refactor lambda expression framework to use meta_kernel This refactors the lambda expression framework to use meta_kernel to construct kernel source code instead of using plain strings.	2013-06-11 21:14:28 -04:00
Kyle Lutz	64e94549b3	Add specialization for get<N>() with zip_iterator This adds a specialization for the get<N>() function when used with zip_iterator's. Now, only the N'th iterator for the expression will be dereferenced instead of dereferencing all of the iterators into a tuple and then extracting the N'th component.	2013-06-11 20:37:23 -04:00
Kyle Lutz	15bc98b94f	Remove cv-qualifiers from get<N>()'s value-type This removes the cv-qualifiers for the value-type returned from get<N>() expressions. This fixes issues when specializing based on the type (e.g. pair, tuple).	2013-06-11 20:29:06 -04:00
Kyle Lutz	98b593b937	Fix meta_kernel streaming operators with float This fixes a bug in the meta_kernel streaming operators with float values. Now, float scalar and vector literals are inserted into the kernel source with the proper 'f' suffix.	2013-06-11 20:23:47 -04:00
Kyle Lutz	36dd3f1306	Improve the system::find_default_device() method This makes some improvements to the system::find_default_device() method. Now, the devices on the system will only be queried once when searching for the default device. This reduces the number of calls to clGetPlatformIDs() and clGetDeviceIDs(). Also, in the case that no GPU or CPU devices are found, the first device on the system will be selected as the default device. This fixes issues when using Boost.Compute with pocl.	2013-05-24 20:07:38 -04:00
Kyle Lutz	aa7fd2f6fa	Add asserts for clRelease() functions in destructors This adds assert()'s verifying that the clRelease() functions in the destructors for the OpenCL wrapper classes return CL_SUCCESS.	2013-05-23 23:15:43 -04:00
Kyle Lutz	b5068b2027	Fix minor version macro This fixes the minor version macro.	2013-05-23 22:46:52 -04:00
Kyle Lutz	5b12d04d4e	Mark streaming operators for boost::tuple<> inline This marks the meta_kernel streaming operators for boost::tuple<> literals as inline.	2013-05-22 22:50:51 -04:00
Kyle Lutz	c2187b89c0	Mark streaming operator std::pair<> inline This marks the meta_kernel streaming operator for std::pair<> literals as inline.	2013-05-22 22:50:46 -04:00
Kyle Lutz	0405c3cdc3	Check for valid range in reverse() This adds a check to the reverse() algorithm to ensure that the range contains at least two elements. Previously, passing zero or one element ranges to reverse() would result in errors.	2013-05-22 22:41:12 -04:00
Kyle Lutz	f07caa1ddd	Fix compilation error in future<void> assignment operator This fixes a compilation error which occurred when assigning to a future<void> from a future<T>. For different future types the event member variable is private and must be accessed via the get_event() method.	2013-05-21 23:20:36 -04:00
Kyle Lutz	bac6fb7332	Check for valid pattern size in fill() disptacher This checks for a valid pattern value size before dispatching to the clEnqueueFillBuffer() function for the fill() algorithm.	2013-05-21 23:17:32 -04:00
Kyle Lutz	2560600122	Fix issues with boost::tuple<>, char, and fill() This fixes issues when using boost::tuple<> containing char types with the fill() algorithm.	2013-05-21 23:10:56 -04:00
Kyle Lutz	9141732b3e	Fix issues with std::pair<>, char, and fill() This fixes issues when using std::pair<> containing char types with the fill() algorithm.	2013-05-21 23:10:56 -04:00
Kyle Lutz	f4ecbd1e6c	Fix issues with char literals in meta_kernel This fixes issues when using char and unsigned char literals in a meta_kernel. Previously the character values would be directly inserted without quotes (e.g. c instead of 'c') which lead to kernel compilation errors.	2013-05-21 23:10:40 -04:00
Kyle Lutz	1caebe6de8	Fix bug in in-place scan() This fixes a bug when creating a temporary vector for use in the in-place scan() algorithm. Previously, a separate command queue was used to copy the input values to the temporary vector. Now, the same command queue is used for copying the input values and performing the scan.	2013-05-20 23:05:51 -04:00
Kyle Lutz	9f231d7b13	Fix conversion warnings in buffer_iterator This fixes conversion warnings for buffer_iterator.	2013-05-20 23:05:40 -04:00
Kyle Lutz	3bc5bfaf78	Remove timer class This removes the timer class. The technique of measuring the time difference between two different OpenCL markers on a command queue is not portable to all OpenCL implementations (only works on NVIDIA). A new internal timer class has been added which uses boost::chrono (or std::chrono if BOOST_COMPUTE_TIMER_USE_STD_CHRONO is defined). This new timer is used by the benchmarks to measure time elapsed on the host.	2013-05-20 21:08:42 -04:00
Kyle Lutz	fab7be5f43	Add inplace_merge() algorithm This adds a simple inplace_merge() algorithm which merges two contiguous sorted ranges in-place. For now, the implementation simply copies the ranges to two temporary vectors and calls merge().	2013-05-20 20:50:12 -04:00
Kyle Lutz	b43e79b983	Add support for get<N>() in lambda expressions This adds support for using the get<N>() function in lambda expressions to extract a single component of an aggregate type. Also adds a test of using boost::tuple<> to store a user-defined data type on the device and sort them by their first component using a lambda expression as the comparator.	2013-05-20 20:50:10 -04:00
Kyle Lutz	e46828a9d6	Fix issues involving iterators with void value_type This fixes a few issues encountered when using iterators with a void value_type (e.g. std::insert_iterator<>). The is_contiguous_iterator meta-function was refactored to always return false for iterators with a void value_type and avoid instantiating types for containers with a void value_type (e.g. std::vector<void>::iterator) which previously resulted in compilation errors.	2013-05-20 19:57:13 -04:00
Kyle Lutz	4ab37ada07	Add system-wide default command queue This adds a system-wide default command queue. This queue is accessible via the new static system::default_queue() method. The default command queue is created for the default compute device in the default context and is analogous to the default stream in CUDA. This changes how algorithms operate when invoked without an explicit command queue. Previously, each algorithm had two overloads, the first expected a command queue to be explicitly passed and the second would create and use a temporary command queue. Now, all algorithms take a command queue argument which has a default value equal to system::default_queue(). This fixes a number of race-conditions and performance issues througout the library associated with create, using, and destroying many separate command queues.	2013-05-15 20:59:56 -04:00
Kyle Lutz	a2bda0610d	Fix memory issues with device_ptr and allocator This fixes a few memory handling issues between device_ptr, buffer_iterator, buffer_value, allocator, and malloc/free. Previously, memory buffers that were allocated by allocator and malloc were being retained (via clRetainMemObject() in buffer's constructor) by device_ptr, buffer_iterator and buffer_value. Now, false is passed for the retain parameter to buffer's constructor so that the buffer's reference count is not incremented. Furthermore, the classes now set the buffer to null before being destructed so that they will not decrement its reference count (which normally occurs buffer's destructor). The main effect of this change is that objects which refer to a memory buffer but do not own it (e.g. device_ptr, buffer_iterator) will not modify the reference count for the buffer. This fixes a number of memory leaks which occured in longer running programs.	2013-05-13 22:27:02 -04:00
Kyle Lutz	a5ddeae614	Add scalar<T> container This adds a new scalar<T> "container" which stores a single value in a memory buffer. This simplifies memory handling in algorithms which read and write a single value.	2013-05-11 20:20:27 -04:00
Kyle Lutz	130f8c30f1	Rename kernel::num_args() method to arity() This renames the kernel::num_args() method to arity().	2013-05-11 20:15:00 -04:00
Kyle Lutz	ffec5fd34a	Remove unnecessary includes from transform_reduce This removes a couple of unnecessary includes from the transform_reduce.hpp header file.	2013-05-11 20:10:28 -04:00
Kyle Lutz	178676df4f	Refactor the system::default_device() method This refactors the system::default_device() method. Now, the default compute device for the system is only found once and stored in a static variable. This eliminates many redundant calls to clGetPlatformIDs() and clGetDeviceIDs(). Also, the default_cpu_device() and default_gpu_device() methods have been removed and their usages replaced with default_device().	2013-05-10 22:49:05 -04:00
Kyle Lutz	d40eddc56b	Fix compilation error with get<N>() and tuple This fixes a compilation error which occured when using the get<N>() function with tuple types.	2013-05-10 21:51:28 -04:00
Kyle Lutz	705b3f35a3	Fix narrowing conversion warnings in device This fixes a couple of narrowing conversion warnings in the device partitioning methods which were seen when compiling VexCL with Boost.Compute in C++11 mode.	2013-05-09 22:04:00 -04:00
Kyle Lutz	9a64f6b39a	Add get<N>() function This adds a get<N>() function which returns the n'th element of an aggregate type (e.g. vector type, pair, tuple). This unifies the functionality of, and replaces, the get_pair() and vector_component() functions.	2013-05-05 12:46:05 -04:00
Kyle Lutz	3e840fa306	Add transform_if() algorithm This adds a new algorithm named transform_if() which applies a given unary function to an input value only if it passes a separate predicate function.	2013-05-05 11:51:21 -04:00
Kyle Lutz	49a34442e5	Remove unused histogram() algorithm This removes the unused histogram() algorithm.	2013-05-05 10:56:14 -04:00
Dominic Meiser	7c5e321c2a	Fixing build issues under windows	2013-05-03 18:37:09 -04:00
Kyle Lutz	3e93d01475	Add default constructors to image2d and image3d This adds default constructors to the image2d and image3d classes which initialize them with null memory objects.	2013-05-02 21:01:30 -04:00
Kyle Lutz	5d28d3887e	Make pick_copy_work_group_size() inline This makes the pick_copy_work_group_size() function inline.	2013-05-02 20:55:22 -04:00
Kyle Lutz	0ab2fe85eb	Don't auto-initialize values in vector This changes the vector class to not auto-initialize values when it is created or resized. This improves performance by eliminating a call to fill(). If needed, user code can call fill() explicitly on the newly allocated values.	2013-04-27 10:30:26 -04:00
Kyle Lutz	03195275b3	Increase work-group size for copy() kernel This increases the work-group size for the copy() kernel to be up to 32 items based on the size of the input. This increases the performance of copy() and related algorithms (e.g. transform()).	2013-04-27 10:21:47 -04:00
Kyle Lutz	ea107ae5d6	Add clamp_range() algorithm This adds a clamp_range() algorithm which clamps a range of values between a low and high value. This is based on the algorithm of the same name in Boost.Algorithm.	2013-04-22 22:06:04 -04:00
Kyle Lutz	8142e5d5f9	Add move-constructors to wrapper classes This adds move-constructors and move-assignment operators to the OpenCL wrapper classes.	2013-04-17 20:45:04 -04:00
Kyle Lutz	4bdec761cd	Add memory_object::reference_count() method This adds a reference_count() method to the memory_object class which returns its current reference count.	2013-04-13 11:07:04 -04:00
Kyle Lutz	d58b7c0902	Return event from command_queue::enqueue_task() This changes the command_queue::enqueue_task() method to return an event object.	2013-04-13 10:23:29 -04:00
Kyle Lutz	da4cb81679	Return event from command_queue::enqueue_nd_range_kernel() This changes the enqueue_nd_range_kernel() method to return an event object. This allows clients to monitor the progress of a kernel executing on a device.	2013-04-13 10:23:01 -04:00
Kyle Lutz	001b3ff7fe	Add get() methods to wrapper classes This adds a get() method to each wrapper class which returns a reference to the underlying OpenCL object.	2013-04-13 09:44:51 -04:00
Denis Demidov	8b78d4187d	Adds support for selecting devices with environment variables boost::compute::system::default_device() supports the following environment variables: BOOST_COMPUTE_DEFAULT_DEVICE for device name BOOST_COMPUTE_DEFAULT_PLATFORM for OpenCL platform name BOOST_COMPUTE_DEFAULT_VENDOR for device vendor name If one or more of these variables is set, then device that satisfies all conditions gets selected. If such a device is unavailable, then the first available GPU is selected. If there are no GPUs in the system, then the first available CPU is selected. Otherwise, default_device() returns null device. The hello_world example is modified to use default_device() instead of default_gpu_device().	2013-04-12 17:22:25 -04:00
Kyle Lutz	1be19a6305	Add multiplies<T> specialization for std::complex<T> This adds a specialization of multiplies<T> for std::complex<T> which implements complex number multiplication. Also adds a simple test using transform() to verify the complex multiplication works correctly.	2013-04-10 22:04:04 -04:00
Kyle Lutz	8d13920dc4	Move swizzle_iterator to detail namespace This moves the swizzle_iterator class to the detail namespace.	2013-04-10 21:51:24 -04:00
Kyle Lutz	bcc3aed40f	Move pixel_input_iterator to detail namespace This moves the pixel_input_iterator class to the detail namespace.	2013-04-10 21:38:05 -04:00
Kyle Lutz	5cce555d8c	Move binary_transform_iterator to detail namespace This moves the binary_transform_iterator class to the detail namespace.	2013-04-10 21:33:29 -04:00
Kyle Lutz	e30ec9f26c	Move adjacent_transform_iterator to detail namespace This moves the adjacent_transform_iterator class to the detail namespace.	2013-04-10 21:24:15 -04:00
Kyle Lutz	6dd6e11c7d	Fix unused variable warning in get_base_iterator_buffer() This fixes an unused variable warning which occurs in the get_base_iterator_buffer() function when the base iterator is not a buffer iterator and thus the iter argument is not used.	2013-04-10 21:09:17 -04:00
Kyle Lutz	6fdffd8a2b	Replace usages of result_of() with tr1_result_of() This fixes a bug in which boost::result_of() would return the wrong result type for a function due to the new implementation using decltype instead of the result_of protocol on compilers that sufficently support C++11 (such as clang >= 3.2). Now, boost::tr1_result_of() is used to explicitly request that the result_of protocol be used even when decltype is supported by the compiler.	2013-04-10 20:17:34 -04:00
Kyle Lutz	652f99e449	Fix bug in get_buffer() for iterator adaptors This fixes a bug in which the get_buffer() method was not properly disabled for iterator adaptors with a non-buffer base iterator.	2013-04-09 21:56:24 -04:00
Kyle Lutz	5164ab4bd0	Cleanup constructors for wrapper classes This cleans up the constructor methods for the OpenCL wrapper classes and unifies the API used for creating a wrapper class object from the underlying OpenCL objects. Now, every wrapper class has a constructor taking the OpenCL object and an optional boolean retain parameter which indicates whether the constructor should increment the reference count.	2013-04-07 15:03:24 -04:00
Kyle Lutz	25a084deda	Fix indentation in kernel::get_arg_info() This fixes the indentation in the kernel::get_arg_info() method.	2013-04-07 12:57:26 -04:00
Kyle Lutz	48e1bb4da0	Update image2d/3d constuctors for OpenCL 1.2 This updates the constructors for the image2d and image3d classes to use the new clCreateImage() function instead of the deprecated clCreateImage2D/3D() functions.	2013-03-31 15:01:30 -04:00
Kyle Lutz	d56e58b48e	Add OpenCL 1.2 error codes to runtime_exception This adds support for the OpenCL 1.2 error codes to the runtime_exception class.	2013-03-31 14:58:00 -04:00
Kyle Lutz	0aa3d024dc	Fix command_queue::enqueue_marker() for OpenCL 1.2 This changes the enqueue_marker() method in the command_queue class to use clEnqueueMarkerWithWaitList() instead of the deprecated clEnqueueMarker() function when compiling with OpenCL 1.2.	2013-03-31 12:08:23 -04:00
Kyle Lutz	7e7e09b704	Fix command_queue::enqueue_barrier() for OpenCL 1.2 This changes the enqueue_barrier() method in the command_queue class to use clEnqueueBarrierWithWaitList() instead of the deprecated clEnqueueBarrier() function when compiling with OpenCL 1.2.	2013-03-31 12:03:38 -04:00
Kyle Lutz	52fef4de6b	Remove command_queue::enqueue_wait_for_event() method This remove the enqueue_wait_for_event() method from the command_queue class as the clEnqueueWaitForEvents() function has been deprecated in OpenCL 1.2.	2013-03-31 11:59:14 -04:00
Kyle Lutz	c7a3bc8af6	Move unload_compiler() method to platform This moves the unload_compiler() method from the system class to the platform class. Also changes the method to use the clUnloadPlatformCompiler() function instead of the deprecated clUnloadCompiler() when compiling with OpenCL 1.2.	2013-03-31 11:29:40 -04:00
Kyle Lutz	d28354184c	Move get_extension_function_address() method to platform This moves the get_extension_function_address() method from the system class to the platform class. Also changes the method to use the clGetExtensionFunctionAddressForPlatform() function instead of the deprecated clGetExtensionFunctionAddress() when compiling with OpenCL 1.2.	2013-03-31 11:26:18 -04:00
Kyle Lutz	00fcb737cc	Fix bug in move-constuctor for vector<T> This fixes a bug in the move-constuctor for the vector<T> class. Previously, the moved-from object was also deallocating the memory buffer leading to an error when the moved-to object attempted to use it. Now, the constructor checks if the buffer is non-empty before deallocating it.	2013-03-30 19:55:51 -04:00
Kyle Lutz	1161f89031	Make get_object_info() inline This marks the get_object_info() method as inline.	2013-03-30 19:38:40 -04:00
Kyle Lutz	d585fbebad	Make stream operator for vector types inline This marks the stream operator for vector types as inline.	2013-03-30 19:35:56 -04:00
Kyle Lutz	da1d7794b5	Remove support for cl_half This removes support for cl_half (typedef'd to half_). The issue is that the cl_half type is indistinguishable from the cl_ushort type (both are typedefs for uint16_t) which caused the cl_khr_fp16 pragma to be injected into kernels using cl_ushort which causes errors on platforms that do not support the cl_khr_fp16 extension.	2013-03-27 00:09:51 -04:00
Kyle Lutz	4338b311f7	Add device::partition() method This adds a new set of methods to the device class allowing device objects to be partitioned into multiple sub-devices using the clCreateSubDevices() function. For now, device partitioning is only supported on systems with OpenCL version 1.2 (or later).	2013-03-26 23:36:54 -04:00
Kyle Lutz	4752fb2404	Support returning std::vector<T> from get_info<T>() This adds support for returning a std::vector<T> from the various get_info<T>() methods. This provides a simpler interface to get the values in an array returned from one of the clGet*Info() functions. This also adds a test using the new API to get the maximum work item sizes in each dimension for a device.	2013-03-26 22:44:56 -04:00
Denis Demidov	2df059738a	Option to not retain program on creation	2013-03-25 20:59:50 -04:00
Kyle Lutz	ab0575dd0c	Fix bug in remove_if() algorithm This fixes a bug in which the remove_if() function would overwrite parts of the input before they were properly copied to the output range. This is fixed by first copying the input values to a temporary vector and then passing that as the input range to copy_if().	2013-03-21 23:23:39 -04:00
Kyle Lutz	147baa05fe	Fix bug in count_if() and find_if() with Intel CPUs This fixes a bug in which the Intel OpenCL compiler would fail to compile the count_if() and find_if() kernels for vector types with the following error: error: no matching function for call to 'all' note: candidate function not viable: 1st argument ('__global int4') is in address space 16776960, but parameter must be in address space 0 This is caused when the predicate compares a value from the input buffer (in the global memory space) to a literal value (in the private memory space). This is fixed by first reading the value into a local variable in the private memory space and then calling the predicate function.	2013-03-21 22:46:01 -04:00
Kyle Lutz	25b71579a5	Fix bug when calling fill() with vector types This fixes a bug in which the fill() algorithm was called by scan_impl() with an integer zero rather than zero of the value type which caused issues when using scan() with vector values.	2013-03-20 18:10:48 -04:00
Denis Demidov	6766b07fad	Allow buffer and queue initialization from lowlevel types	2013-03-19 17:18:43 +04:00
Kyle Lutz	2d81f561c4	Add zip_iterator class This adds a zip_iterator class which allows for one or more iterators to be combined into a single iterator object.	2013-03-17 23:42:56 -04:00
Kyle Lutz	ada2351812	Add support for boost::tuple<> This adds support for using boost::tuple<> types with the Boost.Compute containers and algorithms.	2013-03-17 23:28:07 -04:00
Kyle Lutz	ad9309740e	Add meta_kernel::inject_type<Type>() method This adds a new method which allows for type definitions and type pragmas to be added to a meta_kernel. This provides a more generic and general interface and replaces the previously used add_pair_type() method along with the special case handling of half and double types.	2013-03-17 23:20:19 -04:00
Kyle Lutz	2f81872403	Pass binary_status argument to clCreateProgramWithBinary() This fixes a bug in which certain platforms would return CL_INVALID_VALUE from clCreateProgramWithBinary() if the binary_status argument was not provided.	2013-03-17 22:20:32 -04:00
Kyle Lutz	233df55978	Fix bug in program::binary() This fixes a bug in the program::binary() method in which the return size was not being passed to clGetProgramInfo().	2013-03-17 22:17:28 -04:00
Kyle Lutz	32a1926f6b	Fix local array size in serial_insertion_sort() This fixes the local array allocation size for the serial_insertion_sort() function.	2013-03-14 21:27:00 -04:00
Kyle Lutz	77e75bd4cc	Remove unused variable in serial_insertion_sort() This removes the unused 'op' variable from the serial_insertion_sort() function.	2013-03-14 21:22:40 -04:00
Kyle Lutz	ff204e1b61	Add asserts for null host pointers to command_queue This adds assert()'s to the read and write methods in the command_queue class to check for null host_ptr's.	2013-03-12 22:36:36 -04:00
Kyle Lutz	35984ae412	Remove default type_name_trait::value() implementation This removes the default type_name_trait::value() function implementation. Previously, the default implementation would return a null pointer leading to run-time errors if a type name was not provided. Now, a compile-time error will occur if type_name() is called for an unknown type.	2013-03-12 20:29:44 -04:00
Kyle Lutz	71df0d5fa6	Add type_name() specialization for char This adds a type_name() specialization for char which is different than the built-in cl_char type and thus was not covered before.	2013-03-12 20:23:33 -04:00
Kyle Lutz	418468cc4b	Add lambda wrapper for length() function This adds a lambda wrapper for the length() function which allows it to be used in lambda expressions.	2013-03-10 20:16:14 -04:00
Kyle Lutz	69aef15cab	Add merge() algorithm This implements the merge() algorithm which merges two ranges of sorted values into a single sorted range. The current implementation uses a simple serial merge algorithm. A GPU optimized version is coming soon.	2013-03-10 20:10:58 -04:00
Kyle Lutz	d34cdaac59	Initial commit	2013-03-02 15:14:17 -05:00

... 10 11 12 13 14 ...

791 Commits