Commit Graph

791 Commits

Author SHA1 Message Date
Kyle Lutz
44895606ac Merge pull request #268 from kylelutz/search-use-find
Change search() and search_n() to use find()
2014-09-28 14:38:35 -07:00
Kyle Lutz
05afa5f707 Use thread-local storage for global program cache 2014-09-28 12:37:48 -07:00
Kyle Lutz
76a416c5a9 Change search() and search_n() to use find() 2014-09-28 10:16:14 -07:00
Kyle Lutz
10d79c6689 Fix bug when using bind() with adapted structs 2014-09-27 11:42:29 -07:00
Kyle Lutz
0b0cbd399e Merge pull request #265 from roshanr95/uniform_int_distribution
Uniform int distribution
2014-09-21 11:06:08 -07:00
Kyle Lutz
7e05c0f9a5 Merge pull request #262 from kylelutz/opencl-2.0
Add OpenCL 2.0 support
2014-09-21 11:05:15 -07:00
roshanr
ba09a9f3d0 Fix issue #159 2014-09-20 15:35:35 +05:30
Kyle Lutz
4b10ea608b Merge pull request #264 from kylelutz/refactor-get-info
Refactor get_info() functions
2014-09-14 21:13:36 -07:00
Kyle Lutz
cdcd4c5a32 Refactor get_info() functions 2014-09-13 18:42:43 -07:00
Kyle Lutz
7912b344d1 Add variadic bind() implementation 2014-09-13 13:18:16 -07:00
Kyle Lutz
ec254c04bb Add OpenCL 2.0 support 2014-09-13 12:29:03 -07:00
Kyle Lutz
60f0709bc1 Merge pull request #259 from kylelutz/dynamic-bitset
Add dynamic_bitset class
2014-09-07 11:37:50 -07:00
Kyle Lutz
8310e8e729 Add dynamic_bitset class 2014-09-07 11:21:46 -07:00
Kyle Lutz
49fc80d204 Fix bug when using popcount() with ulong 2014-09-07 11:00:46 -07:00
Kyle Lutz
a4c11ddb5d Merge pull request #243 from kylelutz/address-space-enum
Address space enum
2014-09-06 12:04:50 -07:00
Kyle Lutz
d44af78be5 Use address_space enum for meta_kernel::add_arg() 2014-09-06 11:43:31 -07:00
Kyle Lutz
c5546c92a1 Add address_space enum to memory_object 2014-09-06 11:37:46 -07:00
Kyle Lutz
94d182d47d Rearrange allocator headers
This moves the allocator headers from 'container' to a new
top-level 'allocator' directory.

Also renames allocator<T> to buffer_allocator<T>.
2014-09-06 09:51:46 -07:00
Kyle Lutz
891aff215d Fix bug when calling reduce() with empty ranges 2014-09-04 20:42:37 -07:00
Kyle Lutz
0151195871 Remove usage of 'uint' in linear_congruential_engine 2014-08-25 23:35:17 -07:00
Kyle Lutz
744359715f Implement is_sorted() with adjacent_find() 2014-08-21 22:43:46 -07:00
Kyle Lutz
c69ea170fb Remove adjacent_transform_iterator class 2014-08-21 07:00:13 -07:00
Kyle Lutz
b3ea818248 Rewrite unique() algorithm 2014-08-21 07:00:10 -07:00
Kyle Lutz
b533df6a5c Rewrite adjacent_find() algorithm 2014-08-20 22:46:51 -07:00
Kyle Lutz
45c5ec3281 Rewrite adjacent_difference() algorithm 2014-08-20 22:43:27 -07:00
roshanr
515e1b29ba Enforce same tile_size for all kernels 2014-08-18 16:55:20 +05:30
Kyle Lutz
417a50e3f3 Merge pull request #238 from roshanr95/templating
Modify templating
2014-08-15 19:30:34 -07:00
roshanr
d10d992f62 Move templating from kernel to its member function 2014-08-15 20:32:18 +05:30
Kyle Lutz
8c7efd24fd Add support for multi-device contexts 2014-08-13 20:28:34 -07:00
Kyle Lutz
9f5cc79606 Merge pull request #228 from roshanr95/merge
Merge algorithm
2014-08-11 19:59:17 -07:00
roshanr
c48760fd90 Add a merge-path based merge algorithm and change merge to use it
Added a merge_path kernel and a merge algorithm based on it.
Also changed merge to use the new algorithm.
2014-08-10 07:29:08 +05:30
Kyle Lutz
bd427b8a1b Merge pull request #224 from kylelutz/capture-containers-with-closure
Capture containers with closure
2014-08-09 09:00:43 -07:00
Kyle Lutz
06e0ae10ee Merge pull request #225 from f-koehler/issue217
added wait_list as argument and event as return for opengl enqueue_* methods
2014-08-09 08:56:57 -07:00
f-koehler
3c15712941 added wait_list as argument and event as return for opengl enqueue_* methods 2014-08-08 19:12:10 +02:00
Kyle Lutz
fd8c8f934b Support capturing containers with BOOST_COMPUTE_CLOSURE() 2014-08-07 22:20:16 -07:00
Kyle Lutz
95c331fa84 Capture references with BOOST_COMPUTE_CLOSURE() 2014-08-07 20:57:34 -07:00
Kyle Lutz
4703488c45 Merge pull request #212 from roshanr95/nth-element
Nth element
2014-08-05 19:00:30 -07:00
roshanr
30082abd59 Improve nth_element performance and unit testing 2014-08-05 22:12:45 +05:30
Kyle Lutz
5d663ff338 Fix bug with count_if() on AMD
This fixes an issue in which the count_if_with_reduce()
function fails to compile because convert_ulong(bool) is
not supported.

See issue #202.
2014-07-30 21:40:00 -07:00
Kyle Lutz
2cb564c59c Merge pull request #200 from kylelutz/bind
Add bind() function
2014-07-29 18:29:16 -07:00
Kyle Lutz
f21abdff7e Add bind() function 2014-07-27 10:33:59 -07:00
roshanr
92ae416a32 Improve binary_find performance 2014-07-18 12:32:18 +05:30
Kyle Lutz
f50e9c0110 Release v0.3 2014-07-15 21:07:20 -07:00
Kyle Lutz
132fc85270 Add OpenCL 1.2 memory flags to mem_flags enum 2014-07-12 14:44:03 -07:00
Kyle Lutz
89eee4b60e Merge pull request #193 from kylelutz/unify-seed-method-interface
Unify seed() method interface in the random number engines
2014-07-12 14:35:50 -07:00
Kyle Lutz
e8afaf1e7c Add perf_copy_if benchmark 2014-07-12 14:19:23 -07:00
Kyle Lutz
84b24fcacb Unify seed() method interface in the random number engines
This updates the random number engines to both use the same
interface for their seed() method.
2014-07-12 14:07:48 -07:00
Kyle Lutz
61645c41c3 Fix issues with popcount() on OpenCL 1.1 devices 2014-07-12 12:47:41 -07:00
Kyle Lutz
dfe9399c9f Check OpenCL version before calling enqueue_fill_buffer()
This adds a check for OpenCL version 1.2 before calling the
enqueue_fill_buffer() function in the fill() algorithm.
2014-07-12 11:22:01 -07:00
Kyle Lutz
9dc87712e9 Improve documentation 2014-07-11 23:25:41 -07:00
Kyle Lutz
a4ae254adc Rename mersenne_twister_engine header
This renames the mersenne_twister_engine header from
"mersenne_twister.hpp" to "mersenne_twister_engine.hpp".
2014-07-11 22:27:16 -07:00
Kyle Lutz
40d0166cb2 Merge pull request #182 from kylelutz/rename-create-with-builtin-kernels
Rename create_with_builtin_kernels() method
2014-07-10 20:02:51 -07:00
Kyle Lutz
94306ce2e8 Merge pull request #179 from kylelutz/deprecate-device-ptr
Move device_ptr to the detail namespace
2014-07-10 20:02:39 -07:00
Kyle Lutz
48cee2b619 Rename create_with_builtin_kernels() method 2014-07-10 19:27:17 -07:00
Kyle Lutz
9106222e61 Merge pull request #178 from kylelutz/equality-operators
Add equality operators for all wrapper classes
2014-07-09 22:10:36 -07:00
Kyle Lutz
e0535d7233 Move device_ptr to the detail namespace
This deprecates the device_ptr class and moves it to the detail
namespace. The buffer_iterator class should be used instead of
device_ptr for referencing a memory location on the device.
2014-07-08 21:04:23 -07:00
Kyle Lutz
7d377989ee Add equality operators for all wrapper classes 2014-07-08 20:33:02 -07:00
Kyle Lutz
4a1b3edf48 Fix resize bug with vector::assign()
This fixes a bug in which vector::assign() would not resize
itself to accommodate the assigned values. Now the behavior
matches that of std::vector.
2014-07-07 21:56:39 -07:00
Kyle Lutz
cae813ec3c Add get_info<Info>() specializations 2014-07-07 19:14:14 -07:00
Kyle Lutz
bd164a24b2 Merge pull request #175 from kylelutz/fix-platform-empty-devices
Fix errors when using platforms with no devices
2014-07-07 18:57:16 -07:00
Kyle Lutz
5d6921f162 Merge pull request #170 from kylelutz/create-program-with-built-in-kernels
Add program::create_with_built_in_kernels() method
2014-07-07 08:04:12 -07:00
Kyle Lutz
c156ca7198 Merge pull request #169 from kylelutz/hash-function
Add hash() function
2014-07-07 08:02:09 -07:00
Kyle Lutz
8f952b3c5b Merge pull request #168 from kylelutz/struct-is-packed
Add check for packed structs in BOOST_COMPUTE_ADAPT_STRUCT()
2014-07-07 08:01:50 -07:00
Kyle Lutz
73443e0111 Fix errors when using platforms with no devices
This fixes errors caused when handling OpenCL platforms with
no devices. Now the platform::devices() method will properly
return an empty vector without throwing an exception when such
a platform is encountered.

Previously, default_device() would throw an exception when
confronted with a platform with no devices even if devices
from other platforms on the system were available.

Thanks to Godeffroy Valet for reporting this issue.
2014-07-07 07:57:02 -07:00
Kyle Lutz
6103ff6cfb Add program::create_with_built_in_kernels() method 2014-07-02 23:45:35 -07:00
Kyle Lutz
6af32b3d8f Add hash() function 2014-07-02 23:24:18 -07:00
Kyle Lutz
235261a977 Add check for packed structs in BOOST_COMPUTE_ADAPT_STRUCT()
This adds a compile-time check for non-padded structs in the
BOOST_COMPUTE_ADAPT_STRUCT() macro. Also updates the macro to
add "__attribute__((packed))" to the OpenCL struct definition.
2014-07-02 22:53:48 -07:00
Kyle Lutz
90217d055b Add compile() and link() methods to program 2014-07-02 22:32:58 -07:00
roshanr
acf4698af1 Add benchmark for lce
Added benchmark for lce and changed threads to 1024
2014-06-30 23:41:58 +05:30
roshanr
d81edfc387 Fix errors in lce
Remove unused parameter
Fix discard() to work for all sizes
2014-06-30 23:41:16 +05:30
Kyle Lutz
2fa478a11f Merge pull request #158 from roshanr95/search-algorithms
Search algorithms
2014-06-25 18:37:56 -07:00
roshanr
f3d20639f7 Change function signature of search algorithms to match STL versions 2014-06-26 01:27:21 +05:30
Kyle Lutz
be9be67348 Merge pull request #155 from Mageswaran1989/image_format-support-grey-images
image_format-support-grey-images
2014-06-24 18:39:52 -07:00
Mageswaran
f10d3d549d updated interop/opencv/core.hpp for grey images 2014-06-24 20:59:48 +05:30
Mageswaran
81994ed6c9 support for OpenCV grey images 2014-06-21 16:48:42 +05:30
Kyle Lutz
1d5a088e94 Merge pull request #152 from roshanr95/set_algorithms
Set algorithms
2014-06-20 18:55:25 -07:00
roshanr
5b832d8ee1 Fix errors in set algorithms
Fix all the kernels so that they work with very low number
of elements in the two sets
2014-06-21 02:42:27 +05:30
jamboree
c518869dd4 Use BOOST_NOEXCEPT 2014-06-19 18:03:09 +08:00
Kyle Lutz
b670212ee8 Use "OpenCL API" rather than "OpenCL C API"
This changes references to the OpenCL API in the documentation
from "OpenCL C API" to just "OpenCL API". See issue #142.
2014-06-18 18:42:21 -07:00
roshanr
fd2506b161 Add quick and dirty linear congruential engine 2014-06-18 02:47:00 +05:30
Kyle Lutz
3ac85e3fbc Merge pull request #141 from roshanr95/discrete_distribution
Add discrete_distribution
2014-06-16 18:39:59 -07:00
roshanr
a0a7a85d0d Add discrete_distribution 2014-06-16 03:01:36 +05:30
roshanr
79063d4fb4 Add uniform_int_distribution 2014-06-15 02:28:59 +05:30
Kyle Lutz
6ea122adb7 Add support for specifying wait-lists in command_queue 2014-06-08 22:23:59 -07:00
Kyle Lutz
57470c48c2 Add device::get() method 2014-06-08 14:03:27 -07:00
Kyle Lutz
a639e408b8 Cleanup move-semantics for all core types 2014-06-08 13:40:27 -07:00
Kyle Lutz
c2c1346f19 Rename event::get_status() to event::status() 2014-06-08 10:45:35 -07:00
Kyle Lutz
69e483696b Use reduce() for accumulate() with min()/max() 2014-06-08 10:35:49 -07:00
Kyle Lutz
d2379699d0 Limit maximum number of program_cache instances
This limits the maximum number of program_cache instances to eight
using an LRU cache. This prevents potential CL_OUT_OF_RESOURCES
errors when creating and using many different context objects.
2014-06-07 22:04:06 -07:00
Kyle Lutz
60aca4401b Simplify the program::create_kernel() implementation 2014-06-07 19:58:36 -07:00
Kyle Lutz
732ab539f3 Improve documentation 2014-06-07 15:03:56 -07:00
Kyle Lutz
fc9f014526 Add core.hpp header 2014-06-07 11:57:11 -07:00
Kyle Lutz
b228de76a7 Add memory_object::set_destructor_callback() method 2014-06-07 11:13:52 -07:00
Kyle Lutz
d30dcfb564 Use clEnqueueWriteBuffer() for writing single values with fill() 2014-06-06 23:10:27 -07:00
Kyle Lutz
2ca2c16839 Swap allocators in vector::swap() 2014-06-06 08:29:45 -07:00
Kyle Lutz
07ae8638de Remove bogus assert() in command_queue::enqueue_nd_range_kernel() 2014-06-04 18:50:43 -07:00
Kyle Lutz
53c3e1af83 Add normal_distribution.hpp to random.hpp header 2014-06-04 18:44:55 -07:00
Kyle Lutz
36681a6ad5 Merge pull request #131 from roshanr95/bernoulli-distribution
Add bernoulli distribution
2014-06-04 18:41:37 -07:00
roshanr
1777b7e5bf Add bernoulli distribution 2014-06-04 03:41:19 +05:30
Kyle Lutz
99891752a6 Merge pull request #129 from vaa-msu/patch-1
Changed some 'uint's to 'uint_'s
2014-06-02 22:55:49 -07:00
Kyle Lutz
b2f80daf17 Merge pull request #128 from 'daniel-murray/develop'
add BOOST_COMPUTE_DEFAULT_DEVICE_TYPE env variable support
2014-06-02 22:48:07 -07:00
daniel-murray
b34c02c239 add BOOST_COMPUTE_DEFAULT_DEVICE_TYPE env variable support
allows a user to easily select between a gpu and a cpu without recompiling
2014-06-02 22:47:18 -07:00
vaa-msu
d65a5cf881 Changed some 'uint's to 'uint_'s 2014-06-03 09:27:17 +04:00
roshanr
b772ae4849 Add algorithm and test for is_permutation 2014-06-01 06:01:38 +05:30
roshanr
5e08e1b0b6 Add algorithm and test for prev_permutation 2014-06-01 06:01:31 +05:30
roshanr
9943683627 Minor fixes for next_permutation
Add it to algorithm.hpp.
Remove unnecessary includes.
Change naming so as to prevent conflicts.
2014-06-01 06:00:57 +05:30
Kyle Lutz
ff11312fd5 Merge pull request #126 from roshanr95/next_permutation
Add algorithm and test for next_permutation
2014-05-29 23:16:31 -07:00
roshanr
79d374f646 Minor fix in binary_search
Changed it to use read_single_value instead of
* operator
2014-05-29 22:49:25 +05:30
roshanr
e4f0783ecd Add algorithm and test for next_permutation 2014-05-29 22:31:11 +05:30
roshanr
c8d4836a6f Add partition_point algorithm and test 2014-05-26 03:43:19 +05:30
roshanr
1bf09a5624 Add binary_find algorithm
Added binary_find algorithm.
Changed upper_bound, lower_bound, binary_search to use it
2014-05-25 20:41:12 +05:30
Kyle Lutz
52886775f8 Support conversion of lambda expressions to function objects 2014-05-24 13:42:43 -07:00
roshanr
4d3d114285 Add algorithm and test for stable_partition
Added algorithm and test for stable_partition.
Changed partition to use it for now till a
better implementation is found
2014-05-23 14:19:17 +05:30
Kyle Lutz
c7802d0a49 Move variadic event handling to async/wait.hpp 2014-05-20 23:08:46 -07:00
Kyle Lutz
85af9d2630 Add variadic wait_for_all() function 2014-05-18 19:47:19 -07:00
Kyle Lutz
13bc000117 Add image2d::clone() method 2014-05-18 19:34:24 -07:00
Kyle Lutz
376713f1b4 Simplify lambda wrappers for binary geometric functions 2014-05-18 16:13:47 -07:00
roshanr
8dbc2b658d Add algorithm and test for includes 2014-05-19 01:42:16 +05:30
roshanr
de9b2ef6ff Add algorithm and test for set_symmetric_difference 2014-05-19 01:08:41 +05:30
roshanr
acbabfeeff Add algorithm and test for set_difference 2014-05-18 22:50:00 +05:30
roshanr
5f7cd290cf Add algorithm and test for set_union 2014-05-18 22:49:54 +05:30
Kyle Lutz
4683a3b5be Add default constructor for image_sampler 2014-05-17 17:26:43 -07:00
roshanr
1bb652cbea Minor fixes
Change tile_size from string to int, use read_single_value instead of
finish
2014-05-17 20:41:03 +05:30
roshanr
c522d9a337 Add algorithm and test for set_intersection 2014-05-16 19:14:48 +05:30
roshanr
c0f844f018 Add compact kernel
Added kernel to compact the results of set kernels into actual sets
2014-05-16 19:13:27 +05:30
roshanr
78d06b2e26 Add tile_sets kernel
Added kernel to tile sets based on a balanced path
2014-05-16 19:13:03 +05:30
Kyle Lutz
b41ec2b1cb Fix bug when invoking binary closures 2014-05-14 10:44:52 -07:00
Kyle Lutz
72747e0830 Release v0.2 2014-05-11 10:34:16 -07:00
Kyle Lutz
b89c886462 Refactor exception classes
This refactors and improves the exception classes. Additional
documentation as well as testing has been added. This also adds
a new static method to opencl_error which converts OpenCL error
codes to human-readable strings.
2014-05-10 14:59:33 -07:00
Kyle Lutz
b1eef72ec2 Add device vendor predicate functions
This adds a couple new functions for checking the vendor of a
compute device. This is useful for algorithms which specialize
based on the type of the underlying hardware.
2014-05-10 10:33:24 -07:00
Benoit Dequidt
985414ed50 Improve reduce_on_gpu kernel if device.vendor() == NVIDIA
An internal template functor is used in order to get a clear code
when the device is from nvidia we can do:

  - loop unrolling
  - warp reduction using volatile local memory
2014-05-10 10:16:22 -07:00
Kyle Lutz
4bb78de369 Merge pull request #110 from roshanr95/search_n
Search_n
2014-05-10 10:09:16 -07:00
roshanr
d3b3881d9e Add algorithm and test for search_n 2014-05-09 22:48:40 +05:30
roshanr
09760b4372 Add algorithm and test for find_end 2014-05-09 05:16:04 +05:30
Kyle Lutz
6b7d83b40e Merge pull request #103 from roshanr95/search_algorithms
Search algorithm
2014-05-07 21:04:40 -07:00
roshanr
79d5353a4e Minor fixes
Change documentation style, add test for search(), remove unused
variable, remove trailing whitespaces
2014-05-07 15:53:03 +05:30
Kyle Lutz
88b6a8b3d4 Only call clRetainDevice()/clReleaseDevice() for sub-devices 2014-05-06 19:56:11 -07:00
roshanr
747fe2d41f Search algorithm
Add algorithm and test for search()
2014-05-04 23:41:43 +05:30
roshanr
f49f8ca36a Add kernel for pattern matching
Finds all matches. Can tell if there is a match starting at  a
particular index.
2014-05-03 01:52:39 +05:30
Kyle Lutz
d23894322f Merge pull request #102 from roshanr95/gather/scatter
Gather/scatter
2014-05-01 19:06:55 -07:00
roshanr
cd88be2e94 Rewrite scatter using meta_kernel 2014-05-01 14:53:34 +05:30
roshanr
d0e5efdbb7 Rewrite gather using meta_kernel 2014-05-01 14:26:58 +05:30
Kyle Lutz
37a00f5c4e Add normal_distribution class 2014-04-27 15:27:15 -07:00
Kyle Lutz
714416f17f Change uniform_real_distribution to use BOOST_COMPUTE_FUNCTION() 2014-04-27 15:17:30 -07:00
Kyle Lutz
7429589ca7 Change mersenne_twister_engine::generate() to use temporary storage 2014-04-27 15:14:59 -07:00
Kyle Lutz
02fc4fa170 Add define() method to function<> and closure<> 2014-04-27 15:02:48 -07:00
Kyle Lutz
0117e462e2 Fix unused parameter warnings 2014-04-27 13:11:24 -07:00
Kyle Lutz
6b00246e09 Simplify function/closure macro implementations 2014-04-27 12:56:14 -07:00
Kyle Lutz
9343b99085 Improve documentation 2014-04-24 19:51:45 -07:00
Kyle Lutz
2dc1a8cbfe Change count_if() to use reduce() 2014-04-22 22:24:51 -07:00
Kyle Lutz
ab0a365060 Only allocate temporary vector if necessary in generic_reduce() 2014-04-22 21:42:28 -07:00
Kyle Lutz
72a5449ffe Fix BOOST_COMPUTE_FUNCTION() usage in struct.hpp documentation 2014-04-22 21:31:14 -07:00
Kyle Lutz
127b350411 Remove unnecessary typename in function.hpp and closure.hpp
This fixes a warning when compiling these files with clang.
2014-04-20 19:34:21 -07:00
Kyle Lutz
4b67907023 Change BOOST_COMPUTE_FUNCTION() to use custom argument names
This changes the BOOST_COMPUTE_FUNCTION() macro (and the related
BOOST_COMPUTE_CLOSURE() macro) to use custom, user-provided argument
names instead of auto-generating them based on their index.

This is an API-breaking change. Users should now provide argument
names when using the BOOST_COMPUTE_FUNCTION() macro. The examples
and documentation have been updated to reflect the new API.
2014-04-20 19:13:48 -07:00
Kyle Lutz
2511bdb436 Merge pull request #97 from roshanr95/unique
Fix errors in unique
2014-04-20 16:42:56 -07:00
roshanr
3f537d806e Add unique_copy, modify unique to use it 2014-04-21 01:43:10 +05:30
Kyle Lutz
a78212fdde Rename K to K_BITS in radix_sort()
This should fix the following error seen on the Apple OpenCL
implementation when compiling the radix_sort program: "error:
definition of macro 'K' conflicts with an identifier used in
the precompiled header".
2014-04-20 10:16:02 -07:00
Kyle Lutz
8b06e3f7bb Add event::duration() method 2014-04-19 12:31:37 -07:00
Kyle Lutz
6ac757887c Support generic function callbacks for event 2014-04-19 11:38:10 -07:00
Kyle Lutz
7629748e49 Merge pull request #90 from roshanr95/perf_sort_by_key
Fix errors in perf_sort_by_key
2014-04-18 08:15:08 -07:00
roshanr
70da4979f5 Refactor recurring code into preprocessor
Makes it easy to add specialisations
2014-04-18 10:31:12 +05:30
Kyle Lutz
21d81fcd76 Add user_event class 2014-04-16 21:11:47 -07:00
Kyle Lutz
ac0be42cfc Change program::build() to return void 2014-04-16 21:01:31 -07:00
Kyle Lutz
b3ab16578b Include <boost/mpl/size.hpp> in function.hpp 2014-04-16 19:12:29 -07:00
Kyle Lutz
e84987b3f4 Fix unused parameter warning in reduce_on_gpu.hpp 2014-04-16 19:12:15 -07:00
Kyle Lutz
4f6c591362 Remove unnecessary typename in discard_iterator 2014-04-13 14:55:08 -07:00
Kyle Lutz
663ab01425 Add documentation for the random number generator classes 2014-04-13 14:33:03 -07:00
Kyle Lutz
3d8616e27e Add mersenne_twister_engine::generate() overload with transform 2014-04-13 14:18:41 -07:00
Kyle Lutz
6336b81911 Rename mersenne_twister_engine::fill() to generate() 2014-04-13 14:13:53 -07:00
Kyle Lutz
2ebb04caac Add discard() method to mersenne_twister_engine 2014-04-13 13:57:12 -07:00
Kyle Lutz
dd0b1fcb7b Add discard_iterator class 2014-04-13 13:45:01 -07:00
Kyle Lutz
6a9efd6d03 Optimize vector<T>::erase() when last equals end() 2014-04-12 16:16:01 -07:00
Kyle Lutz
dadced4703 Remove unused context variable in random_fill() 2014-04-12 11:33:09 -07:00
Kyle Lutz
b7c4f0ce18 Change mersenne_twister::seed() to take a command_queue 2014-04-12 11:14:44 -07:00
Kyle Lutz
7b2ca68539 Add documentation for the enqueue_1d_range_kernel() method 2014-04-12 10:11:02 -07:00
Kyle Lutz
8dac90de3a Fix spelling error in enqueue_native_kernel() documentation 2014-04-12 10:09:44 -07:00
Kyle Lutz
420c3dd15b Remove cl_int return values from command_queue
This updates the methods in command_queue to either return void
(for synchronous operations) or an event object (for asynchronous
operations). The caller will be notified of OpenCL errors via an
exception being thrown.
2014-04-12 10:02:45 -07:00
Kyle Lutz
7966768c80 Remove read/write buffer convenience overloads in command_queue 2014-04-12 09:40:37 -07:00
Kyle Lutz
e3604817df Remove explicit call to finish() in command_queue destructor
This removes the explicit call to finish() in the destructor
for the command_queue class.

The clFinish() function will be called automatically by the
clReleaseCommandQueue() function once the reference count for
the command queue drops to zero.
2014-04-12 09:35:39 -07:00
Kyle Lutz
7ec4566a00 Remove default local_work_size argument for enqueue_nd_range_kernel() 2014-04-12 09:22:28 -07:00
Kyle Lutz
89d97768d2 Remove enqueue_1d_range_kernel() overload with no local work-size 2014-04-12 09:17:48 -07:00
Kyle Lutz
15cf54cc48 Fix ambiguous member template warning with clang 2014-04-10 22:31:01 -07:00
Kyle Lutz
acb2188382 Improve reduce() performance with generic iterators 2014-04-10 22:16:04 -07:00
Kyle Lutz
b897b1f023 Copy multiple values per thread in copy_on_device() 2014-04-02 21:46:36 -07:00
Kyle Lutz
bae2bb6c7f Add get_nvidia_compute_capability() function 2014-04-02 21:30:22 -07:00
Kyle Lutz
01eb24f36c Fix bug in copy-constructor for wait_list 2014-03-23 21:26:09 -07:00
Kyle Lutz
6334d67720 Merge pull request #75 from roshanr95/unique
Unique algorithm
2014-03-23 21:24:05 -07:00
Kyle Lutz
5efecbdaad Merge pull request #74 from ddemidov/master
Fixing several warnings given by pedantic g++-4.8.2
2014-03-23 21:23:21 -07:00
roshanr
1e81b7ec2e Unique algorithm
Added unique() algorithm, tests and benchmarks. Removed unused variable
in scan_on_gpu() to remove warnings
2014-03-24 06:30:28 +05:30
Denis Demidov
d653df535d Fixing several warnings given by pedantic g++-4.8.2 2014-03-22 22:37:39 +04:00
Kyle Lutz
667aa9c200 Add buffer::clone() method 2014-03-20 23:31:41 -07:00
Kyle Lutz
0446e24baf Fix BOOST_COMPUTE_FUNCTION() with non-default-constructible types 2014-03-20 23:31:36 -07:00
Kyle Lutz
c15f35b0be Check for empty strings in get_object_info() 2014-03-20 23:17:27 -07:00
Kyle Lutz
21f053fe00 Check binary status in program::create_with_binary() 2014-03-20 23:17:27 -07:00
Kyle Lutz
53d6e95054 Release v0.1 2014-03-16 15:18:26 -07:00
Kyle Lutz
a439709fc2 Improve documentation 2014-03-16 13:59:14 -07:00
Kyle Lutz
8e086104a0 Add event::set_callback() method
This adds a method to the event class which allows the user to
register a callback function to be invoked when the event reaches
the specified state (e.g. when it completes).
2014-03-16 13:20:57 -07:00
Kyle Lutz
0c3a325554 Move transform_if() algorithm to experimental 2014-03-16 13:16:39 -07:00
Kyle Lutz
9bf22a41d1 Merge pull request #66 from roshanr95/rotate_copy
rotate_copy algorithm and test
2014-03-13 08:07:11 -07:00
Kyle Lutz
bae7432c04 Improve sort_by_key() performance 2014-03-12 23:40:57 -07:00
Kyle Lutz
cf8e972e55 Improve kernel::set_arg() method 2014-03-12 21:02:22 -07:00
Kyle Lutz
e1e84252d0 Merge pull request #65 from roshanr95/mersenneTwister
Fix warnings in Mersenne Twister
2014-03-12 18:21:13 -07:00
roshanr
f1b7f39655 rotate_copy algorithm and test 2014-03-13 03:56:56 +05:30
Kyle Lutz
5fb6f94cea Merge pull request #62 from roshanr95/rotate
Rotate algorithm
2014-03-12 10:38:56 -07:00
roshanr
d1a87603f0 Fix warnings in Mersenne Twister 2014-03-12 23:05:06 +05:30
roshanr
03edbbbdab Rotate algorithm 2014-03-12 22:41:30 +05:30
Kyle Lutz
0a1c378731 Add opengl_renderbuffer class 2014-03-09 22:17:36 -07:00
Kyle Lutz
ad48527dcd Add documentation for OpenGL interop headers 2014-03-09 22:16:10 -07:00
Kyle Lutz
6c8f158c00 Fix documentation for the wait_list class 2014-03-09 22:06:45 -07:00
Kyle Lutz
83d104f24f Add BOOST_COMPUTE_CLOSURE() macro
This adds a new macro which allows users to create closure functions
which can capture C++ variables and make them available in OpenCL.
2014-03-08 18:44:03 -08:00
Kyle Lutz
dec92cc438 Add BOOST_COMPUTE_ADAPT_STRUCT() macro
This adds a new macro which allows the user to adapt a C++ struct
or class for use with OpenCL given its type, name, and members.

This allows for custom user-defined data-types to be used with the
Boost.Compute containers and algorithms.
2014-03-08 18:21:34 -08:00
Kyle Lutz
6f3f30bee9 Add enqueue_native_kernel() method to command_queue 2014-03-08 15:21:57 -08:00
Kyle Lutz
3b49cf14f8 Add wait_list class
This adds a wait_list class which contains a vector of OpenCL
events that can be waited on before executing further commands.
2014-03-08 14:09:41 -08:00
Kyle Lutz
71af014b3d Add mapped_view container 2014-03-08 13:17:55 -08:00
Kyle Lutz
51e89596b1 Simplify accumulate() with reduce() 2014-03-08 13:13:32 -08:00
Kyle Lutz
b8de46d4de Add experimental directory
This adds an experimental directory which contains various
experimental algorithms and functions. The files and APIs
under this directory are experimental and unstable.
2014-03-08 13:02:06 -08:00
Kyle Lutz
86c0bb0a12 Add inline specifier to opengl_enqueue_release_gl_objects() 2014-02-28 21:09:34 -08:00
Kyle Lutz
b1b50f5e3a Add meta_kernel::insert_function_call() method 2014-02-24 19:56:52 -08:00
Kyle Lutz
d9a45b06d3 Move float vector stream operators in meta_kernel 2014-02-24 19:42:31 -08:00
Kyle Lutz
80781ce9d2 Add OpenCV-OCL interop functions 2014-02-22 10:57:42 -08:00
Kyle Lutz
dacdbf0ffd Bug in fill() with uchar4 2014-02-22 10:51:39 -08:00
Kyle Lutz
e7a76c343a Remove unused variable in reduce_on_gpu() kernel 2014-02-14 18:14:18 -08:00
Kyle Lutz
ec11d8cdc4 Add third-party perf tests
This adds third-party performance tests to use in comparing
Boost.Compute with other parallel/GPGPU frameworks like Intel's
TBB and NVIDIA's Thrust along with the C++ STL.

Also refactors the timing and profiling infrastructure and adds
a simple perf.py driver script for running performance tests.
2014-02-02 13:12:17 -08:00
Kyle Lutz
6de0b65d18 Improve documentation 2014-02-02 11:32:49 -08:00
Kyle Lutz
f3c2384af4 Add opengl_create_shared_context() function 2014-02-01 12:27:23 -08:00
Kyle Lutz
0c88eca831 Add platform::id() method 2014-02-01 12:17:21 -08:00
Kyle Lutz
9a0aa33c2f Make platform::get_extension_function_address() const 2014-02-01 12:15:53 -08:00
Kyle Lutz
ccd6f21d98 Change vector constructors to take queue argument
This changes the vector<T> constructors which copy or initialize
data to take a queue argument used for performing the operations.

Previously they just took a context argument used to initialize the
buffer and then created a new command queue to use. This improves
performance by not requiring a new command queue and also fixes issues
when performing operations on a different command queue while the
vector was still being initialized.
2014-01-27 23:39:19 -08:00
Kyle Lutz
47922aa780 Add Boost version check to config.hpp
This adds a compile-time check to config.hpp which ensures
that the miniumum supported Boost version (1.48) is found.
2014-01-20 18:31:18 -08:00
Kyle Lutz
dc20f09d92 Add make_tuple() lambda function 2014-01-14 22:18:35 -08:00
Kyle Lutz
ea7c2bf2f4 Add make_pair() lambda function 2014-01-14 22:03:48 -08:00
Kyle Lutz
c784ae994e Add third lambda placeholder 2014-01-14 22:00:22 -08:00
Kyle Lutz
46ef3fffb5 Make lambda function expressions variadic 2014-01-14 21:58:09 -08:00
Kyle Lutz
c57e1953d8 Make lambda get<N>() variadic 2014-01-14 21:54:54 -08:00
Kyle Lutz
8aad57612b Make function_signature_to_mpl_vector<> meta-function variadic 2014-01-14 21:52:34 -08:00
Kyle Lutz
72664c8de9 Add test for generate() with pair<T1, T2> 2014-01-14 21:31:51 -08:00
Kyle Lutz
68412f5ae0 Refactor function handling in lambda expressions 2014-01-13 18:27:57 -08:00
Kyle Lutz
936d801466 Add support for host iterators to sort() 2014-01-13 18:27:52 -08:00
Kyle Lutz
413267b32a Improve accumulate() performance
This improves the performance for the accumulate() algorithm
for types/operations that can be performed with reduce().
2014-01-13 18:27:48 -08:00
Kyle Lutz
ac148e8f1f Fix extra semicolon warning in interop/eigen/core.hpp 2014-01-13 18:27:40 -08:00
Denis Demidov
5e912dff1c Move BOOST_COMPUTE_MAX_ARITY definition to compute/config.hpp 2014-01-10 09:55:35 +04:00
Denis Demidov
52bae83504 Make zip_iterator take more than three elements
This uses Boost.Preprocessor macros to allow zip iterators to work with
arbitrary number of elements (the current limit is maximum boost::tuple
size which is 10 by default).

Refs #50
2014-01-09 23:39:58 +04:00
Denis Demidov
d24749ae52 Use SHA1 for online cache keys
This makes online cache use sha1 of the program source as key.
Introduces boost::compute::detail::sha1() function, which is moved
from compute::program into its own header file.
2014-01-07 23:07:18 +04:00
Kyle Lutz
6f52e3ce1f Merge pull request #46 from ddemidov/offline-cache
Use the original program source for program creation/compilation
2014-01-07 10:11:08 -08:00
Denis Demidov
41d2052c2a Fix linkage problem with detail::getenv()
detail::getenv() function was not declared inline, which led to
`multiple definition` errors at link time when a program consisted of
multiple objects that included Boost.Compute headers.

Fixed the problem and added core.multiple_objects test.
2014-01-07 21:29:18 +04:00
Denis Demidov
f519ad3639 Use the original program source for program creation/compilation
Instead of building the program from source with the added comment
block (used for distinction between different platforms and devices
when offline cache is in use), only use the altered source for the
hash computation. This way users will not get unexpected results from
program.source().
2014-01-07 21:05:26 +04:00
Kyle Lutz
aad03486d9 Add interop support
This adds interoperability support between Boost.Compute and various
other C/C++ libraries (Eigen, OpenCV, OpenGL, Qt and VTK). This eases
development for users using external libraries with Boost.Compute.
2014-01-06 23:35:38 -08:00
Kyle Lutz
b47e74df6f Add is_fundamental type-trait 2014-01-06 23:04:36 -08:00
Kyle Lutz
eca81df028 Merge pull request #39 from ddemidov/offline-cache
Implements offline kernel caching
2014-01-06 22:47:52 -08:00
Denis Demidov
562f149b18 Implements offline kernel caching
See kylelutz/compute#21

This adds program::build_with_source() function that both creates and
builds the program for the given context with supplied source and
compile options. In case BOOST_COMPUTE_USE_OFFLINE_CACHE macro is
defined, it also saves the compiled program binary for reuse in the
offline cache located in $HOME/.boost_compute folder on UNIX-like
systems and in %APPDATA%/boost_compute folder on Windows.

All internal uses of program::create_with_source() followed by
program::build() are replaced with program::build_with_source().
2014-01-07 09:07:00 +04:00
Kyle Lutz
b17888b604 Move future header to async directory 2014-01-06 18:44:37 -08:00
Kyle Lutz
55eeada078 Add getenv() wrapper
This adds a getenv() wrapper which can be used to avoid having to
explicitly disable MSVC warnings when checking for environment
variables.
2014-01-06 07:53:07 -08:00
Kyle Lutz
e337f632da Add height() and width() methods to image2d 2014-01-05 18:36:29 -08:00
Kyle Lutz
3bc4a6366d Add BOOST_COMPUTE_STRINGIZE_SOURCE() macro 2014-01-05 18:30:34 -08:00
Kyle Lutz
6b30645d6d Remove extra semicolon in accumulate.hpp 2014-01-05 18:18:45 -08:00
Kyle Lutz
0d9be38326 Fix issues with gather() algorithm
This fixes some issues with the gather algorithm and also
adds another test for it.
2013-12-21 15:34:29 -08:00
Kyle Lutz
55783258e7 Add cache support to meta_kernel::compile()
This updates the meta_kernel::compile() method to support
caching of program objects. The programs are cached based
on a hash of their source code.
2013-12-21 11:44:02 -08:00
Kyle Lutz
ac1ff45eff Add reduce_on_gpu() algorithm
This adds a improved reduce() algorithm implementation for
GPUs. Also adds checks to accumulate() which allow it to
use the higher-performance reduce() algorithm if possible.
2013-12-21 10:56:55 -08:00
Kyle Lutz
26612823a4 Add merge() overload with custom compare function
This adds a merge() function overload which uses a custom compare
function instead of the default less<T>() to compare the values.
2013-12-07 15:15:37 -08:00
Kyle Lutz
6b6f66b6ba Add reduce() overload without function argument
This adds adds an overload of the reduce() function which
uses plus<T>() as the reductor. This simplifies the common
case of calculating the sum for a range of values.
2013-12-07 15:02:04 -08:00
Kyle Lutz
ba9e64e316 Remove init argument from reduce()
This removes the init argument from reduce. This simplifies the
implementation and avoids copying a value from the host to the
device on every call to reduce.

If an initial value is required, the accumulate function can be
called instead.
2013-12-07 14:49:46 -08:00
Kyle Lutz
7db9ad715f Fix compilation error on Windows for context error handler
This fixes a compilation error which occurs on Windows when
registering the default error handler callback when creating
a new context object.

In OpenCL 1.1 and later the callback function is expected to
use the __stdcall calling convention. This is optionally defined
by the CL_CALLBACK macro on WIN32 platforms. If available, it is
defined with the BOOST_COMPUTE_CL_CALLBACK macro which is then
used to annotate the callback functions.
2013-12-06 23:11:01 -08:00
Kyle Lutz
4b2aa35326 Increase work-group size for copy() kernel
This increases the work-group size for the copy() kernel to 256 which
improves performance on several benchmarks.
2013-11-23 12:22:55 -08:00
Kyle Lutz
701bc8a5f3 Add nth_element() algorithm
This adds an implementation of the nth_element() algorithm. For
now the algorithm is trivially implemented by calling sort().
2013-11-15 20:51:13 -08:00
Kyle Lutz
0daa62e41f Add experimental copy_index_if() algorithm
This adds an experimental algorithm like copy_if() which copies
the index of the values for which predicate returns true instead
of the values themselves.
2013-11-15 20:30:30 -08:00
Kyle Lutz
adde232fc8 Add context error handler
This adds an error handler function which is invoked when an OpenCL
context encounters an error condition. The context error is converted
to a C++ exception containing the error information and thrown.
2013-11-15 20:26:01 -08:00
Kyle Lutz
953ebb4e26 Add variadic tuple support
This adds support for variadic tuples on C++11 compilers.
2013-11-15 20:07:39 -08:00
Kyle Lutz
b5ff4743bb Add field() function
This adds a new function which will return the named field
from a value. For example, this can be used to return one of
the components of a pair object or to swizzle a vector value.
2013-11-10 15:44:45 -08:00
Kyle Lutz
8213697307 Add BOOST_COMPUTE_FUNCTION() macro
This adds a new macro to ease the definition of custom user
functions. The BOOST_COMPUTE_FUNCTION() macro creates a new
boost::compute::function<> object with the provided return
type, argument types, function name and OpenCL source code.
2013-11-10 15:32:15 -08:00
Kyle Lutz
8608e60116 Refactor invoked_function<>
This refactors the invoked_function<> classes. Previously each
function arity (e.g. unary, binary) had a separate invoked_function<>
template class. Now they all use the same class which simplifies the
logic in function<> and meta_kernel.
2013-11-10 15:31:56 -08:00
Kyle Lutz
43678410be Fix bugs with type definitions in meta_kernel
This fixes a bug in which type definitions were being inserted
into meta_kernel's multiple times. Also forces zip_iterator to
insert its type definitions when used in a kernel.
2013-11-10 15:13:46 -08:00
Kyle Lutz
a0b635e201 Add type_name<void>() specialization
This adds a type_name<>() specialization for void types.
2013-11-10 14:35:04 -08:00
Kyle Lutz
85812f4e93 Add BOOST_COMPUTE_TYPE_NAME() macro
This adds a macro for registering custom type names for C++ types
to be used in OpenCL kernel code. Internally the macro specializes
the type_name<T>() function.
2013-10-02 21:40:22 -04:00
Kyle Lutz
a2b7595f36 Make type_name<T>() inline
This adds the inline specifier to the type_name<T>() function.
2013-10-02 21:23:09 -04:00
Kyle Lutz
feb510a019 Add unpack() function adaptor
This adds a new unpack() function adaptor which converts
a function with N arguments to a function which takes a
single tuple argument with N components.

This is useful for calling built-in functions with the tuples
values returned from zip_iterator. This also removes the now
un-needed binary_transform_iterator.
2013-09-24 23:05:08 -04:00
Kyle Lutz
736f3a17a6 Add min_and_max reduce() test
This adds a test for computing the minimum and maximum
values of a vector simultaneously using reduce() with a
custom reduction function.

Also fixes a bug in reduce() in which inplace_reduce() was
being used even if the input type and result type differed.
2013-09-24 22:47:16 -04:00
Kyle Lutz
a1155bc343 Store source strings for binary and ternary functions
This fixes an issue in which the source strings for binary
and ternary functions were not being stored and thus not
being inserted into kernels when they were invoked.
2013-09-24 22:42:50 -04:00
Kyle Lutz
dc6b3228eb Add as() and convert() type-conversion functions
This adds the as() and convert() functions for converting
between OpenCL types.
2013-09-24 22:27:50 -04:00
Kyle Lutz
3412d0935d Add not1() and not2() function adaptors
This adds the not1() and not2() function adaptors which
negate unary and binary functions respectively.
2013-09-24 22:22:52 -04:00
Kyle Lutz
07e4a6b3aa Remove BLAS functions
This removes the incomplete BLAS API functions.
2013-09-24 22:19:56 -04:00
Kyle Lutz
d16309f57e Add program_cache
This adds a program cache which can be used by algorithms and other
functions to store programs which may be re-used. This improves
performance by reducing the need for costly recompilation of commonly
used programs.

Program caches are context specific and multiple copies of the same
context will use the same program cache. They are created and accessed
by the global get_program_cache() function.

For now, only a few algorithms and functions (radix sort, mersenne
twister, fixed size sorts) make use of the program cache.
2013-09-07 22:58:34 -04:00
Kyle Lutz
d04e628367 Add experimental sort_by_transform() algorithm
This adds a sort_by_transform() algorithm which sorts a sets of
values based on the value of a transform function.

For example, this can be used to sort a set of vectors by their
length (when used with the length<T>() function) or by a single
component (when used with the get<N>() function).
2013-09-07 17:10:15 -04:00
Kyle Lutz
3389a5c741 Add sort_by_key() algorithm
This adds a new sort_by_key() algorithm which sorts a range
of values by a range of keys with a comparison operator.

For now this is only implemented by the serial insertion sort
algorithm. In the future it will be ported to the other sorting
algorithms (e.g. radix sort).
2013-09-07 17:02:08 -04:00
Kyle Lutz
f9d887e30d Add experimental tabulate() algorithm
This adds a tabulate() algorithm which fills a range with values
calculated from a function given each elements index.
2013-09-07 16:53:08 -04:00
Kyle Lutz
a96c9c0182 Add result argument to reduce() algorithm
This adds an output iterator result argument to the reduce()
algorithm. Now, instead of returning the reduced result, the
result is written to an output iterator. This allows the value
to stay on the device and avoids a device-to-host copy in cases
where the result is not needed on the host (e.g. it is part of
a larger computation).

This is an API breaking change to users of reduce(). Affected code
should now declare a result variable and then pass a pointer to it
as the new result argument.
2013-09-07 15:36:49 -04:00
Kyle Lutz
a8f4421739 Add copy() specialization for host-to-host transfers
This adds a copy() specialization for host-to-host transfers
which simply forwards the call to std::copy().

This is useful in templated algorithms which may in certain
circumstances copy() between data ranges on the host.
2013-09-07 15:29:48 -04:00
Kyle Lutz
78a561eff1 Add scan_on_cpu() algorithm
This adds a new scan_on_cpu() algorithm which implements the scan()
algorithm for CPU devices. Also renames the existing scan() algorithm
to scan_on_gpu().

This fixes some tests failures on POCL which were caused by the prior
GPU scan() algorithm not functioning properly with POCL.
2013-09-07 15:03:42 -04:00
Kyle Lutz
518d39fc2b Use bitwise-and to check device::type()
This changes the checks for the device type to use the bitwise-and
operator instead of the equaility operator. The returned type is a
bitset and this would cause errors when multiple bits were set.

This fixes a bug on POCL which returns the device type as a
combination of CL_DEVICE_TYPE_DEFAULT and CL_DEVICE_TYPE_CPU. Now
the correct device type (device::cpu) is detected for POCL.
2013-09-07 14:16:20 -04:00
Kyle Lutz
3a7b90ff06 Fix issue with comparison operators in lambda expressions
This fixes an issue in which comparison operators (e.g. <, ==)
in lambda expressions would return the wrong result type causing
compilation errors.

Also adds a few test cases to ensure the correct result type
and that lambda expressions can be properly used with count_if().
2013-08-15 22:10:03 -04:00
Kyle Lutz
bacec5b8fe Add uniform_real_distribution
This adds a random number distribution which generates random
numbers in a uniform distribution.

Also adds a convenience algorithm which fills a range with
uniformly distributed random numbers between two values.
2013-08-13 20:40:42 -04:00
Kyle Lutz
767589fe0d Rearrange type headers
This rearranges the type headers to live under the
<boost/compute/types/...> directory instead of the
top-level <boost/compute/...> directory.
2013-08-13 20:37:56 -04:00
Kyle Lutz
b539e8413c Add Doxygen documentation
This replaces the BoostBook/XML based reference documentation
with Doxygen auto-generated documentation.
2013-07-16 21:48:16 -04:00
Kyle Lutz
b3d2fbb7eb Add fill_async() algorithm
This adds a fill_async() which fills a range with a
given value asynchronously.
2013-07-02 21:57:19 -04:00
Kyle Lutz
5203506c16 Add support for on-device copy_async()
This adds support for copy_async() when copying between
memory objects on a compute device.
2013-07-02 21:57:19 -04:00
Kyle Lutz
8459fdeb0e Change meta_kernel::exec*() methods to return events
This changes the exec() and exec_1d() methods in the meta_kernel
class to return event objects.
2013-07-02 21:57:19 -04:00
Kyle Lutz
d8f5a5b503 Change enqueue_*_buffer() methods to return events
This changes the enqueue_copy_buffer() and enqueue_fill_buffer()
methods in the command_queue class to return event objects.
2013-07-02 21:57:19 -04:00
Kyle Lutz
c1bf707b41 Add event::get_command_type() method
This adds a get_command_type() method to the event class
which returns the OpenCL type for an event object.
2013-07-02 21:57:19 -04:00
Kyle Lutz
ee5f581094 Add command_queue::enqueue_migrate_memory_objects() method
This adds an enqueue_migrate_memory_objects() method to the
command_queue class which allows memory objects to be migrated
between compute devices and to the host.
2013-07-02 21:57:19 -04:00
Kyle Lutz
2ca028c37b Improve reduce() performance
This makes a few tweaks to the reduce() algorithm in order to
improve performance. An unnecessary barrier() has been removed
and now multiple values are reduced on the initial read.
2013-07-02 21:57:15 -04:00
Denis Demidov
84394de119 Get rid of type convesion warnings inside VS2010 2013-06-24 09:57:22 +02:00
Denis Demidov
b28d8697bc Silence MSVC security warning C4996 in system.hpp 2013-06-24 09:55:40 +02:00
Denis Demidov
f5c86057a1 Get rid of clang v3.3 warning -Wconstexpr-not-const 2013-06-21 15:27:00 +04:00
Kyle Lutz
f2b812019c Fix bugs with char/uchar/bool literals in meta_kernel
This fixes a few issues that occurred when using char, uchar
and bool literals with meta_kernel.
2013-06-19 23:55:22 -04:00
Kyle Lutz
e01569049b Add type_name<bool>() specialization
This adds a type_name() specialization for bool.
2013-06-19 23:48:49 -04:00
Kyle Lutz
0d285d8a30 Change meta_kernel::add_arg(name, value) to add_set_arg()
This changes the meta_kernel::add_arg() overload with a name
and a value to a separate method. This fixes conflict when
using add_arg() with string values.
2013-06-11 21:19:47 -04:00
Kyle Lutz
7fb77ef9c5 Add test for any/all/none_if() with NaN and inf
This adds a test for the any_of(), all_of() and none_of() functions
with NaN and Inf values.
2013-06-11 21:16:15 -04:00
Kyle Lutz
8e51a0a162 Refactor lambda expression framework to use meta_kernel
This refactors the lambda expression framework to use meta_kernel
to construct kernel source code instead of using plain strings.
2013-06-11 21:14:28 -04:00
Kyle Lutz
64e94549b3 Add specialization for get<N>() with zip_iterator
This adds a specialization for the get<N>() function when used
with zip_iterator's. Now, only the N'th iterator for the expression
will be dereferenced instead of dereferencing all of the iterators
into a tuple and then extracting the N'th component.
2013-06-11 20:37:23 -04:00
Kyle Lutz
15bc98b94f Remove cv-qualifiers from get<N>()'s value-type
This removes the cv-qualifiers for the value-type returned from
get<N>() expressions. This fixes issues when specializing based
on the type (e.g. pair, tuple).
2013-06-11 20:29:06 -04:00
Kyle Lutz
98b593b937 Fix meta_kernel streaming operators with float
This fixes a bug in the meta_kernel streaming operators with
float values. Now, float scalar and vector literals are inserted
into the kernel source with the proper 'f' suffix.
2013-06-11 20:23:47 -04:00
Kyle Lutz
36dd3f1306 Improve the system::find_default_device() method
This makes some improvements to the system::find_default_device()
method. Now, the devices on the system will only be queried once
when searching for the default device. This reduces the number of
calls to clGetPlatformIDs() and clGetDeviceIDs().

Also, in the case that no GPU or CPU devices are found, the first
device on the system will be selected as the default device. This
fixes issues when using Boost.Compute with pocl.
2013-05-24 20:07:38 -04:00
Kyle Lutz
aa7fd2f6fa Add asserts for clRelease*() functions in destructors
This adds assert()'s verifying that the clRelease*() functions
in the destructors for the OpenCL wrapper classes return
CL_SUCCESS.
2013-05-23 23:15:43 -04:00
Kyle Lutz
b5068b2027 Fix minor version macro
This fixes the minor version macro.
2013-05-23 22:46:52 -04:00
Kyle Lutz
5b12d04d4e Mark streaming operators for boost::tuple<> inline
This marks the meta_kernel streaming operators for
boost::tuple<> literals as inline.
2013-05-22 22:50:51 -04:00
Kyle Lutz
c2187b89c0 Mark streaming operator std::pair<> inline
This marks the meta_kernel streaming operator for
std::pair<> literals as inline.
2013-05-22 22:50:46 -04:00
Kyle Lutz
0405c3cdc3 Check for valid range in reverse()
This adds a check to the reverse() algorithm to ensure that
the range contains at least two elements. Previously, passing
zero or one element ranges to reverse() would result in errors.
2013-05-22 22:41:12 -04:00
Kyle Lutz
f07caa1ddd Fix compilation error in future<void> assignment operator
This fixes a compilation error which occurred when assigning
to a future<void> from a future<T>. For different future types
the event member variable is private and must be accessed via
the get_event() method.
2013-05-21 23:20:36 -04:00
Kyle Lutz
bac6fb7332 Check for valid pattern size in fill() disptacher
This checks for a valid pattern value size before dispatching
to the clEnqueueFillBuffer() function for the fill() algorithm.
2013-05-21 23:17:32 -04:00
Kyle Lutz
2560600122 Fix issues with boost::tuple<>, char, and fill()
This fixes issues when using boost::tuple<> containing char
types with the fill() algorithm.
2013-05-21 23:10:56 -04:00
Kyle Lutz
9141732b3e Fix issues with std::pair<>, char, and fill()
This fixes issues when using std::pair<> containing char
types with the fill() algorithm.
2013-05-21 23:10:56 -04:00
Kyle Lutz
f4ecbd1e6c Fix issues with char literals in meta_kernel
This fixes issues when using char and unsigned char literals in
a meta_kernel. Previously the character values would be directly
inserted without quotes (e.g. c instead of 'c') which lead to
kernel compilation errors.
2013-05-21 23:10:40 -04:00
Kyle Lutz
1caebe6de8 Fix bug in in-place scan()
This fixes a bug when creating a temporary vector for use in the
in-place scan() algorithm. Previously, a separate command queue
was used to copy the input values to the temporary vector. Now,
the same command queue is used for copying the input values and
performing the scan.
2013-05-20 23:05:51 -04:00
Kyle Lutz
9f231d7b13 Fix conversion warnings in buffer_iterator
This fixes conversion warnings for buffer_iterator.
2013-05-20 23:05:40 -04:00
Kyle Lutz
3bc5bfaf78 Remove timer class
This removes the timer class. The technique of measuring the time
difference between two different OpenCL markers on a command queue
is not portable to all OpenCL implementations (only works on NVIDIA).

A new internal timer class has been added which uses boost::chrono
(or std::chrono if BOOST_COMPUTE_TIMER_USE_STD_CHRONO is defined).
This new timer is used by the benchmarks to measure time elapsed
on the host.
2013-05-20 21:08:42 -04:00
Kyle Lutz
fab7be5f43 Add inplace_merge() algorithm
This adds a simple inplace_merge() algorithm which merges
two contiguous sorted ranges in-place.

For now, the implementation simply copies the ranges to
two temporary vectors and calls merge().
2013-05-20 20:50:12 -04:00
Kyle Lutz
b43e79b983 Add support for get<N>() in lambda expressions
This adds support for using the get<N>() function in lambda
expressions to extract a single component of an aggregate type.

Also adds a test of using boost::tuple<> to store a user-defined
data type on the device and sort them by their first component
using a lambda expression as the comparator.
2013-05-20 20:50:10 -04:00
Kyle Lutz
e46828a9d6 Fix issues involving iterators with void value_type
This fixes a few issues encountered when using iterators with a
void value_type (e.g. std::insert_iterator<>).

The is_contiguous_iterator meta-function was refactored to always
return false for iterators with a void value_type and avoid
instantiating types for containers with a void value_type
(e.g. std::vector<void>::iterator) which previously resulted
in compilation errors.
2013-05-20 19:57:13 -04:00
Kyle Lutz
4ab37ada07 Add system-wide default command queue
This adds a system-wide default command queue. This queue is
accessible via the new static system::default_queue() method.
The default command queue is created for the default compute
device in the default context and is analogous to the default
stream in CUDA.

This changes how algorithms operate when invoked without an
explicit command queue. Previously, each algorithm had two
overloads, the first expected a command queue to be explicitly
passed and the second would create and use a temporary command
queue. Now, all algorithms take a command queue argument which
has a default value equal to system::default_queue().

This fixes a number of race-conditions and performance issues
througout the library associated with create, using, and
destroying many separate command queues.
2013-05-15 20:59:56 -04:00
Kyle Lutz
a2bda0610d Fix memory issues with device_ptr and allocator
This fixes a few memory handling issues between device_ptr,
buffer_iterator, buffer_value, allocator, and malloc/free.

Previously, memory buffers that were allocated by allocator and
malloc were being retained (via clRetainMemObject() in buffer's
constructor) by device_ptr, buffer_iterator and buffer_value.

Now, false is passed for the retain parameter to buffer's
constructor so that the buffer's reference count is not
incremented. Furthermore, the classes now set the buffer to
null before being destructed so that they will not decrement its
reference count (which normally occurs buffer's destructor).

The main effect of this change is that objects which refer to a
memory buffer but do not own it (e.g. device_ptr, buffer_iterator)
will not modify the reference count for the buffer. This fixes a
number of memory leaks which occured in longer running programs.
2013-05-13 22:27:02 -04:00
Kyle Lutz
a5ddeae614 Add scalar<T> container
This adds a new scalar<T> "container" which stores a single
value in a memory buffer. This simplifies memory handling in
algorithms which read and write a single value.
2013-05-11 20:20:27 -04:00
Kyle Lutz
130f8c30f1 Rename kernel::num_args() method to arity()
This renames the kernel::num_args() method to arity().
2013-05-11 20:15:00 -04:00
Kyle Lutz
ffec5fd34a Remove unnecessary includes from transform_reduce
This removes a couple of unnecessary includes from the
transform_reduce.hpp header file.
2013-05-11 20:10:28 -04:00
Kyle Lutz
178676df4f Refactor the system::default_device() method
This refactors the system::default_device() method. Now, the
default compute device for the system is only found once and
stored in a static variable. This eliminates many redundant
calls to clGetPlatformIDs() and clGetDeviceIDs().

Also, the default_cpu_device() and default_gpu_device() methods
have been removed and their usages replaced with default_device().
2013-05-10 22:49:05 -04:00
Kyle Lutz
d40eddc56b Fix compilation error with get<N>() and tuple
This fixes a compilation error which occured when using
the get<N>() function with tuple types.
2013-05-10 21:51:28 -04:00
Kyle Lutz
705b3f35a3 Fix narrowing conversion warnings in device
This fixes a couple of narrowing conversion warnings in the
device partitioning methods which were seen when compiling
VexCL with Boost.Compute in C++11 mode.
2013-05-09 22:04:00 -04:00
Kyle Lutz
9a64f6b39a Add get<N>() function
This adds a get<N>() function which returns the n'th element
of an aggregate type (e.g. vector type, pair, tuple).

This unifies the functionality of, and replaces, the get_pair()
and vector_component() functions.
2013-05-05 12:46:05 -04:00
Kyle Lutz
3e840fa306 Add transform_if() algorithm
This adds a new algorithm named transform_if() which applies
a given unary function to an input value only if it passes a
separate predicate function.
2013-05-05 11:51:21 -04:00
Kyle Lutz
49a34442e5 Remove unused histogram() algorithm
This removes the unused histogram() algorithm.
2013-05-05 10:56:14 -04:00
Dominic Meiser
7c5e321c2a Fixing build issues under windows 2013-05-03 18:37:09 -04:00
Kyle Lutz
3e93d01475 Add default constructors to image2d and image3d
This adds default constructors to the image2d and image3d
classes which initialize them with null memory objects.
2013-05-02 21:01:30 -04:00
Kyle Lutz
5d28d3887e Make pick_copy_work_group_size() inline
This makes the pick_copy_work_group_size() function inline.
2013-05-02 20:55:22 -04:00
Kyle Lutz
0ab2fe85eb Don't auto-initialize values in vector
This changes the vector class to not auto-initialize values
when it is created or resized. This improves performance by
eliminating a call to fill(). If needed, user code can call
fill() explicitly on the newly allocated values.
2013-04-27 10:30:26 -04:00
Kyle Lutz
03195275b3 Increase work-group size for copy() kernel
This increases the work-group size for the copy() kernel to be
up to 32 items based on the size of the input. This increases the
performance of copy() and related algorithms (e.g. transform()).
2013-04-27 10:21:47 -04:00
Kyle Lutz
ea107ae5d6 Add clamp_range() algorithm
This adds a clamp_range() algorithm which clamps a range
of values between a low and high value. This is based on
the algorithm of the same name in Boost.Algorithm.
2013-04-22 22:06:04 -04:00
Kyle Lutz
8142e5d5f9 Add move-constructors to wrapper classes
This adds move-constructors and move-assignment operators
to the OpenCL wrapper classes.
2013-04-17 20:45:04 -04:00
Kyle Lutz
4bdec761cd Add memory_object::reference_count() method
This adds a reference_count() method to the memory_object
class which returns its current reference count.
2013-04-13 11:07:04 -04:00
Kyle Lutz
d58b7c0902 Return event from command_queue::enqueue_task()
This changes the command_queue::enqueue_task() method to return
an event object.
2013-04-13 10:23:29 -04:00
Kyle Lutz
da4cb81679 Return event from command_queue::enqueue_nd_range_kernel()
This changes the enqueue_nd_range_kernel() method to return an
event object. This allows clients to monitor the progress of a
kernel executing on a device.
2013-04-13 10:23:01 -04:00
Kyle Lutz
001b3ff7fe Add get() methods to wrapper classes
This adds a get() method to each wrapper class which returns
a reference to the underlying OpenCL object.
2013-04-13 09:44:51 -04:00
Denis Demidov
8b78d4187d Adds support for selecting devices with environment variables
boost::compute::system::default_device() supports the following
environment variables:

BOOST_COMPUTE_DEFAULT_DEVICE   for device name
BOOST_COMPUTE_DEFAULT_PLATFORM for OpenCL platform name
BOOST_COMPUTE_DEFAULT_VENDOR   for device vendor name

If one or more of these variables is set, then device that satisfies
all conditions gets selected. If such a device is unavailable, then
the first available GPU is selected. If there are no GPUs in the
system, then the first available CPU is selected. Otherwise,
default_device() returns null device.

The hello_world example is modified to use default_device() instead
of default_gpu_device().
2013-04-12 17:22:25 -04:00
Kyle Lutz
1be19a6305 Add multiplies<T> specialization for std::complex<T>
This adds a specialization of multiplies<T> for std::complex<T>
which implements complex number multiplication.

Also adds a simple test using transform() to verify the complex
multiplication works correctly.
2013-04-10 22:04:04 -04:00
Kyle Lutz
8d13920dc4 Move swizzle_iterator to detail namespace
This moves the swizzle_iterator class to the detail
namespace.
2013-04-10 21:51:24 -04:00
Kyle Lutz
bcc3aed40f Move pixel_input_iterator to detail namespace
This moves the pixel_input_iterator class to the detail
namespace.
2013-04-10 21:38:05 -04:00
Kyle Lutz
5cce555d8c Move binary_transform_iterator to detail namespace
This moves the binary_transform_iterator class to the
detail namespace.
2013-04-10 21:33:29 -04:00
Kyle Lutz
e30ec9f26c Move adjacent_transform_iterator to detail namespace
This moves the adjacent_transform_iterator class to the
detail namespace.
2013-04-10 21:24:15 -04:00
Kyle Lutz
6dd6e11c7d Fix unused variable warning in get_base_iterator_buffer()
This fixes an unused variable warning which occurs in the
get_base_iterator_buffer() function when the base iterator
is not a buffer iterator and thus the iter argument is not
used.
2013-04-10 21:09:17 -04:00
Kyle Lutz
6fdffd8a2b Replace usages of result_of() with tr1_result_of()
This fixes a bug in which boost::result_of() would return the
wrong result type for a function due to the new implementation
using decltype instead of the result_of protocol on compilers
that sufficently support C++11 (such as clang >= 3.2).

Now, boost::tr1_result_of() is used to explicitly request that
the result_of protocol be used even when decltype is supported
by the compiler.
2013-04-10 20:17:34 -04:00
Kyle Lutz
652f99e449 Fix bug in get_buffer() for iterator adaptors
This fixes a bug in which the get_buffer() method was not properly
disabled for iterator adaptors with a non-buffer base iterator.
2013-04-09 21:56:24 -04:00
Kyle Lutz
5164ab4bd0 Cleanup constructors for wrapper classes
This cleans up the constructor methods for the OpenCL wrapper
classes and unifies the API used for creating a wrapper class
object from the underlying OpenCL objects.

Now, every wrapper class has a constructor taking the OpenCL
object and an optional boolean retain parameter which indicates
whether the constructor should increment the reference count.
2013-04-07 15:03:24 -04:00
Kyle Lutz
25a084deda Fix indentation in kernel::get_arg_info()
This fixes the indentation in the kernel::get_arg_info()
method.
2013-04-07 12:57:26 -04:00
Kyle Lutz
48e1bb4da0 Update image2d/3d constuctors for OpenCL 1.2
This updates the constructors for the image2d and image3d
classes to use the new clCreateImage() function instead of
the deprecated clCreateImage2D/3D() functions.
2013-03-31 15:01:30 -04:00
Kyle Lutz
d56e58b48e Add OpenCL 1.2 error codes to runtime_exception
This adds support for the OpenCL 1.2 error codes to the
runtime_exception class.
2013-03-31 14:58:00 -04:00
Kyle Lutz
0aa3d024dc Fix command_queue::enqueue_marker() for OpenCL 1.2
This changes the enqueue_marker() method in the command_queue
class to use clEnqueueMarkerWithWaitList() instead of the
deprecated clEnqueueMarker() function when compiling with
OpenCL 1.2.
2013-03-31 12:08:23 -04:00
Kyle Lutz
7e7e09b704 Fix command_queue::enqueue_barrier() for OpenCL 1.2
This changes the enqueue_barrier() method in the command_queue
class to use clEnqueueBarrierWithWaitList() instead of the
deprecated clEnqueueBarrier() function when compiling with
OpenCL 1.2.
2013-03-31 12:03:38 -04:00
Kyle Lutz
52fef4de6b Remove command_queue::enqueue_wait_for_event() method
This remove the enqueue_wait_for_event() method from the
command_queue class as the clEnqueueWaitForEvents() function
has been deprecated in OpenCL 1.2.
2013-03-31 11:59:14 -04:00
Kyle Lutz
c7a3bc8af6 Move unload_compiler() method to platform
This moves the unload_compiler() method from the system class
to the platform class. Also changes the method to use the
clUnloadPlatformCompiler() function instead of the deprecated
clUnloadCompiler() when compiling with OpenCL 1.2.
2013-03-31 11:29:40 -04:00
Kyle Lutz
d28354184c Move get_extension_function_address() method to platform
This moves the get_extension_function_address() method from
the system class to the platform class. Also changes the method
to use the clGetExtensionFunctionAddressForPlatform() function
instead of the deprecated clGetExtensionFunctionAddress() when
compiling with OpenCL 1.2.
2013-03-31 11:26:18 -04:00
Kyle Lutz
00fcb737cc Fix bug in move-constuctor for vector<T>
This fixes a bug in the move-constuctor for the vector<T>
class.

Previously, the moved-from object was also deallocating the
memory buffer leading to an error when the moved-to object
attempted to use it. Now, the constructor checks if the buffer
is non-empty before deallocating it.
2013-03-30 19:55:51 -04:00
Kyle Lutz
1161f89031 Make get_object_info() inline
This marks the get_object_info() method as inline.
2013-03-30 19:38:40 -04:00
Kyle Lutz
d585fbebad Make stream operator for vector types inline
This marks the stream operator for vector types as
inline.
2013-03-30 19:35:56 -04:00
Kyle Lutz
da1d7794b5 Remove support for cl_half
This removes support for cl_half (typedef'd to half_).

The issue is that the cl_half type is indistinguishable
from the cl_ushort type (both are typedefs for uint16_t)
which caused the cl_khr_fp16 pragma to be injected into
kernels using cl_ushort which causes errors on platforms
that do not support the cl_khr_fp16 extension.
2013-03-27 00:09:51 -04:00
Kyle Lutz
4338b311f7 Add device::partition() method
This adds a new set of methods to the device class allowing
device objects to be partitioned into multiple sub-devices
using the clCreateSubDevices() function.

For now, device partitioning is only supported on systems
with OpenCL version 1.2 (or later).
2013-03-26 23:36:54 -04:00
Kyle Lutz
4752fb2404 Support returning std::vector<T> from get_info<T>()
This adds support for returning a std::vector<T> from the
various get_info<T>() methods. This provides a simpler
interface to get the values in an array returned from one
of the clGet*Info() functions.

This also adds a test using the new API to get the maximum
work item sizes in each dimension for a device.
2013-03-26 22:44:56 -04:00
Denis Demidov
2df059738a Option to not retain program on creation 2013-03-25 20:59:50 -04:00
Kyle Lutz
ab0575dd0c Fix bug in remove_if() algorithm
This fixes a bug in which the remove_if() function would overwrite
parts of the input before they were properly copied to the output
range. This is fixed by first copying the input values to a temporary
vector and then passing that as the input range to copy_if().
2013-03-21 23:23:39 -04:00
Kyle Lutz
147baa05fe Fix bug in count_if() and find_if() with Intel CPUs
This fixes a bug in which the Intel OpenCL compiler would
fail to compile the count_if() and find_if() kernels for
vector types with the following error:

error: no matching function for call to 'all'
note: candidate function not viable: 1st argument ('__global int4')
is in address space 16776960, but parameter must be in address space 0

This is caused when the predicate compares a value from the input
buffer (in the global memory space) to a literal value (in the
private memory space).

This is fixed by first reading the value into a local variable in
the private memory space and then calling the predicate function.
2013-03-21 22:46:01 -04:00
Kyle Lutz
25b71579a5 Fix bug when calling fill() with vector types
This fixes a bug in which the fill() algorithm was called by
scan_impl() with an integer zero rather than zero of the value
type which caused issues when using scan() with vector values.
2013-03-20 18:10:48 -04:00
Denis Demidov
6766b07fad Allow buffer and queue initialization from lowlevel types 2013-03-19 17:18:43 +04:00
Kyle Lutz
2d81f561c4 Add zip_iterator class
This adds a zip_iterator class which allows for one or more
iterators to be combined into a single iterator object.
2013-03-17 23:42:56 -04:00
Kyle Lutz
ada2351812 Add support for boost::tuple<>
This adds support for using boost::tuple<> types with the
Boost.Compute containers and algorithms.
2013-03-17 23:28:07 -04:00
Kyle Lutz
ad9309740e Add meta_kernel::inject_type<Type>() method
This adds a new method which allows for type definitions and
type pragmas to be added to a meta_kernel.

This provides a more generic and general interface and replaces
the previously used add_pair_type() method along with the special
case handling of half and double types.
2013-03-17 23:20:19 -04:00
Kyle Lutz
2f81872403 Pass binary_status argument to clCreateProgramWithBinary()
This fixes a bug in which certain platforms would return
CL_INVALID_VALUE from clCreateProgramWithBinary() if the
binary_status argument was not provided.
2013-03-17 22:20:32 -04:00
Kyle Lutz
233df55978 Fix bug in program::binary()
This fixes a bug in the program::binary() method in which
the return size was not being passed to clGetProgramInfo().
2013-03-17 22:17:28 -04:00
Kyle Lutz
32a1926f6b Fix local array size in serial_insertion_sort()
This fixes the local array allocation size for the
serial_insertion_sort() function.
2013-03-14 21:27:00 -04:00
Kyle Lutz
77e75bd4cc Remove unused variable in serial_insertion_sort()
This removes the unused 'op' variable from the
serial_insertion_sort() function.
2013-03-14 21:22:40 -04:00
Kyle Lutz
ff204e1b61 Add asserts for null host pointers to command_queue
This adds assert()'s to the read and write methods in the
command_queue class to check for null host_ptr's.
2013-03-12 22:36:36 -04:00
Kyle Lutz
35984ae412 Remove default type_name_trait::value() implementation
This removes the default type_name_trait::value() function
implementation.

Previously, the default implementation would return a null
pointer leading to run-time errors if a type name was not
provided. Now, a compile-time error will occur if type_name()
is called for an unknown type.
2013-03-12 20:29:44 -04:00
Kyle Lutz
71df0d5fa6 Add type_name() specialization for char
This adds a type_name() specialization for char which is different
than the built-in cl_char type and thus was not covered before.
2013-03-12 20:23:33 -04:00
Kyle Lutz
418468cc4b Add lambda wrapper for length() function
This adds a lambda wrapper for the length() function which
allows it to be used in lambda expressions.
2013-03-10 20:16:14 -04:00
Kyle Lutz
69aef15cab Add merge() algorithm
This implements the merge() algorithm which merges two
ranges of sorted values into a single sorted range.

The current implementation uses a simple serial merge
algorithm. A GPU optimized version is coming soon.
2013-03-10 20:10:58 -04:00
Kyle Lutz
d34cdaac59 Initial commit 2013-03-02 15:14:17 -05:00