townforge/zstd - zstd - Townforge git

Author	SHA1	Message	Date
Stephen Kitt	e81d567547	Distinguish static symbols, allow hiding them Even with -fvisibility=hidden added to CFLAGS, any symbol which is given a default visibility attribute ends up exported in the dynamic library. This happens through zstd_internal.h which defines ..._STATIC_LINKING_ONLY before including various header files, and is included for example in lib/common/pool.c. To avoid this, this patch distinguishes static and non-static APIs, by using ZSTDLIB_API only for the latter, and introducing ZSTDLIB_STATIC_API for the former. For now, both are exported, but non-static APIs can be hidden by overriding the definition ZSTDLIB_STATIC_API. lib/Makefile is modified to allow this using make CPPFLAGS_DYNLIB=-DZSTDLIB_STATIC_API=ZSTDLIB_HIDDEN In addition, API declarations are dropped from zstd_compress.c (they aren't needed there). Signed-off-by: Stephen Kitt <steve@sk2.org>	2021-05-14 19:41:59 +02:00
Nick Terrell	03c4111299	[lib] Fix dictionary invalidation logic Call `ZSTD_enforceMaxDist()` before each block with the beginning of the block. This ensures that `lowLimit` is updated to `dictLimit` whenever the ext-dict is out of range, so we can use prefix mode for speed. This can cause non-determinism because prefix mode and ext-dict mode match finders can return different results. It can also hurt speed because ext-dict match finders are slower. The scenario is: 1. Compress large data with a dictionary. 2. The dictionary goes out of bounds, so we invalidate it. 3. However, we still have `lowLimit < dictLimit`, since it is never updated. 4. We will call the ext-dict match finder instead of the prefix one.	2021-05-13 17:05:59 -07:00
Nick Terrell	10b35b312b	[lib] Fix off-by-one error in repcode checks The repcode checks disallowed repcodes that are equal to `windowLow`. This is slightly inefficient, but isn't a problem on its own. Together with the next commit, it cause non-determinism.	2021-05-13 17:05:59 -07:00
Nick Terrell	91c9a247b6	[lib] Fix determinism bug in the optimal parser `ZSTD_insertBt1()` has a speed optimization that skips the prefix of very long matches. `40def70387/lib/compress/zstd_opt.c (L476)` This optimization is based off the length longest match found. However, when indices are reset, we only ensure that we can reference the whole window starting from `ip`. If the previous block ended with a long match then `nextToUpdate` could be much less than `ip`. It might be far enough back that `nextToUpdate < maxDist`, so it doesn't have a full window of data to reference. This can cause non-determinism bugs, because we may find a match that is beyond `ip - maxDist`, and may sometimes be un-referencable, and that match triggers the speed optimization. The fix is to base the `windowLow` off of the `target` of `ZSTD_updateTree_internal()`, because anything below that value will be obsolete by the time `ZSTD_updateTree_internal()` completes.	2021-05-13 17:05:59 -07:00
Yann Collet	cb0cad9b79	reduce Max nb Workers to 64 in 32-bit mode and restored limit to 256 when in 64-bit mode (it was reduced to 200 to give more room for 32-bit). This should fix test instability issues using lot of threads in 32-bit environments.	2021-05-12 13:10:25 -07:00
Nick Terrell	66772efe73	Merge pull request #2627 from terrelln/timeout-fix [lib] Fix fuzzer timeouts by backing off overflow correction	2021-05-07 10:55:26 -07:00
sen	9e94b7cac5	Assert no divison by 0, correct superblocks 0 sequences case (#2592 )	2021-05-07 13:26:56 -04:00
Nick Terrell	c2555f8c6f	[lib] Fix fuzzer timeouts by backing off overflow correction Linearly back off the frequency of overflow correction based on the number of times the `ZSTD_window_t` has been overflow corrected. This will still allow the fuzzer to quickly find overflow correction bugs, while also keeping good speed for larger inputs. Additionally, the `nbOverflowCorrections` variable can be useful for debugging coredumps, since we can inspect the `ZSTD_CCtx` to see if overflow correction has happened yet. I've verified this fixes the timeouts in OSS-Fuzz (176 seconds -> 6 seconds). I've also verified that fuzzers and `fuzzer` and `zstreamtest` still catch the row-hash overflow correction bug.	2021-05-06 22:03:41 -07:00
sen	698f261b35	[1.5.0] Deprecate some functions (#2582 ) * Add deprecated macro to zstd.h, mark certain functions as deprecated * Remove ZSTD_compress.c dependencies on deprecated functions	2021-05-06 17:59:32 -04:00
Nick Terrell	207e33bb61	Merge pull request #2616 from terrelln/deterministic-dict [lib] Add ZSTD_c_deterministicRefPrefix	2021-05-06 11:09:22 -07:00
Nick Terrell	172b4b6ac4	[lib] Add ZSTD_c_deterministicRefPrefix This flag forces zstd to always load the prefix in ext-dict mode, even if it happens to be contiguous, to force determinism. It also applies to dictionaries that are re-processed. A determinism test case is also added, which fails without `ZSTD_c_deterministicRefPrefix` and passes with it set. Question: Should this be the default behavior? It isn't in this PR.	2021-05-05 18:49:56 -07:00
Nick Terrell	eb7e74ccb7	[tests] Set `DEBUGLEVEL=2` by default This allows us to quickly check for compile errors in debug log messages, which are compiled out when `DEBUGLEVEL < 2`.	2021-05-05 13:29:06 -07:00
Nick Terrell	c2183d7cdf	[lib] Move some ZSTD_CCtx_params off the stack * Take `params` by const reference in `ZSTD_resetCCtx_internal()`. * Add `simpleApiParams` to the CCtx and use them in the simple API functions, instead of creating those parameters on the stack. I think this is a good direction to move in, because we shouldn't need to worry about adding parameters to `ZSTD_CCtx_params`, since it should always be on the heap (unless they become absoultely gigantic). Some `ZSTD_CCtx_params` are still on the stack in the CDict functions, but I've left them for now, because it was a little more complex, and we don't use those functions in stack-constrained currently.	2021-05-05 13:25:16 -07:00
Yann Collet	c077f257b4	Merge pull request #2611 from facebook/smallerJobs allow jobSize to be as low as 512 KB	2021-05-05 00:03:29 -07:00
Nick Terrell	8389a5122b	Merge pull request #2602 from terrelln/ldm-opt [LDM] Speed optimization on repetitive data	2021-05-04 23:13:09 -07:00
Nick Terrell	d40f55cd95	Merge pull request #2610 from senhuang42/lazy_underflow_fix Fix bad integer wraparound in repcode index for fast, dfast, lazy	2021-05-04 23:10:23 -07:00
Nick Terrell	0b88c2582c	[test] Add large dict/data --patch-from test Dictionary size must be > `ZSTD_CHUNKSIZE_MAX`.	2021-05-04 17:31:32 -07:00
Sen Huang	e6c8a5dd40	Fix incorrect usages of repIndex across all strategies	2021-05-04 19:50:55 -04:00
Nick Terrell	94db4398a0	[lib] Always load the dictionary in one go Dictionaries larger than `ZSTD_CHUNKSIZE_MAX` used to have to be loaded in multiple segments. Instead, when we detect large dictionaries, ensure that we reset the context's indicies. Then, for dictionaries larger than `ZSTD_CURRENT_MAX - 1`, only load the suffix of the dictionary. Finally, enable DDS for large dictionaries, since we no longer load in multiple segments. This simplifes the dictionary loading code, and reduces opportunities for non-determinism to slip in.	2021-05-04 16:45:25 -07:00
Yann Collet	1026b9fa10	fix rsyncable mode	2021-05-04 15:59:27 -07:00
Nick Terrell	1ffa80a09e	[easy] Rewrite rowHashLog computation `ZSTD_highbit32(1u << x) == x` when it isn't undefined behavior.	2021-05-04 11:43:20 -07:00
Yann Collet	8f86c29c06	allow jobSize to be as low as 512 KB previous lower limit was 1 MB. Note : by default, the lowest job size is 2 MB, achieved at level 1. Even lower job sizes can be achieved by manipulating this value directly, or manually modifying window sizes to lower amounts. Updated unit test to ensure that this new limit works fine (test would fail with previous 1 MB limit).	2021-05-04 11:02:55 -07:00
Nick Terrell	32823bc150	[LDM] Speed optimization on repetitive data LDM does especially poorly on repetitive data when that data's hash happens to have `(hash & stopMask) == 0`. Either because the `stopMask == 0` or random chance. Optimize this case by skipping over repetitive patterns. The detection is very simplistic, but should catch most of the offending cases. ``` head -c 1G /dev/zero \| perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long 21.187881087 seconds time elapsed head -c 1G /dev/zero \| perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long 1.149707921 seconds time elapsed ```	2021-05-04 10:57:42 -07:00
Nick Terrell	34aff7ea06	Bug fix & run overflow correction much more frequently in tests * Fix overflow correction when `windowLog < cycleLog`. Previously, we got the correction wrong in this case, and our chain tables and binary trees would be corrupted. Now, we work as long as `maxDist` is a power of two, by adding `MAX(maxDist, cycleSize)` to our indices. * When `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` is defined to non-zero run overflow correction as frequently as allowed without impacting compression ratio. * Enable `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` in `fuzzer` and `zstreamtest` as well as all the OSS-Fuzz fuzzers. This has a 5-10% speed penalty at most, which seems reasonable.	2021-05-03 15:21:47 -07:00
senhuang42	61fe571af6	Fix chaintable check to include rowhash in ZSTD_reduceIndex()	2021-04-30 19:52:04 -04:00
Nick Terrell	6cee3c2c4f	[trace] Remove default definitions of weak symbols Instead of providing a default no-op implementation, check the symbols for `NULL` before accessing them. Providing a default implementation doesn't reliably work with dynamic linking. Depending on link order the default implementations may not be overridden. By skipping the default implementation, all link order issues are resolved. If the symbols aren't provided the weak function will be `NULL`.	2021-04-26 16:05:39 -07:00
felixhandte	efa6dfa729	Apply DDS adjustments to avoid assert failures	2021-04-23 16:41:00 -04:00
sen	12c045f74d	Merge pull request #2574 from senhuang42/repcode_mismatch_detector_fix Correct the block splitter mismatched repcodes detection.	2021-04-12 23:27:43 -04:00
Sen Huang	8844f93957	Adjust nb elements to prefetch in ZSTD_row_fillHashCache()	2021-04-12 14:24:58 -04:00
Sen Huang	550f76f131	Correct the detection of mismatched repcodes	2021-04-09 09:08:51 -07:00
Sen Huang	4d63d6e8aa	Update results.csv, add Row hash to regression test	2021-04-07 10:31:41 -07:00
Nick Terrell	4694423c4f	Add and integrate lazy row hash strategy	2021-04-07 09:53:34 -07:00
sen	f71aabb5b5	Move clevel override to after initLocalDict() (#2571 )	2021-04-06 21:05:37 -04:00
sen	f1e8b565c2	Maintain two repcode histories for block splitting, replace invalid repcodes (#2569 )	2021-04-06 17:25:55 -04:00
sen	e38124555e	Fix dictionary force reloading clevel selection (#2570 ) * Move cdict clevel override to before localdict init * Update results.csv after dict load changes	2021-04-06 15:35:09 -04:00
Nick Terrell	8383fc828d	Merge pull request #2541 from ihsinme/patch-1 simple fix for using bit operator.	2021-04-02 13:01:09 -07:00
sen	980f3bbf83	[cwksp] Align all allocated "tables" and "aligneds" to 64 bytes (#2546 ) * Perform 64-byte alignment of wksp tables and aligneds internally * Clean up cwskp_finalize() function to only do two allocs * Refactor aligned/buffer reservation code, remove ASAN req for alignment reservations * Change from allocating 128 bytes always to allocating only buffer space as needed for tables/aligned * Back out aligned/table reservation order restriction * Add stricter bounds for new/resized wksps, fix comment in zstd_cwksp.h	2021-04-01 20:07:19 -04:00
sen	255925c231	Fix repcode-related OSS-fuzz issues in block splitter (#2560 ) * Do not emit last partitions of blocks as RLE/uncompressed * Fix repcode updates within block splitter * Add a entropytables confirm function, redo ZSTD_confirmRepcodesAndEntropyTables() for better function signature * Add a repcode updater to block splitter, no longer need to force emit compressed blocks	2021-03-31 15:14:59 -04:00
Nick Terrell	a494308ae9	[copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files * Switch to yearless copyright per FB policy * Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources * Add zstd copyright/license header to the `contrib/linux-kernel` sources * Update the `tests/test-license.py` to check for yearless copyright * Improvements to `tests/test-license.py` * Check `contrib/linux-kernel` in `tests/test-license.py`	2021-03-30 10:30:43 -07:00
sen	84ccb81e7c	Merge pull request #2561 from senhuang42/longlength_enum Add enum for representing long length ID	2021-03-26 15:55:12 -04:00
Sen Huang	b1a43455f8	Add enum for representing long length ID	2021-03-26 10:41:09 -07:00
sen	4fe2e7ae14	Merge pull request #2558 from senhuang42/msan_block_splitter_fix Fix block splitter minor MSAN warning.	2021-03-25 13:51:43 -04:00
sen	b0407b9f0e	Merge pull request #2555 from senhuang42/default_clevel_func Add ZSTD_defaultCLevel() function to public API	2021-03-25 13:07:28 -04:00
Sen Huang	2a907bf4aa	Move lastCountSize into a returned struct, fix MSAN error	2021-03-25 09:11:15 -07:00
Sen Huang	e398744a35	Add ZSTD_defaultCLevel() function to public API	2021-03-25 08:04:00 -07:00
Nick Terrell	f8ac0ea7ef	Merge pull request #2539 from terrelln/linux-kernel-fixes Fixes for the next linux kernel patch version	2021-03-24 10:34:29 -07:00
sen	bf542c8a8d	Merge pull request #2447 from senhuang42/block_splitter_v2 Recursive block splitting	2021-03-24 12:27:22 -04:00
Sen Huang	5b566ebe08	Rename compressSequences() functions for clarity	2021-03-24 08:21:29 -07:00
Sen Huang	0ef1f935b7	Add a fallback in case the total blocksize of split blocks exceeds raw block size	2021-03-24 08:21:29 -07:00
Sen Huang	c90e81a692	Enable block splitter by default when applicable	2021-03-24 08:21:29 -07:00
Sen Huang	e34332834a	Clean up various functions, add debuglogging for estimate vs. actual sizes	2021-03-24 08:21:29 -07:00
Sen Huang	41c3eae6d9	Fix various fuzzer failures: repcode history, superblocks	2021-03-24 08:21:29 -07:00
senhuang42	0633bf17c3	Change 1.3.4 bugfix to be cross-compatible with superblocks and normal compression	2021-03-24 08:21:29 -07:00
senhuang42	eb1ee8686d	Refactor buildSequencesStatistics() to avoid pointer increment for superblocks	2021-03-24 08:21:29 -07:00
senhuang42	e2bb215117	Add unit tests and fuzzer param	2021-03-24 08:21:09 -07:00
senhuang42	de52de1347	Add recursive block split algorithm	2021-03-24 08:21:09 -07:00
senhuang42	f06f6626ed	Update function names for consistency	2021-03-24 08:20:54 -07:00
senhuang42	c56d6e49e8	Add block splitter to experimental params	2021-03-24 08:20:54 -07:00
senhuang42	2949a95224	Refactor block compression logic into single function	2021-03-24 08:20:54 -07:00
senhuang42	c05c090cc2	Centralize entropy statistics calculations to zstd_compress.c	2021-03-24 08:20:29 -07:00
sen	c48889f097	Merge pull request #2538 from senhuang42/monotonicity_test Add memory monotonicity test over srcSize	2021-03-22 16:54:34 -04:00
Sen Huang	dff4a0e867	Make ZSTD_estimateCCtxSize_internal() loop through all srcSize parameter sets as well	2021-03-21 16:15:31 -07:00
ihsinme	a5bf09d764	simple fix for using bit operator. good day. It seems to me that the developer intended to use a logical operator. so I suggest a simple fix.	2021-03-17 11:37:42 +03:00
Sen Huang	77ae664ba6	Fix ZSTD_dedicatedDictSearch_isSupported() requirements	2021-03-16 17:36:05 -07:00
senhuang42	386111adec	Add a nbSeq argument to compressSequences() Refactor ZSTD_compressBlock_internal() to do the block header write within and add nbSeq argument to compressSequences()	2021-03-16 14:04:22 -07:00
senhuang42	98764493cf	Move block header write into compressBlock_internal()	2021-03-16 14:04:22 -07:00
Nick Terrell	cd1551d261	[lib][tracing] Add ZSTD_NO_TRACE macro When defined, it disables tracing, and avoids including the header.	2021-03-16 11:47:27 -07:00
Nick Terrell	b5fd348a85	Merge pull request #2523 from terrelln/huf-stack-reduction Add HUF_writeCTable_wksp() function	2021-03-05 12:35:09 -08:00
Nick Terrell	5df2a21f1e	Add HUF_writeCTable_wksp() function This saves ~700 bytes of stack space in HUF_writeCTable.	2021-03-05 10:29:18 -08:00
Nick Terrell	27498ff00f	Reduce stack usage of ZSTD_buildCTable() It is a stack high-point for some compression strategies and has an easy fix. This moves the normalized count into the entropy workspace.	2021-03-04 16:12:11 -08:00
Nick Terrell	7736549bea	[bug-fix] Make simple single-pass functions ignore advanced parameters The simple compression functions are intended to ignore the advanced parameters, but they were accidentally using them. All the `ZSTD_parameters` were set correctly, but any extra parameters were used as-is. E.g. `ZSTD_c_format`. This PR makes all the simple single-pass functions listed below ignore the advanced parameters, as intended. * `ZSTD_compressCCtx()` * `ZSTD_compress_usingDict()` * `ZSTD_compress_usingCDict()` * `ZSTD_compress_advanced()` * `ZSTD_compress_usingCDict_advanced()` It also adds a test case that ensures that each of these functions ignore the advanced parameters.	2021-02-12 19:11:23 -08:00
Nick Terrell	c62eb05964	[lib] Set appliedParams.compressionLevel correctly Forward the correct compressionLevel to the appliedParams in all cases. It was already correct for the advanced API, so only the old single-pass functions needed to be fixed. This compression level is unused by the library, but is set so that the tracing framework can consume it.	2021-02-12 15:00:14 -08:00
Nick Terrell	f520f6dfbe	[trace] Minor fixes found during integration * Mark `ZSTD_CCtx_getParameter()` as const * Add `extern "C"` guards to `zstd_trace.h`	2021-02-11 16:20:04 -08:00
Yann Collet	8884cb887d	Merge pull request #2483 from mpu/ldmgear New algorithms for the long distance matcher	2021-02-11 08:38:23 -08:00
Quentin Carbonneaux	552efcac2d	relocate large arrays from the stack to ldmState_t	2021-02-10 16:16:54 +01:00
Nick Terrell	e59c9459a5	[trace] Keep track of a uint64_t tracing context The most common information that you want to track between begin() and end() is the timestamp of the begin function, so you can measure the duration of the (de)compression call. Allow the tracing library to put this information inside the `ZSTD_TraceCtx`, so it doesn't need to keep a global map in this case. If a single uint64_t is not enough, the tracing library can return a unique identifier (like the context pointer) instead, and use it as a key in a map. This keeps the simple case simple.	2021-02-09 11:37:05 -08:00
Quentin Carbonneaux	e2ad174d73	fix some compiler warnings	2021-02-08 20:19:16 +01:00
Nick Terrell	54a4998a80	Add basic tracing functionality	2021-02-05 16:28:52 -08:00
Yann Collet	b9748757b0	fixed minor cast warning	2021-02-05 09:55:54 -08:00
Quentin Carbonneaux	874a590e5c	deal safely with short inputs in ZSTD_ldm_generateSequences The fuzzer CI found this bug.	2021-02-04 11:15:24 +01:00
Quentin Carbonneaux	9f327c02fd	new core ldm algorithm	2021-02-03 22:24:07 +01:00
Quentin Carbonneaux	aee3dc877f	fix a variable name to reflect its nature	2021-01-22 02:24:19 -08:00
Quentin Carbonneaux	d6e3de77dc	fix warning and remove one more occurrence of makeEntryAndInsertByTag	2021-01-20 01:39:16 -08:00
Quentin Carbonneaux	e0d5eca8fa	fix forgotten numTagBits in getTagMask	2021-01-20 00:54:20 -08:00
Quentin Carbonneaux	1e65711ca5	a couple performance improvement changes for ldm	2021-01-20 00:54:20 -08:00
Thomas Waldmann	92a2b5ccc9	fixup: lits means literals	2021-01-07 23:30:42 +01:00
Thomas Waldmann	f9802d80a0	fix typos (work done by Andrea Gelmini)	2021-01-07 18:47:23 +01:00
Nick Terrell	58476bcf7f	Don't shrink window log in ZSTD_getCParams() Treat ZSTD_getCParams() and ZSTD_adjustCParams() in the same way we treat streaming compression. Choose parameters based on the dictionary size + source size, and assume the source size is small if unkown. But, don't shrink the window log down in ZSTD_adjustCParams_internal().	2021-01-04 15:54:09 -08:00
Nick Terrell	9d31c704d5	Don't shrink window log when streaming with a dictionary Fixes #2442. 1. When creating a dictionary keep the same behavior as before. Assume the source size is 513 bytes when adjusting parameters. 2. When calling ZSTD_getCParams() or ZSTD_adjustCParams() keep the same behavior as before. 3. When attaching a dictionary keep the same behavior of ignoring the dictionary size. When streaming this will select the largest parameters and not adjust them down. But, the CDict will use the correctly sized parameters, which seems like the right tradeoff. 4. When not attaching a dictionary (either forced not to, or using a prefix dictionary) we select parameters based on the dictionary size + source size, and assume the source size is small, which is the same behavior as before. But, now we don't adjust the window log (and hash and chain log) down when the source size is unknown. When the source size is unknown all cdicts should attach, except when the user disables attaching, or `forceWindow` is used. This means that when streaming with a CDict we end up in the good case where we get small CDict parameters, and large source parameters. TODO: Add a streaming + dictionary regression test case.	2021-01-04 15:54:09 -08:00
Nick Terrell	66e811d782	[license] Update year to 2021	2021-01-04 17:53:52 -05:00
senhuang42	5c41490bfe	Use pre-defined constants	2020-12-21 11:52:05 -05:00
senhuang42	7e11bd012b	Implement skippable frame function	2020-12-21 11:13:22 -05:00
Yann Collet	a7cb4af573	added emphasis on the alignment condition of workspace and made it a programming mistake (`assert()`) rather than a runtime error.	2020-12-18 15:04:09 -08:00
Nick Terrell	ae85676d44	Fix alignment of scratchBuffer in HUF_compressWeights() The scratch buffer must be 4-byte aligned. This causes test failures in 32-bit systems, where the stack isn't aligned. Fixes Issue #2428.	2020-12-17 14:30:27 -08:00
Yann Collet	0b39531d75	moving all references to `release` branch was previously `master`	2020-12-16 23:00:35 -08:00
Yann Collet	b8c3a473ec	Merge pull request #2420 from terrelln/huf-comment [huf_compress] Refactor and comment HUF_buildCTable()	2020-12-14 16:14:07 -08:00
W. Felix Handte	9dab03db90	Create Enum to Represent Static/Dynamic Allocation Distinction in cwksp	2020-12-09 14:57:37 -05:00
W. Felix Handte	db9e73cb07	Don't ASAN-Poison Statically-Allocated Workspaces Addresses #2286.	2020-12-09 13:00:47 -05:00
Nick Terrell	1bbcf07bd5	[huf_compress] Refactor and comment HUF_buildCTable() Comment and refactor `HUF_buildCTable()` and the helper functions it calls as I read and understand the code. Hopefully this refactor makes the code a bit more clear.	2020-12-08 13:57:01 -08:00
Yann Collet	b86e3c9304	Merge pull request #2415 from facebook/fix_aliasing fix gcc-10 strict aliasing warnings	2020-12-04 21:30:57 -08:00
Yann Collet	6132df8dd3	fix gcc-10 strict aliasing warnings by exposing HUF_CElt declaration.	2020-12-04 16:43:19 -08:00
Yann Collet	68c14bdff2	minor speed improvement to HUF_readCTable() faster by ~+1-2%	2020-12-04 16:33:39 -08:00
Nick Terrell	c238db046f	Merge pull request #2414 from terrelln/mt-progress [lib] Ensure that multithreaded compression always makes some progress	2020-12-04 16:30:08 -08:00
Nick Terrell	4c58cb8383	[lib] Ensure that multithreaded compression always makes some progress	2020-12-03 20:25:14 -08:00
Nick Terrell	6672689e7e	Merge pull request #2406 from terrelln/linux-wrapper-api [linux] Add the linux wrapper API	2020-12-02 16:49:03 -08:00
Nick Terrell	894ae36675	Merge pull request #2390 from animalize/clamp_level Clamp compression level	2020-12-02 14:35:58 -08:00
senhuang42	2cbd038528	Move max nb seq check to per-block	2020-12-02 12:11:32 -05:00
Nick Terrell	3cda5fae77	[minor][lib] Remove double semicolon	2020-12-02 01:08:08 -08:00
senhuang42	3efe9c902b	Add sequence nb validation to compressSequences(), adjust minMatch comparisons	2020-12-01 10:54:45 -05:00
senhuang42	4c5f337248	Use cctx's minMatch instead of global MINMATCH, make fuzzer use validation	2020-11-30 15:41:20 -05:00
sen	c5fbd55dac	Merge pull request #2387 from senhuang42/compress_sequence_API [RFC] New sequence compression API	2020-11-20 16:54:20 -05:00
senhuang42	7742f076b4	Add experimental param for sequence validation	2020-11-20 11:57:41 -05:00
senhuang42	0e32928b7d	Remove unnecessary repcode backup, apply style choices, use function pointer	2020-11-20 11:02:19 -05:00
sen	e924a0fa51	Explicit cast for visual warnings Github has automatic commits now! Cool Co-authored-by: Nick Terrell <nickrterrell@gmail.com>	2020-11-19 17:32:40 -05:00
senhuang42	dcbbf7c09f	Unroll isRLE loop	2020-11-19 12:38:13 -05:00
senhuang42	05c0229668	Clean up visual conversion warnings	2020-11-18 15:36:29 -05:00
senhuang42	d6d7ba2a1f	Modification to offset validation to include entire sequence	2020-11-17 10:13:22 -05:00
senhuang42	8f3136a9c7	Fix assert edge case, improve documentation in zstd.h	2020-11-16 18:05:35 -05:00
senhuang42	f6baad87d6	Fix warnings and make validation enabled by default	2020-11-16 12:00:06 -05:00
senhuang42	55b90ef010	Fix unit tests to agree with new changes	2020-11-16 11:36:37 -05:00
senhuang42	7f563b0519	Add new sequence format as an experimental CCtx param	2020-11-16 10:49:17 -05:00
senhuang42	347824ad73	Overhaul logic to simplify, add in proper validations, fix match splitting	2020-11-16 10:49:17 -05:00
senhuang42	46824cb018	Add new sequence compress api params to cctx	2020-11-16 10:49:17 -05:00
senhuang42	48405b4633	Fix srcSize=0 edge case	2020-11-16 10:49:17 -05:00
senhuang42	022e6d81e7	Fix literals length calculation	2020-11-16 10:49:17 -05:00
senhuang42	dad20b5ccb	Remove dstCapacity error check	2020-11-16 10:49:17 -05:00
senhuang42	b8e16a2057	Remove extraneous function in this API	2020-11-16 10:49:17 -05:00
senhuang42	f29507c4fc	Add check comparing offset to window size	2020-11-16 10:49:17 -05:00
senhuang42	7a6e46a92f	Fix MSAN errors	2020-11-16 10:49:17 -05:00
senhuang42	cc2642bd17	Address edge case with endPosInSequence	2020-11-16 10:49:17 -05:00
senhuang42	fd10007174	Change debug levels to appropriate ones	2020-11-16 10:49:17 -05:00
senhuang42	2db8441245	Add RLE support	2020-11-16 10:49:17 -05:00
senhuang42	dfef298336	Fix various build warnings	2020-11-16 10:49:17 -05:00
senhuang42	2bbdddf24e	Add test case to roundtrip using ZSTD_getSequences() and ZSTD_compressSequences()	2020-11-16 10:49:16 -05:00
senhuang42	5fd69f8173	Add documentation for new api functions	2020-11-16 10:49:16 -05:00
senhuang42	e8b7fdb64b	Refactor for enhanced code clarity	2020-11-16 10:49:16 -05:00
senhuang42	c675fb46f1	Rename internal function compressSequences(), and promote new *_ext() functions to their actual name	2020-11-16 10:49:16 -05:00
senhuang42	013434e1e4	Add another API function to compress with existing CCTX	2020-11-16 10:49:16 -05:00
senhuang42	c44ce29013	More adjustments to improve code clarity	2020-11-16 10:49:16 -05:00
senhuang42	48f67da854	Pull compressStream2() transparent initialization into its own function	2020-11-16 10:49:16 -05:00
senhuang42	c86151f53c	Add initial support for new ZSTD_Sequence mode	2020-11-16 10:49:16 -05:00
senhuang42	e0f26afce9	Add sequence compression format param	2020-11-16 10:49:16 -05:00
senhuang42	f51af9a609	Always ensure sequenceRange updates properly, add more error forwarding	2020-11-16 10:49:16 -05:00
senhuang42	1a449688fd	Various minor logical refactors to improve clarity	2020-11-16 10:49:16 -05:00
senhuang42	e5fe485dcc	Fix cSize calculation for noCompressBlocks	2020-11-16 10:49:16 -05:00
senhuang42	6145ebb400	Rebased, roundtrips silesia.tar	2020-11-16 10:49:16 -05:00
senhuang42	b5b61cc216	Refactor for better debugging info	2020-11-16 10:49:16 -05:00
senhuang42	293fad6b45	Corrections and edge-case fixes to be able to roundtrip dickens	2020-11-16 10:49:16 -05:00
senhuang42	7eb6fa7be4	Multi-block compression scaffolding - works on single-block files	2020-11-16 10:49:16 -05:00
senhuang42	75b01f34b9	Add support for uncompressible blocks	2020-11-16 10:49:16 -05:00
senhuang42	e04da68157	Enable usage of ZSTD_sequenceRange for single-block compression	2020-11-16 10:49:16 -05:00
senhuang42	337fac216d	Add logic to handle ZSTD_sequenceRange	2020-11-16 10:49:16 -05:00
senhuang42	85822ddd53	Add last literals handling like getSequences()	2020-11-16 10:49:16 -05:00
senhuang42	2cff8df1a2	Pull block compression out of main compressSequences() function	2020-11-16 10:49:16 -05:00
senhuang42	cfced9344a	Implement ZSTD_updateSequenceRange	2020-11-16 10:49:16 -05:00
senhuang42	b116e1f211	Modify SequenceRange to have posInSequence	2020-11-16 10:49:16 -05:00
senhuang42	d99b675112	Add function definition for sequenceRange updater	2020-11-16 10:49:16 -05:00
senhuang42	74e95c05cc	Add ZSTD_SequenceRange to count ranges in array of ZSTD_Sequence	2020-11-16 10:49:16 -05:00
senhuang42	89f3848310	Add support for repcodes	2020-11-16 10:49:16 -05:00
senhuang42	3e930fd044	Code cleanup, add debuglog statments	2020-11-16 10:49:16 -05:00
senhuang42	086513b5b9	Implement first pass at compressSequences()	2020-11-16 10:49:16 -05:00
senhuang42	a9327b1e9b	Add initial function prototype for ZSTD_compressSequences_ext (to be renamed later)	2020-11-16 10:33:35 -05:00
animalize	52f8c07a3f	Clamp compression level in ZSTD_getCParams_internal() function	2020-11-14 13:26:08 +08:00
senhuang42	9d936d61d2	Reduce number of memcpy() calls	2020-11-13 19:43:30 -05:00
senhuang42	be4ac6c5bc	Use existing repcode update function to implement updates	2020-11-12 16:51:12 -05:00
senhuang42	674c9b9235	Add in proper block repcode histories	2020-11-12 15:34:37 -05:00
senhuang42	06c7f14066	Let block reps persist	2020-11-12 12:24:44 -05:00
senhuang42	396275068c	Fix incorrect repcode setting	2020-11-12 11:57:01 -05:00
senhuang42	1a8af0de73	Improve unit test	2020-11-12 11:09:09 -05:00
senhuang42	4d4fd2c55f	Overhaul repcode handling logic	2020-11-12 10:59:35 -05:00
sen	f62edf0fe9	Merge pull request #2381 from senhuang42/expand_sequence_extraction_api Add enum to define ZSTD_Sequence type and update sequence extraction API	2020-11-06 13:00:31 -05:00
senhuang42	7d1dea070c	Update unit tests	2020-11-06 11:10:37 -05:00
senhuang42	779df995c6	Implement mergeGeneratedSequences()	2020-11-06 10:55:46 -05:00
senhuang42	51abd58208	Rename getSequences() to generateSequences()	2020-11-06 10:53:22 -05:00
Luke Pitt	eac309c71b	Add ZSTD_getDictID_fromCDict function to experimental section	2020-11-04 11:37:37 +00:00
senhuang42	f782cac3d4	Change block delimiter removing to linear time approach	2020-11-02 17:06:20 -05:00
senhuang42	3434049c1f	Use ZSTD_memmove() instead of memmove()	2020-11-02 11:43:19 -05:00
senhuang42	d4d0346b40	Update name of enum, clarify documentation	2020-11-02 11:38:17 -05:00
senhuang42	e6178f837f	Revert unnecessary seqCollector adjustment	2020-11-02 10:59:20 -05:00
senhuang42	e8501e00b8	Fix incorrect index increment in merge algorithm	2020-11-02 10:58:41 -05:00
senhuang42	a36fdada57	Add algorithm to remove all delimiters	2020-11-02 10:46:52 -05:00
senhuang42	435a3a0428	Update seqCollector definition	2020-11-02 10:19:26 -05:00
senhuang42	3327932609	Update ZSTD_getSequences function signature	2020-11-02 10:17:59 -05:00
Nick Terrell	7205e609a9	Merge pull request #2354 from terrelln/stable-buffer Add ZSTD_c_stable{In,Out}Buffer and optimize when set	2020-10-30 15:06:56 -07:00
sen	c37c714ef1	Merge pull request #2376 from senhuang42/clarify_sequence_extraction_api Refine external ZSTD_Sequence API	2020-10-30 15:47:25 -04:00
Nick Terrell	d4e021fe35	[lib] Avoid allocating the input buffer when ZSTD_c_stableInBuffer is set We don't use it when we have a stable input buffer, so don't allocate it. I had to slightly modify `ZSTD_copyCCtx()` by storing the `ZSTD_buffered_policy_e` in the `ZSTD_CCtx`, since `inBuffSize > 0` is no longer the correct signal for the buffered mode.	2020-10-30 10:55:34 -07:00
Nick Terrell	24f72789e2	[lib] Skip the input window buffer when ZSTD_c_stableInBuffer is set Compress directly from the `ZSTD_inBuffer`. We still allocate the input buffer. A following commit will remove that allocation.	2020-10-30 10:55:34 -07:00
Nick Terrell	6bd6b6f7d3	[cwksp] Return NULL when 0 bytes are requested This ensures that the buffer is never used.	2020-10-30 10:55:34 -07:00
Nick Terrell	fcf81cee5e	[lib] Avoid allocating output buffer when ZSTD_c_stableOutBuffer is set We compress directly to the `ZSTD_outBuffer` so we don't need to allocate it.	2020-10-30 10:55:34 -07:00
Nick Terrell	6d5dc93d4e	[lib] Compress directly into output when ZSTD_c_stableOutBuffer is set When we have a stable output buffer always compress directly into the `ZSTD_outBuffer`. We are allowed to return `dstSizeTooSmall`.	2020-10-30 10:55:34 -07:00
Nick Terrell	987cb4ca6a	[lib] Take the shortcut when ZSTD_c_stableOutBuffer is set When we have a stable output buffer take the single-pass shortcut. It is okay to return `dstSizeTooSmall` if the output buffer isn't big enough, because we know it will never grow.	2020-10-30 10:55:34 -07:00
Nick Terrell	809b2f2071	[lib] Set ZSTD_c_stable{In,Out}Buffer in ZSTD_compress2() Sets these parameters in ZSTD_compress2() then resets them to their orignal values after the compression call. An alternative design could be to add a flush mode `ZSTD_e_singlePass` which implies `ZSTD_c_stable{In,Out}Buffer` but only for a single compression call, by directly setting the applied parameters. I've opted for the smaller change, but this is open for discussion.	2020-10-30 10:55:34 -07:00
Nick Terrell	c74be3f6de	[lib] Validate buffers when ZSTD_c_stable{In,Out}Buffer is set Adds the validation of the input/output buffers only. They are still unused.	2020-10-30 10:55:34 -07:00
Nick Terrell	e3e0775cc8	[API] Add ZSTD_c_stable{In,Out}Buffer parameters This commit adds the parameters and sets the value in the CCtxParams but it does not do anything with the value.	2020-10-30 10:54:39 -07:00
Nick Terrell	e2581d9572	[lib] Set appliedParams in zstdmt mode Previously only `nbWorkers` was set. Set all parameters, because that is what is expected. This is needed for the `ZSTD_c_stable{In,Out}Buffer` parameters.	2020-10-30 10:54:38 -07:00
senhuang42	536e89c723	Sequence extractor should update CBlockState	2020-10-30 12:13:19 -04:00
senhuang42	32cac2627a	Emit last literals of 0 size as well, to indicate block boundary	2020-10-29 16:41:17 -04:00
senhuang42	69bd5f0654	Correct literalsRead calculation to include longLength	2020-10-29 14:49:37 -04:00
senhuang42	59624f3163	Remove implicit typecast to appease appVeyor windows build	2020-10-28 16:25:09 -04:00
senhuang42	3ed5d053d8	Clarify comments in zstd.h some more	2020-10-28 09:53:09 -04:00
Nick Terrell	599ff58e08	Merge pull request #2339 from terrelln/zstdmt-stability Fix zstdmt stability issues and clean up the zstdmt code	2020-10-27 19:43:13 -07:00
sen	17b700d78a	Merge pull request #2366 from senhuang42/enable_ldm_by_default Enable LDM by default if window size >= 128MB and strategy uses opt parser	2020-10-27 14:59:28 -04:00
Nick Terrell	0953645837	Merge pull request #2362 from senhuang42/fix_ldm_fuzz_issue Fix long distance matcher OSS-fuzz issue	2020-10-27 11:13:03 -07:00
senhuang42	3163909d14	Remove unused variable position	2020-10-27 12:58:12 -04:00
senhuang42	dc448563e9	Add test compatibility with last literals in sequences	2020-10-27 12:35:28 -04:00
senhuang42	1d221ecc03	Add support for representing last literals in the extracted seqs	2020-10-27 11:19:48 -04:00
senhuang42	9171f920cd	Improve documentation of seqStore_t	2020-10-27 10:50:22 -04:00
senhuang42	96b0ff7886	Improve documentation regarding various operations in copyBlockSequences	2020-10-27 10:36:06 -04:00
senhuang42	3a11c7eb03	Modify ZSTD_copyBlockSequences to agree with new API	2020-10-27 10:31:40 -04:00
senhuang42	8bdb32aebe	Add a function for LDM enable check	2020-10-20 13:46:02 -04:00
senhuang42	578e889ec1	Move ldm enable to compressStream2()	2020-10-20 13:04:45 -04:00
senhuang42	d28d8a1d72	Include LDM tables size for CCtx size estimation where relevant	2020-10-20 09:21:30 -04:00
senhuang42	b1c7fc5768	Add compatibility for multithreading	2020-10-19 12:07:06 -04:00
senhuang42	590f7f55f0	Add ldm enable condition in ZSTD_resetCCtx_internal	2020-10-19 10:26:17 -04:00
senhuang42	4d01979b62	Expose and call ZSTD_ldm_skipRawSeqStoreBytes()	2020-10-16 20:30:00 -04:00
Yann Collet	a0ec50c2dc	Merge pull request #2355 from senhuang42/change_ldm_mt_config Reduce --long mode MT jobsize at higher levels	2020-10-16 13:35:50 -07:00
senhuang42	f49926edf4	Change cycleLog adjustment to +3 from +4	2020-10-15 09:56:05 -04:00
senhuang42	ee84817fe7	Reset posInSequence when using ZSTD_referenceExternalSequences()	2020-10-14 22:06:08 -04:00
senhuang42	d0550bb18f	Clarify argument names, fix DEBUGLOG() statements	2020-10-14 15:45:43 -04:00
senhuang42	3f99c9b38d	Adjust match backwards count args	2020-10-14 15:23:03 -04:00
senhuang42	bf0d559449	Introduce, implement, and call ZSTD_ldm_countBackwardsMatch_2segments()	2020-10-14 12:58:06 -04:00
senhuang42	467e4383b0	Merge branch 'dev' of github.com:senhuang42/zstd into change_ldm_mt_config	2020-10-14 10:17:50 -04:00
Yann Collet	f5d5cd3b40	Merge pull request #2341 from senhuang42/ldm_optimized_for_opt_parser Integrate long distance matches into optimal parser	2020-10-13 13:09:07 -07:00
Nick Terrell	7e6f91ed84	[minor] Improve docs and add an assert in response to review	2020-10-12 16:43:17 -07:00
senhuang42	354b5f1c0a	Use cycleLog instead of chainLog to determine LDM jobLog	2020-10-12 16:09:59 -04:00
Nick Terrell	441ce4178f	[zstdmt] Clarify a comment	2020-10-12 12:58:13 -07:00
Nick Terrell	efff5d8b2d	[zstdmt] Fix determinism issue with rsyncable mode The problem occurs in this scenario: 1. We find a synchronization point. 2. We attmept to create the job. 3. We fail because the job table is full: `mtctx->nextJobID > mtctx->doneJobID + mtctx->jobIDMask`. 4. We call `ZSTDMT_compressStream_generic` again. 5. We forget that we're at a sync point already, and we continue looking for the next sync point. This fix is to detect if we're currently paused at a sync point, and if we are then don't load any more input. Caught by zstreamtest. I modified it to make the bug occur more often (~1/100K -> ~1/200) and verified that it is fixed after. I then ran a few hundred thousand unmodified zstreamtest iterations to verify.	2020-10-12 12:55:17 -07:00
Nick Terrell	ede4f97153	[zstdmt] Fix bug where extra empty blocks are emitted When zstdmt cannot get a buffer and `ZSTD_e_end` is passed an empty compression job can be created. Additionally, `mtctx->frameEnded` can be set to 1, which could potentially cause problems like unterminated blocks. The fix is to adjust to `ZSTD_e_flush` even when we can't get a buffer.	2020-10-12 12:55:17 -07:00
Nick Terrell	c51a9e79b9	[zstdmt] Rip out the zstdmt API This commit leaves only the functions used by zstd_compress.c. All other functions have been removed from the API. The ZSTDMT unit tests in fuzzer.c and zstreamtest.c have been rewritten to use the ZSTD API. And the --mt zstreamtest tests have been ripped out.	2020-10-12 12:55:16 -07:00
Nick Terrell	1784c4b4ab	[zstdmt] Remove single-pass shortcut Simplifies the code and removes blocking from zstdmt. At this point we could completely delete `ZSTDMT_compress_advanced_internal()`. However I'm leaving it in because I think we want to do that in the zstd-1.5.0 release, in case anyone is still using the ZSTDMT API, even though it is not installed by default. Fixes #2327.	2020-10-12 12:53:26 -07:00
Nick Terrell	b55ae009ac	[zstdmt] Remove singleBlockingThread mode This is already handled by zstd, so this logic is never used.	2020-10-12 12:53:26 -07:00
Nick Terrell	d5c688e8ae	Fix ZSTD_adjustCParams_internal() to handle dictionary logic Pass in the `ZSTD_cParamMode_e` to select how we define our cparams. Based on the mode we either take the `dictSize` into account or we set it to `0`. See the documentation for `ZSTD_cParamMode_e`. Some of the modes currently share the same behavior. But they have distinct modes because they are drastically different cases. E.g. compression + reprocessing the dictionary and creating a cdict. Additionally, when downsizing the hashLog and chainLog take the (adjusted) dictionary size into account, since the size of the dictionary gets added onto the window size. Adds a simple test to ensure that we aren't downsizing too far.	2020-10-12 12:50:04 -07:00
Nick Terrell	fadaab8c7c	[minor improvement] Pass 0 as the content size in the DDS The DDS structure can't be copied into the working tables like the DMS. So it doesn't need to account for the source size when sizing its parameters, just the dictionary size.	2020-10-12 12:47:21 -07:00
Nick Terrell	48ef15fb47	[minor improvement] Pass dictSize when selecting parameters When selecting parameters in streaming compression with a dictionary use the dictionary size to select the parameters.	2020-10-12 12:47:19 -07:00
Nick Terrell	012818df99	[refactor] Remove ZSTD_resetCStream_internal() This function is only called in one place. It isn't a logical separation of duties, and it was only obsfucating the code now, so inline it.	2020-10-12 12:46:10 -07:00
Nick Terrell	7083f79008	[bug] Fix dictContentType when reprocessing cdict Conditions to trigger: * CDict is loaded as raw content. * CDict starts with the zstd dictionary magic number. * The CDict is reprocessed (not attached or copied). * The new API is used (streaming or `ZSTD_compress2()`). Bug: The dictionary is loaded as a zstd dictionary, not a raw content dictionary, because the dict content type is set to `ZSTD_dct_auto`. Fix: Pass in the dictionary content type from cdict creation to the call to `ZSTD_compress_insertDictionary()`. Test: Added a test case that exposes the bug, and fixed the raw content tests to not modify the `dictBuffer`, which makes all future tests with the `dictBuffer` raw content, which doesn't seem intentional.	2020-10-12 12:46:10 -07:00
senhuang42	d6911b86be	Require LDM matches to be strictly greater in length	2020-10-09 12:56:18 -04:00
Yann Collet	12541931fa	Merge pull request #2328 from marxin/zstd-pool-api Allow external creation of POOLs that can be shared.	2020-10-09 01:00:50 -07:00
Yann Collet	6fdb0cb8d9	Merge pull request #2303 from senhuang42/let_cdict_take_clevel_priority For ZSTD_compressStream2(), let cdict take compression level priority	2020-10-09 00:48:30 -07:00
senhuang42	b9c8033cde	Define kNullRawSeqStore for every file	2020-10-07 19:02:41 -04:00
senhuang42	a6165c1b28	Change matchState_t::ldmSeqStore to pointer	2020-10-07 14:13:57 -04:00
senhuang42	abce708a56	Move posInSequence correction to correct location	2020-10-07 13:56:25 -04:00
senhuang42	0c515590d8	Replace offCode of largest match if ldm's offCode is superior	2020-10-07 13:56:25 -04:00
senhuang42	0fac8e07e1	Refactor usage of ms->ldmSeqStore so that it is not modified during compressBlock(), and simplify skipRawSeqStoreBytes	2020-10-07 13:56:25 -04:00
senhuang42	a5500cf2af	Refactor separate ldm variables all into one struct	2020-10-07 13:56:25 -04:00
senhuang42	0731b94e7c	Use kNullRawSeqStore constant in zstdmt_compress.c	2020-10-07 13:56:25 -04:00
senhuang42	0325d878f2	Remove bubbling down matches with longer offCode and same matchLen	2020-10-07 13:56:25 -04:00
senhuang42	031b7ec15f	Disable LDM minMatch adjustment when using opt parser	2020-10-07 13:56:25 -04:00
senhuang42	ddf8a3f1b9	Enable inclusion of mid-flight LDMs in opt parser	2020-10-07 13:56:25 -04:00
senhuang42	88f72ed942	Correct incorrect offcode calculation	2020-10-07 13:56:25 -04:00
senhuang42	d8b43a4202	Add explicit conversion of size_t to U32	2020-10-07 13:56:25 -04:00
senhuang42	b8bfc4e63d	Add cSize regression test to fuzzer.c	2020-10-07 13:56:25 -04:00
senhuang42	c87d2e5866	Prefix new static ldm helpers with ZSTD_opt	2020-10-07 13:56:25 -04:00
senhuang42	429dec4f42	Add DEBUGLOG() calls in ldm helpers	2020-10-07 13:56:25 -04:00
senhuang42	10647924f1	Make function descriptions more accurate	2020-10-07 13:56:25 -04:00
senhuang42	1a687b3fcb	Improve documentation of relevant structs	2020-10-07 13:56:25 -04:00
senhuang42	37617e23d7	Correct matchLength calculation and remove unnecessary functions	2020-10-07 13:56:25 -04:00
senhuang42	7dee62c287	Reset ldmSeqStore after initStats_ultra() pass for btultra2	2020-10-07 13:56:25 -04:00
senhuang42	0718aa70df	Refactor existing functions to use posInSequence	2020-10-07 13:56:25 -04:00
senhuang42	7348b40a87	Adjustments to ldm_calculateMatchRange() to calculate bounds correctly	2020-10-07 13:56:25 -04:00
senhuang42	a1ef2db5b2	Add ldm_calculateMatchRange() function	2020-10-07 13:56:25 -04:00
senhuang42	ef823e0299	Remove rawSeqStore.base and add rawSeqStore.posInSequence	2020-10-07 13:56:25 -04:00
senhuang42	4793ae3b84	Prevent duplicate LDMs from being inserted	2020-10-07 13:56:25 -04:00
senhuang42	65f9cfeeec	Add extra bounds check to prevent heap access after free ASAN error	2020-10-07 13:56:25 -04:00
senhuang42	bff5785fd5	Address mixed variables C90 warning	2020-10-07 13:56:25 -04:00
senhuang42	724b94ed18	ldm_getNextMatch fixed return values	2020-10-07 13:56:25 -04:00
senhuang42	ea92fb3a68	Cleanups, add comments and explanations	2020-10-07 13:56:25 -04:00
senhuang42	78da2e1808	Fixed sifting algorithm	2020-10-07 13:56:25 -04:00
senhuang42	6ccd97fc96	Fixed end of match boundary update issues	2020-10-07 13:56:25 -04:00
senhuang42	28394b64f2	Add proper bounds check on adding ldms	2020-10-07 13:56:25 -04:00
senhuang42	a2f2b58d04	Add a function ldm_voidSequences()	2020-10-07 13:56:25 -04:00
senhuang42	9c3c7cd20e	Fix function argument to getNextMatch()	2020-10-07 13:56:25 -04:00
senhuang42	c8b8572b38	Adjustments to no longer segfault on nci	2020-10-07 13:56:25 -04:00
senhuang42	f57c7e6bbf	Add base adjustment correction	2020-10-07 13:56:25 -04:00
senhuang42	5df9b5e05f	Add initial getNextMatch() in opt parser	2020-10-07 13:56:25 -04:00
senhuang42	f8ce7cabc3	Added more debugging	2020-10-07 13:56:25 -04:00
senhuang42	84009a076a	Add re-copying of ldmSeqStore after processing	2020-10-07 13:56:25 -04:00
senhuang42	42395a70c2	Add debug statements, flesh out functions	2020-10-07 13:56:25 -04:00
senhuang42	dd3dd199bb	Get zstd to build with new functions and callsites, fix arguments	2020-10-07 13:56:25 -04:00
senhuang42	766c4a8c28	Implement part of ldm_maybeAddLdm()	2020-10-07 13:56:25 -04:00
senhuang42	84777059d2	Implement ldm_getNextMatch()	2020-10-07 13:56:24 -04:00
senhuang42	28c74bf591	Implement basic splitSequence and skipSequence functions	2020-10-07 13:56:24 -04:00
senhuang42	634ab7830d	Flesh out required args for ldm_handleLdm()	2020-10-07 13:56:24 -04:00
senhuang42	db70761032	Add callsites to appropriate locations in ..opt_generic()	2020-10-07 13:56:24 -04:00
senhuang42	aea61e3c91	Add ldm helper function declarations into opt parser	2020-10-07 13:56:24 -04:00
senhuang42	35d9f488f5	Modify codepath to use opt parser exclusively if the compression level is high enough	2020-10-07 13:56:24 -04:00
senhuang42	e1ae398ad5	Add rawSeqStore to match state	2020-10-07 13:56:24 -04:00
Martin Liska	b684900a4a	Allow external creation of POOLs that can be shared.	2020-10-07 12:44:33 +02:00
Nick Terrell	27c969ed07	Add comments to ZSTD_getLowest{Match,Prefix}Index() Clarify how we handle dictionaries in each case.	2020-10-01 13:21:46 -07:00
Yann Collet	cc88eb7594	Merge pull request #2317 from animalize/msvc_inline Let MSVC force inline ZSTD_hashPtr() function	2020-09-30 08:27:53 -07:00
Nick Terrell	f1cbeec039	[superblock] Reduce stack usage by correctly sizing header buffers	2020-09-24 19:42:04 -07:00
Nick Terrell	6a1e526ea7	[lib] Add ZSTD_COMPRESS_HEAPMODE tuning parameter	2020-09-24 19:42:04 -07:00
Nick Terrell	b841387218	[freestanding] Improve macro resolution to handle #if X	2020-09-24 19:42:04 -07:00
Nick Terrell	caecd8c211	Allow user to override ASAN/MSAN detection Rename ADDRESS_SANITIZER -> ZSTD_ADDRESS_SANITIZER and same for MEMORY_SANITIZER. Also set it to 0/1 instead of checking for defined. This allows the user to override ASAN/MSAN detection for platforms that don't support it.	2020-09-24 19:42:04 -07:00
Nick Terrell	88fac5d514	Remove call to memset The previous commit fixes the test so it errors on calls to mem*() functions from <string.h>.	2020-09-24 19:42:04 -07:00
Nick Terrell	9261476b7d	[lib] Wrap customMem xor checks in parens for readability This clarifies operator precedence, and quiets cppcheck in the Kernel Test Robot. I think this is a slight bonus to readability, so I am accepting the suggestion.	2020-09-23 23:26:07 -07:00
animalize	2e5d73dd72	Use `MEM_STATIC FORCE_INLINE_ATTR` instead of `FORCE_INLINE_TEMPLATE` It adds `__attribute__((unused))` for __GNUC__, to eliminate `-Werror=unused-function` error.	2020-09-21 13:26:38 +08:00
animalize	0a69a6b1ca	Let MSVC force inline ZSTD_hashPtr() function ZSTD_hashPtr() function was not expanded by MSVC, led to low performance compared to GCC.	2020-09-21 10:38:55 +08:00
Felix Handte	200c960f1d	Merge pull request #2311 from felixhandte/ddss-fix-cparam-derivation Fix Compression Parameter Derivation Bugs Introduced by DDSS Changes	2020-09-18 14:02:14 -04:00
W. Felix Handte	8930c6e551	Use ZSTD_CCtxParams_init() to Init CCtxParams, not memset() Even if the discrepancies are at the moment benign, it's probably better to standardize on using the one true initializer, rather than trying (and failing) to correctly duplicate its behavior.	2020-09-17 12:15:33 -04:00

... 4 5 6 7 8 ...

2136 Commits