townforge/zstd - zstd - Townforge git

Author	SHA1	Message	Date
Stephen Kitt	b2582de3c9	Apply flags to libzstd-nomt in libzstd style ... for consistency (this doesn't actually change the build flags used in practice, currently). Signed-off-by: Stephen Kitt <steve@sk2.org>	2021-05-07 13:25:27 +02:00
Nick Terrell	c2555f8c6f	[lib] Fix fuzzer timeouts by backing off overflow correction Linearly back off the frequency of overflow correction based on the number of times the `ZSTD_window_t` has been overflow corrected. This will still allow the fuzzer to quickly find overflow correction bugs, while also keeping good speed for larger inputs. Additionally, the `nbOverflowCorrections` variable can be useful for debugging coredumps, since we can inspect the `ZSTD_CCtx` to see if overflow correction has happened yet. I've verified this fixes the timeouts in OSS-Fuzz (176 seconds -> 6 seconds). I've also verified that fuzzers and `fuzzer` and `zstreamtest` still catch the row-hash overflow correction bug.	2021-05-06 22:03:41 -07:00
Yann Collet	ee425faaa7	Merge branch 'dev' into d_prefetch_refactor	2021-05-06 19:49:26 -07:00
Nick Terrell	b052b583e5	[lib] Fix UBSAN warning in ZSTD_decompressSequences()	2021-05-06 15:31:30 -07:00
sen	698f261b35	[1.5.0] Deprecate some functions (#2582 ) * Add deprecated macro to zstd.h, mark certain functions as deprecated * Remove ZSTD_compress.c dependencies on deprecated functions	2021-05-06 17:59:32 -04:00
Nick Terrell	2b82948e58	Merge pull request #2622 from terrelln/zdict-api [zdict] Add a FAQ to the top of zdict.h	2021-05-06 12:42:56 -07:00
Nick Terrell	1874f0844d	[zdict] Add a FAQ to the top of zdict.h The FAQ covers the questions asked in Issue #2566. It first covers why you would want to use a dictionary, then what a dictionary is, and finally it tells you how to train a dictionary, and clarifies some of the parameters. There is definitely more that could be said about some of the advanced trainers, but this should be a good start.	2021-05-06 12:48:19 -07:00
Nick Terrell	207e33bb61	Merge pull request #2616 from terrelln/deterministic-dict [lib] Add ZSTD_c_deterministicRefPrefix	2021-05-06 11:09:22 -07:00
Nick Terrell	d2925de98a	Merge pull request #2615 from terrelln/stack-space [lib] Move some ZSTD_CCtx_params off the stack	2021-05-05 19:43:39 -07:00
Nick Terrell	172b4b6ac4	[lib] Add ZSTD_c_deterministicRefPrefix This flag forces zstd to always load the prefix in ext-dict mode, even if it happens to be contiguous, to force determinism. It also applies to dictionaries that are re-processed. A determinism test case is also added, which fails without `ZSTD_c_deterministicRefPrefix` and passes with it set. Question: Should this be the default behavior? It isn't in this PR.	2021-05-05 18:49:56 -07:00
Nick Terrell	eb7e74ccb7	[tests] Set `DEBUGLEVEL=2` by default This allows us to quickly check for compile errors in debug log messages, which are compiled out when `DEBUGLEVEL < 2`.	2021-05-05 13:29:06 -07:00
Nick Terrell	c2183d7cdf	[lib] Move some ZSTD_CCtx_params off the stack * Take `params` by const reference in `ZSTD_resetCCtx_internal()`. * Add `simpleApiParams` to the CCtx and use them in the simple API functions, instead of creating those parameters on the stack. I think this is a good direction to move in, because we shouldn't need to worry about adding parameters to `ZSTD_CCtx_params`, since it should always be on the heap (unless they become absoultely gigantic). Some `ZSTD_CCtx_params` are still on the stack in the CDict functions, but I've left them for now, because it was a little more complex, and we don't use those functions in stack-constrained currently.	2021-05-05 13:25:16 -07:00
Yann Collet	7ef6d7b36c	deeper prefetching pipeline for decompressSequencesLong pipeline increased from 4 to 8 slots. This change substantially improves decompression speed when there are long distance offsets. example with enwik9 compressed at level 22 : gcc-9 : 947 -> 1039 MB/s clang-10: 884 -> 946 MB/s I also checked the "cold dictionary" scenario, and found a smaller benefit, around ~2% (measurements are more noisy for this scenario).	2021-05-05 10:04:03 -07:00
Yann Collet	8cde167a27	Merge branch 'dev' into d_prefetch_refactor	2021-05-05 09:13:38 -07:00
Yann Collet	455fd1a067	updated documentation regarding minimum job size	2021-05-05 09:03:11 -07:00
Yann Collet	c077f257b4	Merge pull request #2611 from facebook/smallerJobs allow jobSize to be as low as 512 KB	2021-05-05 00:03:29 -07:00
Nick Terrell	8389a5122b	Merge pull request #2602 from terrelln/ldm-opt [LDM] Speed optimization on repetitive data	2021-05-04 23:13:09 -07:00
Nick Terrell	d40f55cd95	Merge pull request #2610 from senhuang42/lazy_underflow_fix Fix bad integer wraparound in repcode index for fast, dfast, lazy	2021-05-04 23:10:23 -07:00
Nick Terrell	0b88c2582c	[test] Add large dict/data --patch-from test Dictionary size must be > `ZSTD_CHUNKSIZE_MAX`.	2021-05-04 17:31:32 -07:00
Sen Huang	e6c8a5dd40	Fix incorrect usages of repIndex across all strategies	2021-05-04 19:50:55 -04:00
Nick Terrell	94db4398a0	[lib] Always load the dictionary in one go Dictionaries larger than `ZSTD_CHUNKSIZE_MAX` used to have to be loaded in multiple segments. Instead, when we detect large dictionaries, ensure that we reset the context's indicies. Then, for dictionaries larger than `ZSTD_CURRENT_MAX - 1`, only load the suffix of the dictionary. Finally, enable DDS for large dictionaries, since we no longer load in multiple segments. This simplifes the dictionary loading code, and reduces opportunities for non-determinism to slip in.	2021-05-04 16:45:25 -07:00
Yann Collet	1026b9fa10	fix rsyncable mode	2021-05-04 15:59:27 -07:00
Nick Terrell	8a8899fc08	Merge pull request #2612 from terrelln/minor-fix [easy] Rewrite rowHashLog computation	2021-05-04 15:02:00 -07:00
Yann Collet	40cabd0efd	Merge pull request #2608 from facebook/docMinVer Documented minimum version numbers	2021-05-04 12:10:52 -07:00
Nick Terrell	1ffa80a09e	[easy] Rewrite rowHashLog computation `ZSTD_highbit32(1u << x) == x` when it isn't undefined behavior.	2021-05-04 11:43:20 -07:00
Nick Terrell	a8ecf4ff88	Merge pull request #2597 from terrelln/public-headers [1.5.0] Move `zstd_errors.h` and `zdict.h` to `lib/` root	2021-05-04 11:28:41 -07:00
Yann Collet	8f86c29c06	allow jobSize to be as low as 512 KB previous lower limit was 1 MB. Note : by default, the lowest job size is 2 MB, achieved at level 1. Even lower job sizes can be achieved by manipulating this value directly, or manually modifying window sizes to lower amounts. Updated unit test to ensure that this new limit works fine (test would fail with previous 1 MB limit).	2021-05-04 11:02:55 -07:00
Nick Terrell	32823bc150	[LDM] Speed optimization on repetitive data LDM does especially poorly on repetitive data when that data's hash happens to have `(hash & stopMask) == 0`. Either because the `stopMask == 0` or random chance. Optimize this case by skipping over repetitive patterns. The detection is very simplistic, but should catch most of the offending cases. ``` head -c 1G /dev/zero \| perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long 21.187881087 seconds time elapsed head -c 1G /dev/zero \| perf stat -- ./zstd -1 -o /dev/null -v --zstd=ldmHashRateLog=1 --long 1.149707921 seconds time elapsed ```	2021-05-04 10:57:42 -07:00
W. Felix Handte	ee122baacf	Detect Presence of `md5` on Darwin This fixes #2568.	2021-05-04 12:33:19 -04:00
Yann Collet	8aafbd3604	Documented minimum version numbers Any stable API entry point introduced after v1.0 should be documented with its minimum version number. Since PR fixes this requirement updating mostly new entry points since v1.4.0 and newly introduced ones for future v1.5.0.	2021-05-04 09:05:22 -07:00
Nick Terrell	34aff7ea06	Bug fix & run overflow correction much more frequently in tests * Fix overflow correction when `windowLog < cycleLog`. Previously, we got the correction wrong in this case, and our chain tables and binary trees would be corrupted. Now, we work as long as `maxDist` is a power of two, by adding `MAX(maxDist, cycleSize)` to our indices. * When `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` is defined to non-zero run overflow correction as frequently as allowed without impacting compression ratio. * Enable `ZSTD_WINDOW_OVERFLOW_CORRECT_FREQUENTLY` in `fuzzer` and `zstreamtest` as well as all the OSS-Fuzz fuzzers. This has a 5-10% speed penalty at most, which seems reasonable.	2021-05-03 15:21:47 -07:00
sen	cc31bb8b66	Merge pull request #2598 from senhuang42/reduce_index_rowhash_fix Fix chaintable check to include rowhash in ZSTD_reduceIndex()	2021-05-03 17:34:39 -04:00
sen	4c5cc345fb	Merge pull request #2581 from senhuang42/lcm_stable [1.5.0] Promote ZSTD_c_literalCompressionMode to stable params	2021-05-03 11:59:19 -04:00
sen	cdc979ddb3	Merge pull request #2580 from senhuang42/defaultclevel_to_stable [1.5.0] Promote ZSTD_defaultCLevel() into stable API	2021-05-03 11:59:05 -04:00
senhuang42	61fe571af6	Fix chaintable check to include rowhash in ZSTD_reduceIndex()	2021-04-30 19:52:04 -04:00
Nick Terrell	09149beaf8	[1.5.0] Move `zstd_errors.h` and `zdict.h` to `lib/` root `zstd_errors.h` and `zdict.h` are public headers, so they deserve to be in the root `lib/` directory with `zstd.h`, not mixed in with our private headers.	2021-04-30 15:13:54 -07:00
Nick Terrell	6cee3c2c4f	[trace] Remove default definitions of weak symbols Instead of providing a default no-op implementation, check the symbols for `NULL` before accessing them. Providing a default implementation doesn't reliably work with dynamic linking. Depending on link order the default implementations may not be overridden. By skipping the default implementation, all link order issues are resolved. If the symbols aren't provided the weak function will be `NULL`.	2021-04-26 16:05:39 -07:00
sen	3e2fbfd056	Merge pull request #2579 from senhuang42/getcdictID_to_stable [1.5.0] Promote ZSTD_getDictID_fromCDict() into stable API	2021-04-26 09:55:43 -04:00
Sen Huang	3c595a4a79	Add literalCompressionMode to stable cParams	2021-04-26 09:55:06 -04:00
felixhandte	efa6dfa729	Apply DDS adjustments to avoid assert failures	2021-04-23 16:41:00 -04:00
senhuang42	3b98987496	Remove building of ZBUFF/deprecated folder by default	2021-04-19 17:12:00 -04:00
Sen Huang	c5869677d9	Moved ZSTD_defaultCLevel() into stable API	2021-04-16 10:15:40 -07:00
Sen Huang	9c1ca3c00b	Moved ZSTD_getDictID_fromCDict() into stable API	2021-04-16 10:14:29 -07:00
sen	12c045f74d	Merge pull request #2574 from senhuang42/repcode_mismatch_detector_fix Correct the block splitter mismatched repcodes detection.	2021-04-12 23:27:43 -04:00
Sen Huang	8844f93957	Adjust nb elements to prefetch in ZSTD_row_fillHashCache()	2021-04-12 14:24:58 -04:00
Sen Huang	550f76f131	Correct the detection of mismatched repcodes	2021-04-09 09:08:51 -07:00
Sen Huang	4d63d6e8aa	Update results.csv, add Row hash to regression test	2021-04-07 10:31:41 -07:00
Nick Terrell	4694423c4f	Add and integrate lazy row hash strategy	2021-04-07 09:53:34 -07:00
sen	f71aabb5b5	Move clevel override to after initLocalDict() (#2571 )	2021-04-06 21:05:37 -04:00
sen	f1e8b565c2	Maintain two repcode histories for block splitting, replace invalid repcodes (#2569 )	2021-04-06 17:25:55 -04:00
sen	e38124555e	Fix dictionary force reloading clevel selection (#2570 ) * Move cdict clevel override to before localdict init * Update results.csv after dict load changes	2021-04-06 15:35:09 -04:00
Nick Terrell	8383fc828d	Merge pull request #2541 from ihsinme/patch-1 simple fix for using bit operator.	2021-04-02 13:01:09 -07:00
sen	980f3bbf83	[cwksp] Align all allocated "tables" and "aligneds" to 64 bytes (#2546 ) * Perform 64-byte alignment of wksp tables and aligneds internally * Clean up cwskp_finalize() function to only do two allocs * Refactor aligned/buffer reservation code, remove ASAN req for alignment reservations * Change from allocating 128 bytes always to allocating only buffer space as needed for tables/aligned * Back out aligned/table reservation order restriction * Add stricter bounds for new/resized wksps, fix comment in zstd_cwksp.h	2021-04-01 20:07:19 -04:00
sen	255925c231	Fix repcode-related OSS-fuzz issues in block splitter (#2560 ) * Do not emit last partitions of blocks as RLE/uncompressed * Fix repcode updates within block splitter * Add a entropytables confirm function, redo ZSTD_confirmRepcodesAndEntropyTables() for better function signature * Add a repcode updater to block splitter, no longer need to force emit compressed blocks	2021-03-31 15:14:59 -04:00
Nick Terrell	a494308ae9	[copyright][license] Switch to yearless copyright and some cleanup in the linux-kernel files * Switch to yearless copyright per FB policy * Fix up SPDX-License-Identifier lines in `contrib/linux-kernel` sources * Add zstd copyright/license header to the `contrib/linux-kernel` sources * Update the `tests/test-license.py` to check for yearless copyright * Improvements to `tests/test-license.py` * Check `contrib/linux-kernel` in `tests/test-license.py`	2021-03-30 10:30:43 -07:00
sen	84ccb81e7c	Merge pull request #2561 from senhuang42/longlength_enum Add enum for representing long length ID	2021-03-26 15:55:12 -04:00
Sen Huang	b1a43455f8	Add enum for representing long length ID	2021-03-26 10:41:09 -07:00
sen	4fe2e7ae14	Merge pull request #2558 from senhuang42/msan_block_splitter_fix Fix block splitter minor MSAN warning.	2021-03-25 13:51:43 -04:00
sen	b0407b9f0e	Merge pull request #2555 from senhuang42/default_clevel_func Add ZSTD_defaultCLevel() function to public API	2021-03-25 13:07:28 -04:00
Sen Huang	2a907bf4aa	Move lastCountSize into a returned struct, fix MSAN error	2021-03-25 09:11:15 -07:00
Sen Huang	e398744a35	Add ZSTD_defaultCLevel() function to public API	2021-03-25 08:04:00 -07:00
Nick Terrell	f8ac0ea7ef	Merge pull request #2539 from terrelln/linux-kernel-fixes Fixes for the next linux kernel patch version	2021-03-24 10:34:29 -07:00
sen	bf542c8a8d	Merge pull request #2447 from senhuang42/block_splitter_v2 Recursive block splitting	2021-03-24 12:27:22 -04:00
Sen Huang	5b566ebe08	Rename compressSequences() functions for clarity	2021-03-24 08:21:29 -07:00
Sen Huang	0ef1f935b7	Add a fallback in case the total blocksize of split blocks exceeds raw block size	2021-03-24 08:21:29 -07:00
Sen Huang	c90e81a692	Enable block splitter by default when applicable	2021-03-24 08:21:29 -07:00
Sen Huang	e34332834a	Clean up various functions, add debuglogging for estimate vs. actual sizes	2021-03-24 08:21:29 -07:00
Sen Huang	41c3eae6d9	Fix various fuzzer failures: repcode history, superblocks	2021-03-24 08:21:29 -07:00
senhuang42	0633bf17c3	Change 1.3.4 bugfix to be cross-compatible with superblocks and normal compression	2021-03-24 08:21:29 -07:00
senhuang42	eb1ee8686d	Refactor buildSequencesStatistics() to avoid pointer increment for superblocks	2021-03-24 08:21:29 -07:00
senhuang42	e2bb215117	Add unit tests and fuzzer param	2021-03-24 08:21:09 -07:00
senhuang42	de52de1347	Add recursive block split algorithm	2021-03-24 08:21:09 -07:00
senhuang42	f06f6626ed	Update function names for consistency	2021-03-24 08:20:54 -07:00
senhuang42	c56d6e49e8	Add block splitter to experimental params	2021-03-24 08:20:54 -07:00
senhuang42	2949a95224	Refactor block compression logic into single function	2021-03-24 08:20:54 -07:00
senhuang42	c05c090cc2	Centralize entropy statistics calculations to zstd_compress.c	2021-03-24 08:20:29 -07:00
sen	c48889f097	Merge pull request #2538 from senhuang42/monotonicity_test Add memory monotonicity test over srcSize	2021-03-22 16:54:34 -04:00
Nick Terrell	ebc2dfa821	Merge pull request #2524 from terrelln/huf-stack-reduction [huf] Reduce stack usage of HUF_readDTableX2 by ~972 bytes	2021-03-22 12:37:54 -07:00
Nick Terrell	634bfd339f	[FSE] Clean up workspace using dynamically sized struct	2021-03-22 11:07:07 -07:00
Sen Huang	dff4a0e867	Make ZSTD_estimateCCtxSize_internal() loop through all srcSize parameter sets as well	2021-03-21 16:15:31 -07:00
Yann Collet	f5434663ea	Refactor prefetching for the decoding loop Following #2545, I noticed that one field in `seq_t` is optional, and only used in combination with prefetching. (This may have contributed to static analyzer failure to detect correct initialization). I then wondered if it would be possible to rewrite the code so that this optional part is handled directly by the prefetching code rather than delegated as an option into `ZSTD_decodeSequence()`. This resulted into this refactoring exercise where the prefetching responsibility is better isolated into its own function and `ZSTD_decodeSequence()` is streamlined to contain strictly Sequence decoding operations. Incidently, due to better code locality, it reduces the need to send information around, leading to simplified interface, and smaller state structures.	2021-03-19 15:48:17 -07:00
Nick Terrell	756bd59322	[huf][fse] Clean up workspaces * Move `counting` to a struct in `FSE_decompress_wksp_body()` * Fix error code in `FSE_decompress_wksp_body()` * Rename a variable in `HUF_ReadDTableX2_Workspace`	2021-03-17 16:50:37 -07:00
ihsinme	a5bf09d764	simple fix for using bit operator. good day. It seems to me that the developer intended to use a logical operator. so I suggest a simple fix.	2021-03-17 11:37:42 +03:00
Sen Huang	77ae664ba6	Fix ZSTD_dedicatedDictSearch_isSupported() requirements	2021-03-16 17:36:05 -07:00
senhuang42	386111adec	Add a nbSeq argument to compressSequences() Refactor ZSTD_compressBlock_internal() to do the block header write within and add nbSeq argument to compressSequences()	2021-03-16 14:04:22 -07:00
senhuang42	98764493cf	Move block header write into compressBlock_internal()	2021-03-16 14:04:22 -07:00
Nick Terrell	ea288e0d8e	[lib] Bump zstd version number	2021-03-16 11:47:27 -07:00
Nick Terrell	cd1551d261	[lib][tracing] Add ZSTD_NO_TRACE macro When defined, it disables tracing, and avoids including the header.	2021-03-16 11:47:27 -07:00
Yann Collet	3d6c903056	Merge pull request #2521 from animalize/doc_free doc: ZSTD_free*() functions accept NULL pointer	2021-03-06 21:33:28 -08:00
Nick Terrell	3b1aba42cc	[fse] Reduce stack usage of FSE_decompress_wksp() by 512 bytes * Move `counting` into the workspace * Inrease `HUF_DECOMPRESS_WORKSPACE_SIZE` by 512 bytes	2021-03-05 13:24:27 -08:00
Nick Terrell	0f18059a4e	[huf] Reduce stack usage of HUF_readDTableX2 by ~460 bytes * Use `HUF_readStats_wksp()` * Use workspace in `HUF_fillDTableX2()` Clean up workspace usage to use a workspace struct	2021-03-05 12:39:46 -08:00
Nick Terrell	b5fd348a85	Merge pull request #2523 from terrelln/huf-stack-reduction Add HUF_writeCTable_wksp() function	2021-03-05 12:35:09 -08:00
Nick Terrell	5df2a21f1e	Add HUF_writeCTable_wksp() function This saves ~700 bytes of stack space in HUF_writeCTable.	2021-03-05 10:29:18 -08:00
Nick Terrell	e50f88ca4c	Merge pull request #2522 from terrelln/stack-reduction Reduce stack usage of ZSTD_buildCTable()	2021-03-04 20:55:58 -08:00
Nick Terrell	27498ff00f	Reduce stack usage of ZSTD_buildCTable() It is a stack high-point for some compression strategies and has an easy fix. This moves the normalized count into the entropy workspace.	2021-03-04 16:12:11 -08:00
animalize	0933775d79	doc: ZSTD_free*() functions accept NULL pointer	2021-03-04 12:02:52 +08:00
Yann Collet	029f974ddc	strengthen compilation flags	2021-03-02 15:43:52 -08:00
W. Felix Handte	d7db928f72	Bump Library Version 1.4.8 -> 1.4.9	2021-03-01 17:45:30 -05:00
Yann Collet	61b63e9060	Merge pull request #2492 from niacat/dev Use standard md5 tool on NetBSD.	2021-02-24 16:38:10 -08:00
Nick Terrell	7736549bea	[bug-fix] Make simple single-pass functions ignore advanced parameters The simple compression functions are intended to ignore the advanced parameters, but they were accidentally using them. All the `ZSTD_parameters` were set correctly, but any extra parameters were used as-is. E.g. `ZSTD_c_format`. This PR makes all the simple single-pass functions listed below ignore the advanced parameters, as intended. * `ZSTD_compressCCtx()` * `ZSTD_compress_usingDict()` * `ZSTD_compress_usingCDict()` * `ZSTD_compress_advanced()` * `ZSTD_compress_usingCDict_advanced()` It also adds a test case that ensures that each of these functions ignore the advanced parameters.	2021-02-12 19:11:23 -08:00
Nick Terrell	c62eb05964	[lib] Set appliedParams.compressionLevel correctly Forward the correct compressionLevel to the appliedParams in all cases. It was already correct for the advanced API, so only the old single-pass functions needed to be fixed. This compression level is unused by the library, but is set so that the tracing framework can consume it.	2021-02-12 15:00:14 -08:00
Nick Terrell	f520f6dfbe	[trace] Minor fixes found during integration * Mark `ZSTD_CCtx_getParameter()` as const * Add `extern "C"` guards to `zstd_trace.h`	2021-02-11 16:20:04 -08:00
Yann Collet	8884cb887d	Merge pull request #2483 from mpu/ldmgear New algorithms for the long distance matcher	2021-02-11 08:38:23 -08:00
nia	74f85818a6	Use standard md5 tool on NetBSD. This avoids a GNU coreutils dependency. -n is used to match the output format of coreutils: http://man.netbsd.org/md5.1	2021-02-11 10:50:11 +01:00
Quentin Carbonneaux	552efcac2d	relocate large arrays from the stack to ldmState_t	2021-02-10 16:16:54 +01:00
Nick Terrell	e59c9459a5	[trace] Keep track of a uint64_t tracing context The most common information that you want to track between begin() and end() is the timestamp of the begin function, so you can measure the duration of the (de)compression call. Allow the tracing library to put this information inside the `ZSTD_TraceCtx`, so it doesn't need to keep a global map in this case. If a single uint64_t is not enough, the tracing library can return a unique identifier (like the context pointer) instead, and use it as a key in a map. This keeps the simple case simple.	2021-02-09 11:37:05 -08:00
Quentin Carbonneaux	e2ad174d73	fix some compiler warnings	2021-02-08 20:19:16 +01:00
Nick Terrell	54a4998a80	Add basic tracing functionality	2021-02-05 16:28:52 -08:00
Nick Terrell	f9b1e711ba	[zstd] Fix NULL pointer addition in ZSTD_checkContinuity() Don't start a new section when `dstSize == 0` to avoid NULL pointer addition.	2021-02-05 12:18:06 -08:00
Yann Collet	b9748757b0	fixed minor cast warning	2021-02-05 09:55:54 -08:00
Quentin Carbonneaux	874a590e5c	deal safely with short inputs in ZSTD_ldm_generateSequences The fuzzer CI found this bug.	2021-02-04 11:15:24 +01:00
Quentin Carbonneaux	9f327c02fd	new core ldm algorithm	2021-02-03 22:24:07 +01:00
Quentin Carbonneaux	aee3dc877f	fix a variable name to reflect its nature	2021-01-22 02:24:19 -08:00
Quentin Carbonneaux	d6e3de77dc	fix warning and remove one more occurrence of makeEntryAndInsertByTag	2021-01-20 01:39:16 -08:00
Quentin Carbonneaux	e0d5eca8fa	fix forgotten numTagBits in getTagMask	2021-01-20 00:54:20 -08:00
Quentin Carbonneaux	1e65711ca5	a couple performance improvement changes for ldm	2021-01-20 00:54:20 -08:00
Yann Collet	0bad3e5c0f	parallel make build on linux fix #2474	2021-01-18 11:33:03 -08:00
yumeyao	821d9acd17	Fix visibility of symbols in .so (#2441 ) Fix visibility of symbols in .so and add an alias for renamed API	2021-01-08 14:27:31 -08:00
sen	69085db61c	Merge pull request #2446 from senhuang42/multiple_ddicts_v3 [RFC] Support references to multiple DDicts	2021-01-08 16:49:45 -05:00
Yann Collet	33b73db33c	Merge pull request #2457 from facebook/cli-dll zstd CLI can now be linked to libzstd dynamic library	2021-01-07 17:10:13 -08:00
Yann Collet	c416015ab5	Merge pull request #2459 from ThomasWaldmann/fix-typos fix typos (work done by Andrea Gelmini)	2021-01-07 16:19:10 -08:00
senhuang42	9ae0dd9336	Fix Visual and staticanalyze warnings	2021-01-07 17:58:37 -05:00
Thomas Waldmann	92a2b5ccc9	fixup: lits means literals	2021-01-07 23:30:42 +01:00
Yann Collet	2901b5e675	Merge branch 'dev' into cli-dll	2021-01-07 10:24:09 -08:00
Thomas Waldmann	f9802d80a0	fix typos (work done by Andrea Gelmini)	2021-01-07 18:47:23 +01:00
senhuang42	c2c9b8a7ec	Address comments, clean up interface/internals	2021-01-07 12:29:12 -05:00
senhuang42	22b7bff2bc	Add unit test, improve documentation	2021-01-07 12:29:12 -05:00
senhuang42	ea52fc3606	Use XXHash for hash function, create a sensible public interface	2021-01-07 12:29:12 -05:00
senhuang42	7c1a79f232	Add debuglog statements	2021-01-07 12:29:11 -05:00
senhuang42	d1a6a9d285	Reference requested dict ID at decompression time	2021-01-07 12:29:11 -05:00
senhuang42	5a6d3eef2b	Allocate memory for DDict hash set when parameter is set	2021-01-07 12:29:11 -05:00
senhuang42	fd5b608f1c	Add parameter to control multiple DDicts	2021-01-07 12:29:11 -05:00
senhuang42	f933668d3f	Implement hashset for dictIDs	2021-01-07 12:29:11 -05:00
Yann Collet	cefdc023f7	The CLI can be linked to libzstd dynamic library invoking target zstd-dll	2021-01-06 18:00:24 -08:00
Yann Collet	9866148e22	removed redundant tests clang v3.8 tests are either flacky or redundant, prefer using clang-latest.	2021-01-06 15:40:20 -08:00
Nick Terrell	58476bcf7f	Don't shrink window log in ZSTD_getCParams() Treat ZSTD_getCParams() and ZSTD_adjustCParams() in the same way we treat streaming compression. Choose parameters based on the dictionary size + source size, and assume the source size is small if unkown. But, don't shrink the window log down in ZSTD_adjustCParams_internal().	2021-01-04 15:54:09 -08:00
Nick Terrell	9d31c704d5	Don't shrink window log when streaming with a dictionary Fixes #2442. 1. When creating a dictionary keep the same behavior as before. Assume the source size is 513 bytes when adjusting parameters. 2. When calling ZSTD_getCParams() or ZSTD_adjustCParams() keep the same behavior as before. 3. When attaching a dictionary keep the same behavior of ignoring the dictionary size. When streaming this will select the largest parameters and not adjust them down. But, the CDict will use the correctly sized parameters, which seems like the right tradeoff. 4. When not attaching a dictionary (either forced not to, or using a prefix dictionary) we select parameters based on the dictionary size + source size, and assume the source size is small, which is the same behavior as before. But, now we don't adjust the window log (and hash and chain log) down when the source size is unknown. When the source size is unknown all cdicts should attach, except when the user disables attaching, or `forceWindow` is used. This means that when streaming with a CDict we end up in the good case where we get small CDict parameters, and large source parameters. TODO: Add a streaming + dictionary regression test case.	2021-01-04 15:54:09 -08:00
Nick Terrell	66e811d782	[license] Update year to 2021	2021-01-04 17:53:52 -05:00
Yann Collet	cfff4c1cd5	Merge pull request #2439 from senhuang42/skippable_frame_api Generate skippable frame API	2020-12-28 11:22:07 -08:00
Gregory Szorc	dd1a7e41ee	Add ifndef guards for _LARGEFILE_SOURCE and _LARGEFILE64_SOURCE This ensures the symbols aren't redefined, which would result in a compiler error. I was getting redefined symbols for _LARGEFILE64_SOURCE when building for 32-bit x86 Linux on an older CentOS release in a CI environment. With this change, I'm able to compile the single file library in this environment. Closes #2443.	2020-12-26 10:02:45 -07:00
senhuang42	5c41490bfe	Use pre-defined constants	2020-12-21 11:52:05 -05:00
senhuang42	7e11bd012b	Implement skippable frame function	2020-12-21 11:13:22 -05:00
Yann Collet	9648bf027b	try to keep libzstd.a "as is" once created to be compatible with scenarios such as `make -j allmost`	2020-12-20 17:10:57 -08:00
Yann Collet	7c495e8ea2	updated version number to v1.4.8	2020-12-18 15:52:11 -08:00
Yann Collet	a7cb4af573	added emphasis on the alignment condition of workspace and made it a programming mistake (`assert()`) rather than a runtime error.	2020-12-18 15:04:09 -08:00
Nick Terrell	ae85676d44	Fix alignment of scratchBuffer in HUF_compressWeights() The scratch buffer must be 4-byte aligned. This causes test failures in 32-bit systems, where the stack isn't aligned. Fixes Issue #2428.	2020-12-17 14:30:27 -08:00
Yann Collet	0b39531d75	moving all references to `release` branch was previously `master`	2020-12-16 23:00:35 -08:00
Yann Collet	f647a759fe	updated version number to v1.4.7 and updated doc	2020-12-15 20:53:05 -08:00
Yann Collet	b8c3a473ec	Merge pull request #2420 from terrelln/huf-comment [huf_compress] Refactor and comment HUF_buildCTable()	2020-12-14 16:14:07 -08:00
Yann Collet	c56723ab03	replace final links by direct copy link can behave slightly differently from real binaries, breaking a few scripts relying on "real binary" assumption.	2020-12-10 13:25:08 -08:00
Felix Handte	f861e8c07b	Merge pull request #2421 from felixhandte/pc-no-sed Don't Use Regexes to Build Pkg-Config File	2020-12-09 18:58:17 -05:00
W. Felix Handte	9dab03db90	Create Enum to Represent Static/Dynamic Allocation Distinction in cwksp	2020-12-09 14:57:37 -05:00
W. Felix Handte	db9e73cb07	Don't ASAN-Poison Statically-Allocated Workspaces Addresses #2286.	2020-12-09 13:00:47 -05:00
W. Felix Handte	b521183c74	Avoid Use of Regexes in Building Package-Config File	2020-12-08 20:10:05 -05:00
Nick Terrell	1bbcf07bd5	[huf_compress] Refactor and comment HUF_buildCTable() Comment and refactor `HUF_buildCTable()` and the helper functions it calls as I read and understand the code. Hopefully this refactor makes the code a bit more clear.	2020-12-08 13:57:01 -08:00
Yann Collet	b86e3c9304	Merge pull request #2415 from facebook/fix_aliasing fix gcc-10 strict aliasing warnings	2020-12-04 21:30:57 -08:00
Nick Terrell	fad175f9c1	Merge pull request #2412 from animalize/dict_compressionlevel use ZSTD_CLEVEL_DEFAULT in zdict.c	2020-12-04 17:09:30 -08:00
Yann Collet	6132df8dd3	fix gcc-10 strict aliasing warnings by exposing HUF_CElt declaration.	2020-12-04 16:43:19 -08:00
Yann Collet	68c14bdff2	minor speed improvement to HUF_readCTable() faster by ~+1-2%	2020-12-04 16:33:39 -08:00
Nick Terrell	c238db046f	Merge pull request #2414 from terrelln/mt-progress [lib] Ensure that multithreaded compression always makes some progress	2020-12-04 16:30:08 -08:00
Nick Terrell	4c58cb8383	[lib] Ensure that multithreaded compression always makes some progress	2020-12-03 20:25:14 -08:00
animalize	1aec77ea89	use ZSTD_CLEVEL_DEFAULT in zdict.c	2020-12-03 12:46:57 +08:00
Nick Terrell	6672689e7e	Merge pull request #2406 from terrelln/linux-wrapper-api [linux] Add the linux wrapper API	2020-12-02 16:49:03 -08:00
Yann Collet	91c1b57be9	Merge pull request #2409 from facebook/test_makefile Minor refactor	2020-12-02 15:33:54 -08:00
Nick Terrell	894ae36675	Merge pull request #2390 from animalize/clamp_level Clamp compression level	2020-12-02 14:35:58 -08:00
senhuang42	2cbd038528	Move max nb seq check to per-block	2020-12-02 12:11:32 -05:00
Nick Terrell	3cda5fae77	[minor][lib] Remove double semicolon	2020-12-02 01:08:08 -08:00
Yann Collet	9f8b180d5d	fixed API documentation	2020-12-02 00:15:07 -08:00
Yann Collet	6112b82526	Merge pull request #2348 from dscheg/dev Fix dll path in case of cross-compilation	2020-12-01 17:59:56 -08:00
senhuang42	3efe9c902b	Add sequence nb validation to compressSequences(), adjust minMatch comparisons	2020-12-01 10:54:45 -05:00
senhuang42	4c5f337248	Use cctx's minMatch instead of global MINMATCH, make fuzzer use validation	2020-11-30 15:41:20 -05:00
Dmitriy Titarenko	61f71753d4	Pass dictBufferCapacity to COVER_selectDict() closes #2371	2020-11-22 23:45:18 +05:00
sen	c5fbd55dac	Merge pull request #2387 from senhuang42/compress_sequence_API [RFC] New sequence compression API	2020-11-20 16:54:20 -05:00
senhuang42	7742f076b4	Add experimental param for sequence validation	2020-11-20 11:57:41 -05:00
senhuang42	0e32928b7d	Remove unnecessary repcode backup, apply style choices, use function pointer	2020-11-20 11:02:19 -05:00
sen	e924a0fa51	Explicit cast for visual warnings Github has automatic commits now! Cool Co-authored-by: Nick Terrell <nickrterrell@gmail.com>	2020-11-19 17:32:40 -05:00
senhuang42	dcbbf7c09f	Unroll isRLE loop	2020-11-19 12:38:13 -05:00
senhuang42	05c0229668	Clean up visual conversion warnings	2020-11-18 15:36:29 -05:00
senhuang42	3c4454769b	Improve documentation on ZSTD_compressSequences()	2020-11-18 09:52:24 -05:00
senhuang42	d6d7ba2a1f	Modification to offset validation to include entire sequence	2020-11-17 10:13:22 -05:00
senhuang42	8f3136a9c7	Fix assert edge case, improve documentation in zstd.h	2020-11-16 18:05:35 -05:00
senhuang42	f6baad87d6	Fix warnings and make validation enabled by default	2020-11-16 12:00:06 -05:00
senhuang42	55b90ef010	Fix unit tests to agree with new changes	2020-11-16 11:36:37 -05:00
senhuang42	7f563b0519	Add new sequence format as an experimental CCtx param	2020-11-16 10:49:17 -05:00
senhuang42	347824ad73	Overhaul logic to simplify, add in proper validations, fix match splitting	2020-11-16 10:49:17 -05:00
senhuang42	46824cb018	Add new sequence compress api params to cctx	2020-11-16 10:49:17 -05:00
senhuang42	48405b4633	Fix srcSize=0 edge case	2020-11-16 10:49:17 -05:00
senhuang42	022e6d81e7	Fix literals length calculation	2020-11-16 10:49:17 -05:00
senhuang42	dad20b5ccb	Remove dstCapacity error check	2020-11-16 10:49:17 -05:00
senhuang42	b8e16a2057	Remove extraneous function in this API	2020-11-16 10:49:17 -05:00
senhuang42	f29507c4fc	Add check comparing offset to window size	2020-11-16 10:49:17 -05:00
senhuang42	7a6e46a92f	Fix MSAN errors	2020-11-16 10:49:17 -05:00
senhuang42	cc2642bd17	Address edge case with endPosInSequence	2020-11-16 10:49:17 -05:00
senhuang42	fd10007174	Change debug levels to appropriate ones	2020-11-16 10:49:17 -05:00
senhuang42	2db8441245	Add RLE support	2020-11-16 10:49:17 -05:00
senhuang42	dfef298336	Fix various build warnings	2020-11-16 10:49:17 -05:00
senhuang42	2bbdddf24e	Add test case to roundtrip using ZSTD_getSequences() and ZSTD_compressSequences()	2020-11-16 10:49:16 -05:00
senhuang42	5fd69f8173	Add documentation for new api functions	2020-11-16 10:49:16 -05:00
senhuang42	e8b7fdb64b	Refactor for enhanced code clarity	2020-11-16 10:49:16 -05:00
senhuang42	c675fb46f1	Rename internal function compressSequences(), and promote new *_ext() functions to their actual name	2020-11-16 10:49:16 -05:00
senhuang42	013434e1e4	Add another API function to compress with existing CCTX	2020-11-16 10:49:16 -05:00
senhuang42	c44ce29013	More adjustments to improve code clarity	2020-11-16 10:49:16 -05:00
senhuang42	48f67da854	Pull compressStream2() transparent initialization into its own function	2020-11-16 10:49:16 -05:00
senhuang42	c86151f53c	Add initial support for new ZSTD_Sequence mode	2020-11-16 10:49:16 -05:00
senhuang42	e0f26afce9	Add sequence compression format param	2020-11-16 10:49:16 -05:00
senhuang42	f51af9a609	Always ensure sequenceRange updates properly, add more error forwarding	2020-11-16 10:49:16 -05:00
senhuang42	1a449688fd	Various minor logical refactors to improve clarity	2020-11-16 10:49:16 -05:00
senhuang42	e5fe485dcc	Fix cSize calculation for noCompressBlocks	2020-11-16 10:49:16 -05:00
senhuang42	6145ebb400	Rebased, roundtrips silesia.tar	2020-11-16 10:49:16 -05:00
senhuang42	b5b61cc216	Refactor for better debugging info	2020-11-16 10:49:16 -05:00
senhuang42	293fad6b45	Corrections and edge-case fixes to be able to roundtrip dickens	2020-11-16 10:49:16 -05:00
senhuang42	7eb6fa7be4	Multi-block compression scaffolding - works on single-block files	2020-11-16 10:49:16 -05:00
senhuang42	75b01f34b9	Add support for uncompressible blocks	2020-11-16 10:49:16 -05:00
senhuang42	e04da68157	Enable usage of ZSTD_sequenceRange for single-block compression	2020-11-16 10:49:16 -05:00
senhuang42	337fac216d	Add logic to handle ZSTD_sequenceRange	2020-11-16 10:49:16 -05:00
senhuang42	85822ddd53	Add last literals handling like getSequences()	2020-11-16 10:49:16 -05:00
senhuang42	2cff8df1a2	Pull block compression out of main compressSequences() function	2020-11-16 10:49:16 -05:00
senhuang42	cfced9344a	Implement ZSTD_updateSequenceRange	2020-11-16 10:49:16 -05:00
senhuang42	b116e1f211	Modify SequenceRange to have posInSequence	2020-11-16 10:49:16 -05:00
senhuang42	d99b675112	Add function definition for sequenceRange updater	2020-11-16 10:49:16 -05:00
senhuang42	74e95c05cc	Add ZSTD_SequenceRange to count ranges in array of ZSTD_Sequence	2020-11-16 10:49:16 -05:00
senhuang42	89f3848310	Add support for repcodes	2020-11-16 10:49:16 -05:00
senhuang42	3e930fd044	Code cleanup, add debuglog statments	2020-11-16 10:49:16 -05:00
senhuang42	086513b5b9	Implement first pass at compressSequences()	2020-11-16 10:49:16 -05:00
senhuang42	a9327b1e9b	Add initial function prototype for ZSTD_compressSequences_ext (to be renamed later)	2020-11-16 10:33:35 -05:00
animalize	52f8c07a3f	Clamp compression level in ZSTD_getCParams_internal() function	2020-11-14 13:26:08 +08:00
senhuang42	9d936d61d2	Reduce number of memcpy() calls	2020-11-13 19:43:30 -05:00
senhuang42	be4ac6c5bc	Use existing repcode update function to implement updates	2020-11-12 16:51:12 -05:00
senhuang42	674c9b9235	Add in proper block repcode histories	2020-11-12 15:34:37 -05:00
senhuang42	06c7f14066	Let block reps persist	2020-11-12 12:24:44 -05:00
senhuang42	396275068c	Fix incorrect repcode setting	2020-11-12 11:57:01 -05:00
senhuang42	1a8af0de73	Improve unit test	2020-11-12 11:09:09 -05:00
senhuang42	4d4fd2c55f	Overhaul repcode handling logic	2020-11-12 10:59:35 -05:00
Yann Collet	69b8361b0c	Merge pull request #2388 from facebook/fix2386 fix incorrect assert	2020-11-06 11:38:08 -08:00
sen	f62edf0fe9	Merge pull request #2381 from senhuang42/expand_sequence_extraction_api Add enum to define ZSTD_Sequence type and update sequence extraction API	2020-11-06 13:00:31 -05:00
Yann Collet	95e74616d5	fix multiple minor conversion warnings unrelated to #2386, just cleaning up while I'm updating this file ...	2020-11-06 09:57:05 -08:00
Yann Collet	2769e4d459	fix incorrect assert fix #2386, reported by @Neumann-A	2020-11-06 09:44:04 -08:00
senhuang42	7d1dea070c	Update unit tests	2020-11-06 11:10:37 -05:00
senhuang42	779df995c6	Implement mergeGeneratedSequences()	2020-11-06 10:55:46 -05:00
senhuang42	51abd58208	Rename getSequences() to generateSequences()	2020-11-06 10:53:22 -05:00
senhuang42	261ea69661	Add new mergeGeneratedSequences() function	2020-11-06 10:52:34 -05:00
Luke Pitt	eac309c71b	Add ZSTD_getDictID_fromCDict function to experimental section	2020-11-04 11:37:37 +00:00
senhuang42	f782cac3d4	Change block delimiter removing to linear time approach	2020-11-02 17:06:20 -05:00
senhuang42	3c9b43da1d	Remove trailing comma	2020-11-02 11:53:58 -05:00
senhuang42	3434049c1f	Use ZSTD_memmove() instead of memmove()	2020-11-02 11:43:19 -05:00
senhuang42	d4d0346b40	Update name of enum, clarify documentation	2020-11-02 11:38:17 -05:00
senhuang42	e6178f837f	Revert unnecessary seqCollector adjustment	2020-11-02 10:59:20 -05:00
senhuang42	e8501e00b8	Fix incorrect index increment in merge algorithm	2020-11-02 10:58:41 -05:00
senhuang42	a36fdada57	Add algorithm to remove all delimiters	2020-11-02 10:46:52 -05:00
senhuang42	435a3a0428	Update seqCollector definition	2020-11-02 10:19:26 -05:00
senhuang42	3327932609	Update ZSTD_getSequences function signature	2020-11-02 10:17:59 -05:00
senhuang42	7397d0102f	Add new enum for different sequence formats for ingestion/extraction	2020-11-02 10:15:53 -05:00
Nick Terrell	7205e609a9	Merge pull request #2354 from terrelln/stable-buffer Add ZSTD_c_stable{In,Out}Buffer and optimize when set	2020-10-30 15:06:56 -07:00
sen	c37c714ef1	Merge pull request #2376 from senhuang42/clarify_sequence_extraction_api Refine external ZSTD_Sequence API	2020-10-30 15:47:25 -04:00
Nick Terrell	d4e021fe35	[lib] Avoid allocating the input buffer when ZSTD_c_stableInBuffer is set We don't use it when we have a stable input buffer, so don't allocate it. I had to slightly modify `ZSTD_copyCCtx()` by storing the `ZSTD_buffered_policy_e` in the `ZSTD_CCtx`, since `inBuffSize > 0` is no longer the correct signal for the buffered mode.	2020-10-30 10:55:34 -07:00
Nick Terrell	24f72789e2	[lib] Skip the input window buffer when ZSTD_c_stableInBuffer is set Compress directly from the `ZSTD_inBuffer`. We still allocate the input buffer. A following commit will remove that allocation.	2020-10-30 10:55:34 -07:00
Nick Terrell	6bd6b6f7d3	[cwksp] Return NULL when 0 bytes are requested This ensures that the buffer is never used.	2020-10-30 10:55:34 -07:00
Nick Terrell	fcf81cee5e	[lib] Avoid allocating output buffer when ZSTD_c_stableOutBuffer is set We compress directly to the `ZSTD_outBuffer` so we don't need to allocate it.	2020-10-30 10:55:34 -07:00
Nick Terrell	6d5dc93d4e	[lib] Compress directly into output when ZSTD_c_stableOutBuffer is set When we have a stable output buffer always compress directly into the `ZSTD_outBuffer`. We are allowed to return `dstSizeTooSmall`.	2020-10-30 10:55:34 -07:00
Nick Terrell	987cb4ca6a	[lib] Take the shortcut when ZSTD_c_stableOutBuffer is set When we have a stable output buffer take the single-pass shortcut. It is okay to return `dstSizeTooSmall` if the output buffer isn't big enough, because we know it will never grow.	2020-10-30 10:55:34 -07:00
Nick Terrell	809b2f2071	[lib] Set ZSTD_c_stable{In,Out}Buffer in ZSTD_compress2() Sets these parameters in ZSTD_compress2() then resets them to their orignal values after the compression call. An alternative design could be to add a flush mode `ZSTD_e_singlePass` which implies `ZSTD_c_stable{In,Out}Buffer` but only for a single compression call, by directly setting the applied parameters. I've opted for the smaller change, but this is open for discussion.	2020-10-30 10:55:34 -07:00
Nick Terrell	c74be3f6de	[lib] Validate buffers when ZSTD_c_stable{In,Out}Buffer is set Adds the validation of the input/output buffers only. They are still unused.	2020-10-30 10:55:34 -07:00
Nick Terrell	e3e0775cc8	[API] Add ZSTD_c_stable{In,Out}Buffer parameters This commit adds the parameters and sets the value in the CCtxParams but it does not do anything with the value.	2020-10-30 10:54:39 -07:00
Nick Terrell	e2581d9572	[lib] Set appliedParams in zstdmt mode Previously only `nbWorkers` was set. Set all parameters, because that is what is expected. This is needed for the `ZSTD_c_stable{In,Out}Buffer` parameters.	2020-10-30 10:54:38 -07:00
senhuang42	f0da97642a	Specify that getSequences() will always emit block boundary sequences	2020-10-30 12:31:17 -04:00
senhuang42	536e89c723	Sequence extractor should update CBlockState	2020-10-30 12:13:19 -04:00
senhuang42	32cac2627a	Emit last literals of 0 size as well, to indicate block boundary	2020-10-29 16:41:17 -04:00
senhuang42	69bd5f0654	Correct literalsRead calculation to include longLength	2020-10-29 14:49:37 -04:00
senhuang42	59624f3163	Remove implicit typecast to appease appVeyor windows build	2020-10-28 16:25:09 -04:00
Yann Collet	09e3bb95d2	Merge branch 'dev' into libzstd_autoconf_full	2020-10-28 10:53:08 -07:00
Yann Collet	0adce4631d	Merge branch 'libzstd_autoconf_full' of github.com:facebook/zstd into libzstd_autoconf_full	2020-10-28 10:25:55 -07:00
Yann Collet	f6ecf1568f	minor Makefile refactor hopefully improving readability	2020-10-28 09:39:15 -07:00
senhuang42	3ed5d053d8	Clarify comments in zstd.h some more	2020-10-28 09:53:09 -04:00
Nick Terrell	599ff58e08	Merge pull request #2339 from terrelln/zstdmt-stability Fix zstdmt stability issues and clean up the zstdmt code	2020-10-27 19:43:13 -07:00
Yann Collet	ceccd7ae2d	Merge branch 'dev' into libzstd_autoconf_full	2020-10-27 15:45:30 -07:00
Yann Collet	2d2507b9db	Merge pull request #2374 from bket/portability 'head -c BYTES' is non-portable	2020-10-27 14:15:35 -07:00
sen	17b700d78a	Merge pull request #2366 from senhuang42/enable_ldm_by_default Enable LDM by default if window size >= 128MB and strategy uses opt parser	2020-10-27 14:59:28 -04:00
Nick Terrell	0953645837	Merge pull request #2362 from senhuang42/fix_ldm_fuzz_issue Fix long distance matcher OSS-fuzz issue	2020-10-27 11:13:03 -07:00
senhuang42	3163909d14	Remove unused variable position	2020-10-27 12:58:12 -04:00
senhuang42	dc448563e9	Add test compatibility with last literals in sequences	2020-10-27 12:35:28 -04:00
Björn Ketelaars	1f661b5f6b	'head -c BYTES' is non-portable	2020-10-27 16:55:23 +01:00
senhuang42	1d221ecc03	Add support for representing last literals in the extracted seqs	2020-10-27 11:19:48 -04:00
senhuang42	9171f920cd	Improve documentation of seqStore_t	2020-10-27 10:50:22 -04:00
senhuang42	96b0ff7886	Improve documentation regarding various operations in copyBlockSequences	2020-10-27 10:36:06 -04:00
senhuang42	3a11c7eb03	Modify ZSTD_copyBlockSequences to agree with new API	2020-10-27 10:31:40 -04:00
senhuang42	761f40d1c6	Clarify and modify ZSTD_Sequence definition	2020-10-27 09:41:32 -04:00
Yann Collet	456db0c377	make install only rebuild binaries if they don't exist Now `make` followed by `make install` doesn't rebuild binaries also : only generated target directories if they don't already exist	2020-10-23 16:46:49 -07:00
Yann Collet	a6ee614a44	make zstd is now differentiated from zstd-nomt avoids mixing object files using different flags	2020-10-23 16:08:21 -07:00
Yann Collet	89b961ea46	simplified silent mode maintenance	2020-10-23 10:41:17 -07:00
Yann Collet	ffe8d9e428	fix partial lib test	2020-10-23 10:27:12 -07:00
Yann Collet	b5d4728713	simplified silent mode	2020-10-23 10:22:52 -07:00
Yann Collet	a7ad05bf57	fixed building libzstd with manual BUILD_DIR and when HASH is not found	2020-10-23 10:14:04 -07:00
Yann Collet	d3f1a9b5bd	fix partial-build test sometimes, the scope difference is solely determined by the list of source files, not by the flags.	2020-10-22 21:36:09 -07:00
Yann Collet	a912ef0952	can integrate later dynamic flags changes for example `libzstd-mt` is `differentiated from `libzstd`	2020-10-22 18:48:06 -07:00
Yann Collet	f90424da2d	Merge pull request #2368 from facebook/progressive_libzstd faster rebuild of libzstd	2020-10-22 17:36:56 -07:00
Yann Collet	ce6cd07c33	updated build documentation	2020-10-22 12:31:23 -07:00
Yann Collet	e3867fb735	fixed libzstd.dll compilation on mingw and zstd linking	2020-10-22 11:52:19 -07:00
Yann Collet	494f7169ed	fix directory creation for Windows' libzstd	2020-10-22 00:15:31 -07:00
Yann Collet	dd24496951	programs/zstd also automatically generate object dir per conf same rules as lib/libzstd can also be controlled via HASH and BUILD_DIR	2020-10-21 23:38:33 -07:00
Yann Collet	0f8ee5c51e	Merge branch 'dev' into libzstd_autoconf	2020-10-21 22:36:09 -07:00
Yann Collet	d0436b2a45	automatically detect configuration changes Makefile now automatically detects modifications of compilation flags, and produce object files in directories dedicated to this compilation flags. This makes it possible, for example, to compile libzstd with different DEBUGLEVEL. Object files sharing the same configration will be generated into their dedicated directories. Also : new compilation variables - DEBUGLEVEL : select the debug level (assert & traces) inserted during compilation (default == 0 == release) - HASH : select a hash function to differentiate configuration (default == md5sum) - BUILD_DIR : skip the hash stage, store object files into manually specified directory	2020-10-21 19:22:45 -07:00
Yann Collet	8a453a34c5	automatic %.h header dependency tracking also : BUILD_DIR can be manually specified	2020-10-21 17:25:07 -07:00
Yann Collet	2224ec33ed	Merge pull request #2367 from facebook/progressive_build faster rebuild of zstd	2020-10-21 15:43:14 -07:00
Yann Collet	2b99bc29bf	consolidated vpath	2020-10-21 04:01:01 -07:00
Yann Collet	e8eb2939fe	store %.o object files into obj/ both static and dynamic libraries have their own object directory	2020-10-21 03:44:38 -07:00
Yann Collet	3e519be965	minor cleaning	2020-10-21 03:22:27 -07:00
Yann Collet	911dbdbb4b	build libzstd.so from object files %.o object files generated for dynamic library must be different from those generated for static library. Due to this difference, %.o were so far only generated for the static library. The dynamic library was rebuilt from %.c source. This meant that, for every minor change, the entire dynamic library had to be rebuilt. This is fixed in this PR : only the modified %.c source get rebuilt.	2020-10-20 22:19:57 -07:00
senhuang42	8bdb32aebe	Add a function for LDM enable check	2020-10-20 13:46:02 -04:00
senhuang42	578e889ec1	Move ldm enable to compressStream2()	2020-10-20 13:04:45 -04:00
senhuang42	d28d8a1d72	Include LDM tables size for CCtx size estimation where relevant	2020-10-20 09:21:30 -04:00
senhuang42	b1c7fc5768	Add compatibility for multithreading	2020-10-19 12:07:06 -04:00
senhuang42	aad436da37	Document ldm enabled by default in zstd.h	2020-10-19 11:02:29 -04:00
senhuang42	590f7f55f0	Add ldm enable condition in ZSTD_resetCCtx_internal	2020-10-19 10:26:17 -04:00
senhuang42	4d01979b62	Expose and call ZSTD_ldm_skipRawSeqStoreBytes()	2020-10-16 20:30:00 -04:00
Yann Collet	a0ec50c2dc	Merge pull request #2355 from senhuang42/change_ldm_mt_config Reduce --long mode MT jobsize at higher levels	2020-10-16 13:35:50 -07:00
Yann Collet	314c7df170	minor : change test order to reduce a warning with `clang` on Windows	2020-10-16 13:26:47 -07:00
senhuang42	f49926edf4	Change cycleLog adjustment to +3 from +4	2020-10-15 09:56:05 -04:00
senhuang42	ee84817fe7	Reset posInSequence when using ZSTD_referenceExternalSequences()	2020-10-14 22:06:08 -04:00
senhuang42	d0550bb18f	Clarify argument names, fix DEBUGLOG() statements	2020-10-14 15:45:43 -04:00
senhuang42	3f99c9b38d	Adjust match backwards count args	2020-10-14 15:23:03 -04:00
senhuang42	bf0d559449	Introduce, implement, and call ZSTD_ldm_countBackwardsMatch_2segments()	2020-10-14 12:58:06 -04:00
senhuang42	467e4383b0	Merge branch 'dev' of github.com:senhuang42/zstd into change_ldm_mt_config	2020-10-14 10:17:50 -04:00
Nick Terrell	8c46c1d851	Merge pull request #2356 from bsdimp/neon aarch64: use __ARM_NEON instead of __aarch64__ to control use of neon	2020-10-13 15:42:46 -07:00
Yann Collet	1283935ac2	Merge pull request #2281 from likema/fix-aix-51 Fix building on AIX 5.1	2020-10-13 13:09:33 -07:00
Yann Collet	f5d5cd3b40	Merge pull request #2341 from senhuang42/ldm_optimized_for_opt_parser Integrate long distance matches into optimal parser	2020-10-13 13:09:07 -07:00
Warner Losh	43c0054405	aarch64: use __ARM_NEON instead of __aarch64__ to control use of neon There are compilation environments in aarch64 where NEON isn't available. While these environments could define ZSTD_NO_INTRINSICS, it's more fail-safe to use the more specific symbol to know if NEON extensions are available. __ARM_NEON is the proper symbol, defined in ARM C Language Extensions Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some sources suggest __ARM_NEON__, but that's the obsolete spelling from prior versions of the standard. Signed-off-by: Warner Losh <imp@bsdimp.com>	2020-10-13 12:12:46 -06:00
Nick Terrell	7e6f91ed84	[minor] Improve docs and add an assert in response to review	2020-10-12 16:43:17 -07:00
senhuang42	354b5f1c0a	Use cycleLog instead of chainLog to determine LDM jobLog	2020-10-12 16:09:59 -04:00
Nick Terrell	441ce4178f	[zstdmt] Clarify a comment	2020-10-12 12:58:13 -07:00
Nick Terrell	efff5d8b2d	[zstdmt] Fix determinism issue with rsyncable mode The problem occurs in this scenario: 1. We find a synchronization point. 2. We attmept to create the job. 3. We fail because the job table is full: `mtctx->nextJobID > mtctx->doneJobID + mtctx->jobIDMask`. 4. We call `ZSTDMT_compressStream_generic` again. 5. We forget that we're at a sync point already, and we continue looking for the next sync point. This fix is to detect if we're currently paused at a sync point, and if we are then don't load any more input. Caught by zstreamtest. I modified it to make the bug occur more often (~1/100K -> ~1/200) and verified that it is fixed after. I then ran a few hundred thousand unmodified zstreamtest iterations to verify.	2020-10-12 12:55:17 -07:00
Nick Terrell	ede4f97153	[zstdmt] Fix bug where extra empty blocks are emitted When zstdmt cannot get a buffer and `ZSTD_e_end` is passed an empty compression job can be created. Additionally, `mtctx->frameEnded` can be set to 1, which could potentially cause problems like unterminated blocks. The fix is to adjust to `ZSTD_e_flush` even when we can't get a buffer.	2020-10-12 12:55:17 -07:00
Nick Terrell	c51a9e79b9	[zstdmt] Rip out the zstdmt API This commit leaves only the functions used by zstd_compress.c. All other functions have been removed from the API. The ZSTDMT unit tests in fuzzer.c and zstreamtest.c have been rewritten to use the ZSTD API. And the --mt zstreamtest tests have been ripped out.	2020-10-12 12:55:16 -07:00
Nick Terrell	1784c4b4ab	[zstdmt] Remove single-pass shortcut Simplifies the code and removes blocking from zstdmt. At this point we could completely delete `ZSTDMT_compress_advanced_internal()`. However I'm leaving it in because I think we want to do that in the zstd-1.5.0 release, in case anyone is still using the ZSTDMT API, even though it is not installed by default. Fixes #2327.	2020-10-12 12:53:26 -07:00
Nick Terrell	b55ae009ac	[zstdmt] Remove singleBlockingThread mode This is already handled by zstd, so this logic is never used.	2020-10-12 12:53:26 -07:00
Nick Terrell	d5c688e8ae	Fix ZSTD_adjustCParams_internal() to handle dictionary logic Pass in the `ZSTD_cParamMode_e` to select how we define our cparams. Based on the mode we either take the `dictSize` into account or we set it to `0`. See the documentation for `ZSTD_cParamMode_e`. Some of the modes currently share the same behavior. But they have distinct modes because they are drastically different cases. E.g. compression + reprocessing the dictionary and creating a cdict. Additionally, when downsizing the hashLog and chainLog take the (adjusted) dictionary size into account, since the size of the dictionary gets added onto the window size. Adds a simple test to ensure that we aren't downsizing too far.	2020-10-12 12:50:04 -07:00
Nick Terrell	fadaab8c7c	[minor improvement] Pass 0 as the content size in the DDS The DDS structure can't be copied into the working tables like the DMS. So it doesn't need to account for the source size when sizing its parameters, just the dictionary size.	2020-10-12 12:47:21 -07:00
Nick Terrell	48ef15fb47	[minor improvement] Pass dictSize when selecting parameters When selecting parameters in streaming compression with a dictionary use the dictionary size to select the parameters.	2020-10-12 12:47:19 -07:00
Nick Terrell	012818df99	[refactor] Remove ZSTD_resetCStream_internal() This function is only called in one place. It isn't a logical separation of duties, and it was only obsfucating the code now, so inline it.	2020-10-12 12:46:10 -07:00
Nick Terrell	7083f79008	[bug] Fix dictContentType when reprocessing cdict Conditions to trigger: * CDict is loaded as raw content. * CDict starts with the zstd dictionary magic number. * The CDict is reprocessed (not attached or copied). * The new API is used (streaming or `ZSTD_compress2()`). Bug: The dictionary is loaded as a zstd dictionary, not a raw content dictionary, because the dict content type is set to `ZSTD_dct_auto`. Fix: Pass in the dictionary content type from cdict creation to the call to `ZSTD_compress_insertDictionary()`. Test: Added a test case that exposes the bug, and fixed the raw content tests to not modify the `dictBuffer`, which makes all future tests with the `dictBuffer` raw content, which doesn't seem intentional.	2020-10-12 12:46:10 -07:00
Dmitriy Titarenko	1b28d6501c	Fixed dll path in case of cross-compilation	2020-10-11 23:51:44 +05:00
senhuang42	d6911b86be	Require LDM matches to be strictly greater in length	2020-10-09 12:56:18 -04:00
Like Ma	cc907770bd	Fix building on AIX 5.1	2020-10-09 18:34:00 +08:00
Yann Collet	12541931fa	Merge pull request #2328 from marxin/zstd-pool-api Allow external creation of POOLs that can be shared.	2020-10-09 01:00:50 -07:00
Yann Collet	6fdb0cb8d9	Merge pull request #2303 from senhuang42/let_cdict_take_clevel_priority For ZSTD_compressStream2(), let cdict take compression level priority	2020-10-09 00:48:30 -07:00
senhuang42	b9c8033cde	Define kNullRawSeqStore for every file	2020-10-07 19:02:41 -04:00
senhuang42	a6165c1b28	Change matchState_t::ldmSeqStore to pointer	2020-10-07 14:13:57 -04:00
senhuang42	abce708a56	Move posInSequence correction to correct location	2020-10-07 13:56:25 -04:00
senhuang42	0c515590d8	Replace offCode of largest match if ldm's offCode is superior	2020-10-07 13:56:25 -04:00
senhuang42	0fac8e07e1	Refactor usage of ms->ldmSeqStore so that it is not modified during compressBlock(), and simplify skipRawSeqStoreBytes	2020-10-07 13:56:25 -04:00
senhuang42	a5500cf2af	Refactor separate ldm variables all into one struct	2020-10-07 13:56:25 -04:00
senhuang42	0731b94e7c	Use kNullRawSeqStore constant in zstdmt_compress.c	2020-10-07 13:56:25 -04:00
senhuang42	0325d878f2	Remove bubbling down matches with longer offCode and same matchLen	2020-10-07 13:56:25 -04:00
senhuang42	031b7ec15f	Disable LDM minMatch adjustment when using opt parser	2020-10-07 13:56:25 -04:00
senhuang42	ddf8a3f1b9	Enable inclusion of mid-flight LDMs in opt parser	2020-10-07 13:56:25 -04:00
senhuang42	88f72ed942	Correct incorrect offcode calculation	2020-10-07 13:56:25 -04:00
senhuang42	d8b43a4202	Add explicit conversion of size_t to U32	2020-10-07 13:56:25 -04:00
senhuang42	b8bfc4e63d	Add cSize regression test to fuzzer.c	2020-10-07 13:56:25 -04:00
senhuang42	c87d2e5866	Prefix new static ldm helpers with ZSTD_opt	2020-10-07 13:56:25 -04:00
senhuang42	429dec4f42	Add DEBUGLOG() calls in ldm helpers	2020-10-07 13:56:25 -04:00
senhuang42	10647924f1	Make function descriptions more accurate	2020-10-07 13:56:25 -04:00
senhuang42	1a687b3fcb	Improve documentation of relevant structs	2020-10-07 13:56:25 -04:00
senhuang42	37617e23d7	Correct matchLength calculation and remove unnecessary functions	2020-10-07 13:56:25 -04:00
senhuang42	7dee62c287	Reset ldmSeqStore after initStats_ultra() pass for btultra2	2020-10-07 13:56:25 -04:00
senhuang42	0718aa70df	Refactor existing functions to use posInSequence	2020-10-07 13:56:25 -04:00
senhuang42	7348b40a87	Adjustments to ldm_calculateMatchRange() to calculate bounds correctly	2020-10-07 13:56:25 -04:00
senhuang42	a1ef2db5b2	Add ldm_calculateMatchRange() function	2020-10-07 13:56:25 -04:00
senhuang42	ef823e0299	Remove rawSeqStore.base and add rawSeqStore.posInSequence	2020-10-07 13:56:25 -04:00
senhuang42	4793ae3b84	Prevent duplicate LDMs from being inserted	2020-10-07 13:56:25 -04:00
senhuang42	65f9cfeeec	Add extra bounds check to prevent heap access after free ASAN error	2020-10-07 13:56:25 -04:00
senhuang42	bff5785fd5	Address mixed variables C90 warning	2020-10-07 13:56:25 -04:00
senhuang42	724b94ed18	ldm_getNextMatch fixed return values	2020-10-07 13:56:25 -04:00
senhuang42	ea92fb3a68	Cleanups, add comments and explanations	2020-10-07 13:56:25 -04:00
senhuang42	78da2e1808	Fixed sifting algorithm	2020-10-07 13:56:25 -04:00
senhuang42	6ccd97fc96	Fixed end of match boundary update issues	2020-10-07 13:56:25 -04:00
senhuang42	28394b64f2	Add proper bounds check on adding ldms	2020-10-07 13:56:25 -04:00
senhuang42	a2f2b58d04	Add a function ldm_voidSequences()	2020-10-07 13:56:25 -04:00
senhuang42	9c3c7cd20e	Fix function argument to getNextMatch()	2020-10-07 13:56:25 -04:00
senhuang42	c8b8572b38	Adjustments to no longer segfault on nci	2020-10-07 13:56:25 -04:00
senhuang42	f57c7e6bbf	Add base adjustment correction	2020-10-07 13:56:25 -04:00
senhuang42	5df9b5e05f	Add initial getNextMatch() in opt parser	2020-10-07 13:56:25 -04:00
senhuang42	f8ce7cabc3	Added more debugging	2020-10-07 13:56:25 -04:00
senhuang42	84009a076a	Add re-copying of ldmSeqStore after processing	2020-10-07 13:56:25 -04:00
senhuang42	42395a70c2	Add debug statements, flesh out functions	2020-10-07 13:56:25 -04:00
senhuang42	dd3dd199bb	Get zstd to build with new functions and callsites, fix arguments	2020-10-07 13:56:25 -04:00
senhuang42	766c4a8c28	Implement part of ldm_maybeAddLdm()	2020-10-07 13:56:25 -04:00
senhuang42	84777059d2	Implement ldm_getNextMatch()	2020-10-07 13:56:24 -04:00
senhuang42	28c74bf591	Implement basic splitSequence and skipSequence functions	2020-10-07 13:56:24 -04:00
senhuang42	634ab7830d	Flesh out required args for ldm_handleLdm()	2020-10-07 13:56:24 -04:00
senhuang42	db70761032	Add callsites to appropriate locations in ..opt_generic()	2020-10-07 13:56:24 -04:00
senhuang42	aea61e3c91	Add ldm helper function declarations into opt parser	2020-10-07 13:56:24 -04:00
senhuang42	35d9f488f5	Modify codepath to use opt parser exclusively if the compression level is high enough	2020-10-07 13:56:24 -04:00
senhuang42	e1ae398ad5	Add rawSeqStore to match state	2020-10-07 13:56:24 -04:00
Martin Liska	b684900a4a	Allow external creation of POOLs that can be shared.	2020-10-07 12:44:33 +02:00
Nick Terrell	4b4d8b4dc9	Merge pull request #2338 from terrelln/comments Add comments to ZSTD_getLowest{Match,Prefix}Index()	2020-10-01 18:56:24 -07:00
Nick Terrell	0057c4acf7	Merge pull request #2333 from terrelln/stable-dst Reset all decompression parameters in ZSTD_DCtx_reset()	2020-10-01 18:56:11 -07:00
Nick Terrell	2e7d174130	Reset all decompression parameters in ZSTD_DCtx_reset() * Reset all decompression parameters in `ZSTD_DCtx_reset()` when resetting parameters. * Add a test case.	2020-10-01 14:19:21 -07:00
Nick Terrell	27c969ed07	Add comments to ZSTD_getLowest{Match,Prefix}Index() Clarify how we handle dictionaries in each case.	2020-10-01 13:21:46 -07:00
Yann Collet	cc88eb7594	Merge pull request #2317 from animalize/msvc_inline Let MSVC force inline ZSTD_hashPtr() function	2020-09-30 08:27:53 -07:00
Nick Terrell	f1cbeec039	[superblock] Reduce stack usage by correctly sizing header buffers	2020-09-24 19:42:04 -07:00
Nick Terrell	6a1e526ea7	[lib] Add ZSTD_COMPRESS_HEAPMODE tuning parameter	2020-09-24 19:42:04 -07:00
Nick Terrell	b841387218	[freestanding] Improve macro resolution to handle #if X	2020-09-24 19:42:04 -07:00
Nick Terrell	caecd8c211	Allow user to override ASAN/MSAN detection Rename ADDRESS_SANITIZER -> ZSTD_ADDRESS_SANITIZER and same for MEMORY_SANITIZER. Also set it to 0/1 instead of checking for defined. This allows the user to override ASAN/MSAN detection for platforms that don't support it.	2020-09-24 19:42:04 -07:00
Nick Terrell	88fac5d514	Remove call to memset The previous commit fixes the test so it errors on calls to mem*() functions from <string.h>.	2020-09-24 19:42:04 -07:00
Nick Terrell	9ae0483858	Reorganize zstd_deps.h and mem.h + replace mem.h for the kernel	2020-09-24 19:41:59 -07:00
Nick Terrell	260fc75028	Move __has_builtin() fallback define to compiler.h	2020-09-24 15:51:08 -07:00
Nick Terrell	4d63ee57f5	Move ASAN/MSAN support declarations to compiler.h	2020-09-24 15:51:08 -07:00
Nick Terrell	b09ec5c2b9	Remove MEM_STATIC_ASSERT and use DEBUG_STATIC_ASSERT instead	2020-09-24 15:51:04 -07:00
Nick Terrell	9261476b7d	[lib] Wrap customMem xor checks in parens for readability This clarifies operator precedence, and quiets cppcheck in the Kernel Test Robot. I think this is a slight bonus to readability, so I am accepting the suggestion.	2020-09-23 23:26:07 -07:00
Nick Terrell	dec7fb03ec	[lib] Silence -Wunused-const-variable warnings	2020-09-23 12:59:57 -07:00
animalize	2e5d73dd72	Use `MEM_STATIC FORCE_INLINE_ATTR` instead of `FORCE_INLINE_TEMPLATE` It adds `__attribute__((unused))` for __GNUC__, to eliminate `-Werror=unused-function` error.	2020-09-21 13:26:38 +08:00
animalize	0a69a6b1ca	Let MSVC force inline ZSTD_hashPtr() function ZSTD_hashPtr() function was not expanded by MSVC, led to low performance compared to GCC.	2020-09-21 10:38:55 +08:00
Felix Handte	200c960f1d	Merge pull request #2311 from felixhandte/ddss-fix-cparam-derivation Fix Compression Parameter Derivation Bugs Introduced by DDSS Changes	2020-09-18 14:02:14 -04:00
W. Felix Handte	8930c6e551	Use ZSTD_CCtxParams_init() to Init CCtxParams, not memset() Even if the discrepancies are at the moment benign, it's probably better to standardize on using the one true initializer, rather than trying (and failing) to correctly duplicate its behavior.	2020-09-17 12:15:33 -04:00
W. Felix Handte	e8a44326fa	Avoid Redundancy in ZSTD_initCDict_internal() Args; Don't Take CParams + CCtxParams	2020-09-17 12:08:36 -04:00
W. Felix Handte	eee51a664a	Fall Back if Derived CParams are Incompatible with DDSS; Refactor CDict Creation Rewrite ZSTD_createCDict_advanced() as a wrapper around ZSTD_createCDict_advanced2(). Evaluate whether to use DDSS mode after fully resolving cparams. If not, fall back.	2020-09-15 18:01:08 -04:00
W. Felix Handte	bc6521a6f6	Make ZSTD_createCDict_advanced2() cctxParams Arg Const	2020-09-15 14:06:10 -04:00
W. Felix Handte	26a96a5b35	Do More Complete CParams Deduction in Non-DDSS Path of ZSTD_createCDict_advanced2 Call ZSTD_getCParamsFromCCtxParams() instead of ZSTD_getCParams_internal().	2020-09-15 13:57:43 -04:00
W. Felix Handte	a2af804129	Pull CParam Override Logic into Helper	2020-09-15 13:38:05 -04:00
Yann Collet	c91a0855f8	check endDirective in ZSTD_compressStream2() fix #2297 also : - `assert()` `endDirective` in `ZSTD_compressStream_internal()`, for debug mode - add relevant tests	2020-09-14 10:56:08 -07:00
senhuang42	17b56f934e	Coding style cleanup	2020-09-11 11:42:12 -04:00
senhuang42	801513b5e7	Modify params rather than cctx->requestedParams	2020-09-11 11:41:10 -04:00
W. Felix Handte	c5fab8848a	Document searchFuncs Table	2020-09-10 22:10:02 -04:00
W. Felix Handte	85a95840e4	Further Consolidate Dict Mode Checks	2020-09-10 22:10:02 -04:00
W. Felix Handte	032010fcc1	Improve Documentation Slightly	2020-09-10 22:10:02 -04:00
W. Felix Handte	0faefbf1b3	Make DDSS Selection Override ForceCopy Directive	2020-09-10 22:10:02 -04:00
W. Felix Handte	efa33861f2	Attempt to Fix MSVC Warnings	2020-09-10 22:10:02 -04:00
W. Felix Handte	ed43832770	Simplify Match Limit Checks Seems like a ~1.25% speedup.	2020-09-10 22:10:02 -04:00
W. Felix Handte	06d240b8a7	Use All Available Space in the Hash Table to Extent Chain Table Reach Rather than restrict our temp chain table to 2 ** chainLog entries, this commit uses all available space to reach further back to gather longer chains to pack into the DDSS chain table.	2020-09-10 22:10:02 -04:00
W. Felix Handte	b2b0641ea0	Rewrite Table Fill to Retain Cache Entries Beyond Chain Window	2020-09-10 22:10:02 -04:00
W. Felix Handte	916238d9dc	Avoid Malloc in Table Fill; Pack Tmp Structure into Hash Table	2020-09-10 22:10:02 -04:00
W. Felix Handte	f42c5bddd9	Truncate Chain at Last Possible Attempt Make the chain table denser?	2020-09-10 22:10:02 -04:00
W. Felix Handte	20a020edbc	Prefetch Chain Table Matches	2020-09-10 22:10:02 -04:00
W. Felix Handte	9b9feb84f2	Lay Out Chain Table Chains Contiguously Rather than interleave all of the chain table entries, tying each entry's position to the corresponding position in the input, this commit changes the layout so that all the entries in a single chain are laid out next to each other. The last entry in the hash table's bucket for this hash is now a packed pointer of position + length of this chain. This cannot be merged as written, since it allocates temporary memory inside ZSTD_dedicatedDictSearch_lazy_loadDictionary().	2020-09-10 22:10:02 -04:00
W. Felix Handte	66509c7bf4	Only Insert Positions Inside the Chain Window	2020-09-10 22:10:02 -04:00
W. Felix Handte	13c5ec3e41	Only Allow Dedicated Dict Search for Dicts Loaded in 1 Chunk The load algorithm requires we do it all in one go.	2020-09-10 22:10:02 -04:00
W. Felix Handte	07793547e6	Fix Bug: Only Use DDSS Insertion on CDict MatchStates Previously, if DDSS was enabled on a CCtx and a dictionary was inserted into the CCtx, the CCtx MatchState would be filled as a DDSS struct, causing segfaults etc. This changes the check to use whether the MatchState is marked as using the DDSS (which is only ever set for CDict MatchStates), rather than looking at the CCtxParams.	2020-09-10 18:51:52 -04:00
W. Felix Handte	d214d8c859	Shorten Dict Mode Conditionals in Order to Improve Readability	2020-09-10 18:51:52 -04:00
W. Felix Handte	f49c1563ff	Force-Inline ZSTD_insertAndFindFirstIndex_internal() Without this, gcc was declining to inline the function in `ZSTD_noDict` mode, resulting in a ~10% slowdown.	2020-09-10 18:51:52 -04:00
W. Felix Handte	cab86b074f	Clean Up Search Function Selection	2020-09-10 18:51:52 -04:00
W. Felix Handte	2ffbde0d95	Fix `-Wshorten-64-to-32` Error	2020-09-10 18:51:52 -04:00
W. Felix Handte	7b5d2f72ea	Adjust Working Context Table Sizes Back Down	2020-09-10 18:51:52 -04:00
W. Felix Handte	c09454e28f	Add Warning Comment to ZSTD_createCDict_advanced2() Declaration	2020-09-10 18:51:52 -04:00
W. Felix Handte	d332f57897	Permit Matching Against Lowest Valid Position This comparison was previously faulty: the lowest valid position is itself valid, and we should therefore be allowed to match against it.	2020-09-10 18:51:52 -04:00
W. Felix Handte	a3659fe1ef	Make ZSTD_dedicatedDictSearch_getCParams Wrap ZSTD_getCParams Fixes up bounds-checking, and lets us clean up what is at the moment an unnecessary duplication of the default cparams tables.	2020-09-10 18:51:52 -04:00
W. Felix Handte	7b9a755ac9	Remove Chain Limit on Hash Cache Entries; Slightly Improve Compression Entries in the hashTable chain cache aren't subject to the same aliasing that the circular chain table is subject to. As such, we don't need to stop when we cross the chain limit. We can delve deeper. :)	2020-09-10 18:51:52 -04:00
W. Felix Handte	e8b4011b52	Split Lookups in Hash Cache and Chain Table into Two Loops Sliiiight speedup.	2020-09-10 18:51:52 -04:00
W. Felix Handte	9e83c782f8	Simplify DDS Hash Table Construction No need to walk the chainTable; we can just keep shifting the entries in the hashTable.	2020-09-10 18:51:52 -04:00
W. Felix Handte	ad9f98ac3f	Document the ZSTD_c_enableDedicatedDictSearch Parameter	2020-09-10 18:51:52 -04:00
W. Felix Handte	5390fee4f7	Rename and Move DD_BLOG Constant to ZSTD_LAZY_DDSS_BUCKET_LOG	2020-09-10 18:51:52 -04:00
W. Felix Handte	5e91ae27eb	Prefetch First Batch of Match Positions; +11% Speed in Level 5 w/ 1 Dict	2020-09-10 18:51:52 -04:00

... 7 8 9 10 11 ...

4226 Commits