Commit Graph

4226 Commits

Author SHA1 Message Date
Felix Handte
f861e8c07b
Merge pull request #2421 from felixhandte/pc-no-sed
Don't Use Regexes to Build Pkg-Config File
2020-12-09 18:58:17 -05:00
W. Felix Handte
9dab03db90 Create Enum to Represent Static/Dynamic Allocation Distinction in cwksp 2020-12-09 14:57:37 -05:00
W. Felix Handte
db9e73cb07 Don't ASAN-Poison Statically-Allocated Workspaces
Addresses #2286.
2020-12-09 13:00:47 -05:00
W. Felix Handte
b521183c74 Avoid Use of Regexes in Building Package-Config File 2020-12-08 20:10:05 -05:00
Nick Terrell
1bbcf07bd5 [huf_compress] Refactor and comment HUF_buildCTable()
Comment and refactor `HUF_buildCTable()` and the helper functions
it calls as I read and understand the code. Hopefully this refactor
makes the code a bit more clear.
2020-12-08 13:57:01 -08:00
Yann Collet
b86e3c9304
Merge pull request #2415 from facebook/fix_aliasing
fix gcc-10 strict aliasing warnings
2020-12-04 21:30:57 -08:00
Nick Terrell
fad175f9c1
Merge pull request #2412 from animalize/dict_compressionlevel
use ZSTD_CLEVEL_DEFAULT in zdict.c
2020-12-04 17:09:30 -08:00
Yann Collet
6132df8dd3 fix gcc-10 strict aliasing warnings
by exposing HUF_CElt declaration.
2020-12-04 16:43:19 -08:00
Yann Collet
68c14bdff2 minor speed improvement to HUF_readCTable()
faster by ~+1-2%
2020-12-04 16:33:39 -08:00
Nick Terrell
c238db046f
Merge pull request #2414 from terrelln/mt-progress
[lib] Ensure that multithreaded compression always makes some progress
2020-12-04 16:30:08 -08:00
Nick Terrell
4c58cb8383 [lib] Ensure that multithreaded compression always makes some progress 2020-12-03 20:25:14 -08:00
animalize
1aec77ea89 use ZSTD_CLEVEL_DEFAULT in zdict.c 2020-12-03 12:46:57 +08:00
Nick Terrell
6672689e7e
Merge pull request #2406 from terrelln/linux-wrapper-api
[linux] Add the linux wrapper API
2020-12-02 16:49:03 -08:00
Yann Collet
91c1b57be9
Merge pull request #2409 from facebook/test_makefile
Minor refactor
2020-12-02 15:33:54 -08:00
Nick Terrell
894ae36675
Merge pull request #2390 from animalize/clamp_level
Clamp compression level
2020-12-02 14:35:58 -08:00
senhuang42
2cbd038528 Move max nb seq check to per-block 2020-12-02 12:11:32 -05:00
Nick Terrell
3cda5fae77 [minor][lib] Remove double semicolon 2020-12-02 01:08:08 -08:00
Yann Collet
9f8b180d5d fixed API documentation 2020-12-02 00:15:07 -08:00
Yann Collet
6112b82526
Merge pull request #2348 from dscheg/dev
Fix dll path in case of cross-compilation
2020-12-01 17:59:56 -08:00
senhuang42
3efe9c902b Add sequence nb validation to compressSequences(), adjust minMatch comparisons 2020-12-01 10:54:45 -05:00
senhuang42
4c5f337248 Use cctx's minMatch instead of global MINMATCH, make fuzzer use validation 2020-11-30 15:41:20 -05:00
Dmitriy Titarenko
61f71753d4 Pass dictBufferCapacity to COVER_selectDict()
closes #2371
2020-11-22 23:45:18 +05:00
sen
c5fbd55dac
Merge pull request #2387 from senhuang42/compress_sequence_API
[RFC] New sequence compression API
2020-11-20 16:54:20 -05:00
senhuang42
7742f076b4 Add experimental param for sequence validation 2020-11-20 11:57:41 -05:00
senhuang42
0e32928b7d Remove unnecessary repcode backup, apply style choices, use function pointer 2020-11-20 11:02:19 -05:00
sen
e924a0fa51
Explicit cast for visual warnings
Github has automatic commits now! Cool

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
2020-11-19 17:32:40 -05:00
senhuang42
dcbbf7c09f Unroll isRLE loop 2020-11-19 12:38:13 -05:00
senhuang42
05c0229668 Clean up visual conversion warnings 2020-11-18 15:36:29 -05:00
senhuang42
3c4454769b Improve documentation on ZSTD_compressSequences() 2020-11-18 09:52:24 -05:00
senhuang42
d6d7ba2a1f Modification to offset validation to include entire sequence 2020-11-17 10:13:22 -05:00
senhuang42
8f3136a9c7 Fix assert edge case, improve documentation in zstd.h 2020-11-16 18:05:35 -05:00
senhuang42
f6baad87d6 Fix warnings and make validation enabled by default 2020-11-16 12:00:06 -05:00
senhuang42
55b90ef010 Fix unit tests to agree with new changes 2020-11-16 11:36:37 -05:00
senhuang42
7f563b0519 Add new sequence format as an experimental CCtx param 2020-11-16 10:49:17 -05:00
senhuang42
347824ad73 Overhaul logic to simplify, add in proper validations, fix match splitting 2020-11-16 10:49:17 -05:00
senhuang42
46824cb018 Add new sequence compress api params to cctx 2020-11-16 10:49:17 -05:00
senhuang42
48405b4633 Fix srcSize=0 edge case 2020-11-16 10:49:17 -05:00
senhuang42
022e6d81e7 Fix literals length calculation 2020-11-16 10:49:17 -05:00
senhuang42
dad20b5ccb Remove dstCapacity error check 2020-11-16 10:49:17 -05:00
senhuang42
b8e16a2057 Remove extraneous function in this API 2020-11-16 10:49:17 -05:00
senhuang42
f29507c4fc Add check comparing offset to window size 2020-11-16 10:49:17 -05:00
senhuang42
7a6e46a92f Fix MSAN errors 2020-11-16 10:49:17 -05:00
senhuang42
cc2642bd17 Address edge case with endPosInSequence 2020-11-16 10:49:17 -05:00
senhuang42
fd10007174 Change debug levels to appropriate ones 2020-11-16 10:49:17 -05:00
senhuang42
2db8441245 Add RLE support 2020-11-16 10:49:17 -05:00
senhuang42
dfef298336 Fix various build warnings 2020-11-16 10:49:17 -05:00
senhuang42
2bbdddf24e Add test case to roundtrip using ZSTD_getSequences() and ZSTD_compressSequences() 2020-11-16 10:49:16 -05:00
senhuang42
5fd69f8173 Add documentation for new api functions 2020-11-16 10:49:16 -05:00
senhuang42
e8b7fdb64b Refactor for enhanced code clarity 2020-11-16 10:49:16 -05:00
senhuang42
c675fb46f1 Rename internal function compressSequences(), and promote new *_ext() functions to their actual name 2020-11-16 10:49:16 -05:00
senhuang42
013434e1e4 Add another API function to compress with existing CCTX 2020-11-16 10:49:16 -05:00
senhuang42
c44ce29013 More adjustments to improve code clarity 2020-11-16 10:49:16 -05:00
senhuang42
48f67da854 Pull compressStream2() transparent initialization into its own function 2020-11-16 10:49:16 -05:00
senhuang42
c86151f53c Add initial support for new ZSTD_Sequence mode 2020-11-16 10:49:16 -05:00
senhuang42
e0f26afce9 Add sequence compression format param 2020-11-16 10:49:16 -05:00
senhuang42
f51af9a609 Always ensure sequenceRange updates properly, add more error forwarding 2020-11-16 10:49:16 -05:00
senhuang42
1a449688fd Various minor logical refactors to improve clarity 2020-11-16 10:49:16 -05:00
senhuang42
e5fe485dcc Fix cSize calculation for noCompressBlocks 2020-11-16 10:49:16 -05:00
senhuang42
6145ebb400 Rebased, roundtrips silesia.tar 2020-11-16 10:49:16 -05:00
senhuang42
b5b61cc216 Refactor for better debugging info 2020-11-16 10:49:16 -05:00
senhuang42
293fad6b45 Corrections and edge-case fixes to be able to roundtrip dickens 2020-11-16 10:49:16 -05:00
senhuang42
7eb6fa7be4 Multi-block compression scaffolding - works on single-block files 2020-11-16 10:49:16 -05:00
senhuang42
75b01f34b9 Add support for uncompressible blocks 2020-11-16 10:49:16 -05:00
senhuang42
e04da68157 Enable usage of ZSTD_sequenceRange for single-block compression 2020-11-16 10:49:16 -05:00
senhuang42
337fac216d Add logic to handle ZSTD_sequenceRange 2020-11-16 10:49:16 -05:00
senhuang42
85822ddd53 Add last literals handling like getSequences() 2020-11-16 10:49:16 -05:00
senhuang42
2cff8df1a2 Pull block compression out of main compressSequences() function 2020-11-16 10:49:16 -05:00
senhuang42
cfced9344a Implement ZSTD_updateSequenceRange 2020-11-16 10:49:16 -05:00
senhuang42
b116e1f211 Modify SequenceRange to have posInSequence 2020-11-16 10:49:16 -05:00
senhuang42
d99b675112 Add function definition for sequenceRange updater 2020-11-16 10:49:16 -05:00
senhuang42
74e95c05cc Add ZSTD_SequenceRange to count ranges in array of ZSTD_Sequence 2020-11-16 10:49:16 -05:00
senhuang42
89f3848310 Add support for repcodes 2020-11-16 10:49:16 -05:00
senhuang42
3e930fd044 Code cleanup, add debuglog statments 2020-11-16 10:49:16 -05:00
senhuang42
086513b5b9 Implement first pass at compressSequences() 2020-11-16 10:49:16 -05:00
senhuang42
a9327b1e9b Add initial function prototype for ZSTD_compressSequences_ext (to be renamed later) 2020-11-16 10:33:35 -05:00
animalize
52f8c07a3f Clamp compression level in ZSTD_getCParams_internal() function 2020-11-14 13:26:08 +08:00
senhuang42
9d936d61d2 Reduce number of memcpy() calls 2020-11-13 19:43:30 -05:00
senhuang42
be4ac6c5bc Use existing repcode update function to implement updates 2020-11-12 16:51:12 -05:00
senhuang42
674c9b9235 Add in proper block repcode histories 2020-11-12 15:34:37 -05:00
senhuang42
06c7f14066 Let block reps persist 2020-11-12 12:24:44 -05:00
senhuang42
396275068c Fix incorrect repcode setting 2020-11-12 11:57:01 -05:00
senhuang42
1a8af0de73 Improve unit test 2020-11-12 11:09:09 -05:00
senhuang42
4d4fd2c55f Overhaul repcode handling logic 2020-11-12 10:59:35 -05:00
Yann Collet
69b8361b0c
Merge pull request #2388 from facebook/fix2386
fix incorrect assert
2020-11-06 11:38:08 -08:00
sen
f62edf0fe9
Merge pull request #2381 from senhuang42/expand_sequence_extraction_api
Add enum to define ZSTD_Sequence type and update sequence extraction API
2020-11-06 13:00:31 -05:00
Yann Collet
95e74616d5 fix multiple minor conversion warnings
unrelated to #2386, just cleaning up while I'm updating this file ...
2020-11-06 09:57:05 -08:00
Yann Collet
2769e4d459 fix incorrect assert
fix #2386, reported by @Neumann-A
2020-11-06 09:44:04 -08:00
senhuang42
7d1dea070c Update unit tests 2020-11-06 11:10:37 -05:00
senhuang42
779df995c6 Implement mergeGeneratedSequences() 2020-11-06 10:55:46 -05:00
senhuang42
51abd58208 Rename getSequences() to generateSequences() 2020-11-06 10:53:22 -05:00
senhuang42
261ea69661 Add new mergeGeneratedSequences() function 2020-11-06 10:52:34 -05:00
Luke Pitt
eac309c71b Add ZSTD_getDictID_fromCDict function to experimental section 2020-11-04 11:37:37 +00:00
senhuang42
f782cac3d4 Change block delimiter removing to linear time approach 2020-11-02 17:06:20 -05:00
senhuang42
3c9b43da1d Remove trailing comma 2020-11-02 11:53:58 -05:00
senhuang42
3434049c1f Use ZSTD_memmove() instead of memmove() 2020-11-02 11:43:19 -05:00
senhuang42
d4d0346b40 Update name of enum, clarify documentation 2020-11-02 11:38:17 -05:00
senhuang42
e6178f837f Revert unnecessary seqCollector adjustment 2020-11-02 10:59:20 -05:00
senhuang42
e8501e00b8 Fix incorrect index increment in merge algorithm 2020-11-02 10:58:41 -05:00
senhuang42
a36fdada57 Add algorithm to remove all delimiters 2020-11-02 10:46:52 -05:00
senhuang42
435a3a0428 Update seqCollector definition 2020-11-02 10:19:26 -05:00
senhuang42
3327932609 Update ZSTD_getSequences function signature 2020-11-02 10:17:59 -05:00
senhuang42
7397d0102f Add new enum for different sequence formats for ingestion/extraction 2020-11-02 10:15:53 -05:00
Nick Terrell
7205e609a9
Merge pull request #2354 from terrelln/stable-buffer
Add ZSTD_c_stable{In,Out}Buffer and optimize when set
2020-10-30 15:06:56 -07:00
sen
c37c714ef1
Merge pull request #2376 from senhuang42/clarify_sequence_extraction_api
Refine external ZSTD_Sequence API
2020-10-30 15:47:25 -04:00
Nick Terrell
d4e021fe35 [lib] Avoid allocating the input buffer when ZSTD_c_stableInBuffer is set
We don't use it when we have a stable input buffer, so don't allocate
it. I had to slightly modify `ZSTD_copyCCtx()` by storing the
`ZSTD_buffered_policy_e` in the `ZSTD_CCtx`, since `inBuffSize > 0` is
no longer the correct signal for the buffered mode.
2020-10-30 10:55:34 -07:00
Nick Terrell
24f72789e2 [lib] Skip the input window buffer when ZSTD_c_stableInBuffer is set
Compress directly from the `ZSTD_inBuffer`. We still allocate the input
buffer. A following commit will remove that allocation.
2020-10-30 10:55:34 -07:00
Nick Terrell
6bd6b6f7d3 [cwksp] Return NULL when 0 bytes are requested
This ensures that the buffer is never used.
2020-10-30 10:55:34 -07:00
Nick Terrell
fcf81cee5e [lib] Avoid allocating output buffer when ZSTD_c_stableOutBuffer is set
We compress directly to the `ZSTD_outBuffer` so we don't need to
allocate it.
2020-10-30 10:55:34 -07:00
Nick Terrell
6d5dc93d4e [lib] Compress directly into output when ZSTD_c_stableOutBuffer is set
When we have a stable output buffer always compress directly into the
`ZSTD_outBuffer`. We are allowed to return `dstSizeTooSmall`.
2020-10-30 10:55:34 -07:00
Nick Terrell
987cb4ca6a [lib] Take the shortcut when ZSTD_c_stableOutBuffer is set
When we have a stable output buffer take the single-pass shortcut.
It is okay to return `dstSizeTooSmall` if the output buffer isn't
big enough, because we know it will never grow.
2020-10-30 10:55:34 -07:00
Nick Terrell
809b2f2071 [lib] Set ZSTD_c_stable{In,Out}Buffer in ZSTD_compress2()
Sets these parameters in ZSTD_compress2() then resets them to their
orignal values after the compression call.

An alternative design could be to add a flush mode `ZSTD_e_singlePass`
which implies `ZSTD_c_stable{In,Out}Buffer` but only for a single
compression call, by directly setting the applied parameters. I've opted
for the smaller change, but this is open for discussion.
2020-10-30 10:55:34 -07:00
Nick Terrell
c74be3f6de [lib] Validate buffers when ZSTD_c_stable{In,Out}Buffer is set
Adds the validation of the input/output buffers only. They are still
unused.
2020-10-30 10:55:34 -07:00
Nick Terrell
e3e0775cc8 [API] Add ZSTD_c_stable{In,Out}Buffer parameters
This commit adds the parameters and sets the value in the CCtxParams
but it does not do anything with the value.
2020-10-30 10:54:39 -07:00
Nick Terrell
e2581d9572 [lib] Set appliedParams in zstdmt mode
Previously only `nbWorkers` was set. Set all parameters, because that is
what is expected. This is needed for the `ZSTD_c_stable{In,Out}Buffer`
parameters.
2020-10-30 10:54:38 -07:00
senhuang42
f0da97642a Specify that getSequences() will always emit block boundary sequences 2020-10-30 12:31:17 -04:00
senhuang42
536e89c723 Sequence extractor should update CBlockState 2020-10-30 12:13:19 -04:00
senhuang42
32cac2627a Emit last literals of 0 size as well, to indicate block boundary 2020-10-29 16:41:17 -04:00
senhuang42
69bd5f0654 Correct literalsRead calculation to include longLength 2020-10-29 14:49:37 -04:00
senhuang42
59624f3163 Remove implicit typecast to appease appVeyor windows build 2020-10-28 16:25:09 -04:00
Yann Collet
09e3bb95d2 Merge branch 'dev' into libzstd_autoconf_full 2020-10-28 10:53:08 -07:00
Yann Collet
0adce4631d Merge branch 'libzstd_autoconf_full' of github.com:facebook/zstd into libzstd_autoconf_full 2020-10-28 10:25:55 -07:00
Yann Collet
f6ecf1568f minor Makefile refactor
hopefully improving readability
2020-10-28 09:39:15 -07:00
senhuang42
3ed5d053d8 Clarify comments in zstd.h some more 2020-10-28 09:53:09 -04:00
Nick Terrell
599ff58e08
Merge pull request #2339 from terrelln/zstdmt-stability
Fix zstdmt stability issues and clean up the zstdmt code
2020-10-27 19:43:13 -07:00
Yann Collet
ceccd7ae2d Merge branch 'dev' into libzstd_autoconf_full 2020-10-27 15:45:30 -07:00
Yann Collet
2d2507b9db
Merge pull request #2374 from bket/portability
'head -c BYTES' is non-portable
2020-10-27 14:15:35 -07:00
sen
17b700d78a
Merge pull request #2366 from senhuang42/enable_ldm_by_default
Enable LDM by default if window size >= 128MB and strategy uses opt parser
2020-10-27 14:59:28 -04:00
Nick Terrell
0953645837
Merge pull request #2362 from senhuang42/fix_ldm_fuzz_issue
Fix long distance matcher OSS-fuzz issue
2020-10-27 11:13:03 -07:00
senhuang42
3163909d14 Remove unused variable position 2020-10-27 12:58:12 -04:00
senhuang42
dc448563e9 Add test compatibility with last literals in sequences 2020-10-27 12:35:28 -04:00
Björn Ketelaars
1f661b5f6b 'head -c BYTES' is non-portable 2020-10-27 16:55:23 +01:00
senhuang42
1d221ecc03 Add support for representing last literals in the extracted seqs 2020-10-27 11:19:48 -04:00
senhuang42
9171f920cd Improve documentation of seqStore_t 2020-10-27 10:50:22 -04:00
senhuang42
96b0ff7886 Improve documentation regarding various operations in copyBlockSequences 2020-10-27 10:36:06 -04:00
senhuang42
3a11c7eb03 Modify ZSTD_copyBlockSequences to agree with new API 2020-10-27 10:31:40 -04:00
senhuang42
761f40d1c6 Clarify and modify ZSTD_Sequence definition 2020-10-27 09:41:32 -04:00
Yann Collet
456db0c377 make install only rebuild binaries if they don't exist
Now `make` followed by `make install` doesn't rebuild binaries

also : only generated target directories if they don't already exist
2020-10-23 16:46:49 -07:00
Yann Collet
a6ee614a44 make zstd is now differentiated from zstd-nomt
avoids mixing object files using different flags
2020-10-23 16:08:21 -07:00
Yann Collet
89b961ea46 simplified silent mode maintenance 2020-10-23 10:41:17 -07:00
Yann Collet
ffe8d9e428 fix partial lib test 2020-10-23 10:27:12 -07:00
Yann Collet
b5d4728713 simplified silent mode 2020-10-23 10:22:52 -07:00
Yann Collet
a7ad05bf57 fixed building libzstd with manual BUILD_DIR
and when HASH is not found
2020-10-23 10:14:04 -07:00
Yann Collet
d3f1a9b5bd fix partial-build test
sometimes, the scope difference is solely determined by the list of source files,
not by the flags.
2020-10-22 21:36:09 -07:00
Yann Collet
a912ef0952 can integrate later dynamic flags changes
for example `libzstd-mt` is `differentiated from `libzstd`
2020-10-22 18:48:06 -07:00
Yann Collet
f90424da2d
Merge pull request #2368 from facebook/progressive_libzstd
faster rebuild of libzstd
2020-10-22 17:36:56 -07:00
Yann Collet
ce6cd07c33 updated build documentation 2020-10-22 12:31:23 -07:00
Yann Collet
e3867fb735 fixed libzstd.dll compilation on mingw
and zstd linking
2020-10-22 11:52:19 -07:00
Yann Collet
494f7169ed fix directory creation for Windows' libzstd 2020-10-22 00:15:31 -07:00
Yann Collet
dd24496951 programs/zstd also automatically generate object dir per conf
same rules as lib/libzstd
can also be controlled via HASH and BUILD_DIR
2020-10-21 23:38:33 -07:00
Yann Collet
0f8ee5c51e Merge branch 'dev' into libzstd_autoconf 2020-10-21 22:36:09 -07:00
Yann Collet
d0436b2a45 automatically detect configuration changes
Makefile now automatically detects modifications of compilation flags,
and produce object files in directories dedicated to this compilation flags.
This makes it possible, for example, to compile libzstd with different DEBUGLEVEL.
Object files sharing the same configration will be generated into their dedicated directories.

Also : new compilation variables
- DEBUGLEVEL : select the debug level (assert & traces) inserted during compilation (default == 0 == release)
- HASH : select a hash function to differentiate configuration (default == md5sum)
- BUILD_DIR : skip the hash stage, store object files into manually specified directory
2020-10-21 19:22:45 -07:00
Yann Collet
8a453a34c5 automatic %.h header dependency tracking
also : BUILD_DIR can be manually specified
2020-10-21 17:25:07 -07:00
Yann Collet
2224ec33ed
Merge pull request #2367 from facebook/progressive_build
faster rebuild of zstd
2020-10-21 15:43:14 -07:00
Yann Collet
2b99bc29bf consolidated vpath 2020-10-21 04:01:01 -07:00
Yann Collet
e8eb2939fe store %.o object files into obj/
both static and dynamic libraries have their own object directory
2020-10-21 03:44:38 -07:00
Yann Collet
3e519be965 minor cleaning 2020-10-21 03:22:27 -07:00
Yann Collet
911dbdbb4b build libzstd.so from object files
%.o object files generated for dynamic library
must be different from those generated for static library.

Due to this difference, %.o were so far only generated for the static library.
The dynamic library was rebuilt from %.c source.

This meant that, for every minor change, the entire dynamic library had to be rebuilt.

This is fixed in this PR :
only the modified %.c source get rebuilt.
2020-10-20 22:19:57 -07:00
senhuang42
8bdb32aebe Add a function for LDM enable check 2020-10-20 13:46:02 -04:00
senhuang42
578e889ec1 Move ldm enable to compressStream2() 2020-10-20 13:04:45 -04:00
senhuang42
d28d8a1d72 Include LDM tables size for CCtx size estimation where relevant 2020-10-20 09:21:30 -04:00
senhuang42
b1c7fc5768 Add compatibility for multithreading 2020-10-19 12:07:06 -04:00
senhuang42
aad436da37 Document ldm enabled by default in zstd.h 2020-10-19 11:02:29 -04:00
senhuang42
590f7f55f0 Add ldm enable condition in ZSTD_resetCCtx_internal 2020-10-19 10:26:17 -04:00
senhuang42
4d01979b62 Expose and call ZSTD_ldm_skipRawSeqStoreBytes() 2020-10-16 20:30:00 -04:00
Yann Collet
a0ec50c2dc
Merge pull request #2355 from senhuang42/change_ldm_mt_config
Reduce --long mode MT jobsize at higher levels
2020-10-16 13:35:50 -07:00
Yann Collet
314c7df170 minor : change test order
to reduce a warning with `clang` on Windows
2020-10-16 13:26:47 -07:00
senhuang42
f49926edf4 Change cycleLog adjustment to +3 from +4 2020-10-15 09:56:05 -04:00
senhuang42
ee84817fe7 Reset posInSequence when using ZSTD_referenceExternalSequences() 2020-10-14 22:06:08 -04:00
senhuang42
d0550bb18f Clarify argument names, fix DEBUGLOG() statements 2020-10-14 15:45:43 -04:00
senhuang42
3f99c9b38d Adjust match backwards count args 2020-10-14 15:23:03 -04:00
senhuang42
bf0d559449 Introduce, implement, and call ZSTD_ldm_countBackwardsMatch_2segments() 2020-10-14 12:58:06 -04:00
senhuang42
467e4383b0 Merge branch 'dev' of github.com:senhuang42/zstd into change_ldm_mt_config 2020-10-14 10:17:50 -04:00
Nick Terrell
8c46c1d851
Merge pull request #2356 from bsdimp/neon
aarch64: use __ARM_NEON instead of __aarch64__ to control use of neon
2020-10-13 15:42:46 -07:00
Yann Collet
1283935ac2
Merge pull request #2281 from likema/fix-aix-51
Fix building on AIX 5.1
2020-10-13 13:09:33 -07:00
Yann Collet
f5d5cd3b40
Merge pull request #2341 from senhuang42/ldm_optimized_for_opt_parser
Integrate long distance matches into optimal parser
2020-10-13 13:09:07 -07:00
Warner Losh
43c0054405 aarch64: use __ARM_NEON instead of __aarch64__ to control use of neon
There are compilation environments in aarch64 where NEON isn't
available. While these environments could define ZSTD_NO_INTRINSICS,
it's more fail-safe to use the more specific symbol to know if NEON
extensions are available.

__ARM_NEON is the proper symbol, defined in ARM C Language Extensions
Release 2.1 (https://developer.arm.com/documentation/ihi0053/d/). Some
sources suggest __ARM_NEON__, but that's the obsolete spelling from
prior versions of the standard.

Signed-off-by: Warner Losh <imp@bsdimp.com>
2020-10-13 12:12:46 -06:00
Nick Terrell
7e6f91ed84 [minor] Improve docs and add an assert in response to review 2020-10-12 16:43:17 -07:00
senhuang42
354b5f1c0a Use cycleLog instead of chainLog to determine LDM jobLog 2020-10-12 16:09:59 -04:00
Nick Terrell
441ce4178f [zstdmt] Clarify a comment 2020-10-12 12:58:13 -07:00
Nick Terrell
efff5d8b2d [zstdmt] Fix determinism issue with rsyncable mode
The problem occurs in this scenario:
1. We find a synchronization point.
2. We attmept to create the job.
3. We fail because the job table is full: `mtctx->nextJobID > mtctx->doneJobID + mtctx->jobIDMask`.
4. We call `ZSTDMT_compressStream_generic` again.
5. We forget that we're at a sync point already, and we continue looking
   for the next sync point.

This fix is to detect if we're currently paused at a sync point, and if
we are then don't load any more input.

Caught by zstreamtest. I modified it to make the bug occur more often
(~1/100K -> ~1/200) and verified that it is fixed after. I then ran a
few hundred thousand unmodified zstreamtest iterations to verify.
2020-10-12 12:55:17 -07:00
Nick Terrell
ede4f97153 [zstdmt] Fix bug where extra empty blocks are emitted
When zstdmt cannot get a buffer and `ZSTD_e_end` is passed an empty
compression job can be created. Additionally, `mtctx->frameEnded` can be
set to 1, which could potentially cause problems like unterminated blocks.

The fix is to adjust to `ZSTD_e_flush` even when we can't get a buffer.
2020-10-12 12:55:17 -07:00
Nick Terrell
c51a9e79b9 [zstdmt] Rip out the zstdmt API
This commit leaves only the functions used by zstd_compress.c. All other
functions have been removed from the API. The ZSTDMT unit tests in
fuzzer.c and zstreamtest.c have been rewritten to use the ZSTD API. And
the --mt zstreamtest tests have been ripped out.
2020-10-12 12:55:16 -07:00
Nick Terrell
1784c4b4ab [zstdmt] Remove single-pass shortcut
Simplifies the code and removes blocking from zstdmt.

At this point we could completely delete
`ZSTDMT_compress_advanced_internal()`. However I'm leaving it in because
I think we want to do that in the zstd-1.5.0 release, in case anyone is
still using the ZSTDMT API, even though it is not installed by default.

Fixes #2327.
2020-10-12 12:53:26 -07:00
Nick Terrell
b55ae009ac [zstdmt] Remove singleBlockingThread mode
This is already handled by zstd, so this logic is never used.
2020-10-12 12:53:26 -07:00
Nick Terrell
d5c688e8ae Fix ZSTD_adjustCParams_internal() to handle dictionary logic
Pass in the `ZSTD_cParamMode_e` to select how we define our cparams.
Based on the mode we either take the `dictSize` into account or we set
it to `0`. See the documentation for `ZSTD_cParamMode_e`.

Some of the modes currently share the same behavior. But they have
distinct modes because they are drastically different cases. E.g.
compression + reprocessing the dictionary and creating a cdict.

Additionally, when downsizing the hashLog and chainLog take the
(adjusted) dictionary size into account, since the size of the
dictionary gets added onto the window size.

Adds a simple test to ensure that we aren't downsizing too far.
2020-10-12 12:50:04 -07:00
Nick Terrell
fadaab8c7c [minor improvement] Pass 0 as the content size in the DDS
The DDS structure can't be copied into the working tables like the DMS.
So it doesn't need to account for the source size when sizing its
parameters, just the dictionary size.
2020-10-12 12:47:21 -07:00
Nick Terrell
48ef15fb47 [minor improvement] Pass dictSize when selecting parameters
When selecting parameters in streaming compression with a dictionary use
the dictionary size to select the parameters.
2020-10-12 12:47:19 -07:00
Nick Terrell
012818df99 [refactor] Remove ZSTD_resetCStream_internal()
This function is only called in one place. It isn't a logical separation
of duties, and it was only obsfucating the code now, so inline it.
2020-10-12 12:46:10 -07:00
Nick Terrell
7083f79008 [bug] Fix dictContentType when reprocessing cdict
Conditions to trigger:
* CDict is loaded as raw content.
* CDict starts with the zstd dictionary magic number.
* The CDict is reprocessed (not attached or copied).
* The new API is used (streaming or `ZSTD_compress2()`).

Bug: The dictionary is loaded as a zstd dictionary, not a raw content
dictionary, because the dict content type is set to `ZSTD_dct_auto`.

Fix: Pass in the dictionary content type from cdict creation to the call
to `ZSTD_compress_insertDictionary()`.

Test: Added a test case that exposes the bug, and fixed the raw
content tests to not modify the `dictBuffer`, which makes all future
tests with the `dictBuffer` raw content, which doesn't seem intentional.
2020-10-12 12:46:10 -07:00
Dmitriy Titarenko
1b28d6501c Fixed dll path in case of cross-compilation 2020-10-11 23:51:44 +05:00
senhuang42
d6911b86be Require LDM matches to be strictly greater in length 2020-10-09 12:56:18 -04:00
Like Ma
cc907770bd Fix building on AIX 5.1 2020-10-09 18:34:00 +08:00
Yann Collet
12541931fa
Merge pull request #2328 from marxin/zstd-pool-api
Allow external creation of POOLs that can be shared.
2020-10-09 01:00:50 -07:00
Yann Collet
6fdb0cb8d9
Merge pull request #2303 from senhuang42/let_cdict_take_clevel_priority
For ZSTD_compressStream2(), let cdict take compression level priority
2020-10-09 00:48:30 -07:00
senhuang42
b9c8033cde Define kNullRawSeqStore for every file 2020-10-07 19:02:41 -04:00
senhuang42
a6165c1b28 Change matchState_t::ldmSeqStore to pointer 2020-10-07 14:13:57 -04:00
senhuang42
abce708a56 Move posInSequence correction to correct location 2020-10-07 13:56:25 -04:00
senhuang42
0c515590d8 Replace offCode of largest match if ldm's offCode is superior 2020-10-07 13:56:25 -04:00
senhuang42
0fac8e07e1 Refactor usage of ms->ldmSeqStore so that it is not modified during compressBlock(), and simplify skipRawSeqStoreBytes 2020-10-07 13:56:25 -04:00
senhuang42
a5500cf2af Refactor separate ldm variables all into one struct 2020-10-07 13:56:25 -04:00
senhuang42
0731b94e7c Use kNullRawSeqStore constant in zstdmt_compress.c 2020-10-07 13:56:25 -04:00
senhuang42
0325d878f2 Remove bubbling down matches with longer offCode and same matchLen 2020-10-07 13:56:25 -04:00
senhuang42
031b7ec15f Disable LDM minMatch adjustment when using opt parser 2020-10-07 13:56:25 -04:00
senhuang42
ddf8a3f1b9 Enable inclusion of mid-flight LDMs in opt parser 2020-10-07 13:56:25 -04:00
senhuang42
88f72ed942 Correct incorrect offcode calculation 2020-10-07 13:56:25 -04:00
senhuang42
d8b43a4202 Add explicit conversion of size_t to U32 2020-10-07 13:56:25 -04:00
senhuang42
b8bfc4e63d Add cSize regression test to fuzzer.c 2020-10-07 13:56:25 -04:00
senhuang42
c87d2e5866 Prefix new static ldm helpers with ZSTD_opt 2020-10-07 13:56:25 -04:00
senhuang42
429dec4f42 Add DEBUGLOG() calls in ldm helpers 2020-10-07 13:56:25 -04:00
senhuang42
10647924f1 Make function descriptions more accurate 2020-10-07 13:56:25 -04:00
senhuang42
1a687b3fcb Improve documentation of relevant structs 2020-10-07 13:56:25 -04:00
senhuang42
37617e23d7 Correct matchLength calculation and remove unnecessary functions 2020-10-07 13:56:25 -04:00
senhuang42
7dee62c287 Reset ldmSeqStore after initStats_ultra() pass for btultra2 2020-10-07 13:56:25 -04:00
senhuang42
0718aa70df Refactor existing functions to use posInSequence 2020-10-07 13:56:25 -04:00
senhuang42
7348b40a87 Adjustments to ldm_calculateMatchRange() to calculate bounds correctly 2020-10-07 13:56:25 -04:00
senhuang42
a1ef2db5b2 Add ldm_calculateMatchRange() function 2020-10-07 13:56:25 -04:00
senhuang42
ef823e0299 Remove rawSeqStore.base and add rawSeqStore.posInSequence 2020-10-07 13:56:25 -04:00
senhuang42
4793ae3b84 Prevent duplicate LDMs from being inserted 2020-10-07 13:56:25 -04:00
senhuang42
65f9cfeeec Add extra bounds check to prevent heap access after free ASAN error 2020-10-07 13:56:25 -04:00
senhuang42
bff5785fd5 Address mixed variables C90 warning 2020-10-07 13:56:25 -04:00
senhuang42
724b94ed18 ldm_getNextMatch fixed return values 2020-10-07 13:56:25 -04:00
senhuang42
ea92fb3a68 Cleanups, add comments and explanations 2020-10-07 13:56:25 -04:00
senhuang42
78da2e1808 Fixed sifting algorithm 2020-10-07 13:56:25 -04:00
senhuang42
6ccd97fc96 Fixed end of match boundary update issues 2020-10-07 13:56:25 -04:00
senhuang42
28394b64f2 Add proper bounds check on adding ldms 2020-10-07 13:56:25 -04:00
senhuang42
a2f2b58d04 Add a function ldm_voidSequences() 2020-10-07 13:56:25 -04:00
senhuang42
9c3c7cd20e Fix function argument to getNextMatch() 2020-10-07 13:56:25 -04:00
senhuang42
c8b8572b38 Adjustments to no longer segfault on nci 2020-10-07 13:56:25 -04:00
senhuang42
f57c7e6bbf Add base adjustment correction 2020-10-07 13:56:25 -04:00
senhuang42
5df9b5e05f Add initial getNextMatch() in opt parser 2020-10-07 13:56:25 -04:00
senhuang42
f8ce7cabc3 Added more debugging 2020-10-07 13:56:25 -04:00
senhuang42
84009a076a Add re-copying of ldmSeqStore after processing 2020-10-07 13:56:25 -04:00
senhuang42
42395a70c2 Add debug statements, flesh out functions 2020-10-07 13:56:25 -04:00
senhuang42
dd3dd199bb Get zstd to build with new functions and callsites, fix arguments 2020-10-07 13:56:25 -04:00
senhuang42
766c4a8c28 Implement part of ldm_maybeAddLdm() 2020-10-07 13:56:25 -04:00
senhuang42
84777059d2 Implement ldm_getNextMatch() 2020-10-07 13:56:24 -04:00
senhuang42
28c74bf591 Implement basic splitSequence and skipSequence functions 2020-10-07 13:56:24 -04:00
senhuang42
634ab7830d Flesh out required args for ldm_handleLdm() 2020-10-07 13:56:24 -04:00
senhuang42
db70761032 Add callsites to appropriate locations in ..opt_generic() 2020-10-07 13:56:24 -04:00
senhuang42
aea61e3c91 Add ldm helper function declarations into opt parser 2020-10-07 13:56:24 -04:00
senhuang42
35d9f488f5 Modify codepath to use opt parser exclusively if the compression level is high enough 2020-10-07 13:56:24 -04:00
senhuang42
e1ae398ad5 Add rawSeqStore to match state 2020-10-07 13:56:24 -04:00
Martin Liska
b684900a4a Allow external creation of POOLs that can be shared. 2020-10-07 12:44:33 +02:00
Nick Terrell
4b4d8b4dc9
Merge pull request #2338 from terrelln/comments
Add comments to ZSTD_getLowest{Match,Prefix}Index()
2020-10-01 18:56:24 -07:00
Nick Terrell
0057c4acf7
Merge pull request #2333 from terrelln/stable-dst
Reset all decompression parameters in ZSTD_DCtx_reset()
2020-10-01 18:56:11 -07:00
Nick Terrell
2e7d174130 Reset all decompression parameters in ZSTD_DCtx_reset()
* Reset all decompression parameters in `ZSTD_DCtx_reset()` when
  resetting parameters.
* Add a test case.
2020-10-01 14:19:21 -07:00
Nick Terrell
27c969ed07 Add comments to ZSTD_getLowest{Match,Prefix}Index()
Clarify how we handle dictionaries in each case.
2020-10-01 13:21:46 -07:00
Yann Collet
cc88eb7594
Merge pull request #2317 from animalize/msvc_inline
Let MSVC force inline ZSTD_hashPtr() function
2020-09-30 08:27:53 -07:00
Nick Terrell
f1cbeec039 [superblock] Reduce stack usage by correctly sizing header buffers 2020-09-24 19:42:04 -07:00
Nick Terrell
6a1e526ea7 [lib] Add ZSTD_COMPRESS_HEAPMODE tuning parameter 2020-09-24 19:42:04 -07:00
Nick Terrell
b841387218 [freestanding] Improve macro resolution to handle #if X 2020-09-24 19:42:04 -07:00
Nick Terrell
caecd8c211 Allow user to override ASAN/MSAN detection
Rename ADDRESS_SANITIZER -> ZSTD_ADDRESS_SANITIZER and same for
MEMORY_SANITIZER. Also set it to 0/1 instead of checking for defined.
This allows the user to override ASAN/MSAN detection for platforms that
don't support it.
2020-09-24 19:42:04 -07:00
Nick Terrell
88fac5d514 Remove call to memset
The previous commit fixes the test so it errors on calls to mem*()
functions from <string.h>.
2020-09-24 19:42:04 -07:00
Nick Terrell
9ae0483858 Reorganize zstd_deps.h and mem.h + replace mem.h for the kernel 2020-09-24 19:41:59 -07:00
Nick Terrell
260fc75028 Move __has_builtin() fallback define to compiler.h 2020-09-24 15:51:08 -07:00
Nick Terrell
4d63ee57f5 Move ASAN/MSAN support declarations to compiler.h 2020-09-24 15:51:08 -07:00
Nick Terrell
b09ec5c2b9 Remove MEM_STATIC_ASSERT and use DEBUG_STATIC_ASSERT instead 2020-09-24 15:51:04 -07:00
Nick Terrell
9261476b7d [lib] Wrap customMem xor checks in parens for readability
This clarifies operator precedence, and quiets cppcheck in
the Kernel Test Robot. I think this is a slight bonus to
readability, so I am accepting the suggestion.
2020-09-23 23:26:07 -07:00
Nick Terrell
dec7fb03ec [lib] Silence -Wunused-const-variable warnings 2020-09-23 12:59:57 -07:00
animalize
2e5d73dd72 Use MEM_STATIC FORCE_INLINE_ATTR instead of FORCE_INLINE_TEMPLATE
It adds `__attribute__((unused))` for __GNUC__, to eliminate `-Werror=unused-function` error.
2020-09-21 13:26:38 +08:00
animalize
0a69a6b1ca Let MSVC force inline ZSTD_hashPtr() function
ZSTD_hashPtr() function was not expanded by MSVC, led to low performance compared to GCC.
2020-09-21 10:38:55 +08:00
Felix Handte
200c960f1d
Merge pull request #2311 from felixhandte/ddss-fix-cparam-derivation
Fix Compression Parameter Derivation Bugs Introduced by DDSS Changes
2020-09-18 14:02:14 -04:00
W. Felix Handte
8930c6e551 Use ZSTD_CCtxParams_init() to Init CCtxParams, not memset()
Even if the discrepancies are at the moment benign, it's probably better to
standardize on using the one true initializer, rather than trying (and failing)
to correctly duplicate its behavior.
2020-09-17 12:15:33 -04:00
W. Felix Handte
e8a44326fa Avoid Redundancy in ZSTD_initCDict_internal() Args; Don't Take CParams + CCtxParams 2020-09-17 12:08:36 -04:00
W. Felix Handte
eee51a664a Fall Back if Derived CParams are Incompatible with DDSS; Refactor CDict Creation
Rewrite ZSTD_createCDict_advanced() as a wrapper around
ZSTD_createCDict_advanced2(). Evaluate whether to use DDSS mode *after* fully
resolving cparams. If not, fall back.
2020-09-15 18:01:08 -04:00
W. Felix Handte
bc6521a6f6 Make ZSTD_createCDict_advanced2() cctxParams Arg Const 2020-09-15 14:06:10 -04:00
W. Felix Handte
26a96a5b35 Do More Complete CParams Deduction in Non-DDSS Path of ZSTD_createCDict_advanced2
Call ZSTD_getCParamsFromCCtxParams() instead of ZSTD_getCParams_internal().
2020-09-15 13:57:43 -04:00
W. Felix Handte
a2af804129 Pull CParam Override Logic into Helper 2020-09-15 13:38:05 -04:00
Yann Collet
c91a0855f8 check endDirective in ZSTD_compressStream2()
fix #2297
also :
- `assert()` `endDirective` in `ZSTD_compressStream_internal()`, for debug mode
- add relevant tests
2020-09-14 10:56:08 -07:00
senhuang42
17b56f934e Coding style cleanup 2020-09-11 11:42:12 -04:00
senhuang42
801513b5e7 Modify params rather than cctx->requestedParams 2020-09-11 11:41:10 -04:00
W. Felix Handte
c5fab8848a Document searchFuncs Table 2020-09-10 22:10:02 -04:00
W. Felix Handte
85a95840e4 Further Consolidate Dict Mode Checks 2020-09-10 22:10:02 -04:00
W. Felix Handte
032010fcc1 Improve Documentation Slightly 2020-09-10 22:10:02 -04:00
W. Felix Handte
0faefbf1b3 Make DDSS Selection Override ForceCopy Directive 2020-09-10 22:10:02 -04:00
W. Felix Handte
efa33861f2 Attempt to Fix MSVC Warnings 2020-09-10 22:10:02 -04:00
W. Felix Handte
ed43832770 Simplify Match Limit Checks
Seems like a ~1.25% speedup.
2020-09-10 22:10:02 -04:00
W. Felix Handte
06d240b8a7 Use All Available Space in the Hash Table to Extent Chain Table Reach
Rather than restrict our temp chain table to 2 ** chainLog entries, this
commit uses all available space to reach further back to gather longer
chains to pack into the DDSS chain table.
2020-09-10 22:10:02 -04:00
W. Felix Handte
b2b0641ea0 Rewrite Table Fill to Retain Cache Entries Beyond Chain Window 2020-09-10 22:10:02 -04:00
W. Felix Handte
916238d9dc Avoid Malloc in Table Fill; Pack Tmp Structure into Hash Table 2020-09-10 22:10:02 -04:00
W. Felix Handte
f42c5bddd9 Truncate Chain at Last Possible Attempt
Make the chain table denser?
2020-09-10 22:10:02 -04:00
W. Felix Handte
20a020edbc Prefetch Chain Table Matches 2020-09-10 22:10:02 -04:00
W. Felix Handte
9b9feb84f2 Lay Out Chain Table Chains Contiguously
Rather than interleave all of the chain table entries, tying each entry's
position to the corresponding position in the input, this commit changes the
layout so that all the entries in a single chain are laid out next to each
other. The last entry in the hash table's bucket for this hash is now a packed
pointer of position + length of this chain.

This cannot be merged as written, since it allocates temporary memory inside
ZSTD_dedicatedDictSearch_lazy_loadDictionary().
2020-09-10 22:10:02 -04:00
W. Felix Handte
66509c7bf4 Only Insert Positions Inside the Chain Window 2020-09-10 22:10:02 -04:00
W. Felix Handte
13c5ec3e41 Only Allow Dedicated Dict Search for Dicts Loaded in 1 Chunk
The load algorithm requires we do it all in one go.
2020-09-10 22:10:02 -04:00
W. Felix Handte
07793547e6 Fix Bug: Only Use DDSS Insertion on CDict MatchStates
Previously, if DDSS was enabled on a CCtx and a dictionary was inserted into
the CCtx, the CCtx MatchState would be filled as a DDSS struct, causing
segfaults etc. This changes the check to use whether the MatchState is marked
as using the DDSS (which is only ever set for CDict MatchStates), rather than
looking at the CCtxParams.
2020-09-10 18:51:52 -04:00
W. Felix Handte
d214d8c859 Shorten Dict Mode Conditionals in Order to Improve Readability 2020-09-10 18:51:52 -04:00
W. Felix Handte
f49c1563ff Force-Inline ZSTD_insertAndFindFirstIndex_internal()
Without this, gcc was declining to inline the function in `ZSTD_noDict` mode,
resulting in a ~10% slowdown.
2020-09-10 18:51:52 -04:00
W. Felix Handte
cab86b074f Clean Up Search Function Selection 2020-09-10 18:51:52 -04:00
W. Felix Handte
2ffbde0d95 Fix -Wshorten-64-to-32 Error 2020-09-10 18:51:52 -04:00
W. Felix Handte
7b5d2f72ea Adjust Working Context Table Sizes Back Down 2020-09-10 18:51:52 -04:00
W. Felix Handte
c09454e28f Add Warning Comment to ZSTD_createCDict_advanced2() Declaration 2020-09-10 18:51:52 -04:00
W. Felix Handte
d332f57897 Permit Matching Against Lowest Valid Position
This comparison was previously faulty: the lowest valid position is itself
valid, and we should therefore be allowed to match against it.
2020-09-10 18:51:52 -04:00
W. Felix Handte
a3659fe1ef Make ZSTD_dedicatedDictSearch_getCParams Wrap ZSTD_getCParams
Fixes up bounds-checking, and lets us clean up what is at the moment an
unnecessary duplication of the default cparams tables.
2020-09-10 18:51:52 -04:00
W. Felix Handte
7b9a755ac9 Remove Chain Limit on Hash Cache Entries; Slightly Improve Compression
Entries in the hashTable chain cache aren't subject to the same aliasing that
the circular chain table is subject to. As such, we don't need to stop when we
cross the chain limit. We can delve deeper. :)
2020-09-10 18:51:52 -04:00
W. Felix Handte
e8b4011b52 Split Lookups in Hash Cache and Chain Table into Two Loops
Sliiiight speedup.
2020-09-10 18:51:52 -04:00
W. Felix Handte
9e83c782f8 Simplify DDS Hash Table Construction
No need to walk the chainTable; we can just keep shifting the entries in the
hashTable.
2020-09-10 18:51:52 -04:00
W. Felix Handte
ad9f98ac3f Document the ZSTD_c_enableDedicatedDictSearch Parameter 2020-09-10 18:51:52 -04:00
W. Felix Handte
5390fee4f7 Rename and Move DD_BLOG Constant to ZSTD_LAZY_DDSS_BUCKET_LOG 2020-09-10 18:51:52 -04:00
W. Felix Handte
5e91ae27eb Prefetch First Batch of Match Positions; +11% Speed in Level 5 w/ 1 Dict 2020-09-10 18:51:52 -04:00
W. Felix Handte
df386b3d8d Fix Off-By-One Error in Counting DDS Search Attempts
This caused us to double-search the first position and fail to search the
last position in the chain, slowing down search and making it less effective.
2020-09-10 18:51:52 -04:00
W. Felix Handte
914bfe7ee4 Init CCtx's Local Dict with CCtxParams 2020-09-10 18:51:52 -04:00
W. Felix Handte
db2aa25252 Decision for Whether to Attach Should be Based on CDict Config, not CCtx 2020-09-10 18:51:52 -04:00
W. Felix Handte
a494111385 Move Prefetch Before Insertion; Speed Up ~6% 2020-09-10 18:51:52 -04:00
W. Felix Handte
eede46a47e Misc Refactor of DDS Search Code 2020-09-10 18:51:52 -04:00
W. Felix Handte
f1b428fdac Rename enableDedicatedDictSearch to dedicatedDictSearch in MatchState
This makes it clear that not only is the feature allowed here, we're actually
using it, as opposed to the CCtxParam field, in which it's enabled, but we may
or may not be using it.
2020-09-10 18:51:52 -04:00
W. Felix Handte
41012193ad Always Init CDict's enableDedicatedDictSearch Field 2020-09-10 18:51:52 -04:00
W. Felix Handte
34b545acb0 Add a ZSTD_dedicatedDictSearch ZSTD_dictMode_e to Allow Const Propagation
Speed +1.5%.
2020-09-10 18:51:52 -04:00
W. Felix Handte
beefdb0d3d Fix ZSTD_c_forceAttachDict Bounds 2020-09-10 18:51:52 -04:00
W. Felix Handte
c204110eff Make ZSTD_c_enableDedicatedDictSearch an Experimental Param 2020-09-10 18:51:52 -04:00
W. Felix Handte
ae4ebf6b8c TODO: Comment 2020-09-10 18:51:52 -04:00
W. Felix Handte
def62e2d3e Fix Compilation Warnings 2020-09-10 18:51:52 -04:00
Bimba Shrestha
9c628238d3 creating ZSTD_createCDict_advanced_internal 2020-09-10 18:51:52 -04:00
Bimba Shrestha
0a9787c3e1 changing to int for consistency 2020-09-10 18:51:52 -04:00
Bimba Shrestha
e29bc3a009 using dict mls instead of src mls 2020-09-10 18:51:52 -04:00
Bimba Shrestha
145c2d12f9 add hashtable head prefetching 2020-09-10 18:51:52 -04:00
Bimba Shrestha
5d5507788d change method name for consistency 2020-09-10 18:51:52 -04:00
Bimba Shrestha
b30f71becf pass correct cparams 2020-09-10 18:51:52 -04:00
Bimba Shrestha
a3f6e4026e removing wrong comment 2020-09-10 18:51:52 -04:00
Bimba Shrestha
71fda0362f making cctxParams a pointer 2020-09-10 18:51:52 -04:00
Bimba Shrestha
628559d0e4 loading dict using new algorithm 2020-09-10 18:51:52 -04:00
Bimba Shrestha
22705f0c93 adding dedicatedDictSearch algorithm 2020-09-10 18:51:52 -04:00
Bimba Shrestha
31e581bf65 adding enableDedicatedDictSearch to matchState_t 2020-09-10 18:51:52 -04:00
Bimba Shrestha
50550a14ad adding dedicated dict load method to lazy 2020-09-10 18:51:52 -04:00
Bimba Shrestha
75b6360036 adding ZSTD_createCDict_advanced2 to zstd.h 2020-09-10 18:51:52 -04:00
Bimba Shrestha
b7dddbe89b always attach dict when using dedicatedDictSearch 2020-09-10 18:51:52 -04:00
Bimba Shrestha
e36a373df4 adding dedicatedDictSearch cParams helper methods 2020-09-10 18:51:52 -04:00
Bimba Shrestha
f10d4e313c adding ZSTD_dedicatedDictSearch_defaultCParameters variable 2020-09-10 18:51:52 -04:00
Bimba Shrestha
c497cb6716 Add ZSTD_c_enableDedicatedDictSearch Param 2020-09-10 18:51:52 -04:00
senhuang42
64bd68e44b Adjust ZSTD_createCDict_byReference() function, and check for cdict when using compressStream2 2020-09-10 13:42:26 -04:00
Nick Terrell
da30a78c68 [lib] Bump version number to 1.4.6 2020-09-09 17:13:45 -07:00
Nick Terrell
b92569a522 [doc] Document new build macros in lib/README.md 2020-09-09 17:13:16 -07:00
Nick Terrell
aab4bf7b0d [linux-kernel] Add test that checks the ifdef hardwiring 2020-09-09 14:36:19 -07:00
Nick Terrell
79ded1b4a9 [lib] Add ZSTD_NO_UNUSED_FUNCTIONS macro to hide unused functions
The unused function definitions are hidden behind a
`#ifndef ZSTD_NO_UNUSED_FUNCTIONS` check.

Initially hiding all functions which are unused and take up more than
2KB of stack space, because these will show up as warnings in the
Linux Kernel build system.
2020-09-09 14:35:39 -07:00
Nick Terrell
ac3a136b0a [lib] Replace 64-bit divisions with ZSTD_div64() 2020-09-09 14:35:39 -07:00
Nick Terrell
a90779397a [lib] Reduce zstd stack usage by 1KB 2020-09-09 14:35:39 -07:00
Nick Terrell
046aca190f Fix ZSTD_initCStream_advanced() with no dictionary and static allocation 2020-09-09 14:35:39 -07:00
Nick Terrell
e975de289c Add ZSTD_NO_INTRINSICS macro to avoid explicit intrinsics 2020-09-09 14:35:39 -07:00
Nick Terrell
f91ed5c766 [lib] s/current/curr because it collides with Linux Kernel macro 2020-09-09 14:35:39 -07:00
Nick Terrell
5e4efd22d4
Merge pull request #2291 from i-do-cpp/fix-compression-level-default
Fix setParameter not falling back to default compression level
2020-09-08 16:42:34 -07:00
Felix Handte
8db661dd7f
Merge pull request #2294 from felixhandte/makefile-lib-fix-var-order
Fix Makefile Variable Concatenation Order
2020-09-04 10:58:57 -04:00
W. Felix Handte
75bc289911 Fix Makefile Variable Concatenation Order
Previously, this construct would add `-O3` onto the end of the compiler flags
variable, **after** `MOREFLAGS`, which meant that it was impossible to over-
ride. This commit fixes this order and should otherwise be a no-op.
2020-09-03 17:30:29 -04:00
Nick Terrell
6da8acd231
Merge pull request #2293 from allanjude/coverity
Resolve Coverity 1432392 Unintentional integer overflow
2020-09-03 13:58:45 -07:00
Allan Jude
8665793164 Resolve Coverity 1432392 Unintentional integer overflow
Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression:
cdict->dictContentSize * 6U
with type unsigned int (32 bits, unsigned) is evaluated using 32-bit
arithmetic, and then used in a context that expects an expression of
type U64 (64 bits, unsigned).
2020-09-03 19:31:50 +00:00
i-do-cpp
aec8b27fff
Update zstd_compress.c 2020-08-31 09:34:08 +02:00
i-do-cpp
d514281e73
Fix setParameter not falling back to default compression level on 0 value
See documentation for `ZSTD_c_compressionLevel`: `Special: value 0 means default, which is controlled by ZSTD_CLEVEL_DEFAULT`
2020-08-31 09:25:43 +02:00
Yann Collet
c6d5a2cad0
Merge pull request #2288 from animalize/doc_version
[doc] Add ZSTD_versionString() to manual
2020-08-27 12:32:07 -07:00
animalize
6365e0e32f Add ZSTD_versionString() function to manual. 2020-08-27 13:51:22 +08:00
Nick Terrell
c465f24457 ZSTD_ prefix mem{cpy,move,set},malloc,calloc,free 2020-08-26 12:26:03 -07:00
Nick Terrell
a686d306d2 Rename ZSTD_{malloc,calloc,free} to ZSTD_custom{Malloc,Calloc,Free} 2020-08-26 12:25:08 -07:00
Nick Terrell
80f577baa2 Move standard includes to zstd_deps.h 2020-08-26 12:25:08 -07:00
Nick Terrell
cf83aceaf3
Merge pull request #2282 from terrelln/ncount-fix
[bug] Fix FSE_readNCount()
2020-08-26 10:31:07 -07:00
Nick Terrell
4193638996 [bug] Fix FSE_readNCount()
* Fix bug introduced in PR #2271
* Fix long-standing bug that is impossible to trigger inside of zstd
* Add a fuzzer that makes sure the normalized count always round trips
  correctly
2020-08-25 15:42:41 -07:00
Yann Collet
f82d9865b9
Merge pull request #2278 from senhuang42/ignore_checksum_advanced_param
New advanced decompression param to ignore checksums
2020-08-25 12:08:53 -07:00
Nick Terrell
614e446000
Merge pull request #2271 from terrelln/small-blocks
Small block optimizations
2020-08-24 18:54:33 -07:00
Nick Terrell
52f33a1da5 Fix compiler warnings 2020-08-24 16:09:45 -07:00
Nick Terrell
6f301a7903
Merge pull request #2272 from terrelln/dstSize_tooSmall
[fix] Always return dstSize_tooSmall when it is the case
2020-08-24 15:01:17 -07:00
Nick Terrell
6d2f750b37 Document the BMI2 default() functions 2020-08-24 14:44:33 -07:00
senhuang42
a030560d62 Add new DCtx param: validateChecksum and update unit tests 2020-08-24 17:28:00 -04:00
Nick Terrell
cebe0b5c0b Improve FSE_normalizeCount() docs 2020-08-24 13:58:34 -07:00
Nick Terrell
1302f8d676 [fix] Always return dstSize_tooSmall when it is the case 2020-08-24 13:38:13 -07:00
senhuang42
44c54a3e31 Addressing comments: more comments, cleanup, remove extra function, checksum logic 2020-08-24 16:14:19 -04:00
Nick Terrell
8def0e5fd3 Fix up code after reading through 2020-08-24 12:24:45 -07:00
senhuang42
ffaa0df76d Document change in CLI for --no-check during decompression in --help menu 2020-08-24 09:49:12 -04:00
senhuang42
e3f5f9658a Added CLI tests for --no-check, fixed ignore checksum logic 2020-08-22 16:05:40 -04:00
senhuang42
20eb095882 Added unit test to fuzzer.c, changed definition param name 2020-08-22 13:26:33 -04:00
senhuang42
47685ac856 Move enum into zstd.h, and fix pesky switch() logic 2020-08-21 18:18:53 -04:00
senhuang42
08d3567ba8 Add function prototype 2020-08-21 16:51:43 -04:00
senhuang42
6a8dbdcd1f Modify decompression loop to gnore checksums if flag is enabled 2020-08-21 16:46:46 -04:00
senhuang42
2f39124342 Rename to ZSTD_d_forceIgnoreChecksum, add to DCtx, add function to set the advanced param 2020-08-21 16:23:39 -04:00
senhuang42
b5cddda073 Add new definition of ZSTD_d_forceSkipChecksum in experimental section 2020-08-21 15:59:03 -04:00
Nick Terrell
8f8bd2d1ac [regression] Update results.csv 2020-08-20 12:41:35 -07:00
Antonio Bueno
77c97089fc
Fixed Markdown warnings. No visible changes. 2020-08-19 12:36:28 +02:00
Nick Terrell
575731b6db Use ncount=1 when < 4096 symbols 2020-08-18 16:47:53 -07:00
Nick Terrell
612e947c5e wire up bmi2 support 2020-08-17 16:35:28 -07:00
Nick Terrell
ba1fd17a9f speed up literal header decoding 2020-08-17 12:17:53 -07:00
Nick Terrell
6004c1117f speed up small blocks 2020-08-16 23:03:38 -07:00
Felix Handte
bb265da4ae
Merge pull request #2270 from felixhandte/fix-doc-cctx-set-param
Fix Documentation for ZSTD_CCtxParams_setParameter()
2020-08-14 21:44:56 -04:00
W. Felix Handte
99746eea7e Fix Documentation for ZSTD_CCtxParams_setParameter()
It does not only return 0 on success.
2020-08-14 14:44:08 -04:00
Carl Woffenden
4c81fae146 Fix clang -Wcomma warning 2020-08-13 16:11:22 +02:00
Nick Terrell
e3bda594ae Prefer __builtin_prefetch over inline asm
Reorder the ifdefs for the PREFETCH macros so that the compiler builtin is
favored over the inline assembly for aarch64.
2020-08-10 22:17:18 -07:00
Yann Collet
38e38546a4
Merge pull request #2258 from Niadb/dev
Added STATIC_BMI2 for compile time detection of BMI2 on MSVC, when enabled various intrinsics are used
2020-08-04 09:43:59 -07:00
Yann Collet
60b56e3b5f
Merge pull request #2253 from facebook/histvec
optimized histogram
2020-08-02 14:08:42 -07:00
Nick Terrell
f85a0f8bcf
Merge pull request #2256 from helloguo/dev
Optimize ZSTD_wildcopy
2020-07-29 11:57:49 -07:00
Carl Woffenden
5d81d44e40 Fixed VS variable shadowing warning (and added test) 2020-07-29 12:33:39 +02:00
helloguo
acb3dd9a68 Use ZSTD_copy16 instead of memcpy 2020-07-28 11:58:46 -07:00
Niadb
a8ebc14035
Update bitstream.h
Profiler showed some of these not being inlined on MSVC
2020-07-28 11:17:04 -06:00
Niadb
216a63dcf7
Add files via upload 2020-07-28 02:52:52 -06:00
Niadb
493fd40dca
Add files via upload 2020-07-28 02:52:15 -06:00
helloguo
82b0cd844f Optimize ZSTD_wildcopy 2020-07-27 22:08:52 -07:00
Yann Collet
8b9cdd2597 fixed overlapping count & workspace special case 2020-07-26 22:40:21 -07:00
Yann Collet
051232223f optimized histogram
new version easier to vectorize
leads to smaller code and faster execution
notably at the last recombination stage
(basically, fixed cost per block).

Assembly inspected with godbolt

On my laptop, with `clang` and `-mavx2` :
2K block : 1280 MB/s -> 1550 MB/s
8K block : 1750 MB/s -> 1860 MB/s
2020-07-26 22:24:22 -07:00
helloguo
6de87b3a74 fix preprocessor in ZSTD_wildcopy 2020-07-24 10:53:58 -07:00
Yann Collet
c224367ede ensure workspace is large enough
even when MAX_TABLELOG is reduced
2020-07-16 20:33:50 -07:00
Yann Collet
21c273da84 import some minor fixes from FSE project 2020-07-16 20:25:15 -07:00
Yann Collet
a44671b281
Revert "Fix -Wunused-variable under FUZZING_BUILD_MODE..." 2020-07-15 12:42:18 -07:00
Mitch Phillips
23b55d6b3e Fix -Wunused-variable under FUZZING_BUILD_MODE...
Fuzzing build modes (FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION) doesn't
necessarily imply that assert() is enabled, according to the manual.

When the current do-nothing is expanded under -Wunused-variable (-Wall),
it results in unused variables in some of the FUZZING_BUILD_MODE...
blocks.

This patch extends the do-nothing to avoid the unused variable.
2020-07-14 09:03:02 -07:00
Yann Collet
16b353b207 minor doc clarification regarding MT parameters 2020-07-11 02:16:52 -07:00
Yann Collet
2cdd33ae16
Merge pull request #2227 from yoshihitoh/single-file-dict-emscripten
[contrib] Fix single-file compilation error on Emscripten build.
2020-07-07 08:51:20 -07:00
dkcasset
82e7e2b47e Add variable for sed extended RE option (defaults to -E) 2020-06-29 13:44:23 -07:00
yoshihitoh
c6548eac8e Rename static vars to avoid redefinition error. 2020-06-29 10:51:50 +09:00
dkcasset
b0ed66ef92 Replace -E option with equivalent -r for older versions of sed 2020-06-26 10:43:28 -07:00
Nick Terrell
7afd5d85d3
Merge pull request #2218 from terrelln/assert-seq
Fix unused variable warnings in fuzzing build mode without asserts
2020-06-22 17:41:18 -07:00
Nick Terrell
081691a3aa
Merge pull request #2217 from terrelln/cover-redundant
[cover] Remove unnecessary mask and dedup hash functions
2020-06-22 17:41:13 -07:00
Nick Terrell
370933fa20
Merge pull request #2209 from Niadb/dev
Explicitly use __cdecl for qsort, to avoid warning when default calling convention is not __cdecl
2020-06-22 17:41:03 -07:00
Nick Terrell
cce0edfdbe Fix unused variable warnings in fuzzing build mode without asserts
Fix unused vairable warnings when `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` is defined but asserts are disabled.

Fixes #2210.
2020-06-22 12:56:57 -07:00
Nick Terrell
2312b819af [cover] Remove unnecessary mask and dedup hash functions
* Remove the unnecessary mask, since `ZSTD_hash*()` already ensures
  the output is mod 2^h.
* Dedup the hash functions and use `ZSTD_hash*()` directly.
2020-06-22 12:52:13 -07:00
Nick Terrell
78601d0806
Merge pull request #2212 from cwoffenden/single-file-dict
Single-file libs now include dictBuilder
2020-06-22 12:46:54 -07:00
Nick Terrell
1047097dad [superblock] Add defensive assert and bounds check
The bound check condition should always be met because we selected `set_basic` as
our encoding type. But that code is very far away, so assert it is true so if it is
ever false we can catch it, and add a bounds check.

Fixes #2213.
2020-06-22 10:21:38 -07:00
Carl Woffenden
38cdb6a072 Renamed cover and fast cover hash functions/vars 2020-06-22 11:54:24 +02:00
Carl Woffenden
4a9b7d136f Initial implementation (files added, macros fixed)
Hashing functions still to fix.
2020-06-22 10:31:36 +02:00
Niadb
74f65f624c
Update compiler.h
clean wording
2020-06-19 09:51:00 -06:00
Niadb
8c115cbe23
Update compiler.h
Added a comment explaining the purpose of the WIN_CDECL macro
2020-06-19 09:48:35 -06:00
Niadb
2962fda93f
Add files via upload 2020-06-19 03:34:05 -06:00
Niadb
405586d40a
Add files via upload 2020-06-19 03:32:11 -06:00
Niadb
a4c8aa5e02
Add files via upload 2020-06-19 03:31:47 -06:00
Nick Terrell
08981d2638 [lib] Allow compression dictionaries with missing symbols
Allow compression to use dictionaries with missing symbols in their
entropy tables. We set the FSE repeat mode to check when there are
missing symbols, and set the FSE repeat mode to valid when all symbols
are present.

Note that when not all symbols are present, the heuristics which favor
dictionary tables for lower compression levels won't activate.

Tested by manually creating a dictionary with missing symbols of every
type, and validing that the compressor rejects it before this change,
and accepts it after this change. Also, I ran the `dictionary_loader`
fuzzer for >1 hour of CPU time without running into cases where
compression succeeds, but decompression fails.

Fixes #2174.
2020-06-12 17:57:19 -07:00
Felix Handte
2af4e07326
Merge pull request #2133 from felixhandte/single-size-calculation
Consolidate CCtx Size Estimation Code
2020-05-28 13:07:18 -04:00
Yann Collet
39a32f40c9 fixed default rule for lib/Makefile
default rule is `lib-release`

`lib-release` wasn't working : it was just skipped.

Removing `lib-release` from the list of .PHONY targets fixes it.

Same for `lib-mt`.
2020-05-25 06:50:45 -07:00
Yann Collet
082755bd3f do not install zbuff.h
this API is deprecated, for a loong time now,
all related symbols will be removed in a future version (likely v1.5.0)
and the header file `zbuff.h` doesn't compile from `include/` anyway,
because it needs to be positioned one directory below `zstd.h`.

Also removed `cover.h` from `cmake` installer,
as it should have never been part of this list to begin with.
2020-05-22 15:35:54 -07:00
Orivej Desh
93cec0c1d6 Fix legacy build after #2103 2020-05-22 12:48:02 +00:00
Nick Terrell
3cc227e90e [ldm][mt] Fix loadedDictEnd 2020-05-19 15:55:03 -07:00
Yann Collet
fdc56baa42
fix 22294 (#2151) 2020-05-18 21:05:10 -07:00
Nick Terrell
b2092c6dc4 [ldm] Reset loadedDictEnd when the context is reset 2020-05-18 12:35:44 -07:00
Nick Terrell
add7ed2d4a [lib] Fix bug in loading LDM dictionary in MT mode
Exposed when loading a dictionary < LDM minMatch bytes in MT mode.

Test Plan:
```
CC=clang make -j zstreamtest MOREFLAGS="-O0 -fsanitize=address"
./zstreamtest -vv -i100000000 -t1 --newapi -s7065 -t3925297
```

TODO: Add an explicit test that loads a small dictionary in MT mode
2020-05-14 11:52:28 -07:00
W. Felix Handte
3bb7992350 Fix Size Estimate for LDM Seq Space 2020-05-14 13:50:53 -04:00
Nick Terrell
70c80e19e6 [greedy] Fix performance instability 2020-05-12 17:51:16 -07:00
Nick Terrell
c3e921c639
Merge pull request #2131 from terrelln/raw-dict-fuzzer
Fix rare scenario with lazy parser, dictionary, and repcodes
2020-05-12 17:44:31 -07:00
W. Felix Handte
d9a1e37aec Nit: Fix Size Type for 32-bit 2020-05-12 18:03:31 -04:00
Nick Terrell
f800e72a3c [lib] Fix assertion when dictionary is prefix 2020-05-12 14:33:59 -07:00
W. Felix Handte
1aa6c7ccce Assert We Allocated Approximately What We Expected To 2020-05-12 16:55:03 -04:00
W. Felix Handte
27e2482217 Minor Refactor 2020-05-12 16:55:03 -04:00
W. Felix Handte
afc2488973 Handle Non-Static CCtxes in Estimation 2020-05-12 16:54:33 -04:00
W. Felix Handte
7ed996f5a0 Consolidate CCtx Size Estimation Code
This commit pulls out the internals of `ZSTD_estimateCCtxSize_usingCCtxParams`
into a helper. It then migrates two other callsites to use that helper,
a small optimization for `ZSTD_estimateCStreamSize_usingCCtxParams`, which
folds the buffer sizing into the helper, and then `ZSTD_resetCCtx_internal`,
which is more invasive.

This attempts to guarantee that the estimates returned to users are always
correct.
2020-05-12 16:26:53 -04:00
Nick Terrell
3c1eba4d99 [lib] Fix lazy repcode validity checks 2020-05-12 12:25:06 -07:00
Nick Terrell
4e0515916d [lib] Fix repcode validation in no dict mode 2020-05-12 11:57:15 -07:00
Nick Terrell
6d687a8816 [lib] Fix dictionary + repcodes + optimal parser 2020-05-12 10:36:53 -07:00
Nick Terrell
4b88bd3ee0 [lib][fuzz] Assert sequences are valid in round trip tests 2020-05-11 20:38:49 -07:00
Yann Collet
20bd246045 blindfix for VS macro redefinition 2020-05-11 19:29:36 -07:00
Yann Collet
76e726e3be updated documentation for ZSTD_estimate*()
make it clearer that tighter memory estimation
can be provided using advanced functions
on the condition of a defined input size bound.
2020-05-11 19:21:50 -07:00
Nick Terrell
80d3585e31 [lib] Fix lazy parser with dictionary + repcodes 2020-05-11 19:04:30 -07:00
Yann Collet
608f1bfc4c fixed context downsize with initStatic
When context is created using initStatic,
no resize is possible.

fix : only bump oversizeDuration when !initStatic
2020-05-11 18:16:38 -07:00
W. Felix Handte
c6636afbbb Fix ZSTD_estimateCCtxSize() Under ASAN
`ZSTD_estimateCCtxSize()` provides estimates for one-shot compression, which
is guaranteed not to buffer inputs or outputs. So it ignores the sizes of the
buffers, assuming they'll be zero. However, the actual workspace allocation
logic always allocates those buffers, and when running under ASAN, the
workspace surrounds every allocation with 256 bytes of redzone. So the 0-sized
buffers end up consuming 512 bytes of space, which is accounted for in the
actual allocation path through the use of `ZSTD_cwksp_alloc_size()` but isn't
in the estimation path, since it ignores the buffers entirely.

This commit fixes this.
2020-05-11 18:58:19 -04:00
W. Felix Handte
87c541c5f9 Only Trigger libzstd.pc Build on Unix-Like Platforms
We don't even define the rule on unsupported platforms.
2020-05-08 16:11:32 -04:00
W. Felix Handte
78aa9373cb Add libzstd.pc Build to More Aggregate Targets in Makefiles 2020-05-08 16:11:32 -04:00
W. Felix Handte
15561bcf74 Fix pkg-config File Generation Again Again
Resubmission of #2001. This switches the `sed` invocations to use `-E`,
extended regex syntax, which is better standardized across platforms.
I guess.

Same test plan:

```
make -C lib clean libzstd.pc
cat lib/libzstd.pc

echo # should fail
make -C lib clean libzstd.pc     LIBDIR=/foo
make -C lib clean libzstd.pc INCLUDEDIR=/foo
make -C lib clean libzstd.pc     LIBDIR=/usr/localfoo
make -C lib clean libzstd.pc INCLUDEDIR=/usr/localfoo
make -C lib clean libzstd.pc     LIBDIR=/usr/local/lib     prefix=/foo
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/include prefix=/foo

echo # should succeed
make -C lib clean libzstd.pc     LIBDIR=/usr/local/foo
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/foo
make -C lib clean libzstd.pc     LIBDIR=/usr/local/
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/
make -C lib clean libzstd.pc     LIBDIR=/usr/local
make -C lib clean libzstd.pc INCLUDEDIR=/usr/local
make -C lib clean libzstd.pc     LIBDIR=/tmp/foo prefix=/tmp
make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp
make -C lib clean libzstd.pc     LIBDIR=/tmp/foo prefix=/tmp/foo
make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp/foo

echo # should also succeed
make -C lib clean libzstd.pc prefix=/foo LIBDIR=/foo/bar INCLUDEDIR=/foo/
cat lib/libzstd.pc

mkdir out
cd out
cmake ../build/cmake
make
cat lib/libzstd.pc
```
2020-05-08 16:11:32 -04:00
Bimba Shrestha
df9e5b6f4c adding 2020-05-07 22:07:40 -05:00
Yann Collet
efc656c9a6
Merge pull request #2114 from facebook/verbose
support for verbose make
2020-05-07 13:16:33 -07:00
Yann Collet
1afe57cff7
Merge pull request #2112 from facebook/cfast
small speed improvement for strategy fast
2020-05-07 13:13:34 -07:00
caoyzh
969ba4f2b9 Change the modification of ZSTD_wildcopy() 2020-05-07 13:10:46 -07:00
caoyzh
a7e34ff693 revert ZSTD_reduceTable_internal()'s modificatiion 2020-05-07 13:10:46 -07:00
caoyzh
9e802ede9c Modify indent of comments 2020-05-07 13:10:46 -07:00
caoyzh
7f75f05e84 Change "arm_neon.h" to system include <arm_neon.h> 2020-05-07 13:10:46 -07:00
caoyzh
b2e56f7f7f Optimize compression by using neon function. 2020-05-07 13:10:46 -07:00
Nick Terrell
45c66dd298 [zdict] Stabilize ZDICT_finalizeDictionary() 2020-05-07 10:37:01 -07:00
Yann Collet
cf854f4660 support for verbose make
A commonly accepted makefile idiom is V=1 or VERBOSE=1
to request the printing of all commands.

This is not "default" though, and must be manually added.

Example :
Before :
```
make libzstd
compiling dynamic library 1.4.5
creating versioned links

make libzstd V=1
compiling dynamic library 1.4.5
creating versioned links
```

After :
```
make libzstd
compiling dynamic library 1.4.5
creating versioned links

make libzstd V=1
compiling dynamic library 1.4.5
cc -DXXH_NAMESPACE=ZSTD_ -DZSTD_LEGACY_SUPPORT=5 -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement -Wstrict-prototypes -Wundef -Wpointer-arith -Wvla -Wformat=2 -Winit-self -Wfloat-equal -Wwrite-strings -Wredundant-decls -Wmissing-prototypes -Wc++-compat  -O3 common/debug.c common/entropy_common.c common/error_private.c common/fse_decompress.c common/pool.c common/threading.c common/xxhash.c common/zstd_common.c compress/fse_compress.c compress/hist.c compress/huf_compress.c compress/zstd_compress.c compress/zstd_compress_literals.c compress/zstd_compress_sequences.c compress/zstd_compress_superblock.c compress/zstd_double_fast.c compress/zstd_fast.c compress/zstd_lazy.c compress/zstd_ldm.c compress/zstd_opt.c compress/zstdmt_compress.c decompress/huf_decompress.c decompress/zstd_ddict.c decompress/zstd_decompress.c decompress/zstd_decompress_block.c deprecated/zbuff_common.c deprecated/zbuff_compress.c deprecated/zbuff_decompress.c dictBuilder/cover.c dictBuilder/divsufsort.c dictBuilder/fastcover.c dictBuilder/zdict.c legacy/zstd_v05.c legacy/zstd_v06.c legacy/zstd_v07.c -shared -fPIC -fvisibility=hidden -Wl,-soname=libzstd.so.1 -o libzstd.so.1.4.5
creating versioned links
ln -sf libzstd.so.1.4.5 libzstd.so.1
ln -sf libzstd.so.1.4.5 libzstd.so
```
2020-05-07 08:04:10 -07:00
Yann Collet
54144285fd small speed improvement for strategy fast
gcc 9.3.0 :
kennedy : 459 -> 466
silesia : 360 -> 365
enwik8  : 267 -> 269

clang 10.0.0 :
kennedy : 436 -> 441
silesia : 364 -> 366
enwik8  : 271 -> 272
2020-05-07 06:15:58 -07:00
Nick Terrell
5717bd39ee [lib] Fix NULL pointer dereference
When the output buffer is `NULL` with size 0, but the frame content size
is non-zero, we will write to the NULL pointer because our bounds check
underflowed.

This was exposed by a recent PR that allowed an empty frame into the
single-pass shortcut in streaming mode.

* Fix the bug.
* Fix another NULL dereference in zstd-v1.
* Overflow checks in 32-bit mode.
* Add a dedicated test.
* Expose the bug in the dedicated simple_decompress fuzzer.
* Switch all mallocs in fuzzers to return NULL for size=0.
* Fix a new timeout in a fuzzer.

Neither clang nor gcc show a decompression speed regression on x86-64.
On x86-32 clang is slightly positive and gcc loses 2.5% of speed.

Credit to OSS-Fuzz.
2020-05-06 12:09:02 -07:00
Felix Handte
ad8dbae1b7
Merge pull request #2103 from felixhandte/relative-includes
Migrate Includes to Relative Paths
2020-05-06 09:42:23 -07:00
Yann Collet
c29fd7cd8b some more conversion warnings
hunting down some static analyzer warnings
2020-05-05 10:16:59 -07:00
Yann Collet
c1b836f4c3 fix minor conversion warnings 2020-05-04 14:43:09 -07:00
Felix Handte
8b327149a8
Merge pull request #1976 from felixhandte/minimal-lib-target
Add Minification Variable to `lib/Makefile`
2020-05-04 12:42:56 -07:00
W. Felix Handte
7b75d772b1 Remove Useless Assignment in Makefile 2020-05-04 15:20:26 -04:00
W. Felix Handte
6028827fee Rewrite Include Paths to be Relative
Addresses #1998.
2020-05-04 15:20:26 -04:00
Felix Handte
7e9aabd652
Merge pull request #2099 from felixhandte/compile-under-pedantic
Compile Under `-pedantic -Werror` and `-std=c90`
2020-05-04 10:07:13 -07:00
W. Felix Handte
fa5e01c467 Add Space-Optimized Helper Variable to Lib Makefile
This diff reorganizes the `lib/Makefile` to extract various settings that a
user would normally invoke together (supposing that they were aware of them)
if they were trying to build the smallest `libzstd` possible. It collects
these settings under a master setting `ZSTD_LIB_MIN_SIZE`.

Also document this new option.
2020-05-04 11:19:25 -04:00
Felix Handte
816ed80774
Merge pull request #1984 from MeghnaM/1636-Reduce-stack-usage-of-HUF_sort
Reduce stack usage of HUF_sort()
2020-05-04 08:15:31 -07:00
W. Felix Handte
3764859060 Switch Helper Declaration to Not Force Inline
It was causing build issues in ANSI mode.
2020-05-04 10:59:15 -04:00
W. Felix Handte
c7da66c9cf Purge C++-Style Comments (// ...), Make Compilation Succeed Under C90 2020-05-04 10:59:15 -04:00
W. Felix Handte
952427aebf Avoid inline Keyword in C90
Previously we would use it for all gcc-like compilations, even when a
restrictive mode that disallowed it had been selected.
2020-05-04 10:59:15 -04:00
W. Felix Handte
baa4e2e36c Don't Evaluate Arguments to Dummy Function 2020-05-04 10:59:15 -04:00
W. Felix Handte
450542d3a7 Allow Empty Format Strings in Error Macro Invocations
`-Wall` implies `-Wformat-zero-length`, which will cause compilation to fail
under `-Werror` when an empty string is passed as the format string to a
`printf`-family function. This commit moves us back to prefixing the provided
format string, which successfully avoids that warning.

However, this removes the failure mode where that `RAWLOG` invocation would
fail to compile when no format string was provided at all (which was desirable
to avoid having code that would successfully compile normally but fail under
`-pedantic`, which *does* require that a non-zero number of args are provided).

So this commit also introduces a function which does nothing at all, but will
fail to compile if not provided with at least one argument, which is a string.
This successfully links the compilability of pedantic and non-pedantic builds.
2020-05-04 10:59:15 -04:00
W. Felix Handte
6696933b32 Make All Invocations Start With Literal Format String 2020-05-04 10:59:15 -04:00
W. Felix Handte
2745f7a7d5 Make Error Macro Invocation Without Info String Fail to Compile
Even without `-pedantic`, these macros will now fail to compile unless you
provide an info string argument. This will prevent us from regressing.
2020-05-04 10:59:15 -04:00
W. Felix Handte
5e5f262612 Add (Possibly Empty) Info Strings to All Variadic Error Handling Macro Invocations 2020-05-04 10:58:55 -04:00
Nick Terrell
e103d7b4a6
Fix superblock mode (#2100)
Fixes:

Enable RLE blocks for superblock mode
Fix the limitation that the literals block must shrink. Instead, when we're within 200 bytes of the next header byte size, we will just use the next one up. That way we should (almost?) always have space for the table.
Remove the limitation that the first sub-block MUST have compressed literals and be compressed. Now one sub-block MUST be compressed (otherwise we fall back to raw block which is okay, since that is streamable). If no block has compressed literals that is okay, we will fix up the next Huffman table.
Handle the case where the last sub-block is uncompressed (maybe it is very small). Before it would skip superblock in this case, now we allow the last sub-block to be uncompressed. To do this we need to regenerate the correct repcodes.
Respect disableLiteralsCompression in superblock mode
Fix superblock mode to handle a block consisting of only compressed literals
Fix a off by 1 error in superblock mode that disabled it whenever there were last literals
Fix superblock mode with long literals/matches (> 0xFFFF)
Allow superblock mode to repeat Huffman tables
Respect ZSTD_minGain().
Tests:

Simple check for the condition in #2096.
When the simple_round_trip fuzzer enables superblock mode, it checks that the compressed size isn't expanded too much.
Remaining limitations:

O(targetCBlockSize^2) because we recompute statistics every sequence
Unable to split literals of length > targetCBlockSize into multiple sequences
Refuses to generate sub-blocks that don't shrink the compressed data, so we could end up with large sub-blocks. We should emit those sections as uncompressed blocks instead.
...
Fixes #2096
2020-05-01 16:11:47 -07:00
Meghna Malhotra
0adfc8dfce Fix broken CI; make changes in response to the comments 2020-05-01 13:45:48 -07:00
Meghna Malhotra
53d76dc20f Remove magic constant and made other changes addressing the comments 2020-05-01 13:45:48 -07:00
Meghna Malhotra
fe8402b522 WIP: Still getting an error 2020-05-01 13:45:48 -07:00
Meghna Malhotra
a084d959bd WIP: Increased wksp size, but it's segfaulting 2020-05-01 13:45:48 -07:00
Meghna Malhotra
fdb2780c47 Move rank table into HUF_buildCTable_wksp() 2020-05-01 13:45:48 -07:00
Yann Collet
da2748a855
Merge pull request #2097 from facebook/underlink
Fix underlinked libzstd
2020-04-30 10:16:24 -07:00
Yann Collet
f77fd5ced0 generalized pattern rules 2020-04-28 18:43:55 -07:00
Yann Collet
c6ae2e83bc fix libzstd-mt underlinking issue
fix #2045
When compiling `libzstd` in multithreading mode,
the `libzstd-mt` recipe would not include `-pthread`,
resulting in an underlinked dynamic library.

Added a test on Travis to check that the library is fully linked.

This makes it possible, in some future release,
to build a multi-threaded `libzstd` dynamic library by default
as it would no longer impact the build script of user programs.
2020-04-28 18:29:20 -07:00
Nick Terrell
55a57d46be Add extra warnings about not modifying the ZSTD_outBuffer 2020-04-28 12:07:42 -07:00
Nick Terrell
77a2945c43 Add some comments 2020-04-27 20:04:04 -07:00
Nick Terrell
f33de06c3e [lib] Fix single-pass mode for empty frames 2020-04-27 20:04:01 -07:00
Nick Terrell
a4ff217baf [lib] Add ZSTD_d_stableOutBuffer 2020-04-27 18:09:44 -07:00
Nick Terrell
b104f8e3eb [zstd] Fix typo in ZSTD_dParameter 2020-04-27 12:12:28 -07:00
Bimba Shrestha
1875f616ce passing dictContentType instead of rawContent every time 2020-04-21 22:29:35 -07:00
Bimba Shrestha
5b0a452cac
Adding --long support for --patch-from (#1959)
* adding long support for patch-from

* adding refPrefix to dictionary_decompress

* adding refPrefix to dictionary_loader

* conversion nit

* triggering log mode on chainLog < fileLog and removing old threshold

* adding refPrefix to dictionary_round_trip

* adding docs

* adding enableldm + forceWindow test for dict

* separate patch-from logic into FIO_adjustParamsForPatchFromMode

* moving memLimit adjustment to outside ifdefs (need for decomp)

* removing refPrefix gate on dictionary_round_trip

* rebase on top of dev refPrefix change

* making sure refPrefx + ldm is < 1% of srcSize

* combining notes for patch-from

* moving memlimit logic inside fileio.c

* adding display for optimal parser and long mode trigger

* conversion nit

* fuzzer found heap-overflow fix

* another conversion nit

* moving FIO_adjustMemLimitForPatchFromMode outside ifndef

* making params immutable

* moving memLimit update before createDictBuffer call

* making maxSrcSize unsigned long long

* making dictSize and maxSrcSize params unsigned long long

* error on files larger than 4gb

* extend refPrefix test to include round trip

* conversion to size_t

* making sure ldm is at least 10x better

* removing break

* including zstd_compress_internal and removing redundant macros

* exposing ZSTD_cycleLog()

* using cycleLog instead of chainLog

* add some more docs about user optimizations

* formatting
2020-04-17 15:58:53 -05:00
Nick Terrell
5fcbc484c8
Merge pull request #2040 from caoyzh/dev-2
Optimize by prefetching on aarch64
2020-04-08 13:14:47 -07:00
Bimba Shrestha
c0d4b2b5a3
Merge pull request #2075 from bimbashrestha/dict_fuzzer_ref
[bug] handling case where prefix is NULL or 0 sized in refPrefix_advanced
2020-04-07 17:37:19 -05:00
Bimba Shrestha
1658ae75cd handling nil case for refprefix 2020-04-07 14:41:53 -07:00
Carl Woffenden
a93fadfcd9 Further replication removed
`CHECK_F` is now in `error_private.h`. Minor tidy.
2020-04-07 11:25:16 +02:00
Carl Woffenden
7af7735fa3 Merge remote-tracking branch 'upstream/dev' into single-file-lib 2020-04-07 11:13:02 +02:00
Carl Woffenden
edd9a07322 Code replicated in compression and decompression moved to shared headers
`CHECK_F` macro moved to `error_private.h` (shared between `fse_compress.c` and `fse_decompress.c`). `ZSTD_limitCopy()` moved to `zstd_internal.h` (shared between `zstd_compress.c` and `zstd_decompress.c`). Erroneous build artefact `zstd.h` removed from repo.
2020-04-07 11:02:06 +02:00
Bimba Shrestha
0154866749 moving consts to zstd_internal and reusing them 2020-04-03 14:26:15 -07:00
Bimba Shrestha
0a172c5e43 converting to if 2020-04-03 14:21:24 -07:00
Bimba Shrestha
3a4c8cc9b3 adding dctx to function name 2020-04-03 14:14:46 -07:00
Bimba Shrestha
ae47d50355 only computing sizes once 2020-04-03 14:12:23 -07:00
Bimba Shrestha
a4cbe79ccb Using in and out size together 2020-04-03 14:09:21 -07:00
Bimba Shrestha
936aa63ff1 adding oversized check on decompression 2020-04-03 13:25:32 -07:00
Bimba Shrestha
05574ec141 adding oversizeDuration to dctx and macros 2020-04-03 13:08:29 -07:00
Carl Woffenden
7c420344d2 Single-file decoder script can now (optionally) create an encoder
To complement the single-file decoder a new script was added to create an amalgamated single-file of all of the Zstd source, along with examples and (simple) tests.
2020-04-03 19:07:46 +02:00
Carl Woffenden
7202184ee0
Fixes decompressor when using -Wshorten-64-to-32 (#2062)
Spotted on iOS when building with `-Wshorten-64-to-32` (since `__builtin_expect` returns a `long`).
2020-04-03 02:55:29 -07:00
Nick Terrell
ac58c8d720 Fix copyright and license lines
* All copyright lines now have -2020 instead of -present
* All copyright lines include "Facebook, Inc"
* All licenses are now standardized

The copyright in `threading.{h,c}` is not changed because it comes from
zstdmt.

The copyright and license of `divsufsort.{h,c}` is not changed.
2020-03-26 17:02:06 -07:00
Nick Terrell
f5029e285f
Merge pull request #2050 from terrelln/align
Align decompress sequences loop to 32+16 bytes
2020-03-24 11:42:59 -07:00
Nick Terrell
8d0ee37ac0 Align decompress sequences loop to 32+16 bytes
The alignment is added before the loop, so this shouldn't hurt
performance in any case. The only way it hurts is if there is already
performance instability, and we force it to be stable but in the bad
case.

This consistently gets us into the good case with gcc-{7,8,9} on an
Intel i9-9900K and clang-9. gcc-5 is 5% worse than its best case but has
stable performance. We get consistently good behavior on my Macbook Pro
compiled with both clang and gcc-8. It ends up in the 50% from DSB and
50% from MITE case, but the performance is the same as the 85% DSB case,
so thats fine.
2020-03-23 19:40:31 -07:00
Nick Terrell
d34204a7b7
Merge pull request #2029 from terrelln/minor-opt
[opt] Update repcodes less often
2020-03-23 18:12:32 -07:00
caoyzh
7201980650 Optimize by prefetching on aarch64 2020-03-14 15:25:59 +08:00
Bimba Shrestha
66607d0eac
Merge pull request #2033 from bimbashrestha/icc
[opt] Small icc level 1 compression speed gain using #pragma vector
2020-03-10 20:42:19 -05:00
Bimba Shrestha
a89c45bdbd Typo 2020-03-10 15:19:48 -05:00
Bimba Shrestha
43fc88f443 Adding comment and remvoing ivdep 2020-03-10 14:57:27 -05:00
Bimba Shrestha
dba3abc95a Missed returns 2020-03-05 12:20:59 -08:00
Bimba Shrestha
a75e5f2ffc bitscan add undef check 2020-03-05 11:52:15 -08:00
Bimba Shrestha
85d0efd619 Removing no-tree-vectorize for intel 2020-03-05 10:02:48 -08:00
Bimba Shrestha
4c72a1a9c2 adding vector to main loop 2020-03-05 09:55:38 -08:00
Nick Terrell
81fda0419e [opt] Only update repcodes upon arrival 2020-03-04 17:57:15 -08:00
Nick Terrell
04744e52dc
Merge pull request #2028 from terrelln/minor-opt
[opt] Don't recompute initial literals price
2020-03-04 17:40:59 -08:00
Nick Terrell
0f9882deb9 [opt] Don't recompute repcodes while emitting sequences 2020-03-04 17:23:00 -08:00
Nick Terrell
c6caa2d04e [opt] Delete ZSTD_litLengthContribution 2020-03-04 16:35:26 -08:00
Nick Terrell
610171ed86 [opt] Explain why we don't include literals price 2020-03-04 16:29:19 -08:00
Nick Terrell
5f49578be7 [opt] Don't recompute initial literals price 2020-03-04 16:27:17 -08:00
Bimba Shrestha
cba46e9b7b Fixing ZSTD_c_compressionLevel confusing note 2020-03-03 13:12:02 -08:00
Nick Terrell
c836992be1 Dont log errors when ZSTD_fseBitCost() returns an error 2020-03-02 11:13:18 -08:00
Felix Handte
b669c5347a
Revert "Fix pkg-config File Generation Again" (#2016) 2020-02-26 10:52:49 -08:00
W. Felix Handte
e5ef935cf6 Fix Variable Capitalization 2020-02-18 13:40:58 -05:00
W. Felix Handte
73737231b9 Allow Manual Overriding of pkg-config Lib and Include Dirs
When the `PCLIBDIR` or `PCINCDIR` is non-empty (either because we succeeded
in removing the prefix, or because it was manually set), we don't need to
perform the check. This lets us trust users who go to the trouble of setting
a manual override, rather than still blindly failing the make.

They'll still be prefixed with `${prefix}/` / `${exec_prefix}/` in the
pkg-config file though.
2020-02-18 13:17:17 -05:00
W. Felix Handte
e668c9b528 Fix pkg-config File Generation Again
Revises #1851. Fixes #1900. Replaces #1930.

Thanks to @orbea, @neheb, @Polynomial-C, and particularly @eli-schwartz for
pointing out the problem and suggesting solutions.

Tested with

  ```
  make -C lib clean libzstd.pc
  cat lib/libzstd.pc

  # should fail
  make -C lib clean libzstd.pc     LIBDIR=/foo
  make -C lib clean libzstd.pc INCLUDEDIR=/foo
  make -C lib clean libzstd.pc     LIBDIR=/usr/localfoo
  make -C lib clean libzstd.pc INCLUDEDIR=/usr/localfoo
  make -C lib clean libzstd.pc     LIBDIR=/usr/local/lib     prefix=/foo
  make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/include prefix=/foo

  # should succeed
  make -C lib clean libzstd.pc     LIBDIR=/usr/local/foo
  make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/foo
  make -C lib clean libzstd.pc     LIBDIR=/usr/local/
  make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/
  make -C lib clean libzstd.pc     LIBDIR=/usr/local
  make -C lib clean libzstd.pc INCLUDEDIR=/usr/local
  make -C lib clean libzstd.pc     LIBDIR=/tmp/foo prefix=/tmp
  make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp
  make -C lib clean libzstd.pc     LIBDIR=/tmp/foo prefix=/tmp/foo
  make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp/foo

  # should also succeed
  make -C lib clean libzstd.pc prefix=/foo LIBDIR=/foo/bar INCLUDEDIR=/foo/
  cat lib/libzstd.pc

  mkdir out
  cd out
  cmake ../build/cmake
  make
  cat lib/libzstd.pc
  ```
2020-02-18 12:23:50 -05:00
Bimba Shrestha
80c26117a9 Line-wrapping 2020-02-03 09:38:16 -08:00
Bimba Shrestha
ee8a712af3 Using appliedParams instead of supplied params 2020-01-31 15:49:07 -08:00
Nick Terrell
e32e3e8662 Improve wildcopy performance across the board 2020-01-28 20:37:04 -08:00
Nick Terrell
7627759b4e
Merge pull request #1972 from terrelln/check-cont
Move ZSTD_checkContinuity() to zstd_decompress_block.c
2020-01-23 22:02:50 -08:00
Nick Terrell
fa6a772f38 Initialize dctx->bType to silence valgrind false positive 2020-01-23 17:54:48 -08:00
Nick Terrell
cb2abc3dbe Fix performance regression on aarch64 with clang 2020-01-23 17:31:14 -08:00
Nick Terrell
6e3cd5b024 Move ZSTD_checkContinuity() to zstd_decompress_block.c 2020-01-23 12:27:39 -08:00
Nick Terrell
a11a9271d6 Fix lowLimit underflow in overflow correction 2020-01-17 12:10:18 -08:00
Nick Terrell
036b30b555
Fix super block compression and stream raw blocks in decompression (#1947)
Super blocks must never violate the zstd block bound of input_size + ZSTD_blockHeaderSize. The individual sub-blocks may, but not the super block. If the superblock violates the block bound we are liable to violate ZSTD_compressBound(), which we must not do. Whenever the super block violates the block bound we instead emit an uncompressed block.

This means we increase the latency because of the single uncompressed block. I fix this by enabling streaming an uncompressed block, so the latency of an uncompressed block is 1 byte. This doesn't reduce the latency of the buffer-less API, but I don't think we really care.

* I added a test case that verifies that the decompression has 1 byte latency.
* I rely on existing zstreamtest / fuzzer / libfuzzer regression tests for correctness. During development I had several correctness bugs, and they easily caught them.
* The added assert that the superblock doesn't violate the block bound will help us discover any missed conditions (though I think I got them all).

Credit to OSS-Fuzz.
2020-01-10 18:02:11 -08:00
Nick Terrell
d1cc9d2797
[fuzz] Allow zero sized buffers for streaming fuzzers (#1945)
* Allow zero sized buffers in `stream_decompress`. Ensure that we never have two
  zero sized buffers in a row so we guarantee forwards progress.
* Make case 4 in `stream_round_trip` do a zero sized buffers call followed by
  a full call to guarantee forwards progress.
* Fix `limitCopy()` in legacy decoders.
* Fix memcpy in `zstdmt_compress.c`.

Catches the bug fixed in PR #1939
2020-01-09 11:38:50 -08:00
Igor Sugak
03ffda7b88 fix UBSAN's invalid-null-argument error in zstd_decompress.c (#1939) 2020-01-08 16:17:42 -08:00
Bimba Shrestha
b1f53b1a10 [fuzz] Dividing by targetCBlockSize instead of blockSize for nbBlocks fit (#1936)
* Adding fail logging for superblock flow

* Dividing by targetCBlockSize instead of blockSize

* Adding new const and using more acurate formula for nbBlocks

* Only do dstCapacity check if using superblock

* Remvoing disabling logic

* Updating test to make it catch more extreme case of previou bug

* Also updating comment

* Only taking compressEnd shortcut on non-superblock
2020-01-03 16:53:51 -08:00
Bimba Shrestha
56415efc76 Constifying, malloc check and naming nit 2019-12-17 17:16:51 -08:00
Bimba Shrestha
5225dcfc0f Adding bool to check if enough room left for noCompress superblocks 2019-12-13 15:47:28 -08:00
Yann Collet
d73e2fb465
Merge pull request #1891 from bimbashrestha/oss
[fuzz] Superblock fuzz issues
2019-12-10 13:17:00 -08:00
Bimba Shrestha
e1913dc87f Making const, removing unnecessary indent, changing parameter order 2019-12-04 15:51:17 -08:00
Bimba Shrestha
2ec556fec2 Moving init/end functions, moving compressSuperBlock inside body() 2019-12-04 15:23:13 -08:00
Bimba Shrestha
ffb0463041 Refactor 2019-12-04 14:52:27 -08:00
Bimba Shrestha
49c6d49247 [fuzz] msan uninitialized unsigned value (#1908)
Fixes new fuzz issue

Credit to OSS-Fuzz

* Initializing unsigned value

* Initialilzing to 1 instead of 0 because its more conservative

* Unconditionoally setting to check first and then checking zero

* Moving bool to before block for c90

* Move check set before block
2019-12-04 10:02:17 -08:00
Yann Collet
5120883a9c bumped version number
so that potential issue report do not confuse `dev` with latest release
2019-12-03 17:06:42 -08:00
Bimba Shrestha
1fc9352f81 Using bss var instead of creating new bool 2019-12-02 21:39:06 -08:00
Bimba Shrestha
1f681d8592 Merge branch 'oss' of https://github.com/bimbashrestha/zstd into oss 2019-11-27 10:56:54 -08:00
Bimba Shrestha
a3a3c62b81 [fuzz] Only set HUF_repeat_valid if loaded table has all non-zero weights (#1898)
Fixes a fuzz issue where dictionary_round_trip failed because the compressor was generating corrupt files thanks to zero weights in the table.

* Only setting loaded dict huf table to valid on non-zero

* Adding hasNoZeroWeights test to fse tables

* Forbiding nbBits != 0 when weight == 0

* Reverting the last commit

* Setting table log to 0 when weight == 0

* Small (invalid) zero weight dict test

* Small (valid) zero weight dict test

* Initializing repeatMode vars to check before zero check

* Removing FSE changes to seperate pr

* Reverting accidentally changed file

* Negating bool, using unsigned, optimization nit
2019-11-26 12:24:19 -08:00
Bimba Shrestha
d4e17d0776 Negating bool, updating bool on inner branches 2019-11-26 12:17:43 -08:00
Nick Terrell
718f00ff6f
Optimize decompression speed for gcc and clang (#1892)
* Optimize `ZSTD_decodeSequence()`
* Optimize Huffman decoding
* Optimize `ZSTD_decompressSequences()`
* Delete `ZSTD_decodeSequenceLong()`
2019-11-25 18:26:19 -08:00
Bimba Shrestha
826b555463
Merge branch 'dev' into oss 2019-11-22 17:29:33 -08:00
Bimba Shrestha
10bce1919e Mixed declration fix 2019-11-21 13:08:27 -08:00
Bimba Shrestha
0451accab1 Checking noCompressBlock explicitly for rep code confirmation 2019-11-21 13:06:26 -08:00
Nick Terrell
659e9f05cf Fix null pointer addition 2019-11-20 18:36:04 -08:00
Yann Collet
2d4dcce55f
Merge pull request #1894 from felixhandte/doc-clarify-dctx-reset
Easy: Update Comment on `ZSTD_initDStream()`
2019-11-19 16:18:56 -08:00
Nick Terrell
e0d6daabac Fix Appveyor failure 2019-11-19 11:12:26 -08:00
Bimba Shrestha
8f0c2d04c8 Going back to original flow but removing else return 2019-11-19 10:03:07 -08:00
W. Felix Handte
722149cf2b Easy: Update Comment on ZSTD_initDStream() 2019-11-19 01:57:15 -05:00
Nick Terrell
6a7f65117e
Merge pull request #1866 from legrosbuffle/dev
Optimized loop bounds to allow the compiler to unroll the loop.
2019-11-18 16:16:30 -08:00
Nick Terrell
a839d6852c
Merge pull request #1888 from senhuang42/superblocks_fixed
RLE test and re-enable RLE in main compression loop
2019-11-18 16:09:33 -08:00
Bimba Shrestha
80586f5e80 Reversing condition order and forwarding error 2019-11-18 13:53:55 -08:00
Bimba Shrestha
dade64428f Output regular uncompressed block when compressSequences fails 2019-11-18 08:43:14 -08:00
Bimba Shrestha
2d5d961a60 Typo in comment 2019-11-15 19:00:53 -08:00
Bimba Shrestha
dba767c0bb Leaving room for checksum 2019-11-15 18:44:51 -08:00
Vincent Torri
6b5c10b48c shared library: rename import library with .dll.a extension
mort of open source project are using this extension for the import library.
The Win32 linker is supporting this extension, see
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/win32.html
section "direct linking to a dll"
2019-11-15 19:46:06 +01:00
Clement Courbet
b3c9fc27b4 Optimized loop bounds to allow the compiler to unroll the loop.
This has no measurable impact on large files but improves small file
decompression by ~1-2% for 10kB, benchmarked with:

head -c 10000 silesia.tar > /tmp/test
make CC=/usr/local/bin/clang-9 BUILD_STATIC=1 && ./lzbench -ezstd -t1,5 /tmp/test
2019-11-15 08:27:05 +01:00
Sen Huang
d9646dcbb5 Fixed main compression logic changes 2019-11-14 19:39:09 -05:00
Yann Collet
4b1ac69f19
Merge pull request #1868 from senhuang42/superblocks_fixed
Superblocks rebased for merge
2019-11-14 13:31:34 -08:00
Sen Huang
c26d32c91c Change superblock #include to be last 2019-11-14 13:12:17 -05:00
Yann Collet
d67742bc5d
Merge pull request #1858 from senhuang42/dictionary_header_size
Method to get dictionary header size
2019-11-14 09:44:07 -08:00
Sen Huang
c85d10d0ea Remove mixed declarations 2019-11-08 13:57:26 -05:00
Sen Huang
d9c475f3b3 Fix static analyze error, use proper bounds for dictEnd 2019-11-08 13:57:26 -05:00
Sen Huang
d06b90692b Move asserts to loadZstdDictionary() 2019-11-08 13:57:26 -05:00
Sen Huang
b39149e156 Expose ZSTD_reset_compressedBlockState() to shared API 2019-11-08 13:57:26 -05:00
Sen Huang
6ce335371b Add error forwarding to loadCEntropy(), make check for dictSize >= 8 from bad merge 2019-11-08 13:57:26 -05:00
Sen Huang
4a61aaf368 Remove redundant comment 2019-11-08 13:57:26 -05:00
Sen Huang
c787b351ea Use ZSTD Error codes, improve explanation of ZSTD_loadCEntropy() and ZSTD_loadDEntropy() 2019-11-08 13:57:26 -05:00
Sen Huang
04fb42b4f3 Integrated refactor into getDictHeaderSize, now passes tests 2019-11-08 13:57:26 -05:00
Sen Huang
0bcaf6db08 First working pass at refactor of loadZstdDictionary() 2019-11-08 13:57:26 -05:00
Sen Huang
4b141b63e0 Revert "Move decompress symbols into zstd_internal.h, remove dependency"
This reverts commit a152b4c67a5266f611db4a2eac4a79003852a795.
2019-11-08 13:57:26 -05:00
Sen Huang
84404cff6e Move decompress symbols into zstd_internal.h, remove dependency 2019-11-08 13:57:26 -05:00
Sen Huang
341e0641ed Checks malloc() for failure, returns 0 if so 2019-11-08 13:57:26 -05:00
Sen Huang
97b7f712f3 Change to heap allocation, remove implicit type conversion 2019-11-08 13:57:25 -05:00
Sen Huang
3c36a7f13a Add ZDICT_getHeaderSize() 2019-11-08 13:57:08 -05:00
Nick Terrell
8c474f9845 Fix parameter selection and adjustment with srcSize == 0 2019-11-07 08:58:43 -08:00
Felix Handte
5688447758
Merge pull request #1873 from felixhandte/make-overlap-log-multithread-only
Fix #1861: Restrict overlapLog Parameter When Not Built With Multithreading
2019-11-06 16:56:37 -05:00
Felix Handte
ba4613602f
Merge pull request #1843 from moozzyk/issue-1637
Take ZSTD_parameters as a const pointer
2019-11-06 16:56:14 -05:00
W. Felix Handte
c13f81905a Fix #1861: Restrict overlapLog Parameter When Not Built With Multithreading
This parameter is unused in single-threaded compression. We should make it
behave like the other multithread-only parameters, for which we only accept
zero when we are not built with multithreading.
2019-11-06 16:05:02 -05:00
Sen Huang
13bb7500e8 Fix frame argument to compression 2019-11-05 16:15:55 -05:00
Sen Huang
f2932fb5eb Fix more merge conflicts 2019-11-05 15:54:05 -05:00
Sen Huang
7ce891870c Fix merge conflicts 2019-11-05 15:51:25 -05:00
Bimba Shrestha
3fb5b106da Replacing some literals with constants 2019-11-05 10:26:57 -08:00
Nick Terrell
60205fec02 Fix 2 bugs in dictionary loading
* Silently skip dictionaries less than 8 bytes, unless using `ZSTD_dct_fullDict`.
  This changes the compressor, which silently skips dictionaries <= 8 bytes.
* Allow repcodes that are equal to the dictionary content size, since it is in bounds.
2019-11-01 16:52:07 -07:00
Sen Huang
b9ede1c8c2 Make sure contentsize is known 2019-10-30 16:03:58 -04:00
Nick Terrell
9c1860861e Fix assert in ZSTD_safecopy
In the case that `op >= oend_w` it is possible that `diff < 8` because
the two buffers could be adjacent.

Credit to OSS-Fuzz, which found the bug. It isn't reproducible because
it depends on the memory layout.
2019-10-28 17:51:17 -07:00