Commit Graph

1736 Commits

Author SHA1 Message Date
Yann Collet
fc2ea97442 refactored fuzzer tests for sequence compression api
add explicit delimiter mode to libfuzzer test
2022-01-26 00:19:35 -08:00
Yann Collet
cc7d23bcec
Merge pull request #2965 from facebook/offbase
Converge sumtype (offset | repcode) numeric representation towards offBase
2022-01-24 15:47:42 -08:00
binhdvo
17017ac8db
Change zstdless behavior to align with zless (#2909)
* Change zstdless behavior to align with zless
2022-01-21 19:57:19 -05:00
Yonatan Komornik
1598e6c634
Async write for decompression (#2975)
* Async IO decompression:
- Added --[no-]asyncio flag for CLI decompression.
- Replaced dstBuffer in decompression with a pool of write jobs.
- Added an ability to execute write jobs in a separate thread.
- Added an ability to wait (join) on all jobs in a thread pool (queued and running).
2022-01-21 13:55:41 -08:00
Yann Collet
71921e596f
Merge pull request #2983 from facebook/minLitPricev2
[opt] minor compression ratio improvement
2022-01-20 16:02:31 -08:00
Nick Terrell
8ea3d57de4 [build][asm] Pass ASFLAGS to the assembler instead of CFLAGS
* Add `-Wa,--noexecstack` to both `ASFLAGS` and `CFLAGS`
* Pass `ASFLAGS` to `.S` compilation instead of `CFLAGS`

Fixes #3006.
2022-01-18 15:11:29 -08:00
Yann Collet
05e884f288
Merge pull request #2956 from sunwire/fix-tar-test-cases
Fix tar test cases
2022-01-11 10:16:34 -08:00
Yann Collet
5595aec629 updated regression results 2022-01-07 15:08:06 -08:00
Nick Terrell
c7b03c217c [license] Fix license header of huf_decompress_amd64.S
* Add the license header for `huf_decompress_amd64.S`
* Add `.S` files to the `test-license.py` test
2022-01-07 09:35:27 -08:00
Felix Handte
7e679511a8
Merge pull request #2964 from felixhandte/noexecstack-all-archs
Mark Huffman Decoder Assembly `noexecstack` on All Architectures
2022-01-05 16:52:39 -05:00
W. Felix Handte
ff5d1daf33 Clean Up Debugging Statements 2022-01-05 16:13:00 -05:00
W. Felix Handte
35208f702f Add Test Validating Stack is not Executable in playTests.sh 2022-01-04 16:57:35 -05:00
Yann Collet
a0b9520e38 fullbench: added compress_freshCCtx scenario 2022-01-02 08:52:33 -08:00
Yann Collet
213dc6110f fixed fullbench freshCCtx scenario 2022-01-01 23:46:49 -08:00
Yann Collet
03903f5701 fixed minor compression difference in btlazy2
subtle dependency on sumtype numeric representation
2021-12-29 18:51:03 -08:00
Yann Collet
7a18d709ae updated all names to offBase convention 2021-12-29 17:30:43 -08:00
Yann Collet
8da414231d found a few more places which were dependent on seqStore offcode sumtype numeric representation 2021-12-28 17:03:24 -08:00
Yann Collet
681c81f06c abstracted storeSeq() sumtype numeric representation from decodecorpus.c 2021-12-28 11:58:33 -08:00
Yann Collet
435f5a2e6d fixed regression test assert
optLdm->offset might be == 0 in invalid case.
Only use STORE_OFFSET() after validating it's a correct case.
2021-12-28 09:55:31 -08:00
Paweł Marciniak
666372c7bf Fix tar test cases 2021-12-24 15:05:26 +01:00
Yann Collet
1aed962216 introduce macros STORE_OFFSET() and STORE_REPCODE()
this meant to abstract the sumtype representation required
to transfert `offcode` to `ZSTD_storeSeq()`.

Unfortunately, the sumtype numeric representation is currently a leaky abstraction
that has permeated many other parts of the code,
especially within `zstd_lazy.c` and also within `zstd_opt.c` and `zstd_compress.c`.

While this PR makes a good job a transfering a large nb of call sites
to using the new macros, there are still a few sites where this transformation is more complex,
or where the numeric representation itself it used "as is".

One of the problematics area is the decision to use the numeric format of the sumtype
within the match finders of `zstd_lazy`.

This commit doesn't change the behavior, it only introduces and employes the macros,
but eventually the resulting code remains identical.

At target, if the numeric representation of the sumtype can be completely abstracted
and no other part of the code depends on it,
it will be possible to move it towards something slightly more efficient.
2021-12-23 22:03:30 -08:00
Yann Collet
aeff128331 change seqDef.offset into seqDef.offBase
to better reflect the value stored in this field.
2021-12-23 17:56:08 -08:00
Yann Collet
e145b58cfd changed seqDef.matchLength into seqDef.mlBase
since this is effectively what is stored in this field (== matchLength - MINMATCH).
This makes it clearer what needs to be done when reading from / writing to this field.
2021-12-23 13:39:46 -08:00
Yann Collet
b77fcac61f change ZSTD_storeSeq() interface to accept matchLength
instead of mlBase.

This removes the need to do `- MINMATCH` at every call site.

The new interface contract is checked with an `assert()`.
2021-12-23 12:03:33 -08:00
Yann Collet
a9e43b37d0
Revert "Limit ZSTD_maxCLevel to 21 for 32-bit binaries." 2021-12-20 11:43:14 -08:00
Yann Collet
01adddc3e0 update regression results 2021-12-16 20:43:23 -08:00
Yann Collet
7c7b9244d6 update regression results 2021-12-16 16:07:54 -08:00
Yann Collet
9a32492730 updated regression results.csv 2021-12-16 14:39:30 -08:00
Elliot Gorokhovsky
c5f1e826ca
Merge pull request #2925 from embg/dict_training_sample_limit_size
Allow user to specify memory limit for dictionary training
2021-12-15 15:58:17 -05:00
Elliot Gorokhovsky
71c0c07c19 Allow user to specify memory limit for dictionary training 2021-12-14 14:29:01 -05:00
Felix Handte
5e2fede604
Merge pull request #2921 from felixhandte/neg-lvl-stagger-step
Stagger Stepping in Negative Levels
2021-12-14 14:13:57 -05:00
W. Felix Handte
450fca9704 Update Regression Tests w/ New Sizes 2021-12-13 17:29:32 -05:00
Nick Terrell
3e2a70b6fb
Merge pull request #2905 from 15596858998/dev_1205
add test case
2021-12-13 13:45:23 -08:00
zx123123
c69d13eb99
Update playTests.sh 2021-12-13 08:58:42 +08:00
Felix Handte
0c26d98c0d
Merge pull request #2910 from felixhandte/reject-irregular-dicts
Reject Irregular Dictionary Files
2021-12-09 11:44:37 -05:00
W. Felix Handte
9985e10fda Reject Irregular Dictionary Files
I hadn't seen #2890, so I wrote my own version. I like this approach a little
better, since it does an explicit check for a regular file, rather than
passing a magic value.

Addresses #2874.
2021-12-08 16:17:04 -05:00
Yann Collet
fb3522a3fe fixed very minor cast warning under cygwin 2021-12-08 09:48:56 -08:00
binhdvo
38dfc4699e
Imply -q when stderr is not a tty (#2884)
* Imply -q when stderr is not a tty
2021-12-07 16:56:19 -05:00
15596858998
ba7e8f1051 add test case 2021-12-05 19:12:52 +08:00
Nick Terrell
01ecd6ffc0
Merge pull request #2892 from terrelln/issue-2785
[CircleCI] Fix short-tests-0
2021-12-02 16:20:56 -05:00
Alexander Kanavin
1e514feec6 Makefile: sort all wildcard file list expansions
Otherwise the order is non-deterministic and breaks
reproducible builds.
2021-12-02 12:04:11 +01:00
Nick Terrell
91f5891dd0 [CircleCI] Fix short-tests-0
short-tests-0 were silently failing. I think because of the && make clean construction. Switch to ; instead.

Also fix all the test failures that were exposed.

`make all` is failing on CircleCI because it is missing Docker. Move that test
to GitHub actions, and switch the pedantic CircleCI test to `make allmost`.
2021-12-01 17:43:46 -08:00
Nick Terrell
9b97fdf74f
Merge pull request #2887 from terrelln/issue-2815
[zdict] Remove ZDICT_CONTENTSIZE_MIN restriction for ZDICT_finalizeDictionary
2021-12-01 18:15:53 -05:00
Yann Collet
49d578763d reduce storage requirement
51 MB seems excessive for CI storate and considering the nature of the test.

(note : maybe we should consider using `/tmp` for files generated during tests,
as tmpfs is typically using RAM, thus preserving storage.)
2021-12-01 15:10:55 -08:00
Yann Collet
3133d1e86e
Merge pull request #2876 from 15596858998/dev
Solve the bug of extra output newline character
2021-12-01 15:10:08 -08:00
Yann Collet
8031dc7a48
Merge pull request #2885 from yoniko/limit-level-32bit-systems
Limit `ZSTD_maxCLevel` to 21 for 32-bit binaries.
2021-12-01 14:19:16 -08:00
Nick Terrell
360c2630e4 [test] Test that the exec-stack bit isn't set on libzstd.so
Tests that libzstd.so doesn't have the exec-stack bit set using
readelf. If the stack is marked executable systemd will refuse
to link against zstd. We now test that it isn't set on every PR.

Adds a test for PR #2857
Fixes Issue #2865
2021-11-30 18:13:00 -08:00
Nick Terrell
0356e05f07 [zdict] Remove ZDICT_CONTENTSIZE_MIN restriction for ZDICT_finalizeDictionary
Allow the `dictContentSize` to be any size. The finalized dictionary
content size must be at least as large as the maximum repcode (8). So we
add zero bytes to the dictionary to ensure that we meet that
requirement.

I've removed this restriction because its been causing us headaches when
people complain that dictionary training failed. It fails because there
isn't enough useful content to put in the dictionary. Either because
every sample is exactly the same and less than ZDICT_CONTENTSIZE_MIN bytes,
or there isn't enough content. Instead, we should succeed in creating
the dictionary, and it is up to the user to decide if it is worthwhile.
It is possible that the tables alone provide enough value.

NOTE: This allows us to produce dictionaries with finalized
`dictContentSize < ZDICT_CONTENTSIZE_MIN`. But, they are still valid
zstd dictionaries. We could remove the `ZDICT_CONTENTSIZE_MIN` macro,
but I've decided to leave that for now, so we don't break users.
2021-11-30 18:02:26 -08:00
Yonatan Komornik
ef2cba609d ZSTD_maxCLevel now limited to 21 for 32-bit binaries.
CI tests for constrained memory runs with max level on 32-bit binaries.
2021-11-30 10:31:52 -08:00
15596858998
7ce6f71c6f 更新 playTests.sh 2021-11-21 12:35:58 +08:00
Dimitris Apostolou
ebbd675998
Fix typos 2021-11-13 10:04:04 +02:00
Yann Collet
ddae153947
Merge pull request #2847 from Svetlitski-FB/improve-verbose-output-2
Display command line parameters with concrete values in verbose mode
2021-11-12 18:51:18 -08:00
Kevin Svetlitski
9b28c26cbf Integrate verbose mode tests into playTests.sh 2021-11-12 14:10:21 -08:00
Kevin Svetlitski
f6ffd39230 Add test case for detailed compression parameter verbose output 2021-11-11 11:59:54 -08:00
binhdvo
931778ed9b
Fix fullbench CI failure (#2851) 2021-11-09 15:15:35 -05:00
binhdvo
6a7ede3dfc
Reduce size of dctx by reutilizing dst buffer (#2751)
* Reduce size of dctx by reutilizing dst buffer

Co-authored-by: Binh Vo <binhvo@fb.com>
2021-10-25 10:38:01 -04:00
Felix Handte
23c1a2d260
Merge pull request #2774 from felixhandte/zstd-dfast-pipelined-single
Pipelined Implementation of ZSTD_dfast
2021-10-13 16:38:43 -04:00
Nick Terrell
c6c482fe07 [binary-tree] Fix underflow of nbCompares
Fix underflow of `nbCompares` by switching to an `int` and comparing
`nbCompares > 0`. This is a minimal fix, because I don't want to change
the logic. These loops seem to be doing `nbCompares + 1` comparisons.

The bug was reported by Dan Carpenter and found by Smatch static
checker.

https://lore.kernel.org/all/20211008063704.GA5370@kili/
2021-10-08 13:22:55 -07:00
W. Felix Handte
168d0a3c89 Fix Flaky Test
This test depended on `_extDict` and `_noDict` compressing identically, which
is not a guarantee we make, AFAIK.
2021-10-05 16:18:09 -04:00
W. Felix Handte
c2c32839dc Update results.csv 2021-10-05 16:18:00 -04:00
Yann Collet
7868f38019
Merge pull request #2747 from Helflym/dev
Add AIX support in Makefile
2021-10-01 08:13:39 -07:00
Sen Huang
9360367371 Update regression test 2021-09-28 08:29:11 -07:00
Sen Huang
b8fd6bf30c Skip most long matches in lazy hash table update 2021-09-28 08:19:39 -07:00
Norbert Lange
0d45540695 decompress: conditionally remove bmi2 from context
Use an helper function, which will just return 0 in case
the feature is disabled.
Allows constant propagation and removal of dead code.
2021-09-26 14:41:37 +02:00
sen
044c8b4722
Merge pull request #2779 from senhuang42/fse_fix
Fix NCountWriteBound
2021-09-22 13:51:21 -04:00
sen
1e99d36361
Merge pull request #2788 from senhuang42/param_switch
Use new paramSwitch enum for row matchfinder and block splitter
2021-09-22 13:27:55 -04:00
senhuang42
99b5e7b8c2 Add test case for FSE over-write 2021-09-22 12:03:46 -04:00
senhuang42
b5c35d7ea3 Use new paramSwitch enum for LCM, row matchfinder, and block splitter 2021-09-21 14:22:02 -04:00
Nick Terrell
a5f2c45528 Huffman ASM 2021-09-20 14:46:43 -07:00
Nick Terrell
d7542aacd9 [fuzzer] Add huf_decompress fuzzer
Add a fuzzer for Huffman decompression. Fix several bugs in Huffman
decompression, mostly related to `op == NULL` and pointer underflow.
2021-09-17 15:00:49 -07:00
Nick Terrell
8bf699aa59 [build] Add support for ASM files in Make + CMake
* Extract out common portion of `lib/Makefile` into `lib/libzstd.mk`.
  Most relevantly, the way we find library files.
* Use `lib/libzstd.mk` in the other Makefiles instead of repeating the
  same code.
* Add a test `tests/test-variants.sh` that checks that the builds of
  `make -C programs allVariants` are correct, and run it in Actions.
* Adds support for ASM files in the CMake build.

The Meson build is not updated because it lists every file in zstd,
and supports ASM off the bat, so the Huffman ASM commit will just add
the ASM file to the list.

The Visual Studios build is not updated because I'm not adding ASM
support to Visual Studios yet.
2021-09-17 14:13:53 -07:00
Yann Collet
fd94b9d1c9 Merge branch 'dev' into opt_investigation 2021-09-14 01:15:51 -07:00
Sen Huang
d45d0ad9d8 Update regression test 2021-09-13 12:41:02 -04:00
Sen Huang
4a498fb9c3 Add a dictionary training large corpus test 2021-09-13 12:29:17 -04:00
Yann Collet
b6b2855b80 updated regression tests 2021-09-12 10:22:35 -07:00
Yann Collet
f58e63bee7 Merge branch 'dev' into opt_investigation 2021-09-12 01:42:49 -07:00
Yann Collet
640c5b1f77 fix automated_benchmarking
make it able to process text output sent into either stdout or stderr
2021-09-12 01:36:18 -07:00
Felix Handte
d68aa19a2f
Merge pull request #2749 from felixhandte/zstd-fast-pipelined
Pipelined Implementation of ZSTD_fast (~+5% Speed)
2021-09-09 17:05:30 -04:00
Yann Collet
5449ede2e6 make automated-benchmarking faster
by employing parallel compilation of object files.
2021-09-08 15:12:28 -07:00
Yann Collet
4f0b1b9ee5 update regression tests 2021-09-08 14:37:42 -07:00
Yann Collet
7fce9a41b5 change update rate to 12/11/11/11
better for large files, and sources with relatively "stable" entropy,
like silesia.tar.
slightly worse for files with rapidly changing entropy,
like Calgary.tar/.

Updated small files tests in fuzzer
2021-09-08 14:05:57 -07:00
Yann Collet
b096a5c626 updated regression tests 2021-09-07 09:55:14 -07:00
Yann Collet
27a8bbe265 new initializer for ll price 2021-09-03 16:07:31 -07:00
Yann Collet
f0fc8cb3e1 Disable console notification by default within the library
As a library, the default shouldn't be to write anything on console.
`cover` and `fastcover` have a `g_displayLevel` variable to control this behavior.
It's now set to 0 (no display) by default.
Setting notification to a higher level should be an explicit operation by a console application.
2021-09-03 13:44:07 -07:00
Yann Collet
40e44bd56d updated regression tests 2021-09-01 13:26:39 -07:00
W. Felix Handte
b0977e4ed2 Update results.csv 2021-09-01 14:45:00 -04:00
W. Felix Handte
98d3df326b Change Target Size in Fuzzer
It's a bit strange, because this is hitting the dictionary special case where
the dictionary is contiguous with the input and still runs in the single-
segment path.

We should probably change that to hit the `extDict` path instead?
2021-09-01 14:15:04 -04:00
Yann Collet
c1de65535f Merge branch 'dev' into qemu 2021-08-31 08:16:46 -07:00
Yann Collet
f21977c5e6 fix playTests.sh when EXE_PREFIX not null 2021-08-29 17:20:12 -07:00
Yann Collet
72bd2a83a0 reduce length of scanbuild static analyzer test
This was ~30mn, by far the longest run on travisCI.
That's because it re-analyzes multiple times the same files (library files notably).
It also performs actions that make no sense for the static analyzer purpose,
such as building the single-file library.

Reduced time spent in this test by reducing its scope :
just build the CLI, and obviously the library along it.
These are the only ones that really deserve to be analyzed.

Unfortunately, it still results in a number of false positives when using newer versions of scanbuild
(each version of scanbuild generates a different list of false positives).
These will have to be fixed before transfering to Github Actions.
2021-08-29 15:26:31 -07:00
Yann Collet
7f37b8a547 accelerate versionsCompatibilityTest
by allowing parallel build of units,
and reducing optimization levels.

Parallel build is only effective on "recent" versions of `zstd`,
as previously, the list of units was passed as a list of source files,
which is something neither `make` nor `gcc` can parallelize.
So its impact is mildly effective (-20%).

Reducing optimization level to `-O1` makes compilation much faster.
It also makes runtime slower,
but in this test, compilation time dominates run time.
The savings are very significant (-50%).

On my test system, it reduces the length of this test from 13mn to 5mn.
2021-08-29 14:48:11 -07:00
Clément Chigot
6ef6cd7999 test: avoid /dev/full on AIX 2021-08-13 10:25:50 +02:00
Clément Chigot
399849e236 Makefile: add AIX support
For lib, AIX linker doesn't allow --soname.
2021-08-13 10:25:14 +02:00
sen
6a258043f5
Merge pull request #2692 from senhuang42/rebalance_clevel
[RFC] Rebalance compression levels
2021-08-06 12:51:31 -04:00
Sen Huang
539b3aab9b Optimize 32-bit VecMask_next() 2021-08-04 17:14:58 -04:00
Felix Handte
ae131282f5
Merge pull request #2742 from felixhandte/set-mtime
Set mtime on Output Files
2021-08-04 16:44:46 -04:00
senhuang42
e411040ea1 Add 64 row entry support for lazy 2021-08-04 16:19:12 -04:00
W. Felix Handte
a8f4612eab Remove sleep()s in Test; Replace with Artificial mtime
This behavior of `touch` is standardized in POSIX, so it should be available.
2021-08-04 15:56:46 -04:00
senhuang42
aa1957477b Improve Huffman sorting algorithm 2021-08-04 12:43:34 -04:00
W. Felix Handte
ead41bcb4e Add Test to Verify mtime is Copied to Destination 2021-08-03 17:22:58 -04:00