Commit Graph

9629 Commits

Author SHA1 Message Date
Yonatan Komornik
4b24ebdcf3
Travis CI: fix by installing pip compatible with python 3.6 (#3041)
Pip install script no longer supports python3.6 by default, switched to a script that does.
2022-01-31 16:49:49 -08:00
Yann Collet
9a758ce520 update sequence_compression_api fuzzer test
to check for under-sized dstCapacity.
2022-01-31 16:17:11 -08:00
Yonatan Komornik
cc0657f27d
AsyncIO compression part 2 - added async read and asyncio to compression code (#3022)
* Compression asyncio:
- Added asyncio functionality for compression flow
- Added ReadPool for async reads, implemented in both comp and decomp flows
2022-01-31 15:43:41 -08:00
Nick Terrell
0b70da6277
Merge pull request #3020 from terrelln/cli-tests
Add new CLI testing platform
2022-01-31 10:02:27 -08:00
Eli Schwartz
c01582dc8a
travis CI: update meson image to one with a python that isn't EOL
Currently the build errors out with:

```
ERROR: This script does not work on Python 3.6 The minimum supported Python version is 3.7. Please use https://bootstrap.pypa.io/pip/3.6/get-pip.py instead.
```

While in theory this advice could be followed to get a better pip on
xenial, Meson has now deprecated python 3.6 support too, and the next
(unreleased) version requires python 3.7

There are a couple solutions to this:
- hold the version of pip, allow pip to only install 3.6-compatible
  versions of meson (effectively freezing meson going forward)
- install python 3.7 on xenial
- update to a 2-year-old image instead of a 4-year-old one

Option 3 is the simplest.
2022-01-30 22:45:23 -05:00
Eli Schwartz
ef78b9af30
meson: valgrind wrapper should return correct errors
While trying to raise an exception on failures, it instead raised an
exception for misusing the exception. CalledProcessError is only
supposed to be used when given a return code and a command, and it
prints:

Command '{cmd}' returned non-zero exit status {ret}

Passing an error message string instead, just errored out with:

TypeError: __init__() missing 1 required positional argument

Instead use the subprocess module's base error which does accept string
messages. Everything that used to error out, still errors out, but now
they do so with a slightly prettier console message.
2022-01-30 21:50:42 -05:00
Eli Schwartz
84c05453db
meson: never require a libm
libm is not guaranteed to exist. POSIX requires the math functions to
exist, but doesn't require to have it be a standalone library.

On platforms where libm exists as a standalone library, it will always
be found by meson -- it is shipped with libc.

If it is not found, then we can safely assume the linker will make the
math functions available by default.

See https://mesonbuild.com/howtox.html#add-math-library-lm-portably

Fixes building with bin_tests=true on Windows.
2022-01-30 21:50:42 -05:00
Eli Schwartz
5b2c6c776a
meson: fix resource file compilation on Windows
It needs to know about the correct include directories on its own.
2022-01-30 21:50:42 -05:00
Nick Terrell
8d65f87416 Fix static analysis false-positives
* It couldn't detect that the `fastCoverParams` can't be non-null, since it was just an assertion.
* It thought we were accesing `wksp->dtable` beyond the bounds because we were using it to set the `workSpace` value. Instead, compute the workspace size used in a different way.
2022-01-30 12:16:16 -08:00
Yann Collet
637b2d7a24 fixed bug 44168
discovered by oss-fuzz

It's a bug in the test itself :
ZSTD_compressBound() as an upper bound of the compress size
only works for data compressed "normally".
But in situations where many flushes are forcefully introduced,
this creates many more blocks,
each of which has a potential to increase the size by 3 bytes.
In extreme cases (lots of small incompressible blocks), the expansion can go beyond ZSTD_compressBound().

This situation is similar when using the CompressSequences() API
with Explicit Block Delimiters.
In which case, each explicit block acts like a deliberate flush.
When employed by a fuzzer, it's possible to generate scenarios like the one described above,
with tons of incompressible blocks of small sizes,
thus going beyond ZSTD_compressBound().

fix : when using Explicit Block Delimiters, use a larger bound, to account for this scenario.
2022-01-29 16:36:20 -08:00
Yann Collet
c9072dd60a
Merge pull request #3027 from brailovich/dev
fix for -r on empty directory
2022-01-29 15:40:43 -08:00
Yann Collet
85a13250c7
Merge pull request #3030 from terrelln/verbose-version
Print zlib/lz4/lzma library versions in verbose version output
2022-01-29 15:40:09 -08:00
Yann Collet
9a68840176 minor refactor to blocksplit
notably simplication of ZSTD_deriveSeqStoreChunk()
2022-01-27 20:24:35 -08:00
Yann Collet
a7285955f1
Merge pull request #3034 from facebook/fix44122
fix 44122 test error
2022-01-27 16:01:39 -08:00
Yann Collet
d64d5ddc57 fix 44122 test error
It's a bug in the test itself, in exceptional circumstances (no more space for additional sequence).

There should be enough room for all cases to work fine from now on,
and if not, we have an additional `assert()` to catch that situation.
2022-01-27 14:54:18 -08:00
Nick Terrell
1fc42de86a [CI] Hook cli-tests up to CI
Add cli-tests to `make test`. This adds a `python3` dependency to `make
test`, but not `make check`. We could make this dependency optional by
skipping the tests if `python3` is not present.
2022-01-27 13:56:59 -08:00
Nick Terrell
f3096ff6d1 [test] Add new CLI testing platform
Adds the new CLI testing platform that I'm proposing.
See the added `README.md` for details.
2022-01-27 13:56:59 -08:00
Nick Terrell
f088c430e3 [datagen] Remove extra newline printed
`datagen` was printing a `\n` even when it had no other output. Raise
the output level for the final `\n` to the minimum output level used.

This minor bug was caught by the new testing framework.
2022-01-27 13:56:59 -08:00
Nick Terrell
495dcb839a [zstdcli] Fix option detection for --auto-threads
The option `--auto-threads` should still be accepted and parsed, even if
`ZSTD_MULTITHREAD` is not defined. It doesn't mean anything, but we
should still accept the option. Since we want scripts to be able to work
generically.

This bug was caught by tests I added to the new testing framework.
2022-01-27 13:56:59 -08:00
Nick Terrell
246982e782 [dibio] Fix assertion triggered by no inputs
Passing 0 inputs to `DiB_shuffle()` caused an assertion failure where
it should just return.

A test is added in a later commit, with the initial introduction of the
new testing framework.

Fixes #3007.
2022-01-27 13:56:59 -08:00
Yann Collet
5d70ec0bc4
Merge pull request #3033 from facebook/fix44108
fix issue 44108
2022-01-27 10:57:48 -08:00
brailovich
9f37d1fede
Update playTests.sh
combination of -r with empty folder simplified to comply with sh compatibility tests
2022-01-27 08:22:05 -08:00
Yann Collet
bad7f82300
Merge pull request #2974 from facebook/fix2966_part3
Lazy parameters adaptation (part 1 - ZSTD_c_stableInBuffer)
2022-01-27 06:14:04 -08:00
Yann Collet
8df1257c3c fix issue 44108
credit to oss-fuzz

In rare circumstances, the block-splitter might cut a block at the exact beginning of a repcode.
In which case, since litlength=0, if the repcode expected 1+ literals in front, its signification changes.
This scenario is controlled in ZSTD_seqStore_resolveOffCodes(),
and the repcode is transformed into a raw offset when its new meaning is incorrect.

In more complex scenarios, the previous block might be emitted as uncompressed after all,
thus modifying the expected repcode history.
In the case discovered by oss-fuzz, the first block is emitted as uncompressed,
so the repcode history remains at default values: 1,4,8.

But since the starting repcode is repcode3, and the literal length is == 0,
its meaning is : = repcode1 - 1.
Since repcode1==1, it results in an offset value of 0, which is invalid.

So that's what the `assert()` was verifying : the result of the repcode translation should be a valid offset.

But actually, it doesn't matter, because this result will then be compared to reality,
and since it's an invalid offset, it will necessarily be discarded if incorrect,
then the repcode will be replaced by a raw offset.

So the `assert()` is not useful.
Furthermore, it's incorrect, because it assumes this situation cannot happen, but it does, as described in above scenario.
2022-01-27 05:49:59 -08:00
brailovich
501a353b91
Update playTests.sh 2022-01-26 18:56:52 -08:00
Yann Collet
f2d9652ad8 more usage of new error code stabilityCondition_notRespected
as suggested by @terrelln
2022-01-26 18:30:55 -08:00
Nick Terrell
e60eba58bf Print zlib/lz4/lzma library versions in verbose version output
Knowing the version of zlib/lz4/lzma we're linking against is very
useful for debugging issues with those libraries, so print it out in the
verbosity 4 version output.

Also print this information at the top of `playTests.sh`.
2022-01-26 18:25:58 -08:00
Yann Collet
7543085013
Merge pull request #3019 from facebook/huf_traces
More traces to improved debugging of literals compression
2022-01-26 18:02:05 -08:00
brailovich
5e7523385b
Update playTests.sh 2022-01-26 16:53:11 -08:00
brailovich
beb4872241
Update zstdcli.c 2022-01-26 16:51:18 -08:00
Yann Collet
8b46895588 removed new huffman depth heuristic
results are now identical to before this PR
2022-01-26 15:22:06 -08:00
Yann Collet
a66e8bb437 introduced LitHufLog constant
which properly represents the maximum bit size of compressed literals (11) as defined in the specification.

To be preferred from HUF_TABLELOG_DEFAULT which represents the same value but by accident.

Name selected to keep the same convention as existing width definitions,
MLFSELog, LLFSELog and OffFSELog.
2022-01-26 14:47:24 -08:00
Yann Collet
2d154e627a renamed HufLog into ZSTD_HUFFDTABLE_CAPACITY_LOG
old name was not descriptive and actually misleading
2022-01-26 14:47:24 -08:00
Yann Collet
32a5d95dcb moved HufLog to lib/decompress
it's only used to size decompression tables
2022-01-26 14:47:24 -08:00
Yann Collet
e9dd923fa4 only declare debug functions in debug mode 2022-01-26 14:47:24 -08:00
Yann Collet
5db717af10 proper max limit to 11 2022-01-26 14:47:24 -08:00
Yann Collet
4684836f4f update regression tests
minor compression ratio benefits in some cases,
no compression ratio regression in the measured scenarios.
2022-01-26 14:47:24 -08:00
Yann Collet
51da2d2ff2 improved compression of literals in specific corner cases
In rare cases, the default huffman depth selector is a bit too harsh,
requiring brutal adaptations to the tree,
resulting is some loss of compression ratio.
This new heuristic avoids the worse cases, favoring compression ratio.

As an example, compression of a specific distribution of 771 literals
is now improved to 441 bytes, from 601 bytes before.
2022-01-26 14:47:24 -08:00
Yann Collet
7616e39f3b adding traces to better track processing of literals 2022-01-26 14:47:21 -08:00
Yann Collet
a0acf9aa49
Merge pull request #3023 from facebook/fix_seqCompress_withDelimiter
fix sequence compression API in Explicit Delimiter mode
2022-01-26 14:15:28 -08:00
Yann Collet
dda4c10f07 added ZSTD_compressStream2() + ZSTD_c_stableInBuffer test 2022-01-26 13:33:04 -08:00
Yann Collet
cbff372d10 added helper function inBuffer_forEndFlush() 2022-01-26 11:05:57 -08:00
Yann Collet
b99ece96b9 converted checks into user validation generating error codes
had to create a new error code for this condition,
none of the existing ones were fitting enough.
2022-01-26 10:43:50 -08:00
Yann Collet
af3d9c506e added streaming test starting from non-0 pos 2022-01-26 10:31:25 -08:00
Yann Collet
c1668a00d2 fix extended case combining stableInBuffer with continue() and flush() modes 2022-01-26 10:31:25 -08:00
Yann Collet
270f9bf005 better consistency in accessing @input
as suggested by @terrelln.

Also : commented zstreamtest more
to ensure ZSTD_stableInBuffer is tested/
2022-01-26 10:31:24 -08:00
Yann Collet
8296be4a0a pretend consuming input to provide a sense of forward progress 2022-01-26 10:31:24 -08:00
Yann Collet
4b9d1dd9ff fixed incorrect comment 2022-01-26 10:31:24 -08:00
Yann Collet
27d336b099 minor behavior refinements
specifically, there is no obligation to start streaming compression with pos=0.
stableSrc mode is now compatible with this setup.
2022-01-26 10:31:24 -08:00
Yann Collet
37b87add7a make stableSrc compatible with regular streaming API
including flushStream().

Now the only condition is for `input.size` to continuously grow.
2022-01-26 10:31:24 -08:00