Commit Graph

152 Commits

Author SHA1 Message Date
Nick Terrell
1185dfb8d1 [fuzz] Add raw dictionary content fuzzer 2020-05-11 19:03:33 -07:00
Nick Terrell
301a62fe08 [fuzz] Fix compress bound for dictionary_round_trip 2020-05-11 19:00:52 -07:00
Nick Terrell
5717bd39ee [lib] Fix NULL pointer dereference
When the output buffer is `NULL` with size 0, but the frame content size
is non-zero, we will write to the NULL pointer because our bounds check
underflowed.

This was exposed by a recent PR that allowed an empty frame into the
single-pass shortcut in streaming mode.

* Fix the bug.
* Fix another NULL dereference in zstd-v1.
* Overflow checks in 32-bit mode.
* Add a dedicated test.
* Expose the bug in the dedicated simple_decompress fuzzer.
* Switch all mallocs in fuzzers to return NULL for size=0.
* Fix a new timeout in a fuzzer.

Neither clang nor gcc show a decompression speed regression on x86-64.
On x86-32 clang is slightly positive and gcc loses 2.5% of speed.

Credit to OSS-Fuzz.
2020-05-06 12:09:02 -07:00
Nick Terrell
e103d7b4a6
Fix superblock mode (#2100)
Fixes:

Enable RLE blocks for superblock mode
Fix the limitation that the literals block must shrink. Instead, when we're within 200 bytes of the next header byte size, we will just use the next one up. That way we should (almost?) always have space for the table.
Remove the limitation that the first sub-block MUST have compressed literals and be compressed. Now one sub-block MUST be compressed (otherwise we fall back to raw block which is okay, since that is streamable). If no block has compressed literals that is okay, we will fix up the next Huffman table.
Handle the case where the last sub-block is uncompressed (maybe it is very small). Before it would skip superblock in this case, now we allow the last sub-block to be uncompressed. To do this we need to regenerate the correct repcodes.
Respect disableLiteralsCompression in superblock mode
Fix superblock mode to handle a block consisting of only compressed literals
Fix a off by 1 error in superblock mode that disabled it whenever there were last literals
Fix superblock mode with long literals/matches (> 0xFFFF)
Allow superblock mode to repeat Huffman tables
Respect ZSTD_minGain().
Tests:

Simple check for the condition in #2096.
When the simple_round_trip fuzzer enables superblock mode, it checks that the compressed size isn't expanded too much.
Remaining limitations:

O(targetCBlockSize^2) because we recompute statistics every sequence
Unable to split literals of length > targetCBlockSize into multiple sequences
Refuses to generate sub-blocks that don't shrink the compressed data, so we could end up with large sub-blocks. We should emit those sections as uncompressed blocks instead.
...
Fixes #2096
2020-05-01 16:11:47 -07:00
Nick Terrell
1343b815f8 [fuzz] Fuzz test ZSTD_d_stableOutBuffer 2020-04-27 20:04:04 -07:00
Bimba Shrestha
5b0a452cac
Adding --long support for --patch-from (#1959)
* adding long support for patch-from

* adding refPrefix to dictionary_decompress

* adding refPrefix to dictionary_loader

* conversion nit

* triggering log mode on chainLog < fileLog and removing old threshold

* adding refPrefix to dictionary_round_trip

* adding docs

* adding enableldm + forceWindow test for dict

* separate patch-from logic into FIO_adjustParamsForPatchFromMode

* moving memLimit adjustment to outside ifdefs (need for decomp)

* removing refPrefix gate on dictionary_round_trip

* rebase on top of dev refPrefix change

* making sure refPrefx + ldm is < 1% of srcSize

* combining notes for patch-from

* moving memlimit logic inside fileio.c

* adding display for optimal parser and long mode trigger

* conversion nit

* fuzzer found heap-overflow fix

* another conversion nit

* moving FIO_adjustMemLimitForPatchFromMode outside ifndef

* making params immutable

* moving memLimit update before createDictBuffer call

* making maxSrcSize unsigned long long

* making dictSize and maxSrcSize params unsigned long long

* error on files larger than 4gb

* extend refPrefix test to include round trip

* conversion to size_t

* making sure ldm is at least 10x better

* removing break

* including zstd_compress_internal and removing redundant macros

* exposing ZSTD_cycleLog()

* using cycleLog instead of chainLog

* add some more docs about user optimizations

* formatting
2020-04-17 15:58:53 -05:00
Bimba Shrestha
794f03459e adding refPrefix 2020-04-06 22:57:49 -07:00
Nick Terrell
ac58c8d720 Fix copyright and license lines
* All copyright lines now have -2020 instead of -present
* All copyright lines include "Facebook, Inc"
* All licenses are now standardized

The copyright in `threading.{h,c}` is not changed because it comes from
zstdmt.

The copyright and license of `divsufsort.{h,c}` is not changed.
2020-03-26 17:02:06 -07:00
Nick Terrell
d1cc9d2797
[fuzz] Allow zero sized buffers for streaming fuzzers (#1945)
* Allow zero sized buffers in `stream_decompress`. Ensure that we never have two
  zero sized buffers in a row so we guarantee forwards progress.
* Make case 4 in `stream_round_trip` do a zero sized buffers call followed by
  a full call to guarantee forwards progress.
* Fix `limitCopy()` in legacy decoders.
* Fix memcpy in `zstdmt_compress.c`.

Catches the bug fixed in PR #1939
2020-01-09 11:38:50 -08:00
Nick Terrell
b77ad810c9
[fuzz] Fix regression_driver.c with directory input (#1944)
The `numFiles` variable wasn't updated, so the fuzzer didn't do anything.
I did two things to fix this:

1. Remove the `numFiles` variable entirely.
2. Error if we can't open a file and print the number of files tested.
2020-01-08 13:20:56 -08:00
Yann Collet
c71bd45a3b Merge branch 'dev' into ahmed_file 2019-11-26 11:20:26 -08:00
Nick Terrell
e68db76b4b Update .gitignore 2019-11-20 16:36:40 -08:00
Yann Collet
aea2ff5d8d fixed wrong assert() in regression driver 2019-11-06 14:56:21 -08:00
Yann Collet
a7e33e3e10 updated fuzz tests to use FileNamesTable* abstraction 2019-11-06 14:42:13 -08:00
Sen Huang
e21a8bbecd Fix FUZZ_rand32() bug 2019-11-05 16:43:24 -05:00
Sen Huang
f2932fb5eb Fix more merge conflicts 2019-11-05 15:54:05 -05:00
Nick Terrell
60205fec02 Fix 2 bugs in dictionary loading
* Silently skip dictionaries less than 8 bytes, unless using `ZSTD_dct_fullDict`.
  This changes the compressor, which silently skips dictionaries <= 8 bytes.
* Allow repcodes that are equal to the dictionary content size, since it is in bounds.
2019-11-01 16:52:07 -07:00
Nick Terrell
75e7c0d107 [fuzz] Add dictionary_loader fuzzer
* Adds the fuzzer
* Adds an additional `InputType` for the fuzzer

I ran the fuzzer for about 10 minutes and it found 2 bugs:

* Catches the original bug without any help
* Catches an additional bug with 8-byte dictionaries
2019-11-01 15:54:24 -07:00
Nick Terrell
8c11f089a1 [fuzz] Increase output buffer size of stream_round_trip
Fixes OSS-Fuzz crash.
Credit to OSS-Fuzz
2019-10-18 13:39:08 -07:00
Nick Terrell
d721fcf3ee [fuzz] Fix leak in block_round_trip 2019-09-13 10:32:38 -07:00
Nick Terrell
7c4578160e [fuzz] Generate seed data up to 256KB 2019-09-12 15:02:01 -07:00
Dario Pavlovic
51e9d29a51 Merge branch 'improvDataGen' of github.com:darxsys/zstd into improvDataGen 2019-09-12 13:11:02 -07:00
Dario Pavlovic
cd8588077e It's time for all of rng seed code to go. Goodbye 2019-09-12 13:10:34 -07:00
Dario Pavlovic
47bb4c6a23
Update tests/fuzz/fuzz_data_producer.h 2019-09-12 12:45:28 -07:00
Dario Pavlovic
92c58c4d5d Use range instead of the generic uint32 method to use less bytes when generating necessary numbers. 2019-09-12 12:40:12 -07:00
Dario Pavlovic
b5b24c2a0d Combining fuzz_data_producer restrict calls into a single function 2019-09-11 10:09:29 -07:00
Dario Pavlovic
23cc2d8510 All tests should give some portion of data to the producer and use the rest. 2019-09-10 16:52:38 -07:00
Dario Pavlovic
0630d084cb [Fuzz] Improve data generation #1723
Converting the rest of the tests to use the new data producer.
2019-09-10 16:14:43 -07:00
Dario Pavlovic
ea1ad123da Addressing nits 2019-09-09 16:13:24 -07:00
Dario Pavlovic
3932fcfebc Fixing issues with double usage of data. 2019-09-09 15:39:04 -07:00
Dario Pavlovic
a71bbba7be [Fuzz] Improve data generation #1723 2019-09-09 08:43:22 -07:00
Nick Terrell
d0750a1c9c
Merge pull request #1733 from nmagerko/size-hint
Add --size-hint=# option
2019-08-23 10:16:10 -07:00
Nick Terrell
e2030a2c40 [fuzz] Add a DEBUGLOG(3) statement to print file
Enable it by building with this command:

```
./fuzz.py build all --debug 3
```
2019-08-22 17:27:15 -07:00
Nick Magerko
493f95c7df Fix merge conflicts 2019-08-22 11:51:41 -07:00
Nick Terrell
3982935aef [fuzz] Improve fuzzer build script and docs
* Remove the `make libFuzzer` target since it is broken and obsoleted
  by `CC=clang CXX=clang++ ./fuzz.py build all --enable-fuzzer`. The
  new `-fsanitize=fuzzer` is much better because it works with MSAN
  by default.
* Improve the `./fuzz.py gen` command by making the input type explicit
  when creating a new target.
* Update the `README` for `--enable-fuzzer`.

Fixes #1727.
2019-08-20 16:44:50 -07:00
Nick Magerko
c7a24d7a14 Define ZSTD_SRCSIZEHINT_MIN as 0 2019-08-20 13:06:15 -07:00
Nick Magerko
ea9d35922c Add size-hint to fuzz tests 2019-08-19 15:12:29 -07:00
Nick Terrell
e962f07d19
[fuzz] Add a compression fuzzer with randomly sized output buffer (#1670) 2019-07-02 22:05:07 -07:00
Nick Terrell
6810dd6191 [fuzz] Remove max_len from the options 2019-06-10 11:05:45 -07:00
Nick Terrell
610a81ecf9 [fuzzer] Compile with legacy support 2019-04-18 12:44:55 -07:00
Nick Terrell
cc669006dc [fuzzer] Size the decompression output buffer randomly 2019-04-18 12:44:21 -07:00
Nick Terrell
58bcc328a4 [fuzz] Add a seedcorpora target for oss-fuzz 2019-04-17 12:13:06 -07:00
Nick Terrell
09caa4d800 [fuzzer] Add a fuzzer for frame info functions
Add a fuzzer that fuzzes all helper functions that take compressed
input. This fuzzer caught one out of bounds read in
`ZSTD_decompressBound()`.
2019-04-17 11:29:42 -07:00
Josh Soref
a880ca239b Spelling (#1582)
* spelling: accidentally

* spelling: across

* spelling: additionally

* spelling: addresses

* spelling: appropriate

* spelling: assumed

* spelling: available

* spelling: builder

* spelling: capacity

* spelling: compiler

* spelling: compressibility

* spelling: compressor

* spelling: compression

* spelling: contract

* spelling: convenience

* spelling: decompress

* spelling: description

* spelling: deflate

* spelling: deterministically

* spelling: dictionary

* spelling: display

* spelling: eliminate

* spelling: preemptively

* spelling: exclude

* spelling: failure

* spelling: independence

* spelling: independent

* spelling: intentionally

* spelling: matching

* spelling: maximum

* spelling: meaning

* spelling: mishandled

* spelling: memory

* spelling: occasionally

* spelling: occurrence

* spelling: official

* spelling: offsets

* spelling: original

* spelling: output

* spelling: overflow

* spelling: overridden

* spelling: parameter

* spelling: performance

* spelling: probability

* spelling: receives

* spelling: redundant

* spelling: recompression

* spelling: resources

* spelling: sanity

* spelling: segment

* spelling: series

* spelling: specified

* spelling: specify

* spelling: subtracted

* spelling: successful

* spelling: return

* spelling: translation

* spelling: update

* spelling: unrelated

* spelling: useless

* spelling: variables

* spelling: variety

* spelling: verbatim

* spelling: verification

* spelling: visited

* spelling: warming

* spelling: workers

* spelling: with
2019-04-12 11:18:11 -07:00
Nick Terrell
c45dec12c5 [fuzzer] Use ZSTD_DCtx_loadDictionary_advanced() half the time 2019-04-09 18:02:22 -07:00
Nick Terrell
10a3d4dca9 [fuzzer] Make the regression_driver work while fuzzers are active 2019-04-09 18:01:49 -07:00
Nick Terrell
c5d70b7dbb [fuzzer] Sometimes fuzz with one less output byte
Zstd compression sometimes does different stuff when it has at least
`ZSTD_compressBound()` output bytes, or not. Half of the time fuzz with
`ZSTD_compressBound() - 1` output bytes. Ensure that we have at least
one byte of overhead by disabling either the dictionary ID or checksum.
2019-04-09 16:47:59 -07:00
Nick Terrell
7a1fde2957 [fuzzer] Add dictionary fuzzers 2019-04-08 21:07:28 -07:00
Nick Terrell
462918560c [fuzzer] Fix stream_round_trip for the new options 2019-04-08 21:06:19 -07:00
Nick Terrell
f871b5144e [fuzz] Use the new advanced API 2019-04-08 20:01:38 -07:00
Nick Terrell
4b0024a97d [fuzz] Add --enable-fuzzer for clang fuzzing 2019-02-27 17:15:52 -08:00
Peter (Stig) Edwards
cdb3e7af2f
-Wformat-security not needed with -Wformat=2 2019-02-01 09:38:49 +00:00
Yann Collet
39e28982cf introduced constants ZSTD_STRATEGY_MIN and ZSTD_STRATEGY_MAX 2018-12-06 16:16:16 -08:00
Yann Collet
be9e561da4 changed ZSTD_c_compressionStrategy into ZSTD_c_strategy
also : fixed paramgrill, and limit conditions
2018-12-06 15:00:52 -08:00
Yann Collet
3583d19c4e changed parameter names from ZSTD_p_* to ZSTD_c_*
for naming consistency
2018-12-05 17:26:02 -08:00
Yann Collet
d8e215cbee created ZSTD_compress2() and ZSTD_compressStream2()
ZSTD_compress_generic() is renamed ZSTD_compressStream2().

Note that, for the time being,
the "stable" API and advanced one use different parameter planes :
setting parameters using the advanced API does not influence ZSTD_compressStream()
and using ZSTD_initCStream() does not influence parameters for ZSTD_compressStream2().
2018-11-30 11:25:56 -08:00
Yann Collet
41c7d0b1e1 changed hashEveryLog into hashRateLog 2018-11-21 14:36:57 -08:00
Yann Collet
2e7fd6a2cb fixed remaining searchLength invocations 2018-11-20 15:13:27 -08:00
Yann Collet
f07e13b729 fixed fuzz src 2018-11-15 16:53:45 -08:00
Yann Collet
cf9f4b63b8 fixed fuzz test src code 2018-11-14 14:46:49 -08:00
W. Felix Handte
2242330b26 Fix Fuzz Range 2018-11-12 13:01:14 -08:00
Ethan Jones
953c7b9463 Fix libFuzzer location in makefile.
libFuzzer was moved into compiler-rt, update the repo location
accordingly.
2018-10-22 11:19:13 -05:00
Rohit Jain
5dc9443053 Changing tests/fuzz/Makefile to move util.o to FUZZ_SRC instead 2018-10-12 19:06:58 -07:00
Rohit Jain
23e727e3a2 Fixing regressiontest makefile 2018-10-11 17:08:42 -07:00
Nick Terrell
3ff6040848 Publish artifacts with CircleCI
* Updates CircleCI to use workflows.
  We can now specify any number of test jobs to run in parallel.
* Switch the image to `buildpack-deps:trusty` which is only 500 MB
  instead of 7 GB, so that saves 7 minutes to download it if it isn't
  already cached on the host.
* Publish the source tarball and sha256sum as artifacts.
* If the `GITHUB_TOKEN` environment variable is set, we will also
  add the tarball + sha256sum to the tagged release, after manual
  approval.
2018-09-26 13:23:28 -07:00
W. Felix Handte
01bb1c1016 Add CCtx Param Controlling Dict Attachment Behavior 2018-06-21 17:29:25 -04:00
Yann Collet
fa41bcc2c2 grouped debug functions into debug.h
There were 2 competing set of debug functions
within zstd_internal.h and bitstream.h.
They were mostly duplicate, and required care to avoid messing with each other.

There is now a single implementation, shared by both.

Significant change :
The macro variable ZSTD_DEBUG does no longer exist,
it has been replaced by DEBUGLEVEL,
which required modifying several source files.
2018-06-13 15:43:09 -04:00
Yann Collet
0b1833c2cb fixed regressiontest
ZSTD_TARGETLEN_MIN no longer exists
since is would be tautological to check if an unsigned value is < 0.
2018-06-07 16:07:46 -07:00
Nick Terrell
fdd4d8510f Improve compiler detection to work on Mac 2018-05-24 14:21:12 -07:00
Nick Terrell
ac852abb8b Define BIT_DEBUG for --debug 2018-05-24 14:21:12 -07:00
Nick Terrell
2a9975f77b Increase the maximum file size 2018-05-24 14:21:12 -07:00
Nick Terrell
e712a3a0a3 Small fixes to fuzz.py 2018-05-24 14:21:12 -07:00
Yann Collet
f15e2fd55e fixed fuzz test 2018-03-21 16:45:15 -07:00
Yann Collet
ffc335bccf complete ignore list
fuzz tests artifacts
2017-12-29 14:39:49 +01:00
Yann Collet
e4ec427720 Merge branch 'dev' into shorterTests
fixed conflicts
2017-09-28 12:19:28 -07:00
Yann Collet
824f75ea7c Merge pull request #863 from facebook/newFormats
magicless frames (#591)
2017-09-28 00:32:16 -07:00
Nick Terrell
d9c1e9125f [fuzz] Small changes for oss-fuzz integration 2017-09-27 18:23:06 -07:00
Yann Collet
54a827fff0 Merge branch 'dev' into newFormats
Fixed conflicts in zstdmt_compress.c
2017-09-27 16:39:40 -07:00
Yann Collet
bdd0f6f046 improved make clean in tests/fuzz 2017-09-27 15:20:08 -07:00
Yann Collet
4791561c4a silence minor gcc warning -Wempty-body
also silence fuzz test artefacts
2017-09-26 17:57:38 -07:00
Nick Terrell
471aa385b3 [fuzz] Speed up round trip tests
* Enforce smaller maximum values for parameters
* Adjust parameters to the source size

The memory usage is reduced by about 5x, which makes the fuzzers run at
least twice as fast, even more so with ASAN/MSAN enabled.
2017-09-26 14:03:43 -07:00
Nick Terrell
917a213254 [fuzz] Determine flags based on compiler version 2017-09-25 15:32:36 -07:00
Nick Terrell
11e21f23cb [fuzz] Mention the corpora in the README 2017-09-25 15:31:38 -07:00
Nick Terrell
6bb781e0f1 [fuzz] Add regressiontest targets 2017-09-25 15:31:33 -07:00
Nick Terrell
bfad5568b5 [fuzz] Make simple_round_trip compile cleanly 2017-09-25 13:28:45 -07:00
Nick Terrell
23199b6daf [fuzz] Fix fuzz.py env flags parsing 2017-09-25 13:28:18 -07:00
Nick Terrell
1c23b64049 [fuzz] fuzz.py can minimize and zip corpora
* "minimize" minimizes the corpora into an output directory.
* "zip" zips up the minimized corpora, which are ready to deploy.
2017-09-25 12:04:12 -07:00
Nick Terrell
39357c41cb [fuzzer] Fuzz long range matching & new API 2017-09-14 14:48:08 -07:00
Nick Terrell
9712d5ebe6 [fuzzer] Fix bugs in fuzz.py 2017-09-13 19:08:35 -07:00
Nick Terrell
a6f08b4783 [fuzzer] Fix FUZZ_seed() 2017-09-13 18:41:32 -07:00
Nick Terrell
6c6412cef9 [fuzzer] Update README.md 2017-09-13 18:23:52 -07:00
Nick Terrell
6b8236cf7e [fuzz] Add fuzzing helper script 2017-09-13 17:45:21 -07:00
Nick Terrell
b7e1522330 Add block fuzzers 2017-09-13 17:44:41 -07:00
Nick Terrell
def3214d74 [fuzzer] Handle single empty directory 2017-09-13 17:44:30 -07:00
Nick Terrell
8b6c80ada8 Update fuzzer Makefile 2017-09-13 16:16:57 -07:00
Nick Terrell
677c2cbf89 Update fuzzer sources 2017-09-13 16:16:57 -07:00
Stella Lau
af4068a697 Fix function name in tests/fuzz/regression_driver 2017-09-05 22:14:41 -07:00
Eiichi Tsukata
7492e7f1c7 tests/fuzz: change ZSTD_BLOCKSIZE_ABSOLUTEMAX into ZSTD_BLOCKSIZE_MAX
ZSTD_BLOCKSIZE_ABSOLUTEMAX is changed at the commit:
fa3671eac7
2017-09-01 16:37:39 +09:00
Eiichi Tsukata
6639395979 tests/fuzz: fix make all target names 2017-09-01 16:32:40 +09:00
Yann Collet
e21384fffb fixed more file headers after license change (#825) 2017-08-31 12:11:57 -07:00