townforge/zstd - zstd - Townforge git

Author	SHA1	Message	Date
caoyzh	a7e34ff693	revert ZSTD_reduceTable_internal()'s modificatiion	2020-05-07 13:10:46 -07:00
caoyzh	9e802ede9c	Modify indent of comments	2020-05-07 13:10:46 -07:00
caoyzh	7f75f05e84	Change "arm_neon.h" to system include <arm_neon.h>	2020-05-07 13:10:46 -07:00
caoyzh	b2e56f7f7f	Optimize compression by using neon function.	2020-05-07 13:10:46 -07:00
Nick Terrell	45c66dd298	[zdict] Stabilize ZDICT_finalizeDictionary()	2020-05-07 10:37:01 -07:00
Yann Collet	cf854f4660	support for verbose make A commonly accepted makefile idiom is V=1 or VERBOSE=1 to request the printing of all commands. This is not "default" though, and must be manually added. Example : Before : ``` make libzstd compiling dynamic library 1.4.5 creating versioned links make libzstd V=1 compiling dynamic library 1.4.5 creating versioned links ``` After : ``` make libzstd compiling dynamic library 1.4.5 creating versioned links make libzstd V=1 compiling dynamic library 1.4.5 cc -DXXH_NAMESPACE=ZSTD_ -DZSTD_LEGACY_SUPPORT=5 -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow -Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement -Wstrict-prototypes -Wundef -Wpointer-arith -Wvla -Wformat=2 -Winit-self -Wfloat-equal -Wwrite-strings -Wredundant-decls -Wmissing-prototypes -Wc++-compat -O3 common/debug.c common/entropy_common.c common/error_private.c common/fse_decompress.c common/pool.c common/threading.c common/xxhash.c common/zstd_common.c compress/fse_compress.c compress/hist.c compress/huf_compress.c compress/zstd_compress.c compress/zstd_compress_literals.c compress/zstd_compress_sequences.c compress/zstd_compress_superblock.c compress/zstd_double_fast.c compress/zstd_fast.c compress/zstd_lazy.c compress/zstd_ldm.c compress/zstd_opt.c compress/zstdmt_compress.c decompress/huf_decompress.c decompress/zstd_ddict.c decompress/zstd_decompress.c decompress/zstd_decompress_block.c deprecated/zbuff_common.c deprecated/zbuff_compress.c deprecated/zbuff_decompress.c dictBuilder/cover.c dictBuilder/divsufsort.c dictBuilder/fastcover.c dictBuilder/zdict.c legacy/zstd_v05.c legacy/zstd_v06.c legacy/zstd_v07.c -shared -fPIC -fvisibility=hidden -Wl,-soname=libzstd.so.1 -o libzstd.so.1.4.5 creating versioned links ln -sf libzstd.so.1.4.5 libzstd.so.1 ln -sf libzstd.so.1.4.5 libzstd.so ```	2020-05-07 08:04:10 -07:00
Yann Collet	54144285fd	small speed improvement for strategy fast gcc 9.3.0 : kennedy : 459 -> 466 silesia : 360 -> 365 enwik8 : 267 -> 269 clang 10.0.0 : kennedy : 436 -> 441 silesia : 364 -> 366 enwik8 : 271 -> 272	2020-05-07 06:15:58 -07:00
Nick Terrell	5717bd39ee	[lib] Fix NULL pointer dereference When the output buffer is `NULL` with size 0, but the frame content size is non-zero, we will write to the NULL pointer because our bounds check underflowed. This was exposed by a recent PR that allowed an empty frame into the single-pass shortcut in streaming mode. * Fix the bug. * Fix another NULL dereference in zstd-v1. * Overflow checks in 32-bit mode. * Add a dedicated test. * Expose the bug in the dedicated simple_decompress fuzzer. * Switch all mallocs in fuzzers to return NULL for size=0. * Fix a new timeout in a fuzzer. Neither clang nor gcc show a decompression speed regression on x86-64. On x86-32 clang is slightly positive and gcc loses 2.5% of speed. Credit to OSS-Fuzz.	2020-05-06 12:09:02 -07:00
Felix Handte	ad8dbae1b7	Merge pull request #2103 from felixhandte/relative-includes Migrate Includes to Relative Paths	2020-05-06 09:42:23 -07:00
Yann Collet	c29fd7cd8b	some more conversion warnings hunting down some static analyzer warnings	2020-05-05 10:16:59 -07:00
Yann Collet	c1b836f4c3	fix minor conversion warnings	2020-05-04 14:43:09 -07:00
Felix Handte	8b327149a8	Merge pull request #1976 from felixhandte/minimal-lib-target Add Minification Variable to `lib/Makefile`	2020-05-04 12:42:56 -07:00
W. Felix Handte	7b75d772b1	Remove Useless Assignment in Makefile	2020-05-04 15:20:26 -04:00
W. Felix Handte	6028827fee	Rewrite Include Paths to be Relative Addresses #1998.	2020-05-04 15:20:26 -04:00
Felix Handte	7e9aabd652	Merge pull request #2099 from felixhandte/compile-under-pedantic Compile Under `-pedantic -Werror` and `-std=c90`	2020-05-04 10:07:13 -07:00
W. Felix Handte	fa5e01c467	Add Space-Optimized Helper Variable to Lib Makefile This diff reorganizes the `lib/Makefile` to extract various settings that a user would normally invoke together (supposing that they were aware of them) if they were trying to build the smallest `libzstd` possible. It collects these settings under a master setting `ZSTD_LIB_MIN_SIZE`. Also document this new option.	2020-05-04 11:19:25 -04:00
Felix Handte	816ed80774	Merge pull request #1984 from MeghnaM/1636-Reduce-stack-usage-of-HUF_sort Reduce stack usage of HUF_sort()	2020-05-04 08:15:31 -07:00
W. Felix Handte	3764859060	Switch Helper Declaration to Not Force Inline It was causing build issues in ANSI mode.	2020-05-04 10:59:15 -04:00
W. Felix Handte	c7da66c9cf	Purge C++-Style Comments (`// ...`), Make Compilation Succeed Under C90	2020-05-04 10:59:15 -04:00
W. Felix Handte	952427aebf	Avoid inline Keyword in C90 Previously we would use it for all gcc-like compilations, even when a restrictive mode that disallowed it had been selected.	2020-05-04 10:59:15 -04:00
W. Felix Handte	baa4e2e36c	Don't Evaluate Arguments to Dummy Function	2020-05-04 10:59:15 -04:00
W. Felix Handte	450542d3a7	Allow Empty Format Strings in Error Macro Invocations `-Wall` implies `-Wformat-zero-length`, which will cause compilation to fail under `-Werror` when an empty string is passed as the format string to a `printf`-family function. This commit moves us back to prefixing the provided format string, which successfully avoids that warning. However, this removes the failure mode where that `RAWLOG` invocation would fail to compile when no format string was provided at all (which was desirable to avoid having code that would successfully compile normally but fail under `-pedantic`, which does require that a non-zero number of args are provided). So this commit also introduces a function which does nothing at all, but will fail to compile if not provided with at least one argument, which is a string. This successfully links the compilability of pedantic and non-pedantic builds.	2020-05-04 10:59:15 -04:00
W. Felix Handte	6696933b32	Make All Invocations Start With Literal Format String	2020-05-04 10:59:15 -04:00
W. Felix Handte	2745f7a7d5	Make Error Macro Invocation Without Info String Fail to Compile Even without `-pedantic`, these macros will now fail to compile unless you provide an info string argument. This will prevent us from regressing.	2020-05-04 10:59:15 -04:00
W. Felix Handte	5e5f262612	Add (Possibly Empty) Info Strings to All Variadic Error Handling Macro Invocations	2020-05-04 10:58:55 -04:00
Nick Terrell	e103d7b4a6	Fix superblock mode (#2100 ) Fixes: Enable RLE blocks for superblock mode Fix the limitation that the literals block must shrink. Instead, when we're within 200 bytes of the next header byte size, we will just use the next one up. That way we should (almost?) always have space for the table. Remove the limitation that the first sub-block MUST have compressed literals and be compressed. Now one sub-block MUST be compressed (otherwise we fall back to raw block which is okay, since that is streamable). If no block has compressed literals that is okay, we will fix up the next Huffman table. Handle the case where the last sub-block is uncompressed (maybe it is very small). Before it would skip superblock in this case, now we allow the last sub-block to be uncompressed. To do this we need to regenerate the correct repcodes. Respect disableLiteralsCompression in superblock mode Fix superblock mode to handle a block consisting of only compressed literals Fix a off by 1 error in superblock mode that disabled it whenever there were last literals Fix superblock mode with long literals/matches (> 0xFFFF) Allow superblock mode to repeat Huffman tables Respect ZSTD_minGain(). Tests: Simple check for the condition in #2096. When the simple_round_trip fuzzer enables superblock mode, it checks that the compressed size isn't expanded too much. Remaining limitations: O(targetCBlockSize^2) because we recompute statistics every sequence Unable to split literals of length > targetCBlockSize into multiple sequences Refuses to generate sub-blocks that don't shrink the compressed data, so we could end up with large sub-blocks. We should emit those sections as uncompressed blocks instead. ... Fixes #2096	2020-05-01 16:11:47 -07:00
Meghna Malhotra	0adfc8dfce	Fix broken CI; make changes in response to the comments	2020-05-01 13:45:48 -07:00
Meghna Malhotra	53d76dc20f	Remove magic constant and made other changes addressing the comments	2020-05-01 13:45:48 -07:00
Meghna Malhotra	fe8402b522	WIP: Still getting an error	2020-05-01 13:45:48 -07:00
Meghna Malhotra	a084d959bd	WIP: Increased wksp size, but it's segfaulting	2020-05-01 13:45:48 -07:00
Meghna Malhotra	fdb2780c47	Move rank table into HUF_buildCTable_wksp()	2020-05-01 13:45:48 -07:00
Yann Collet	da2748a855	Merge pull request #2097 from facebook/underlink Fix underlinked libzstd	2020-04-30 10:16:24 -07:00
Yann Collet	f77fd5ced0	generalized pattern rules	2020-04-28 18:43:55 -07:00
Yann Collet	c6ae2e83bc	fix libzstd-mt underlinking issue fix #2045 When compiling `libzstd` in multithreading mode, the `libzstd-mt` recipe would not include `-pthread`, resulting in an underlinked dynamic library. Added a test on Travis to check that the library is fully linked. This makes it possible, in some future release, to build a multi-threaded `libzstd` dynamic library by default as it would no longer impact the build script of user programs.	2020-04-28 18:29:20 -07:00
Nick Terrell	55a57d46be	Add extra warnings about not modifying the ZSTD_outBuffer	2020-04-28 12:07:42 -07:00
Nick Terrell	77a2945c43	Add some comments	2020-04-27 20:04:04 -07:00
Nick Terrell	f33de06c3e	[lib] Fix single-pass mode for empty frames	2020-04-27 20:04:01 -07:00
Nick Terrell	a4ff217baf	[lib] Add ZSTD_d_stableOutBuffer	2020-04-27 18:09:44 -07:00
Nick Terrell	b104f8e3eb	[zstd] Fix typo in ZSTD_dParameter	2020-04-27 12:12:28 -07:00
Bimba Shrestha	1875f616ce	passing dictContentType instead of rawContent every time	2020-04-21 22:29:35 -07:00
Bimba Shrestha	5b0a452cac	Adding --long support for --patch-from (#1959 ) * adding long support for patch-from * adding refPrefix to dictionary_decompress * adding refPrefix to dictionary_loader * conversion nit * triggering log mode on chainLog < fileLog and removing old threshold * adding refPrefix to dictionary_round_trip * adding docs * adding enableldm + forceWindow test for dict * separate patch-from logic into FIO_adjustParamsForPatchFromMode * moving memLimit adjustment to outside ifdefs (need for decomp) * removing refPrefix gate on dictionary_round_trip * rebase on top of dev refPrefix change * making sure refPrefx + ldm is < 1% of srcSize * combining notes for patch-from * moving memlimit logic inside fileio.c * adding display for optimal parser and long mode trigger * conversion nit * fuzzer found heap-overflow fix * another conversion nit * moving FIO_adjustMemLimitForPatchFromMode outside ifndef * making params immutable * moving memLimit update before createDictBuffer call * making maxSrcSize unsigned long long * making dictSize and maxSrcSize params unsigned long long * error on files larger than 4gb * extend refPrefix test to include round trip * conversion to size_t * making sure ldm is at least 10x better * removing break * including zstd_compress_internal and removing redundant macros * exposing ZSTD_cycleLog() * using cycleLog instead of chainLog * add some more docs about user optimizations * formatting	2020-04-17 15:58:53 -05:00
Nick Terrell	5fcbc484c8	Merge pull request #2040 from caoyzh/dev-2 Optimize by prefetching on aarch64	2020-04-08 13:14:47 -07:00
Bimba Shrestha	c0d4b2b5a3	Merge pull request #2075 from bimbashrestha/dict_fuzzer_ref [bug] handling case where prefix is NULL or 0 sized in refPrefix_advanced	2020-04-07 17:37:19 -05:00
Bimba Shrestha	1658ae75cd	handling nil case for refprefix	2020-04-07 14:41:53 -07:00
Carl Woffenden	a93fadfcd9	Further replication removed `CHECK_F` is now in `error_private.h`. Minor tidy.	2020-04-07 11:25:16 +02:00
Carl Woffenden	7af7735fa3	Merge remote-tracking branch 'upstream/dev' into single-file-lib	2020-04-07 11:13:02 +02:00
Carl Woffenden	edd9a07322	Code replicated in compression and decompression moved to shared headers `CHECK_F` macro moved to `error_private.h` (shared between `fse_compress.c` and `fse_decompress.c`). `ZSTD_limitCopy()` moved to `zstd_internal.h` (shared between `zstd_compress.c` and `zstd_decompress.c`). Erroneous build artefact `zstd.h` removed from repo.	2020-04-07 11:02:06 +02:00
Bimba Shrestha	0154866749	moving consts to zstd_internal and reusing them	2020-04-03 14:26:15 -07:00
Bimba Shrestha	0a172c5e43	converting to if	2020-04-03 14:21:24 -07:00
Bimba Shrestha	3a4c8cc9b3	adding dctx to function name	2020-04-03 14:14:46 -07:00
Bimba Shrestha	ae47d50355	only computing sizes once	2020-04-03 14:12:23 -07:00
Bimba Shrestha	a4cbe79ccb	Using in and out size together	2020-04-03 14:09:21 -07:00
Bimba Shrestha	936aa63ff1	adding oversized check on decompression	2020-04-03 13:25:32 -07:00
Bimba Shrestha	05574ec141	adding oversizeDuration to dctx and macros	2020-04-03 13:08:29 -07:00
Carl Woffenden	7c420344d2	Single-file decoder script can now (optionally) create an encoder To complement the single-file decoder a new script was added to create an amalgamated single-file of all of the Zstd source, along with examples and (simple) tests.	2020-04-03 19:07:46 +02:00
Carl Woffenden	7202184ee0	Fixes decompressor when using -Wshorten-64-to-32 (#2062 ) Spotted on iOS when building with `-Wshorten-64-to-32` (since `__builtin_expect` returns a `long`).	2020-04-03 02:55:29 -07:00
Nick Terrell	ac58c8d720	Fix copyright and license lines * All copyright lines now have -2020 instead of -present * All copyright lines include "Facebook, Inc" * All licenses are now standardized The copyright in `threading.{h,c}` is not changed because it comes from zstdmt. The copyright and license of `divsufsort.{h,c}` is not changed.	2020-03-26 17:02:06 -07:00
Nick Terrell	f5029e285f	Merge pull request #2050 from terrelln/align Align decompress sequences loop to 32+16 bytes	2020-03-24 11:42:59 -07:00
Nick Terrell	8d0ee37ac0	Align decompress sequences loop to 32+16 bytes The alignment is added before the loop, so this shouldn't hurt performance in any case. The only way it hurts is if there is already performance instability, and we force it to be stable but in the bad case. This consistently gets us into the good case with gcc-{7,8,9} on an Intel i9-9900K and clang-9. gcc-5 is 5% worse than its best case but has stable performance. We get consistently good behavior on my Macbook Pro compiled with both clang and gcc-8. It ends up in the 50% from DSB and 50% from MITE case, but the performance is the same as the 85% DSB case, so thats fine.	2020-03-23 19:40:31 -07:00
Nick Terrell	d34204a7b7	Merge pull request #2029 from terrelln/minor-opt [opt] Update repcodes less often	2020-03-23 18:12:32 -07:00
caoyzh	7201980650	Optimize by prefetching on aarch64	2020-03-14 15:25:59 +08:00
Bimba Shrestha	66607d0eac	Merge pull request #2033 from bimbashrestha/icc [opt] Small icc level 1 compression speed gain using #pragma vector	2020-03-10 20:42:19 -05:00
Bimba Shrestha	a89c45bdbd	Typo	2020-03-10 15:19:48 -05:00
Bimba Shrestha	43fc88f443	Adding comment and remvoing ivdep	2020-03-10 14:57:27 -05:00
Bimba Shrestha	dba3abc95a	Missed returns	2020-03-05 12:20:59 -08:00
Bimba Shrestha	a75e5f2ffc	bitscan add undef check	2020-03-05 11:52:15 -08:00
Bimba Shrestha	85d0efd619	Removing no-tree-vectorize for intel	2020-03-05 10:02:48 -08:00
Bimba Shrestha	4c72a1a9c2	adding vector to main loop	2020-03-05 09:55:38 -08:00
Nick Terrell	81fda0419e	[opt] Only update repcodes upon arrival	2020-03-04 17:57:15 -08:00
Nick Terrell	04744e52dc	Merge pull request #2028 from terrelln/minor-opt [opt] Don't recompute initial literals price	2020-03-04 17:40:59 -08:00
Nick Terrell	0f9882deb9	[opt] Don't recompute repcodes while emitting sequences	2020-03-04 17:23:00 -08:00
Nick Terrell	c6caa2d04e	[opt] Delete ZSTD_litLengthContribution	2020-03-04 16:35:26 -08:00
Nick Terrell	610171ed86	[opt] Explain why we don't include literals price	2020-03-04 16:29:19 -08:00
Nick Terrell	5f49578be7	[opt] Don't recompute initial literals price	2020-03-04 16:27:17 -08:00
Bimba Shrestha	cba46e9b7b	Fixing ZSTD_c_compressionLevel confusing note	2020-03-03 13:12:02 -08:00
Nick Terrell	c836992be1	Dont log errors when ZSTD_fseBitCost() returns an error	2020-03-02 11:13:18 -08:00
Felix Handte	b669c5347a	Revert "Fix pkg-config File Generation Again" (#2016 )	2020-02-26 10:52:49 -08:00
W. Felix Handte	e5ef935cf6	Fix Variable Capitalization	2020-02-18 13:40:58 -05:00
W. Felix Handte	73737231b9	Allow Manual Overriding of pkg-config Lib and Include Dirs When the `PCLIBDIR` or `PCINCDIR` is non-empty (either because we succeeded in removing the prefix, or because it was manually set), we don't need to perform the check. This lets us trust users who go to the trouble of setting a manual override, rather than still blindly failing the make. They'll still be prefixed with `${prefix}/` / `${exec_prefix}/` in the pkg-config file though.	2020-02-18 13:17:17 -05:00
W. Felix Handte	e668c9b528	Fix pkg-config File Generation Again Revises #1851. Fixes #1900. Replaces #1930. Thanks to @orbea, @neheb, @Polynomial-C, and particularly @eli-schwartz for pointing out the problem and suggesting solutions. Tested with ``` make -C lib clean libzstd.pc cat lib/libzstd.pc # should fail make -C lib clean libzstd.pc LIBDIR=/foo make -C lib clean libzstd.pc INCLUDEDIR=/foo make -C lib clean libzstd.pc LIBDIR=/usr/localfoo make -C lib clean libzstd.pc INCLUDEDIR=/usr/localfoo make -C lib clean libzstd.pc LIBDIR=/usr/local/lib prefix=/foo make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/include prefix=/foo # should succeed make -C lib clean libzstd.pc LIBDIR=/usr/local/foo make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/foo make -C lib clean libzstd.pc LIBDIR=/usr/local/ make -C lib clean libzstd.pc INCLUDEDIR=/usr/local/ make -C lib clean libzstd.pc LIBDIR=/usr/local make -C lib clean libzstd.pc INCLUDEDIR=/usr/local make -C lib clean libzstd.pc LIBDIR=/tmp/foo prefix=/tmp make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp make -C lib clean libzstd.pc LIBDIR=/tmp/foo prefix=/tmp/foo make -C lib clean libzstd.pc INCLUDEDIR=/tmp/foo prefix=/tmp/foo # should also succeed make -C lib clean libzstd.pc prefix=/foo LIBDIR=/foo/bar INCLUDEDIR=/foo/ cat lib/libzstd.pc mkdir out cd out cmake ../build/cmake make cat lib/libzstd.pc ```	2020-02-18 12:23:50 -05:00
Bimba Shrestha	80c26117a9	Line-wrapping	2020-02-03 09:38:16 -08:00
Bimba Shrestha	ee8a712af3	Using appliedParams instead of supplied params	2020-01-31 15:49:07 -08:00
Nick Terrell	e32e3e8662	Improve wildcopy performance across the board	2020-01-28 20:37:04 -08:00
Nick Terrell	7627759b4e	Merge pull request #1972 from terrelln/check-cont Move ZSTD_checkContinuity() to zstd_decompress_block.c	2020-01-23 22:02:50 -08:00
Nick Terrell	fa6a772f38	Initialize dctx->bType to silence valgrind false positive	2020-01-23 17:54:48 -08:00
Nick Terrell	cb2abc3dbe	Fix performance regression on aarch64 with clang	2020-01-23 17:31:14 -08:00
Nick Terrell	6e3cd5b024	Move ZSTD_checkContinuity() to zstd_decompress_block.c	2020-01-23 12:27:39 -08:00
Nick Terrell	a11a9271d6	Fix lowLimit underflow in overflow correction	2020-01-17 12:10:18 -08:00
Nick Terrell	036b30b555	Fix super block compression and stream raw blocks in decompression (#1947 ) Super blocks must never violate the zstd block bound of input_size + ZSTD_blockHeaderSize. The individual sub-blocks may, but not the super block. If the superblock violates the block bound we are liable to violate ZSTD_compressBound(), which we must not do. Whenever the super block violates the block bound we instead emit an uncompressed block. This means we increase the latency because of the single uncompressed block. I fix this by enabling streaming an uncompressed block, so the latency of an uncompressed block is 1 byte. This doesn't reduce the latency of the buffer-less API, but I don't think we really care. * I added a test case that verifies that the decompression has 1 byte latency. * I rely on existing zstreamtest / fuzzer / libfuzzer regression tests for correctness. During development I had several correctness bugs, and they easily caught them. * The added assert that the superblock doesn't violate the block bound will help us discover any missed conditions (though I think I got them all). Credit to OSS-Fuzz.	2020-01-10 18:02:11 -08:00
Nick Terrell	d1cc9d2797	[fuzz] Allow zero sized buffers for streaming fuzzers (#1945 ) * Allow zero sized buffers in `stream_decompress`. Ensure that we never have two zero sized buffers in a row so we guarantee forwards progress. * Make case 4 in `stream_round_trip` do a zero sized buffers call followed by a full call to guarantee forwards progress. * Fix `limitCopy()` in legacy decoders. * Fix memcpy in `zstdmt_compress.c`. Catches the bug fixed in PR #1939	2020-01-09 11:38:50 -08:00
Igor Sugak	03ffda7b88	fix UBSAN's invalid-null-argument error in zstd_decompress.c (#1939 )	2020-01-08 16:17:42 -08:00
Bimba Shrestha	b1f53b1a10	[fuzz] Dividing by targetCBlockSize instead of blockSize for nbBlocks fit (#1936 ) * Adding fail logging for superblock flow * Dividing by targetCBlockSize instead of blockSize * Adding new const and using more acurate formula for nbBlocks * Only do dstCapacity check if using superblock * Remvoing disabling logic * Updating test to make it catch more extreme case of previou bug * Also updating comment * Only taking compressEnd shortcut on non-superblock	2020-01-03 16:53:51 -08:00
Bimba Shrestha	56415efc76	Constifying, malloc check and naming nit	2019-12-17 17:16:51 -08:00
Bimba Shrestha	5225dcfc0f	Adding bool to check if enough room left for noCompress superblocks	2019-12-13 15:47:28 -08:00
Yann Collet	d73e2fb465	Merge pull request #1891 from bimbashrestha/oss [fuzz] Superblock fuzz issues	2019-12-10 13:17:00 -08:00
Bimba Shrestha	e1913dc87f	Making const, removing unnecessary indent, changing parameter order	2019-12-04 15:51:17 -08:00
Bimba Shrestha	2ec556fec2	Moving init/end functions, moving compressSuperBlock inside body()	2019-12-04 15:23:13 -08:00
Bimba Shrestha	ffb0463041	Refactor	2019-12-04 14:52:27 -08:00
Bimba Shrestha	49c6d49247	[fuzz] msan uninitialized unsigned value (#1908 ) Fixes new fuzz issue Credit to OSS-Fuzz * Initializing unsigned value * Initialilzing to 1 instead of 0 because its more conservative * Unconditionoally setting to check first and then checking zero * Moving bool to before block for c90 * Move check set before block	2019-12-04 10:02:17 -08:00
Yann Collet	5120883a9c	bumped version number so that potential issue report do not confuse `dev` with latest release	2019-12-03 17:06:42 -08:00
Bimba Shrestha	1fc9352f81	Using bss var instead of creating new bool	2019-12-02 21:39:06 -08:00
Bimba Shrestha	1f681d8592	Merge branch 'oss' of https://github.com/bimbashrestha/zstd into oss	2019-11-27 10:56:54 -08:00
Bimba Shrestha	a3a3c62b81	[fuzz] Only set HUF_repeat_valid if loaded table has all non-zero weights (#1898 ) Fixes a fuzz issue where dictionary_round_trip failed because the compressor was generating corrupt files thanks to zero weights in the table. * Only setting loaded dict huf table to valid on non-zero * Adding hasNoZeroWeights test to fse tables * Forbiding nbBits != 0 when weight == 0 * Reverting the last commit * Setting table log to 0 when weight == 0 * Small (invalid) zero weight dict test * Small (valid) zero weight dict test * Initializing repeatMode vars to check before zero check * Removing FSE changes to seperate pr * Reverting accidentally changed file * Negating bool, using unsigned, optimization nit	2019-11-26 12:24:19 -08:00
Bimba Shrestha	d4e17d0776	Negating bool, updating bool on inner branches	2019-11-26 12:17:43 -08:00
Nick Terrell	718f00ff6f	Optimize decompression speed for gcc and clang (#1892 ) * Optimize `ZSTD_decodeSequence()` * Optimize Huffman decoding * Optimize `ZSTD_decompressSequences()` * Delete `ZSTD_decodeSequenceLong()`	2019-11-25 18:26:19 -08:00
Bimba Shrestha	826b555463	Merge branch 'dev' into oss	2019-11-22 17:29:33 -08:00
Bimba Shrestha	10bce1919e	Mixed declration fix	2019-11-21 13:08:27 -08:00
Bimba Shrestha	0451accab1	Checking noCompressBlock explicitly for rep code confirmation	2019-11-21 13:06:26 -08:00
Nick Terrell	659e9f05cf	Fix null pointer addition	2019-11-20 18:36:04 -08:00
Yann Collet	2d4dcce55f	Merge pull request #1894 from felixhandte/doc-clarify-dctx-reset Easy: Update Comment on `ZSTD_initDStream()`	2019-11-19 16:18:56 -08:00
Nick Terrell	e0d6daabac	Fix Appveyor failure	2019-11-19 11:12:26 -08:00
Bimba Shrestha	8f0c2d04c8	Going back to original flow but removing else return	2019-11-19 10:03:07 -08:00
W. Felix Handte	722149cf2b	Easy: Update Comment on `ZSTD_initDStream()`	2019-11-19 01:57:15 -05:00
Nick Terrell	6a7f65117e	Merge pull request #1866 from legrosbuffle/dev Optimized loop bounds to allow the compiler to unroll the loop.	2019-11-18 16:16:30 -08:00
Nick Terrell	a839d6852c	Merge pull request #1888 from senhuang42/superblocks_fixed RLE test and re-enable RLE in main compression loop	2019-11-18 16:09:33 -08:00
Bimba Shrestha	80586f5e80	Reversing condition order and forwarding error	2019-11-18 13:53:55 -08:00
Bimba Shrestha	dade64428f	Output regular uncompressed block when compressSequences fails	2019-11-18 08:43:14 -08:00
Bimba Shrestha	2d5d961a60	Typo in comment	2019-11-15 19:00:53 -08:00
Bimba Shrestha	dba767c0bb	Leaving room for checksum	2019-11-15 18:44:51 -08:00
Vincent Torri	6b5c10b48c	shared library: rename import library with .dll.a extension mort of open source project are using this extension for the import library. The Win32 linker is supporting this extension, see https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/win32.html section "direct linking to a dll"	2019-11-15 19:46:06 +01:00
Clement Courbet	b3c9fc27b4	Optimized loop bounds to allow the compiler to unroll the loop. This has no measurable impact on large files but improves small file decompression by ~1-2% for 10kB, benchmarked with: head -c 10000 silesia.tar > /tmp/test make CC=/usr/local/bin/clang-9 BUILD_STATIC=1 && ./lzbench -ezstd -t1,5 /tmp/test	2019-11-15 08:27:05 +01:00
Sen Huang	d9646dcbb5	Fixed main compression logic changes	2019-11-14 19:39:09 -05:00
Yann Collet	4b1ac69f19	Merge pull request #1868 from senhuang42/superblocks_fixed Superblocks rebased for merge	2019-11-14 13:31:34 -08:00
Sen Huang	c26d32c91c	Change superblock #include to be last	2019-11-14 13:12:17 -05:00
Yann Collet	d67742bc5d	Merge pull request #1858 from senhuang42/dictionary_header_size Method to get dictionary header size	2019-11-14 09:44:07 -08:00
Sen Huang	c85d10d0ea	Remove mixed declarations	2019-11-08 13:57:26 -05:00
Sen Huang	d9c475f3b3	Fix static analyze error, use proper bounds for dictEnd	2019-11-08 13:57:26 -05:00
Sen Huang	d06b90692b	Move asserts to loadZstdDictionary()	2019-11-08 13:57:26 -05:00
Sen Huang	b39149e156	Expose ZSTD_reset_compressedBlockState() to shared API	2019-11-08 13:57:26 -05:00
Sen Huang	6ce335371b	Add error forwarding to loadCEntropy(), make check for dictSize >= 8 from bad merge	2019-11-08 13:57:26 -05:00
Sen Huang	4a61aaf368	Remove redundant comment	2019-11-08 13:57:26 -05:00
Sen Huang	c787b351ea	Use ZSTD Error codes, improve explanation of ZSTD_loadCEntropy() and ZSTD_loadDEntropy()	2019-11-08 13:57:26 -05:00
Sen Huang	04fb42b4f3	Integrated refactor into getDictHeaderSize, now passes tests	2019-11-08 13:57:26 -05:00
Sen Huang	0bcaf6db08	First working pass at refactor of loadZstdDictionary()	2019-11-08 13:57:26 -05:00
Sen Huang	4b141b63e0	Revert "Move decompress symbols into zstd_internal.h, remove dependency" This reverts commit a152b4c67a5266f611db4a2eac4a79003852a795.	2019-11-08 13:57:26 -05:00
Sen Huang	84404cff6e	Move decompress symbols into zstd_internal.h, remove dependency	2019-11-08 13:57:26 -05:00
Sen Huang	341e0641ed	Checks malloc() for failure, returns 0 if so	2019-11-08 13:57:26 -05:00
Sen Huang	97b7f712f3	Change to heap allocation, remove implicit type conversion	2019-11-08 13:57:25 -05:00
Sen Huang	3c36a7f13a	Add ZDICT_getHeaderSize()	2019-11-08 13:57:08 -05:00
Nick Terrell	8c474f9845	Fix parameter selection and adjustment with srcSize == 0	2019-11-07 08:58:43 -08:00
Felix Handte	5688447758	Merge pull request #1873 from felixhandte/make-overlap-log-multithread-only Fix #1861: Restrict overlapLog Parameter When Not Built With Multithreading	2019-11-06 16:56:37 -05:00
Felix Handte	ba4613602f	Merge pull request #1843 from moozzyk/issue-1637 Take ZSTD_parameters as a const pointer	2019-11-06 16:56:14 -05:00
W. Felix Handte	c13f81905a	Fix #1861 : Restrict overlapLog Parameter When Not Built With Multithreading This parameter is unused in single-threaded compression. We should make it behave like the other multithread-only parameters, for which we only accept zero when we are not built with multithreading.	2019-11-06 16:05:02 -05:00
Sen Huang	13bb7500e8	Fix frame argument to compression	2019-11-05 16:15:55 -05:00
Sen Huang	f2932fb5eb	Fix more merge conflicts	2019-11-05 15:54:05 -05:00
Sen Huang	7ce891870c	Fix merge conflicts	2019-11-05 15:51:25 -05:00
Bimba Shrestha	3fb5b106da	Replacing some literals with constants	2019-11-05 10:26:57 -08:00
Nick Terrell	60205fec02	Fix 2 bugs in dictionary loading * Silently skip dictionaries less than 8 bytes, unless using `ZSTD_dct_fullDict`. This changes the compressor, which silently skips dictionaries <= 8 bytes. * Allow repcodes that are equal to the dictionary content size, since it is in bounds.	2019-11-01 16:52:07 -07:00
Sen Huang	b9ede1c8c2	Make sure contentsize is known	2019-10-30 16:03:58 -04:00
Nick Terrell	9c1860861e	Fix assert in ZSTD_safecopy In the case that `op >= oend_w` it is possible that `diff < 8` because the two buffers could be adjacent. Credit to OSS-Fuzz, which found the bug. It isn't reproducible because it depends on the memory layout.	2019-10-28 17:51:17 -07:00
Felix Handte	01ec595b85	Merge pull request #1851 from felixhandte/pkg-config-prefix-fix In pkg-config File, Derive Lib and Include Dir from Prefix at Use-Time	2019-10-28 14:24:56 -04:00
Yann Collet	74065da4c5	updated API inline doc and manual regarding ZSTD_CDict created without a dictBuffer.	2019-10-28 11:15:41 -07:00
W. Felix Handte	74bd76c3ff	In pkg-config File, Derive Lib and Include Dir from Prefix at Use-Time Addresses #1794. Instead of deriving the lib dir and include dir at build-time, let's do it like everyone else does at pkg-config run-time. This has the disadvantage that we can no longer override LIBDIR and INCLUDEDIR in the Makefile and have that reflected in the .pc file.	2019-10-25 15:07:31 -04:00
Yann Collet	c2140e9db0	Merge pull request #1845 from facebook/zbuff improve deprecation warning macro	2019-10-25 09:59:00 -07:00
Yann Collet	a9a216a846	Merge pull request #1824 from senhuang42/new_path_for_cdict Avoid using CDict params when input is large.	2019-10-23 12:04:40 -07:00
Yann Collet	63e435dda1	improve deprecation warning macro fix #1488 although, curiously enough, I was never able to reproduce the issue (according to the bug report, it should be present while using gcc 4.8).	2019-10-23 11:59:32 -07:00
moozzyk	eda7946a36	Take ZSTD_parameters as a const pointer Fixes: #1637	2019-10-22 23:21:54 -07:00
Yann Collet	f966cd080a	added documentation on DYNAMIC_BMI2 build macro	2019-10-22 17:43:09 -07:00
Yann Collet	5d5c895b18	fix initCStream_advanced() for fast strategies Compression ratio of fast strategies (levels 1 & 2) was seriously reduced, due to accidental disabling of Literals compression. Credit to @QrczakMK, which perfectly described the issue, and implementation details, making the fix straightforward. Example : initCStream with level 1 on synthetic sample P50 : Before : 5,273,976 bytes After : 3,154,678 bytes ZSTD_compress (for comparison) : 3,154,550 Fix #1787. To follow : refactor the test which was supposed to catch this issue (and failed)	2019-10-22 15:01:38 -07:00
Yann Collet	111b0c53b0	update documentation on deprecated functions mostly : note that these functions will soon generate deprecation warnings	2019-10-22 13:51:18 -07:00
Nick Terrell	b1ec94e63c	Fix ZSTD_f_zstd1_magicless for small data * Fix `ZSTD_FRAMEHEADERSIZE_PREFIX` and `ZSTD_FRAMEHEADERSIZE_MIN` to take a `format` parameter, so it is impossible to get the wrong size. * Fix the places that called `ZSTD_FRAMEHEADERSIZE_PREFIX` without taking the format into account, which is now impossible by design. * Call `ZSTD_frameHeaderSize_internal()` with `dctx->format`. * The added tests catch both bugs in `ZSTD_decompressFrame()`. Fixes #1813.	2019-10-21 21:16:17 -07:00
Sen Huang	c2e1e54f24	((x or y) or z) == (x or y or z), remove brackets	2019-10-21 19:16:50 -04:00
Sen Huang	59c81aa31b	Line up comments :)	2019-10-21 19:12:15 -04:00
Sen Huang	dbda8c318a	Trailing comma	2019-10-21 19:10:13 -04:00
Sen Huang	0c00455ea6	Merge branch 'dev' of github.com:senhuang42/zstd into new_path_for_cdict	2019-10-21 19:06:51 -04:00
Sen Huang	5b2f4ac1a8	merge	2019-10-21 19:02:52 -04:00
Sen Huang	2ab484a5f9	Fix bad merge	2019-10-21 18:55:17 -04:00
Nick Terrell	919d1d8e93	Merge pull request #1831 from terrelln/zstdmt-bad-memset [zstdmt] Don't memset the jobDescription	2019-10-21 15:53:57 -07:00
Sen Huang	b6c3459d50	merge	2019-10-21 18:46:17 -04:00
Yann Collet	6cf04c0344	Merge pull request #1834 from facebook/winFix Windows fixes	2019-10-21 13:45:17 -07:00
Sen Huang	676f89902a	Added multiplier, renamed new enum to something more useful	2019-10-21 15:36:12 -04:00
Sen Huang	1f3a51fb52	Updated forceAttachDict param bounds	2019-10-21 15:36:12 -04:00
Sen Huang	8f69c47643	Add enum to decision process	2019-10-21 15:36:12 -04:00
Sen Huang	e4de8b098a	Added support for forcing new CDict behavior and updated enum	2019-10-21 15:36:12 -04:00
Sen Huang	9294f4826b	Changed to int from BYTE	2019-10-21 15:36:12 -04:00
Sen Huang	f0fccc8847	Changed to int from BYTE	2019-10-21 15:36:12 -04:00
Sen Huang	bb2df8c499	Trailing whitespace	2019-10-21 15:36:12 -04:00
Sen Huang	cf51501d2f	Fix test	2019-10-21 15:36:12 -04:00
Sen Huang	ea3cb6988f	Cast to BYTE to appease appveyor	2019-10-21 15:36:12 -04:00
Sen Huang	a727a85a7e	merge conflicts round 2	2019-10-21 15:36:12 -04:00
Sen Huang	053a35fd64	formatting	2019-10-21 15:35:33 -04:00
Sen Huang	3fa4daaa55	Fix error	2019-10-21 15:35:33 -04:00
Sen Huang	3328348c63	Add compressionlevel to cdict	2019-10-21 15:32:39 -04:00
Felix Handte	cf725630a6	Merge pull request #1795 from felixhandte/workspace-asan Add Poisoned Redzones to the Workspace When Compiling with ASAN	2019-10-21 12:15:17 -04:00
Sen Huang	e8aa3e486d	Updated forceAttachDict param bounds	2019-10-20 22:01:08 -04:00
Sen Huang	6d297265f9	Add enum to decision process	2019-10-20 19:02:47 -04:00
Sen Huang	1daa898c93	Added support for forcing new CDict behavior and updated enum	2019-10-20 14:03:09 -04:00
Nick Terrell	0bc39bc3a0	[zstdmt] Don't memset the jobDescription	2019-10-18 15:05:51 -07:00
Nick Terrell	243824551f	[threading] Add debug utilities	2019-10-18 15:05:34 -07:00
Yann Collet	1795133c45	refactored FIO_compressMultipleFilenames() prototype for consistency	2019-10-17 15:32:03 -07:00
Yann Collet	6446ffb277	Merge pull request #1827 from facebook/dm_Dct updated erroneous comments using ZSTD_dm_*	2019-10-17 10:30:58 -07:00
Yann Collet	19741c7d99	Merge pull request #1815 from facebook/zlibwrap make zlibWrapper strict ISO-C90 compatible	2019-10-16 16:45:15 -07:00
Yann Collet	6323966e53	updated erroneous comments using ZSTD_dm_* instead of the current ZSTD_dct_*, reported by @nigeltao (#1822)	2019-10-16 16:14:04 -07:00
Yann Collet	2d5201b0ab	removed wildcopy8() which is no longer used, noticed by @davidbolvansky	2019-10-16 14:51:33 -07:00
Sen Huang	4455f00cb8	Changed to int from BYTE	2019-10-16 15:06:02 -04:00
Sen Huang	4f7d26b0ee	Changed to int from BYTE	2019-10-16 15:05:29 -04:00
Sen Huang	cf00ea367a	Trailing whitespace	2019-10-16 10:31:27 -04:00
Sen Huang	8cb2174446	Fix test	2019-10-16 10:29:31 -04:00
Sen Huang	5e901b6f32	Cast to BYTE to appease appveyor	2019-10-15 13:58:44 -04:00
Sen Huang	5c010c9d2d	merge conflicts round 2	2019-10-15 13:10:05 -04:00
Sen Huang	a06b51879c	merge conflict	2019-10-15 12:58:50 -04:00
Sen Huang	23dac23a49	formatting	2019-10-15 12:44:48 -04:00
Sen Huang	0c8df5c928	Fix error	2019-10-15 12:28:23 -04:00
Sen Huang	a65eb39f9d	Add compressionlevel to cdict	2019-10-15 10:22:06 -04:00
Yann Collet	fb77afc626	Merge pull request #1760 from bimbashrestha/extract_sequences_api Adding api for extracting sequences from seqstore	2019-10-10 13:11:18 -07:00
W. Felix Handte	ede31da2ea	Fix CCtx Size Estimation	2019-10-10 15:02:08 -04:00
W. Felix Handte	bd6a20b8a0	Expand Default Redzone Size	2019-10-10 13:45:55 -04:00
W. Felix Handte	2c80a9f8ac	Check if CCtx in Workspace after Null Check	2019-10-10 13:40:16 -04:00
W. Felix Handte	b6987acbbf	Declare the ASAN Functions We Need, Don't Include the Header	2019-10-10 13:40:16 -04:00
W. Felix Handte	0ffae7e440	Stop Allocating Extra Space for Table Redzones	2019-10-10 13:40:16 -04:00
W. Felix Handte	a07037b784	Don't Try to Redzone the Tables	2019-10-10 13:40:16 -04:00
W. Felix Handte	0cc481ef66	Fix Workspace Size Calculation	2019-10-10 13:40:16 -04:00
W. Felix Handte	b6c0a02a17	Fix ZSTD_sizeof_matchState() Calculation	2019-10-10 13:40:16 -04:00
W. Felix Handte	8cffd6ed08	Avoid ASAN Failure in ZSTD_cwksp_free()	2019-10-10 13:40:16 -04:00
W. Felix Handte	ef0b5707c5	Refactor Freeing CCtxes / CDicts Inside Workspaces	2019-10-10 13:40:16 -04:00
W. Felix Handte	143b296cf6	Surround Workspace Allocs with Dead Zone	2019-10-10 13:40:16 -04:00
W. Felix Handte	19a0955ec9	Add `ZSTD_cwksp_alloc_size()` to Help Calculate Needed Workspace Size	2019-10-10 13:40:16 -04:00
W. Felix Handte	da88c35d41	Stop Assuming Tables are Adjacent	2019-10-10 13:40:16 -04:00
W. Felix Handte	35c30d6ca7	Poison Unused Workspace Memory	2019-10-10 13:40:16 -04:00
W. Felix Handte	edb6d884a5	Detect Whether We're Being Compiled with ASAN	2019-10-10 13:40:16 -04:00
W. Felix Handte	dc1fb684bf	Remove Unused MEM_SKIP_MSAN Macro	2019-10-10 13:40:16 -04:00
Bimba Shrestha	36528b96c4	Manually moving instead of memcpy on decoder and using genBuffer()	2019-10-03 09:26:51 -07:00
Bimba Shrestha	61ec4c2e7f	Cleaning sequence parsing logic	2019-10-03 06:42:40 -07:00
Yann Collet	cb18fffe65	enforce C90 compatibility for zlibWrapper	2019-09-24 17:50:58 -07:00
Yann Collet	ad2a2785f7	bump version number to v1.4.4 so that future reports on `dev` branch use this number instead	2019-09-24 15:15:33 -07:00
Bimba Shrestha	c04245b257	Replacing assert with memory_allocation error code throw	2019-09-23 15:42:16 -07:00
Bimba Shrestha	be0bebd24e	Adding test and null check for malloc	2019-09-23 15:08:18 -07:00
Dávid Bolvanský	1ab1a40c9c	Fixed one more place	2019-09-23 21:32:56 +02:00
Dávid Bolvanský	1f7228c040	Use clz ^ 31 instead of 31 - clz; better codegen for GCC	2019-09-23 21:23:09 +02:00
Nick Terrell	7451c6578c	Merge pull request #1804 from terrelln/wild-and-fast Optimize (de)compression and fix wildcopy overread	2019-09-21 17:04:36 -07:00
Nick Terrell	5cb7615f1f	Add UNUSED_ATTR to ZSTD_storeSeq()	2019-09-20 21:37:13 -07:00
Nick Terrell	5dc0a1d659	HINT_INLINE ZSTD_storeSeq() Clang on Mac wasn't inlining `ZSTD_storeSeq()` in level 1, which was causing a 5% performance regression. This fixes it.	2019-09-20 16:39:27 -07:00
Bimba Shrestha	f3c4fd17e3	Passing in dummy dst buffer of compressbound(srcSize)	2019-09-20 15:50:58 -07:00
Felix Handte	c047fcf7bf	Merge pull request #1806 from felixhandte/estimate-cctx-doc Update Comment on `ZSTD_estimateCCtxSize()`	2019-09-20 15:36:00 -04:00
Nick Terrell	44c65da97e	Remove literals overread in ZSTD_storeSeq() for ~neutral perf	2019-09-20 12:23:25 -07:00
W. Felix Handte	f7d9b36835	Update Comment on `ZSTD_estimateCCtxSize()`	2019-09-20 14:11:29 -04:00
Nick Terrell	fde217df04	Fix bounds check in ZSTD_storeSeq()	2019-09-20 08:25:12 -07:00
Nick Terrell	67b1f5fc72	Fix too strict assert	2019-09-20 01:23:35 -07:00
Nick Terrell	ddab2a94e8	Pass iend into ZSTD_storeSeq() to allow ZSTD_wildcopy()	2019-09-20 00:56:20 -07:00
Nick Terrell	cdad7fa512	Widen ZSTD_wildcopy to 32 bytes	2019-09-20 00:52:15 -07:00
Nick Terrell	efd37a64ea	Optimize decompression and fix wildcopy overread * Bump `WILDCOPY_OVERLENGTH` to 16 to fix the wildcopy overread. * Optimize `ZSTD_wildcopy()` by removing unnecessary branches and unrolling the loop. * Extract `ZSTD_overlapCopy8()` into its own function. * Add `ZSTD_safecopy()` for `ZSTD_execSequenceEnd()`. It is optimized for single long sequences, since that is the important case that can end up in `ZSTD_execSequenceEnd()`. Without this optimization, decompressing a block with 1 long match goes from 5.7 GB/s to 800 MB/s. * Refactor `ZSTD_execSequenceEnd()`. * Increase the literal copy shortcut to 16. * Add a shortcut for offset >= 16. * Simplify `ZSTD_execSequence()` by pushing more cases into `ZSTD_execSequenceEnd()`. * Delete `ZSTD_execSequenceLong()` since it is exactly the same as `ZSTD_execSequence()`. clang-8 seeds +17.5% on silesia and +21.8% on enwik8. gcc-9 sees +12% on silesia and +15.5% on enwik8. TODO: More detailed measurements, and on more datasets. Crdit to OSS-Fuzz for finding the wildcopy overread.	2019-09-19 21:07:14 -07:00
Bimba Shrestha	ae6d0e64ae	Addressing comments	2019-09-19 15:25:20 -07:00
Yann Collet	3cac061db5	Merge pull request #1802 from bimbashrestha/rle_block_bound_fix_pt2 Adding 4 blocks to FSE_BLOCKBOUND() in lib/common (different from las…	2019-09-18 16:32:37 -07:00
Bimba Shrestha	6e9f6813bb	adding bit container size	2019-09-18 13:49:45 -07:00
Bimba Shrestha	f9b6abb896	Adding 4 blocks to FSE_BLOCKBOUND() in lib/common (different from last week)	2019-09-18 13:29:05 -07:00
Yann Collet	bfff5b30a4	Merge pull request #1756 from mgrice/dev Improvements in zstd decode performance	2019-09-18 11:35:50 -07:00
Yann Collet	243200e5bf	minor refactor of ZSTD_fast - reduced variables lifetime - more accurate code comments	2019-09-17 14:02:57 -07:00
Bimba Shrestha	76fea3fb99	Resolving appveyor test failure implicit conversion	2019-09-16 14:02:23 -07:00
Bimba Shrestha	a874435478	Merge branch 'dev' into extract_sequences_api	2019-09-16 13:29:59 -07:00
Felix Handte	2164a130f3	Merge pull request #1780 from felixhandte/workspace-efficiency-3 Avoid Clearing Tables Even When Changing CParams	2019-09-16 14:37:05 -04:00
W. Felix Handte	72ea79cacd	Don't Include `sanitizer/msan_interface.h`, Since Not All Platforms Provide It Instead, explicitly declare the functions we use.	2019-09-16 12:08:03 -04:00
Bimba Shrestha	bff6072e3a	Bailing early when collecting sequences and documentation	2019-09-16 08:26:21 -07:00
Nick Terrell	fbeaf6989e	[libzstd] Improve advanced API docs	2019-09-15 12:41:24 -07:00
Yann Collet	09b1844d9b	Merge pull request #1784 from bimbashrestha/fse_block_bound_err Rearranging assert and allowing 4 extra for FSE_BLOCKBOUND()	2019-09-12 19:09:27 -07:00
Bimba Shrestha	fe9af338ed	Added assert to BIT_flushBits()	2019-09-12 15:35:27 -07:00
Bimba Shrestha	43da5bf27e	Rearranging assert and allowing 4 extra for FSE_BLOCKBOUND()	2019-09-12 14:43:50 -07:00
W. Felix Handte	20c69077d1	Shrink Table Valid End During Alloc Alignment / Phase Change	2019-09-11 17:14:59 -04:00
W. Felix Handte	51d90668ba	Add Assertions to Confirm that Workspace Pointers are Correctly Ordered	2019-09-11 17:14:59 -04:00
W. Felix Handte	a10c191613	`__msan_poison()` Workspace When Preparing for Re-Use	2019-09-11 17:14:45 -04:00
W. Felix Handte	7c57e2b9ca	Zero `h3size` When `h3log` is 0 This led to a nasty edgecase, where index reduction for modes that don't use the h3 table would have a degenerate table (size 4) allocated and marked clean, but which would not be re-indexed.	2019-09-11 13:14:26 -04:00
W. Felix Handte	bc020eec92	Also Shrink Clean Table Area When Reducing Indices	2019-09-11 11:40:57 -04:00
W. Felix Handte	1999b2ed9b	Update DEBUGLOG Statements	2019-09-11 11:21:00 -04:00
W. Felix Handte	13e29a56de	Shrink Clean Table Area When Copying Table Contents into Context The source matchState is potentially at a lower current index, which means that any extra table space not overwritten by the copy may now contain invalid indices. The simple solution is to unconditionally shrink the valid table area to just the area overwritten.	2019-09-11 11:18:45 -04:00
W. Felix Handte	edb3ad053e	Comments	2019-09-10 18:25:45 -04:00
W. Felix Handte	f31ef28ff8	Only Reset Indexing in `ZSTD_resetCCtx_internal()` When Necessary	2019-09-10 18:25:45 -04:00
W. Felix Handte	9968a53e91	Remove No-Longer-Used Continuation Functions	2019-09-10 18:25:45 -04:00
W. Felix Handte	1b28e80416	Remove Fast Continue Path in `ZSTD_resetCCtx_internal()`	2019-09-10 18:25:45 -04:00
W. Felix Handte	ad16eda5e4	`ZSTD_reset_matchState` Optionally Doesn't Restart Indexing	2019-09-10 18:25:45 -04:00
W. Felix Handte	5b10bb5ec3	Rename `ZSTD_compResetPolicy_e` Values and Add Comment	2019-09-10 18:25:45 -04:00
W. Felix Handte	0492b9a9ec	Accept `ZSTD_indexResetPolicy_e` Param in `ZSTD_reset_matchState()`	2019-09-10 18:25:45 -04:00
W. Felix Handte	14c5471d5e	Introduce `ZSTD_indexResetPolicy_e` Enum	2019-09-10 18:25:45 -04:00
W. Felix Handte	17b6da2e0f	Track Usable Table Space in Compression Workspace	2019-09-10 18:25:37 -04:00
Yann Collet	22bd158e0f	Merge pull request #1712 from felixhandte/workspace-efficiency-2 Allocate Internal Buffers via Workspace Abstraction	2019-09-10 15:20:29 -07:00
Bimba Shrestha	1407919d13	Addressing comments on parsing	2019-09-10 15:10:50 -07:00
Bimba Shrestha	47199480da	Cleaning up parsing per suggestion	2019-09-10 13:18:59 -07:00
W. Felix Handte	a9d373f093	Remove Empty lib/compress/zstd_cwksp.c	2019-09-10 16:03:13 -04:00
Yann Collet	5ba495b622	Merge pull request #1775 from facebook/edufix fix educational decoder	2019-09-10 12:12:08 -07:00
Yann Collet	41416f0927	Merge pull request #1773 from bimbashrestha/rle_first_block_decompression_fix Removing redundant condition in decompression, making first block rle…	2019-09-10 11:17:29 -07:00
Bimba Shrestha	e3c5825918	Fizing litLength == 0 case	2019-09-10 10:38:13 -07:00
Bimba Shrestha	9e7bb55e14	Addressing comments	2019-09-09 20:04:46 -07:00
W. Felix Handte	81208fd7c2	Forward Declare `ZSTD_cwksp_available_space` to Fix Build	2019-09-09 19:10:09 -04:00
W. Felix Handte	91bf1babd1	Inline Workspace Functions	2019-09-09 18:53:53 -04:00
W. Felix Handte	0db3ffe7ee	Forward resetCCtx Errors when Using CDict	2019-09-09 16:47:19 -04:00
W. Felix Handte	eb6f69d978	Fix sizeof_CCtx and sizeof_CDict Calculations for Statically Init'ed Objects	2019-09-09 16:45:17 -04:00
W. Felix Handte	e3703825a8	Fix workspaceTooSmall Calculation	2019-09-09 15:12:14 -04:00
W. Felix Handte	0a65a67901	Shorten `&zc->workspace` -> `ws` in `ZSTD_resetCCtx_internal()`	2019-09-09 14:59:09 -04:00
W. Felix Handte	1120e4d962	Clean Up TODOs and Comments pt. II	2019-09-09 14:04:39 -04:00
W. Felix Handte	c60e1c3be5	Nit	2019-09-09 13:34:08 -04:00
W. Felix Handte	7d7b665c90	Pull Phase Advance Logic Out into Internal Function	2019-09-09 13:34:08 -04:00
W. Felix Handte	8549ae9f1d	Hide Workspace Movement Behind Helper Function	2019-09-09 13:34:08 -04:00
W. Felix Handte	2405c03bcd	Fix DEBUGLOG Statement Levels	2019-09-09 13:34:08 -04:00
W. Felix Handte	7100d24221	Fix Rescale Continue Special Case	2019-09-09 13:34:08 -04:00
W. Felix Handte	7321e4c9f3	Remove Unused noRealloc CRP Value	2019-09-09 13:34:08 -04:00
W. Felix Handte	901bba4ca6	Re-Implement Workspace Shrinking when Oversized	2019-09-09 13:34:08 -04:00
W. Felix Handte	881bcd80ca	Cleanup from Move	2019-09-09 13:34:08 -04:00
W. Felix Handte	b511a84adc	Move Workspace Functions to Their Own File	2019-09-09 13:34:08 -04:00
W. Felix Handte	077a2d7dc9	Rename	2019-09-09 13:34:08 -04:00
W. Felix Handte	ebd162194f	Clean Up TODOs and Comments	2019-09-09 13:34:08 -04:00
W. Felix Handte	2abe0145b1	Improve Comments a Bit	2019-09-09 13:34:08 -04:00
W. Felix Handte	7a2416a863	Allocate CDict in Workspace (Rather than in Separate Allocation)	2019-09-09 13:34:08 -04:00
W. Felix Handte	65057cf009	Rewrite ZSTD_initStaticCCtx to Alloc CCtx in Workspace	2019-09-09 13:34:08 -04:00
W. Felix Handte	58b69ab15c	Only the CCtx Itself Needs to be Cleared during Static CCtx Init	2019-09-09 13:34:08 -04:00
W. Felix Handte	88c2fcd0ee	Align Alloc Pointer When Transitioning from Buffers to Aligned Allocs	2019-09-09 13:34:08 -04:00
W. Felix Handte	e936b73889	Remove Overly-Restrictive Assert	2019-09-09 13:34:08 -04:00
W. Felix Handte	75d574368b	When Loading Dict By Copy, Always Put it in the Workspace	2019-09-09 13:34:08 -04:00
W. Felix Handte	e69b67e33a	Alloc Tables Separately	2019-09-09 13:34:08 -04:00
W. Felix Handte	6177354b36	Begin Introducing Phases	2019-09-09 13:34:08 -04:00
W. Felix Handte	786f2266bb	TMP	2019-09-09 13:34:08 -04:00
W. Felix Handte	c25283cf00	Disambiguate 'workspace' and 'entropyWorkspace'	2019-09-09 13:34:08 -04:00
W. Felix Handte	ccaac852e8	Normalize Case 'workSpace' -> 'workspace'	2019-09-09 13:27:18 -04:00
Bimba Shrestha	44e122053b	Mentioning cli only in the comment as suggested	2019-09-06 14:48:41 -07:00
Yann Collet	2b0a271ed2	fix eductional decoder fix #1774 also : - fix minor compilation warnings - make sure the `test` is run during CI tests	2019-09-06 14:30:13 -07:00
Bimba Shrestha	a917cd597d	Put back omission for first rle block and updated comment as suggested	2019-09-06 13:44:25 -07:00
Bimba Shrestha	d687d603e4	Removing redundant condition in decompression, making first block rles valid to deocmpress	2019-09-06 10:46:19 -07:00
Varun S Nair	9816560649	Fixing assert and DEBUGLOG due to ZSTD_CCtx_params parameter change to const pointer	2019-09-05 15:47:17 +05:30
Varun S Nair	771645471f	Passing ZSTD_CCtx_params by const pointer	2019-09-05 15:28:30 +05:30
Bimba Shrestha	5f8b0f6890	Changing api to get sequences across all blocks	2019-08-30 09:18:44 -07:00
Yann Collet	5198347382	Merge pull request #1744 from bimbashrestha/dev Generate RLE blocks in the encoder	2019-08-29 15:19:10 -07:00
Bimba Shrestha	623b90f85d	Fixing ci-circle test complaints	2019-08-29 13:09:42 -07:00
mgrice	5d89771529	fix warning: always_inline function might not be inlinable	2019-08-29 12:32:15 -07:00
Bimba Shrestha	ece465644b	Adding api for extracting sequences from seqstore	2019-08-29 12:29:39 -07:00
mgrice	b830599582	Improvements in zstd decode performance Summary: The idea behind wildcopy is that it can be cheaper to copy more bytes (say 8) than it is to copy less (say, 3). This change takes that further by exploiting some properties: 1. it's almost always OK to copy 16 bytes instead of 8, which means fewer copy instructions, and fewer branches 2. A 16 byte chunk size means that ~90% of wildcopy invocations will have a trip count of 1, so branch prediction will be improved. Speedup on Xeon E5-2680v4 is in the range of 3-5%. Measured wildcopy length distributions on silesia.tar: level <=8 <=16 <=24 >24 1 78.05% 11.49% 3.52% 6.94% 3 82.14% 8.99% 2.44% 6.43% 6 85.81% 6.51% 2.92% 4.76% 8 83.02% 7.31% 3.64% 6.03% 10 84.13% 6.67% 3.29% 5.91% 15 77.58% 7.55% 5.21% 9.66% 16 80.07% 7.20% 3.98% 8.75% Test Plan: benchmark silesia, make check	2019-08-29 12:25:56 -07:00
Bimba Shrestha	c3e3c8bf32	Undoing the last commit (that was an accident)	2019-08-29 12:05:47 -07:00
bimbashrestha	4a1ca5e0a8	Adding method for extracting sequences.	2019-08-29 11:55:12 -07:00
bimbashrestha	e5704bbfdf	Added test for multiple blocks of zeros and fixed nit about comments	2019-08-28 08:32:34 -07:00
Nick Terrell	e9c0fc12d2	Merge pull request #1748 from terrelln/cover-deadlock [dictBuilder] Fix deadlock in *COVER error case	2019-08-27 10:17:28 -07:00
Nick Terrell	0932de54bc	[dictBuilder] Fix deadlock in *COVER error case The COVER and FASTCOVER dictionary builders can deadlock when dictionary construction errors, likely because there are too few samples, or too few distinct dmers. The deadlock only occurs when there are errors. Fixes #1746.	2019-08-26 18:19:29 -07:00
bimbashrestha	96201d9774	Added bool to cctx and fixed some comment nits	2019-08-26 15:30:41 -07:00
bimbashrestha	991cbc9024	Fixing mixed declaration compiler complaint	2019-08-26 15:00:50 -07:00
bimbashrestha	ce264ce53b	Forbiding emission of RLE when its the first block	2019-08-26 14:54:29 -07:00
bimbashrestha	33b6446ca7	Removing accidental method call	2019-08-26 14:34:43 -07:00
bimbashrestha	7b041b552e	Removing assert for rle that doesn't always hold	2019-08-26 12:26:53 -07:00
bimbashrestha	1f2bf77f2a	Using typedef U32 instead of int	2019-08-26 09:00:22 -07:00
bimbashrestha	ba46932492	Removing implicit conversion from const void* to const BYTE* and added constant for threshold	2019-08-26 08:51:34 -07:00
Carl Woffenden	c690f22e96	Merge branch 'dev' into amalgamate	2019-08-23 23:05:02 +02:00
Carl Woffenden	5144e66095	Revert "Merge remote-tracking branch 'origin/master' into dev" This reverts commit `0df29a4e5f`, reversing changes made to `69c875a0cc`.	2019-08-23 23:04:21 +02:00
Carl Woffenden	0fcaa675e0	Merge remote-tracking branch 'upstream/dev' into dev	2019-08-23 23:03:52 +02:00
Carl Woffenden	0df29a4e5f	Merge remote-tracking branch 'origin/master' into dev	2019-08-23 22:57:06 +02:00
bimbashrestha	0e3ba02cf1	Fixing more test falure errors	2019-08-22 13:54:41 -07:00
bimbashrestha	4faf3a5911	Fixing ci-circle test failure issues	2019-08-22 13:46:15 -07:00
bimbashrestha	cba5350f88	Moving RLE logic to inside ZSTD_compressBlock_internal and adding assert	2019-08-22 12:12:44 -07:00
Nick Magerko	493f95c7df	Fix merge conflicts	2019-08-22 11:51:41 -07:00
bimbashrestha	4c90d862e3	Generate RLE blocks in the encoder	2019-08-22 11:27:20 -07:00
Nick Terrell	54ad33448c	Merge pull request #1737 from terrelln/legacy-fix [legacy] Fix buffer overflow in v0.2 and v0.4 raw literals decompression	2019-08-21 10:10:24 -07:00
Carl Woffenden	901ea61f83	Tweaks to create a single-file decoder The CHECK_F macros differ slightly (but eventually do the same thing). Older GCC needs to fallback on the old-style pragma optimisation flags.	2019-08-21 17:49:17 +02:00
Yann Collet	38b6428fcd	Merge pull request #1725 from emaste/dev remove extraneous doubled ;s	2019-08-21 05:19:30 -07:00
Yann Collet	fe0877c664	Merge pull request #1721 from facebook/seq127 fixed very minor inefficiency (nbSeq==127)	2019-08-21 05:19:12 -07:00
Yann Collet	757ab66879	Merge pull request #1713 from cemeyer/fix_gcc4_build Fix the build on GCC 4.x after `812e8f2a1`	2019-08-21 05:17:42 -07:00
Nick Terrell	07f22d465d	[legacy] Fix buffer overflow in v0.2 and v0.4 raw literals decompression Extends the fix in PR#1722 to v0.2 and v0.4. These aren't built into zstd by default, and v0.5 onward are not affected. I only add the `srcSize > BLOCKSIZE` check to v0.4 because the comments say that it must hold, but the equivalent comment isn't present in v0.2. Credit to OSS-Fuzz.	2019-08-20 17:13:04 -07:00
Nick Magerko	de6a6c7364	Fix ZSTD_SRCSIZEHINT_MIN typo	2019-08-20 13:07:51 -07:00
Nick Magerko	c7a24d7a14	Define ZSTD_SRCSIZEHINT_MIN as 0	2019-08-20 13:06:15 -07:00
Nick Magerko	2d39b43906	Use int for srcSizeHint when sensible	2019-08-19 16:49:25 -07:00
Nick Magerko	09894dc2eb	Add mention of regression with poor size hints	2019-08-19 13:41:36 -07:00
Nick Magerko	fee8fbcddf	Make upper bound INT_MAX	2019-08-19 12:58:54 -07:00
Nick Magerko	edf2abf106	Fix fall-through case	2019-08-19 12:32:43 -07:00
Nick Magerko	dffbac5f89	Add --size-hint=# option	2019-08-19 11:38:49 -07:00
Ed Maste	b81d7cc6a0	remove extraneous doubled ;s	2019-08-15 21:17:06 -04:00
W. Felix Handte	a42bbb4e05	Fix Buffer Overflow in Legacy (v0.3) Raw Literals Decompression	2019-08-15 14:28:30 -04:00
Yann Collet	782bfb858a	fixed very minor inefficiency (nbSeq==127) The nbSeq "short" format (1-byte) is compatible with any value < 128. However, the code would cautiously only accept values < 127. This is not an error, because the general 2-bytes format is compatible with small values < 128. Hence the inefficiency never triggered any warning. Spotted by Intel's Smita Kumar.	2019-08-15 16:41:34 +02:00
Conrad Meyer	ff6c81d90c	Fix the build on GCC 4.x after `812e8f2a1` The ancient GCC 4.x doesn't understand the "optimize" attribute until 4.4. Fix the build on platforms with GCC 4.x < 4.4 by limiting the DONT_VECTORIZE definition to GCC 5 and greater. Noticed and patch proposed by Warner Losh <imp@FreeBSD.org>.	2019-08-08 17:25:49 -07:00
Yann Collet	01b2331ad1	bumped version number to v1.4.3	2019-08-05 17:17:16 +02:00
Yann Collet	61936ba42a	Merge pull request #1705 from josepho0918/dev Add support for IAR C/C++ Compiler for Arm	2019-08-05 15:57:28 +02:00
Yann Collet	facbe8b2c2	factored the logic selecting lowest match index as suggested by @terrelln	2019-08-05 15:18:43 +02:00
Yann Collet	0b0b83e8f3	fix test 122 it's an unsupported scenario.	2019-08-03 16:51:26 +02:00
Yann Collet	98e7c344cd	fixed strategies btopt+	2019-08-02 14:42:53 +02:00
Yann Collet	b4257b04e7	fixed strategy btlazy2	2019-08-02 14:26:26 +02:00
Yann Collet	5cf1b24aca	fixed strategies greedy, lazy & lazy2 restore dictionary compression ratio	2019-08-02 14:21:39 +02:00
Yann Collet	98692c2838	fixed compression ratio regression when dictionary-compressing medium-size inputs at levels 1-3	2019-08-01 15:58:17 +02:00
Joseph Chen	3855bc4295	Add support for IAR C/C++ Compiler for Arm	2019-07-29 15:25:58 +08:00
W. Felix Handte	8083581f9a	Bump Library Version Number to 1.4.2	2019-07-24 17:35:19 -04:00
Nick Terrell	e6edcfa795	[legacy] Fix bug in zstd-0.5 decoder The match length and literal length extra bytes could either by 2 bytes or 3 bytes in version 0.5. All earlier verions were always 3 bytes, and later version didn't have dumps. The bug, introduced by commit `0fd322f812`, was triggered when the last dump was a 2-byte dump, because we didn't separate that case from a 3-byte dump, and thought we were over-reading. I've tested this fix with every zstd version < 1.0.0 on the buggy file, and we are now always successfully decompressing with the right checksum. Fixes #1693.	2019-07-22 13:05:09 -07:00
Yann Collet	be3d2e2de8	Merge pull request #1679 from ephiepark/dev Restructure the source files	2019-07-19 15:29:07 -07:00
Vivek Miglani	c7be7d2efb	Fixing compressed block size checks	2019-07-17 12:53:15 -07:00
Ephraim Park	1dc98de279	Restructure the source files	2019-07-15 17:39:18 -07:00
Vivek Miglani	3f108f82fb	Return error if block size exceeds maximum	2019-07-15 12:10:21 -07:00
Yann Collet	8fb08b68cc	Merge pull request #1681 from facebook/level3 updated double_fast complementary insertion	2019-07-12 16:16:06 -07:00
Nick Terrell	75cfe1dc69	[ldm] Fix bug in overflow correction with large job size (#1678 ) * [ldm] Fix bug in overflow correction with large job size * [zstdmt] Respect ZSTDMT_JOBSIZE_MAX (1G in 64-bit mode) * [test] Add test that exposes the bug Sadly the test fails on our CI because it uses too much memory, so I had to comment it out.	2019-07-12 18:45:18 -04:00
Yann Collet	eaeb7f00b5	updated the _extDict variant of double fast	2019-07-12 14:17:17 -07:00
Yann Collet	e8a7f5d3ce	double-fast: changed the trade-off for a smaller positive change same number of complementary insertions, just organized differently (long at `ip-2`, short at `ip-1`).	2019-07-12 11:34:53 -07:00
mgrice	812e8f2a16	perf improvements for zstd decode (#1668 ) * perf improvements for zstd decode tldr: 7.5% average decode speedup on silesia corpus at compression levels 1-3 (sandy bridge) Background: while investigating zstd perf differences between clang and gcc I noticed that even though gcc is vectorizing the loop in in wildcopy, it was not being done as well as could be done by hand. The sites where wildcopy is invoked have an interesting distribution of lengths to be copied. The loop trip count is rarely above 1, yet long copies are common enough to make their performance important.The code in zstd_decompress.c to invoke wildcopy handles the latter well but the gcc autovectorizer introduces a needlessly expensive startup check for vectorization. See how GCC autovectorizes the loop here: https://godbolt.org/z/apr0x0 Here is the code after this diff has been applied: (left hand side is the good one, right is with vectorizer on) After: https://godbolt.org/z/OwO4F8 Note that autovectorization still does not do a good job on the optimized version, so it's turned off\ via attribute and flag. I found that neither attribute nor command-line flag were entirely successful in turning off vectorization, which is why there were both. silesia benchmark data - second triad of each file is with the original code: file orig compressedratio encode decode change 1#dickens 10192446-> 4268865(2.388), 198.9MB/s 709.6MB/s 2#dickens 10192446-> 3876126(2.630), 128.7MB/s 552.5MB/s 3#dickens 10192446-> 3682956(2.767), 104.6MB/s 537MB/s 1#dickens 10192446-> 4268865(2.388), 195.4MB/s 659.5MB/s 7.60% 2#dickens 10192446-> 3876126(2.630), 127MB/s 516.3MB/s 7.01% 3#dickens 10192446-> 3682956(2.767), 105MB/s 479.5MB/s 11.99% 1#mozilla 51220480-> 20117517(2.546), 285.4MB/s 734.9MB/s 2#mozilla 51220480-> 19067018(2.686), 220.8MB/s 686.3MB/s 3#mozilla 51220480-> 18508283(2.767), 152.2MB/s 669.4MB/s 1#mozilla 51220480-> 20117517(2.546), 283.4MB/s 697.9MB/s 5.30% 2#mozilla 51220480-> 19067018(2.686), 225.9MB/s 665MB/s 3.20% 3#mozilla 51220480-> 18508283(2.767), 154.5MB/s 640.6MB/s 4.50% 1#mr 9970564-> 3840242(2.596), 262.4MB/s 899.8MB/s 2#mr 9970564-> 3600976(2.769), 181.2MB/s 717.9MB/s 3#mr 9970564-> 3563987(2.798), 116.3MB/s 620MB/s 1#mr 9970564-> 3840242(2.596), 253.2MB/s 827.3MB/s 8.76% 2#mr 9970564-> 3600976(2.769), 177.4MB/s 655.4MB/s 9.54% 3#mr 9970564-> 3563987(2.798), 111.2MB/s 564.2MB/s 9.89% 1#nci 33553445-> 2849306(11.78), 575.2MB/s , 1335.8MB/s 2#nci 33553445-> 2890166(11.61), 509.3MB/s , 1238.1MB/s 3#nci 33553445-> 2857408(11.74), 431MB/s , 1210.7MB/s 1#nci 33553445-> 2849306(11.78), 565.4MB/s , 1220.2MB/s 9.47% 2#nci 33553445-> 2890166(11.61), 508.2MB/s , 1128.4MB/s 9.72% 3#nci 33553445-> 2857408(11.74), 429.1MB/s , 1097.7MB/s 10.29% 1#ooffice 6152192-> 3590954(1.713), 231.4MB/s , 662.6MB/s 2#ooffice 6152192-> 3323931(1.851), 162.8MB/s , 592.6MB/s 3#ooffice 6152192-> 3145625(1.956), 99.9MB/s , 549.6MB/s 1#ooffice 6152192-> 3590954(1.713), 224.7MB/s , 624.2MB/s 6.15% 2#ooffice 6152192-> 3323931 (1.851), 155MB/s , 564.5MB/s 4.98% 3#ooffice 6152192-> 3145625(1.956), 101.1MB/s , 521.2MB/s 5.45% 1#osdb 10085684-> 3739042(2.697), 271.9MB/s 876.4MB/s 2#osdb 10085684-> 3493875(2.887), 208.2MB/s 857MB/s 3#osdb 10085684-> 3515831(2.869), 135.3MB/s 805.4MB/s 1#osdb 10085684-> 3739042(2.697), 257.4MB/s 793.8MB/s 10.41% 2#osdb 10085684-> 3493875(2.887), 209.7MB/s 776.1MB/s 10.42% 3#osdb 10085684-> 3515831(2.869), 130.6MB/s 727.7MB/s 10.68% 1#reymont 6627202-> 2152771(3.078), 198.9MB/s 696.2MB/s 2#reymont 6627202-> 2071140(3.200), 170MB/s 595.2MB/s 3#reymont 6627202-> 1953597(3.392), 128.5MB/s 609.7MB/s 1#reymont 6627202-> 2152771(3.078), 199.6MB/s 655.2MB/s 6.26% 2#reymont 6627202-> 2071140(3.200), 168.2MB/s 554.4MB/s 7.36% 3#reymont 6627202-> 1953597(3.392), 128.7MB/s 557.4MB/s 9.38% 1#samba 21606400-> 5510994(3.921), 338.1MB/s 1066MB/s 2#samba 21606400-> 5240208(4.123), 258.7MB/s 992.3MB/s 3#samba 21606400-> 5003358(4.318), 200.2MB/s 991.1MB/s 1#samba 21606400-> 5510994(3.921), 330.8MB/s 974MB/s 9.45% 2#samba 21606400-> 5240208(4.123), 257.9MB/s 919.4MB/s 7.93% 3#samba 21606400-> 5003358(4.318), 198.5MB/s 908.9MB/s 9.04% 1#sao 7251944-> 6256401(1.159), 194.6MB/s 602.2MB/s 2#sao 7251944-> 5808761(1.248), 128.2MB/s 532.1MB/s 3#sao 7251944-> 5556318(1.305), 73MB/s 509.4MB/s 1#sao 7251944-> 6256401(1.159), 198.7MB/s 580.7MB/s 3.70% 2#sao 7251944-> 5808761(1.248), 129.1MB/s 502.7MB/s 5.85% 3#sao 7251944-> 5556318(1.305), 74.6MB/s 493.1MB/s 3.31% 1#webster 41458703-> 13692222(3.028), 222.3MB/s 752MB/s 2#webster 41458703-> 12842646(3.228), 157.6MB/s 532.2MB/s 3#webster 41458703-> 12191964(3.400), 124MB/s 468.5MB/s 1#webster 41458703-> 13692222(3.028), 219.7MB/s 697MB/s 7.89% 2#webster 41458703-> 12842646(3.228), 153.9MB/s 495.4MB/s 7.43% 3#webster 41458703-> 12191964(3.400), 124.8MB/s 444.8MB/s 5.33% 1#xml 5345280-> 696652(7.673), 485MB/s , 1333.9MB/s 2#xml 5345280-> 681492(7.843), 405.2MB/s , 1237.5MB/s 3#xml 5345280-> 639057(8.364), 328.5MB/s , 1281.3MB/s 1#xml 5345280-> 696652(7.673), 473.1MB/s , 1232.4MB/s 8.24% 2#xml 5345280-> 681492(7.843), 398.6MB/s , 1145.9MB/s 7.99% 3#xml 5345280-> 639057(8.364), 327.1MB/s , 1175MB/s 9.05% 1#x-ray 8474240-> 6772557(1.251), 521.3MB/s 762.6MB/s 2#x-ray 8474240-> 6684531(1.268), 230.5MB/s 688.5MB/s 3#x-ray 8474240-> 6166679(1.374), 68.7MB/s 478.8MB/s 1#x-ray 8474240-> 6772557(1.251), 502.8MB/s 736.7MB/s 3.52% 2#x-ray 8474240-> 6684531(1.268), 224.4MB/s 662MB/s 4.00% 3#x-ray 8474240-> 6166679(1.374), 67.3MB/s 437.8MB/s 9.37% 7.51% * makefile changed to only pass -fno-tree-vectorize to gcc * <Replace this line with a title. Use 1 line only, 67 chars or less> Don't add "no-tree-vectorize" attribute on clang (which defines __GNUC__) * fix for warning/error with subtraction of void* pointers * fix c90 conformance issue - ISO C90 forbids mixed declarations and code * Fix assert for negative diff, only when there is no overlap * fix overflow revealed in fuzzing tests * tweak for small speed increase	2019-07-11 18:31:07 -04:00
Yann Collet	d1327738c2	updated double_fast complementary insertion in a way which is more favorable to compression ratio, though very slightly slower (~-1%). More details in the PR.	2019-07-11 15:25:22 -07:00
Yann Collet	b01c1c679f	Merge pull request #1675 from ephiepark/dev Factor out the logic to build sequences	2019-07-10 13:32:31 -07:00
Yann Collet	b8ec4b0fd6	updated version number (to v1.4.1) also : added doc on context re-use, as suggested by @scherepanov at #1676	2019-07-09 11:43:59 -07:00
Yann Collet	096714d1b8	Merge pull request #1671 from ephiepark/dev Adding targetCBlockSize param	2019-07-03 17:47:44 -07:00
Ephraim Park	f57ac7b09e	Factor out the logic to build sequences	2019-07-03 15:42:38 -07:00
Ephraim Park	9007701670	Adding targetCBlockSize param	2019-07-03 15:41:52 -07:00
Nick Terrell	6c92ba774e	ZSTD_compressSequences_internal assert op <= oend (#1667 ) When we wrote one byte beyond the end of the buffer for RLE blocks back in 1.3.7, we would then have `op > oend`. That is a problem when we use `oend - op` for the size of the destination buffer, and allows further writes beyond the end of the buffer for the rest of the function. Lets assert that it doesn't happen.	2019-07-02 15:45:47 -07:00
Yann Collet	857e608b51	Merge pull request #1658 from facebook/memset memset() rather than reduceIndex()	2019-07-01 15:01:43 -07:00
Yann Collet	4d611ca405	Merge pull request #1664 from ephiepark/dev decodecorpus	2019-07-01 14:13:49 -07:00
Tyler-Tran	c55d2e7ba3	Adding shrinking flag for cover and fastcover (#1656 ) * Changed ERROR(GENERIC) excluding inits * editing git ignore * Edited init functions to size_t returns * moved declarations earlier * resolved issues with changes to init functions * fixed style and an error check * attempting to add tests that might trigger changes * added && die to cases expecting to fail * resolved no die on expected failed command * fixed accel to be incorrect value * Adding an automated shrinking option * Fixing build * finalizing fixes * fix? * Removing added comment in cover.h * Styling fixes * Merging with fb dev * removing megic number for default regression * Requested revisions * fixing support for fast cover * fixing casting errors * parenthesis fix * fixing some build nits * resolving travis ci syntax * might resolve all compilation issues * removed unused variable * remodeling the selectDict function * fixing bad memory access * fixing error checks * fixed erroring check in selectDict * fixing mixed declarations * modify mixed declaration * fixing nits and adding test cases * Adding requested changes + fixed bug for error checking * switched double comparison from != to < * fixed declaration typing * refactoring COVER_best_finish() and changing shrinkDict * removing the const's * modifying ZDICT_optimizeTrainFromBuffer_cover functions * fixing potential bad memcpy * fixing the error function for dict size	2019-06-27 16:26:57 -07:00
Ephraim Park	c7c1ba3a19	Fix a constraint stricter than the spec	2019-06-26 16:43:37 -07:00
Yann Collet	621adde3b2	changed naming to ZSTD_indexTooCloseToMax() Also : minor speed optimization : shortcut to ZSTD_reset_matchState() rather than the full reset process. It still needs to be completed with ZSTD_continueCCtx() for proper initialization. Also : changed position of LDM hash tables in the context, so that the "regular" hash tables can be at a predictable position, hence allowing the shortcut to ZSTD_reset_matchState() without complex conditions.	2019-06-24 14:39:29 -07:00
Yann Collet	45c9fbd6d9	prefer memset() rather than reduceIndex() when close to index range limit by disabling continue mode when index is close to limit.	2019-06-21 16:19:21 -07:00
Yann Collet	944e2e9e12	benchfn : added macro macro CONTROL() like assert() but cannot be disabled. proper separation of user contract errors (CONTROL()) and invariant verification (assert()).	2019-06-21 15:58:55 -07:00
Nick Terrell	674534a700	[zstd] Fix data corruption in niche use case * Extract the overflow correction into a helper function. * Load the dictionary `ZSTD_CHUNKSIZE_MAX = 512 MB` bytes at a time and overflow correct between each chunk. Data corruption could happen when all these conditions are true: * You are using multithreading mode * Your overlap size is >= 512 MB (implies window size >= 512 MB) * You are using a strategy >= ZSTD_btlazy * You are compressing more than 4 GB The problem is that when loading a large dictionary we don't do overflow correction. We can only load 512 MB at a time, and may need to do overflow correction before each chunk.	2019-06-21 15:47:31 -07:00
Nick Terrell	4156060ca4	[zstdmt] Update assert to use ZSTD_WINDOWLOG_MAX	2019-06-21 15:39:33 -07:00
Nick Terrell	95e2b430ea	[opt] Add asserts for corruption in ZSTD_updateTree()	2019-06-21 15:22:29 -07:00
Yann Collet	9af909bf35	Merge pull request #1624 from facebook/smallwlog Improves compression ratio for small windowLog	2019-06-14 17:28:21 -07:00
Nick Terrell	cdb9481e38	[libzstd] Optimize ZSTD_insertBt1() for repetitive data We would only skip at most 192 bytes at a time before this diff. This was added to optimize long matches and skip the middle of the match. However, it doesn't handle the case of repetitive data. This patch keeps the optimization, but also handles repetitive data by taking the max of the two return values. ``` > for n in $(seq 9); do echo strategy=$n; dd status=none if=/dev/zero bs=1024k count=1000 \| command time -f %U ./zstd --zstd=strategy=$n >/dev/null; done strategy=1 0.27 strategy=2 0.23 strategy=3 0.27 strategy=4 0.43 strategy=5 0.56 strategy=6 0.43 strategy=7 0.34 strategy=8 0.34 strategy=9 0.35 ``` At level 19 with multithreading the compressed size of `silesia.tar` regresses 300 bytes, and `enwik8` regresses 100 bytes. In single threaded mode `enwik8` is also within 100 bytes, and I didn't test `silesia.tar`. Fixes Issue #1634.	2019-06-05 20:34:00 -07:00
Yann Collet	b3af1873a0	better title formatting for html documentation must pay attention to /** and /*! patterns.	2019-06-04 10:35:40 -07:00
Yann Collet	b5c98fbfd0	Added comments on I/O buffer sizes for streaming It seems this is still a confusing topic, as in https://github.com/klauspost/compress/issues/109 .	2019-06-04 10:26:16 -07:00
Yann Collet	80d6ccea79	removed UINT32_MAX apparently not guaranteed on all platforms, replaced by UINT_MAX.	2019-05-31 17:27:07 -07:00
Yann Collet	fce4df3ab7	fixed wrong assert in double_fast	2019-05-31 17:06:28 -07:00
Yann Collet	a968099038	minor code cleaning for new index invalidation strategy	2019-05-31 16:52:37 -07:00
Yann Collet	d605f482c7	make double_fast compatible with new index invalidation strategy	2019-05-31 16:50:04 -07:00
Yann Collet	a30febaeeb	Made fast strategy compatible with new offset validation strategy fast mode does the same thing as before : it pre-emptively invalidates any index that could lead to offset > maxDistance. It's supposed to help speed. But this logic is performed inside zstd_fast, so that other strategies can select a different behavior.	2019-05-31 16:34:55 -07:00
Yann Collet	58adb1059f	extended exact window size to greedy/lazy modes	2019-05-31 16:08:48 -07:00
Yann Collet	bc601bdc6d	first implementation of small window size for btopt noticeably improves compression ratio when window size is small (< 18). enwik7 level 19 windowLog `dev` `smallwlog` improvement 23 3.577 3.577 0.02% 22 3.536 3.538 0.06% 21 3.462 3.467 0.14% 20 3.364 3.377 0.39% 19 3.244 3.272 0.86% 18 3.110 3.166 1.80% 17 2.843 3.057 7.53% 16 2.724 2.943 8.04% 15 2.594 2.822 8.79% 14 2.456 2.686 9.36% 13 2.312 2.523 9.13% 12 2.162 2.361 9.20% 11 2.003 2.182 8.94%	2019-05-31 15:55:12 -07:00
Yann Collet	b13a9207f9	Merge pull request #1623 from facebook/fullbench fullbench minor improvements	2019-05-31 14:40:19 -07:00
Yann Collet	ed38b645db	fullbench: pass proper parameters in scenario 43	2019-05-29 15:26:06 -07:00
Yann Collet	9719fd616c	removed nextToUpdate3 from ZSTD_window it's now a local variable of ZSTD_compressBlock_opt()	2019-05-28 16:18:12 -07:00
Yann Collet	33dabc8c80	get bt matches : made it a bit clearer which parameters are input and output	2019-05-28 16:11:32 -07:00
Yann Collet	327cf6fac1	nextToUpdate3 does not need to be maintained outside of zstd_opt.c It's re-synchronized with nextToUpdate at beginning of each block. It only needs to be tracked from within zstd_opt block parser. Made the logic clear, so that no code tried to maintain this variable. An even better solution would be to make nextToUpdate3 an internal variable of ZSTD_compressBlock_opt_generic(). That would make it possible to remove it from ZSTD_matchState_t, thus restricting its visibility to only where it's actually useful. This would require deeper changes though, since the matchState is the natural structure to transport parameters into and inside the parser.	2019-05-28 15:26:52 -07:00
Yann Collet	6453f8158f	complementary code comments on variables used / impacted during maxDist check	2019-05-28 14:12:16 -07:00
Yann Collet	4baecdf72a	added comments to better understand enforceMaxDist()	2019-05-28 13:15:48 -07:00
Tyler-Tran	cb47871a0a	[dictBuilder] Be more specific than ERROR(generic) (#1616 ) * Specify errors at a finer granularity than `ERROR(generic)`. * Add tests for bad parameters in the dictionary builder.	2019-05-22 18:57:50 -07:00
Nick Terrell	5f228f8db2	[libzstd] Add a ZSTD_STATIC_ASSERT for BIT_DStream_status	2019-04-23 14:22:16 -07:00
Nick Terrell	a892e25374	[libzstd] Error if all sequence bits aren't consumed	2019-04-23 14:07:36 -07:00
Nick Terrell	0fd322f812	[legacy] Fix ZSTDv0_decodeSequence() Version <= 0.5 could read beyond the end of `dumps`, which points into the input buffer. * Check the validity of `dumps` before using it, if it is out of bounds return garbage values. There is no return code for this function. * Introduce `MEM_readLE24()` for simplicity, since I don't want to trust that there is an extra byte after `dumps`.	2019-04-19 11:34:52 -07:00
Nick Terrell	2536771134	[legacy] Fix Huffman jump table reads in v01 and v05	2019-04-18 16:20:42 -07:00
Nick Terrell	579f3d7794	[legacy] Fix bug in ZSTD_decodeSeqHeaders()	2019-04-18 13:41:10 -07:00
Nick Terrell	ac098c7f5f	[legacy] Fix a bug in ZSTDv06_findFrameSizeInfoLegacy()	2019-04-18 13:33:26 -07:00
Nick Terrell	ee130a9889	[libzstd] Check the size in readSkippableFrameSize()	2019-04-17 11:41:55 -07:00
Nick Terrell	5922f4e2ae	[legacy] Return the right error code	2019-04-17 11:34:52 -07:00
Nick Terrell	450feb0f95	[libzstd] Fix ZSTD_decompressBound() on bad skippable frames The function didn't verify that the skippable frame size is correct.	2019-04-17 11:29:42 -07:00
Nick Terrell	a17fe4c9e5	[visual] Fix unreachable code warning	2019-04-16 11:32:35 -07:00
Nick Terrell	de0499f7fa	[libzstd] Require ZSTD_MULTITHREAD to create a ZSTDMT_CCtx ZSTDMT was broken when compiled without ZSTD_MULTITHREAD defined, because `ZSTD_CCtx_setParameter(cctx, ZSTD_c_nbWorkers, nbWorkerss)` failed. It was detected by the MSVC test which runs the fuzzer with multithreading disabled. This is a very niche use case of a deprecated API, because the API is inefficient and synchronous, since `threading.h` will be synchronous. Users almost certainly don't want this, and anyone who tested their code should realize that it is broken. Therefore, I think it is safe to require `ZSTD_MULTITHREAD` to be defined to use ZSTDMT.	2019-04-15 23:04:46 -07:00
Josh Soref	a880ca239b	Spelling (#1582 ) * spelling: accidentally * spelling: across * spelling: additionally * spelling: addresses * spelling: appropriate * spelling: assumed * spelling: available * spelling: builder * spelling: capacity * spelling: compiler * spelling: compressibility * spelling: compressor * spelling: compression * spelling: contract * spelling: convenience * spelling: decompress * spelling: description * spelling: deflate * spelling: deterministically * spelling: dictionary * spelling: display * spelling: eliminate * spelling: preemptively * spelling: exclude * spelling: failure * spelling: independence * spelling: independent * spelling: intentionally * spelling: matching * spelling: maximum * spelling: meaning * spelling: mishandled * spelling: memory * spelling: occasionally * spelling: occurrence * spelling: official * spelling: offsets * spelling: original * spelling: output * spelling: overflow * spelling: overridden * spelling: parameter * spelling: performance * spelling: probability * spelling: receives * spelling: redundant * spelling: recompression * spelling: resources * spelling: sanity * spelling: segment * spelling: series * spelling: specified * spelling: specify * spelling: subtracted * spelling: successful * spelling: return * spelling: translation * spelling: update * spelling: unrelated * spelling: useless * spelling: variables * spelling: variety * spelling: verbatim * spelling: verification * spelling: visited * spelling: warming * spelling: workers * spelling: with	2019-04-12 11:18:11 -07:00
Nick Terrell	aafe97b67d	[libzstd] Switch dictUses to an enum	2019-04-10 16:50:35 -07:00
Nick Terrell	50b9c41196	[libzstd] Fix decompression dictionary bugs and clean up initialization Bugs: * `ZSTD_DCtx_refPrefix()` didn't clear the dictionary after the first use. Fix and add a test case. * `ZSTD_DCtx_reset()` always cleared the dictionary. Fix and add a test case. * After calling `ZSTD_resetDStream()` you could no longer load a dictionary, since the stage was set to `zdss_loadHeader`. Fix and add a test case. Cleanup: * Make `ZSTD_initDStream()` and `ZSTD_resetDStream()` wrap the new advanced API, and add test cases. Document the equivalent of these functions in the advanced API and document the unstable functions as deprecated.	2019-04-10 12:59:02 -07:00
Nick Terrell	824aaa695f	[libzstd] Fix ZSTD_decompressDCtx() with a dictionary * `ZSTD_decompressDCtx()` did not use the dictionary loaded by `ZSTD_DCtx_loadDictionary()`. * Add a unit test. * A stacked diff uses `ZSTD_decompressDCtx()` in the `dictionary_round_trip` and `dictionary_decompress` fuzzers.	2019-04-09 17:59:27 -07:00
Nick Terrell	48a6427d22	[libzstd] Fix ZSTD_compress2() for multithreaded compression `ZSTD_compress2()` wouldn't wait for multithreaded compression to finish. We didn't find this because ZSTDMT will block when it can compress all in one go, but it can't do that if it doesn't have enough output space, or if `ZSTD_c_rsyncable` is enabled. Since we will already sometimes block when using `ZSTD_e_end`, I've changed `ZSTD_e_end` and `ZSTD_e_flush` to guarantee maximum forward progress. This simplifies the API, and helps users avoid the easy bug that was made in `ZSTD_compress2()` * Found by the libfuzzer fuzzers. * Added a test case that catches the problem. * I will make the fuzzers sometimes allocate less than `ZSTD_compressBound()` output space.	2019-04-09 16:24:17 -07:00
Nick Terrell	e649fad7aa	[dictBuilder] Fix displayLevel for corpus warning Pass the displaylevel into the corpus warning, because it is used in fast cover and cover, so it needs to respect the local level.	2019-04-08 20:00:18 -07:00
Nick Terrell	bfcd5b81d7	[libzstd] Don't check the dictID in fuzzing mode When `FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION` is defined don't check the dictID. This check makes the fuzzers job harder, and it is at the very beginning.	2019-04-08 19:57:41 -07:00
Nick Terrell	947548c24f	Remove double the from README	2019-04-08 16:50:18 -07:00
Nick Terrell	641e594309	[libzstd] Remove ZSTDMT from the shared object * Remove ZSTDMT from the shared object by default. * Provide a macro `ZSTD_LEGACY_MULTITHREADED_API` to override it. * Document it in `lib/README.md`.	2019-04-07 18:47:52 -07:00
Nick Terrell	1dfe37fea9	[libzstd] Stabilize ZSTD_getDictID_*() functions	2019-04-05 18:59:30 -07:00
Nick Terrell	ce388fe4d2	[libzstd] Fix return value docs for ZSTD_compressStream2()	2019-04-05 17:44:07 -07:00
Nick Terrell	7231ea72a8	[libzstd] Reword the streaming docs for the new API	2019-04-03 19:21:05 -07:00
Nick Terrell	cf7d601bf5	Move the dictionary API and mark the legacy API * Move the dictionary API below the streaming API * Mark the legacy streaming API as redundant	2019-04-03 19:16:40 -07:00
Nick Terrell	d7d89513d6	Stabilize advance API This commit moves the candidate advanced API to the stable section. It makes some minor whitespace changes, but it doesn't change any of the wording of the documentation. I'll put up a separate PR that tweaks some of the documentation once this lands, so that it is easier to review. NOTE: Even though these functions are now in stable, they aren't stable until the next release (in under 1 month). It is possible that they change until then.	2019-04-03 18:43:20 -07:00
Nick Terrell	0827edeace	[libzstd] Bump the library version to 1.4.0 Bumps the library version to 1.4.0 in preparation to stabilize the advanced API.	2019-04-03 18:43:20 -07:00
Nick Terrell	72a3fbc0e4	Merge pull request #1562 from terrelln/2fast [libzstd] Speed up single segment zstd_fast by 5%	2019-04-03 18:08:15 -07:00
Nick Terrell	00679da22b	[libzstd] Setting ZSTD_d_maxWindowLog to 0 means default	2019-04-02 19:20:52 -07:00
Nick Terrell	95624b77e4	[libzstd] Speed up single segment zstd_fast by 5% This PR is based on top of PR #1563. The optimization is to process two input pointers per loop. It is based on ideas from [igzip] level 1, and talking to @gbtucker. \| Platform \| Silesia \| Enwik8 \| \|-------------------------\|-------------\|--------\| \| OSX clang-10 \| +5.3% \| +5.4% \| \| i9 5 GHz gcc-8 \| +6.6% \| +6.6% \| \| i9 5 GHz clang-7 \| +8.0% \| +8.0% \| \| Skylake 2.4 GHz gcc-4.8 \| +6.3% \| +7.9% \| \| Skylake 2.4 GHz clang-7 \| +6.2% \| +7.5% \| Testing on all Silesia files on my Intel i9-9900k with gcc-8 \| Silesia File \| Ratio Change \| Speed Change \| \|--------------\|--------------\|--------------\| \| silesia.tar \| +0.17% \| +6.6% \| \| dickens \| +0.25% \| +7.0% \| \| mozilla \| +0.02% \| +6.8% \| \| mr \| -0.30% \| +10.9% \| \| nci \| +1.28% \| +4.5% \| \| ooffice \| -0.35% \| +10.7% \| \| osdb \| +0.75% \| +9.8% \| \| reymont \| +0.65% \| +4.6% \| \| samba \| +0.70% \| +5.9% \| \| sao \| -0.01% \| +14.0% \| \| webster \| +0.30% \| +5.5% \| \| xml \| +0.92% \| +5.3% \| \| x-ray \| -0.00% \| +1.4% \| Same tests on Calgary. For brevity, I've only included files where compression ratio regressed or was much better. \| Calgary File \| Ratio Change \| Speed Change \| \|--------------\|--------------\|--------------\| \| calgary.tar \| +0.30% \| +7.1% \| \| geo \| -0.14% \| +25.0% \| \| obj1 \| -0.46% \| +15.2% \| \| obj2 \| -0.18% \| +6.0% \| \| pic \| +1.80% \| +9.3% \| \| trans \| -0.35% \| +5.5% \| We gain 0.1% of compression ratio on Silesia. We gain 0.3% of compression ratio on enwik8. I also tested on the GitHub and hg-commands datasets without a dictionary, and we gain a small amount of compression ratio on each, as well as speed. I tested the negative compression levels on Silesia on my Intel i9-9900k with gcc-8: \| Level \| Ratio Change \| Speed Change \| \|-------\|--------------\|--------------\| \| -1 \| +0.13% \| +6.4% \| \| -2 \| +4.6% \| -1.5% \| \| -3 \| +7.5% \| -4.8% \| \| -4 \| +8.5% \| -6.9% \| \| -5 \| +9.1% \| -9.1% \| Roughly, the negative levels now scale half as quickly. E.g. the new level 16 is roughly equivalent to the old level 8, but a bit quicker and smaller. If you don't think this is the right trade off, we can change it to multiply the step size by 2, instead of adding 1. I think this makes sense, because it gives a bit slower ratio decay. [igzip]: https://github.com/01org/isa-l/tree/master/igzip	2019-04-02 19:02:50 -07:00
Nick Terrell	56682a7709	Fix ZSTD_estimateCStreamSize_usingCCtxParams() It wasn't using the ZSTD_CCtx_params correctly. It must actualize the compression parameters by calling ZSTD_getCParamsFromCCtxParams() to get the real window log. Tested by updating the streaming memory usage example in the next commit. The CHECK() failed before this patch, and passes after. I also added a unit test to zstreamtest.c that failed before this patch, and passes after.	2019-04-01 18:02:52 -07:00
Nick Terrell	425ce5547c	Merge pull request #1563 from terrelln/dms-sep [libzstd] Split out zstd_fast dict match state function	2019-03-29 16:19:21 -06:00
Nick Terrell	f00407b640	Split out zstd_fast dict match state function	2019-03-29 10:39:16 -06:00
shakeelrao	dca73db30c	fix srcSize typo and add new UTIL func to comment	2019-03-28 17:50:34 -07:00
Nick Terrell	d0f5ba36fb	[cover] Improvements for small or homogeneous data * The algorithm would bail as soon as it found one epoch that contained no new segments. Change it so it now has to fail >= 10 times in a row (10 for fastcover, 10-100 for cover). * The algorithm uses the `maxDict` size to decide the epoch size. When this size is absurdly large, it causes tiny epochs. Lower bound the epoch size at 10x the segment size, and warn the user that their training set is too small. Fixes #1554	2019-03-22 14:14:46 -07:00
Nick Terrell	6b053b9f60	[lib] Allow ZSTD_CCtx_loadDictionary() to be called before parameters are set * After loading a dictionary only create the cdict once we've started the compression job. This allows the user to pass the dictionary before they set other settings, and is in line with the rest of the API. * Add tests that mix the 3 dictionary loading APIs. * Add extra tests for `ZSTD_CCtx_loadDictionary()`. * The first 2 tests added fail before this patch. * Run the regression test suite.	2019-03-21 16:13:53 -07:00
Nick Terrell	20f9ff7e53	Update documentation to tell how to replace the old streaming API with the new one.	2019-03-21 16:08:58 -07:00
Nick Terrell	e55da9e963	Wrap the new advanced api completely	2019-03-21 10:54:40 -07:00
shakeelrao	186ded6d91	Fix typo in legacy documentation	2019-03-19 01:44:08 -07:00
shakeelrao	5740eb6769	Remove extraneous spacing in comments	2019-03-18 21:05:35 -07:00
shakeelrao	0a3fa6f909	Add legacy mode in documentation	2019-03-18 20:33:15 -07:00
shakeelrao	20aa1b455c	Stylistic changes	2019-03-17 19:35:43 -07:00
shakeelrao	0033bb4785	Update documentation for ZSTD_frameSizeInfo	2019-03-17 17:41:27 -07:00
shakeelrao	19b75b6ecb	Test new ZSTD_findFrameCompressedSize and update documentation	2019-03-15 18:04:19 -07:00
shakeelrao	8cd423a659	Reorder declaration in ZSTD_findFrameSizeInfoLegacy	2019-03-15 16:20:34 -07:00
shakeelrao	60796e76b0	Add legacy support to decompressBound	2019-03-15 16:10:37 -07:00
Nick Terrell	f52a7d8faa	Merge pull request #1547 from shakeelrao/fix-error Fix incorrect error code in ZSTD_errorFrameSizeInfo	2019-03-15 10:57:49 -07:00
Nick Terrell	787b76904a	[libzstd] Allow compression parameters to be set with a cdict The order you set parameters in the advanced API is not supposed to matter. However, once you call `ZSTD_CCtx_refCDict()` the compression parameters cannot be changed. Remove that restriction, and document what parameters are used when using a CDict. If the CCtx is in dictionary mode, then the CDict's parameters are used. If the CCtx is not in dictionary mode, then its requested parameters are used.	2019-03-13 16:10:05 -07:00
Nick Terrell	0594e8135b	[libzstd] Free local cdict when referencing cdict We no longer care about the `cdictLocal` after calling `ZSTD_CCtx_refCDict()`, so we should free it to save some memory.	2019-03-13 14:54:31 -07:00
shakeelrao	79827a179f	Fix incorrectly assigned value in ZSTD_errorFrameSizeInfo As documented in `zstd.h`, ZSTD_decompressBound returns `ZSTD_CONTENTSIZE_ERROR` if an error occurs (not `ZSTD_CONTENTSIZE_UNKNOWN`). This is consistent with the error checking made in ZSTD_decompressBound, particularly line 545.	2019-03-13 01:23:07 -07:00
shakeelrao	9ad3f31d33	update documentation for decompressBound	2019-03-02 17:56:10 -08:00
shakeelrao	95dfd48143	update formatting	2019-03-01 23:11:15 -08:00
shakeelrao	1e08c49f75	add stylistic changes	2019-03-01 18:29:35 -08:00
shakeelrao	2bb5eec711	update missing error case to CONTENTSIZE_ERROR	2019-03-01 00:12:16 -08:00
shakeelrao	44ae395b3e	change nbBlocks to size_t for consistency	2019-03-01 00:05:59 -08:00
shakeelrao	03026c3b1d	change compressedBound to ULL	2019-03-01 00:03:50 -08:00
shakeelrao	8930c3c79b	implement API-level changes	2019-02-28 22:55:18 -08:00
shakeelrao	dce9a09772	initialize local vars in decompressBound	2019-02-28 03:01:21 -08:00
shakeelrao	515c506b4c	switch frameBound type to ULL	2019-02-28 02:10:17 -08:00
shakeelrao	d0a3f25697	change return type to ULL	2019-02-28 01:52:01 -08:00
shakeelrao	c9d674b60d	Remove autogenerated test file	2019-02-28 01:29:04 -08:00
shakeelrao	97d3d28dab	Fix decl-after-stmnt build error	2019-02-28 01:24:54 -08:00
shakeelrao	820af1e078	Provide an API function to estimate decompressed size. Introduces a new utility function `ZSTD_findFrameCompressedSize_internal` which is equivalent to `ZSTD_findFrameCompressSize`, but accepts an additional output parameter `bound` that computes an upper-bound for the compressed data in the frame. The new API function is named `ZSTD_decompressBound` to be consistent with `zstd_compressBound` (the inverse operation). Clients will now be able to compute an upper-bound for their compressed payloads instead of guessing a large size. Implements https://github.com/facebook/zstd/issues/1536.	2019-02-28 00:42:49 -08:00
Nick Terrell	be3bd70c57	Merge pull request #1532 from terrelln/cctx-params [libzstd] Rename ZSTD_CCtxParam_* to ZSTD_CCtxParams_*	2019-02-20 10:46:46 -08:00
Nick Terrell	7ad7ba3178	[libzstd] Rename ZSTD_CCtxParam_* to ZSTD_CCtxParams_*	2019-02-19 17:44:52 -08:00
Nick Terrell	9f9630f455	[Windows] Don't use a .def file	2019-02-19 16:52:38 -08:00
Nick Terrell	0c86d23467	[Windows] Move public headers to include/	2019-02-19 15:49:48 -08:00
Nick Terrell	f4abba02ba	[libzstd] Clean up parameter code * Move all ZSTDMT parameter setting code to ZSTD_CCtxParams_Parameter(). ZSTDMT now calls these functions, so we can keep all the logic in the same place. Clean up `ZSTD_CCtx_setParameter()` to only add extra checks where needed. * Clean up `ZSTDMT_initJobCCtxParams()` by copying all parameters by default, and then zeroing the ones that need to be zeroed. We've missed adding several parameters here, and it makes more sense to only have to update it if you change something in ZSTDMT. * Add `ZSTDMT_cParam_clampBounds()` to clamp a parameter into its valid range. Use this to keep backwards compatibility when setting ZSTDMT parameters, which clamp into the valid range.	2019-02-19 13:22:37 -08:00
Nick Terrell	3d7377b874	[libzstd] Handle uncompressed literals	2019-02-15 14:58:11 -08:00
Nick Terrell	f9513115e4	[libzstd] Add ZSTD_c_literalCompressionMode flag It controls the literals compression. It is either `auto`, `huffman`, or `uncompressed`. It defaults to `auto`, which is the current behavior.	2019-02-13 14:59:22 -08:00
Nick Terrell	197a5737c8	Merge pull request #1516 from terrelln/dict-doc [zdict] Improve documentation	2019-02-01 19:04:05 -05:00
Nick Terrell	21616d8a77	[zdict] Improve documentation	2019-02-01 15:19:32 -08:00
Peter (Stig) Edwards	894bbda44c	-Wformat-security not needed with -Wformat=2	2019-02-01 09:31:02 +00:00
W. Felix Handte	501eb25102	Rename FORWARD_ERROR -> FORWARD_IF_ERROR	2019-01-29 12:56:07 -05:00
W. Felix Handte	429987c9a6	Add Comment	2019-01-28 17:35:31 -05:00
W. Felix Handte	2179ce00e1	Remove CHECK_E Macro	2019-01-28 17:33:13 -05:00
W. Felix Handte	03e040a966	Replace Uses of CHECK_E with RETURN_ERROR_IF(*_isError(...	2019-01-28 17:33:01 -05:00
W. Felix Handte	7ebd897157	Remove CHECK_F Macro	2019-01-28 17:16:32 -05:00
W. Felix Handte	64bb6640f2	Replace CHECK_F Uses in zstdmt_compress.c and zstd_ddict.c	2019-01-28 17:15:57 -05:00
W. Felix Handte	cafc3b1bcb	Also Convert zstd_compress.c	2019-01-28 17:05:18 -05:00
W. Felix Handte	324e9654d3	Add grep-able String to Error Macros	2019-01-28 12:50:36 -05:00
W. Felix Handte	32fed9c7be	Switch CHECK_F Calls to FORWARD_ERROR	2019-01-28 12:45:34 -05:00
W. Felix Handte	800c87fed0	Switch Unconditional RETURN_ERROR_IF Calls to RETURN_ERROR	2019-01-28 12:45:34 -05:00
W. Felix Handte	a3538bbc6f	Add RETURN_ERROR and FORWARD_ERROR Macros	2019-01-28 12:45:26 -05:00
W. Felix Handte	c823237d7b	Convert Checks in zstd_decompress.c to RETURN_ERROR_IF	2019-01-28 12:23:14 -05:00
W. Felix Handte	ea031f4ea2	Convert Checks in zstd_decompress_block.c to RETURN_ERROR_IF	2019-01-28 11:56:39 -05:00
W. Felix Handte	54fa31f03b	Add RETURN_ERROR_IF Macro That Logs Debug Information When Check Fails	2019-01-28 11:43:33 -05:00
Yann Collet	f9e4f89252	improved comments for adjustCParams() and getCParams()	2019-01-02 12:18:40 -08:00
Yann Collet	0fb4b21d1a	updated libzstd documentation	2018-12-25 03:10:07 -08:00
Yann Collet	e980ba212f	Merge pull request #1471 from facebook/nofloat guard functions using floating point for debug mode only	2018-12-23 12:35:51 -08:00
Yann Collet	aae5bc538a	Merge pull request #1470 from facebook/U32 fix confusion between unsigned <-> U32	2018-12-23 12:35:39 -08:00
Yann Collet	c9dfb7e445	guard functions using floating point for debug mode only they are only used to print debug messages. Requested in #1386,	2018-12-22 09:09:40 -08:00
Yann Collet	ededcfca57	fix confusion between unsigned <-> U32 as suggested in #1441. generally U32 and unsigned are the same thing, except when they are not ... case : 32-bit compilation for MIPS (uint32_t == unsigned long) A vast majority of transformation consists in transforming U32 into unsigned. In rare cases, it's the other way around (typically for internal code, such as seeds). Among a few issues this patches solves : - some parameters were declared with type `unsigned` in .h, but with type `U32` in their implementation .c . - some parameters have type unsigned*, but the caller user a pointer to U32 instead. These fixes are useful. However, the bulk of changes is about %u formating, which requires unsigned type, but generally receives U32 values instead, often just for brevity (U32 is shorter than unsigned). These changes are generally minor, or even annoying. As a consequence, the amount of code changed is larger than I would expect for such a patch. Testing is also a pain : it requires manually modifying `mem.h`, in order to lie about `U32` and force it to be an `unsigned long` typically. On a 64-bit system, this will break the equivalence unsigned == U32. Unfortunately, it will also break a few static_assert(), controlling structure sizes. So it also requires modifying `debug.h` to make `static_assert()` a noop. And then reverting these changes. So it's inconvenient, and as a consequence, this property is currently not checked during CI tests. Therefore, these problems can emerge again in the future. I wonder if it is worth ensuring proper distinction of U32 != unsigned in CI tests. It's another restriction for coding, adding more frustration during merge tests, since most platforms don't need this distinction (hence contributor will not see it), and while this can matter in theory, the number of platforms impacted seems minimal. Thoughts ?	2018-12-21 18:09:41 -08:00
Yann Collet	c8d1fda982	update aarch64 test to xenial in an attempt to circumvent the `ld` bug	2018-12-21 15:08:48 -08:00
Yann Collet	8f35c7f94c	Merge pull request #1466 from facebook/noDictPresent fixed : better error message	2018-12-20 19:01:27 -08:00
Yann Collet	41b45b84a1	Merge pull request #1465 from facebook/noFilePresent fixed : detection of non-existing file	2018-12-20 17:21:04 -08:00
Yann Collet	ed2fb6bd57	fixed : better error message when dictionary missing during benchmark. Also : refactored ZSTD_fillHashTable(), just for readability (it does the same thing)	2018-12-20 17:20:07 -08:00
Yann Collet	e4ae24c229	Merge pull request #1420 from felixhandte/zstd-decompress-minimal Various Macros to Allow Building Extremely Minimal Decoder Library	2018-12-20 15:17:37 -08:00
Yann Collet	95784c654c	fixed shadowing of stat variable some standard lib declares a `stat` variable at global scope shadowing local declarations ....	2018-12-20 14:56:44 -08:00
Yann Collet	ffba142406	fixed file identity detection in 32-bit mode also : some library decided to use `index` as a global variable declared in standard header shadowing the ones used in fastcover.c :(	2018-12-20 14:30:30 -08:00
W. Felix Handte	91b7309115	Mask Off Unused Functions When ZSTD_FORCE_DECOMPRESS_SEQUENCES_LONG	2018-12-20 12:20:34 -08:00
W. Felix Handte	038aabde28	Mask Off Unused Functions When ZSTD_FORCE_DECOMPRESS_SEQUENCES_SHORT	2018-12-20 12:15:07 -08:00
Yann Collet	2898afab52	fixed OSSfuzz 11849 The problem was already masked, due to no longer accepting tiny blocks for statistics. But in case it could still happen with not-so-tiny blocks, there is a stricter control which ensures that nothing was already loaded prior to statistics collection.	2018-12-19 16:54:15 -08:00
W. Felix Handte	8e61ac8161	Use Unused Variable in ERR_getErrorString()	2018-12-19 12:36:10 -08:00
Yann Collet	8e0e495ce8	fixed: compression ratio discrepancy depending on initialization, the first byte of a new frame was invalidated or not. As a consequence, one match opportunity was available or not, resulting in slightly different compressed sizes (on average, 1 or 2 bytes once every 20 frames). It impacted ratio comparison between one-shot and streaming modes. This fix makes the first byte of a new frame always a valid match. Now compressed size is always the same. It also improves compressed size by a negligible amount.	2018-12-19 10:11:06 -08:00
Yann Collet	d0e15f8d32	Merge pull request #1458 from terrelln/estimate [libzstd] Fix estimate with negative levels	2018-12-18 15:12:21 -08:00
Yann Collet	04baecaeed	Merge pull request #1457 from facebook/btultra2.1 btultra2 and very small input	2018-12-18 14:46:55 -08:00
Nick Terrell	d7def456d8	[libzstd] Fix estimate with negative levels * Fix `ZSTD_estimateCCtxSize()` with negative levels. * Fix `ZSTD_estimateCStreamSize()` with negative levels. * Add a unit test to test for this error.	2018-12-18 14:24:49 -08:00
Yann Collet	ef984e7307	fix debug levels as reported by @terrelln. 2 is reserved for temporary usage only.	2018-12-18 13:40:07 -08:00
W. Felix Handte	0d606ee3db	Fix Incorrect assert()	2018-12-18 13:36:39 -08:00
W. Felix Handte	bd4afc389f	Add Logic to Makefile to Convert Make Vars to Defines	2018-12-18 13:36:39 -08:00
W. Felix Handte	ece2c18372	Document Macros in README	2018-12-18 13:36:39 -08:00
W. Felix Handte	c2d51637d9	Add Mutual-Exclusion Error	2018-12-18 13:36:39 -08:00
W. Felix Handte	c560e34c86	Add HUF_FORCE_DECOMPRESS_X2	2018-12-18 13:36:39 -08:00
W. Felix Handte	abd1567d3c	Move HUF_DGEN Up Out of X1 Definitions	2018-12-18 13:36:39 -08:00
W. Felix Handte	4a0572b215	Refactor Huffman Decompression Away From Ternary Tree in ZSTD_decodeLiteralsBlock	2018-12-18 13:36:39 -08:00
W. Felix Handte	432314b58a	Rename HUF_DECOMPRESS_MINIMAL -> HUF_FORCE_DECOMPRESS_X1	2018-12-18 13:36:39 -08:00
W. Felix Handte	4bbb8a48ad	Add ZSTD_FORCE_DECOMPRESS_SEQUENCES_LONG This macro forces behavior in the opposite direction.	2018-12-18 13:36:39 -08:00
W. Felix Handte	64553a0e35	Rename ZSTD_DECOMPRESS_MINIMAL -> ZSTD_FORCE_DECOMPRESS_SEQUENCES_SHORT	2018-12-18 13:36:39 -08:00
W. Felix Handte	605dd576ee	Remove Error Strings with ZSTD_STRIP_ERROR_STRINGS	2018-12-18 13:36:39 -08:00
W. Felix Handte	9d5f3963ff	Add Option to Not Request Inlining with ZSTD_NO_INLINE	2018-12-18 13:36:39 -08:00
W. Felix Handte	df28e5babd	Add ZSTD_DECOMPRESS_MINIMAL Macro, Which Reduces Branching of Decompress Variants	2018-12-18 13:36:39 -08:00
W. Felix Handte	f45c9df42e	Totally Hide/Disable X2 Variants when HUF_DECOMPRESS_MINIMAL is Defined	2018-12-18 13:36:39 -08:00
W. Felix Handte	36a84b07a8	Load Dictionaries as X1 Tables	2018-12-18 13:36:39 -08:00
W. Felix Handte	f9cb348776	Add HUF_DECOMPRESS_MINIMAL Macro, Which Avoids Using X2 Variants	2018-12-18 13:36:39 -08:00
Yann Collet	635783da12	btultra2 and very small srcSize When srcSize is small, the nb of symbols produced is likely too small to warrant dedicated probability tables. In which case, predefined distribution tables will be used instead. There is a cheap algorithm in btultra initialization : it presumes default distribution will be used if srcSize <= 1024. btultra2 now uses the same threshold to shut down probability estimation, since measured frequencies won't be used at entropy stage, and therefore relying on them to determine sequence cost is misleading, resulting in worse compression ratios. This fixes btultra2 performance issue on very small input. Note that, a proper way should be to determine which symbol is going to use predefined probaility and which symbol is going to use dynamic ones. But the current algorithm is unable to make a "per-symbol" decision. So this will require significant modifications.	2018-12-18 12:32:58 -08:00
Yann Collet	517d8c984c	Merge pull request #1449 from facebook/ovlog_def overlapLog default values	2018-12-18 09:45:53 -08:00
Yann Collet	373ff8b983	play around with rescale weights	2018-12-17 15:48:34 -08:00
Yann Collet	8be145a8c1	fixed default job size	2018-12-13 16:38:08 -08:00
Nick Terrell	75fa3f2eb7	Merge pull request #1446 from terrelln/overflow [libzstd] Fix infinite loop in decompression	2018-12-13 16:21:15 -08:00
Yann Collet	62180b27d5	zstdmt parameter getter/setter use `int`	2018-12-13 15:47:34 -08:00
Nick Terrell	aaea4ef924	[libzstd] Fix infinite loop in decompression When we switched `ZSTD_SKIPPABLEHEADERSIZE` to a macro, the places where we do: MEM_readLE32(ptr) + ZSTD_SKIPPABLEHEADERSIZE can now overflow `(unsigned)-8` to `0` and we infinite loop. We now check the frame size and reject sizes that overflow a U32. Note that this bug never made it into a release, and was only in the dev branch for a few days. Credit to OSS-Fuzz	2018-12-13 15:13:19 -08:00
Yann Collet	34f01e600f	fixed multiple conversions from 64-bit to 32-bit	2018-12-13 14:02:22 -08:00
Yann Collet	1993f5d412	fixed ovlog tests and updated man page	2018-12-12 21:09:14 -08:00
Yann Collet	f2f86d369b	Merge branch 'btultra2' into ovlog_def	2018-12-12 20:58:14 -08:00
Yann Collet	9a92ed401d	updated compression results.csv and fixed nit	2018-12-12 20:30:09 -08:00
Yann Collet	9792acda3b	Merge branch 'dev' into btultra2	2018-12-12 20:18:27 -08:00
Yann Collet	7bb8dfc62f	new overlapLog default values varies between 6 and 9, depending on strategy	2018-12-11 18:10:29 -08:00
Yann Collet	eee789b7ea	continued: changed to overlapLog in deeper code layer. for consistency.	2018-12-11 17:41:42 -08:00
Yann Collet	9b784dec7f	changed parameter name to ZSTD_c_overlapLog from overlapSizeLog. Reasoning : `overlapLog` is already used everwhere, in the code, command line and documentation. `ZSTD_c_overlapSizeLog` feels unnecessarily different.	2018-12-11 16:55:33 -08:00
Yann Collet	52b94f902c	add clarification for ZSTD_CCtx_setPledgedSrcSize() as requested in #1391	2018-12-11 12:08:21 -08:00
Yann Collet	9c3265a53f	Merge pull request #1417 from facebook/advancedAPI Advanced API	2018-12-10 18:48:15 -08:00
Yann Collet	5e6aaa3abb	fixed btultra2 usage with prefix notably while using multi-threading	2018-12-10 18:45:03 -08:00
Yann Collet	3619c34399	fix assert position within ZSTD_compress2()	2018-12-10 17:42:35 -08:00
Yann Collet	5a1e01e5f1	clarified experimentalParam	2018-12-10 17:36:20 -08:00
Yann Collet	c226a7b9f3	fixed ZSTD_compress2() as suggested by @terrelln	2018-12-10 17:33:49 -08:00
Yann Collet	37e314a68d	updated clevel table for large inputs	2018-12-09 22:38:05 -08:00
Yann Collet	c9c4c7ec8c	update clevel table for 256K	2018-12-08 21:40:08 -08:00
Yann Collet	8075d75f9c	update clevel table for 128K	2018-12-08 10:42:55 -08:00
Yann Collet	95b152ab33	updated clevel table for 16K to introduce btultra2	2018-12-07 20:12:43 -08:00
Yann Collet	d613fd9afe	linked btultra2 as strategy9 and ensure zstdbench detects out-of-bound parameters	2018-12-06 19:27:37 -08:00
Yann Collet	34aa401afd	updated documentation introducing ZSTD_btultra2	2018-12-06 17:22:19 -08:00
Yann Collet	ae370b0e12	minor bound refinements	2018-12-06 16:51:17 -08:00
Yann Collet	39e28982cf	introduced constants ZSTD_STRATEGY_MIN and ZSTD_STRATEGY_MAX	2018-12-06 16:16:16 -08:00
Yann Collet	c3c3488981	fixed c++ assignment to enum	2018-12-06 15:57:55 -08:00
Yann Collet	be9e561da4	changed ZSTD_c_compressionStrategy into ZSTD_c_strategy also : fixed paramgrill, and limit conditions	2018-12-06 15:00:52 -08:00
Yann Collet	e9448cdf4c	introduced strategy btultra2 note : not yet applied on any compression level	2018-12-06 13:38:09 -08:00
Yann Collet	0c404a48f0	moved ZSTD_WINDOWLOG_LIMIT_DEFAULT into static-linking-only area	2018-12-06 10:57:19 -08:00
Yann Collet	96d887429b	clarified usage of word "job" only applies in MT / async context now.	2018-12-06 10:14:34 -08:00
Yann Collet	3583d19c4e	changed parameter names from ZSTD_p_* to ZSTD_c_* for naming consistency	2018-12-05 17:26:02 -08:00
Yann Collet	c2053310e5	updated API documentation	2018-12-05 16:23:00 -08:00
Yann Collet	3e042d5cc0	ZSTD_decompressDCtx() is compatible with sticky parameters	2018-12-04 17:30:58 -08:00
Yann Collet	d7da3fc90a	merge dedicated dParam setters	2018-12-04 17:06:48 -08:00
Yann Collet	4b5a4f02d7	write the switch()case: differently so that it please both compilers which warn for dead code after the switch and compilers which do not detect that all branches terminate.	2018-12-04 16:59:26 -08:00
Yann Collet	85b02bf142	fixed silent conversion warning	2018-12-04 15:57:16 -08:00
Yann Collet	aec945f0dc	implemented ZSTD_dParam_getBounds() and ZSTD_DCtx_setParameter()	2018-12-04 15:35:37 -08:00
Yann Collet	34e146f548	advanced decompression function replaces by normal streaming one advanced parameters compatible with ZSTD_decompressStream().	2018-12-04 10:28:36 -08:00
Yann Collet	7ef7dc561a	check availability of --color=never command on grep and egrep before applying them. Fixes #1436	2018-12-03 15:46:55 -08:00
Yann Collet	6ced8f7c7c	joined normal streaming API with advanced one	2018-12-03 14:22:38 -08:00
Yann Collet	da1f3066a3	preparative for ZSTD_DCtx_setParameter()	2018-11-30 15:59:50 -08:00
Yann Collet	d8e215cbee	created ZSTD_compress2() and ZSTD_compressStream2() ZSTD_compress_generic() is renamed ZSTD_compressStream2(). Note that, for the time being, the "stable" API and advanced one use different parameter planes : setting parameters using the advanced API does not influence ZSTD_compressStream() and using ZSTD_initCStream() does not influence parameters for ZSTD_compressStream2().	2018-11-30 11:25:56 -08:00
Mitchell Grenier	a424899637	Fix buck for lib	2018-11-30 13:45:16 +00:00
Yann Collet	d3a0c71259	pushed experimental parameters into ZSTD_STATIC_LINKING_ONLY section	2018-11-21 16:18:55 -08:00
Yann Collet	d4d4e109e9	getParameter fills an int* rather than an unsigned* for consistency since type of setParameter() changed to int.	2018-11-21 15:37:26 -08:00
Yann Collet	fea920615c	promote ZSTD_findFrameCompressedSize() into staging area	2018-11-21 15:25:50 -08:00
Yann Collet	41c7d0b1e1	changed hashEveryLog into hashRateLog	2018-11-21 14:36:57 -08:00
Yann Collet	5d3592398d	fixed fall-through	2018-11-20 16:09:33 -08:00
Yann Collet	5c6d4b18ac	completed implementation of ZSTD_cParam_getBounds() for all parameters	2018-11-20 16:06:00 -08:00
Yann Collet	2e7fd6a2cb	fixed remaining searchLength invocations	2018-11-20 15:13:27 -08:00
Yann Collet	e874dacc08	changed searchLength into minMatch refactored all relevant API and calls for consistency.	2018-11-20 14:56:07 -08:00
Yann Collet	114bd4346e	changed enum type name to ZSTD_ResetDirective for naming consistency : types should start with a capital letter (after prefix)	2018-11-20 12:00:20 -08:00
Yann Collet	3b838abf97	ZSTD_CCtx_setParameter : `value` argument is now `int` for compatibility with compression level	2018-11-20 11:53:01 -08:00
Yann Collet	19e5f2a35b	removed some constants and _simpleArgs() from staging constants that may change in the future will be accessed through functions instead (to be created). _simpleArgs() variant do not have (yet) a clear enough added value to deserve "stable" status.	2018-11-19 17:38:15 -08:00
Ryan Schmidt	ef4df0df4a	Fix i386 build failure "Junk character 13"	2018-11-16 02:16:21 -06:00
Yann Collet	5c68639186	updated ZSTD_DCtx_reset() signature and behavior is now the same as ZSTD_CCtx_reset()	2018-11-15 16:12:39 -08:00
Yann Collet	06c8d5a4f4	Merge branch 'dev' into advancedAPI fixed rsyncable	2018-11-15 10:51:24 -08:00
Nick Terrell	b9693d3a49	[lib] Add rsyncable mode - Add rsyncable mode to multithreaded mode - Factor out LDM's hash function for reuse	2018-11-14 16:59:57 -08:00
Yann Collet	21a42bf5f9	added advanced decompression api	2018-11-14 16:54:54 -08:00
Yann Collet	cf9f4b63b8	fixed fuzz test src code	2018-11-14 14:46:49 -08:00
Yann Collet	7b0391e37e	finalized retrofit of ZSTD_CCtx_reset() updated all depending sources	2018-11-14 13:05:35 -08:00
Yann Collet	ff8d371708	modified ZSTD_CCtx_reset() which now accepts an enum, to distinguish between resetting the session, or the parameters (or both). removed ZSTD_CCtx_resetParameters(), which is redundant. start replacing invocation of ZSTD_CCtx_reset*() functions Updated advanced API documentation trimmed down amount of API staged in RC, in particular, all functions related to ZSTD_CCtxParams() seem too advanced.	2018-11-14 12:33:57 -08:00
Yann Collet	d7e10a774a	added constant ZSTD_WINDOWLOG_LIMIT_DEFAULT answering #1407. Also : removed obsolete function ZSTD_setDStreamParameter() which could only be used with one parameter (DStream_p_maxWindowSize). Now replaced by ZSTD_DCtx_setWindowSize() (which exists since a few revisions)	2018-11-13 18:12:34 -08:00
Yann Collet	2c8fde538f	added constant ZSTD_MAGIC_SKIPPABLE_MASK and updated several API comments	2018-11-13 17:36:35 -08:00
Yann Collet	b83d1e7714	removed some `static const` variables and replaced by traditional macro constants. Unfortunately, C doesn't consider `static const` to mean "constant"	2018-11-13 16:56:32 -08:00
Yann Collet	768a264200	Merge branch 'dev' of github.com:facebook/zstd into dev	2018-11-13 15:56:36 -08:00
Yann Collet	092c4abd4c	bumped version number to v1.3.8	2018-11-13 15:53:38 -08:00
Yann Collet	f28af025d9	Merge pull request #1413 from felixhandte/attach-dict-fix-unsigned-compare Fix #1412: Perform Signed Comparison When Setting Attach Dict Param	2018-11-12 17:53:11 -08:00
Yann Collet	626040ab53	changed PREFETCH() macro into PREFETCH_L2() which is more accurate	2018-11-12 17:05:32 -08:00
W. Felix Handte	5faef4d378	Const	2018-11-12 14:48:42 -08:00
W. Felix Handte	2d9332eb21	Fix Types	2018-11-12 12:52:31 -08:00
W. Felix Handte	4127de5fa6	Switch Enum to Only Non-Negative Values, Update Comments	2018-11-12 12:47:47 -08:00
W. Felix Handte	596f7d1256	Fix #1412 : Perform Signed Comparison When Setting Attach Dict Param	2018-11-12 12:07:57 -08:00
Yann Collet	7b0c551bff	Merge pull request #1411 from facebook/prefetch_dict Improves decompression speed when using cold dictionary	2018-11-09 11:31:35 -08:00
Yann Collet	1b4a9c518b	Merge pull request #1410 from facebook/prefetch_dec improve long-range decoder speed	2018-11-08 18:41:58 -08:00
Yann Collet	483759a3de	Improves decompression speed when using cold dictionary by triggering the prefetching decoder path (which used to be dedicated to long-range offsets only). Figures on my laptop : no content prefetch : ~300 MB/s (for reference) full content prefetch : ~325 MB/s (before this patch) new prefetch path : ~375 MB/s (after this patch) The benchmark speed is already significant, but another side-effect is that this version prefetch less data into memory, since it only prefetches what's needed, instead of the full dictionary. This is supposed to help highly active environments such as active databases, that can't be properly measured in benchmark environment (too clean). Also : fixed the largeNbDict test program which was working improperly when setting nbBlocks > nbFiles.	2018-11-08 17:00:23 -08:00
Yann Collet	20fb9e7f36	reduced assertion strength one limit case can apparently be generated during fuzzer tests	2018-11-08 12:57:34 -08:00
Yann Collet	9126da5b5c	improve long-range decoder speed on enwik9 at level 22 (which is almost a worst case scenario), speed improves by +7% on my laptop (415 -> 445 MB/s)	2018-11-08 12:47:46 -08:00
Yann Collet	8bed4012bd	fixed decompression-only benchmark	2018-11-08 12:36:39 -08:00
Nick Terrell	a8daa2d683	Signal before unlocking in pool.c	2018-11-08 10:45:53 -08:00
Bartosz Szreder	5c5c476338	Prevent deadlock on malloc() failure.	2018-11-08 10:29:31 +01:00
Yann Collet	e0701d3c5d	Merge pull request #1404 from facebook/T36302429 fixed T36302471	2018-11-06 11:53:20 -08:00
Yann Collet	3e5cdf1b6a	fixed T36302429	2018-11-05 17:50:30 -08:00
Yann Collet	2caa995558	just add an assert() in ZSTD_insertBtAndGetAllMatches() to express a condition on ll0 . May help static analyzer as in #1397	2018-11-05 17:13:32 -08:00
Yann Collet	3a90229616	Merge pull request #1395 from facebook/decompressblock created zstd_decompress_block module	2018-10-29 16:28:09 -07:00
Yann Collet	acd75a1448	fixed a second memset() on NULL not sure why it only triggers now, this code has been around for a while. Introduced a new error code : dstBuffer_null, I couldn't express anything even remotely similar with existing error codes set.	2018-10-29 15:03:57 -07:00
Yann Collet	9c58098200	fixed memcpy() on NULL warning memcpy(NULL, src, 0) is undefined behavior.	2018-10-29 13:57:37 -07:00
Yann Collet	ea966c8fb1	Merge pull request #1396 from facebook/huf_refactor refactor HUF_compress_internal for clarity	2018-10-29 13:06:45 -07:00
Yann Collet	fc20b3c441	added flag -Wc++-compat for library and cli	2018-10-26 16:38:23 -07:00
Yann Collet	c3c7deb1e1	Merge pull request #1392 from coetry/dev provide consistent spacing to enum field	2018-10-26 15:29:07 -07:00
Yann Collet	1866bd374a	Merge branch 'dev' into huf_Refactor	2018-10-26 15:25:01 -07:00
Yann Collet	8d56f4baee	added a few comments for clarifications	2018-10-26 15:21:52 -07:00
Yann Collet	450356b5af	Merge branch 'dev' into decompressblock	2018-10-26 15:03:43 -07:00
Yann Collet	7d4960a5e8	Merge pull request #1390 from facebook/nullAsOutput support decompressing an empty frame into NULL	2018-10-26 14:43:16 -07:00
Allen Hai	3783720f70	vertically align code comment	2018-10-26 16:16:06 -05:00
Yann Collet	7b74405150	refactor HUF_compress_internal for clarity changed workspace parameter convention to always provide workspaceSize, so that size can be explicitly checked. Also, use more enum to make the meaning of some parameters more explicit.	2018-10-26 13:21:37 -07:00
Allen Hai	26e34d8a73	provide consistent spacing to enum field	2018-10-25 18:45:20 -05:00
Yann Collet	2b4914082e	created zstd_decompress_block module isolate all logic associated with block decompression into its own module. zstd_decompress is still in charge of context creation/destruction, frames, headers, streaming, special blocks, etc. Compressed blocks themselves are now handled within zstd_decompress_block .	2018-10-25 16:28:41 -07:00
Yann Collet	cb320a9fc0	added comment on public ddict functions	2018-10-24 16:50:03 -07:00
Yann Collet	806a5c84e4	support decompressing an empty frame into NULL fix #1385 decompressing into NULL was an automatic error. It is now allowed, as long as the content of the frame is empty. Seems to simplify things for `arrow`. Maybe some other projects rely on this behavior ?	2018-10-24 16:34:35 -07:00
Yann Collet	debff3929b	fixed warnings in testpools	2018-10-24 10:36:06 -07:00
Yann Collet	cc3612e1c5	added simple guard macros in case of accidental multi-includes	2018-10-23 17:55:23 -07:00
Yann Collet	ccd2d426fc	separate DDict logic into its own module created zstd_ddict.c within lib/decompress	2018-10-23 17:25:49 -07:00
Yann Collet	f181799082	fix decodecorpus incorrect frame generation fix #1379 decodecorpus was generating one extraneous byte when `nbSeq==0`. This is disallowed by the specification. The reference decoder was just skipping the extraneous byte. It is now stricter, and flag such situation as an error.	2018-10-20 18:56:21 -07:00
Yann Collet	1e6208e75e	bumped version number to v1.3.7 updated documentation	2018-10-11 14:40:12 -07:00
Ori Livneh	f31715f5e0	Enable use of bswap intrinsics in clang Necessary because clang disguises itself as an older (__GNUC_MINOR__ = 2) GCC.	2018-10-11 15:01:09 -04:00
Yann Collet	6ed3b526e4	restored bitMask for shift values since corrupted bitstreams can generate too large values. This slightly reduces the benefits from clang on my laptop. gcc results and code generation are not affected.	2018-10-10 18:29:50 -07:00
Yann Collet	c012e9540a	removed one assert() that can be triggered by a corrupted bitstream.	2018-10-10 17:33:04 -07:00
Yann Collet	7791f192ee	removed one assert() which can be triggered when input is corrupted.	2018-10-10 16:39:15 -07:00
Yann Collet	d3ec23313d	improved decompression speed while reviewing #1364, I found a decompression speed improvement. On my laptop, the new code decompresses +5-6% faster on clang and +2-3% faster on gcc. not bad for an accidental optimization...	2018-10-10 15:48:43 -07:00
Yann Collet	942df522cc	Merge pull request #1361 from facebook/streamdoc Clarify streaming api doc	2018-10-08 19:19:34 -07:00
W. Felix Handte	b8235be865	Avoid Searching Dictionary in ZSTD_btlazy2 When an Optimal Match is Found Bailing here is important to avoid reading past the end of the input buffer.	2018-10-08 15:59:32 -07:00
W. Felix Handte	d121b3451c	Clean Up Debug Log Statements	2018-10-08 15:59:32 -07:00
W. Felix Handte	08da9ad316	Remove Unused Variable	2018-10-08 15:59:32 -07:00
Yann Collet	8fc79fac07	clarify streaming api doc as suggested by @indygreg in #1360	2018-10-08 15:53:29 -07:00
Yann Collet	11cd2ea43d	finalized minor warnings on Haiku	2018-10-03 16:37:50 -07:00
Yann Collet	bc93b801f0	Merge pull request #1330 from korli/haiku Enable building zstd on Haiku.	2018-10-03 13:36:00 -07:00
Jerome Duval	87c10e2f58	Enable building zstd on Haiku.	2018-10-03 09:51:56 +02:00
Yann Collet	22ddf3523a	fixed msan warning on btlazy2 strategy with dictAttach	2018-10-02 18:20:20 -07:00
Yann Collet	c9843ec232	Merge pull request #1348 from facebook/donotdelete Fix #1082	2018-10-02 16:37:58 -07:00
Yann Collet	3ca6261223	fixed static analyzer warnings note : for some reason, scan-build version on my laptop found problems within fastcover.c that scan-build on travisCI does not flag. They are, as usual, false positive : the analyzer does not understand that a table (`offset`) is correctly filled before usage.	2018-10-02 15:59:11 -07:00
Yann Collet	228c6e5147	Merge pull request #1317 from felixhandte/split-logs Independent Dictionary and Working Context Table Logs	2018-10-01 17:20:12 -07:00
W. Felix Handte	5b296869df	Revert Ability to Set HashLog and ChainLog on Context When Dict is Attached This capability is not needed / used in the current unit of work. I'll re-introduce it later, when we start allowing users to override the deduced working context logs.	2018-10-01 13:28:13 -07:00
W. Felix Handte	c2369fedc4	Restore Passing CParams to `ZSTD_insertAndFindFirstIndex_internal`	2018-09-28 17:12:54 -07:00
W. Felix Handte	bad74c4781	Use Working Ctx Logs when not in DMS Mode We pre-hash the ptr for the dict match state sometimes. When that actually happens, a hashlog of 0 can produce undefined behavior (right shift a long long by 64). Only applies to unoptimized compilations, since when optimizations are applied, those hash operations are dropped when we're not actually in dms mode.	2018-09-28 17:12:54 -07:00
W. Felix Handte	c38acff94f	When Attaching Dictionary, Size Working Tables Based on Input Size Only	2018-09-28 17:12:54 -07:00
W. Felix Handte	9d87d50878	Remove Log Overriding for the Time Being	2018-09-28 17:12:54 -07:00
W. Felix Handte	77fd17d93f	Remove Strategy-Dependency in Making Attachment Decision	2018-09-28 17:12:54 -07:00
W. Felix Handte	00c088b32d	Support Split Logs in ZSTD_btopt..ZSTD_btultra	2018-09-28 17:12:54 -07:00
W. Felix Handte	0783492178	Bump Split Log Support to ZSTD_btultra	2018-09-28 17:12:54 -07:00
W. Felix Handte	e4ac4a0f16	Support Split Logs in ZSTD_greedy..ZSTD_btlazy2	2018-09-28 17:12:54 -07:00
W. Felix Handte	e710dc3369	Bump Split Log Support to ZSTD_btlazy2	2018-09-28 17:12:54 -07:00
W. Felix Handte	22fcb8d4c7	Support Split Logs in ZSTD_dfast	2018-09-28 17:12:54 -07:00
W. Felix Handte	a232b3bb7c	Bump Split Log Support to ZSTD_dfast	2018-09-28 17:12:54 -07:00
W. Felix Handte	fe96e98f81	Support a Separate Hash Log in ZSTD_fast	2018-09-28 17:12:54 -07:00
W. Felix Handte	bc880ebe8f	Stop Passing in `hashLog` and `stepSize` to `ZSTD_compressBlock_fast_generic`	2018-09-28 17:12:54 -07:00
W. Felix Handte	b3107c7799	Temporary Commit to Retain Requested Hash and Chain Logs During Dict Attach	2018-09-28 17:12:54 -07:00
W. Felix Handte	34e0193129	Allow Setting Hash and Chain Logs on Contexts with Attached CDict	2018-09-28 17:12:54 -07:00
W. Felix Handte	eae8232f50	For Supported Strategies, Attach Dict Even When Params Don't Match	2018-09-28 17:12:54 -07:00
W. Felix Handte	01ff945eae	Split Attach and Copy Reset Strategies into Separate Implementation Functions	2018-09-28 17:12:54 -07:00
W. Felix Handte	a6d6bbeae1	Pull Attachment Decision into Separate Function	2018-09-28 17:12:54 -07:00
W. Felix Handte	b7fba599ae	And Then Avoid the Unused Parameter Warning	2018-09-28 17:12:54 -07:00
W. Felix Handte	1f188ae655	Move Asserts into Function to Avoid Unused Function Warning	2018-09-28 17:12:54 -07:00
W. Felix Handte	7212b5e5c2	Move Match State CParams Setting into `resetCCtx` and `continueCCtx`	2018-09-28 17:12:54 -07:00
W. Felix Handte	01e34d365b	Strengthen Assertion to Assert Equality	2018-09-28 17:12:53 -07:00
W. Felix Handte	50cc1cf4d5	Remove CParams Arg from ZSTD_ldm_blockCompress	2018-09-28 17:12:53 -07:00
W. Felix Handte	14764de49f	Stop Separately Passing CParams in ZSTD_lazy Internal Functions	2018-09-28 17:12:53 -07:00
W. Felix Handte	97149f22c3	Stop Separately Passing CParams in ZSTD_opt Internal Functions	2018-09-28 17:10:42 -07:00
W. Felix Handte	dcdf437fed	Also Remove CParams from Table Filling Functions' Args	2018-09-28 17:10:42 -07:00
W. Felix Handte	3483f89101	Also Assert Equivalency When Filling MatchState with Prefix	2018-09-28 17:10:42 -07:00
W. Felix Handte	6cb2454646	Remove CParams from Block Compressor Functions' Args	2018-09-28 17:10:42 -07:00
W. Felix Handte	03103269de	Assert `ctx` and `ms` cparams Equivalency	2018-09-28 17:10:42 -07:00
W. Felix Handte	4e3ecee9ed	Remove cParams from CDict	2018-09-28 17:10:42 -07:00
W. Felix Handte	76ef87ed9d	Add ZSTD_compressionParameters to ZSTD_matchState_t	2018-09-28 17:10:42 -07:00
Nick Terrell	6391cd1030	[zstd] Fix newly added test case	2018-09-28 12:09:28 -07:00
Yann Collet	73773c6b6a	fixed legacy compilation tests for some reason, these tests started failing recently on CircleCI	2018-09-27 18:15:14 -07:00
Nick Terrell	a180ea07c4	Restore ZSTD_noCompressBlock() for clarity	2018-09-27 16:06:02 -07:00
Nick Terrell	aec1a3ec58	Change byte to value to avoid a GRUB typedef	2018-09-27 15:24:48 -07:00
Nick Terrell	109bd37474	Include stddef.h for size_t	2018-09-27 15:24:48 -07:00
Nick Terrell	f2d6db45cd	[zstd] Add -Wmissing-prototypes	2018-09-27 15:24:48 -07:00
Yann Collet	2a5cd8535a	Merge pull request #1342 from facebook/fixcatyd fix : huge (>4GB) chain of blocks	2018-09-27 10:20:14 -07:00
Yann Collet	404a7bfed0	moved again overflow correction cannot work from within ZSTD_compressBlock()	2018-09-26 18:06:53 -07:00
Yann Collet	0e2dbac18a	changed overflow correction place keep one in compress_frameChunk(), so that it's tested at every loop in case some user simply some large mulit-GB input in a single invocation. Add one in ZSTD_compressBlock(), since compressBlock() explicitly skips frameChunk().	2018-09-26 15:35:38 -07:00
Yann Collet	e74eade251	Merge pull request #1339 from facebook/grep_colors fixed usage of grep in Makefile	2018-09-26 14:39:20 -07:00
Yann Collet	8883af6a1e	Merge pull request #1327 from facebook/adapt Adaptive compression	2018-09-26 14:39:08 -07:00
Yann Collet	f98c69d77c	fix : huge (>4GB) stream of blocks experimental function ZSTD_compressBlock() is designed for very small data in mind, for situation where saving the ~12 bytes of frame header can actually make a difference. Some systems though may have to deal with small and large data entangled. If it's larger than a block (> 128KB), compressBlock() cannot compress them in one round. That's why it's possible to compress in multiple rounds. This is a chain of compressed blocks. Some users push this capability to the limit, encoding gigantic chain of blocks. On crossing the 4GB limit, some internal overflow occurs. This fix moves the overflow correction mechanism higher in the call chain, so that it's applied also to gigantic chains of blocks. Added a test case in fuzzer.c, which crashes before the fix, and pass now.	2018-09-26 14:24:28 -07:00
Yann Collet	8ff17a6a09	Merge pull request #1329 from facebook/v04isout Changed default legacy support to v0.5+	2018-09-26 13:39:05 -07:00
Yann Collet	08f68d83c5	fixed usage of grep in Makefile when terminal uses colors as suggested by @danielshir (#1294)	2018-09-25 16:56:53 -07:00
Yann Collet	04f47bbdd2	Merge branch 'dev' into adapt	2018-09-24 16:56:45 -07:00
Yann Collet	9bb6c15f79	Merge pull request #1332 from facebook/minclevel defined a minimum negative level	2018-09-24 16:01:13 -07:00
Yann Collet	292d8e4a83	added some tests based on limits.h in order to ensure proper type mapping when not using stdint.h	2018-09-23 23:57:30 -07:00
Yann Collet	71a5210617	avoid recompiling dll every time under mingw	2018-09-21 17:40:30 -07:00
Yann Collet	c484345a82	Merge branch 'mingw' into adapt	2018-09-21 16:00:46 -07:00
Yann Collet	bfff4f4809	ensure all writes to job->cSize are mutex protected even when reporting errors, using a macro for code brevity, as suggested by @terrelln,	2018-09-21 16:00:39 -07:00
Yann Collet	32b7cf1bcf	fixed tautological tests involving ZSTD_TARGETLENGTH_MIN (== 0)	2018-09-21 15:04:43 -07:00
Yann Collet	c044345f8f	Merge branch 'mingw' into minclevel	2018-09-21 14:56:57 -07:00
Yann Collet	de6c75e4e5	Merge pull request #1318 from felixhandte/shadow-dict-matches Don't Search Dictionary Context When Working Context Search Resulted in Mismatch	2018-09-21 12:15:33 -07:00
Yann Collet	a54c86cfc6	defined a minimum negative level which can be probed using new function ZSTD_minCLevel(). Also : redefined ZSTD_TARGETLENGTH_MIN/MAX for consistency used the opportunity to bump version number to v1.3.6	2018-09-20 16:52:03 -07:00
Yann Collet	b2939163e1	Changed default legacy support to v0.5+ thus dropping read support for v0.4. It's always possible to re-enable it, by changing build macro ZSTD_LEGACY_SUPPORT to 4.	2018-09-20 14:30:20 -07:00
Yann Collet	7992942d66	fixed complex tsan issue when job->consumed == job->src.size , compression job is presumed completed, so it must be the very last action done in worker thread.	2018-09-20 13:47:31 -07:00
Yann Collet	6b07a66aec	fixed minor reporting discrepancy in MT mode	2018-09-19 16:30:55 -07:00
Yann Collet	ca02ebee07	removed static variables so that --adapt can work on multiple input files too	2018-09-19 15:25:50 -07:00
Yann Collet	89bc309d90	error out when --adapt is associated with --single-thread since they are not compatible	2018-09-19 14:49:13 -07:00
Yann Collet	2f78228f65	Merge branch 'dev' into adapt	2018-09-19 12:43:42 -07:00
Yann Collet	005f000aed	updated documentation of *refPrefix() indicating the equivalence with `diff` operation.	2018-09-18 13:07:08 -07:00
ko-zu	18b4a1da61	Fix clang build Fix dixygen comment Fix clang binary path	2018-09-16 10:27:02 +09:00
Yann Collet	7269fe6cd3	minor code comment update	2018-09-14 16:06:35 -07:00
Yann Collet	0403148315	Merge pull request #1295 from felixhandte/hdr-intro-comment-negative-lvls Proposed Update to Zstd.h Introduction Comment	2018-09-14 15:29:19 -07:00
W. Felix Handte	b76c888497	ZSTD_dfast: Don't Search Dict Context When Mismatch Was Found	2018-09-14 15:24:25 -07:00
W. Felix Handte	b048af5999	ZSTD_fast: Don't Search Dict Context When Mismatch Was Found	2018-09-14 15:23:35 -07:00
Yann Collet	0e5b447aaa	Merge pull request #1316 from facebook/coldDict Cold dictionary mitigation	2018-09-14 10:37:46 -07:00
Yann Collet	5512400677	updated code comments, based on @terrelln review	2018-09-13 16:44:04 -07:00
Yann Collet	d195eec97e	fixed msan error cold dictionary is detected through a comparison with dictEnd, which was not initialized at the beginning of first DCtx usage.	2018-09-13 12:29:52 -07:00
Yann Collet	674dd21bd0	final parameter tuning	2018-09-12 17:25:34 -07:00
Yann Collet	419dfd4ea3	clean traces	2018-09-12 16:40:28 -07:00
Yann Collet	2618253da2	fixed PREFETCH() macro for corner cases and platforms without this instruction	2018-09-12 16:15:37 -07:00
Yann Collet	44d3b83bb1	conditional dict content prefetching based on nbSeq.	2018-09-12 15:35:21 -07:00
Yann Collet	5fb5ed3b31	adjust heuristic decisions	2018-09-12 12:32:09 -07:00
Nick Terrell	f6daddf2db	Also allow x86	2018-09-12 12:05:32 -07:00
Nick Terrell	1e0bac6a9c	[libzstd] Fix cpu for MSFT ARM The `__cpuid()` and `__cpuidex()` intrinsics are only available on x86 and x86_64.	2018-09-12 10:35:16 -07:00
Yann Collet	4de344d505	added conditional prefetch depending on amount of work to do.	2018-09-12 10:29:47 -07:00
Yann Collet	63a519dbf6	implemented first prefetch based on dictID. dictContent is prefetched up to 32 KB (no contentSize adaptation)	2018-09-11 17:23:44 -07:00
Yann Collet	3675ef4762	added comment about minimum size of FSE tables required for DDict creation, which use this space as workspace during Hufman table building stage.	2018-09-10 11:24:17 -07:00
Yann Collet	f97ca36eab	strengthened conditions for using workplace into fse table space ensure that the structure layout is as expected. will trigger an error if it changes in the future. Another solution would be to use a union, this would be cleaner and get rid of these static asserts. However, in order to keep the current code unmodified, it would be necessary to use an un-named unions. And apparently, un-named unions are only possible on "recent" compilers (C99+).	2018-09-06 17:54:13 -07:00
Yann Collet	87406548f0	reduced DDict size, by -2KB corresponding to the removal of workspace which is needed while building huffman table and is now either present in DCtx, or temporarily borrowed from available FSE table space.	2018-09-06 17:07:53 -07:00
Yann Collet	50b216146f	Merge pull request #1304 from facebook/largeNbDicts contrib/largeNbDicts	2018-09-06 09:50:56 -07:00
Jennifer Liu	21721b75a3	Change default f to 20	2018-09-04 17:15:14 -07:00
Jennifer Liu	944c9986e0	Update comment on default steps of cover and fastcover	2018-08-30 15:37:29 -07:00
Jennifer Liu	16db0337b1	Always use splitPoint=1.0 for non-optimize cover and fastcover	2018-08-30 14:59:22 -07:00
Yann Collet	31ebb26945	Merge pull request #1301 from terrelln/lit-size [zstd] Fix seqStore growth	2018-08-28 17:10:25 -07:00
Nick Terrell	5e580de6da	[zstd] Fix seqStore growth We could undersize the literals buffer by up to 11 bytes, due to a combination of 2 bugs: * The literals buffer didn't have `WILDCOPY_OVERLENGTH` extra space, like it is supposed to. * We didn't check the literals buffer size in `ZSTD_sufficientBuff()`.	2018-08-28 13:24:44 -07:00
Yann Collet	b37a0a6bde	Merge pull request #1298 from facebook/bench Refactored bench.c	2018-08-28 12:25:02 -07:00
modbw	d14edf259f	Fixed memory leak detected by cppcheck cppcheck (which is run regularly in our CI environment) detected a possible memory leak.	2018-08-28 07:25:05 +02:00
Yann Collet	6782725155	first sketch for largeNbDicts test program	2018-08-26 19:29:12 -07:00
Yann Collet	af23d39eb8	Merge pull request #1297 from felixhandte/check-offset-table Fix Missing Offset Table Check	2018-08-24 17:36:44 -07:00
W. Felix Handte	37f17ee237	Mark Repeated Offset Table as Needing Check	2018-08-24 14:33:34 -07:00
Nick Terrell	e34e917655	Fix compiler warning	2018-08-23 17:48:06 -07:00
Nick Terrell	5ee5e71be3	[zstd] Add note about empty ZSTD_CDict	2018-08-23 17:48:06 -07:00
Nick Terrell	924944e471	[zstd] Reuse the ZSTD_CCtx more often with small data.	2018-08-23 17:48:06 -07:00
Yann Collet	2e45badff4	refactored bench.c for clarity and safety, especially at interface level	2018-08-23 14:21:18 -07:00
Jennifer Liu	9d6ed9def3	Merge fastCover into DictBuilder (#1274 ) * Minor fix * Run non-optimize FASTCOVER 5 times in benchmark * Merge fastCover into dictBuilder * Fix mixed declaration issue * Add fastcover to symbol.c * Add fastCover.c and cover.h to build * Change fastCover.c to fastcover.c * Update benchmark to run FASTCOVER in dictBuilder * Undo spliting fastcover_param into cover_param and f * Remove convert param functions * Assign f to parameter * Add zdict.h to Makefile in lib * Add cover.h to BUCK * Cast 1 to U64 before shifting * Remove trimming of zero freq head and tail in selectSegment and rebenchmark * Remove f as a separate parameter of tryParam * Read 8 bytes when d is 6 * Add trimming off zero frequency head and tail * Use best functions from COVER and remove trimming part(which leads to worse compression ratio after previous bugs were fixed) * Add finalize= argument to FASTCOVER to specify percentage of training samples passed to ZDICT_finalizeDictionary * Change nbDmer to always read 8 bytes even when d=6 * Add skip=# argument to allow skipping dmers in computeFrequency in FASTCOVER * Update comments and benchmarking result * Change default method of ZDICT_trainFromBuffer to ZDICT_optimizeTrainFromBuffer_fastCover * Add dictType enum and fix bug about passing zParam when converting to coverParam * Combine finalize and skip into a single parameter * Update acceleration parameters and benchmark on 3 sample sets * Change default splitPoint of FASTCOVER to 0.75 and benchmark first 3 sample sets * Initialize variables outside of for loop in benchmark.c * Update benchmark result for hg-manifest * Remove cover.h from install-includes * Add explanation of f * Set default compression level for trainFromBuffer to 3 * Add assertion of fastCoverParams in DiB_trainFromFiles * Add checkTotalCompressedSize function + some minor fixes * Add test for multithreading fastCovr * Initialize segmentFreqs in every FASTCOVER_selectSegment and move mutex_unnlock to end of COVER_best_finish * Free segmentFreqs * Initialize segmentFreqs before calling FASTCOVER_buildDictionary instead of in FASTCOVER_selectSegment * Add FASTCOVER_MEMMULT * Minor fix * Update benchmarking result	2018-08-23 12:06:20 -07:00
W. Felix Handte	e589ac6276	Reformat Introduction Comment and Mention Negative Levels	2018-08-22 17:07:34 -07:00
Yann Collet	c71c4f23d7	fix "unused parameter" in single-thread mode within newly added ZSD_toFlushNow()	2018-08-20 11:40:10 -07:00
Yann Collet	105677c6db	created ZSTDMT_toFlushNow() tells in a non-blocking way if there is something ready to flush right now. only works with multi-threading for the time being. Useful to know if flush speed will be limited by lack of production.	2018-08-17 18:11:54 -07:00
Yann Collet	36d6165a2d	Makefile: added variable SCANBUILD so that a different version of scan-build can be selected	2018-08-16 16:44:13 -07:00
Yann Collet	1515f0bb0d	fixed more issues detected by recent version of scan-build test run on Linux	2018-08-16 15:20:25 -07:00
Yann Collet	5291d9ac31	fix scope of scan-build tests exclude zlib code	2018-08-15 17:41:44 -07:00
Yann Collet	42a02ab745	fixed minor warnings issued by scan-build	2018-08-15 14:36:02 -07:00
Yann Collet	3692c31598	Merge branch 'dev' into scanbuild	2018-08-15 13:50:49 -07:00
Yann Collet	6e66bbf5dd	fixed several minor issues detected by scan-build only notable one : writeNCount() resists better vs invalid distributions (though it should never happen within zstd anyway)	2018-08-14 16:55:35 -07:00
Yann Collet	3e4617ef54	frameProgression reports nbActiveWorkers and output flushed	2018-08-14 11:49:25 -07:00
Yann Collet	e7a49c6683	introduced command --adapt	2018-08-11 20:48:06 -07:00
Yann Collet	2dd76037be	zstd cli can increase level when input is too slow	2018-08-09 15:51:30 -07:00
Yann Collet	79a35ac20d	minor code comments improvements	2018-08-09 15:16:31 -07:00
W. Felix Handte	2ca7c69167	Fix CDict Attachment to Handle CDicts with Non-Zero Starts CDicts were previously guaranteed to be generated with `lowLimit=dictLimit=0`. This is no longer true, and so the old length and index calculations are no longer valid. This diff fixes them to handle non-zero start indices in CDicts.	2018-08-07 18:14:14 -07:00
Yann Collet	5808027abf	Merge branch 'dev' into fix1241	2018-08-03 16:08:33 -07:00
Yann Collet	5892dd5da4	Merge pull request #1255 from terrelln/norm-fix [FSE] Fix division by zero	2018-08-02 11:48:56 -07:00
Nick Terrell	dc5a67cb7b	Disallow tableLog == srcLog	2018-08-02 11:12:17 -07:00
Jennifer Liu	f5228f2c44	Refactoring	2018-07-31 13:58:54 -07:00
Jennifer Liu	4e29bc2469	Use CDict instead of CCtx in analyzeEntropy	2018-07-31 10:36:45 -07:00
cyan4973	3f535007e4	fix %zu support under minGW and relevant test on Appveyor	2018-07-30 16:56:18 +02:00
cyan4973	aade1e5904	Merge branch 'dev' into fix1241	2018-07-30 16:30:35 +02:00
Nick Terrell	9889bca530	[FSE] Fix division by zero When the primary normalization method fails, and `(1 << tableLog) == (maxSymbolValue + 1)`, and every symbol gets assigned normalized weight 1 or -1 in the first loop, then the next division can raise `SIGFPE`.	2018-07-27 17:30:03 -07:00
Yann Collet	6e490a2f09	Merge pull request #1237 from terrelln/init-cstream-adv Set requestedParams in ZSTD_initCStream*()	2018-07-18 16:33:30 +02:00
cyan4973	9597b438e9	fix #1241 Ensure that first input position is valid for a match even during first usage of context by starting reference at 1 (avoiding the problematic 0).	2018-07-17 18:52:57 +02:00
cyan4973	53e1f0504e	zstdmt debug traces compatibles with mingw since mingw does not have `sys/times.h`, remove this path when detecting mingw compilation.	2018-07-17 14:39:44 +02:00
Nick Terrell	45821fac0c	Merge pull request #1225 from jennifermliu/dev Split samples when building dictionary for COVER	2018-07-13 13:26:15 -07:00
Nick Terrell	6d222c437c	Set requestedParams in ZSTD_initCStream() The correct parameters are used once, but once `ZSTD_resetCStream()` is called the default parameters (level 3) are used. Fix this by setting `requestedParams` in the `ZSTD_initCStream()` functions. The added tests both fail before this patch and pass after.	2018-07-12 18:35:55 -07:00
Jennifer Liu	612b346ed5	Add explanation for split=100	2018-07-11 15:50:28 -07:00
Jennifer Liu	5021441d86	Change default splitPoint to 100	2018-07-10 11:19:33 -07:00
Jennifer Liu	456f290e31	Change back to splitPoint<=0	2018-07-09 13:53:25 -07:00
Jennifer Liu	7efabb2cf6	Only make 0.0 default splitPoint	2018-07-09 12:26:53 -07:00
Yann Collet	bbd78df59b	add build macro NO_PREFETCH prevent usage of prefetch intrinsic commands which are not supported by c2rust (see https://github.com/immunant/c2rust/issues/13)	2018-07-06 17:06:04 -07:00
Jennifer Liu	015a00af0f	Change cover_sum back to 2 parameters and fix splitPoint issues	2018-07-06 14:24:18 -07:00
Jennifer Liu	0bbff01211	Fix testing parameter	2018-07-05 22:40:32 -07:00
Jennifer Liu	a085d1aae1	Allow splitPoint==1.0 (using all samples for both training and testing)	2018-07-05 10:38:45 -07:00
Jennifer Liu	0881184c89	Some edits based on pull request comments	2018-07-03 17:53:27 -07:00
Jennifer Liu	16e75e8804	Update minimal training sample size	2018-07-03 12:07:06 -07:00
Jennifer Liu	348e5f77a9	Add split=# to cli	2018-06-29 17:54:41 -07:00
Jennifer Liu	52fbbbcb6b	Explicitly cast double to unsigned	2018-06-29 16:17:20 -07:00
Jennifer Liu	f9d19b83fb	Fix variable declaration problem	2018-06-29 15:46:56 -07:00
Jennifer Liu	e061d84016	Another fix to comparator	2018-06-29 15:38:08 -07:00
Jennifer Liu	59797d3328	Fix splitPoint floating point comparison problem	2018-06-29 12:47:03 -07:00
Jennifer Liu	0ef06f2e8a	Split samples into train and test sets	2018-06-29 12:33:34 -07:00
Yann Collet	121aa2c388	Merge pull request #1211 from facebook/staticAssert updated DEBUG_STATIC_ASSERT()	2018-06-27 12:19:17 -07:00
Yann Collet	4489daec09	slightly adjusted default-distribution threshold depending on strategy. fast favors faster compression and decompression speeds.	2018-06-26 20:10:45 -07:00
Yann Collet	ff773bfcde	zeroise freq table with memset() improves decoding speed by ~5% in github_users sample set	2018-06-26 17:24:41 -07:00
Yann Collet	7b9bbf77c9	switched to a sizeof() version avoid -Werror=unused-variable issue	2018-06-26 14:08:35 -07:00
Yann Collet	f98ec46979	updated DEBUG_STATIC_ASSERT() following suggestion from #1209	2018-06-26 12:04:59 -07:00
Nick Terrell	b426bcc097	[zstdmt] Fix jobsize bugs (#1205 ) [zstdmt] Fix jobsize bugs * `ZSTDMT_serialState_reset()` should use `targetSectionSize`, not `jobSize` when sizing the seqstore. Add an assert that checks that we sized the seqstore using the right job size. * `ZSTDMT_compressionJob()` should check if `rawSeqStore.seq == NULL`. * `ZSTDMT_initCStream_internal()` should not adjust `mtctx->params.jobSize` (clamping to MIN/MAX is okay).	2018-06-25 15:21:08 -07:00
Yann Collet	3b53bfe4f3	Merge pull request #1200 from felixhandte/zstd-attach-dict-pref Add CCtx Param Controlling Dict Attachment Behavior	2018-06-25 12:42:31 -07:00
Yann Collet	31769ce702	error on no forward progress streaming decoders, such as ZSTD_decompressStream() or ZSTD_decompress_generic(), may end up making no forward progress, (aka no byte read from input __and__ no byte written to output), due to unusual parameters conditions, such as providing an output buffer already full. In such case, the caller may be caught in an infinite loop, calling the streaming decompression function again and again, without making any progress. This version detects such situation, and generates an error instead : ZSTD_error_dstSize_tooSmall when output buffer is full, ZSTD_error_srcSize_wrong when input buffer is empty. The detection tolerates a number of attempts before triggering an error, controlled by ZSTD_NO_FORWARD_PROGRESS_MAX macro constant, which is set to 16 by default, and can be re-defined at compilation time. This behavior tolerates potentially existing implementations where such cases happen sporadically, like once or twice, which is not dangerous (only infinite loops are), without generating an error, hence without breaking these implementations.	2018-06-22 17:58:21 -07:00
Yann Collet	3934e010a2	Merge pull request #1197 from facebook/poolResize Thread Pool resize	2018-06-22 14:20:07 -07:00
Yann Collet	fbd5dfc1b1	changed POOL_resize() return type to int return is now just en error code. This guarantee that `ctx` remains valid after POOL_resize(). Gets rid of internal POOL_free() operation.	2018-06-22 12:14:59 -07:00
Yann Collet	1d5648ca10	Merge pull request #1196 from felixhandte/zstd-btopt-in-place-dict ZSTD_btopt: Support Searching the Dictionary Context In-Place	2018-06-22 11:53:23 -07:00
Yann Collet	f6242d30b7	Merge pull request #1202 from facebook/barelyCompressible Increase threshold detection of poorly compressible data	2018-06-22 11:52:52 -07:00
Yann Collet	698fd00afb	huf: increase threshold detection of poorly compressible data	2018-06-21 18:32:38 -07:00
Yann Collet	243cd9d8bb	add a cond_broadcast after resize to make sure all threads (notably newly available threads) get awaken to immediately process potential items in the queue.	2018-06-21 18:04:58 -07:00
Yann Collet	818e72b4d5	added extended POOL test abrupt end + downsizing with running jobs remaining in queue. also : POOL_resize() requires numThreads >= 1	2018-06-21 14:58:59 -07:00
W. Felix Handte	01bb1c1016	Add CCtx Param Controlling Dict Attachment Behavior	2018-06-21 17:29:25 -04:00
W. Felix Handte	3e91dc4d6a	Add Repcode Bounds Check	2018-06-21 15:54:41 -04:00
W. Felix Handte	5bd3d4b7d2	Add Debug Log Statement	2018-06-21 15:54:07 -04:00
W. Felix Handte	3caba150c6	Fix `dmsBtLow` Test	2018-06-21 15:53:40 -04:00
W. Felix Handte	5da9bbc38e	Conceivably Dedup ZSTD_noDict and ZSTD_dictMatchState _insertBt1 Impls By reverting to the bool extDict flag, we call ZSTD_insertBt1 with the same const args in both non-extDict dictModes.	2018-06-21 11:20:01 -04:00
Yann Collet	6de249c1c6	fixed: bug when counting nb of active threads when queueSize > 1 also : added a test in testpool.c verifying resizing is effective.	2018-06-20 18:28:49 -07:00
Yann Collet	6b48eb12c0	change control of threadLimit now limits maximum nb of active threads even when queueSize > 1.	2018-06-20 14:35:39 -07:00
W. Felix Handte	5d81f71e83	Consistency in Guarding DMS-Only Variable Initializations	2018-06-20 16:54:53 -04:00
W. Felix Handte	9c14eafe3d	Also Use `matchLow` for HC3 Match	2018-06-20 15:51:14 -04:00
W. Felix Handte	0a6cf7cd1d	Minor Changes	2018-06-20 15:27:23 -04:00
W. Felix Handte	ae1f3898a2	Remove Dead(!) HC3 DMS Lookup	2018-06-20 15:27:12 -04:00
Yann Collet	93702a7a62	Merge pull request #1198 from facebook/msdebug made Visual Studio compatible with DEBUGLEVEL >= 2	2018-06-20 12:26:31 -07:00
cyan4973	ae0b7ffa0a	made Visual Studio compatible with DEBUGLEVEL >= 2	2018-06-20 09:45:02 -07:00
Yann Collet	62469c9f41	fixed wrong size in pthread struct transfer	2018-06-19 20:14:03 -07:00
Yann Collet	166901dc72	reduced POOL_resize() restriction It's not necessary to ensure that no job is ongoing. The pool is only expanded, existing threads are preserved. In case of error, the only option is to return NULL and terminate the thread pool anyway.	2018-06-19 18:07:18 -07:00
Yann Collet	066fbbfe1c	make zstdmt resize its context when nbThreads change. Technically, it only expands. But when instructed to use less threads, the thread pool will limit nb of concurrent threads.	2018-06-19 17:28:56 -07:00
Yann Collet	4567c57199	finalized POOL_resize() POOL_ctx* POOL_resize(POOL_ctx* ctx, size_t numThreads) The function may fail, and returns a NULL pointer in this case.	2018-06-19 16:03:12 -07:00
Yann Collet	6768cf53fd	Merge pull request #1190 from terrelln/ldm-adjust Adjust advanced parameters to source size	2018-06-19 14:40:56 -07:00
W. Felix Handte	03c39c540b	Fix Incorrect Param	2018-06-19 15:36:33 -04:00
W. Felix Handte	de639502aa	Update Dict Attachment Cut-Offs	2018-06-19 15:36:13 -04:00
W. Felix Handte	f0a13bcd68	Make Sure Position 0 Gets Into the Tree	2018-06-19 15:10:06 -04:00
W. Felix Handte	87fe4788a3	Fix Compression Ratio Regression #1	2018-06-19 13:01:21 -04:00
W. Felix Handte	4bb79f9c55	Misc Changes	2018-06-19 13:01:21 -04:00
W. Felix Handte	2091f34e9e	Find Proper Matches	2018-06-19 13:01:21 -04:00
W. Felix Handte	64348a15f1	Misc Fixes	2018-06-19 13:01:21 -04:00
W. Felix Handte	ade8586ce6	Find `mls == 3` Matches	2018-06-19 13:01:21 -04:00
W. Felix Handte	ce743312e2	Fix Typo	2018-06-19 13:01:21 -04:00
W. Felix Handte	a075864756	Switch `!= ZSTD_extDict` to `== ZSTD_noDict`	2018-06-19 13:01:21 -04:00
W. Felix Handte	1e03377bde	Implement RepCode Check	2018-06-19 13:01:21 -04:00
W. Felix Handte	ccbf067973	Add _dictMatchState Functions	2018-06-19 13:01:21 -04:00
W. Felix Handte	d5d8240967	Convert `extDict` Flag to `dictMode` Enum	2018-06-19 13:01:21 -04:00
W. Felix Handte	93c3184d44	Attach Dicts when Using ZSTD_btopt and ZSTD_btultra	2018-06-19 13:01:21 -04:00
Yann Collet	1c714fda3f	introduced POOL_resize() not complete yet : finalize behavior in case of unfinished expansion	2018-06-18 20:46:39 -07:00
Nick Terrell	3841dbac84	Adjust advanced parameters to source size In the new advanced API, adjust the parameters even if they are explicitly set. This mainly applies to the `windowLog`, and accordingly the `hashLog` and `chainLog`, when the source size is known.	2018-06-18 15:49:31 -07:00
Yann Collet	e30f13bde0	Merge pull request #1185 from felixhandte/zstd-btlazy-in-place-dict ZSTD_btlazy2: Support Searching the Dictionary Context In-Place	2018-06-18 13:29:44 -07:00
Yann Collet	d8462ecba2	Merge branch 'dev' into huf_rename	2018-06-14 20:42:10 -04:00
Yann Collet	b7e5ebef2a	grouped X2 function together	2018-06-14 20:41:50 -04:00
Yann Collet	9698d2fb72	Merge pull request #1189 from facebook/hist histogram module	2018-06-14 20:39:52 -04:00
Yann Collet	6901c94cd6	avoid duplicate code comments when a function is decribed in hist.h, do not describe it again in hist.c to avoid future doc synchronization issues.	2018-06-14 19:47:05 -04:00
Yann Collet	f70f829ff5	Merge pull request #1187 from facebook/fix1186 fix dctx initialization within ZSTD_decompress in stack mode	2018-06-14 16:22:22 -04:00
Yann Collet	a71513bec6	Merge pull request #1184 from facebook/debug Grouped debug functions into debug.h	2018-06-14 16:21:53 -04:00
Yann Collet	1adf84ccb7	renamed all HUF_decompressX4() functions into X2 to underline they generate up to 2 symbols per decoding, in preparation for a future X3 variant.	2018-06-14 15:17:03 -04:00
Yann Collet	a09af5eb6b	renamed all HUF_decompressX2() functions into X1 to underline they generate one symbol per decoding operation. The new naming scheme will make it easier to introduce an X3 variant.	2018-06-14 15:08:43 -04:00
W. Felix Handte	0c654d22c8	Force Inline BtFindBestMatch	2018-06-14 14:54:39 -04:00
Yann Collet	7fee966f02	fix dctx initialization within ZSTD_decompress in stack mode when ZSTD_HEAPMODE=0 (which is not default). Also : added an associated test (test-fuzzer-stackmode) run on travis CI fix #1186	2018-06-14 10:22:24 -04:00
Yann Collet	fc682263d0	fixed g_debuglevel variable name in debug.h	2018-06-13 20:02:33 -04:00
Yann Collet	2d76defbfe	grouped all histogram functions into hist.c renamed functions with HIST_* prefix	2018-06-13 19:49:31 -04:00
W. Felix Handte	0551de4b5a	Search Dict for Matches	2018-06-13 16:06:28 -04:00
W. Felix Handte	ace9cfa950	Attach Dicts when Using ZSTD_btlazy2	2018-06-13 16:06:28 -04:00
Yann Collet	fa41bcc2c2	grouped debug functions into debug.h There were 2 competing set of debug functions within zstd_internal.h and bitstream.h. They were mostly duplicate, and required care to avoid messing with each other. There is now a single implementation, shared by both. Significant change : The macro variable ZSTD_DEBUG does no longer exist, it has been replaced by DEBUGLEVEL, which required modifying several source files.	2018-06-13 15:43:09 -04:00
W. Felix Handte	d53200a846	Fix Cast Warning	2018-06-13 14:58:36 -04:00
W. Felix Handte	b82063b266	Extend Dictionary Matches Backwards	2018-06-13 14:58:36 -04:00
W. Felix Handte	d53a04211c	Update Dictionary Attachment Cutoff Values Again	2018-06-13 14:58:36 -04:00
W. Felix Handte	2162aa9f18	Do Not Inline DMS Search Function	2018-06-13 14:58:36 -04:00
W. Felix Handte	338bede9b5	Also Implement Depth Repcode Checks	2018-06-13 14:58:36 -04:00
W. Felix Handte	555ab9f8cf	Apply Match Continuation Bug Fix	2018-06-13 14:58:36 -04:00
W. Felix Handte	c87dd2121d	Update Dictionary Attachment Cutoff Values	2018-06-13 14:58:36 -04:00
W. Felix Handte	6204b6d592	Check Dict Match State in ZSTD_HcFindBestMatch_generic	2018-06-13 14:58:36 -04:00
W. Felix Handte	211a61b69b	Focus on Non-BT Impls for the Moment	2018-06-13 14:58:36 -04:00
W. Felix Handte	2e93736a77	Remove Pre-Existing Repcode Check	2018-06-13 14:58:36 -04:00
W. Felix Handte	3b82a23a35	Second Repcode Check	2018-06-13 14:58:36 -04:00
W. Felix Handte	a2a24bebec	First Repcode Check	2018-06-13 14:58:36 -04:00
W. Felix Handte	f74c2cd673	Disallow Too-Long Repcodes When Using an Attached Dict	2018-06-13 14:58:36 -04:00
W. Felix Handte	c14db94450	Rename `base` -> `prefixLowest`	2018-06-13 14:58:36 -04:00
W. Felix Handte	5d90708a0a	Go Back to Separate Intermediate Functions for Different Dict Modes	2018-06-13 14:58:36 -04:00
W. Felix Handte	f84fc63a43	Further Templatize Intermediate Functions on dictMode	2018-06-13 14:58:36 -04:00
W. Felix Handte	529d3a5acd	Convert Existing U32 extDict Vars to ZSTD_dictMode Enums	2018-06-13 14:58:36 -04:00
W. Felix Handte	33e2240fac	Attach Dict When Using ZSTD_lazy Strategies	2018-06-13 14:58:36 -04:00
W. Felix Handte	90cfc799e5	Add _dictMatchState Stubs for ZSTD_lazy Functions	2018-06-13 14:58:36 -04:00
W. Felix Handte	a85ecb32bd	Add dictMode Param to ZSTD_compressBlock_lazy_generic	2018-06-13 14:58:36 -04:00
Yann Collet	750ee87a92	Merge pull request #1175 from ryandesign/macos Fix name of macOS	2018-06-13 11:32:06 -04:00
Yann Collet	b2632bcf6c	Merge pull request #1174 from duc0/document_default_level Expose ZSTD_CLEVEL_DEFAULT and update documentation	2018-06-12 12:09:01 -07:00
Duc Ngo	869e2718f6	Line break	2018-06-11 10:02:15 -07:00
Duc Ngo	e8ef725e13	Address comments	2018-06-11 10:01:35 -07:00
Ryan Schmidt	b567ce9d68	Fix name of macOS	2018-06-09 14:31:17 -05:00
Duc Ngo	e34c000e44	Expose ZSTD_CLEVEL_DEFAULT and update documentation	2018-06-08 11:33:44 -07:00
Yann Collet	3050733042	Merge branch 'dev' into negLevels	2018-06-07 15:51:35 -07:00
Yann Collet	c2c47e24e0	support targetlen==0 with strategy==ZSTD_fast to mean "normal compression", targetlen >= 1 now means "disable huffman compression of literals"	2018-06-07 15:49:01 -07:00
Yann Collet	a57b4df85f	removed literalCompression directive in this version, literal compression is always disabled for ZSTD_fast strategy. Performance parity between ZSTD_compress_advanced() and ZSTD_compress_generic()	2018-06-07 15:24:12 -07:00
Yann Collet	8537bfd85c	fuzzer: make negative compression level fail result of ZSTD_compress_advanced() is different from ZSTD_compress_generic() when using negative compression levels because the disabling of huffman compression is not passed in parameters.	2018-06-07 15:12:13 -07:00
Yann Collet	8ef75547ef	Merge pull request #1165 from facebook/ctxSizeDown Dynamic context downsize	2018-06-07 14:44:32 -07:00
Yann Collet	e3c42c739b	clean ZSTD_compress() initialization The (pretty old) code inside ZSTD_compress() was making some pretty bold assumptions on what's inside a CCtx and how to init it. This is pretty fragile by design. CCtx content evolve. Knowledge of how to handle that should be concentrate in one place. A side effect of this strategy is that ZSTD_compress() wouldn't check for BMI2 capability, and is therefore missing out some potential speed opportunity. This patch makes ZSTD_compress() use the same initialization and release functions as the normal creator / destructor ones. Measured on my laptop, with a custom version of bench manually modified to use ZSTD_compress() (instead of the advanced API) : This patch : 1#silesia.tar : 211984896 -> 73651053 (2.878), 312.2 MB/s , 723.8 MB/s 2#silesia.tar : 211984896 -> 70163650 (3.021), 226.2 MB/s , 649.8 MB/s 3#silesia.tar : 211984896 -> 66996749 (3.164), 169.4 MB/s , 636.7 MB/s 4#silesia.tar : 211984896 -> 65998319 (3.212), 136.7 MB/s , 619.2 MB/s dev branch : 1#silesia.tar : 211984896 -> 73651053 (2.878), 291.7 MB/s , 727.5 MB/s 2#silesia.tar : 211984896 -> 70163650 (3.021), 216.2 MB/s , 655.7 MB/s 3#silesia.tar : 211984896 -> 66996749 (3.164), 162.2 MB/s , 633.1 MB/s 4#silesia.tar : 211984896 -> 65998319 (3.212), 130.6 MB/s , 618.6 MB/s	2018-06-07 14:05:25 -07:00
Yann Collet	b27c7389e3	Merge pull request #1164 from GeorgeLu97/CustomMacros Partial Compilation Macros	2018-06-06 16:47:42 -07:00
Yann Collet	24319975b6	bumped version number to v1.3.5	2018-06-06 15:51:55 -07:00
Yann Collet	f1ea383f45	context can be sized down even with constant parameters when parameters are "equivalent", the context is re-used in continue mode, hence needed workspace size is not recalculated. This incidentally also evades the size-down check and action. This patch intercepts the "continue mode" so that the size-down check and action is actually triggered.	2018-06-06 15:04:12 -07:00
Yann Collet	e5e17d009f	changed member name to workSpaceOversizedDuration	2018-06-06 15:00:27 -07:00
Yann Collet	f7392f3dc9	added test case	2018-06-05 14:53:28 -07:00
George Lu	11d5bfdaa9	Revert "Partial compilation test?" This reverts commit `b2496ab606`.	2018-06-05 13:55:36 -07:00
George Lu	b2496ab606	Partial compilation test?	2018-06-05 13:24:00 -07:00
Yann Collet	3d523c741b	added workSpaceTooLarge and workSpaceWasteful also : slightly increased speed of test fuzzer.16	2018-06-05 11:42:48 -07:00
George Lu	b3ef314830	Fix Typos	2018-06-04 17:19:06 -07:00
Yann Collet	357c648c3f	changed a few variable names to unify naming convention	2018-06-04 17:10:50 -07:00
George Lu	609d72b0ca	Added Deprecated Dependencies	2018-06-04 14:33:21 -07:00
George Lu	9437021d2f	Remove old file declaration	2018-06-04 13:32:41 -07:00
George Lu	6a617d70ed	Documentation	2018-06-04 09:56:37 -07:00
George Lu	65de25a463	Created Macros	2018-06-04 09:56:29 -07:00
Yann Collet	2108decb41	Fixed a nasty corruption bug recently introduce into the new dictionary mode. The bug could be reproduced with this command : ./zstreamtest -v --opaqueapi --no-big-tests -s4092 -t639 error was in function ZSTD_count_2segments() : the beginning of the 2nd segment corresponds to prefixStart and not the beginning of the current block (istart == src). This would result in comparing the wrong byte.	2018-06-01 18:54:34 -07:00
Yann Collet	143fc9ff6c	Merge pull request #1157 from facebook/decompressedSize minor : improved zstd.h API code comment	2018-06-01 10:28:17 -07:00
Yann Collet	7c33b48221	Merge pull request #1151 from felixhandte/zstd-dfast-in-place-dict-goto ZSTD_dfast: Support Searching the Dictionary Context In-Place (Alternate `goto` Implementation)	2018-05-31 17:37:09 -07:00
W. Felix Handte	48deab92de	Allow Different Dict Attachment Cut-Offs for Different Strategies	2018-05-31 17:37:44 -04:00
W. Felix Handte	f86796639e	Remove Incorrect and Extraneous Repcode Bounds Check	2018-05-31 17:02:29 -04:00
Yann Collet	9b979d0e33	minor : improved API code comment Extend guarantee that ZSTD_getFrameContentSize() will delivering the decompressed size to any single-pass compression function. Answer #1156	2018-05-31 11:12:18 -07:00
Yann Collet	809f2f9322	minor update of literal cost function just assert() there is no negative cost evaluation for literals	2018-05-29 15:34:50 -07:00
Yann Collet	463a0fe38b	simplified optimal parser removed "cached" structure. prices are now saved in the optimal table. Primarily done for simplification. Might improve speed by a little. But actually, and surprisingly, also improves ratio in some circumstances.	2018-05-29 14:07:25 -07:00
Yann Collet	bb6eaf6495	Merge pull request #1153 from facebook/dynThreshold changed dynamic fse threshold for offset	2018-05-26 08:43:45 -07:00
Yann Collet	e916c365a1	fixed minor visual warning	2018-05-25 20:43:09 -07:00
Yann Collet	a7fdceeccd	changed dynamic fse threshold for offset recent experienced showed that default distribution table for offset can get it wrong pretty quickly with the nb of symbols, while it remains a reasonable choice much longer for lengths symbols. Changed the formula, so that dynamic threshold is now 32 symbols for offsets. It remains at 64 symbols for lengths. Detection based on defaultNormLog	2018-05-25 17:41:16 -07:00
Yann Collet	4b3a36d5d8	Merge branch 'dev' into lowCompression	2018-05-25 15:45:03 -07:00
Yann Collet	5f177f1c53	btultra accepts blocks with poorer compression ratio zstd rejects blocks which do not compress by at least a certain amount. In which case, such block is simply emitted uncompressed (even if a little bit of compression could be achieved). This is better for decompression speed, hence for energy. The logic is controlled by ZSTD_minGain(). The rule is applied uniformly, at all compression levels. This change makes btultra accepts blocks with poor compression ratios. We presume that users of btultra mode prefers compression ratio over some decompress speed gains. The threshold for minimum gain is lowered for btultra from s>>6 (~1.5% minimum gain) to s>>7 (~0.8% minimum gain). This is a prudent change. Not sure if it's large enough.	2018-05-25 15:19:52 -07:00
Yann Collet	e2c0e3d437	slightly nudge choices towards less sequences also slightly improve some strange detrimental corner cases.	2018-05-25 14:52:21 -07:00
W. Felix Handte	5b292b5685	Check Long + 1 Matches in Both Prefix and Dict in Bothe Short Match Paths	2018-05-25 13:13:57 -04:00
W. Felix Handte	88b733b380	Interleave Prefix and Dict Searches	2018-05-25 13:13:57 -04:00
W. Felix Handte	1850025156	Refactor ZSTD_dfast to Use `goto`s	2018-05-25 13:13:57 -04:00
W. Felix Handte	43606f9c83	... When I Said "HashTable", I Meant "ChainTable"	2018-05-25 13:13:28 -04:00
W. Felix Handte	ec7efe88f5	Fix Off-By-One Error	2018-05-25 13:13:28 -04:00
W. Felix Handte	2bfe43267e	Disallow Too-Long Repcodes When Using an Attached Dict	2018-05-25 13:13:28 -04:00
W. Felix Handte	b97ad3f457	Port Changes Made to ZSTD_fast to ZSTD_dfast	2018-05-25 13:13:28 -04:00
W. Felix Handte	2313cca1b7	Implement Second Repcode Check	2018-05-25 13:13:28 -04:00
W. Felix Handte	0998f10813	Implement First Repcode Check	2018-05-25 13:13:28 -04:00
W. Felix Handte	50c5b2bb90	Find Dict Hash Table Matches	2018-05-25 13:13:28 -04:00
W. Felix Handte	7a25f7ef5b	Existing Repcode Check Only Applies to noDict Case	2018-05-25 13:13:28 -04:00
W. Felix Handte	8b241da4df	Properly Initialize Repcode Values	2018-05-25 13:13:28 -04:00
W. Felix Handte	7097a03749	Add Necessary Dict Variables	2018-05-25 13:13:28 -04:00
W. Felix Handte	aacbbf4f9a	Rename 'lowest' to 'localLowest' to Prepare to Introduce Dict Indices	2018-05-25 13:13:28 -04:00
W. Felix Handte	c10d1b4011	Skeleton for In-Place Impl for ZSTD_dfast	2018-05-25 13:13:28 -04:00
Yann Collet	f6ad59ab5c	Merge branch 'dev' into staticDictCost	2018-05-24 16:21:02 -07:00
Yann Collet	b5ef32fea7	Merge branch 'dev' into fracFse	2018-05-24 14:09:49 -07:00
Yann Collet	776128d16f	fix corner case when requiring cost of an FSE symbol ensure that, when frequency[symbol]==0, result is (tableLog + 1) bits with both upper-bit and fractional-bit estimates. Also : enable BIT_DEBUG in /tests	2018-05-24 13:59:11 -07:00
Yann Collet	08c5be5db3	Merge pull request #1117 from felixhandte/zstd-fast-in-place-dict ZSTD_fast: Support Searching the Dictionary Context In-Place	2018-05-23 19:32:25 -07:00
Nick Terrell	06b70179da	Work around bug in zstd decoder (#1147 ) Work around bug in zstd decoder Pull request #1144 exercised a new path in the zstd decoder that proved to be buggy. Avoid the extremely rare bug by emitting an uncompressed block.	2018-05-23 18:02:30 -07:00
Nick Terrell	f2d0924b87	Variable declarations	2018-05-23 14:58:58 -07:00
W. Felix Handte	d9c7e67125	Assert that Dict and Current Window are Adjacent in Index Space	2018-05-23 17:53:03 -04:00
W. Felix Handte	298d24fa57	Make loadedDictEnd an Index, not the Dict Len	2018-05-23 17:53:03 -04:00
W. Felix Handte	7ef85e0618	Fixes in re Comments	2018-05-23 17:53:03 -04:00
W. Felix Handte	582b7f85ed	Don't Attach Empty Dict Contents In weird corner cases, they produce unexpected results...	2018-05-23 17:53:03 -04:00
W. Felix Handte	9c92223468	Avoid Undefined Behavior in Match Ptr Calculation	2018-05-23 17:53:03 -04:00
W. Felix Handte	a44ab3b475	Remove Out-of-Date Comment	2018-05-23 17:53:03 -04:00
W. Felix Handte	95bdf20a87	Moar Renames	2018-05-23 17:53:03 -04:00
W. Felix Handte	7e0402e738	Also Attach Dict When Source Size is Unknown	2018-05-23 17:53:03 -04:00
W. Felix Handte	3ba70cc759	Clear the Dictionary When Sliding the Window	2018-05-23 17:53:03 -04:00
W. Felix Handte	b05ae9b608	Refine ip Initialization to Avoid ARM Weirdness	2018-05-23 17:53:03 -04:00
W. Felix Handte	1a7b34ef28	Use New Index Invariant to Simplify Conditionals	2018-05-23 17:53:03 -04:00
W. Felix Handte	2d598e6fed	Force Working Context Indices Greater than Dict Indices	2018-05-23 17:53:03 -04:00
W. Felix Handte	d005e5daf4	Whitespace Fix	2018-05-23 17:53:03 -04:00
W. Felix Handte	154eb09419	Switch to Original Match Calc for noDict Repcode Check	2018-05-23 17:53:03 -04:00
W. Felix Handte	191fc74a51	Rename 'hasDict' to 'dictMode'	2018-05-23 17:53:03 -04:00
W. Felix Handte	ae4fcf7816	Respond to PR Comments; Formatting/Style/Lint Fixes	2018-05-23 17:53:03 -04:00
W. Felix Handte	ca26cecc7a	Rename and Reformat	2018-05-23 17:53:03 -04:00
W. Felix Handte	66bc1ca641	Change Cut-Off to 8 KB	2018-05-23 17:53:03 -04:00
W. Felix Handte	c31ee3c7f8	Fix Rep Code Initialization	2018-05-23 17:53:03 -04:00
W. Felix Handte	b67196f30d	Coalesce hasDictMatchState and extDict Checks into One Enum and Rename Stuff	2018-05-23 17:53:03 -04:00
W. Felix Handte	265c2869d1	Split Wrapper Functions to Cause Inlining	2018-05-23 17:53:03 -04:00
W. Felix Handte	6929964d65	Add bounds check in repcode tests	2018-05-23 17:53:03 -04:00
W. Felix Handte	70a537d1d7	Initial Repcode Check Support for Ext Dict Ctx	2018-05-23 17:53:03 -04:00
W. Felix Handte	8d24ff0353	Preliminary Support in ZSTD_compressBlock_fast_generic() for Ext Dict Ctx	2018-05-23 17:53:03 -04:00
W. Felix Handte	d18a405779	Refer to the Dictionary Match State In-Place (Sometimes)	2018-05-23 17:53:03 -04:00
Nick Terrell	c92dd11940	Error if reported size is too large in edge case	2018-05-23 14:47:20 -07:00
Nick Terrell	a97e9a627a	[zstd] Fix decompression edge case This edge case is only possible with the new optimal encoding selector, since before zstd would always choose `set_basic` for small numbers of sequences. Fix `FSE_readNCount()` to support buffers < 4 bytes. Credit to OSS-Fuzz	2018-05-23 12:16:00 -07:00
Nick Terrell	e3959d5eba	Fixes	2018-05-22 16:06:33 -07:00
Yann Collet	7a8b3496b4	Merge branch 'dev' into staticDictCost	2018-05-22 15:10:05 -07:00
Yann Collet	a8ddf1d370	disable 2-passes strategy	2018-05-22 15:06:36 -07:00
Nick Terrell	49cf880513	Approximate FSE encoding costs for selection Estimate the cost for using FSE modes `set_basic`, `set_compressed`, and `set_repeat`, and select the one with the lowest cost. * The cost of `set_basic` is computed using the cross-entropy cost function `ZSTD_crossEntropyCost()`, using the normalized default count and the count. * The cost of `set_repeat` is computed using `FSE_bitCost()`. We check the previous table to see if it is able to represent the distribution. * The cost of `set_compressed` is computed with the entropy cost function `ZSTD_entropyCost()`, together with the cost of writing the normalized count `ZSTD_NCountCost()`.	2018-05-22 14:33:22 -07:00
Yann Collet	27af35c110	Merge pull request #1143 from facebook/tableLevels Update table of compression levels	2018-05-19 14:40:37 -07:00
Yann Collet	5381369cb1	Merge branch 'dev' into tableLevels	2018-05-18 18:23:27 -07:00
Yann Collet	b0b3fb517d	updated compression levels for blocks of 256KB	2018-05-18 17:17:12 -07:00
Nick Terrell	7cbb8bbbbf	[cover] Small compression ratio improvement The cover algorithm selects one segment per epoch, and it selects the epoch size such that `epochs * segmentSize ~= dictSize`. Selecting less epochs gives the algorithm more candidates to choose from for each segment it selects, and then it will loop back to the first epoch when it hits the last one. The trade off is that now it takes longer to select each segment, since it has to look at more data before making a choice. I benchmarked on the following data sets using this command: ```sh $ZSTD -T0 -3 --train-cover=d=8,steps=256 $DIR -r -o dict && $ZSTD -3 -D dict -rc $DIR \| wc -c ``` \| Data set \| k (approx) \| Before \| After \| % difference \| \|--------------\|------------\|----------\|----------\|--------------\| \| GitHub \| ~1000 \| 738138 \| 746610 \| +1.14% \| \| hg-changelog \| ~90 \| 4295156 \| 4285336 \| -0.23% \| \| hg-commands \| ~500 \| 1095580 \| 1079814 \| -1.44% \| \| hg-manifest \| ~400 \| 16559892 \| 16504346 \| -0.34% \| There is some noise in the measurements, since small changes to `k` can have large differences, which is why I'm using `steps=256`, to try to minimize the noise. However, the GitHub data set still has some noise. If I run the GitHub data set on my Mac, which presumably lists directory entries in a different order, so the dictionary builder sees the files in a different order, or I use `steps=1024` I see these results. \| Run \| Before \| After \| % difference \| \|------------\|--------\|--------\|--------------\| \| steps=1024 \| 738138 \| 734470 \| -0.50% \| \| MacBook \| 738451 \| 737132 \| -0.18% \| Question: Should we expose this as a parameter? I don't think it is necessary. Someone might want to turn it up to exchange a much longer dictionary building time in exchange for a slightly better dictionary. I tested `2`, `4`, and `16`, and `4` got most of the benefit of `16` with a faster running time.	2018-05-18 16:15:27 -07:00
Yann Collet	5cbef6e094	Merge branch 'dev' into staticDictCost	2018-05-18 16:03:06 -07:00
Yann Collet	a95e9e80d1	adding some debug functions to observe statistics	2018-05-18 14:09:42 -07:00
fbrosson	291824f49d	__builtin_prefetch did probably not exist before gcc 3.1.	2018-05-18 18:40:11 +00:00
fbrosson	16bb8f1f9e	Drop colon in asm snippet to make old versions of gcc happy.	2018-05-18 17:05:36 +00:00
Yann Collet	af3da079d1	fixed minor conversion warning	2018-05-17 17:27:27 -07:00
Yann Collet	8572b4d09f	fixed a pretty complex bug when combining ldm + btultra	2018-05-17 16:13:53 -07:00
Yann Collet	134388ba6b	collect statistics for first block in ultra mode this patch makes btultra do 2 passes on the first block, the first one being dedicated to collecting statistics so that the 2nd pass is more accurate. It translates into a very small compression ratio gain : enwik7, level 20: blocks 4K : 2.142 -> 2.153 blocks 16K : 2.447 -> 2.457 blocks 64K : 2.716 -> 2.726 On the other hand, the cpu cost is doubled. The trade off looks bad. Though, that's ultimately a price to pay to reach better compression ratio. So it's only enabled when setting btultra.	2018-05-17 12:24:30 -07:00
Yann Collet	a243020d37	slightly improved weight calculation translating into a tiny compression ratio improvement	2018-05-17 11:19:44 -07:00
Yann Collet	63eeeaa1dd	update table levels for blocks <= 16K also : allow hlog to be slighly larger than windowlog, as it's apparently good for both speed and compression ratio.	2018-05-16 16:13:37 -07:00
Yann Collet	18fc3d3cd5	introduced bit-fractional cost evaluation this improves compression ratio by a tiny amount. It also reduces speed by a small amount. Consequently, bit-fractional evaluation is only turned on for btultra.	2018-05-16 14:53:35 -07:00
Yann Collet	9938b17d4c	Merge pull request #1135 from facebook/frameCSize decompress: changed error code when input is too large	2018-05-15 11:02:53 -07:00
Nick Terrell	30d9c84b1a	Fix failing Travis tests	2018-05-15 09:46:20 -07:00
Yann Collet	0b31304c8d	Merge branch 'dev' into staticDictCost	2018-05-14 18:09:26 -07:00
Yann Collet	2c26df0e13	opt: removed static prices after testing, it's actually always better to use dynamic prices albeit initialised from dictionary.	2018-05-14 18:04:08 -07:00
Yann Collet	f372ffc64d	Merge pull request #1127 from facebook/staticDictCost Improved optimal parser with dictionary	2018-05-14 17:45:50 -07:00
Yann Collet	d59cf02df0	decompress: changed error code when input is too large ZSTD_decompress() can decompress multiple frames sent as a single input. But the input size must be the exact sum of all compressed frames, no more. In the case of a mistake on srcSize, being larger than required, ZSTD_decompress() will try to decompress a new frame after current one, and fail. As a consequence, it will issue an error code, ERROR(prefix_unknown). While the error is technically correct (the decoder could not recognise the header of _next_ frame), it's confusing, as users will believe that the first header of the first frame is wrong, which is not the case (it's correct). It makes it more difficult to understand that the error is in the source size, which is too large. This patch changes the error code provided in such a scenario. If (at least) a first frame was successfully decoded, and then following bytes are garbage values, the decoder assumes the provided input size is wrong (too large), and issue the error code ERROR(srcSize_wrong).	2018-05-14 15:32:28 -07:00
Yann Collet	c9227ee16b	update table for 128 KB blocks	2018-05-13 17:15:07 -07:00
Yann Collet	b4250489cf	update compression levels for large inputs	2018-05-13 01:53:38 -07:00
Yann Collet	761758982e	replaced FSE_count by FSE_count_simple to reduce usage of stack memory. Also : tweaked a few comments, as suggested by @terrelln	2018-05-11 16:03:37 -07:00
Yann Collet	3193d692c2	minor patch, ensuring LIBDIR is created before installation follow-up from #1123	2018-05-11 11:31:48 -07:00
Yann Collet	99ddca43a6	fixed wrong assertion base can actually overflow	2018-05-10 19:48:09 -07:00
Yann Collet	0d7626672d	fixed c++ conversion warning	2018-05-10 18:17:21 -07:00
Yann Collet	09d0fa29ee	minor adjusting of weights	2018-05-10 18:13:48 -07:00
Yann Collet	1a26ec6e8d	opt: init statistics from dictionary instead of starting from fake "default" statistics.	2018-05-10 17:59:12 -07:00
Yann Collet	74b1c75d64	btopt : minor adjustment of update frequencies	2018-05-10 16:32:36 -07:00
Yann Collet	ac6105463a	opt: minor improvements to log traces slight improvement when using fractional-bit evaluation (opt:dictionay)	2018-05-09 15:46:11 -07:00
Yann Collet	c39061cb7b	fixed declaration-after-statement warning	2018-05-09 12:07:25 -07:00
Yann Collet	4d5bd32a00	added traces to look at symbol costs evaluation looks correct.	2018-05-09 12:00:12 -07:00
Yann Collet	c0da0f5e9e	switchable bit-approximation / fractional-bit accuracy modes also : makes it possible to select nb of fractional bits.	2018-05-09 10:48:09 -07:00
Yann Collet	ba2ad9b6b9	implemented fractional bit cost evaluation for FSE symbols. While it seems to work, the gains are negligible compared to rough maxNbBits evaluation. There are even a few losses sometimes, that still need to be explained. Furthermode, there are still cases where btlazy2 does a better job than btopt, which seems rather strange too.	2018-05-08 17:43:13 -07:00
Yann Collet	1aff63b114	opt: shift all costs by 8 bits (* 256) making it possible to represent fractional bit costs.	2018-05-08 16:19:04 -07:00
Yann Collet	6a3c34aa58	opt: estimate cost of both Hufman and FSE symbols For FSE symbols : provide an upper bound, in nb of bits, since cost function is not able to store fractional bit costs.	2018-05-08 16:11:21 -07:00
Yann Collet	338f738c24	pass entropy tables to optimal parser for proper estimation of symbol's weights when using dictionary compression. Note : using only huffman costs is not good enough, presumably because sequence symbol costs are incorrect.	2018-05-08 15:37:06 -07:00
Yann Collet	a155061328	minor code refactor for readability removed some useless operations from optimal parser (should not change performance, too small a difference)	2018-05-08 12:32:44 -07:00
Baruch Siach	9a0643b633	lib/Makefile: create include directory before headers installation Make sure that $(INCLUDEDIR) exists before copying the headers there. Otherwise, the contest of header files is copied over $(DESTDIR)$(INCLUDEDIR), making it a regular file. While at it, remove $(DESTDIR)$(INCLUDEDIR) from the list of directories to create in the install-pc target. The install-pc target does not need this directory.	2018-05-08 20:59:44 +03:00
Yann Collet	ad4524d605	fix ZSTD_compressBlock() associated with CDict reported by @let-def. It's actually a bug in ZSTD_compressBegin_usingCDict() which would pass a wrong pledgedSrcSize value (0 instead of ZSTD_CONTENTSIZE_UNKNOWN) resulting in wrong window size, resulting in downsized seqStore, resulting in segfault when writing into the seqStore later in the process. Added a test in fuzzer to cover this use case (fails before the patch).	2018-05-07 12:54:13 -07:00
Peter Seiderer	64bfdca5b9	Split library install target into pc, static, shared and include only target Signed-off-by: Peter Seiderer <ps.report@gmx.net>	2018-04-30 20:32:32 +02:00
Nick Terrell	ca77822ddf	Fix parameter adjustment with dictionary The new advanced API basically set `requestedParams = appliedParams` when using a dictionary. This halted all parameter adjustment, which can hurt compression ratio if, for example, the window log is small for the first call, but the rest of the files are large. This patch fixes the bug, and checks that the `requestedParams` don't change in the new advanced API when using a dictionary, and generally in the fuzzer.	2018-04-25 16:32:29 -07:00
Yann Collet	12f60b8c98	clarified documentation related to refPrefix()	2018-04-25 10:17:06 -07:00
Yann Collet	ace856a835	updated documentation of streaming compression api	2018-04-24 14:44:27 -07:00
taigacon	2c3ad05812	Fix the problem that enables DYNAMIC_BMI2 macro by mistake on ARM architecture with Clang (#1110 )	2018-04-23 15:41:50 -07:00
Nick Terrell	e8c9dc5cea	Fix documentation	2018-04-13 12:43:38 -07:00
Nick Terrell	c0987986e5	Only reset CDict in ZSTD_CCtx_resetParameters()	2018-04-13 11:26:40 -07:00
Nick Terrell	9f76eebd17	Add ZSTD_CCtx_resetParameters() function * Fix docs for `ZSTD_CCtx_reset()`. * Add `ZSTD_CCtx_resetParameters()`. Fixes #1094.	2018-04-12 16:54:07 -07:00
Nick Terrell	3c3f59e68f	Enforce pledgeSrcSize whenever known (#1106 ) The test fails before the patch and passes after. Fixes #1095.	2018-04-12 16:02:03 -07:00
Nick Terrell	280a236e9e	Add ZSTD_CCtx(Param)?_getParameter() function Closes #1096.	2018-04-12 11:50:12 -07:00
Yann Collet	04212178b5	doc : clarified advanced API usage sticky parameters only work with `ZSTD_compress_generic()`	2018-04-10 11:40:36 -07:00
Yann Collet	ad5ba6cdcf	updated comment on parameters that can be changed during compression	2018-04-09 17:39:07 -07:00
Yann Collet	1da629f2ad	Merge pull request #1104 from terrelln/fast-train Allow negative compression levels in training	2018-04-09 14:16:20 -07:00
Nick Terrell	569e2abccd	Allow negative compression levels in training * Set `dictCLevel` in `zstdcli.c`. * Only set to default level if the compression level `== 0`, not `<= 0`.	2018-04-09 12:12:03 -07:00
Yann Collet	4195b36dd7	Merge pull request #1100 from bket/stable_sort zstd requires a stable sort.	2018-04-05 11:39:27 -07:00
Yann Collet	f35b8ba9da	updated ZSTD_p_chainLog description	2018-04-05 11:05:11 -07:00
Björn Ketelaars	462aed6811	zstd requires a stable sort. On OpenBSD qsort() is not guaranteed to be stable, their mergesort() is. This fixes issue #1088. All the hard work has been done by @terrelln.	2018-04-05 07:59:16 +02:00
Yann Collet	55f67502f4	Merge pull request #1098 from terrelln/nd-mt Only load extra table positions for CDicts	2018-04-02 15:38:20 -07:00
Nick Terrell	295ab0dbfa	Only load extra table positions for CDicts Zstdmt uses prefixes to load the overlap between segments. Loading extra positions makes compression non-deterministic, depending on the previous job the context was used for. Since loading extra position takes extra time as well, only do it when creating a `ZSTD_CDict`. Fixes #1077.	2018-04-02 14:41:30 -07:00
Yann Collet	5b616fa269	Merge pull request #1090 from bket/openbsd Fix building zstd on OpenBSD.	2018-04-02 14:15:26 -07:00
Björn Ketelaars	9d3048346d	Fix building zstd on OpenBSD.	2018-03-31 10:46:20 +02:00
Yann Collet	8be984ec45	fixed comments as suggested by @terrelln	2018-03-30 20:09:27 -07:00
Yann Collet	e6e848bfe9	added ZSTD_getFrameHeader_advanced() makes it possible to request frame header from a magicless frame	2018-03-29 17:51:08 -06:00
Yann Collet	a6694838e1	added more code documentation for ZSTD_getFrameHeader()	2018-03-29 15:24:17 -06:00
René Rebe	21eb26d664	fixed legacy/zstd_v* with older gcc version, by guarding builtin_* like in other files	2018-03-25 20:35:15 +02:00

... 19 20 21 22 23 ...

4226 Commits