townforge/zstd - zstd - Townforge git

Author	SHA1	Message	Date
W. Felix Handte	46ad9377e8	Bump Version Number to 1.5.2	2022-01-07 14:14:26 -05:00
Nick Terrell	5f2c3d9720	Merge pull request #2981 from terrelln/asm-license [license] Fix license header of huf_decompress_amd64.S	2022-01-07 11:06:30 -08:00
Nick Terrell	c7b03c217c	[license] Fix license header of huf_decompress_amd64.S * Add the license header for `huf_decompress_amd64.S` * Add `.S` files to the `test-license.py` test	2022-01-07 09:35:27 -08:00
Nick Terrell	4d8a2132d0	[opt] Fix oss-fuzz bug in optimal parser oss-fuzz uncovered a scenario where we're evaluating the cost of litLength = 131072, which can't be represented in the zstd format, so we accessed 1 beyond LL_bits. Fix the issue by making it cost 1 bit more than litLength = 131071. There are still follow ups: 1. This happened because literals_cost[0] = 0, so the optimal parser chose 36 literals over a match. Should we bound literals_cost[literal] > 0, unless the block truly only has one literal value? 2. When no matches are found, the cost model isn't updated. In this case no matches were found for an entire block. So the literals cost model wasn't updated at all. That made the optimal parser think literals_cost[0] = 0, where it is actually quite high, since the block was entirely random noise. Credit to OSS-Fuzz.	2022-01-06 16:10:18 -08:00
W. Felix Handte	8dd943e42c	Improve Module Map File This commit makes several changes: 1. It adds modules for the dictionary builder and errors headers. 2. It captures all of the macros that are used to configure these headers. When the headers are imported as modules and one of these macros is defined the compiler issues a warning that it needs to be defined on the CLI. 3. It promotes the modulemap file into the root of the lib directory. Experimentation shows that clang's `-fimplicit-module-maps` will find the modulemap when placed here, but not when it's put in a subdirectory.	2022-01-05 18:32:53 -05:00
Felix Handte	7e679511a8	Merge pull request #2964 from felixhandte/noexecstack-all-archs Mark Huffman Decoder Assembly `noexecstack` on All Architectures	2022-01-05 16:52:39 -05:00
W. Felix Handte	ff5d1daf33	Clean Up Debugging Statements	2022-01-05 16:13:00 -05:00
W. Felix Handte	ef1f9e80ff	Restrict `GNU-stack` Note to GNU Assemblers	2022-01-05 16:03:32 -05:00
W. Felix Handte	b12edddb37	Write `GNU-stack` Section on All ELF Architectures Previously we did this only on Linux, which missed other Unices.	2022-01-05 15:44:40 -05:00
W. Felix Handte	4620ce6a9a	Makefiles: Add `noexecstack` Options to Compilation and Linking Hopefully this marks the binary artifacts `noexecstack` even on platforms where binaries default to true.	2022-01-05 15:12:31 -05:00
Yann Collet	41ad7332dd	Updated expression for better readability	2022-01-04 09:07:11 -08:00
Yann Collet	8c53e526db	fix performance issue in scenario #2966 (part 1) When re-using a compression state, across multiple successive compressions, the state should minimize the amount of allocation and initialization required. This mostly matters in situations where initialization is an overwhelming task compared to compression itself. This can happen when the amount to compress is small, while the compression state was given the impression that it would be much larger, aka, streaming mode without providing a srcSize hint. This lean-initialization optimization was broken in `980f3bbf83` . This commit fixes it, making this scenario once again on par with v1.4.9. Note that this does not completely fix #2966, since another heavy initialization, specific to row mode, is also happening (and was not present in v1.4.9). This will be fixed in a separate commit.	2021-12-31 15:16:19 -08:00
Yann Collet	6211bfee5e	fixed backup prototype for POOL_sizeof()	2021-12-30 14:33:21 -08:00
Yann Collet	b1978d60ee	POOL_sizeof() only needs a const read-only reference	2021-12-30 14:08:51 -08:00
Yann Collet	03903f5701	fixed minor compression difference in btlazy2 subtle dependency on sumtype numeric representation	2021-12-29 18:51:03 -08:00
W. Felix Handte	9a9d1ec6f4	Mark Huffman Decoder Assembly `noexecstack` on All Architectures Apparently, even when the assembly file is empty (because `ZSTD_ENABLE_ASM_X86_64_BMI2` is false), it still is marked as possibly needing an executable stack and so the whole library is marked as such. This commit applies a simple patch for this problem by moving the noexecstack indication outside the macro guard. This commit builds on #2857. This commit addresses #2963.	2021-12-29 17:47:12 -08:00
Yann Collet	7a18d709ae	updated all names to offBase convention	2021-12-29 17:30:43 -08:00
Yann Collet	f92ec5ea54	change the offset\|repcode sumtype format to match offBase directly at ZSTD_storeSeq() interface. In the process, remove ZSTD_REP_MOVE. This makes it possible, in future commits, to update and effectively simplify the naming scheme to properly label the updated processing pipeline : offset \| repcode => offBase => offCode + offBits	2021-12-29 12:03:36 -08:00
Yann Collet	ad7c9fc11e	use ZSTD_memcpy(), for proper redirection within Linux Kernel	2021-12-28 17:41:47 -08:00
Yann Collet	8da414231d	found a few more places which were dependent on seqStore offcode sumtype numeric representation	2021-12-28 17:03:24 -08:00
Yann Collet	de9f52e945	regroup all mentions of ZSTD_REP_MOVE within zstd_compress_internal.h	2021-12-28 13:47:57 -08:00
Yann Collet	a34ccad9a6	fixed minor conversion warnings	2021-12-28 13:21:22 -08:00
Yann Collet	92a08eec72	abstracted storeSeq() sumtype numeric representation from zstd_lazy.c	2021-12-28 12:23:39 -08:00
Yann Collet	e909fa627f	abstracted storeSeq() sumtype numeric representation from zstd_opt.c	2021-12-28 12:14:33 -08:00
Yann Collet	6fa640ef70	separate newRep() from updateRep() the new contracts seems to make more sense : updateRep() updates an array of repeat offsets _in place_, while newRep() generates a new structure with the updated repeat-offset array. Most callers are actually expecting the in-place variant, and a limited sub-section, in `zstd_opt.c` mainly, prefer `newRep()`.	2021-12-28 11:52:33 -08:00
Yann Collet	321583ccf5	fixed minor typecast warnings	2021-12-28 11:38:21 -08:00
Yann Collet	b7630a474b	abstracted usage of offBase sumtype within zstd_lazy.c	2021-12-28 10:59:47 -08:00
Yann Collet	435f5a2e6d	fixed regression test assert optLdm->offset might be == 0 in invalid case. Only use STORE_OFFSET() after validating it's a correct case.	2021-12-28 09:55:31 -08:00
Yann Collet	2068889146	created STORED_*() macros to act on values stored / expressed in the sumtype numeric representation required by `storedSeq()`. This makes it possible to abstract away this representation by using the macros to extract these values. First user : ZSTD_updateRep() .	2021-12-28 06:59:07 -08:00
Yann Collet	1aed962216	introduce macros STORE_OFFSET() and STORE_REPCODE() this meant to abstract the sumtype representation required to transfert `offcode` to `ZSTD_storeSeq()`. Unfortunately, the sumtype numeric representation is currently a leaky abstraction that has permeated many other parts of the code, especially within `zstd_lazy.c` and also within `zstd_opt.c` and `zstd_compress.c`. While this PR makes a good job a transfering a large nb of call sites to using the new macros, there are still a few sites where this transformation is more complex, or where the numeric representation itself it used "as is". One of the problematics area is the decision to use the numeric format of the sumtype within the match finders of `zstd_lazy`. This commit doesn't change the behavior, it only introduces and employes the macros, but eventually the resulting code remains identical. At target, if the numeric representation of the sumtype can be completely abstracted and no other part of the code depends on it, it will be possible to move it towards something slightly more efficient.	2021-12-23 22:03:30 -08:00
Yann Collet	bec7bbb5a4	Merge branch 'dev' into seqStore_off	2021-12-23 18:03:17 -08:00
Yann Collet	aeff128331	change seqDef.offset into seqDef.offBase to better reflect the value stored in this field.	2021-12-23 17:56:08 -08:00
Yann Collet	75525fcb9f	library optimization flag can be selected on command line again `CFLAGS=-O0 make` will now use `-O0` instead of enforcing `-O3` which used to be the behavior before introduction of `libzstd.mk`. This should result in faster tests, since a few tests depend on this capability for faster roundtrips.	2021-12-23 17:43:12 -08:00
Yann Collet	e145b58cfd	changed seqDef.matchLength into seqDef.mlBase since this is effectively what is stored in this field (== matchLength - MINMATCH). This makes it clearer what needs to be done when reading from / writing to this field.	2021-12-23 13:39:46 -08:00
Yann Collet	b77fcac61f	change ZSTD_storeSeq() interface to accept matchLength instead of mlBase. This removes the need to do `- MINMATCH` at every call site. The new interface contract is checked with an `assert()`.	2021-12-23 12:03:33 -08:00
Yann Collet	a9e43b37d0	Revert "Limit `ZSTD_maxCLevel` to 21 for 32-bit binaries."	2021-12-20 11:43:14 -08:00
Yann Collet	f829c32258	forgot the chainlog is effectively a "fake" value with rowHash the only value which makes sense is `hashlog-1` as it mimics the real memory usage.	2021-12-16 11:37:40 -08:00
Yann Collet	db1b408a2f	rebalance lazy compression levels	2021-12-15 21:33:31 -08:00
Yann Collet	c8d6067615	fixed incorrect rowlog initialization the variable has only very limited usage, being only used once at the beginning of the block for prefetching only, hence the error had no impact on compression ratio.	2021-12-15 14:37:05 -08:00
Yann Collet	eaf786242d	Merge pull request #2929 from facebook/sse_row_lazy simplify SSE implementation of row_lazy match finder	2021-12-15 11:47:15 -08:00
Norbert Lange	2fbb1d10c1	Reduce bit tables to 8bit This saves some 1.7Kb in rodata section (x86_64, zstd tool), while assembler code stays the same except the type of a few load/extend instructions. Should not have negative performance implications.	2021-12-14 23:47:57 +01:00
Norbert Lange	99923dfc1a	Add typedefs for 8bit (un)signed To make code more expressive, add U8 and S8 typedefs	2021-12-14 23:47:57 +01:00
binhdvo	64205b7832	Fix performance degradation with -m32 (#2926 )	2021-12-14 15:53:50 -05:00
Felix Handte	5e2fede604	Merge pull request #2921 from felixhandte/neg-lvl-stagger-step Stagger Stepping in Negative Levels	2021-12-14 14:13:57 -05:00
Yann Collet	05430b25a8	roll SSE implementation of row_lazy match finder mostly for maintenance convenience. Performance wise, there is very little change, slightly faster for slog 3 & 4, neutral or very slightly negative for slot 5 & 6.	2021-12-14 10:44:23 -08:00
W. Felix Handte	82a49c88f9	Increment Step by 1 not 2 I couldn't find a good way to spread `ip0` and `ip1` apart when we accelerate due to incompressible inputs. (The methods I tried slowed things down quite a bit.) Since we aren't splaying ip0 and ip1 apart (which would be like `0_1_2_3_`, as opposed to the `01__23__` we were actually doing), it's a big ambitious to increment `step` by 2. Instead, let's increment it by 1, which has the benefit sliiightly improving compression. Speed remains pretty much unchanged.	2021-12-13 16:59:33 -05:00
W. Felix Handte	6ca5f42402	Rewrite `step` to Track Increment Between Pairs of Positions The position updates are rewritten from `ip[N] = ip[N-1] + step` to be `ip[N] = ip[N-2] + step`. This lets us only deal with the asymmetric spacing of gaps at setup and then we only have to keep a single `step` variable. This seems to work quite well on GCC and Clang!	2021-12-13 14:48:26 -05:00
W. Felix Handte	b8434cb754	Allow Templating `ZSTD_fast` Matchfinders on Acceleration (Lvl < -1)	2021-12-13 14:46:57 -05:00
Yann Collet	e1ab2200ff	fixed x32 compatibility	2021-12-10 21:02:17 -08:00
W. Felix Handte	ace6a7e746	Decompose `step` into Two Variables This avoids an additional addition, at the cost of an additional variable.	2021-12-10 16:44:23 -05:00
W. Felix Handte	22501cd283	Stagger Application of `stepSize` in ZSTD_fast This replicates the behavior of @terrelln's `ZSTD_fast` implementation. That is, it always looks at adjacent pairs of positions, and only applies the acceleration every other position. This produces a more fine-grained acceleration.	2021-12-10 16:44:23 -05:00
Yann Collet	57383d2317	Merge pull request #2914 from facebook/xxhash081 updated xxHash to latest v0.8.1	2021-12-08 16:48:46 -08:00
Yann Collet	3ce265fea8	remove offending static assert lines no idea why visual + clang-cl + appveyor don't like them, I've not been able to reproduce the issue locally, but these static assert are very unlikely to deliver a useful signal, I can't imagine a situation where they will be wrong, and if they are, then a ton of other things will be broken way before reaching that point.	2021-12-08 15:05:17 -08:00
Nick Terrell	8b40095b3f	Merge pull request #2916 from terrelln/issue-2906 Remove possible NULL pointer addition	2021-12-08 16:51:10 -05:00
Yann Collet	16241b7d26	altered copyright title	2021-12-08 13:18:41 -08:00
Yann Collet	a9cd6164d7	removed declarations of XXH3 symbols when XXH_NO_XXH3 is defined on top of implementations, which were already scoped out.	2021-12-08 12:56:16 -08:00
Yann Collet	27e706de88	replaces malloc / free / memcpy by Zstandard's version	2021-12-08 12:51:04 -08:00
Nick Terrell	b94407b6cf	Remove possible NULL pointer addition Refactor `ZSTDMT_isOverlapped()` to do NULL checks before computing the end pointer. Fixes #2906.	2021-12-08 12:40:40 -08:00
Nick Terrell	859e0500ab	Merge pull request #2915 from terrelln/oss-fuzz-build-fix Fix oss-fuzz build	2021-12-08 15:32:49 -05:00
Nick Terrell	aa7729c9f3	Fix oss-fuzz build Disable assembly when dataflow sanitizer is enabled. This regressed in PR #2893, which accidentally removed the check for dataflow sanitizer.	2021-12-08 11:01:52 -08:00
W. Felix Handte	9f1dee8fa5	Fix Up #2659 ; Build libzstd.pc Whenever Building the Lib on Unix	2021-12-08 12:43:34 -05:00
Yann Collet	1c7d2c4dd5	updated xxHash to latest v0.8.1 with minor modifications directly embedded in source : - does not compile XXH3 - namespace emulation (ZSTD_ prefix) Incidentally fix #2824	2021-12-07 21:16:15 -08:00
Felix Handte	9118ee04c2	Merge pull request #2659 from ericonr/pc [lib] Fix libzstd.pc for lib-mt builds	2021-12-07 14:18:38 -05:00
Nick Terrell	b6b4c9a3da	Merge pull request #2907 from Hello71/armv6-fix-legacy Apply FORCE_MEMORY_ACCESS=1 to legacy	2021-12-06 15:41:22 -05:00
Alex Xu (Hello71)	3d773d7013	Apply FORCE_MEMORY_ACCESS=1 to legacy See #2633, #2881.	2021-12-05 22:51:44 -05:00
Nick Terrell	486472c453	Merge pull request #2893 from terrelln/issue-2789 [asm] Share portability macros and restrict ASM further	2021-12-03 14:07:30 -05:00
Felix Handte	d2c86ec898	Merge pull request #2897 from felixhandte/zstd-deprecated-avoid-deprecated Avoid Using Deprecated Functions in Deprecated Code	2021-12-03 12:09:58 -05:00
Nick Terrell	c284569457	[asm] Share portability macros and restrict ASM further Move portability macros to `lib/common/portability_macros.h`. This file only contains platform/feature detection (e.g. 0/1 macros). This file is shared between C and ASM code, so it cannot include any C code. Rename `HUF_` ASM macros to be `ZSTD_` prefixed, and move to the new header. Restrict `ZSTD_ASM_SUPPORTED` to `__GNUC__`, because we need the GAS assembler. Finally, only include the ASM code if we are actually going to use it. This disables it on all Windows platforms, which should resolve the problem brought up in Issue #2789.	2021-12-02 16:58:04 -08:00
Nick Terrell	014bbb29f8	Merge pull request #2898 from terrelln/issue-2862 Improve zstd_opt build speed and size	2021-12-02 19:49:43 -05:00
Yann Collet	1bf3d8a475	Merge pull request #2896 from facebook/m68k Zstandard compiles and run on m68k cpus	2021-12-02 14:25:45 -08:00
Nick Terrell	e5bfaeede7	Improve zstd_opt build speed and size Use the same trick as we did for zstd_lazy in PR #2828: * Create one search function specialization for each (dictMode, mls). * Select the search function pointer at the top of the match finder. Additionally, we no longer inline `ZSTD_compressBlock_opt_generic` into every function, since `dictMode` is no longer used as a template. Create two specializations, for opt levels 0 and 2, and call one of the two specializations. Lastly, remove the hack that disabled inlining for zstd_opt for the Linux Kernel, as we've gotten most of the benefit already. Compilation time sees a ~4x reduction: \| Compiler \| Flags \| Dev Time (s) \| PR Time (s) \| Delta \| \|----------\|----------------------------------\|--------------\|-------------\|-------\| \| gcc \| -O3 \| 10.1 \| 2.3 \| -77% \| \| gcc \| -O3 -fsanitize=address,undefined \| 61.1 \| 10.2 \| -83% \| \| clang \| -O3 \| 9.0 \| 2.1 \| -76% \| \| clang \| -O3 -fsanitize=address,undefined \| 33.5 \| 5.1 \| -84% \| Build size is reduced by 150KB - 200KB: \| Compiler \| Dev libzstd.a Size (B) \| PR libzstd.a Size (B) \| Delta \| \|----------\|------------------------\|-----------------------\|-------\| \| gcc \| 1327476 \| 1177108 \| -11% \| \| clang \| 1378324 \| 1167780 \| -15% \| There is a <2% speed loss in all cases: \| Compiler \| Level \| Dev Speed (MB/s) \| PR Speed (MB/s) \| Delta \| \|----------\|-------\|------------------\|-----------------\|--------\| \| gcc \| 16 \| 4.78 \| 4.72 \| -1.25% \| \| gcc \| 17 \| 3.49 \| 3.46 \| -0.85% \| \| gcc \| 18 \| 2.92 \| 2.86 \| -2.04% \| \| gcc \| 19 \| 2.61 \| 2.61 \| 0.00% \| \| clang \| 16 \| 4.69 \| 4.80 \| 2.34% \| \| clang \| 17 \| 3.53 \| 3.49 \| -1.13% \| \| clang \| 18 \| 2.86 \| 2.85 \| -0.34% \| \| clang \| 19 \| 2.61 \| 2.61 \| 0.00% \| Fixes Issue #2862.	2021-12-02 14:19:41 -08:00
W. Felix Handte	e688317652	Fix Include Path	2021-12-02 16:53:52 -05:00
Nick Terrell	01ecd6ffc0	Merge pull request #2892 from terrelln/issue-2785 [CircleCI] Fix short-tests-0	2021-12-02 16:20:56 -05:00
W. Felix Handte	d82d67d073	Migrate to `FORWARD_IF_ERROR`	2021-12-02 16:06:07 -05:00
Yann Collet	30b9db8ae4	changed macro name to ZSTD_ALIGNOF for better consistency	2021-12-02 12:57:42 -08:00
Yann Collet	1d025d871b	bound alignment backup to sizeof(void*)	2021-12-02 11:30:03 -08:00
W. Felix Handte	a3ee9815c7	Avoid Using Deprecated Functions in Deprecated Code `lib/deprecated` is no longer built by zstd's bundled build files. However, users may try to build these files when they import the source tree into their own build systems. And if they have `-Wdeprecated-declarations` on, this can produce warnings. This PR migrates these files away from using deprecated declarations. This addresses #2767.	2021-12-02 14:25:33 -05:00
Yann Collet	80a13fd645	move the alignment macro to compiler.h because mem.h is dropped in the Linux kernel. Changed macro definition order (gcc/clang/msvc before c11) due to a limitation in the kernel source builder. Changed the backup to sizeof(), reverting to previous behavior when no support of alignof() is detected.	2021-12-02 11:20:01 -08:00
Nick Terrell	21e28f5c24	Merge pull request #2891 from supperPants/dev Fix typos	2021-12-02 13:53:33 -05:00
Yann Collet	39dced092e	fix align conditions for huf_compress	2021-12-01 23:02:00 -08:00
Nick Terrell	91f5891dd0	[CircleCI] Fix short-tests-0 short-tests-0 were silently failing. I think because of the && make clean construction. Switch to ; instead. Also fix all the test failures that were exposed. `make all` is failing on CircleCI because it is missing Docker. Move that test to GitHub actions, and switch the pedantic CircleCI test to `make allmost`.	2021-12-01 17:43:46 -08:00
Yann Collet	e89e847820	added alignment test and fix an incorrect alignment check in cwksp which was failing on m68k	2021-12-01 17:16:36 -08:00
Yann Collet	a71eed360b	removed lib/Makefile preamble now included from libzstd.mk	2021-12-01 16:54:59 -08:00
Yann Collet	3f64b31585	Merge branch 'dev' into tomerge2051	2021-12-01 15:29:49 -08:00
Nick Terrell	9b97fdf74f	Merge pull request #2887 from terrelln/issue-2815 [zdict] Remove ZDICT_CONTENTSIZE_MIN restriction for ZDICT_finalizeDictionary	2021-12-01 18:15:53 -05:00
Yann Collet	8031dc7a48	Merge pull request #2885 from yoniko/limit-level-32bit-systems Limit `ZSTD_maxCLevel` to 21 for 32-bit binaries.	2021-12-01 14:19:16 -08:00
supperPants	d4713de5a3	Fix typos.	2021-12-01 22:36:21 +08:00
Nick Terrell	0356e05f07	[zdict] Remove ZDICT_CONTENTSIZE_MIN restriction for ZDICT_finalizeDictionary Allow the `dictContentSize` to be any size. The finalized dictionary content size must be at least as large as the maximum repcode (8). So we add zero bytes to the dictionary to ensure that we meet that requirement. I've removed this restriction because its been causing us headaches when people complain that dictionary training failed. It fails because there isn't enough useful content to put in the dictionary. Either because every sample is exactly the same and less than ZDICT_CONTENTSIZE_MIN bytes, or there isn't enough content. Instead, we should succeed in creating the dictionary, and it is up to the user to decide if it is worthwhile. It is possible that the tables alone provide enough value. NOTE: This allows us to produce dictionaries with finalized `dictContentSize < ZDICT_CONTENTSIZE_MIN`. But, they are still valid zstd dictionaries. We could remove the `ZDICT_CONTENTSIZE_MIN` macro, but I've decided to leave that for now, so we don't break users.	2021-11-30 18:02:26 -08:00
Nick Terrell	5414dd7978	[bmi2] Add lzcnt and bmi target attributes * When dynamic dispatching to bmi2 add lzcnt and bmi to the TARGET_ATTRIBUTE. * Centralize the bmi2 TARGET_ATTRIBUTE definition to BMI2_TARGET_ATTRIBUTE so we can change it in the future. * Only enable bmi2 when both bmi1 & bmi2 are supported. There shouldn't be any cases where bmi2 is supported but bmi1 isn't. But, since we are using the instruction we should check bmi1 as well.	2021-11-30 17:54:56 -08:00
Yonatan Komornik	ef2cba609d	`ZSTD_maxCLevel` now limited to 21 for 32-bit binaries. CI tests for constrained memory runs with max level on 32-bit binaries.	2021-11-30 10:31:52 -08:00
Lvv.me	944c71c07d	Remove zstd-umbrella.h	2021-11-24 07:48:40 +08:00
Lvv.me	ebf664b744	Fix SPM warning: umbrella header for module 'libzstd' does not include header 'xxx.h'	2021-11-21 21:57:55 +08:00
Felix Handte	c2c6a4ab40	Merge pull request #2869 from felixhandte/oss-fuzz-fix-41005 Determinism: Avoid Mapping Window into Reserved Indices during Reduction	2021-11-18 10:11:48 -05:00
W. Felix Handte	66079085f0	Determinism: Avoid Mapping Window into Reserved Indices during Reduction PR #2850 attempted to fix a determinism bug that was uncovered by OSS-Fuzz. It succeeded in addressing that source of non-determinism, but introduced a new one: it was possible, when index reduction occurred, to map indices in the window to the reserved value, which would cause them to be zeroed, potentially altering parsing of the input. This PR addresses this issue. It makes sure that the bottom of the window is always `>= ZSTD_WINDOW_START_INDEX`. I'm not sure if this makes #2850 redundant. I think it's probably still valuable to have that protection as well. Credit to OSS-Fuzz for discovering this issue.	2021-11-17 18:09:18 -05:00
Yann Collet	a37a8df532	Merge pull request #2856 from rex4539/typos Fix typos	2021-11-17 13:04:30 -08:00
Yann Collet	f56de118a0	Merge pull request #2858 from cntrump/support_swift_package_manager Support Swift Package Manager	2021-11-16 20:50:07 -08:00
Nick Terrell	f343f27d17	Merge pull request #2857 from ko-zu/noexecstack Remove executable flag from GNU_STACK segment	2021-11-16 14:37:56 -08:00
Nick Terrell	b7d899d99d	Merge pull request #2864 from terrelln/linux-opt [linux-kernel] Don't inline function in zstd_opt.c	2021-11-16 14:13:39 -08:00
Nick Terrell	19eb459da3	[linux-kernel] Don't inline function in zstd_opt.c The optimal parser is unlikely to be used in the linux kernel in practice. There is no reason these functions should be force inlined, since we aren't gaining anything, and are losing build size. \| Compiler \| Before (Bytes) \| After (Bytes) \| Delta (Bytes) \| \|----------\|----------------\|---------------\|---------------\| \| gcc-11 \| 1142090 \| 952754 \| -189336 \| \| clang-12 \| 1228402 \| 976290 \| -252112 \| This is a temporary solution pending the resolution of PR #2862 in the `dev` branch.	2021-11-15 20:37:30 -08:00
Nick Terrell	802ea885ef	Reduce function size in fast & dfast Take the same approach as in PR #2828 [0] to remove functions that force inline many function bodies and `switch`. Instead, create one function per "template" combination, and then switch between these functions. This allows the compiler to break the large function into many small functions, which generally helps codegen. Also, in the `extDict` modes when there is no ext-dict, call the top level function instead of the force inlined one, to save on code size. I'm specifically doing this because gcc on the parisc architecture doesn't handle the large function body well, and ends up using a lot of excess stack space. Outlining these functions fixes it.	2021-11-15 19:05:48 -08:00
Lvv.me	81b96205af	Using `module.modulemap` replace symbol link for public header	2021-11-15 13:23:50 +08:00
Lvv.me	27455638aa	Support Swift Package Manager	2021-11-14 17:29:33 +08:00
ko-zu	c67e07f34e	Remove executable flag from GNU_STACK section Putting stack marking into every assembly files is required to indicate that the stack does not need to be executable. Executable flag on stack conflicts with some security measures, Systemd MemoryDenyWriteExecute=yes for example.	2021-11-13 22:58:33 +09:00
Dimitris Apostolou	ebbd675998	Fix typos	2021-11-13 10:04:04 +02:00
Yann Collet	9ba07907c8	Merge pull request #2836 from animalize/copy16 ZSTD_copy16() uses ZSTD_memcpy()	2021-11-11 07:53:08 -08:00
W. Felix Handte	48572f52b1	Rewrite Fix to Still Auto-Vectorize	2021-11-09 12:17:03 -05:00
W. Felix Handte	61765cacd0	Avoid Reducing Indices to Reserved Values Previously, if an index was equal to `reducerValue + 1`, it would get remapped during index reduction to 1 i.e. `ZSTD_DUBT_UNSORTED_MARK`. This can affect the parsing of the input slightly, by causing tree nodes to be nullified when they otherwise wouldn't be. This hardly matters from a correctness or efficiency perspective, but it does impact determinism. So this commit changes index reduction to avoid mapping indices to collide with `ZSTD_DUBT_UNSORTED_MARK`.	2021-11-08 20:03:52 -05:00
Nick Terrell	d46995efeb	Backport zstd patch from LKML Credit to Nathan Chancellor for the bug fix and Nick Desaulniers for the bug report. Link: https://github.com/ClangBuiltLinux/linux/issues/1486 Link: https://lore.kernel.org/all/20211021202353.2356400-1-nathan@kernel.org/	2021-11-05 14:09:49 -07:00
senhuang42	384744888e	Void out unused functions	2021-11-04 14:32:07 +03:00
Ma Lin	b10357ce65	ZSTD_copy16() uses SSE2 instructions This accelerates the decompression speed of MSVC build.	2021-11-04 11:37:10 +08:00
binhdvo	b399b47467	Move mingw tests from appveyor to github actions (#2838 )	2021-11-02 13:17:55 -04:00
binhdvo	04734ee84a	Fix oss fuzz test error (#2837 )	2021-10-29 10:29:50 -04:00
Yann Collet	aba88fa996	Merge pull request #2829 from facebook/ZSTD_DECODER_INTERNAL_BUFFER minor : change build macro to ZSTD_DECODER_INTERNAL_BUFFER	2021-10-26 10:48:16 -07:00
Yann Collet	2b2a5c449a	fix minor cast warning	2021-10-26 08:38:17 -07:00
Yann Collet	518f06b281	added minimum for decoder buffer also : introduced macro BOUNDED()	2021-10-26 08:21:31 -07:00
Yann Collet	12e177cba8	Merge pull request #2830 from facebook/clevels separate compression level tables into their own file	2021-10-25 13:35:54 -07:00
Nick Terrell	ad739e5959	Merge pull request #2828 from terrelln/lazy-compile [lazy] Speed up compilation times	2021-10-25 10:22:23 -07:00
Yann Collet	082d6c6775	separate compression level tables into their own files that's clearer than finding the tables somewhere in the middle of `compress.c`. Also, down the line, it may potentially allows zstd to feature adjusted tables depending on target cpu.	2021-10-25 08:49:54 -07:00
Yann Collet	02be2a830f	build macro ZSTD_DECODER_INTERNAL_BUFFER just to make the topic more accessible for potential users.	2021-10-25 08:09:04 -07:00
binhdvo	6a7ede3dfc	Reduce size of dctx by reutilizing dst buffer (#2751 ) * Reduce size of dctx by reutilizing dst buffer Co-authored-by: Binh Vo <binhvo@fb.com>	2021-10-25 10:38:01 -04:00
Yann Collet	0a794f5afe	Merge pull request #2822 from marxin/fix-zstd-thread-pool-documentation Support thread pool section in HTML documentation.	2021-10-22 16:46:08 -07:00
Nick Terrell	13cad3abb1	[lazy] Speed up compilation times Speed up compilation times by moving each specialized search function into its own function. This is faster because compilers can handle many smaller functions much faster than one gigantic function. The previous approach generated one giant function with `switch` statements and inlining to select the implementation. \| Compiler \| Flags \| Dev Time (s) \| PR Time (s) \| Delta \| \|----------\|-------------------------------------\|--------------\|-------------\|-------\| \| gcc \| -O3 \| 16.5 \| 5.6 \| -66% \| \| gcc \| -O3 -g -fsanitize=address,undefined \| 158.9 \| 38.2 \| -75% \| \| clang \| -O3 \| 36.5 \| 5.5 \| -85% \| \| clang \| -O3 -g -fsanitize=address,undefined \| 27.8 \| 17.5 \| -37% \| This also reduces the binary size because the search functions are no longer inlined into the main body. \| Compiler \| Dev libzstd.a Size (B) \| PR libzstd.a Size (B) \| Delta \| \|----------\|------------------------\|-----------------------\|-------\| \| gcc \| 1563868 \| 1308844 \| -16% \| \| clang \| 1924372 \| 1376020 \| -28% \| Finally, the performance is not impacted significantly by this change, in fact we generally see a small speed boost. \| Compiler \| Level \| Dev Speed (MB/s) \| PR Speed (MB/s) \| Delta \| \|----------\|-------\|------------------\|-----------------\|-------\| \| gcc \| 5 \| 110.6 \| 110.0 \| -0.5% \| \| gcc \| 7 \| 70.4 \| 72.2 \| +2.5% \| \| gcc \| 9 \| 53.2 \| 53.5 \| +0.5% \| \| gcc \| 13 \| 12.7 \| 12.9 \| +1.5% \| \| clang \| 5 \| 113.9 \| 110.4 \| -3.0% \| \| clang \| 7 \| 67.7 \| 70.6 \| +4.2% \| \| clang \| 9 \| 51.9 \| 52.2 \| +0.5% \| \| clang \| 13 \| 12.4 \| 13.3 \| +7.2% \| The compression strategy is unmodified in this PR, so the compressed size should be exactly the same. I may have a follow up PR to slightly improve the compression ratio, if it doesn't cost too much speed.	2021-10-22 13:45:26 -07:00
Nick Terrell	abd717a5fa	[asm] Switch to C style comments Switch to C style comments for increased portability, and consistency.	2021-10-20 11:37:05 -07:00
Yann Collet	9d62957b31	Merge pull request #2800 from animalize/fix_c89 Fix a C89 error in msvc	2021-10-18 14:32:04 -07:00
Martin Liska	1c2b02eee9	Support thread pool section in HTML documentation.	2021-10-15 18:35:39 +02:00
Felix Handte	23c1a2d260	Merge pull request #2774 from felixhandte/zstd-dfast-pipelined-single Pipelined Implementation of ZSTD_dfast	2021-10-13 16:38:43 -04:00
W. Felix Handte	0bfc935add	Convert Outer Control Structure to Loop	2021-10-12 13:34:17 -04:00
Nick Terrell	b77d95b053	Merge pull request #2820 from terrelln/nb-compares [binary-tree] Fix underflow of nbCompares	2021-10-11 09:59:57 -07:00
Nick Terrell	26486db9ab	Merge pull request #2819 from terrelln/ldm-hash-rate-log [ldm] Fix ZSTD_c_ldmHashRateLog bounds check	2021-10-08 14:58:29 -07:00
Nick Terrell	802745e88a	Merge pull request #2818 from terrelln/indentation-fix [nit] Fix buggy indentation	2021-10-08 14:57:52 -07:00
Nick Terrell	c6c482fe07	[binary-tree] Fix underflow of nbCompares Fix underflow of `nbCompares` by switching to an `int` and comparing `nbCompares > 0`. This is a minimal fix, because I don't want to change the logic. These loops seem to be doing `nbCompares + 1` comparisons. The bug was reported by Dan Carpenter and found by Smatch static checker. https://lore.kernel.org/all/20211008063704.GA5370@kili/	2021-10-08 13:22:55 -07:00
Nick Terrell	31316cf158	[multiple-ddicts] Fix NULL checks The bug was reported by Dan Carpenter and found by Smatch static checker. https://lore.kernel.org/all/20211008063704.GA5370@kili/	2021-10-08 11:24:58 -07:00
Nick Terrell	1bbb372e3e	[ldm] Fix ZSTD_c_ldmHashRateLog bounds check There is no minimum value check, so the parameter could be negative. Switch to the standard pattern of using `BOUNDCHECK()`. The bug was reported by Dan Carpenter and found by Smatch static checker. https://lore.kernel.org/all/20211008063704.GA5370@kili/	2021-10-08 11:17:40 -07:00
Nick Terrell	399644b1f1	[nit] Fix buggy indentation The bug was reported by Dan Carpenter and found by Smatch static checker. https://lore.kernel.org/all/20211008063704.GA5370@kili/	2021-10-08 11:13:11 -07:00
W. Felix Handte	79ca830766	Style: Add Comments to Variables and Move a Couple into the Loop	2021-10-05 16:18:09 -04:00
W. Felix Handte	62536ef7da	Simplify DMS Implementation by Removing noDict Support	2021-10-05 14:54:37 -04:00
W. Felix Handte	051b473e7e	Fall Back in _extDict to New _noDict Rather than Old Merged Impl	2021-10-05 14:54:37 -04:00
W. Felix Handte	fcab4841aa	Nit: Rename Function	2021-10-05 14:54:37 -04:00
W. Felix Handte	47fd762ecc	Nit: Unnest Blocks that Don't Declare Anything	2021-10-05 14:54:37 -04:00
W. Felix Handte	2cdfad538c	Search One Last Position	2021-10-05 14:54:37 -04:00
W. Felix Handte	6ae44c0db8	Advance Long Index Lookup (+0.5% Speed) This lookup can be advanced to before the short match check because either way we will use it (in the next loop iter or in `_search_next_long`).	2021-10-05 14:54:37 -04:00
W. Felix Handte	2ddef7c872	Write Back Advanced Hash in Long Matches as Well (+Ratio) Since we're now hashing the position ahead even if we find a long match and don't search that next position, we can write it back into the hashtable even in long matches. This seems to cost us no speed, and improves compression ratio slightly!	2021-10-05 14:54:37 -04:00
W. Felix Handte	39f2491bfc	Use Look-Ahead Hash for Next Long Check after Short Match (+0.5% Speed) This costs a little ratio, unfortunately.	2021-10-05 14:54:37 -04:00
W. Felix Handte	db4e1b5479	Hash Long One Position Ahead (+2.5% Speed) Aside from maybe a latency win in the loop, this means that when we find a short match, we've already done the hash we need to check the next long match.	2021-10-05 14:54:37 -04:00
W. Felix Handte	a1ac7205d0	Pull Match Found Stuff Out of the Loop	2021-10-05 14:54:37 -04:00
W. Felix Handte	072ffaad67	Extract Working Variables	2021-10-05 14:54:37 -04:00
W. Felix Handte	1bdf041071	Track Step Rather than Recalculating (+0.5% Speed)	2021-10-05 14:54:37 -04:00
W. Felix Handte	258c0623e1	Extract Single-Segment Variant of ZSTD_dfast	2021-10-05 14:54:37 -04:00
stanjo74	52598d54e9	Limit train samples (#2809 ) * Limit training samples size to 2GB * simplified DISPLAYLEVEL() macro to use global vqriable instead of local. * refactored training samples loading * fixed compiler warning * addressed comments from the pull request * addressed @terrelln comments * missed some fixes * fixed type mismatch * Fixed bug passing estimated number of samples rather insted of the loaded number of samples. Changed unit conversion not to use bit-shifts. * fixed a declaration after code * fixed type conversion compile errors * fixed more type castting * fixed more type mismatching * changed sizes type to size_t * move type casting * more type cast fixes	2021-10-04 17:47:52 -07:00
Yann Collet	7868f38019	Merge pull request #2747 from Helflym/dev Add AIX support in Makefile	2021-10-01 08:13:39 -07:00

1 2 3 4 5 ...

4226 Commits