townforge/zstd - zstd - Townforge git

Author	SHA1	Message	Date
senhuang42	4c5f337248	Use cctx's minMatch instead of global MINMATCH, make fuzzer use validation	2020-11-30 15:41:20 -05:00
sen	c5fbd55dac	Merge pull request #2387 from senhuang42/compress_sequence_API [RFC] New sequence compression API	2020-11-20 16:54:20 -05:00
senhuang42	7742f076b4	Add experimental param for sequence validation	2020-11-20 11:57:41 -05:00
senhuang42	0e32928b7d	Remove unnecessary repcode backup, apply style choices, use function pointer	2020-11-20 11:02:19 -05:00
sen	e924a0fa51	Explicit cast for visual warnings Github has automatic commits now! Cool Co-authored-by: Nick Terrell <nickrterrell@gmail.com>	2020-11-19 17:32:40 -05:00
senhuang42	dcbbf7c09f	Unroll isRLE loop	2020-11-19 12:38:13 -05:00
senhuang42	05c0229668	Clean up visual conversion warnings	2020-11-18 15:36:29 -05:00
senhuang42	d6d7ba2a1f	Modification to offset validation to include entire sequence	2020-11-17 10:13:22 -05:00
senhuang42	8f3136a9c7	Fix assert edge case, improve documentation in zstd.h	2020-11-16 18:05:35 -05:00
senhuang42	f6baad87d6	Fix warnings and make validation enabled by default	2020-11-16 12:00:06 -05:00
senhuang42	55b90ef010	Fix unit tests to agree with new changes	2020-11-16 11:36:37 -05:00
senhuang42	7f563b0519	Add new sequence format as an experimental CCtx param	2020-11-16 10:49:17 -05:00
senhuang42	347824ad73	Overhaul logic to simplify, add in proper validations, fix match splitting	2020-11-16 10:49:17 -05:00
senhuang42	46824cb018	Add new sequence compress api params to cctx	2020-11-16 10:49:17 -05:00
senhuang42	48405b4633	Fix srcSize=0 edge case	2020-11-16 10:49:17 -05:00
senhuang42	022e6d81e7	Fix literals length calculation	2020-11-16 10:49:17 -05:00
senhuang42	dad20b5ccb	Remove dstCapacity error check	2020-11-16 10:49:17 -05:00
senhuang42	b8e16a2057	Remove extraneous function in this API	2020-11-16 10:49:17 -05:00
senhuang42	f29507c4fc	Add check comparing offset to window size	2020-11-16 10:49:17 -05:00
senhuang42	7a6e46a92f	Fix MSAN errors	2020-11-16 10:49:17 -05:00
senhuang42	cc2642bd17	Address edge case with endPosInSequence	2020-11-16 10:49:17 -05:00
senhuang42	fd10007174	Change debug levels to appropriate ones	2020-11-16 10:49:17 -05:00
senhuang42	2db8441245	Add RLE support	2020-11-16 10:49:17 -05:00
senhuang42	dfef298336	Fix various build warnings	2020-11-16 10:49:17 -05:00
senhuang42	2bbdddf24e	Add test case to roundtrip using ZSTD_getSequences() and ZSTD_compressSequences()	2020-11-16 10:49:16 -05:00
senhuang42	5fd69f8173	Add documentation for new api functions	2020-11-16 10:49:16 -05:00
senhuang42	e8b7fdb64b	Refactor for enhanced code clarity	2020-11-16 10:49:16 -05:00
senhuang42	c675fb46f1	Rename internal function compressSequences(), and promote new *_ext() functions to their actual name	2020-11-16 10:49:16 -05:00
senhuang42	013434e1e4	Add another API function to compress with existing CCTX	2020-11-16 10:49:16 -05:00
senhuang42	c44ce29013	More adjustments to improve code clarity	2020-11-16 10:49:16 -05:00
senhuang42	48f67da854	Pull compressStream2() transparent initialization into its own function	2020-11-16 10:49:16 -05:00
senhuang42	c86151f53c	Add initial support for new ZSTD_Sequence mode	2020-11-16 10:49:16 -05:00
senhuang42	e0f26afce9	Add sequence compression format param	2020-11-16 10:49:16 -05:00
senhuang42	f51af9a609	Always ensure sequenceRange updates properly, add more error forwarding	2020-11-16 10:49:16 -05:00
senhuang42	1a449688fd	Various minor logical refactors to improve clarity	2020-11-16 10:49:16 -05:00
senhuang42	e5fe485dcc	Fix cSize calculation for noCompressBlocks	2020-11-16 10:49:16 -05:00
senhuang42	6145ebb400	Rebased, roundtrips silesia.tar	2020-11-16 10:49:16 -05:00
senhuang42	b5b61cc216	Refactor for better debugging info	2020-11-16 10:49:16 -05:00
senhuang42	293fad6b45	Corrections and edge-case fixes to be able to roundtrip dickens	2020-11-16 10:49:16 -05:00
senhuang42	7eb6fa7be4	Multi-block compression scaffolding - works on single-block files	2020-11-16 10:49:16 -05:00
senhuang42	75b01f34b9	Add support for uncompressible blocks	2020-11-16 10:49:16 -05:00
senhuang42	e04da68157	Enable usage of ZSTD_sequenceRange for single-block compression	2020-11-16 10:49:16 -05:00
senhuang42	337fac216d	Add logic to handle ZSTD_sequenceRange	2020-11-16 10:49:16 -05:00
senhuang42	85822ddd53	Add last literals handling like getSequences()	2020-11-16 10:49:16 -05:00
senhuang42	2cff8df1a2	Pull block compression out of main compressSequences() function	2020-11-16 10:49:16 -05:00
senhuang42	cfced9344a	Implement ZSTD_updateSequenceRange	2020-11-16 10:49:16 -05:00
senhuang42	b116e1f211	Modify SequenceRange to have posInSequence	2020-11-16 10:49:16 -05:00
senhuang42	d99b675112	Add function definition for sequenceRange updater	2020-11-16 10:49:16 -05:00
senhuang42	74e95c05cc	Add ZSTD_SequenceRange to count ranges in array of ZSTD_Sequence	2020-11-16 10:49:16 -05:00
senhuang42	89f3848310	Add support for repcodes	2020-11-16 10:49:16 -05:00
senhuang42	3e930fd044	Code cleanup, add debuglog statments	2020-11-16 10:49:16 -05:00
senhuang42	086513b5b9	Implement first pass at compressSequences()	2020-11-16 10:49:16 -05:00
senhuang42	a9327b1e9b	Add initial function prototype for ZSTD_compressSequences_ext (to be renamed later)	2020-11-16 10:33:35 -05:00
animalize	52f8c07a3f	Clamp compression level in ZSTD_getCParams_internal() function	2020-11-14 13:26:08 +08:00
senhuang42	9d936d61d2	Reduce number of memcpy() calls	2020-11-13 19:43:30 -05:00
senhuang42	be4ac6c5bc	Use existing repcode update function to implement updates	2020-11-12 16:51:12 -05:00
senhuang42	674c9b9235	Add in proper block repcode histories	2020-11-12 15:34:37 -05:00
senhuang42	06c7f14066	Let block reps persist	2020-11-12 12:24:44 -05:00
senhuang42	396275068c	Fix incorrect repcode setting	2020-11-12 11:57:01 -05:00
senhuang42	1a8af0de73	Improve unit test	2020-11-12 11:09:09 -05:00
senhuang42	4d4fd2c55f	Overhaul repcode handling logic	2020-11-12 10:59:35 -05:00
sen	f62edf0fe9	Merge pull request #2381 from senhuang42/expand_sequence_extraction_api Add enum to define ZSTD_Sequence type and update sequence extraction API	2020-11-06 13:00:31 -05:00
senhuang42	7d1dea070c	Update unit tests	2020-11-06 11:10:37 -05:00
senhuang42	779df995c6	Implement mergeGeneratedSequences()	2020-11-06 10:55:46 -05:00
senhuang42	51abd58208	Rename getSequences() to generateSequences()	2020-11-06 10:53:22 -05:00
Luke Pitt	eac309c71b	Add ZSTD_getDictID_fromCDict function to experimental section	2020-11-04 11:37:37 +00:00
senhuang42	f782cac3d4	Change block delimiter removing to linear time approach	2020-11-02 17:06:20 -05:00
senhuang42	3434049c1f	Use ZSTD_memmove() instead of memmove()	2020-11-02 11:43:19 -05:00
senhuang42	d4d0346b40	Update name of enum, clarify documentation	2020-11-02 11:38:17 -05:00
senhuang42	e6178f837f	Revert unnecessary seqCollector adjustment	2020-11-02 10:59:20 -05:00
senhuang42	e8501e00b8	Fix incorrect index increment in merge algorithm	2020-11-02 10:58:41 -05:00
senhuang42	a36fdada57	Add algorithm to remove all delimiters	2020-11-02 10:46:52 -05:00
senhuang42	435a3a0428	Update seqCollector definition	2020-11-02 10:19:26 -05:00
senhuang42	3327932609	Update ZSTD_getSequences function signature	2020-11-02 10:17:59 -05:00
Nick Terrell	7205e609a9	Merge pull request #2354 from terrelln/stable-buffer Add ZSTD_c_stable{In,Out}Buffer and optimize when set	2020-10-30 15:06:56 -07:00
sen	c37c714ef1	Merge pull request #2376 from senhuang42/clarify_sequence_extraction_api Refine external ZSTD_Sequence API	2020-10-30 15:47:25 -04:00
Nick Terrell	d4e021fe35	[lib] Avoid allocating the input buffer when ZSTD_c_stableInBuffer is set We don't use it when we have a stable input buffer, so don't allocate it. I had to slightly modify `ZSTD_copyCCtx()` by storing the `ZSTD_buffered_policy_e` in the `ZSTD_CCtx`, since `inBuffSize > 0` is no longer the correct signal for the buffered mode.	2020-10-30 10:55:34 -07:00
Nick Terrell	24f72789e2	[lib] Skip the input window buffer when ZSTD_c_stableInBuffer is set Compress directly from the `ZSTD_inBuffer`. We still allocate the input buffer. A following commit will remove that allocation.	2020-10-30 10:55:34 -07:00
Nick Terrell	fcf81cee5e	[lib] Avoid allocating output buffer when ZSTD_c_stableOutBuffer is set We compress directly to the `ZSTD_outBuffer` so we don't need to allocate it.	2020-10-30 10:55:34 -07:00
Nick Terrell	6d5dc93d4e	[lib] Compress directly into output when ZSTD_c_stableOutBuffer is set When we have a stable output buffer always compress directly into the `ZSTD_outBuffer`. We are allowed to return `dstSizeTooSmall`.	2020-10-30 10:55:34 -07:00
Nick Terrell	987cb4ca6a	[lib] Take the shortcut when ZSTD_c_stableOutBuffer is set When we have a stable output buffer take the single-pass shortcut. It is okay to return `dstSizeTooSmall` if the output buffer isn't big enough, because we know it will never grow.	2020-10-30 10:55:34 -07:00
Nick Terrell	809b2f2071	[lib] Set ZSTD_c_stable{In,Out}Buffer in ZSTD_compress2() Sets these parameters in ZSTD_compress2() then resets them to their orignal values after the compression call. An alternative design could be to add a flush mode `ZSTD_e_singlePass` which implies `ZSTD_c_stable{In,Out}Buffer` but only for a single compression call, by directly setting the applied parameters. I've opted for the smaller change, but this is open for discussion.	2020-10-30 10:55:34 -07:00
Nick Terrell	c74be3f6de	[lib] Validate buffers when ZSTD_c_stable{In,Out}Buffer is set Adds the validation of the input/output buffers only. They are still unused.	2020-10-30 10:55:34 -07:00
Nick Terrell	e3e0775cc8	[API] Add ZSTD_c_stable{In,Out}Buffer parameters This commit adds the parameters and sets the value in the CCtxParams but it does not do anything with the value.	2020-10-30 10:54:39 -07:00
Nick Terrell	e2581d9572	[lib] Set appliedParams in zstdmt mode Previously only `nbWorkers` was set. Set all parameters, because that is what is expected. This is needed for the `ZSTD_c_stable{In,Out}Buffer` parameters.	2020-10-30 10:54:38 -07:00
senhuang42	536e89c723	Sequence extractor should update CBlockState	2020-10-30 12:13:19 -04:00
senhuang42	32cac2627a	Emit last literals of 0 size as well, to indicate block boundary	2020-10-29 16:41:17 -04:00
senhuang42	69bd5f0654	Correct literalsRead calculation to include longLength	2020-10-29 14:49:37 -04:00
senhuang42	59624f3163	Remove implicit typecast to appease appVeyor windows build	2020-10-28 16:25:09 -04:00
senhuang42	3ed5d053d8	Clarify comments in zstd.h some more	2020-10-28 09:53:09 -04:00
sen	17b700d78a	Merge pull request #2366 from senhuang42/enable_ldm_by_default Enable LDM by default if window size >= 128MB and strategy uses opt parser	2020-10-27 14:59:28 -04:00
senhuang42	3163909d14	Remove unused variable position	2020-10-27 12:58:12 -04:00
senhuang42	dc448563e9	Add test compatibility with last literals in sequences	2020-10-27 12:35:28 -04:00
senhuang42	1d221ecc03	Add support for representing last literals in the extracted seqs	2020-10-27 11:19:48 -04:00
senhuang42	9171f920cd	Improve documentation of seqStore_t	2020-10-27 10:50:22 -04:00
senhuang42	96b0ff7886	Improve documentation regarding various operations in copyBlockSequences	2020-10-27 10:36:06 -04:00
senhuang42	3a11c7eb03	Modify ZSTD_copyBlockSequences to agree with new API	2020-10-27 10:31:40 -04:00
senhuang42	8bdb32aebe	Add a function for LDM enable check	2020-10-20 13:46:02 -04:00
senhuang42	578e889ec1	Move ldm enable to compressStream2()	2020-10-20 13:04:45 -04:00
senhuang42	d28d8a1d72	Include LDM tables size for CCtx size estimation where relevant	2020-10-20 09:21:30 -04:00
senhuang42	b1c7fc5768	Add compatibility for multithreading	2020-10-19 12:07:06 -04:00
senhuang42	590f7f55f0	Add ldm enable condition in ZSTD_resetCCtx_internal	2020-10-19 10:26:17 -04:00
senhuang42	4d01979b62	Expose and call ZSTD_ldm_skipRawSeqStoreBytes()	2020-10-16 20:30:00 -04:00
senhuang42	ee84817fe7	Reset posInSequence when using ZSTD_referenceExternalSequences()	2020-10-14 22:06:08 -04:00
Yann Collet	f5d5cd3b40	Merge pull request #2341 from senhuang42/ldm_optimized_for_opt_parser Integrate long distance matches into optimal parser	2020-10-13 13:09:07 -07:00
Nick Terrell	7e6f91ed84	[minor] Improve docs and add an assert in response to review	2020-10-12 16:43:17 -07:00
Nick Terrell	d5c688e8ae	Fix ZSTD_adjustCParams_internal() to handle dictionary logic Pass in the `ZSTD_cParamMode_e` to select how we define our cparams. Based on the mode we either take the `dictSize` into account or we set it to `0`. See the documentation for `ZSTD_cParamMode_e`. Some of the modes currently share the same behavior. But they have distinct modes because they are drastically different cases. E.g. compression + reprocessing the dictionary and creating a cdict. Additionally, when downsizing the hashLog and chainLog take the (adjusted) dictionary size into account, since the size of the dictionary gets added onto the window size. Adds a simple test to ensure that we aren't downsizing too far.	2020-10-12 12:50:04 -07:00
Nick Terrell	fadaab8c7c	[minor improvement] Pass 0 as the content size in the DDS The DDS structure can't be copied into the working tables like the DMS. So it doesn't need to account for the source size when sizing its parameters, just the dictionary size.	2020-10-12 12:47:21 -07:00
Nick Terrell	48ef15fb47	[minor improvement] Pass dictSize when selecting parameters When selecting parameters in streaming compression with a dictionary use the dictionary size to select the parameters.	2020-10-12 12:47:19 -07:00
Nick Terrell	012818df99	[refactor] Remove ZSTD_resetCStream_internal() This function is only called in one place. It isn't a logical separation of duties, and it was only obsfucating the code now, so inline it.	2020-10-12 12:46:10 -07:00
Nick Terrell	7083f79008	[bug] Fix dictContentType when reprocessing cdict Conditions to trigger: * CDict is loaded as raw content. * CDict starts with the zstd dictionary magic number. * The CDict is reprocessed (not attached or copied). * The new API is used (streaming or `ZSTD_compress2()`). Bug: The dictionary is loaded as a zstd dictionary, not a raw content dictionary, because the dict content type is set to `ZSTD_dct_auto`. Fix: Pass in the dictionary content type from cdict creation to the call to `ZSTD_compress_insertDictionary()`. Test: Added a test case that exposes the bug, and fixed the raw content tests to not modify the `dictBuffer`, which makes all future tests with the `dictBuffer` raw content, which doesn't seem intentional.	2020-10-12 12:46:10 -07:00
Yann Collet	12541931fa	Merge pull request #2328 from marxin/zstd-pool-api Allow external creation of POOLs that can be shared.	2020-10-09 01:00:50 -07:00
Yann Collet	6fdb0cb8d9	Merge pull request #2303 from senhuang42/let_cdict_take_clevel_priority For ZSTD_compressStream2(), let cdict take compression level priority	2020-10-09 00:48:30 -07:00
senhuang42	b9c8033cde	Define kNullRawSeqStore for every file	2020-10-07 19:02:41 -04:00
senhuang42	a6165c1b28	Change matchState_t::ldmSeqStore to pointer	2020-10-07 14:13:57 -04:00
senhuang42	10647924f1	Make function descriptions more accurate	2020-10-07 13:56:25 -04:00
senhuang42	37617e23d7	Correct matchLength calculation and remove unnecessary functions	2020-10-07 13:56:25 -04:00
senhuang42	ef823e0299	Remove rawSeqStore.base and add rawSeqStore.posInSequence	2020-10-07 13:56:25 -04:00
senhuang42	ea92fb3a68	Cleanups, add comments and explanations	2020-10-07 13:56:25 -04:00
senhuang42	6ccd97fc96	Fixed end of match boundary update issues	2020-10-07 13:56:25 -04:00
senhuang42	c8b8572b38	Adjustments to no longer segfault on nci	2020-10-07 13:56:25 -04:00
Martin Liska	b684900a4a	Allow external creation of POOLs that can be shared.	2020-10-07 12:44:33 +02:00
Nick Terrell	6a1e526ea7	[lib] Add ZSTD_COMPRESS_HEAPMODE tuning parameter	2020-09-24 19:42:04 -07:00
Nick Terrell	b841387218	[freestanding] Improve macro resolution to handle #if X	2020-09-24 19:42:04 -07:00
Nick Terrell	88fac5d514	Remove call to memset The previous commit fixes the test so it errors on calls to mem*() functions from <string.h>.	2020-09-24 19:42:04 -07:00
Nick Terrell	9261476b7d	[lib] Wrap customMem xor checks in parens for readability This clarifies operator precedence, and quiets cppcheck in the Kernel Test Robot. I think this is a slight bonus to readability, so I am accepting the suggestion.	2020-09-23 23:26:07 -07:00
Felix Handte	200c960f1d	Merge pull request #2311 from felixhandte/ddss-fix-cparam-derivation Fix Compression Parameter Derivation Bugs Introduced by DDSS Changes	2020-09-18 14:02:14 -04:00
W. Felix Handte	8930c6e551	Use ZSTD_CCtxParams_init() to Init CCtxParams, not memset() Even if the discrepancies are at the moment benign, it's probably better to standardize on using the one true initializer, rather than trying (and failing) to correctly duplicate its behavior.	2020-09-17 12:15:33 -04:00
W. Felix Handte	e8a44326fa	Avoid Redundancy in ZSTD_initCDict_internal() Args; Don't Take CParams + CCtxParams	2020-09-17 12:08:36 -04:00
W. Felix Handte	eee51a664a	Fall Back if Derived CParams are Incompatible with DDSS; Refactor CDict Creation Rewrite ZSTD_createCDict_advanced() as a wrapper around ZSTD_createCDict_advanced2(). Evaluate whether to use DDSS mode after fully resolving cparams. If not, fall back.	2020-09-15 18:01:08 -04:00
W. Felix Handte	bc6521a6f6	Make ZSTD_createCDict_advanced2() cctxParams Arg Const	2020-09-15 14:06:10 -04:00
W. Felix Handte	26a96a5b35	Do More Complete CParams Deduction in Non-DDSS Path of ZSTD_createCDict_advanced2 Call ZSTD_getCParamsFromCCtxParams() instead of ZSTD_getCParams_internal().	2020-09-15 13:57:43 -04:00
W. Felix Handte	a2af804129	Pull CParam Override Logic into Helper	2020-09-15 13:38:05 -04:00
Yann Collet	c91a0855f8	check endDirective in ZSTD_compressStream2() fix #2297 also : - `assert()` `endDirective` in `ZSTD_compressStream_internal()`, for debug mode - add relevant tests	2020-09-14 10:56:08 -07:00
senhuang42	17b56f934e	Coding style cleanup	2020-09-11 11:42:12 -04:00
senhuang42	801513b5e7	Modify params rather than cctx->requestedParams	2020-09-11 11:41:10 -04:00
W. Felix Handte	0faefbf1b3	Make DDSS Selection Override ForceCopy Directive	2020-09-10 22:10:02 -04:00
W. Felix Handte	13c5ec3e41	Only Allow Dedicated Dict Search for Dicts Loaded in 1 Chunk The load algorithm requires we do it all in one go.	2020-09-10 22:10:02 -04:00
W. Felix Handte	07793547e6	Fix Bug: Only Use DDSS Insertion on CDict MatchStates Previously, if DDSS was enabled on a CCtx and a dictionary was inserted into the CCtx, the CCtx MatchState would be filled as a DDSS struct, causing segfaults etc. This changes the check to use whether the MatchState is marked as using the DDSS (which is only ever set for CDict MatchStates), rather than looking at the CCtxParams.	2020-09-10 18:51:52 -04:00
W. Felix Handte	7b5d2f72ea	Adjust Working Context Table Sizes Back Down	2020-09-10 18:51:52 -04:00
W. Felix Handte	a3659fe1ef	Make ZSTD_dedicatedDictSearch_getCParams Wrap ZSTD_getCParams Fixes up bounds-checking, and lets us clean up what is at the moment an unnecessary duplication of the default cparams tables.	2020-09-10 18:51:52 -04:00
W. Felix Handte	5390fee4f7	Rename and Move DD_BLOG Constant to ZSTD_LAZY_DDSS_BUCKET_LOG	2020-09-10 18:51:52 -04:00
W. Felix Handte	914bfe7ee4	Init CCtx's Local Dict with CCtxParams	2020-09-10 18:51:52 -04:00
W. Felix Handte	db2aa25252	Decision for Whether to Attach Should be Based on CDict Config, not CCtx	2020-09-10 18:51:52 -04:00
W. Felix Handte	f1b428fdac	Rename enableDedicatedDictSearch to dedicatedDictSearch in MatchState This makes it clear that not only is the feature allowed here, we're actually using it, as opposed to the CCtxParam field, in which it's enabled, but we may or may not be using it.	2020-09-10 18:51:52 -04:00
W. Felix Handte	41012193ad	Always Init CDict's enableDedicatedDictSearch Field	2020-09-10 18:51:52 -04:00
W. Felix Handte	34b545acb0	Add a ZSTD_dedicatedDictSearch ZSTD_dictMode_e to Allow Const Propagation Speed +1.5%.	2020-09-10 18:51:52 -04:00
W. Felix Handte	beefdb0d3d	Fix ZSTD_c_forceAttachDict Bounds	2020-09-10 18:51:52 -04:00
W. Felix Handte	def62e2d3e	Fix Compilation Warnings	2020-09-10 18:51:52 -04:00
Bimba Shrestha	9c628238d3	creating ZSTD_createCDict_advanced_internal	2020-09-10 18:51:52 -04:00
Bimba Shrestha	0a9787c3e1	changing to int for consistency	2020-09-10 18:51:52 -04:00
Bimba Shrestha	5d5507788d	change method name for consistency	2020-09-10 18:51:52 -04:00
Bimba Shrestha	b30f71becf	pass correct cparams	2020-09-10 18:51:52 -04:00
Bimba Shrestha	71fda0362f	making cctxParams a pointer	2020-09-10 18:51:52 -04:00
Bimba Shrestha	628559d0e4	loading dict using new algorithm	2020-09-10 18:51:52 -04:00
Bimba Shrestha	31e581bf65	adding enableDedicatedDictSearch to matchState_t	2020-09-10 18:51:52 -04:00
Bimba Shrestha	75b6360036	adding ZSTD_createCDict_advanced2 to zstd.h	2020-09-10 18:51:52 -04:00
Bimba Shrestha	b7dddbe89b	always attach dict when using dedicatedDictSearch	2020-09-10 18:51:52 -04:00
Bimba Shrestha	e36a373df4	adding dedicatedDictSearch cParams helper methods	2020-09-10 18:51:52 -04:00
Bimba Shrestha	f10d4e313c	adding ZSTD_dedicatedDictSearch_defaultCParameters variable	2020-09-10 18:51:52 -04:00
Bimba Shrestha	c497cb6716	Add ZSTD_c_enableDedicatedDictSearch Param	2020-09-10 18:51:52 -04:00
senhuang42	64bd68e44b	Adjust ZSTD_createCDict_byReference() function, and check for cdict when using compressStream2	2020-09-10 13:42:26 -04:00
Nick Terrell	a90779397a	[lib] Reduce zstd stack usage by 1KB	2020-09-09 14:35:39 -07:00
Nick Terrell	046aca190f	Fix ZSTD_initCStream_advanced() with no dictionary and static allocation	2020-09-09 14:35:39 -07:00
Nick Terrell	f91ed5c766	[lib] s/current/curr because it collides with Linux Kernel macro	2020-09-09 14:35:39 -07:00
Nick Terrell	5e4efd22d4	Merge pull request #2291 from i-do-cpp/fix-compression-level-default Fix setParameter not falling back to default compression level	2020-09-08 16:42:34 -07:00
Nick Terrell	6da8acd231	Merge pull request #2293 from allanjude/coverity Resolve Coverity 1432392 Unintentional integer overflow	2020-09-03 13:58:45 -07:00
Allan Jude	8665793164	Resolve Coverity 1432392 Unintentional integer overflow Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN) overflow_before_widen: Potentially overflowing expression: cdict->dictContentSize * 6U with type unsigned int (32 bits, unsigned) is evaluated using 32-bit arithmetic, and then used in a context that expects an expression of type U64 (64 bits, unsigned).	2020-09-03 19:31:50 +00:00
i-do-cpp	aec8b27fff	Update zstd_compress.c	2020-08-31 09:34:08 +02:00
i-do-cpp	d514281e73	Fix setParameter not falling back to default compression level on 0 value See documentation for `ZSTD_c_compressionLevel`: `Special: value 0 means default, which is controlled by ZSTD_CLEVEL_DEFAULT`	2020-08-31 09:25:43 +02:00
Nick Terrell	c465f24457	ZSTD_ prefix mem{cpy,move,set},malloc,calloc,free	2020-08-26 12:26:03 -07:00
Nick Terrell	a686d306d2	Rename ZSTD_{malloc,calloc,free} to ZSTD_custom{Malloc,Calloc,Free}	2020-08-26 12:25:08 -07:00
Nick Terrell	80f577baa2	Move standard includes to zstd_deps.h	2020-08-26 12:25:08 -07:00
Nick Terrell	08981d2638	[lib] Allow compression dictionaries with missing symbols Allow compression to use dictionaries with missing symbols in their entropy tables. We set the FSE repeat mode to check when there are missing symbols, and set the FSE repeat mode to valid when all symbols are present. Note that when not all symbols are present, the heuristics which favor dictionary tables for lower compression levels won't activate. Tested by manually creating a dictionary with missing symbols of every type, and validing that the compressor rejects it before this change, and accepts it after this change. Also, I ran the `dictionary_loader` fuzzer for >1 hour of CPU time without running into cases where compression succeeds, but decompression fails. Fixes #2174.	2020-06-12 17:57:19 -07:00
Felix Handte	2af4e07326	Merge pull request #2133 from felixhandte/single-size-calculation Consolidate CCtx Size Estimation Code	2020-05-28 13:07:18 -04:00
Nick Terrell	b2092c6dc4	[ldm] Reset loadedDictEnd when the context is reset	2020-05-18 12:35:44 -07:00
Nick Terrell	add7ed2d4a	[lib] Fix bug in loading LDM dictionary in MT mode Exposed when loading a dictionary < LDM minMatch bytes in MT mode. Test Plan: ``` CC=clang make -j zstreamtest MOREFLAGS="-O0 -fsanitize=address" ./zstreamtest -vv -i100000000 -t1 --newapi -s7065 -t3925297 ``` TODO: Add an explicit test that loads a small dictionary in MT mode	2020-05-14 11:52:28 -07:00
W. Felix Handte	3bb7992350	Fix Size Estimate for LDM Seq Space	2020-05-14 13:50:53 -04:00
Nick Terrell	c3e921c639	Merge pull request #2131 from terrelln/raw-dict-fuzzer Fix rare scenario with lazy parser, dictionary, and repcodes	2020-05-12 17:44:31 -07:00
W. Felix Handte	d9a1e37aec	Nit: Fix Size Type for 32-bit	2020-05-12 18:03:31 -04:00
W. Felix Handte	1aa6c7ccce	Assert We Allocated Approximately What We Expected To	2020-05-12 16:55:03 -04:00
W. Felix Handte	27e2482217	Minor Refactor	2020-05-12 16:55:03 -04:00
W. Felix Handte	afc2488973	Handle Non-Static CCtxes in Estimation	2020-05-12 16:54:33 -04:00
W. Felix Handte	7ed996f5a0	Consolidate CCtx Size Estimation Code This commit pulls out the internals of `ZSTD_estimateCCtxSize_usingCCtxParams` into a helper. It then migrates two other callsites to use that helper, a small optimization for `ZSTD_estimateCStreamSize_usingCCtxParams`, which folds the buffer sizing into the helper, and then `ZSTD_resetCCtx_internal`, which is more invasive. This attempts to guarantee that the estimates returned to users are always correct.	2020-05-12 16:26:53 -04:00
Nick Terrell	4e0515916d	[lib] Fix repcode validation in no dict mode	2020-05-12 11:57:15 -07:00
Yann Collet	608f1bfc4c	fixed context downsize with initStatic When context is created using initStatic, no resize is possible. fix : only bump oversizeDuration when !initStatic	2020-05-11 18:16:38 -07:00
W. Felix Handte	c6636afbbb	Fix ZSTD_estimateCCtxSize() Under ASAN `ZSTD_estimateCCtxSize()` provides estimates for one-shot compression, which is guaranteed not to buffer inputs or outputs. So it ignores the sizes of the buffers, assuming they'll be zero. However, the actual workspace allocation logic always allocates those buffers, and when running under ASAN, the workspace surrounds every allocation with 256 bytes of redzone. So the 0-sized buffers end up consuming 512 bytes of space, which is accounted for in the actual allocation path through the use of `ZSTD_cwksp_alloc_size()` but isn't in the estimation path, since it ignores the buffers entirely. This commit fixes this.	2020-05-11 18:58:19 -04:00
caoyzh	a7e34ff693	revert ZSTD_reduceTable_internal()'s modificatiion	2020-05-07 13:10:46 -07:00
caoyzh	b2e56f7f7f	Optimize compression by using neon function.	2020-05-07 13:10:46 -07:00
W. Felix Handte	6028827fee	Rewrite Include Paths to be Relative Addresses #1998.	2020-05-04 15:20:26 -04:00
W. Felix Handte	6696933b32	Make All Invocations Start With Literal Format String	2020-05-04 10:59:15 -04:00
W. Felix Handte	5e5f262612	Add (Possibly Empty) Info Strings to All Variadic Error Handling Macro Invocations	2020-05-04 10:58:55 -04:00
Nick Terrell	e103d7b4a6	Fix superblock mode (#2100 ) Fixes: Enable RLE blocks for superblock mode Fix the limitation that the literals block must shrink. Instead, when we're within 200 bytes of the next header byte size, we will just use the next one up. That way we should (almost?) always have space for the table. Remove the limitation that the first sub-block MUST have compressed literals and be compressed. Now one sub-block MUST be compressed (otherwise we fall back to raw block which is okay, since that is streamable). If no block has compressed literals that is okay, we will fix up the next Huffman table. Handle the case where the last sub-block is uncompressed (maybe it is very small). Before it would skip superblock in this case, now we allow the last sub-block to be uncompressed. To do this we need to regenerate the correct repcodes. Respect disableLiteralsCompression in superblock mode Fix superblock mode to handle a block consisting of only compressed literals Fix a off by 1 error in superblock mode that disabled it whenever there were last literals Fix superblock mode with long literals/matches (> 0xFFFF) Allow superblock mode to repeat Huffman tables Respect ZSTD_minGain(). Tests: Simple check for the condition in #2096. When the simple_round_trip fuzzer enables superblock mode, it checks that the compressed size isn't expanded too much. Remaining limitations: O(targetCBlockSize^2) because we recompute statistics every sequence Unable to split literals of length > targetCBlockSize into multiple sequences Refuses to generate sub-blocks that don't shrink the compressed data, so we could end up with large sub-blocks. We should emit those sections as uncompressed blocks instead. ... Fixes #2096	2020-05-01 16:11:47 -07:00
Bimba Shrestha	1875f616ce	passing dictContentType instead of rawContent every time	2020-04-21 22:29:35 -07:00
Bimba Shrestha	5b0a452cac	Adding --long support for --patch-from (#1959 ) * adding long support for patch-from * adding refPrefix to dictionary_decompress * adding refPrefix to dictionary_loader * conversion nit * triggering log mode on chainLog < fileLog and removing old threshold * adding refPrefix to dictionary_round_trip * adding docs * adding enableldm + forceWindow test for dict * separate patch-from logic into FIO_adjustParamsForPatchFromMode * moving memLimit adjustment to outside ifdefs (need for decomp) * removing refPrefix gate on dictionary_round_trip * rebase on top of dev refPrefix change * making sure refPrefx + ldm is < 1% of srcSize * combining notes for patch-from * moving memlimit logic inside fileio.c * adding display for optimal parser and long mode trigger * conversion nit * fuzzer found heap-overflow fix * another conversion nit * moving FIO_adjustMemLimitForPatchFromMode outside ifndef * making params immutable * moving memLimit update before createDictBuffer call * making maxSrcSize unsigned long long * making dictSize and maxSrcSize params unsigned long long * error on files larger than 4gb * extend refPrefix test to include round trip * conversion to size_t * making sure ldm is at least 10x better * removing break * including zstd_compress_internal and removing redundant macros * exposing ZSTD_cycleLog() * using cycleLog instead of chainLog * add some more docs about user optimizations * formatting	2020-04-17 15:58:53 -05:00
Bimba Shrestha	c0d4b2b5a3	Merge pull request #2075 from bimbashrestha/dict_fuzzer_ref [bug] handling case where prefix is NULL or 0 sized in refPrefix_advanced	2020-04-07 17:37:19 -05:00
Bimba Shrestha	1658ae75cd	handling nil case for refprefix	2020-04-07 14:41:53 -07:00
Carl Woffenden	edd9a07322	Code replicated in compression and decompression moved to shared headers `CHECK_F` macro moved to `error_private.h` (shared between `fse_compress.c` and `fse_decompress.c`). `ZSTD_limitCopy()` moved to `zstd_internal.h` (shared between `zstd_compress.c` and `zstd_decompress.c`). Erroneous build artefact `zstd.h` removed from repo.	2020-04-07 11:02:06 +02:00
Carl Woffenden	7c420344d2	Single-file decoder script can now (optionally) create an encoder To complement the single-file decoder a new script was added to create an amalgamated single-file of all of the Zstd source, along with examples and (simple) tests.	2020-04-03 19:07:46 +02:00
Nick Terrell	ac58c8d720	Fix copyright and license lines * All copyright lines now have -2020 instead of -present * All copyright lines include "Facebook, Inc" * All licenses are now standardized The copyright in `threading.{h,c}` is not changed because it comes from zstdmt. The copyright and license of `divsufsort.{h,c}` is not changed.	2020-03-26 17:02:06 -07:00

... 2 3 4 5 6 ...

1312 Commits