townforge/lz4 - lz4 - Townforge git

Author	SHA1	Message	Date
Yann Collet	474c17cdc4	unified limitedOutput_directive between lz4.c and lz4hc.c . was left in a strange state after the "amalgamation" patch. Now only 3 directives remain, same name across both implementations, single definition place. Might allow some light simplification due to reduced nb of states possible.	2019-04-15 11:09:56 -07:00
Yann Collet	dd43b913a2	fix minor visual warning yet some overly cautious overflow risk flag, while it's actually impossible, due to previous test just one line above. Changing the cast position, just to be please the thing.	2019-04-12 16:56:22 -07:00
Yann Collet	8d76c8a44a	introduce LZ4_DISTANCE_MAX build macro make it possible to generate LZ4-compressed block with a controlled maximum offset (necessarily <= 65535). This could be useful for compatibility with decoders using a very limited memory budget (<64 KB). Answer #154	2019-04-11 14:15:33 -07:00
Yann Collet	d8d5f14138	fixed loadDictHC by making a full initialization instead of a fast reset.	2019-04-09 15:37:59 -07:00
Yann Collet	14c71dfa9c	modified LZ4_initStreamHC() to look like LZ4_initStream() it is now a pure initializer, for statically allocated states. It can initialize any memory area, and because of this, requires size.	2019-04-09 13:55:42 -07:00
Yann Collet	c3f8928f87	fixed strict iso C90	2019-04-05 10:41:26 -07:00
Yann Collet	c491df54ec	created LZ4_initStreamHC() - promoted LZ4_resetStreamHC_fast() to stable - moved LZ4_resetStreamHC() to deprecated (but do not generate a warning yet) - Updated doc, to highlight difference between init and reset - switched all invocations of LZ4_resetStreamHC() onto LZ4_initStreamHC() - misc: ensure `make all` also builds /tests	2019-04-04 17:05:11 -07:00
Tim Zakian	81441e2462	Make fact that certain variables that are passed into LZ4HC_encodeSequence are changed by the function call	2019-01-09 13:42:12 -08:00
qiuyangs	660d21272e	lz4hc.c: change (length >> 8) to (length / 255) Every 0xff byte in the compressed block corresponds to a length of 255 (not 256) in the input data. For long repeating sequences, using (length >> 8) may generate bad compressed blocks.	2019-01-06 16:29:30 +08:00
Bing Xu	17f5071e72	Enable amalgamation of lz4hc.c and lz4.c	2018-11-15 22:24:25 -08:00
Oleg Khabinov	f27ea0774e	Adding information about dirty context for _HC_ family of functions	2018-10-10 10:33:04 -07:00
Yann Collet	8bea19d57c	fixed minor cppcheck warnings in lib	2018-09-18 15:51:26 -07:00
Yann Collet	86023f01f2	avoid final trailing comma for enum lists as detected in #485 by @JoachimSchneider. Refactored the c_standards tests so that these issues get automatically detected in CI tests.	2018-09-13 14:29:41 -07:00
Yann Collet	30f6f34328	removed one assert() condition which is not correct when using LZ4_HC with dictionary and starting from a low address (<0x10000).	2018-09-05 11:25:10 -07:00
Yann Collet	2e4847c2d5	fixed #560 it was a fairly complex scenario, involving source files > 64K and some extraordinary conditions related to specific layout of ranges of zeroes. and only on level 9.	2018-09-04 18:21:40 -07:00
Yann Collet	ba1c7148a5	renamed variable for clarity	2018-05-07 12:14:26 -07:00
Yann Collet	200b2960d5	fixed minor conversion warning	2018-05-06 18:26:14 -07:00
Yann Collet	24b9c485db	small PA optimization which measurably improves speed on levels 9+	2018-05-06 16:53:33 -07:00
Yann Collet	cdb0275b7f	lz4hc: fixed PA / SC parameter order also : reserved PA for levels 9+ (instead of 8+). In most cases, speed is lower, and compression benefit is not worth.	2018-05-05 14:32:57 -07:00
Yann Collet	a4e918d7a6	lz4hc: SC only enabled for opt parser the trade off is not good for regular HC parser : compression is a little bit better, but speed cost is too large in comparison.	2018-05-05 14:25:37 -07:00
Yann Collet	d097bf93f8	fixed SC.opt integration with regular HC parser Only enabled when searching forward. note : it slighly improves compression ratio, but measurably decreases speed. Trade-off to analyse.	2018-05-05 13:46:45 -07:00
Yann Collet	fa89a9e18b	lz4hc: fixed performance issue when combining both PA and CS optimizations	2018-05-05 13:31:03 -07:00
Yann Collet	9699ba5ddf	integrated chain swapper into HC match finder slower than expected Pattern analyzer and Chain Swapper work slower when both activated. Reasons unclear.	2018-05-04 19:13:33 -07:00
Yann Collet	434ace7244	implemented search accelerator greatly improves speed compared to non-accelerated, especially for slower files. On my laptop, -b12 : ``` calgary.tar : 4.3 MB/s => 9.0 MB/s enwik7 : 10.2 MB/s => 13.3 MB/s silesia.tar : 4.0 MB/s => 8.7 MB/s ``` Note : this is the simplified version, without handling dictionaries, external buffer, nor pattern analyzer. Current `dev` branch on these samples gives : ``` calgary.tar : 4.2 MB/s enwik7 : 9.7 MB/s silesia.tar : 3.5 MB/s ``` interestingly, it's slower, presumably due to handling of dictionaries.	2018-05-03 16:31:41 -07:00
Yann Collet	dc42707107	created LZ4HC_FindLongestMatch() simplified match finder only searching forward and within current buffer, for easier testing of optimizations.	2018-05-03 15:38:32 -07:00
Yann Collet	85be6b8f6d	increased nbAttempts for lz4 -12 shaves one more kilobyte from silesia.tar	2018-05-02 14:22:35 -07:00
Yann Collet	bd470ccd38	Merge pull request #521 from lz4/BD_deterministic fix lz4hc -BD non-determinism	2018-04-30 20:40:34 -07:00
Cyan4973	6a7d501fed	renamed variable for clarity lowLimit -> lowestMatchIndex	2018-04-30 18:56:16 -07:00
Yann Collet	8c574990a9	lz4hc changed variable to reduce confusion dictLowLimit => dictStart	2018-04-30 16:08:16 -07:00
Yann Collet	1e6ca25af3	Merge pull request #520 from felixhandte/frame-dict-nits Minor Fixes to Dictionary Preparation in LZ4 Frame	2018-04-27 13:52:30 -07:00
Yann Collet	de7b274d99	Merge branch 'dev' into BD_deterministic	2018-04-27 12:59:20 -07:00
Yann Collet	19b1267d44	fix lz4hc -BD non-determinism related to chain table update	2018-04-27 12:46:49 -07:00
Yann Collet	72e99c8939	lz4hc : minor editions for clarity	2018-04-27 12:28:58 -07:00
W. Felix Handte	fefc40fc0a	Avoid Possibly Redundant Table Clears When Loading HC Dict	2018-04-27 14:10:27 -04:00
Yann Collet	d294dd7fc6	ensure favorDecSpeed is properly initialized also : - fix a potential malloc error - proper use of ALLOC macro inside lz4hc - update html API doc	2018-04-27 09:04:09 -07:00
Yann Collet	0fb3a3b199	fixed a number of minor cast warnings	2018-04-26 18:08:28 -07:00
Yann Collet	5c7d3812d9	fasterDecSpeed can be triggered from cli with --favor-decSpeed	2018-04-26 15:49:32 -07:00
Yann Collet	3792d00168	favorDecSpeed feature can be triggered from lz4frame and lz4hc.	2018-04-26 15:18:44 -07:00
Yann Collet	1148173c5d	introduced ability to parse for decompression speed triggered through an enum. Now, it's still necessary to properly expose this capability all the way up to the cli.	2018-04-26 13:01:59 -07:00
W. Felix Handte	5ed1463bf4	Remove Debug Log Statements	2018-04-24 11:58:51 -04:00
W. Felix Handte	ee67f25576	Change vLimit Calculation	2018-04-20 20:18:30 -04:00
W. Felix Handte	1895fa19a4	Remove Redundant Static Assert	2018-04-20 20:14:12 -04:00
W. Felix Handte	fcc99d1f31	Simpler loadDict() Reset	2018-04-20 19:37:28 -04:00
W. Felix Handte	a8cb2feffd	Tolerate Base Pointer Underflow	2018-04-20 19:37:07 -04:00
W. Felix Handte	85cac61dd8	Don't Segfault on Malloc Failure	2018-04-20 19:35:51 -04:00
W. Felix Handte	756ed402da	Sign-Extend -1 to Pointer Width	2018-04-20 17:56:26 -04:00
W. Felix Handte	86b381e40b	Fix Constant Value	2018-04-20 17:13:40 -04:00
W. Felix Handte	1d2500d44e	Handle Index Underflows Safely	2018-04-20 17:13:03 -04:00
W. Felix Handte	7874cf06b3	Consts and Asserts and Other Minor Nits	2018-04-20 15:30:08 -04:00
W. Felix Handte	d7347f9eea	Add API for Attaching Dictionaries	2018-04-20 14:59:34 -04:00
W. Felix Handte	ca833f928f	Also Reset the Chain Table	2018-04-20 14:16:27 -04:00
W. Felix Handte	8f118cf6e9	Remove inputBuffer from Context, Work Around its Absence	2018-04-20 14:08:06 -04:00
W. Felix Handte	0064e8ebc7	Remove Commented Out Support for Match Continuation over Segment Boundary	2018-04-20 13:14:37 -04:00
W. Felix Handte	14c577d4c9	Fix Signedness of Comparison	2018-04-19 20:54:35 -04:00
W. Felix Handte	f4b13e17ea	Don't Clear the Dictionary Context Until No Longer Useful	2018-04-19 20:54:35 -04:00
W. Felix Handte	0abc23f72e	Copy DictCtx into Working Context on Inputs Larger than 4 KB	2018-04-19 20:54:35 -04:00
W. Felix Handte	b67de2a327	Force Inline on HashChain	2018-04-19 20:54:35 -04:00
W. Felix Handte	22e16d5b50	Split DictCtx-using Code Into Separate Inlining Chain	2018-04-19 20:54:35 -04:00
W. Felix Handte	3591fe8ab8	Add Fast Reset Paths	2018-04-19 20:54:35 -04:00
W. Felix Handte	8db291bc1d	Remove Match Upper Bounds Check	2018-04-19 20:54:35 -04:00
W. Felix Handte	8f9a2db0e1	Fix Some Cast/Conversion Warnings	2018-04-19 20:54:35 -04:00
W. Felix Handte	221211d7d0	Fix Offset Math	2018-04-19 20:54:35 -04:00
W. Felix Handte	a1beba13f7	Reset Stream in LZ4_compress_HC	2018-04-19 20:54:35 -04:00
W. Felix Handte	bdd7af6f71	Don't Bother Clearing Chain Table for Working Contexts	2018-04-19 20:54:35 -04:00
W. Felix Handte	895e76cc20	Push Previous Compression Offsets into the Past	2018-04-19 20:54:35 -04:00
W. Felix Handte	22db704a73	Shift Dict Limit Checks out of the Loop	2018-04-19 20:54:35 -04:00
W. Felix Handte	4f7b7a8ffa	Clear Tables on Dict Load	2018-04-19 20:54:35 -04:00
W. Felix Handte	b88a0b4e88	Only Perform Dict Lookup if Attempts Remain	2018-04-19 20:54:35 -04:00
W. Felix Handte	b6c35ed642	Avoid Resetting Chain Table	2018-04-19 20:54:35 -04:00
W. Felix Handte	595ea58289	Avoid Resetting Hash Table	2018-04-19 20:54:35 -04:00
W. Felix Handte	66d217e240	Perform Lookups into the Dictionary Context	2018-04-19 20:54:35 -04:00
W. Felix Handte	fdeead0b09	Set dictCtx Rather than memcpy'ing Ctx	2018-04-19 20:54:35 -04:00
W. Felix Handte	a992d11fc2	Fully Bounds Check Hash Table Reads	2018-04-19 20:54:35 -04:00
W. Felix Handte	e75153f508	Add Debug Log Statements to HC	2018-04-19 20:54:35 -04:00
test4973	8af32ce6f7	modified a few traces for debug	2018-04-12 13:35:19 -07:00
test4973	43132af808	Merge branch 'dev' into lowAddr	2018-04-04 11:38:55 -07:00
W. Felix Handte	efc419a6d4	Replace calloc() Calls With malloc() Where Possible	2018-03-12 14:58:43 -04:00
Yann Collet	550b40849f	merge lz4opt.h into lz4hc.c Having a dedicated file for optimal parser made sense during its creation, it allowed Przemyslaw to work more freely on lz4opt, with less dependency on lz4hc, moreover, the optimal parser was more complex, with its own search functions. Since the optimal was rewritten last year, it's now a lot lighter. It makes more sense now to integrate it directly inside lz4hc.c, making it easier to edit (editors are a bit "lost" inside a `*.h` dependent on its #include position), it also reduces the number of files in the project, which fits pretty well with lz4 objectives. (adding lz4hc requires "just" lz4hc.h and lz4hc.c).	2018-02-25 00:32:09 -08:00
Yann Collet	7173a631db	edge case : compress up to end-mflimit (12 bytes) The LZ4 block format specification states that the last match must start at a minimum distance of 12 bytes from the end of the block. However, out of an abundance of caution, the reference implementation would actually stop searching matches at 13 bytes from the end of the block. This patch fixes this small detail. The new version is now able to properly compress a limit case such as `aaaaaaaabaaa\n` as reported by Gao Xiang (@hsiangkao). Obviously, it doesn't change a lot of things. This is just one additional match candidate per block, with a maximum match length of 7 (since last 5 bytes must remain literals). With default policy, blocks are 4 MB long, so it doesn't happen too often Compressing silesia.tar at default level 1 saves 5 bytes (100930101 -> 100930096). At max level 12, it saves a grand 16 bytes (77389871 -> 77389855). The impact is a bit more visible when blocks are smaller, hence more numerous. For example, compressing silesia with blocks of 64 KB (using -12 -B4D) saves 543 bytes (77304583 -> 77304040). So the smaller the packet size, the more visible the impact. And it happens we have a ton of scenarios with little blocks using LZ4 compression ... And a useless "hooray" sidenote : the patch improves the LZ4 compression record of silesia (using -12 -B7D --no-frame-crc) by 16 bytes (77270672 -> 77270656) and the record on enwik9 by 44 bytes (371680396 -> 371680352) (previously claimed by [smallz4](http://create.stephan-brumme.com/smallz4/) ).	2018-02-24 11:47:53 -08:00
Yann Collet	25b16e8a2e	added one assert() suggested by @terrelln	2018-02-20 15:25:45 -08:00
Yann Collet	d3a13397d9	slight hc speed benefit (~+1%) by optimizing countback	2018-02-12 00:01:58 -08:00
Yann Collet	2b674bf02f	slightly improved hc compression speed (+~1-2%) by removing bad candidates faster.	2018-02-11 02:45:36 -08:00
Yann Collet	20e969e579	fuzzer: added low address compression test is expected to work on linux+gcc only.	2018-02-05 15:19:00 -08:00
Nick Terrell	30e92f320c	[lz4hc] level == 0 means default, not level 1	2018-01-22 12:50:06 -08:00
Yann Collet	0b203b04f6	Merge pull request #434 from lz4/pattern conditional pattern analysis	2018-01-06 06:58:41 +01:00
Yann Collet	7d2f30c7d1	lz4opt supports _destSize no longer limited to level 9	2017-12-22 12:47:59 +01:00
Yann Collet	9753ac4c91	conditional pattern analysis Pattern analysis (currently limited to long ranges of identical bytes) is actually detrimental to performance when `nbSearches` is low. Reason is : `nbSearches` provides a built-in protection for these cases. The problem with patterns is that they dramatically increase the number of candidates to visit. But with a low nbSearches, the match finder just aborts early. In such cases, pattern analysis adds some complexity without reducing total nb of candidates. It actually increases compression ratio a little bit, by filtering only "good" candidates, but at a measurable speed cost, so it's not a good trade-off. This patch makes pattern analysis optional. It's enabled for levels 8+ only.	2017-12-22 08:07:25 +01:00
Yann Collet	55da545e7a	new level 10 lz4opt is only competitive vs lz4hc level 10. Below that level, it doesn't match the speed / compression effectiveness of regular hc parser. This patch propose to extend lz4opt to levels 10-12. The new level 10 tend to compress a bit better and a bit faster than previous one (mileage vary depending on file) The only downside is that `limitedDestSize` mode is now limited to max level 9 (vs 10), since it's only compatible with regular HC parser. (Note : I suspect it's possible to convert lz4opt to support it too, but haven't spent time into it).	2017-12-20 14:14:01 +01:00
Yann Collet	f93b595718	lz4opt: simplified match finder invocation to LZ4HC_FindLongerMatch()	2017-11-08 17:11:51 -08:00
Yann Collet	fa03a9d3d9	added code comments	2017-11-08 08:42:59 -08:00
Yann Collet	b07d36245a	fixed LZ4HC_reverseCountPattern() for multi-bytes patterns (which is not useful for the time being)	2017-11-07 17:58:59 -08:00
Yann Collet	897f5e9834	removed the ip++ at the beginning of block The first byte used to be skipped to avoid a infinite self-comparison. This is no longer necessary, since init() ensures that index starts at 64K. The first byte is also useless to search when each block is independent, but it's no longer the case when blocks are linked. Removing the first-byte-skip saves about 10 bytes / MB on files compressed with -BD4 (linked blocks 64Kb), which feels correct as each MB has 16 blocks of 64KB.	2017-11-07 17:37:31 -08:00
Yann Collet	71fd08c17d	removed legacy version of LZ4HC_InsertAndFindBestMatch()	2017-11-07 11:33:40 -08:00
Yann Collet	c49f66f2ad	ensure `pattern` is a 1-byte repetition	2017-11-07 11:29:28 -08:00
Yann Collet	5512a5f1a9	removed useless `(1 && ...)` condition as reported by @terrelln	2017-11-07 11:22:57 -08:00
Yann Collet	7130bfe573	improved LZ4HC_reverseCountPattern() : works for any repetitive pattern of length 1, 2 or 4 (but not 3!) works for any endianess	2017-11-07 11:05:48 -08:00
Yann Collet	a004c1fbee	fixed LZ4HC_countPattern() - works with byte values other than `0` - works for any repetitive pattern of length 1, 2 or 4 (but not 3!) - works for little and big endian systems - preserve speed of previous implementation	2017-11-07 10:53:29 -08:00
Yann Collet	aa99163752	fixed minor static analyzer warning dead assignment	2017-11-03 12:33:55 -07:00
Yann Collet	1025546347	unified HC levels LZ4_setCompressionLevel() can be users accross the whole range of HC levels No more transition issue between Optimal and HC modes	2017-11-03 11:28:28 -07:00
Yann Collet	a1f4a0d983	moved ctx->end handling from parsers responsibility better handled one layer above (LZ4HC_compress_generic())	2017-11-03 10:48:55 -07:00

1 2 3 4 5

229 Commits