Commit Graph

143 Commits

Author SHA1 Message Date
Martin Mitáš
481fbfbdf7
Check for hard breaks more carefully to avoid false positives...
... caused by trailing tab characters.

Fixes #250.
2024-02-25 20:51:06 +01:00
Martin Mitáš
64f36805b0
Fix handling tab when removing trailing whitespace.
Espacially in connection with ATX headers.
2024-02-25 16:24:50 +01:00
Martin Mitáš
3848bfb6cc Make striketrough spans follow same flanking rules...
... as other emphasis spans.

Fixes #242.
2024-02-21 09:09:31 +01:00
Martin Mitáš
aa53f82c29 Introduce an overall limit to link. ref. defs instantiations.
This is to prevent time and output size explosion in case of input
pattern generated by this:

    $ python -c 'N=1000; print("[x]: " + "x" * N + "\n[x]" * N)'

We roughly allow to blowing up the input size of the document
16 times by link reference definitions or up to 1 MB, whatever is
smaller. When the threshold is reached, following reference definitions
are sent to output unresolved as a text.

Fixes #238.
2024-02-07 14:45:09 +01:00
Martin Mitáš
f37a89f5d7
md_is_inline_link_spec: Use md_lookup_line() instead of walking.
Fixes #236.
2024-02-01 22:14:36 +01:00
Martin Mitáš
485619fef7
test/spec.txt: Upgrade to spec version 0.31.2.
It's essentially same as 0.31 and 0.31.1, it only fixes release date
metadata in the spec.txt file.

Also fix link in CHANGELOG.md accordingly.
2024-01-30 01:46:15 +01:00
Martin Mitáš
f852aaed31
test/LICENSE: Update to reflect recent file renaming.
Also rename the file to test/LICENSE.md.
2024-01-28 20:37:08 +01:00
Martin Mitáš
1883132b4e
Update test/spec.txt from upstream.
(The spec 0.31 was released errorneously still with version 0.30 inside
of it. Re-release 0.31.1 fixes it.)
2024-01-28 19:05:39 +01:00
Martin Mitas
136b39ace0 Update test/spec.txt from upstream. 2024-01-28 09:00:08 +01:00
Martin Mitas
aeddaf587f Simplify and fix handling of newline in code span.
Fixes #223 properly (one corner case has been unnoticed/hidden due test
suite normalization feature).

Fixes #230 (strictly speaking duplicate of the corner case).
2024-01-25 22:24:17 +01:00
Martin Mitas
d082cdd8fe test/run-testsuite.py: Allow disabling normalisation on per-unittest basis.
And use it for few tests in regressions.txt where the whitespace
matters.
2024-01-25 21:38:45 +01:00
Martin Mitas
a3c510ac0b Improve coverage testing of UTF-8 routines. 2024-01-21 14:15:52 +01:00
Martin Mitas
cd7c326f1c Add code coverage test for MD_FLAG_COLLAPSEWHITESPACE. 2024-01-21 14:15:52 +01:00
Martin Mitas
65957f5369 Limit number of table columns to prevent explosion of output...
with the input pattern in the form of geneated by this one-liner:

$ python3 -c 'N=1000; print("x|" * N + "\n" + "-|" * N + "\n" + "x\n" * N)'

Here the amount of HTML otput grows with N^2.
2024-01-19 22:42:56 +01:00
Martin Mitas
70b247cf7d md_analyze_permissive_autolink: Accept path ending with '/'.
Fixes #226.
2024-01-19 13:59:45 +01:00
Martin Mitas
601ff05326 Fix handling new line at beginning/end of a code span.
Fixes #223.
2024-01-18 16:49:37 +01:00
Martin Mitas
23b1416866 pathological-tests.py: Fix output if a test unit ends with non-zero exit code. 2024-01-18 15:12:26 +01:00
Martin Mitas
a08f6a05f1 Improve/fix latex math extension.
To mitigate false positives:

* We accept $ and $$ as a potential opener only if it's not preceded
  with alnum char.

* Similarly closer cannot be followed with alnum char.

* We now also match closer with last preceding pontential opener, not
  the first one. (And to avoid nesting, any previous openers are
  ignored.)

* Also revert an unintended change in 3fc207affa
  which allowed keeping nested resolved marks in it.
2024-01-18 12:29:31 +01:00
Martin Mitas
4728cd981d md_analyze_tilde: Pop from chain tail like other emphasis.
The function incorrectly used header from the head, leading to wrong
result (incompatible with e.. GFM) but even worse to bad internal state
md_rollback() is then potentially unable to solve.

Fixes #222.
2024-01-17 16:04:14 +01:00
Martin Mitas
f45dd4420e Add regression test for #213.
As it's now possible to add tests with multiple cmdline options easily.
2024-01-17 02:58:12 +01:00
Martin Mitáš
d955c495ee
Rework permissive autolinks. (#220)
* We have now dedicated run over the inline marks for them.

 * We check more throughly whether it really looks as an URL or e-mail
   address. The old implementation recognized even heavily broken ones.

 * This allows us to be much more careful in order not to cross already
   resolved marks.

 * Share substantial parts of the code between all three types of the
   permissive autolinks (URL, WWW, e-mail).

 * Merge their tests into one file, spec-permissive-autolinks.txt.

 * Add one pathological case which triggered quadratic behavior in the
   old implementation.
2024-01-17 02:48:57 +01:00
Martin Mitas
a715b884ac Rename many files in test dir for better organization. 2024-01-16 19:05:52 +01:00
Martin Mitas
4b9e4d7cdd Move one more forgotten regression test to regressions.txt. 2024-01-16 19:05:52 +01:00
Martin Mitas
6685df9c50 Move all regression tests into new tests/regressions.txt.
(And update scripts/run-tests.sh accordingly.)
2024-01-16 15:09:33 +01:00
Martin Mitas
74e5f7a9a7 Tests: Specify md2html command line options for each example as needed.
Previously the caller (or the script scripts/run_tests.sh) needed to
know what options to specify.
2024-01-16 14:56:09 +01:00
Martin Mitas
359406bfb2 Test: Add support for per-example command line options.
(We also removed direct call support into the library. It was inherited
from cmark as the testsuite was originally taken from there, but it
actually was never updated to work with MD4C.)
2024-01-16 14:45:00 +01:00
Martin Mitas
7882942708 Fix some emphasis parsing issues.
* We incorrectly applied the infamous rule of three only to
   asterisk-encoded emphasis, it has to be applied to underscore as
   well.

 * We incorrectly applied the rule of three only if the opener
   and/or closer was inside a word. It has also to be applied if the
   mark is both preceded and followed by punctuation.

Fixes #217.
2024-01-13 03:11:29 +01:00
Martin Mitas
5592352fdb HTML declaration doesn't require whitespace before the closer.
Fixes #216.
2024-01-13 00:30:08 +01:00
Martin Mitas
7497ea92b3 Allow tabs after setext header underline.
Fixes #215.
2024-01-13 00:17:08 +01:00
Martin Mitas
0d10b60b19 Move test/fuzz-input/ to test/fuzzers/seed-corpus/. 2024-01-12 22:44:31 +01:00
Martin Mitas
821477b1da Fix typo in fuzz-mdhtml.c, preventing oss-fuzz from working. 2024-01-10 17:35:46 +01:00
Martin Mitas
c6942ef03e Treat TABLECELLBOUNDARIES chain as special one.
It's not an ordinary openers chain as (most of) the others, and
md_rollback() must not touch it.

Fixes #212.
2024-01-10 17:33:06 +01:00
Martin Mitas
ca169a92d5 Fix HTML renderer to handle neted images correctly.
Fixes #210.
2024-01-10 12:23:17 +01:00
Martin Mitas
38303af369 Make md_is_html_block_end_condition() reuse the same data...
... as md_is_html_block_start_condition() for the type 1 so we make all
tags are used consistently there.

Fixes #207.
2024-01-09 00:01:35 +01:00
Martin Mitas
8699cd5d8e test/hard-soft-breaks.txt: Fix wording. 2024-01-08 21:58:26 +01:00
l-m
6ef3be6e69
MD_FLAG_HARD_SOFT_BREAKS (#193) 2024-01-08 21:09:57 +01:00
Martin Mitas
4d2f8a2e0b Add test for issue #201.
Seems the issue got fixed by combination of previous commits.

Fixes #201.
2024-01-08 19:35:53 +01:00
Martin Mitas
132c29dcd0 Allow indented code block to follow any block except paragraph without a blank line.
Fixes #200.
2024-01-08 19:31:37 +01:00
Martin Mitas
601c8ab70e Restore parent's block indentation when interruping a list item with double blank line.
Fixes #190.
2024-01-08 19:27:16 +01:00
Martin Mitas
a27f8dc093 test/fuzzers.fuzz-mdhtml.c: Remove stale comment. 2023-12-12 19:31:30 +01:00
Martin Mitas
d3c1c0bb2d fuzz-mdhtml.c: Cleanup of the code. 2022-01-14 17:27:05 +01:00
Martin Mitas
b42e7f5cea md_resolve_links: Avoid link ref. def. lookup if...
if we know that the bracket pair contains nested brackets. That makes
the label invalid anyway, therefore we know that there is no link ref.
def. to be found anyway.

In case of heavily nested bracket pairs, the lookup could lead to
quadratic parsing times.

Fixes #172.
2022-01-10 11:41:25 +01:00
Martin Mitas
7f44e1ad6c pathological_tests.py: Improve code alignment. 2022-01-10 10:40:29 +01:00
Martin Mitas
a8bb4d3020 md_is_table_underline: Remove requirement for minimal length of a cell underline.
Fixes #169.
2022-01-06 16:01:55 +01:00
Martin Mitas
c01aa6b394 Update CommonMark spec file to v. 0.30 2021-06-27 18:49:33 +02:00
Martin Mitas
bcb55d0d40 md_resolve_links: Suppress bogus nested permissive autolink.
Fixes #152.
2021-04-14 09:18:09 +02:00
DavidKorczynski
3478ec69c1
Added fuzzer for oss-fuzz integration. (#151) 2021-02-23 15:01:31 +01:00
Martin Mitas
fd7b5fe085 md_analyze_line: Fix implicit ending of HTML blocks...
... when the HTML block is not explicitly ended (before the enclosing
container block ends).

Fixes #149.
2021-02-05 21:50:57 +01:00
Martin Mitas
da5821ae0d Fix testcase for issue #142. 2020-12-14 19:53:40 +01:00
Martin Mitas
5a44e327a0 md_link_label_cmp: Fix the loop end condition.
The old version likely could stop prematurely in a corner case when
there was a Unicode character at the end of the either string, which
maps into multiple fold info codepoints.

Fixes #142.
2020-12-14 19:30:23 +01:00