townforge/md4c - md4c - Townforge git

Author	SHA1	Message	Date
Martin Mitas	3254b7cb00	md_process_table_block_contents: Suppress empty TBODY block generation. When the table has no body rows, do not call the callback with MD_BLOCK_TBODY events. Fixes #138.	2020-11-13 12:02:39 +01:00
Martin Mitas	4585088ad7	md_analyze_permissive_url_autolink: Better GFM compatibility. The autolinks now allow unmatched parenthesis, only the trailing parenthesis closers are handled specially to deal with the situation the autolink is all inside an outer parenthesis. Somehow our tests were broken and avoided the cases with unmatched parenthesis pairs inside the auto-link. That's now fixed and in sync with GFM specs too. Fixes #135.	2020-11-13 10:22:34 +01:00
Martin Mitas	002f76c975	md_resolve_links: Skip [...] used as a reference link/image label. Fixes #131.	2020-10-18 09:43:06 +02:00
Martin Mitas	c501c891b9	Fix spelling of "than" in many occurances. I often spell it errorneously as "then". Doing this mistake way too often when typing fast.	2020-07-30 10:13:05 +02:00
Martin Mitas	c595c2ed00	md_process_verbatim_block_contents: Fix off by 1 error. This caused outputting wrong indentation inside a fenced code blocks for lines indented with mor ethan 16 spaces. Fixes #124.	2020-07-30 08:38:19 +02:00
Martin Mitas	0c4d7f3d85	test/normalize.py: Use html.escape instead of cgi.escape. Fixes #123.	2020-07-28 07:20:51 +02:00
Martin Mitas	d0e3ed79bf	md2html: Skip UTF-8 BOM, if present in the input.	2020-03-12 23:08:29 +01:00
Martin Mitas	9e6ab76c24	Minor fuzz-input cleanup. Move some permissive links incorrectly placed in commonmark.md into gfm.md.	2020-02-17 12:41:50 +01:00
Martin Mitas	cc9a9d28ca	test/fuzz-test: Add some fuzzing testing initial input.	2020-02-16 15:29:54 +01:00
Martin Mitas	5d7c35973e	md_analyze_emph: Detect correctly opener chain when resolving the range. Fixes #107.	2020-02-16 13:51:05 +01:00
Martin Mitas	b4c30cd6e6	Improve wiki-link parsing. * Get rid of MD_LINE::total_indent. * Remove some special complicated branching for nested images: Instead we use md_rollback() the wiki-link destination span to kill _any_ marks resolved so far, including the images. * Remove any length limit from label. Only destination length is limited, regardless of whether '\|' is present or not. * Move the special handling of `[[foo\|]]` from md_process_inlines() into md_resolve_links(). We simply expand the closer mark to consume the `\|`. * Do not modify the opener and closer marks until we really know it is indeed a wiki-link.	2020-02-13 02:50:15 +01:00
Martin Mitas	403043bba3	md_mark_chain_append: Set next of the tail mark to -1. Fixes #104.	2020-01-16 16:27:37 +01:00
Martin Mitáš	e6661f23dc	Implement an underline extension. (#103 ) Closes #101.	2020-01-10 19:27:10 +01:00
Martin Mitas	82d7d087cc	Rework/improve recognition of strike-through spans. Closes #102.	2020-01-10 16:11:21 +01:00
Martin Mitas	561f52e05f	md_is_autolink_email: Fix an off-by-one error. Fixes #100.	2020-01-05 18:33:46 +01:00
Martin Mitas	46f25f0b47	md_analyze_emph: Call md_resolve_range() with proper chain. Errorneously, we have called md_resolve_range() with mark chain derived from the closer mark. In the case that the opener and closer marks differ in length (and we have split one or the other), we pass in an incorrect chain, which may lead to strange behavior in subsequent analysis. Fixes #98.	2019-11-12 21:48:26 +01:00
niblo	e336e6404f	Add support for Wiki links (#92 ) With a new flag MD_FLAG_WIKILINKS, recoginize wiki-style links as [[foo]] and [[foo\|bar]]. Update also the HTML renderer accordingly, to output a custom HTML tag <x-wikilink> when seeing it.	2019-11-04 15:20:59 +01:00
Martin Mitáš	ef85cfc278	Simplify parsing of tables (#97 ) We do so by removing the function md_is_table_row(). md_is_table_row() did some crazy inline parsing to detect whether the line contains at least one pipe which is not inside a code span or other high-priority inline element. This was very complicated under the hood and to was actually breaking the clean design which separates block analysis parse and inline analysis of each block contents. We now just use the table underline for determining the block is table and its properties like e.g. the column count. This means a paragraph now cannot interrupt a table. This is a change in a behavior but likely acceptable one as it actually brings the behavior closer to behavior of tables in cmark-gfm in this regard. Last but not least, it seems to prevent adoption of other useful features, for about that, see the discussion in PR #92.	2019-11-04 15:05:07 +01:00
Martin Mitas	993c7b9b88	Render LaTeX math into HTML as a tag <x-equation>... ... instead of <equation>. This is to highlight that it is not a standard HTML tag.	2019-11-03 23:32:46 +01:00
Martin Mitas	e97d0250bb	Link label comparision fixes. * md_link_label_cmp: To match the labels, the loop has to reach ends of the labels for both of them. * md_link_label_cmp_load_fold_info: Collapse consequtive whitespace into a single ' ' for the label comparison purposes. Fixes #96.	2019-11-03 13:57:00 +01:00
Martin Mitas	0354e1ab5a	md_is_container_mark: Ordered list mark requires at least one digit. Fixes #95.	2019-10-04 22:35:54 +02:00
Martin Mitas	9760636977	Fix the last test case in latex-math.txt.	2019-07-07 11:19:21 +02:00
Martin Mitas	099ce69b04	Add missing file into git.	2019-07-07 11:15:44 +02:00
Martin Mitas	2e965941ed	Add/improve docs for the LaTeX math spans.	2019-07-07 10:59:20 +02:00
Tilman Roeder	8bac86aa43	Added support for LaTeX math (#87 ) Addresses #86.	2019-07-07 10:46:10 +02:00
Martin Mitas	ce8b5d9440	md_analyze_line: Blockquote with blank line can interrupt a paragraph. Fixes #83.	2019-05-27 22:16:35 +02:00
Martin Mitas	5138616445	md_link_label_cmp: Fix handling non-trivial folding info. Fixes #78.	2019-05-19 11:46:26 +02:00
Martin Mitas	4f6a9e546f	Update Unicode support to 12.1. * scipts/build__map.py: Implement helper pythonic scripts used to generate some Unicode search maps and data for helper Unicode functions used in MD4C. This should simplify updating to future Unicode versions. md_get_unicode_fold_info: Use data generated by the scripts. * md_is_unicode_whitespace__: Ditto. * md_is_unicode_punct__: Ditto.	2019-05-19 11:00:40 +02:00
Martin Mitas	aca5c27f1f	test/spec.txt: Update from upstream head.	2019-05-16 22:48:08 +02:00
Martin Mitas	64a1bc37f5	test/coverage.txt: Sort the regression test cases by the issue number.	2019-05-15 23:25:05 +02:00
Martin Mitas	919a0cc9e0	test/*.txt: Fix some formatting.	2019-05-08 07:38:33 +02:00
Martin Mitas	1757ff55c6	test/spec_tests.py: Make ready for spec.txt from cmark-gfm project. This allows easier checking of our GFM dialect compatibility.	2019-05-07 23:10:46 +02:00
Martin Mitas	83047d3eb1	md_analyze_permissive_url_autolink: Improve. * Fix domain recognition so that it has to have at least two dot-delimited components. * Fix handling if parenthesis so that they have to form balanced pairs; i.e. the first ')' not having a preceding opener ends the path. Fixes #76.	2019-05-07 22:24:29 +02:00
Martin Mitas	609dfb0b1e	md_analyze_line: Treat blank lines inside a HTML block more carefully... ... with respect to the parent list containers. Fixes #10 (but now really).	2019-05-05 15:56:51 +02:00
Martin Mitas	952791318f	When undoing complete block from ctx->block_bytesp[], reset ctx->current_block properly. Fixes #74.	2019-04-30 00:32:36 +02:00
Martin Mitas	d4d1091511	Improve parsing of inline raw HTML. * Isolate some common code for scanning HTML closer into a new function so most HTML scanner functions reuse the same code. * Improve the scanning for the closer so that on failure we remember the range where no closer is present. So any later scanning attempts may fail early. Fixes #73.	2019-04-29 19:03:16 +02:00
Martin Mitáš	d7920b9c25	Merge pull request #67 from mity/spec-0.29 This merges all changes for CommonMark specification 0.28 -> 0.29 transition.	2019-04-08 19:35:06 +02:00
Martin Mitas	5b78f295c6	test/spec.txt: Update from upstream head.	2019-04-08 11:00:27 +02:00
Martin Mitas	2a7b97ed46	test/spec.txt: Update from upstream head.	2019-04-05 08:18:54 +02:00
Martin Mitas	b858698784	md_collect_mark: Add missing 'continue' to '~' branch. Fixes #69.	2019-04-03 08:28:27 +02:00
Martin Mitas	855a1bfccf	test/spec.txt: Update from upstream head.	2019-03-27 02:04:24 +02:00
Martin Mitas	94c86fe292	Revert "Fix problematic link destinations with angle brackets." The updated specification now explicitly requests the behavior we implemented before fixing #24. This reverts commit `2e0a74ba99`. Also remove associated regression test as it is no longer valid.	2019-03-26 14:45:23 +02:00
Martin Mitas	0959975a8c	md_analyze_emph: Follow specs changes to the "rule of three".	2019-03-26 14:01:02 +02:00
Martin Mitas	98968e22ed	Update spec.txt from upstream head. (I previously used an updated revision of it by mistake.)	2019-03-26 13:33:05 +02:00
Martin Mitas	1edd0c9cf5	test/spec.txt: Update to current upstream HEAD.	2019-03-26 11:49:25 +02:00
Martin Mitas	2dd96ab4ac	Fix O(n^2) in handling the "rule of three". We had to break the list of potential '*' openers into multiple ones so we do not have to walk it when looking for matching length due to the "rule of three" for intraword delimiter runs. Fixes #63.	2019-03-12 10:27:36 +02:00
Martin Mitas	b21086522e	md_analyze_line: Fix O(n^2) in thematic break handling. Fixes #66.	2019-03-11 21:13:15 +02:00
Martin Mitas	37104fc281	md_is_code_span: Fix crash at EOF. Fixes #65.	2019-03-11 20:26:58 +02:00
Martin Mitas	966b8e39b5	md_is_link_title: Stop on ')' lin ()-style title. Fixes #60.	2019-03-11 19:56:46 +02:00
Martin Mitas	fc27108e71	test/pathological_tests.py: Output test durations.	2019-03-11 19:55:08 +02:00
Martin Mitas	53f65852be	test/spec.txt: Little update. Somehow we were having little different spec.txt version that the one from CommonMark repo tag 0.28. But we still pass all its compliance test suite.	2019-03-11 19:03:34 +02:00
Martin Mitas	685b714453	Move codespan detection from md_analyze_backtick() into... md_is_code_span(), called from md_collect_marks(). We have to do this at the same time as detecting raw inline HTML to follow CommonMark priority requirements. Also it is done very differently now: When scanning for the closer mark, we remember (the latest) position of potential closers for all other lengths as well. This means that: (1) If we find it, we reduced the task because all subsequent scan shall begin after the closer. (2) If we do not find it, then we have to reach the end of the block and hence we then know (for every allowed marker length) the position of last such backtick sequence. (3) That makes the guaranty that any subsequent call with either succeed in its scan (and reduce the task even further); or that we shall be able to detect instantly there is no suitable closer. I.e. every call either reduces the task by O(n) scan (1); or collects all the data in O(n) because (2) happens at most once; or fails in O(1) (3). This makes O(n) guaranty of the function complexity. Fixes #59.	2019-03-11 13:02:17 +01:00
Martin Mitas	0cb61205b1	Move raw inline HTML detection from md_analyze_lt_qt() into md_collect_marks(). Fixes #58: For resolving raw inline HTML the function tried closer with all potential openers, because raw HTML can have '<' inside of an attribute. However this caused O(n^2) for input like "<><><><><><><>...". We solved by handling raw HTML in earlier stage, directly in md_collect_marks(), where we can scan linerary forward. Fixes #61: As a side effect, this also fixes the issue that MD_FLAG_NOHTMLSPANS disabled also recognition of CommonMark autolinks.	2019-03-11 13:02:17 +01:00
Martin Mitáš	8e01a769ea	Implement task lists. (#50 ) Fixes #30.	2019-02-10 22:58:42 +01:00
Martin Mitas	d32aa2e076	Fix conflict in parsing permissive autolinks and ordinary links. The issues is caused by the fact that we do not know exact position of permissive auto-link in time of md_collect_marks() because there is no syntax to mark its end on the 1st place. This causes that eventually, the closer mark in ctx->marks[] can be out-of-order somewhat. As a consequence, if some other mark range (e.g. ordinary link) shadows the auto-link, the closer mark may be left outside the shadowed range and survive till the phase when we generate the output. We fix by using an extra mark flag to remember we did really output the opener mark, and output the closer only in such case. Fixes #53.	2019-02-09 10:40:52 +01:00
Martin Mitas	67401e7019	md_analyze_inlines: Resolve table cell boundaries before links. This brings some corner cases closer to cmark-gfm. Also fixes #51.	2019-02-06 04:31:25 +01:00
Martin Mitas	8fc692badc	md_rollback: Do not touch TABLECELLBOUNDARIES chain. This chain is not normal opener/closer inline mark chain. Fixes #42.	2018-06-11 18:18:56 +02:00
Martin Mitas	e6e2ea4c5a	md_analyze_line: Fix mixing list and table parsing. If table header underline is not nested the same way as the preceding line (i.e. the wannabe table header line), then it cannot form a table. Fixes #41.	2018-06-11 11:43:47 +02:00
Martin Mitas	4ef024fbb7	md_process_inlines: Fix link/image closers spanning over multiple lines. Fixes #40.	2018-05-29 23:30:02 +02:00
Martin Mitas	7deaccf65d	md_is_link_label: Fix if the link label contains just backslash escapes. The function did not remember the label start line index, leading to bad consequences. Fixes #39.	2018-05-29 18:38:51 +02:00
Martin Mitas	bf022cb656	Fix md_split_simple_pairing_mark(). When splitting a mark into two, make sure each of them gets the right share od dummies for case that we will have to split once more. Fixes #36.	2018-05-28 21:16:29 +02:00
Martin Mitas	e7b84d65a4	pathological_tests.py: Fix test compatibility with Windows.	2018-05-28 21:09:32 +02:00
Martin Mitas	81e2a5cac2	pathological_tests.py: Test deeply nested lists.	2018-04-12 17:04:12 +02:00
Martin Mitas	0d1a41a4d2	md_build_attr_append_substr: Fix +1 allocation error. Fixes #33.	2018-03-28 08:21:21 +02:00
Martin Mitas	19b24bdd11	Simplify the pathological test "many references".	2017-08-16 18:16:49 +02:00
Martin Mitas	07cec7dcd6	Add regression test for #24 .	2017-08-16 16:34:50 +02:00
Martin Mitas	ee3bee1a5d	Upgrade to CommonMark specification 0.28.	2017-08-02 00:38:54 +02:00
Martin Mitas	938460d564	Improve/unify output of test scripts.	2017-07-25 03:25:42 +02:00
Martin Mitas	c52a50a3db	pathological_tests.py: Add test for reference definition lookup.	2017-07-25 03:25:42 +02:00
Martin Mitas	c51fb31058	md_analyze_marks: Walk only required range of the marks. This changes causes that when recursing to analysis of link contents, only the marks between the link opener and closer are iterated in md_analyze_marks(). Fixes #22	2017-07-24 23:33:25 +02:00
Martin Mitas	a27aefded9	pathological_tests.py: Allow short option -p as a synonym of --program.	2017-07-24 20:17:50 +02:00
Martin Mitas	f4f7b2230c	pathological_tests.py: Allow Windowish line ends.	2017-07-24 20:15:09 +02:00
Martin Mitas	26f14899ed	Add pathological_tests.py from cmark.	2017-07-24 20:12:13 +02:00
Martin Mitas	ad4f28bb85	md_analyze_simple_pairing_mark: Fix the "rule of three". If the first emphasis opener is refused due the rule of three, a previous opener is examined. However the variable opener_orig_size_module3 was not (re)set accordingly. Fixes #21.	2017-07-24 20:09:23 +02:00
Martin Mitas	cfbce75910	Rework ref. def. dictionary. It now uses FNV1a and we now sort/bsearch only contents of single bucket. Additionally we fix #20 by disabling the invalid ref. definitions during hashtable build.	2017-07-18 18:49:52 +02:00
Martin Mitas	f2821cbd8e	md_analyze_permissive_email_autolink: Make it compatible with CMark-gfm.	2017-07-14 17:10:45 +02:00
Martin Mitas	1bc7f3a84e	render_url_escaped: Fix escaping of ampersand. This affected generating href attribute if links or src attribute of images.	2017-07-14 02:24:21 +02:00
Martin Mitas	f3f9404e53	Improve URL autolinks extension. It is now much more compatible to Cmark-gfm. With the flag MD_FLAG_PERMISSIVEWWWAUTOLINKS, we now also support the WWW autolinks (when the http: scheme is omitted).	2017-07-14 02:06:23 +02:00
Martin Mitas	25a156ee1b	Implement strikethrough extension.	2017-07-12 23:30:14 +02:00
Martin Mitas	8999e1844a	Fix "rule of three" for emphasis resolution (issue #14 ).	2017-01-04 15:20:46 +01:00
Martin Mitas	c63909df8e	When splitting emphasis opener mark, we have to retain 'dummy' marks available for more splitting in the future (issue #15 ).	2017-01-04 15:06:14 +01:00
Martin Mitas	5271238426	When parsing tables, pipes inside a link/image/code span cannot make cell boundary (issue #7 ).	2016-12-27 22:52:06 +01:00
Martin Mitas	f9b4cb8f6e	md_process_inlines: Fix when an expanded mark shadows some nested marks (issue #11 ).	2016-12-15 16:47:41 +01:00
Martin Mitas	c235a02ee8	test/coverage.txt: Add some tests for higher code coverage.	2016-12-15 13:18:48 +01:00
Martin Mitas	a725fee3f6	md_enter_child_containers: Fix crash (issue #10 ). Calling md_push_container_bytes() may result in ending a current block which may result in removing some contents from ctx->block_bytes when removing some lines with link reference definitions. This in effect means we have to end the block explicitly before storing the offset into the ctx->block_bytes.	2016-12-14 16:51:24 +01:00
Martin Mitas	ba29d0075e	md_is_link_reference_definition: Fix handling of multiline label (issue #9 ).	2016-12-12 23:31:59 +01:00
Martin Mitas	09ae86095f	Handle images more like links. Remove MD_SPAN_IMG_DETAIL::alt. Instead, the contents of the image is propagated to the renderer via MD_RENDERER::text() callback. * This fixes handling of entities inside the image text (issue #4). * It simplifies parsing and, more importantly, it better distingusshes what is responsibility of parser or renderer respectively. * This allows more flexibility on renderers side. Renderer who do not * really support images can just output the image content as any other text. The cost is a renderer into HTML (if it wants to render image contents into the attribute ALT of the IMG tag), has to handle images with more care. Typically such renderer has to track whether it is inside an image, and if so, then render span enter/leave as an empty string.	2016-12-07 23:56:47 +01:00
Martin Mitas	23312d6d65	md_is_html_tag: Fix parsing unquoted attribute value (issue #2 ).	2016-12-05 11:13:43 +01:00
Martin Mitas	b40d595044	Fix file permissions of python scripts.	2016-12-04 17:01:00 +01:00
Martin Mitas	be7fcc16ff	Implement tables. Note it is implemented as an extension. To enable it, the flag MD_FLAG_TABLES must be explicitly specified.	2016-11-21 13:39:45 +01:00
Martin Mitas	809e611b3c	Migrate to CommonMark pecification 0.27.	2016-11-20 00:57:32 +01:00
Martin Mitas	ef5f230ffa	Implement permissive autolinks extensions. With MD_FLAG_PERMISSIVEURLAUTOLINKS, we treat not overly complicated URLs as autolinks even without '<' and '>'. With MD_FLAG_PERMISSIVEEMAILAUTOLINKS, we treat not overly complicated e-mail addresses as autolinks even without '<', '>' and without the 'mailto:' scheme. Also expanded md2html utility and tests to cover these.	2016-10-14 19:56:05 +02:00
Martin Mitas	1cfc6a5f42	Incorporate the specification testsuite from CommonMark.	2016-10-11 01:10:11 +02:00

1 2 3

143 Commits