Commit Graph

127 Commits

Author SHA1 Message Date
Martin Mitas
c6535ff3da Fix eof handling in a middle of task list item. 2024-01-10 21:39:24 +01:00
Martin Mitas
ebbb12e506 Revert most of PR #168
i.e of the commit f436c30298.

It added bunch of checks all over the place, but most of them
shouldn't be needed: If they are true, our internal state is
already broken. In other words, those checks are hiding real bugs
and making debugging harder.

Hopefully the underlying bugs are already fixed in some of previous
commits addressing some fuzzing issues, like these:

 * d775b5103e
 * c6942ef03e
2024-01-10 20:47:34 +01:00
Martin Mitas
d775b5103e More fixes of TABLECELLBOUNDARIES chain handling.
Fixes #213.
2024-01-10 18:33:32 +01:00
Martin Mitas
c6942ef03e Treat TABLECELLBOUNDARIES chain as special one.
It's not an ordinary openers chain as (most of) the others, and
md_rollback() must not touch it.

Fixes #212.
2024-01-10 17:33:06 +01:00
Martin Mitas
ca169a92d5 Fix HTML renderer to handle neted images correctly.
Fixes #210.
2024-01-10 12:23:17 +01:00
Jens Alfke
efcfd7e7cd
Added MD_SPAN_A_DETAIL.is_autolink (#181)
This allows the processor to tell whether an <A> tag is the result of
an autolink, and customize its output. For example, I want to emit an
autolink of an image URL as an <IMG> tag, and an autolink of a YouTube
URL as a video embed.
2024-01-09 11:32:17 +01:00
Martin Mitas
61949ee9d1 Update to Unicode 15.1. 2024-01-09 02:08:48 +01:00
Martin Mitas
38303af369 Make md_is_html_block_end_condition() reuse the same data...
... as md_is_html_block_start_condition() for the type 1 so we make all
tags are used consistently there.

Fixes #207.
2024-01-09 00:01:35 +01:00
Martin Mitas
319631f67e Don't merge multiple HTML blocks together.
Fixes #202.
2024-01-08 21:52:30 +01:00
l-m
6ef3be6e69
MD_FLAG_HARD_SOFT_BREAKS (#193) 2024-01-08 21:09:57 +01:00
step
f554bf1108
Don't trim HTML block lines (MD_LINE_HTML) (#206)
Markdown 0.30 doesn't mandate right-trimming the contents of HTML lines.
Doing so is more work and breaks output compatibility with cmark, tested
with https://github.com/commonmark/cmark/commit/9393560.
2024-01-08 20:55:54 +01:00
Martin Mitas
132c29dcd0 Allow indented code block to follow any block except paragraph without a blank line.
Fixes #200.
2024-01-08 19:31:37 +01:00
Martin Mitas
601c8ab70e Restore parent's block indentation when interruping a list item with double blank line.
Fixes #190.
2024-01-08 19:27:16 +01:00
Martin Mitas
28f253d75c Fix some gcc warnings with -pedantic.
Fixes #187.
2024-01-08 18:18:51 +01:00
Martin Mitas
f7c8db7588 md_rollback: Fix dummization of virtual closers.
Fixes #173.
2022-01-14 11:04:02 +01:00
Martin Mitas
6abb7789f6 Remove debug messages left by mistake in the previous commit. 2022-01-14 10:13:28 +01:00
Martin Mitas
62b60979f6 Reset TABLECELLBOUNDARIES with ordinary opener chains.
This is needed because special handling of '|' is now done also if the
wiki-links extension is enabled so the chain is populated even with that
extension.

Fixes #174.
2022-01-14 10:12:30 +01:00
Martin Mitas
db9ab417b1 Improve wiki-link parsing.
* md_rollback: Restore dummy marks changed to virtual zero-length
   closers.

 * md_analyze_links: Be more careful in how we rollback contents
   of a full wiki link (`[[destination|label]]`). The destination has to
   be rollbacked completely (MD_ROLBACK_ALL) while the label only with
   MD_ROLLBACK_CROSSING.

Fixes #173.
2022-01-12 16:16:00 +01:00
Martin Mitas
8dd35762d9 md_analyze_dollar: Simplify the function. 2022-01-11 20:53:04 +01:00
Martin Mitas
4358c40ab7 md_lookup_line: Advance to the next line even if the offset...
falls into a gap between two lines, instead of returning NULL.
Fixes NULL dereference in md_is_link_reference(). This was a regression
in 2e9b13cc51.
2022-01-11 10:32:57 +01:00
Martin Mitas
c058e82c6a md_is_table_underline: Fix detection by the end of file.
This was a regression in a8bb4d3020.
2022-01-10 12:34:57 +01:00
Martin Mitas
b42e7f5cea md_resolve_links: Avoid link ref. def. lookup if...
if we know that the bracket pair contains nested brackets. That makes
the label invalid anyway, therefore we know that there is no link ref.
def. to be found anyway.

In case of heavily nested bracket pairs, the lookup could lead to
quadratic parsing times.

Fixes #172.
2022-01-10 11:41:25 +01:00
Martin Mitas
2e9b13cc51 md_lookup_line: New function.
The function performs a binary search over array of MD_LINE structs to
find a line the given offset lives on.

Replaced few linear scans for such lines with a call to this function.
2022-01-10 03:16:47 +01:00
Thierry Coppey
f436c30298
Fix buffer overflows and other errors found with fuzzying. (#168)
Fix multiple buffer overflow on input found with fuzzying.
2022-01-06 16:21:51 +01:00
Martin Mitáš
eeb32ecc9e
Merge pull request #167 from dtldarek/master
Two buffer overflow fixes.
2022-01-06 16:16:45 +01:00
Martin Mitas
a8bb4d3020 md_is_table_underline: Remove requirement for minimal length of a cell underline.
Fixes #169.
2022-01-06 16:01:55 +01:00
dtldarek
260cd3394d Fix buffer overflow on input found with fuzzying (in c-string format):
"\n# h1\nc  hh##e2ked\n\n A | rong__ ___strong \u0000\u0000\u0000\u0000\u0000\u0000\a\u0000\u0000\u0000\u0000\n# h1\nh# #2\n### h3\n#### h4\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\\\n##### h5\n#*#####\u0000\n6"
2021-08-25 15:02:38 +02:00
dtldarek
933388a657 This is a fix for a buffer overflow that happens on input found with fuzzying (in c-string format): "\xA9##r[](r[](". 2021-08-25 14:41:49 +02:00
Martin Mitas
ab422e83ff md4c-html.h: Fix typo in a comment. 2021-07-15 19:12:49 +02:00
Martin Mitas
ccc8b64a96 md_html: Add ~ to the list of characters not escaped in URIs.
Fixes #165.
2021-07-15 18:56:46 +02:00
Martin Mitas
82b226ffb5 md_is_html_block_start_condition: Accept lower-case HTML declaration.
The change is mandated by the spec v. 0.30.
2021-06-27 18:49:33 +02:00
Martin Mitas
d50a0142a0 md_is_html_block_start_condition: Update for 0.30.
The spec. 0.30 adds the tag <textarea> into the list if HTML blocks
start condition type 1.
2021-06-27 18:49:33 +02:00
Kai Koehne
e828594220
Fix MSVC compiler level 3 warnings (#162)
Fix various C4244 warnings with the MSVC compiler for 64 bit
2021-06-14 09:47:17 +02:00
Martin Mitas
b2ee4b194c md_resolve_links: Fix the test for the nested autolink covering whole link text.
This fixes the fix for #152.
2021-04-14 18:27:19 +02:00
Martin Mitas
bcb55d0d40 md_resolve_links: Suppress bogus nested permissive autolink.
Fixes #152.
2021-04-14 09:18:09 +02:00
Martin Mitas
4fc808d8fe md_analyze_line: Avoid reading 1 byte beyond the input size.
Fixes #155.
2021-03-29 12:51:48 +02:00
Martin Mitas
aa65423091 md_enter_child_containers: Propagate list mark character properly.
Fixes #153, #154.
2021-03-22 14:04:55 +01:00
Martin Mitas
fe2f242774 Fix copy&paster error in a comment. 2021-02-11 11:35:54 +01:00
Martin Mitas
fd7b5fe085 md_analyze_line: Fix implicit ending of HTML blocks...
... when the HTML block is not explicitly ended (before the enclosing
container block ends).

Fixes #149.
2021-02-05 21:50:57 +01:00
Niclas Rosenvik
2b6ebdfa39
Fix use of the cmake package (#146)
Fix use of the cmake package

Fix use of the cmake package and its imported targets.
Make sure that the include dir comes with the cmake targets
Put everything under md4cConfig so that the md4c-html can see
md4c.
Use md4c namespace so that the targets become md4c::md4c and
md4c::md4c-html following cmake standards for imported targets.

Fixes #145.
2021-01-09 11:54:27 +01:00
Martin Mitas
9ba57ccb2e md_link_label_cmp_load_fold_info: Remove a bogus code.
The input into the function is already guaranted to not have a new line
characters. (And handling of them in the function was broken anyway.)
2020-12-14 19:53:58 +01:00
Martin Mitas
5a44e327a0 md_link_label_cmp: Fix the loop end condition.
The old version likely could stop prematurely in a corner case when
there was a Unicode character at the end of the either string, which
maps into multiple fold info codepoints.

Fixes #142.
2020-12-14 19:30:23 +01:00
Martin Mitas
d4a78622a1 Minor cleanup. 2020-12-14 18:58:00 +01:00
Martin Mitas
701a06266b Make MD_UNICODE_FOLD_INFO::n_codepoints unsigned. 2020-12-14 18:57:55 +01:00
Giuseppe D'Angelo
a45f839b7b Fix mixed signed/unsigned comparisons
Force both operands to unsigned. n_codepoints does not seem to ever
contain negative offsets anyhow, should it actually be unsigned?
2020-12-14 18:41:30 +01:00
Giuseppe D'Angelo
6dd6434653 Silence "unused parameter" warnings
Merely added a suitable macro. Didn't refactor any code to
actually figure out why the parameters were not used.
2020-12-14 18:41:30 +01:00
Giuseppe D'Angelo
569defae40 Silence -Wimplicit-fallthrough warnings
Use a macro that dispatches to the compiler-specific magic
to silence implicit fallthrough warnings when the fallthrough
was actually intended. The code already featured comments,
so these are actually safe to place.

(Unfortunately, Clang does not recognize any comment as
"fall through" comment, and GCC only recognizes some variations
of "fall through", not "pass through". Moreover, one of the
comments replaced here had a typo...)
2020-12-14 18:41:30 +01:00
Giuseppe D'Angelo
e1b4187611 Enable more warnings when building under GCC/Clang 2020-12-14 18:41:30 +01:00
Martin Mitas
26003b8881 md_is_container_mark: Recognize list item marks just before EOF.
We were recognizing the list item marks when a new line or a blank
character follows.

However, given end-of-file means implicitly also an end-of-line, we
should recognize in that situation too.

Fixes #139.
2020-12-04 22:34:06 +01:00
Martin Mitas
3254b7cb00 md_process_table_block_contents: Suppress empty TBODY block generation.
When the table has no body rows, do not call the callback with
MD_BLOCK_TBODY events.

Fixes #138.
2020-11-13 12:02:39 +01:00
Martin Mitas
a997cb21bf Add MD_BLOCK_TABLE_DETAIL.
This allows renderers to have the info about table dimension (table
column and row count) in advance and e.g. simplify their memory
allocation strategy.
2020-11-13 11:59:49 +01:00
Martin Mitas
4585088ad7 md_analyze_permissive_url_autolink: Better GFM compatibility.
The autolinks now allow unmatched parenthesis, only the trailing
parenthesis closers are handled specially to deal with the situation the
autolink is all inside an outer parenthesis.

Somehow our tests were broken and avoided the cases with unmatched
parenthesis pairs inside the auto-link. That's now fixed and in sync
with GFM specs too.

Fixes #135.
2020-11-13 10:22:34 +01:00
Martin Mitas
c3a18d5587 md_collect_marks: continue -> break
Does not cause any change in behavior: we just avoid needless loop
iterations now.
2020-11-13 09:28:56 +01:00
Martin Mitas
baa1dd06eb Fix some English wording in comments. 2020-11-09 16:03:00 +01:00
Rasmus Andersson
125e8e03e6 Initializes an uninitilized variable in md_analyze_emph
Fixes the following, reported by clang analysis:

src/md4c.c:3729:61: warning: variable 'opener_index' may be uninitialized when used here [-Wconditional-uninitialized]
            MD_MARKCHAIN* opener_chain = md_mark_chain(ctx, opener_index);
                                                            ^~~~~~~~~~~~
src/md4c.c:3686:25: note: initialize the variable 'opener_index' to silence this warning
        int opener_index;
                        ^
                         = 0
2020-10-24 19:26:57 +02:00
Rasmus Andersson
1a2f4816a7 Adds missing field initializers (undefined behavior)
src/md4c.c:5667:72: warning: missing field 'beg' initializer [-Wmissing-field-initializers]
static const MD_LINE_ANALYSIS md_dummy_blank_line = { MD_LINE_BLANK, 0 };
2020-10-24 19:02:48 +02:00
Martin Mitas
002f76c975 md_resolve_links: Skip [...] used as a reference link/image label.
Fixes #131.
2020-10-18 09:43:06 +02:00
Martin Mitas
22ca89a300 Fix ISANYOF encountering a zero byte in the input.
When it happened, it could lead to unexpected results, including broken
internal state of the parser.

Fixes #130.
2020-09-29 21:33:43 +02:00
Karsten
440ccd82f3 Update md4c.h 2020-08-19 10:13:23 +02:00
Martin Mitas
9651f78051 Improve docs comments. 2020-08-16 11:28:21 +02:00
Martin Mitas
67214417dd Make mark_chain[] helper macro definitions safer. 2020-08-05 10:53:52 +02:00
Martin Mitas
70d0ef7c91 Avoid simple {0} to initialize a more complex object.
Should fix #125.
2020-08-05 09:18:41 +02:00
Martin Mitas
c501c891b9 Fix spelling of "than" in many occurances.
I often spell it errorneously as "then". Doing this mistake way too
often when typing fast.
2020-07-30 10:13:05 +02:00
Martin Mitas
c595c2ed00 md_process_verbatim_block_contents: Fix off by 1 error.
This caused outputting wrong indentation inside a fenced code blocks for
lines indented with mor ethan 16 spaces.

Fixes #124.
2020-07-30 08:38:19 +02:00
Nazar Vinnichuk
4ad801b771 Replace deprecated MD_RENDERER mentions in md4c.h. 2020-07-28 07:15:27 +02:00
Evan Klitzke
da27ee0dcd fix a comment typo in md4c-html.h, md_render_html -> md_html 2020-07-12 22:20:18 +02:00
Martin Mitas
dec6e22b0e Fix entity rendering with MD_HTML_FLAG_VERBATIM_ENTITIES.
Fixes #118.
2020-06-27 20:27:28 +02:00
Dominick C. Pastore
3e5d64bf44
Add missing <img /> tag to XHTML support (#116) 2020-05-29 16:42:38 +02:00
Martin Mitas
72dad97ed6 scripts/build_folding_map.py: Handle properly "ranges" of length 2.
Update the data structures in md_get_unicode_fold_info() to reflect the
update in the script and handle the previously omitted characters.

Fixes #113.
2020-05-20 16:44:07 +02:00
Dmitry Atamanov
3d64d6be9b
Update to Unicode 13.0 (#111) 2020-05-07 23:13:55 +02:00
Martin Mitas
ddcc1f34df HTML renderer: Add support for XHTML mode. 2020-05-04 12:54:15 +02:00
Martin Mitas
2728b9eb0f Move md2html utility to a standalone dir. 2020-04-20 19:37:18 +02:00
Martin Mitas
ce8c66060c Fix the build. 2020-04-20 19:32:13 +02:00
Martin Mitas
1f78c867ff Rename HTML renderer public identifier names.
This is to reflect we make it a public API.
2020-04-20 19:24:28 +02:00
Martin Mitas
77c2669bd1 Update generating pkgconfig .pc files.
* Output also package URL.
 * Output also package description.
 * Output also package version.
 * Generate .pc file for the new renderer lib.
2020-04-20 19:22:42 +02:00
Martin Mitas
ed0ef280b3 Build the HTML renderer as a standalong library. 2020-04-20 19:22:42 +02:00
Martin Mitas
7f2d880f9a Refactor dir structure.
We place all the sources in the single directory in order to not having
many dirs with too few sources.
2020-04-20 19:22:42 +02:00