Happy April!
Reverts:
- I had to remove the Typst formatter I was so happy about in the last release! This is very unfortunate. It turned out to be LLM-generated, and I just haven't had the time or energy to replace it myself. (by @kivikakk in #781)
Changed APIs:
- Deprecate the option name
header_idsin favor ofheader_id_prefix, to make it clear that the option value actually adds a prefix to theidattribute, and add theheader_id_prefix_in_hrefoption, which adds the same prefix to generatedhrefs. (by @miketheman in #776) - Decouple greentext handling from blockquote parsing. (by @Martin005 in #789)
- This means a lone
>amongst blockquotes won't trigger greentext when enabled.
- This means a lone
New APIs:
- Add parse option for char-based columns in
Sourcepos. They report by default in byte columns according to the input UTF-8 source. (by @Martin005 in #779) - Add block directive extension. (by @P-SiZK in #782)
Dependency updates:
- Bump toml from 1.0.3+spec-1.1.0 to 1.0.6+spec-1.1.0. (by
@dependabot[bot]in #778) - I've lengthened the dependency cooldown in Dependabot to 90 days, given gestures aimlessly the state of everything. (15c22fa2)
Documentation:
- Added a
CONTRIBUTING.mdthat directs the reader to the README's contributing section. (ddae1558)
Build changes:
- Set
codegen-units = 1for release builds. (suggested by @zamazan4ik, added in ea026ef3)
Behind the scenes:
- Stop fuzz targets regressing by adding the build to CI. (by @kivikakk in #771)
- @P-SiZK made their first contribution in #782
Diff: https://github.com/kivikakk/comrak/compare/v0.51.0...v0.52.0
新年快乐! :) What a lovely assortment of improvements, fixes, and new contributors we have in this release. Of particular note, Comrak has gained initial support for formatting to Typst, thanks to @neilberkman! This is a first cut and there are some known issues — see the PR — but I'm super happy to have folks testing it sooner rather than later!
Changed APIs:
- Add "css" to CLI's syntax highlighting options; set CSS as the default highlighting mode. (by @gjtorikian in #739)
- Allow raw nodes to be children of anything. (by @JeanMertz in #743)
- Support comma-delimited language tokens in Syntect plugin. (by @neilberkman in #752)
New APIs:
- Added
RenderPlugins::codefence_renderersto register language-specific codefence renderers. (by @neilberkman in #751) - Added
CodefenceRendererAdapterfor language-specific codefence rendering. Itswritemethod receives parsed codefence language and metadata (lang,meta) alongside code and source position. (by @neilberkman in #751) - Add
++insert++extension and guillemet smart punctuation. (by @neilberkman in #754) - Add Typst formatter. (by @neilberkman in #763)
- Add
compact_htmlrender option to suppress newlines in pretty-printing. (by @xvchris in #769)
Bug fixes:
- Don't wrap text in table cells in CommonMark output. (by @cinerea0 in #737)
- Fix incorrect
sourceposfor inserted table cells. (by @Martin005 in #747) - Fix HEEx nested element edge cases. (by @leandrocp in #749)
- Fix off-by-one in hex entity digit limit. (by @neilberkman in #753)
Dependency updates:
- Replace
unicode_categorieswithfinl_unicodefor Unicode character categories. (by @Martin005 in #757) - Bump bon from
3.8.2to3.9.0. (by@dependabot[bot]in #759) - Bump clap from
4.5.54to4.5.60. (by@dependabot[bot]in #748, #756, #758, #766) - Bump strum from
0.27.2to0.28.0. (by@dependabot[bot]in #765) - Bump time from
0.3.36to0.3.47in/fuzz. (by@dependabot[bot]in #744) - Bump toml from
0.9.10+spec-1.1.0to1.0.3+spec-1.1.0. (by@dependabot[bot]in #742, #760, #767)
Documentation:
- Add
alerts(admonitions) to README. (by @kritzelkrak in #761) - Explain column counting in
LineColumn, and fix shortcode context link. (by @Martin005 in #764)
- @cinerea0 made their first contribution in #737
- @JeanMertz made their first contribution in #743
- @neilberkman made their first contribution in #751
- @kritzelkrak made their first contribution in #761
- @xvchris made their first contribution in #769
Diff: https://github.com/kivikakk/comrak/compare/v0.50.0...v0.51.0
The big news is that I've updated Comrak to use Rust 2024, which means our MSRV has been updated to 1.85. I'm sorry if this affects you negatively! Keeping dependencies up-to-date while respecting our MSRV is basically impossible without the MSRV resolver, and I've spent far too many hours of my short life trying to do without it. Thank you for your understanding 🤍
Bug fixes:
- Incorrect
sourceposfor lists before CodeBlocks fixed. (by @Martin005 in #712) - Incorrect
sourceposfor HTML and HEEx blocks inside blockquotes fixed. (by @Martin005 in #714) - HTML block in blockquote in multiline block quotes had no content; now it does! (by @Martin005 in #713)
- Fixed some bugs with HEEx components mixed with Markdown. (by @leandrocp in #719)
Performance:
- Reduced some heap allocations and improved hash function performance. (by @gjtorikian in #717)
Dependency updates:
- A whole lot of Dependabot updates:
- Upgrade to edition 2021, then 2024. (by @kivikakk in #731, #732)
- The latter PR includes even more dependency updates (using the v3 resolver), using the latest versions possible on 1.85.
Documentation:
- Added a policy on LLM-generated code. (by @kivikakk in #718)
Build changes:
- Dependency updates to help building on LoongArch64. (by @kivikakk in #708)
- Don't run the pathological test suite on RISC-V. (by @kivikakk in #710)
- Turns out the Darwin builds were also linking to the Nix store! (by @kivikakk in #736)
Behind the scenes:
- Integrated CodSpeed benchmarks on PRs. (by @kivikakk in #715)
Diff: https://github.com/kivikakk/comrak/compare/v0.49.0...v0.50.0
New APIs:
- Support for Phoenix HEEx has been added! Comrak can now avoid interpreting the contents of HEEx tags, representing them in their own node type in the AST. (by @leandrocp in #693, #702)
- Add sourcepos for the task item symbol (i.e. the
xin[x]). (by @kivikakk in #705)
Build changes:
- Fix typos in repository and add typos CI config. (by @liamwhite in #691)
- Promote clippy warns to errors in CI. (by @leandrocp in #701)
Behind the scenes:
- Clean up
find_special_charlogic for inlines. (by @liamwhite in #692) - Add
byte_matcheshelper to simplifyget/map_orchaining tests in parser. (by @liamwhite in #694) - Apply clippy suggestions. (by @leandrocp in #700)
- DRY out the
create_formattermacro a bit. (by @nberlette in #704)
- @nberlette made their first contribution in #704
Diff: https://github.com/kivikakk/comrak/compare/v0.48.0...v0.49.0
The breaking changes are listed right at the top! Please note that AST content now represents NUL bytes (codepoint number zero) as they were in the input; these used to be translated to the lovely � character at the very beginning of the input process, presumably so the rest of the reference C parser didn't have to deal with the possibility of strings containing NUL bytes. We can do better, though, so let's! The � character is now emitted by our formatters in place of NUL, but if you use custom or manual formatters and emit any part of the AST content directly (without using comrak::html::escape, context::html::escape_href, or the same-named functions on Context), you may need to do the same translation yourself.
We also no longer append a newline to the end of the file where there wasn't one originally, which meant a lot of places in the parser had to adapt to their strings not necessarily containing a newline before they ended. Careful review and extensive fuzzing should have squeaked out any unexpected overruns, but consider my eyes peeled for reports regarding this. (Ew.) We've cleaned up some sourcepos calculation which depended on this behaviour in odd ways, but there may yet be more to discover which our test suite didn't catch.
Did you know November is Trans Month? I didn't! I'm guessing it's because Trans Awareness Week falls within it, and we've been having a pretty bad time of it rights-wise around the world lately!
Happy Trans Month, and if you happen to typo it as Trans Moth, we can be happy about that too! 🏳️⚧️ ᖭི༏ᖫྀ
Parser changes:
- No longer translate
NULbytes intoU+FFFD REPLACEMENT CHARACTERin the parse stage; do it in formatters instead. (by @kivikakk in #681)- This means the AST now contains
NULbytes where they were present in input, preserving the difference betweenNULand literally-entered�characters.
- This means the AST now contains
- No longer append a virtual newline at the end of the file where missing. (by @kivikakk in #682)
- The spec allows a line to end with either a newline or EOF; the reference parser would assume any given input string will always have a terminating linefeed and forced that to be the case, and so Comrak used to. Comrak no longer does.
- We also now handle line feed, carriage return, and carriage return plus line feed (as allow'd by the spec) without pretending they're all just a line feed, meaning e.g. sourcepos for softbreaks now correctly spans two bytes when it was produced by a carriage return plus line feed.
Changed APIs:
- Remove mandatory space before fenced codeblock info string in CommonMark output. (by @kivikakk in #686)
- Write out
%25in hrefs where not part of a percent-encode sequence. (by @kivikakk in #687)- We used to leave any
%character alone, such that[link](%%20)would roundtrip without change. It now roundtrips to[link](%25%20).
- We used to leave any
- Relaxed tasklist matching now supports a full Unicode scalar for the character between the
[…], and no longer turns single-byte UTF-8 characters into the Unicode codepoint numbered at the UTF-8 byte (!). (by @kivikakk in #689)
New APIs:
- Add
highlightextension,==for highlights==! These render with<mark>in the HTML formatter. (by @pferreir in #672) - Add
comrak::Node<'a>as an alias forcomrak::nodes::Node<'a>. (by @kivikakk in #673) - Add
comrak::Arena<'a>as an alias fortyped_arena::Arena<comrak::nodes::AstNode<'a>>. (by @kivikakk in #675) - Add
From<(LineColumn, LineColumn)>impl forSourcepos. (by @kivikakk in #675) - Make
comrak::nodes::NodeValue::xml_node_namepublic, when you want a handy-to-access name for a node type. (by @kivikakk in #673) - Add
options.parse.leave_footnote_definitions; this option causes footnote definitions to not be relocated to the bottom of the document, and unused references not to be garbage collected, for use with custom formatters. (by @kivikakk in #673)
Bug fixes:
- Fix relaxed autolink email in footnote edge case/panic. (by @kivikakk in #677)
- Prevent unexpected post-processing, such as
\[x]still being eligible for tasklist inclusion despite the escaped[. (by @kivikakk in #679)
Performance:
- Simplify internal feed function, no longer requiring any allocation before the block parser. (by @kivikakk in #679, #681)
- Don't buffer CommonMark output unless necessary. (by @kivikakk in #684)
- The full output was always buffered in a string before being written to the destination, which in many cases is going to be another string. Buffering is now done only to the extent required by output options, which often will be "not at all."
- Use SIMD for core line feed process. (by @kivikakk in #688)
Build changes:
- We now build Linux release binaries against musl, making them actually useful for anyone not running my exact Nix build :') (by @kivikakk in #671)
- The benchmark CI job no longer causes the whole PR to fail checks if it can't post its comment. (by @kivikakk in #674)
Behind the scenes:
- Factor out
inlines::Scanner, reducing some needless allocations. (by @kivikakk in #675) - The
all_optionsfuzzer now fuzzes across all options, and not just with most of them switched on. (by @kivikakk in #678)
- @pferreir made their first contribution in #672
Diff: https://github.com/kivikakk/comrak/compare/v0.47.0...v0.48.0
Martin Chrástek has fixed all known sourcepos issues in Comrak, while closing a number of other bugs at the same time! I'm so happy.
New APIs:
NodeCodeBlocknow has aclosedproperty. (by @Martin005 in #661)NodeHeadingnow has aclosedproperty, for closed ATX-style headings. (by @Martin005 in #665)
Bug fixes:
- Source position information for lists and their children is fixed. (by @Martin005 in #666)
- Source position information for unclosed fenced code blocks is fixed. (by @Martin005 in #661)
EscapedandEscapedTagno longer fail AST validation when formatting as CommonMark with debug assertions. (by @kivikakk in #662, #664)
Build changes:
- The fuzzer now also runs on CommonMark and XML output formats. (by @kivikakk in #663)
Diff: https://github.com/kivikakk/comrak/compare/v0.46.0...v0.47.0
Please note the MSRV has been bumped from 1.65 to 1.70; see the pull request for more details. It's a kind of sticky and awkward situation — thanks to the inevitability of Progress — with no particularly clean solution. (wherein telling GCC 15 users "sorry it just won't build from source for you without messing with dependencies" is not a solution.)
Security:
- Footnote resolution no longer recurses over the document tree; on documents with deeply nested elements, this could cause a stack overflow, with resultant denial of service. (by @kivikakk in #659)
- Inline footnotes are restricted to a depth of 5 for similar reasons. An iterative rewrite here to avoid a limit is possible, but for now I'm hoping we can all pretend to be responsible adult human beings and limit our recursive inline footnote usage accordingly. (PRs welcome tho, non-human users are very welcome!) (by @kivikakk in #659)
Parser changes:
- U+2069 POP DIRECTIONAL ISOLATE will be treated as terminating an autolink, rather than included as part of the link, making autolinks much easier to use correctly in RTL text. (by @SethFalco in #654)
- HTML start condition 4 is correctly detected when non-capital letters follow "<!". (by @kivikakk in #658)
New APIs:
Bug fixes:
- Source position information is corrected for description lists, HTML blocks, multiline block quotes, links with newlines following the destination, tables with leading indentation, and escaped character spans. (by @Martin005 in #646, #651, #652, #653, #656, #657)
escaped_char_spanusers can now successfully format to CommonMark with debug assertions enabled. These ASTs previously did not validate, which currently is enabled experimentally only in CommonMark output in debug. (by @kivikakk in #659)
Build changes:
- Comrak's MSRV is bumped from 1.65 to 1.70. (by @kivikakk in #649)
- @Martin005 made their first contribution in #646
- @Kuuuube made their first contribution in #648
- @SethFalco made their first contribution in #654
Diff: https://github.com/kivikakk/comrak/compare/v0.45.0...v0.46.0
Welcome to v0.45.0! This is a big update, much of them part of from rc.1 from last week. More context on the size of the update in the changelog there.
The biggest library user-facing changes are ergonomic: Node<'a> instead of &'a AstNode<'a>, is nice, and so likewise node.data() instead of node.data.borrow(). They're small, but I appreciate them a lot in my own work.
You'll also notice more bovine creatures in the Comrak pasture: there's a few Cow<str> instead of String, such as in NodeValue::Text. At most an extra .into() will be required; take note if you use any 'static str, as they'll no longer need to be heap-allocated. Some Boxes have been added, too, to reduce the size of every NodeValue. Let the types guide you.
Other than this, the options have been put in their own module (comrak::options), and a lot of things generally cleaned up. Read below for all the deets! Here's the final performance comparison to v0.44.0 on aarch64:
Benchmark 1: ./bench.sh ./comrak-0.44.0
Time (mean ± σ): 88.1 ms ± 1.9 ms [User: 71.2 ms, System: 17.8 ms]
Range (min … max): 86.2 ms … 93.2 ms 31 runs
Benchmark 2: ./bench.sh ./comrak-0.45.0
Time (mean ± σ): 67.0 ms ± 1.2 ms [User: 51.2 ms, System: 17.0 ms]
Range (min … max): 65.2 ms … 70.0 ms 42 runs
Summary
./bench.sh ./comrak-0.45.0 ran
1.32 ± 0.04 times faster than ./bench.sh ./comrak-0.44.0
Be well!
Parser changes:
- Runs of more than two
~are no longer recognised as valid delimiters, meaning they will not prevent strikethrough recognition when they occur within correct delimiters. See the PR for discussion. (by @miketheman in #635)- This does not impact spec compatibility, matches
cmark-gfm, and follows the intent of the original implementation and implementor (hi!).
- This does not impact spec compatibility, matches
Changed APIs:
r#unsafeis used instead ofunsafe_. (by @kivikakk in #640)--gemojisis renamed to--gemoji. (by @kivikakk in #641)NodeValue::Textnow contains aCow<'static, str>instead of aString. This is a pretty major change, but means we can now create text nodes with static content without duplicating the string on the heap. This particularly benefits smart quotes and HTML entity resolution. (by @kivikakk in #627)- Adapting to this change usually means nothing on the read-only side (you can use it as a
&strwithout issues); to write in-place, use.to_mut()on theCowto get a&mut String. To assign, use.into()on a&strorString, likeNodeValue::Text("moo".into()). NodeValue::text()now returns a&str. It used to return a&String(!).NodeValue::text_mut()now returns a&mut Cow<'static, str>, instead of a&mut String. This permits writing a borrowed reference.- I am experimenting with parameterising the lifetime on the
Cow; it'd be amazing to refer continuously to the input where possible.
- Adapting to this change usually means nothing on the read-only side (you can use it as a
NodeValue'sCodeBlock,Table,Link,Image,ShortCodeandAlertvariants' payloads are now boxed. (by @kivikakk in #632)- Adapting to this change usually means adding a
Box::newcall when constructing these nodes, and on matches, pulling the box out and then just dereferencing it directly (e.g.NodeValue::Table(nt) => &nt.alignmentsinstead ofNodeValue::Table(NodeTable { ref alignments }). - These payloads were larger than average, increasing the size of every node considerably. The changes reduce an
Astto 128 bytes, and a fullAstNode<'_>to 176 bytes. - This produces a performance sweet spot: boxing the whole
NodeValueresults in worse performance than doing nothing at all. This change appreciably improves matters. - We now assert the size of a node during build to ensure future payload changes don't increase the total size of an
Ast.
- Adapting to this change usually means adding a
- Options now live in
comrak::options. Structs have been renamed to removeOptionsfrom their name:comrak::RenderOptionsis nowcomrak::options::Render, etc. The old names are marked deprecated. (@kivikakk in #636)- Traits cannot be aliased yet :(
URLRewriterandBrokenLinkCallbackhave been moved, without a deprecation period.
- Traits cannot be aliased yet :(
SyntaxHighlighterAdapter'sattributesarguments now takeHashMap<&'static str, Cow<'s, str>>; they used to takeHashMap<String, String>. (by @kivikakk in #633)html::write_opening_tagcan now take differentAsRef<str>types for the attribute key and value.parse_document_with_broken_link_callbackhas been removed! This entrypoint has been deprecated since 0.25.0. (by @kivikakk in #623)options.render.ignore_setextwas moved tooptions.parse.ignore_setext, as its effect takes place only in the parse stage. (by @kivikakk in #623)nodes::can_contain_typeis nowNode::can_contain_type. (by @kivikakk in #625)
New APIs:
node.data()andnode.data_mut()are added as short-hand fornode.data.borrow()andnode.data.borrow_mut()respectively. (by @kivikakk in #643)comrak::nodes::Node<'a>is introduced as an alias for&'a comrak::nodes::AstNode<'a>. (by @kivikakk in #627)options.parse.tasklist_in_tableadded: parse a tasklist item if it's the only content of a table cell. (by @kivikakk in #622)
Performance:
- Inline content is transferred to Text nodes without copying where possible. (by @kivikakk in #642).
- Have you looked at your 7 year old code lately? A detail in the C-to-Rust translation meant essentially every line of input was being copied completely unnecessarily at the very beginning of the line processing stage. This no longer happens. We regret the error. (by @kivikakk in #629)
- Preprocess entity data at build-time so we don't spend time doing a linear search over an unsorted array, some of which we will never match. (by @kivikakk in #631)
- Inline content is consumed by the inline processor, instead of being borrowed by it and retained in memory indefinitely. (by @kivikakk in #631)
- Don't try to do better than the stdlib at guessing buffer sizes; it's very good at it. (by @kivikakk in #626)
- Use
strinternally in block and inline processing, eliminating many UTF-8 rechecks. Thestringsmodule actually operates on strings now. (by @kivikakk in #626) - Many, many needless clones have been removed in almost every subsystem.
Dependency updates:
memchrremoved fromCargo.toml; it wasn't used directly, though it still is included unconditionally due tocaseless. (by @kivikakk in #630)slugis moved to a development-only dependency; it's only used in an example. (by @kivikakk in #630)jetsciiis added for faster string searching, including SIMD on x86_64. (by @kivikakk in #630)- I'm experimenting with aarch64 SIMD.
Documentation:
- The CLI help text has been copy-edited to a consistent style. (by @kivikakk in #641)
- The
READMEexample code is updated to build with recent API changes. (by @kivikakk in #621)
Build changes:
shortcodesis enabled by default (but still optional) for CLI builds. (by @kivikakk in #641)syntectis now optional (but still default) in CLI builds. (by @kivikakk in #624)
Behind the scenes:
- Much of the block parser code has been re-organised, and many C-isms from the original port have been refactored into readable Rust. (by @kivikakk in #627)
- Likewise the inline parser has been re-organised. (by @kivikakk in #644)
- All
unsafeblocks now have aSAFETYcomment describing why their actions are safe.
- @miketheman made their first contribution in #635
Diff: https://github.com/kivikakk/comrak/compare/v0.44.0...v0.45.0
Parser changes:
- Autolink validation is now stricter in the default mode, to maintain conformance with the GitHub Flavored Markdown autolinks extension spec. Those parses which previously worked but no longer do --- such as
http://localhost(!),www.com(!?), orhttps://(!?!) --- are now part of therelaxed_autolinksoption. See more discussion in the PR. (by @chamlis in #618)
New APIs:
- You can write footnotes with their body inline by enabling the
inline_footnotesextension and using the syntax^[footnote content](by @sheremetyev in #619)
Diff: https://github.com/kivikakk/comrak/compare/v0.43.0...v0.44.0
Parser changes:
superscriptorsubscriptextensions only: punctuation following a superscript or subscript delimiter no longer disqualifies the delimiter from being considered left-flanking, such thate^-i^andn~-i~now parse as superscript or subscript respectively (by @kivikakk in #593)
Changed APIs:
html::format_document,xml::format_document,cm::format_documentand friends now take anstd::fmt::Writeas theiroutputargument, instead of anstd::io::Write, to avoid revalidating UTF-8 (by @kivikakk in #601)- bin: allow
--header-ids ''for prefix-less headers (by @kivikakk in #610)
New APIs:
- Add CJK Friendly Emphasis to CLI option (by @tats-u in #607)
Documentation updates:
Diff: https://github.com/kivikakk/comrak/compare/v0.42.0...v0.43.0
New APIs:
cm::escape_inline(aliased at crate level asescape_commonmark_inline) is added; escapes input text suitable for inclusion in a CommonMark document where regular inline processing takes place. (by @kivikakk in #602)cm::escape_link_destination(aliased at crate level asescape_commonmark_link_destination) is added; escapes input URL suitable for use as a link destination in a CommonMark document. (by @kivikakk in #605)
Changed APIs:
html::collect_textnow returns aString.html::collect_text_appendis added if you still want to start with your own (String) buffer. (by @kivikakk in #600)- There was no particular reason for this populating a
Vec<u8>instead of aString; it was just old.
- There was no particular reason for this populating a
Anchorizer::anchorizernow takes&strinstead of aString. (by @kivikakk in #603)- As above.
Updates:
- Update
is_cjkin CJK Friendly Emphasis to Unicode 17. (by @tats-u in #598)
Behind the scenes:
- Rename a bunch of "nch" to "nh". (by @kivikakk in #599)
Diff: https://github.com/kivikakk/comrak/compare/v0.41.1...v0.42.0
Bug fixes:
- Fix the range of non-emoji general purpose variation selector by @tats-u in #596
Stability:
- html: remove some panics on unusual ASTs, and document others. by @kivikakk in #589
Behind the scenes:
- Cleanup fuzzers, add unresolved
relaxed_autolink_email_in_footnotetest by @Mrmaxmeier in #594 - build(deps): bump actions/checkout from 4 to 5 by @dependabot[bot] in #590
- @Mrmaxmeier made their first contribution in #594
Diff: https://github.com/kivikakk/comrak/compare/v0.41.0...v0.41.1
New features:
- Add CJK friendly emphasis extension by @tats-u in #582
- Add CJK friendly emphasis to README by @tats-u in #583
Build changes:
- Use syntect's default-fancy feature for ios by @tvanderstad in #585
Diff: https://github.com/kivikakk/comrak/compare/v0.40.0...v0.41.0
Reverts:
- "Fix header-ids accessibility" was reverted in https://github.com/kivikakk/comrak/commit/2cb6188bfb4c5d69bf55c73c19b2049e9dfe5dba
- See discussion at #574 (comment) for details.
Bug fixes:
- html: don't escape IPv6 address square bracket delimiters in links. by @kivikakk in #578
New APIs:
- Syntect syntax highlighting: Make a CSS class prefix configurable. by @SteveBinary in #580
- @SteveBinary made their first contribution in #580
Diff: https://github.com/kivikakk/comrak/compare/v0.39.1...v0.40.0
Bug fixes:
- Fix header-ids accessibility by @davemackintosh in #574
- Recursively join text nodes inside links, images, and wikilinks by @JamieMagee in #575
- @davemackintosh made their first contribution in #574
- @JamieMagee made their first contribution in #575
Diff: https://github.com/kivikakk/comrak/compare/v0.39.0...v0.39.1
New APIs:
- Make dangerous_url public by @digitalmoksha in #564
Diff: https://github.com/kivikakk/comrak/compare/v0.38.0...v0.39.0
Bug fixes:
- Only delete parent if the node has no siblings by @digitalmoksha in #559
New features/APIs:
- html: add user data to context. by @kivikakk in #555
Coming of age???:
- experimental-inline-sourcepos: not really experimental any more! by @kivikakk in #560
Diff: https://github.com/kivikakk/comrak/compare/v0.37.0...v0.38.0
Bug fixes:
- add --all-features to CI matrix, add missing shortcode case. by @charlottia in #546
- Inline sourcepos fixes. by @kivikakk in #542
Documentation:
- docs: add mdex (elixir bindings) in the list of related projects by @leandrocp in #547
- Add Commonmarker to related projects by @gjtorikian in #548
Diff: https://github.com/kivikakk/comrak/compare/v0.36.0...v0.37.0
Bug fixes:
- Stop at first suitable $ at inline math by @Bubbis in #533
New features/APIs:
- Create custom HTML formatters. by @kivikakk in #540
- make
AlertTypemethods public by @fiji-flo in #532 - commonmark: experimental minimize by @charlottia in #523
Behind the scenes:
- tests: don't let sourcepos stand in the way of roundtrip tests. by @kivikakk in #543
- Refactor html output functions by @digitalmoksha in #529
- Write a sourcepos test for each
NodeValuevariant by @SamWilsn in #498
Documentation:
- docs: add Python bindings repo and PyPI links to Related Projects by @lmmx in #539
Diff: https://github.com/kivikakk/comrak/compare/v0.35.0...v0.36.0
- Use CSS class
markdown-alertinstead ofalertby @digitalmoksha in #524
Diff: https://github.com/kivikakk/comrak/compare/v0.34.0...v0.35.0
Admonition special!
- Add GitHub style alerts / admonitions by @digitalmoksha in #519
- Enable GitLab multiline alerts by @digitalmoksha in #521
Diff: https://github.com/kivikakk/comrak/compare/v0.33.0...v0.34.0
Happy new year! Thanks to @nicoburns for these changes, enabling much faster compiles if you don't need the builders!
- Eliminate
regexandonce_celldependencies. by @nicoburns in #514 - Make bon builders optional by @nicoburns in #515
- Make options structs exhaustive by @nicoburns in #516
Diff: https://github.com/kivikakk/comrak/compare/v0.32.0...v0.33.0
- rust-toolchain: remove by @charlottia in #493
- Account for front matter when calculating
sourceposby @SamWilsn in #494 - callbacks: constrain to input lifetime by @liamwhite in #499
- Refactor open_new_blocks by @digitalmoksha in #505
- Refactors open_new_blocks by lifting out handlers by @digitalmoksha in #506
- Make
wikilinks_title_after_pipeoverridewikilinks_title_before_pipeby @SamWilsn in #500 - Detect ending front matter delimiter at EOF by @kivikakk in #508
- Add Raw Node by @wakairo in #511
- @charlottia made their first contribution in #493
- @SamWilsn made their first contribution in #494
- @wakairo made their first contribution in #511
Diff: https://github.com/kivikakk/comrak/compare/v0.31.0...v0.32.0
- Enhance description lists by @digitalmoksha in #462
Diff: https://github.com/kivikakk/comrak/compare/v0.30.0...v0.31.0
- Add
task-list-itemclass to task list items by @nicoburns in #468 - Add option for specifying a minimum width of ordered lists by @edwar4rd in #465
- Use
bonfor an infallible and compile-time-checked builder by @Veetaha in #466 - Add support for image and link URL rewriting by @liamwhite in #481
- Unwrap Mutex from broken_link_callback by @liamwhite in #484
- Prevent panic in format_item by @silverpill in #486
- Bump
bonversion to 3.0 by @Veetaha in #487 - Add support for subscript extension by @liamwhite in #488
- Add macro for character tables by @liamwhite in #490
- @nicoburns made their first contribution in #468
- @edwar4rd made their first contribution in #465
- @Veetaha made their first contribution in #466
Diff: https://github.com/kivikakk/comrak/compare/v0.29.0...v0.30.0
- Add support for backslash escape in wikilinks by @digitalmoksha in #471
Diff: https://github.com/kivikakk/comrak/compare/v0.28.0...v0.29.0
- Add a render option to render the image as by @JmPotato in #458
- Fix edge cases for relaxed-autolink option by @digitalmoksha in #461
Diff: https://github.com/kivikakk/comrak/compare/v0.27.0...v0.28.0
- Track line offsets for better accuracy of inline sourcepos by @digitalmoksha in #453
- Add experimental-inline-sourcepos to cli options by @digitalmoksha in #455
Diff: https://github.com/kivikakk/comrak/compare/v0.26.0...v0.27.0
- Restore inline sourcepos as experimental. by @kivikakk in #444
- This is needed by some downstream users, so we re-introduce it, with a clearly labelled option.
Diff: https://github.com/kivikakk/comrak/compare/v0.25.0...v0.26.0
- Discord-flavored Markdown by @Meow and @liamwhite in #421
- Three new extensions and two render options are added:
extension.underlineadds support for__underlined__text.extension.spoileradds support for||spoiler||text.extension.greentextadds support for image board-style>greentext, which isn't transformed into a blockquote.render.ignore_setextdisables parsing setext-style headings.render.ignore_empty_linkscauses links with no text (like[](xyz)) to remain in the text as-is.
- Three new extensions and two render options are added:
- nodes: add From impls for AstNode. by @kivikakk in #424
- Back by popular demand:
AstNode::from(NodeValue). - Also added is
AstNode::from(Ast), if you have sourcepos.
- Back by popular demand:
- AST validation by @yannham in #425
- The AST is validated when formatting a document as CommonMark in debug builds.
- Address autolink edge cases. by @kivikakk in #426
- Autolinks had many edge cases where output differed from upstream
cmark-gfm. These have been fixed by following upstream's parser design closely.
- Autolinks had many edge cases where output differed from upstream
- shortcodes: capture all known aliases. by @kivikakk in #427
- We didn't parse shortcodes containing numbers or
+. We do now.
- We didn't parse shortcodes containing numbers or
- Support both upstream CommonMark and GFM's differences in the base spec. by @kivikakk in #428
- GFM modifies even base CommonMark output somewhat. We now support and validate against both.
- cm: count ol items from start of each list. by @kivikakk in #429
- Ordered list item numbers are normalised on formatting back to CommonMark.
- arena_tree: panic if iterator invalidation causes trouble. by @kivikakk in #437
arena_treewould silently stop iteration when trying to proceed from a child that had lost its parent. It now panics instead, as the old behaviour is incorrect and impossible to notice.
- broken reflink callback updates & big cleanup. by @kivikakk in #438
- The broken reference link callback has been moved into
ParseOptions(which now takes a lifetime, meaningOptionsdoes too). - The callback now takes a struct containing both the normalised reference, and the original text, and the return value has changed from a 2-tuple to a struct for clarity.
parse_document_with_broken_link_callbackhas been marked deprecated.
- The broken reference link callback has been moved into
- Inline sourcepos fixes. by @kivikakk in #439
- Inline sourcepos was provided on a best-effort basis, but there are multiple correctness issues which can't be fixed without significant work.
- Inline sourcepos is no longer reported in HTML output. It remains in the AST and in XML output, but it is not reliable. See the PR for details.
- Link sourcepos is slightly better than it was when it spans multiple lines.
Diff: https://github.com/kivikakk/comrak/compare/v0.24.1...v0.25.0
- Add GH_TOKEN to release workflow by @digitalmoksha in #418
Diff: https://github.com/kivikakk/comrak/compare/v0.24.0...v0.24.1
- Miscellany. by @kivikakk in #387
- Add automation to release new crates by @gjtorikian in #374
- build(deps): bump emojis from 0.5.2 to 0.6.2 by @dependabot in #393
- build(deps): bump arbitrary from 1.3.0 to 1.3.2 by @dependabot in #394
- build(deps): bump actions/checkout from 3 to 4 by @dependabot in #389
- build(deps): bump once_cell from 1.17.0 to 1.19.0 by @dependabot in #390
- build(deps): bump xdg from 2.4.1 to 2.5.2 by @dependabot in #391
- build(deps): bump derive_builder from 0.12.0 to 0.20.0 by @dependabot in #392
- build(deps): bump memchr from 2.5.0 to 2.7.2 by @dependabot in #396
- build(deps): bump ntest from 0.9.0 to 0.9.2 by @dependabot in #397
- build(deps): bump typed-arena from 2.0.1 to 2.0.2 by @dependabot in #398
- Update automerge.yml by @gjtorikian in #401
- build(deps): bump clap from 4.0.32 to 4.5.4 by @dependabot in #400
- build(deps): bump regex from 1.7.0 to 1.10.4 by @dependabot in #402
- Fix release workflows by @gjtorikian in #395
- workflows: check MSRV in CI. by @kivikakk in #406
- Add support for wikilinks format by @digitalmoksha in #407
- Autolink should ignore wikilinks by @digitalmoksha in #413
- Bump version to 0.24.0 by @digitalmoksha in #415
Diff: https://github.com/kivikakk/comrak/compare/0.23.0...v0.24.0
- add traverse() demo example by @kaesluder in #370
- Avoid backslashes before a new block. by @jneem in #373
- Expand traverse and descendants documentation: Issue #369 by @kaesluder in #375
- Feat/inplace: add new parameter
--inplace(-i) for in-place formatting by @bioinformatist in #377 - Change
relaxed-autolinksto allow any url scheme by @digitalmoksha in #380 - Fix sourcepos for setext headers by @digitalmoksha in #381
- Add iterative search/replace example to examples and README.md by @kaesluder in #383
- un-Nix in CI. by @kivikakk in #384
- Return brackets in autolinks behavior back to cmark-gfm by @digitalmoksha in #386
- Fix broken docs link in README by @ohakutsu in #364
- Make non public nodes public by @mfontanini in #363
- cargo update -p rustix --precise 0.36.17 by @kivikakk in #368
- Add render option to wrap escaped chars in span by @digitalmoksha in #367
- Add math support by @digitalmoksha in #366
- Add a multiline blockquote extension by @digitalmoksha in #359
- build(deps): bump rustix from 0.36.11 to 0.36.16 in /fuzz by @dependabot in #346
- Use Nix for CI. by @charlottia in #338
- Allow for Syntect to simply generate CSS classes by @gjtorikian in #347
- Simplify anchorize() by @kornelski in #297
- Use footnote name for reference id by @digitalmoksha in #300
- Escape footnote name by @digitalmoksha in #308
- Add in-doc labels for public facing features by @CosmicHorrorDev in #304
- build(deps): bump xml-rs from 0.8.4 to 0.8.14 by @dependabot in #312
- Handle footnote names that have been parsed into multiple nodes by @digitalmoksha in #311
- Sync with cmark-gfm-0.29.0.gfm.3 by @digitalmoksha in #313
- Sync with cmark-gfm-0.29.0.gfm.4 by @digitalmoksha in #314
- Sync with cmark-gfm-0.29.0.gfm.5 by @digitalmoksha in #315
- Fix backslash in a link issue by @vpetrigo in #317
- Sync with cmark-gfm-0.29.0.gfm.7 by @digitalmoksha in #318
- Rename
ComrakFootypes to justFoofor easier usage by @tgross35 in #320 - Make
ComrakExtensionOptionsnon-exhaustive by @CosmicHorrorDev in #305 - Add builder derive and non_exhaustive for option structs by @YJDoc2 in #292
- add PartialEq and Eq derive for Ast and its components by @YJDoc2 in #322
- Sync with cmark-gfm-0.29.0.gfm.11 by @digitalmoksha in #319
- Fix autolink detection inside wiki style link brackets by @digitalmoksha in #325
- Add CI for running benchmarks by @YJDoc2 in #326
- Make adapters Send + Sync by @lucperkins in #337
- docs: fix-up broken docs.rs link by @silverjam in #341
- Use github/cmark-gfm submodule by @digitalmoksha in #344
- Sync with cmark-gfm-0.29.0.gfm.12 by @digitalmoksha in #343
- Sync with cmark-gfm-0.29.0.gfm.13 by @digitalmoksha in #345
- Improve performance of bundled plugins, and streaming I/O by @kivikakk in #288
- Implement Default for enums without using #[default] attribute by @silverpill in #293
- XML and sourcepos support by @kivikakk in #232
- Add a quadratic fuzzer by @philipturnbull in #295
- Fix some panics found by trivial fuzzing.
Missed from the 0.17.0 changelog:
- Add footnote attributes that mirror cmark-gfm by @digitalmoksha in #273
- Add support for full_info_string render option by @digitalmoksha in #276
- chore: improve debug performance by @conradludgate in #283
This contains some breaking changes from an API point of view, but output is largely unchanged. Spec compliance is improved, and benchmark runtime is over 20% faster.
- SECURITY: GHSA-8hqf-xjwp-p67v / Quadratic runtime when parsing Markdown (GHSL-2023-047)
- https://github.com/kivikakk/comrak/security/advisories/GHSA-8hqf-xjwp-p67v
- A variety of quadratic runtime issues that could lead to DoS were reported and addressed.
- We replaced pest with an re2c-based scanner.
- SECURITY: GHSA-xxmq-4vph-956w / Excessive output when parsing Markdown (GHSL-2023-048)
- https://github.com/kivikakk/comrak/security/advisories/GHSA-xxmq-4vph-956w
- Reference output is limited to 100Kb.
- SECURITY: GHSA-5r3x-p7xx-x6q5 / Attacker controlled data in AST nodes is not validated (GHSL-2023-049)
- https://github.com/kivikakk/comrak/security/advisories/GHSA-5r3x-p7xx-x6q5
- AST nodes no longer store raw
Vec<u8>s, and instead storeStrings.
- Various API points were cleaned up.
- Comrak now targets Rust 2018.
Many thanks to @philipturnbull and @darakian of the GitHub Security Lab for bringing these issues to my attention and detailing the reproduction steps for each case.
- Track which symbol was used to mark task item as checked by @felipesere in #252
- improve tagfilter performance by @fiji-flo in #256
- [ShortCode] Add support for gemojis via shortcodes extension by @eklipse2k8 in #260
- "mod three rule" fix by @kivikakk in #262
- Add
shortcodesto the README by @gjtorikian in #263 - Cargo.toml: remove timebomb by @kivikakk in #264
- Add custom heading adapter by @lucperkins in #266
- Keep track of "^" symbol when within footnotes by @gjtorikian in #274
- table: fix start_line of Table itself by @kivikakk in #231
- Rename header file to match c libname by @gjtorikian in #233
- Change the name of the ifdef by @gjtorikian in #234
- Add
comrak_set_parse_option_smartby @gjtorikian in #235 - Allow
c_charoptions to be NULL by @gjtorikian in #237 - Replace
lazy_staticdependency withonce_cellby @Turbo87 in #238 - Make
comrak --helpreadable on my terminal by @mgeisler in #242 - c-api: fix CI build by @kivikakk in #240
- Bump versions of some dependencies by @helmet91 in #243
- Adding functionality to build SyntectAdapters with custom themes, syntax sets, etc. by @ArvinSKushwaha in #239
- Make shell-words and xdg dependencies optional by @silverpill in #245
- Bump clap version to 4.0 and switch to the Derive API by @tranzystorek-io in #248
- c-api: remove by @kivikakk in #249
- Add C FFI, allowing Comrak to be used from other languages. (#171, Garen Torikian)
- Fix line wrapping in CommonMark output. (#228, Edward Loveall)
- Add option to specify character used for unordered list bullets in CommonMark output. (#229, Edward Loveall)
- Fix Windows build.
- Support compiling for WASM. (#222, Ben Wishoshavich)
- Replace deprecated twoway dependency. (#224)
- SECURITY: Bump regex to 1.5.5. (#221, Dependabot)
- Drop unneeded YAML dependency from Syntect. (#199, Chris Wong)
- Match newline handling in code inlines to upstream, and improve test failure reporting. (#210, Michael Anderson)
- Make all node value fields public. (#216, Evan Schwartz)
- Line break handling adjustments. (#214, Michael Anderson)
- Disable control characters in link definitions. (#219, Michael Anderson)
- Only load syntax and theme sets once, on Syntect plugin instantiation. (#197)
- Match syntax highlighting language names more loosely. (#198)
- Add pluggable syntax highlighting, and default implementation with syntect. (Daniel Simon, #194)
- Allow short URLs even with non-empty path. (#191, Bernard Teo)
- Expose NodeCode struct in AST. (#192, Vojtech Kral)
- SECURITY: it was possible to smuggle unsafe URLs --- like
javascript:ones --- even without using the "unsafe" mode of operation. Thanks to Sam Sanoop (snoopysecurity) for reporting. - Recognise tables without a preceding newline. (#183)
- 0.9.1 was a semver-breaking change.
- Add -o/--output CLI option. (#177)
- SECURITY: we were matching unsafe URL prefixes, such as
data:orjavascript:, in a case-sensitive manner. This meant prefixes likeData:were untouched. Please upgrade as soon as possible. (Kouhei Morita) - Add support for ignoring front matter. (#170, Eitan Mosenkis.)
- 0.8.2 was a semver-breaking change, so we're now bumping to 0.9.0. Some tests have been added to catch this in future.
- Allow image/ prefix on data URIs. (#169, Daniel Sorichetti)
- Fix some lint issues. (#152, Caleb Maclennan)
- Build benchmarks separately to tests. (#154)
- Add support for a config file for CLI use. (#157, with thanks to AJ ONeal.)
- Add escape option to escape raw HTML instead of clobbering it. (#150, Ryan Westlund)
- 0.7.1 was a semver-breaking change. This is now 0.8.0.
- Reduce list item indentation in line with spec. (#135, Casey Rodarmor)
- Split uber-struct ComrakOptions into substructures.
- Refactor HTML formatter escaping. (#140, Donough Liu)
- Don't render
inside
- tags. (#145)
- Supporting stable and newer again, since dependencies keep breaking for 1.27.0. (#134)
- Exclude unneeded files from crate. (#120, Igor Gnatenko)
- Bump the twoway dependency. (#121, Igor Gnatenko)
- Add --gfm flag to CLI to enable all GitHub Flavored Markdown extensions and options. (#118, James R Miller)
- Add TaskItem variant to NodeValue. (#115, Élisabeth Henry)
- Support building on Rust versions back to 1.27.0. (#114)
- Update API so that footnote reference and definition identifiers match. (#110, Élisabeth Henry)
- Update to CommonMark spec 0.29. (#112)
- Add From impl to AstNode. (#105, Sunjay Varma)
- Add a Default derive and Ast::new to make ASTs programmatically constructible. (#101, Sunjay Varma and #102)
- Add a callback to fill in broken reference links, per pulldown_cmark's Parser::new_with_broken_link_callback. (#100, Sunjay Varma)
- Update to latest spec. (#99)
- Fix a bug in anchor generation; it should now be on par with GitHub's. (#97, Clifford T. Matthews)
- Expose anchor generation for use in library consumers. (#94, Clifford T. Matthews)
- Invert default-false
safeflag to default-falseunsafe_flag. If you were not enabling safe mode before, you'll need to enable unsafe mode now.
- Keep up-to-date with the spec.
- Significant test coverage and code clean up. (#82, #83, Brian Anderson)
- Description list support. (#86, Ayose Cazorla)
- Example use of comrak to convert CommonMark documents into S-expressions. (#86, Ayose Cazorla)
- Footnotes are now enabled via an extension option, not a flag of its own. (#87)
- Extend
cmark-gfmcompatibility to include all extension and regression tests. (#87)
- Speed enhancements. (#76, Brian Anderson)
- Target latest spec; bring comrak closer into line with cmark. (#81, Brian Anderson and Ashe Connor)
- Speed enhancements. (#75, Shaquille Johnson)
- Add safety options per the reference C implementation. (#67)
- Expose Arena type so users don't need to bring it in themselves (#66, Vincent Prouillet).
- Bring up to date with latest spec.
- Fix parsing of tables nested in other block elements (#61, Brian Anderson).
- Protect against stack smashing in inline processors and CommonMark and HTML formatters (#63, Brian Anderson).
- Fix a corner case in the ATX header parser (#53, Brian Anderson).
- Fix grammar for scanning table marker rows (#55, Brian Anderson).
- Add smart punctuation (#57).
- Add
default-info-stringargument/option to specify a default language in fenced code blocks. (Thanks to @steveklabnik for the suggestion.)
- Use
pestinstead of regexes for lexing.
- Fixed a bug where back-to-back emphases would not be processed correctly. (#45; thanks to @SSJohns for the report.)
- Fixed a bug where an exclamation mark "!" followed by a footnote would be eaten by the parser.
- Added footnotes support.
- Added header IDs extension.
- Fix for pathological reference link parsing.
- Speed optimisations.
- The formatters no longer produce Strings themselves; you must specify an output stream.
- Speed up whitespace normalisation.
- Multibyte character fix for autolink (#35, Shaquille Johnson).
- Resolve panics with tables in awkward situations (#36).
- Fix possible DoS in link parsing (#33, Demi Obenour).