diff --git a/docs/.vuepress/config.js b/docs/.vuepress/config.js
index 85ebf762..5a0cbe66 100644
--- a/docs/.vuepress/config.js
+++ b/docs/.vuepress/config.js
@@ -48,14 +48,7 @@ module.exports = {
'/docs/v3/ext-strikethrough/',
'/docs/v3/ext-tables/',
'/docs/v3/ext-tasklist/',
- {
- title: 'HTML',
- collapsable: false,
- children: [
- '/docs/v3/html/',
- '/docs/v3/html/custom-tag-handler.md'
- ]
- },
+ '/docs/v3/html/',
'/docs/v3/image/gif.md',
'/docs/v3/image/okhttp.md',
'/docs/v3/image/svg.md',
diff --git a/docs/docs/v3/core/html-renderer.md b/docs/docs/v3/core/html-renderer.md
index d0756b12..17ef63e3 100644
--- a/docs/docs/v3/core/html-renderer.md
+++ b/docs/docs/v3/core/html-renderer.md
@@ -79,4 +79,23 @@ builder.setHandler("a", new TagHandler() {
}
}
});
-```
\ No newline at end of file
+```
+
+:::tip
+Sometimes HTML content might include tags that are not closed (although
+they are required to be by the spec, for example a `div`).
+Markwon by default disallows such tags and ignores them. Still,
+there is an option to allow them _explicitly_ via builder method:
+```java
+final Markwon markwon = Markwon.builder(context)
+ .usePlugin(new AbstractMarkwonPlugin() {
+ @Override
+ public void configureHtmlRenderer(@NonNull MarkwonHtmlRenderer.Builder builder) {
+ builder.allowNonClosedTags(true);
+ }
+ })
+ .build();
+```
+Please note that if `allowNonClosedTags=true` then all non-closed tags will be closed
+at the end of a document.
+:::
\ No newline at end of file
diff --git a/docs/docs/v3/html/README.md b/docs/docs/v3/html/README.md
index 96a75431..a8a12d1b 100644
--- a/docs/docs/v3/html/README.md
+++ b/docs/docs/v3/html/README.md
@@ -1,47 +1,21 @@
----
-title: 'Overview'
----
+# HTML
-# HTML
+This artifact encapsulates HTML parsing from the core artifact and provides
+few predefined `TagHandlers`
-
-
-Starting with version `2.0.0` `Markwon` brings the whole HTML parsing/rendering
-stack _on-site_. The main reason for this are _special_ definitions of HTML nodes
-by . More specifically:
-and .
-These two are _a bit_ different from _native_ HTML understanding.
-Well, they are _completely_ different and share only the same names as
- and
-elements. This leads to situations when for example an `` tag is considered
-a block when it's used like this:
-
-```markdown
-
-Hello from italics tag
-
+```java
+final Markwon markwon = Markwon.builder(context)
+ .usePlugin(HtmlPlugin.create())
+ .build();
```
-:::tip A bit of background
-
- had brought attention to differences between HTML & commonmark implementations.
-:::
+As this artifact brings modified [jsoup](https://github.com/jhy/jsoup) library
+it was moved to a standalone module in order to minimize dependencies and unused code
+in applications that does not require HTML render capabilities.
-Let's modify code snippet above _a bit_:
-
-```markdown{3}
-
-Hello from italics tag
-
-
-```
-
-We have just added a `new-line` before closing `` tag. And this
-changes everything as now, according to the ,
-we have 2 HtmlBlocks: one before `new-line` (containing open `` tag and text content)
-and one after (containing as little as closing `` tag).
-
-If we modify code snippet _a bit_ again:
+Before `Markwon` used android `Html` class for parsing and
+rendering. Unfortunately, according to markdown specification, markdown can contain
+HTML in _unpredictable_ way if rendered _outside_ of browser. For example:
```markdown{4}
@@ -50,260 +24,38 @@ Hello from italics tag
bold>
```
-We will have 1 HtmlBlock (from previous snippet) and a bunch of HtmlInlines:
+This snippet could be represented as:
+* HtmlBlock (`\nHello from italics tag`)
* HtmlInline (``)
* HtmlInline (``)
* Text (`bold`)
* HtmlInline (``)
-Those _little_ differences render `Html.fromHtml` (which was used in `1.x.x` versions)
-useless. And actually it renders most of the HTML parsers implementations useless,
-as most of them do not allow processing of HTML fragments in a raw fashion
-without _fixing_ content on-the-fly.
-
-Both `TagSoup` and `Jsoup` HTML parsers (that were considered for this project) are built to deal with
-_malicious_ HTML code (*all HTML code*? :no_mouth:). So, when supplied
-with a `italic` fragment they will make it `italic`.
-And it's a good thing, but consider these fragments for the sake of markdown:
-
-* `italic `
-* `bold italic`
-* ``
-
-We will get:
-
-* `italic `
-* `bold italic`
-
-_* Or to be precise: `italic ` &
-`bold italic`_
-
-Which will be rendered in a final document:
-
-
-|expected|actual|
-|---|---|
-|italic bold italic|italic bold italic|
-
-This might seem like a minor problem, but add more tags to a document,
-introduce some deeply nested structures, spice openning and closing tags up
-by adding markdown markup between them and finally write _malicious_ HTML code :laughing:!
-
-There is no such problem on the _frontend_ for which commonmark specification is mostly
-aimed as _frontend_ runs in a web-browser environment. After all _parsed_ markdown
-will become HTML tags (most common usage). And web-browser will know how to render final result.
-
-We, on the other hand, do not posess HTML heritage (*thank :robot:!*), but still
-want to display some HTML to style resulting markdown a bit. That's why `Markwon`
-incorporated own HTML parsing logic. It is based on the project.
-And makes usage of the `Tokekiser` class that allows to _tokenise_ input HTML.
-All other code that doesn't follow this purpose was removed. It's safe to use
-in projects that already have `jsoup` dependency as `Markwon` repackaged **jsoup** source classes
-(which could be found )
-
-## Parser
-
-There are no additional steps to configure HTML parsing. It's enabled by default.
-If you wish to _exclude_ it, please follow the [exclude](#exclude-html-parsing) section below.
-
-The key class here is: `MarkwonHtmlParser` that is defined in `markwon-html-parser-api` module.
-`markwon-html-parser-api` is a simple module that defines HTML parsing contract and
-does not provide implementation.
-
-To change what implementation `Markwon` should use, `SpannableConfiguration` can be used:
-
-```java{2}
-SpannableConfiguration.builder(context)
- .htmlParser(MarkwonHtmlParser)
- .build();
-```
-
-`markwon-html-parser-impl` on the other hand provides `MarkwonHtmlParser` implementation.
-It's called `MarkwonHtmlParserImpl`. It can be created like this:
-
-```java
-final MarkwonHtmlParser htmlParser = MarkwonHtmlParserImpl.create();
-// or
-final MarkwonHtmlParser htmlParser = MarkwonHtmlParserImpl.create(HtmlEmptyTagReplacement);
-```
-
-### Empty tag replacement
-
-In order to append text content for self-closing, void or just _empty_ HTML tags,
-`HtmlEmptyTagReplacement` can be used. As we cannot set Span for empty content,
-we must represent empty tag with text during parsing stage (if we want it to be represented).
-
-Consider this:
-* ``
-* ` `
-* ``
-
-By default (`HtmlEmptyTagReplacement.create()`) will handle `img` and `br` tags.
-`img` will be replaced with `alt` property if it is present and `\uFFFC` if it is not.
-And `br` will insert a new line.
-
-### Non-closed tags
-
-It's possible that your HTML can contain non-closed tags. By default `Markwon` will ignore them,
-but if you wish to get a bit closer to a web-browser experience, you can allow this behaviour:
-
-```java{2}
-SpannableConfiguration.builder(context)
- .htmlAllowNonClosedTags(true)
- .build();
-```
-
-:::warning Note
-If there is (for example) an `` tag at the start of a document and it's not closed
-and `Markwon` is configured to **not** ignore non-closed tags (`.htmlAllowNonClosedTags(true)`),
-it will make the whole document in italics
+:::tip A bit of background
+
+ had brought attention to differences between HTML & commonmark implementations.
:::
-### Implementation note
+Unfortunately Android `HTML` class cannot parse a _fragment_ of HTML to later
+be included in a bigger set of content. This is why the decision was made to bring
+HTML parsing _in-markwon-house_
-`MarkwonHtmlParserImpl` does not create a unified HTML node. Instead it creates
-2 collections: inline tags and block tags. Inline tags are represented as a `List`
-of inline tags (). And
-block tags are structured in a tree. This helps to achieve _browser_-like behaviour,
-when open inline tag is applied to all content (even if inside blocks) until closing tag.
-All tags that are not _inline_ are considered to be _block_ ones.
+## Predefined TagHandlers
+* ``
+* ``
+* `