Dumbdown – A Tree Language that compiles to HTML

Kwpolska · on Sept 2, 2019

This is not an alternative to Markdown in any shape or form. Markdown resembles plain text and can be easily parsed by anyone without syntax highlighting, a Markdown → visual format compiler, or prior knowledge of the format; moreover, Markdown is meant to be quick to type without wondering about the intricacies of the output format.

This thing is anything but. You need to know the keywords and type them out all the time, you can’t use it in an e-mail because your readers won’t understand you, and if you like trees so much, just write regular HTML — or a “simplified” HTML format like Pug — and call it a day.

cybervegan · on Sept 2, 2019

I've been working on something similar, with the intention of making something that separates semantic tags from formatting tags, such that the resulting document could be directly narrated or dictated, without resorting to too many (if any) "special" words for punctuation. Renderers could easily translate/format the output for better style or into a form more appropriate for their purposes; "editors" could do the same.

Currently, screen-readers do not cope well with modern web pages, and can you imagine trying to dictate HTML to a speech-to-text based editor? Using our approach means you can literally speak the content to the stt, and have it appear on-screen in a displayable format. I'm thinking this would be useful for verbal/conversation applications where you can effectively dispense with a visual display.

I haven't put my ideas into any kind of proper documentation, but they are very close to yours.

mr__y · on Sept 2, 2019

This sounds interesting to me! I actually thought about something relatively similar with the intention of creating js-based renderer to make it web browser compatible. I was also contemplating the idea to put semantic data into a separate structure, similarly to schema.org data in json-ld format. While this is probably worse from the editor point of view, the advantage would be more clear and more human readable content since the semantic tags would be omitted. This could also save some bandwidth if client software could fetch only the part that it needs (for instance if a given reader does not support semantic tags parsing it could download only the main content, while search engines could download only the semantic part). The problem is that thinking about that for a while is all that I've achieved so far :)

breck · on Sept 2, 2019

> schema.org data in json-ld format

It would be cool if you explored Tree Notation in your thoughts.

We have an idea roughly called "World Wide Tree" (or at least one person calls it "World Wide Forest").

We think Tree Notation might be the trick to getting the semantic web vision realized. One simple universal syntax for HTML, CSS, Javascript, JSON, Data schemas, data itself, etc.

The semantics you still need to define and build machines for, but if we had a simpler syntax (without sacrificing a single capability!) that might move the ball pretty far.

tempo33 · on Sept 2, 2019

> We think Tree Notation might be the trick to getting the semantic web vision realized.

I'm not sure what Tree Notation fundamentally brings to the table that RDF and microformats, etc do not..

Have you read: https://people.well.com/user/doctorow/metacrap.htm

How does Tree Notation address anything listed there?

The semantic web has never been about technological limitations or tooling problems. And ever if it were, that is solving the simple problem.

breck · on Sept 2, 2019

That's a fantastic, fantastic link. Thanks for sharing. Very informative. I may have read it before, but don't see it in my notes.

> How does Tree Notation address anything listed there?

The 2 things that have changed since 2001:

- git - Tree Notation

A very powerful combination. In two ways. First as a collaborative database system (https://treenotation.org/treeBase/). Second as collaborative grammars (done via Github, gitlab, or any gitX).

1. "People lie". Complexity can be measured directly in Tree Notation. Complexity is where corruption hides. Tree Notation + git (blaming, etc), makes it much harder to lie.

2. "People are lazy": Tree Notation requires the fewest keystrokes (or pen/pencil strokes--it works great on paper too! very important in clinical settings. for instance, in some countries, 80% of hospitals have no digital medical records at all--I was recently told today!). Tree Notation and our grammar language gives you type checking, autocomplete, autocorrections, and more.

3. "People are stupid": see response to #2.

4. "Mission: Impossible -- know thyself". I'm not sure the problem here. The semantic web shouldn't be about forcing some model of behavior on people.

5. "Schemas aren't neutral". Tree Notation makes this very simple: just fork a grammar! We are carefully designing our Grammar language so you can simply do a file concat of N files to create a new grammar. We are making it as easy as possible to build, fork, and combine new grammars.

6. "Metrics influence results". In our database of 10k notations and computer languages, I quickly realized that you can't bucket things so cleanly. Terms like "a functional language" an "imperative language" are mildly useful, but not so precise. Instead, we now have over 1K columns. Tree Notation/TreeBase/Grammars make this very easy. Amongst other things, this will allow for better precision medicine.

7. "There's more than one way to describe something". We agree! It's so easy to fork a Grammar if you think you can do it better. Let the market decide. We have this of we talk about of the "World Wide Tree". But at least one person thinks we should call it the "World Wide Forest". I think they may be right.

FWIW, I pitched Tree Notation for the semantic web to w3c in 2017 but never head back. This is a reminder that I should ping them again.

Thanks again for the link. A very good read and I've long been a fan of CD's work.

cybervegan · on Sept 2, 2019

I know what you mean.

breck · on Sept 2, 2019

We think alike!

> such that the resulting document could be directly narrated or dictated, without resorting to too many (if any) "special" words for punctuation.

This is a design test I put every new Tree Language through. The early languages still had some special punctuation, like # for comment. Once syntax highlighting and autocomplete was good, I realized we should do away with all such instances, in most cases. It's made them much more of a joy to use.

> Currently, screen-readers do not cope well with modern web pages, and can you imagine trying to dictate HTML to a speech-to-text based editor?

Agreed! I had a friend who is blind who I spent a couple days with years ago observing how he used his machine. It was both incredibly complicated and also amazing (I couldn't believe how fast he had the computer speak to him and how he was able to understand anything). I hadn't thought about that use case for Tree Notation languages until your comment just now. Thanks for sharing it. Perhaps there is something that could be done in that domain. Let me know if there are ways we could help.

solarkraft · on Sept 2, 2019

I'm interested in this, partly because I'd like to have an easy way to get richer web pages: Instead of reading "This was X dollars in year Y, the equivalent of Z dollars in <years ago>" I'd like the author to be able to just put in the facts: Something like $[X|Y], with Z being generated automatically by the web page, which means it'd always be current and you could get a ton more context, like currency conversion.

rraghur · on Sept 2, 2019

markdown has a bunch of flaws - most egregious is that it lacks a spec. GFM brings in a few nice QOL improvements and CommonMark tried to standardize - but it's very very soon that you run into it's warts - For ex - try to have a list which has a code block in it (or other complex content)

For me, Asciidoc (and Asciidoctor) have become the default - formal spec, test cases and extension mechanism so that you don't have N + 1 flavors of it. It also has a markdown migration mode that eases moving from md.

IMO, it is better in almost every way than markdown - the only reason it isn't as popular is that Markdown was made popular (and drove adoption) by the Github & others.

geraldbauer · on Sept 2, 2019

So how is:

.Ordered

. number

  .. letter

  .. letter

better and more readable than

1. number

2. number

   a. letter

   b. letter

For sure Markdown has some cruft but isn't it better to improve Markdown itself than to try to "establish" a completly new format with some weird conventions such as:

Level 3 Heading

^^^^^^^

Level 4 Heading

+++++++++++++++

Why? ^^ or ++ is this in any way intuitive or an established convention that is so much better?

PS: See Text with Instructions (.texti) for a (better) Markdown evolved variant / flavor :-) - https://texti.github.io

TuringTest · on Sept 2, 2019

> So how is (X) better? For sure Markdown has some cruft but isn't it better to improve Markdown itself than to try to "establish" a completly new format

> PS: See Text with Instructions (.texti) for a (better) Markdown evolved variant / flavor

You have answered yourself: markdown is not a language that can be improved, it's a mashup of several different slightly incompatible implementations, much like HTML in the early web during browser wars.

You could try to create a standard body that defined a homogeneous definition that everybody adheres to and which could be extended with new features.

But by that time you'd be better served off by using asciidoc, which already did that job and which is actually based on Docbook, supporting all the features of that complete standard for book publishing.

geraldbauer · on Sept 2, 2019

Sorry you are misreading and misstating what I said - markdown is a language and can of course be improved as many flavors / dialects / extensions and so on proof. Evolution is usually way better in finding conventions than your top-down Docbook in Ascii (Asciidoc) flavor. Again (re)read the post above and tell me how the Asciidoc ordered list using . and .. is more readable or the heading level 3 or 4 using ^^ or ++ and so on.

rraghur · on Sept 3, 2019

> post above and tell me how the Asciidoc ordered list using . and .. is more readable

It isn't - the point is that asciidoc has a 'spec' whereas markdown has dialects because it's not rigourously specified (core or extensions). Which one you might run into is the luck of the draw.

Should you try to improve markdown (and lots of people have), you end up with the N+1 standards problem (famous xkcd cartoon) and further fragmenting the implementation if at all it takes off.

Getting one 'Markdown' that has features (and extensions) that work properly wherever you go isn't a technical problem . It's a problem created by the lack of a spec in the first place and IMO pretty much unsolvable now.

icegreentea2 · on Sept 2, 2019

I don't think that your examples for asciidoc are quite correct.

Ordered lists do not need ".Ordered" to begin (https://asciidoctor.org/docs/asciidoc-syntax-quick-reference... scroll down a little). You don't need the leading spaces for sublists either.

I'll grant you that the period is a little more annoying than the numbers, but it's honestly not a big deal, and also means that long lists (longer than 10 elements) have consistent spacing.

For headers, you don't need those different formats (https://asciidoctor.org/docs/asciidoc-syntax-quick-reference...). It's just equal signs the whole way.

kerkeslager · on Sept 2, 2019

It's better because it's specified.

This is a case of, "It doesn't matter what side of the road you drive on, as long as everyone drives on the same side."

Your link asks repeatedly whether we've learned anything over the last 10+ years. One thing we've learned over the last 10+ years is that underspecified organically emergent "standards" result in a bunch of inconsistent behavior that confuses users.

I won't claim that AsciiDoc fixes the problem--it would need to be adopted more widely and it's not really trying to do the same thing as Markdown--it's not trying to make something that's both readable as plain text and generatable into rich text. But simply focusing on some aesthetic qualities of the markup shows you aren't really understanding the problem with Markdown.

GhostVII · on Sept 2, 2019

The first example is nice because you don't have to change all of the numbers around every time you add/remove an entry from the list. But I don't like having to put the ".Ordered" before every ordered list, and in Markdown I think it will automatically make the numbering sequential if the source isn't anyways.

joshuamorton · on Sept 2, 2019

Yeah you can number everything as 1. In markdown and it will work, but if you want the source to also be readable, which is the point of markdown, the numbers should match.

edraferi · on Sept 2, 2019

I evaluated several light markup languages a while back for use in technical contracts. Basically I was fed up dealing with MSFT Word’s formatting problems while iterating on complex content with internal and external lawyers.

Asciidoc definitely seemed like the best option. The biggest thing for me was the ability to do complex numbered headers.

Unfortunately the tooling wasn’t really good enough for my use case. I needed a clean way to move between Asciidoc and Word, and it wasn’t really possible. Pandoc claimed support for both Asciidoc and word, but it was only partial and couldn’t round-Trip the document. I considered developing & contributing the functionality, but I didn’t know the language the tool was written in and couldn’t justify investing more time in the experiment.

Maybe it’s time to review the tooling situation and take another crack at it...

rraghur · on Sept 3, 2019

I've found decent results with .adoc --(asciidoctor)--> docbook --(pandoc)--> docx

It won't roundtrip (and I doubt anything will ever) but I usually go only one way - converting to docx in the last stage.

a-nikolaev · on Sept 2, 2019

Agreed. Writing lecture notes [1] in Asciidoc was a real pleasure. The clarity of specs and convenience beats Markdown in every way. It even beats TeX, I'd say, unless you want real control of the typography. A very nice default styling in Asciidoctor is a big plus for me too.

[1] for example, this: https://a-nikolaev.github.io/fp/lec/2/

616c · on Sept 2, 2019

Impressive, direct, and clear enough is the FP book you cited I will likely give OCaml a go again, not just AsciiDoc, thanks!

a-nikolaev · on Sept 2, 2019

Cool, thanks!

breck · on Sept 2, 2019

Great references and description of the field. If anyone wants to submit a pull request to the Dumbdown grammar, I would love one that adds links to all the great languages in this space, like markdown and asciidoc. Something like:

    link http://asciidoc.org/ AsciiDoc

joshuamorton · on Sept 2, 2019

The consistent reposting and commenting about tree notation is tiring, especially when it has obvious flaws. As a simple example, this language requires you to label each node-type. A list is

    list
     - one
     - two

instead of just

    - one
    - two

which is an obvious deficiency compared to markdown. This seems like a general problem with something like tree notation, maybe there's a way to fix it, but it isn't obvious. Why's this preferable?

n1000 · on Sept 2, 2019

Markdown remains the best hybrid for me, because it strikes the perfect balance between a. formal enough to use pandoc and make htmls, doxc, etc with it and b. still fully human readable, which allows me to take all my meeting minutes with it and be able to quickly copy & paste them into an email or store them on our project share and nobody will really wonder.

osrec · on Sept 2, 2019

I like markdown, but also like asciidoc for when you need your documents to be a bit smarter.

cmroanirgo · on Sept 2, 2019

I'm a bit of a fan of textile [0] and first came across the redcloth version[1] years ago, especially for non technical writing. In particular, it's closely aligned to the html that gets generated without all the angles. Eg. You can easily add class and/or id to a paragraph:

> Some text

p.red Some red text

It has easy to do left, centre and right alignments.

Markdown was (slightly) influenced by textile too [2]:

> While Markdown’s syntax has been influenced by several existing text-to-HTML filters — including Setext, atx, Textile, reStructuredText, Grutatext, and EtText — the single biggest source of inspiration for Markdown’s syntax is the format of plain text email.

Too bad it's gone the way of the dodo [3].

[0] https://en.wikipedia.org/wiki/Textile_(markup_language)

[1] http://redcloth.org/

[2] https://daringfireball.net/projects/markdown/syntax

[3] https://github.blog/2016-03-01-upgrading-your-textile-posts-...

pattisapu · on Sept 2, 2019

Indeed, Textile is slightly more intuitive and powerful than Markdown, but these days it's unaccountably Betamaxing.

pattisapu · on Sept 10, 2019

And simple, helpful things like automatic list numbering are in Textile but not in canonical Markdown.

myhf · on Sept 2, 2019

Being explicit about node-types is useful if you are converting those names directly into tagNames. But then you just end up reinventing HAML/Pug.

joshuamorton · on Sept 2, 2019

Sure, it's simpler to implement, but that isn't useful for an end user, especially when markdown already exists.

breck · on Sept 2, 2019

You could make - a top level node type. You could also use pattern matching, instead of prefix notation. Tree Languages can use prefix notation, infix, postfix, pattern matching, omnifix, etc. prefix notation seems to work well for this case.

hyperpallium · on Sept 2, 2019

RE the thread at https://lobste.rs/s/solbtw/this_is_dumbdown_alternative_mark... could you just write a parser (in Tree Notation) that that converts a language in Tree Notation into BNF?

e.g. to produce BNF of dumbdown

Also, to parse BNF and produce a (bare bones) version in Tree Notation.

This would make it more accessible while also demostrating its power.

breck · on Sept 2, 2019

That's a really good idea!

If you give it a shot, I'd be happy to add any features or fix any bugs that you come across in the Grammar language. I'd also be happy to do a screenshare to explain any questions you might have about Grammar before you get started (since the UX is still pretty bad--I apologize. Slowly getting there!).

hyperpallium · on Sept 3, 2019

It was a challenge to you. Though thinking it over, it's grammar transformation, not the parsing that Tree Notation is for.

nine_k · on Sept 2, 2019

There is a great, time-tested, full-featured tree text format. It's called Org.

The problem is that the only editor that has a full support for it is Emacs (org mode).

If I wanted to make a difference in this area, I would dedicate my time to writing a very good HTML renderer / JS control for it, then a good VIM mode, then maybe a good VScode mode.

It's not easy because the spec is large, and every part of it is useful.

eitland · on Sept 2, 2019

FWIW and IIRC this is the first time I see this here :-)

Oh, and thanks for pointing out the problems you see.

chriszhq · on Sept 2, 2019

Maybe we don't need so many "light" markup languages. Writing html directly is as easy as writing any markup language if you have a good html editor. I have been taking notes with html-notepad (https://html-notepad.com/) for a long time and the writing experience is much more enjoyable than writing md, rst or emacs org mode (I only tried these three). Taking notes with html doesn't mean you have to write verbose html tags or the html tags would clutter the edit area. The html-notepad is a WYSIWYG editor which means what you see is a rendered html page when you are editing. There are keyboard shortcuts for oft-used elements like header1~6, lists, code and so on, and you could edit the html source if you like. Although the html-notepad has some flaws and lacks some really useful function (like search and replace), it's definitly a very convenient tool for me.

pgcj_poster · on Sept 2, 2019

I think the main reason people like using lightweight markup languages is because they don't have to depend on any special tools. I can comfortably edit markdown in any text editor, whether I'm using VS Code on my laptop, Notepad on a public library PC, Vim on a remote server, a form in my source host's web interface, or whatever it is that people use to edit text on phones. And I don't need to worry about the status of someone's proprietary hobby project.

gagan2020 · on Sept 2, 2019

It's more like what is YAML to XML. HTML is also tree structure.

Markdown is just plain readable file that can be beautify with HTML and CSS.

geraldbauer · on Sept 2, 2019

For a more pragmatic alternative to Markdown see Text with Instructions (.texti) that gets you the best of Markdown, Wikipedia Markup, LaTeX & Friends. See https://texti.github.io

masklinn · on Sept 2, 2019

I don't see anything pragmatic about it. It seems like a complete mess, less convenient than markdown (putting a space before a title makes it into a comment? uwot?), and no more extensible.

geraldbauer · on Sept 2, 2019

You're misreading what texti is all about. A comment is unix-style # and a heading (title) is using the wikipedia markup convention. You can put as many spaces before as you like it makes no difference to a comment or to a heading (title). See some samples @ https://github.com/texti/texti.github.io/tree/master/samples to compare markdown, texti and wikipedia markup using a real-world article (from wikipedia itself on markup languages).

About extensible - it's no different from markdown with two additions - 1) evolution / changes are more than welcome and 2) texti (like wikipedia markup) has (recursive) template extensions / includes built-in.

breck · on Sept 2, 2019

Interesting. Thanks for sharing the link! Always helpful to read about related projects.

boomlinde · on Sept 2, 2019

I think the only redeeming feature of markdown is that its formatting markup mimics plain text formatting coventions. In that sense it’s hard to agree that this is an alternative to markdown. It lacks the one quality markdown has that otherwise more complete document formats lack, really its only distinguishing feature.

wetpaws · on Sept 2, 2019

Now let's add attributes, some stylesheets and perhaps some small script language for basic interactivity and we are done

jspash · on Sept 2, 2019

I'd love to volunteer for the scripting language but I only have about 10 days. And I'm not very bright. But that's ok since it's unlikely to get much adoption. How does Scripty McScriptface sound for the name?

jobigoud · on Sept 2, 2019

For the name it would be better to use an existing programming language and add "script" at the end, it won't be confusing. Maybe JavaScriptScript?

ourmandave · on Sept 2, 2019

Will there also be components?

And can I get a compiler, in say, 25 years?

fao_ · on Sept 2, 2019

I wrote something vaguely similar (Except instead of being inspired by Markdown I was inspired by Lisp) a few months back and then wrote my website in it:

https://gitlab.com/finnoleary/wisp-new

friend-monoid · on Sept 2, 2019

But... why?

mr__y · on Sept 2, 2019

For the glory of Syntax :)

ToJans · on Sept 2, 2019

Just a short remark: this reminds me of Cobol.

cttet · on Sept 2, 2019

Elm markup has a similar approach and I quite like this

enriquto · on Sept 2, 2019

Unpopular opinion: markdown is mostly useless, and plain text is better nearly in all cases.

As a corollary, markdown so complex that is unreadable before being "rendered" is worse than useless.