🤯 50% Off! 700+ courses, assessments, and books

Markdown Processing in Ruby

Jesse Herrick
Share

Markdown-mark

Markdown is a fantastic markup language that compiles into HTML. Although its original implementation was written in Perl, Markdown has been ported into multiple languages with various features. We’re going to focus on 4 Ruby implementations of Markdown: kramdown, maruku, rdiscount, and redcarpet.

Processing Markdown in Ruby

Obviously, we should install these gems first, so gem install kramdown maruku rdiscount redcarpet. This may take a little bit of time because both RDiscount and Redcarpet use C extensions for faster processing speed.
This also means that, if your Ruby interpreter doesn’t support C extensions, you won’t be able to use RDiscount or Redcarpet. I’m looking at you, JRuby.

For our processing needs, we’ll use the Markdown from John Gruber’s Markdown syntax page saved to markup.md.

In each implementation we can process this Markdown file in a single line of code.

markdown = File.read('markup.md')

# kramdown
require 'kramdown'
Kramdown::Document.new(markdown).to_html

# maruku
require 'maruku'
Maruku.new(markdown).to_html

# rdiscount
require 'rdiscount'
RDiscount.new(markdown).to_html

# redcarpet
require 'redcarpet'
Redcarpet::Markdown.new(Redcarpet::Render::HTML.new).render(markdown)

Sweet! We can see that each framework (with the exception of Redcarpet) has a simple API for processing Markdown.

Features

Markdown’s original implementation was great in its day, but people eventually decided that they want more. For example, Markdown’s original implementation includes support for images and code blocks, but many users wanted either a different syntax and/or better extensibility of these features. Thus, the features of various implementations of Markdown vary greatly.

Let’s look at several popular feature additions, and their support in our Ruby implementations.

Code Blocks

The code block syntax of Markdown originally involved the use of indentation of the code like so:

if foo == bar
    "Markdown is awesome."
  end

The improved syntax is called fenced code blocks. It is written like this:

```ruby
if foo == bar
  "Markdown is awesome."
end
```

Notice the specification of Ruby as the language at the end of the first set of “fences”. This is optional, but allows for the Markdown implementation to do language-specific syntax highlighting.

Strikethrough

Strikethrough is an added feature to Markdown. It is written by wrapping a word (or words) in tildes like so:

~~Something outdated~~

Tables

Tables are an added feature to Markdown. They are created in a similar way to how they look in HTML.

| Header 1 | Header 2 | Header 3 |
|----------|----------|----------|
| Foo      | Bar      | Baz      |

This outputs an HTML table that looks very similar. Although this syntax is convenient for simple tables, I find it inconvenient when editing tables because it’s hard to maintain an even cell width in plain text. However, this is irrelevant because the following will also render the same as above (it just makes your Markdown look uglier):

| Header 1      | Header 2|Header 3 |
|-------|----|----------|
|Foo|Bar|Baz|

Header IDs

This feature allows for HTML anchors (e.g. http://something.com/document.html#id-ref) in Markdown generated documents. It is done automatically in headings and subheadings. Each implementation of Markdown generates these differently though, so it’s best to check documentation on this.

Typographic Substitution (“SmartyPants” Style)

It turns out that Markdown users also love another Daring Fireball project called SmartyPants. This program substitutes common ASCII punctuation into “smart” typographic HTML output. For example:

"Ruby", the programming language.
Becomes: “Ruby”, the programming language.

Other typographic substitutions are also performed, such as --- to — (em-dash) and -- to – (en-dash).

A Comparison

Let’s take a look at support for those features in our Ruby Markdown implementations.

Fenced Code Strikethrough Tables Header IDs Typographic Substitution
Kramdown Yes No Yes Yes Yes
Maruku Yes No Yes Yes Yes
Redcarpet Yes Yes Yes Yes Yes
RDiscount Yes Yes Yes No Yes

From this data, it’s pretty obvious that Redcarpet is the way to go. Not only is Redcarpet extensible, but it’s also amazingly fast (we’ll get into this next).

Benchmarks

I decided to test the speed of each Markdown processor by benchmarking how fast they process this file. It turns out that doing this is surprisingly easy in Ruby using the Benchmark module. Here’s how I set it up:

require 'benchmark'

markdown = File.read('TestDoc.md')

Benchmark.bm(15) do |x|
  x.report('Kramdown') {
    require 'kramdown'
    Kramdown::Document.new(markdown.dup).to_html
  }

  x.report('Maruku') {
    require 'maruku'
    Maruku.new(markdown.dup).to_html
  }

  x.report('RDiscount') {
    require 'rdiscount'
    RDiscount.new(markdown.dup).to_html
  }

  x.report('RedCarpet') {
    require 'redcarpet'
    Redcarpet::Markdown.new(Redcarpet::Render::HTML.new).render(markdown.dup)
  }
end

As there are variations in execution time, I ran this benchmark 5 times (on a late 2013 MacBook Pro: 2.4 GHz Intel Core i5 running Ruby 2.2.0) and averaged the results:

Avg. Time (in seconds)
Kramdown .1054152
Maruku .1226444
RDiscount .0131436
RedCarpet .007233

Those numbers are cool, but to really get the impact we need a graph: (shorter the bar, the faster it is)

Wow. It’s obvious that Redcarpet is really fast, with RDiscount coming in at a close second. Given Redcarpet’s features, I highly recommend it.

Let’s put these numbers into perspective. This next table is the number of documents that RedCarpet can process in the time that it takes the other processors to process one.

Processor Number of Documents Processed
Maruku 17 documents
Kramdown 15 documents
RDiscount 2 documents

So yeah, Redcarpet is pretty freaking fast.

Advanced Redcarpet Usage

Not only is Redcarpet fast, but it’s also very extensible. Let’s see if we can make the coolest Redcarpet setup ever.

The first thing we have to decide is what extra features we want from our Markdown. Let’s make a list of what we want: (you can reference these features here)

  • Tables (:tables)
  • Fenced Code Blocks (:fenced_code_blocks)
  • Autolinking (:autolink)
  • Strikethrough (:strikethrough)

Awesome. Next we need to get a little meta with our HTML and decide what attributes we want it to have. These options will be put into the renderer itself, rather than the Markdown processor. This is just a part of Redcarpet’s amazing extensibility. However, most of these options are safety features (like :safe_links_only and :filter_html), so we don’t actually need to use them in our case, but it’s good to know about them.

So using our list, let’s create our Markdown processor!

# our markdown extensions
md_options = {
  tables: true,
  fenced_code_blocks: true,
  autolink: true,
  strikethrough: true
}

# our markdown processor
processor = Redcarpet::Markdown.new(Redcarpet::Render::HTML, md_options)

Now we can parse our Markdown with a simple processor.render(some_markdown_string).

Conclusion

Markdown is a very useful markup language. Even this article is written in Markdown.

As its syntax becomes more and more ubiquitous, the ability to parse it in various languages will be as well. When it comes to performance and extensibility, Redcarpet can’t be beat, but don’t let that stop you from trying out the other Ruby Markdown processors as well. Happy rendering!

CSS Master, 3rd Edition