I grew up using word processors. By all rights, we should be old friends.
My first word processor was Final Writer on the Amiga. I used it to type all my school assignments all throughout grade school. It was a great tool, and easy enough that even a grade schooler could figure it out. Back then, I was happy just to be able to type instead of hand writing everything. I’d print my documents out on our Epson dot matrix printer, and, later, on our HP inkjet. What I saw on-screen was, more or less, what I got from the printer; I never thought much of this because it seemed like the obvious way things ought to work. Without the web, I likely never would have realized that it’s a fundamentally broken way to work.
From Final Writer to Claris Works to Microsoft Word, somehow the same little frustrations carried over. Make a word bold, clear the bold setting before entering the next word, make a typo, erase, find that the text is bold again. Repetitiously change formatting for headers and other elements, then change it back for regular text. Never, ever figure out the logic behind the little fiddly bits in the ruler that control indentation according to some incomprehensible arcane formula.
Writing for the web is different. On the web, you author pages in plain text, and what you see is most certainly not what you get. Printers have nothing to do with it; content is paged according to its needs instead of the limited physical space of a piece of paper. Most importantly, web designers who know their stuff create semantically structured documents that separate their content from its presentation.
If you’re not a web designer, the words “semantically structured documents” probably mean bupkis to you. Let me explain.
When you want to create a header in a traditional word processor, what do you do? Font size 16, bold, maybe use a different font? What you’ve done is make the text look like a header without actually calling it a header. Your WYSIWYG document has no structure; it’s just styled text. (Named styles kinda-sorta structure WYSIWYG documents, but they only go halfway. They have the form of semantic structuring, but they deny the power thereof.)
Semantic structuring takes the opposite approach. Instead of applying a style to create a header, you apply a bit of markup that names it a header. Each bit of text is given a name that describes what it is: headers are marked as headers, emphasized text is marked as emphasized, and so on. Writing this way frees you from having to deal with the vagaries of formatting while writing. It’s incredibly liberating.
It’s also deceptively powerful. Write your documents with semantic structure, and you’ll have a source that can be automatically transformed into any sort of output format you desire. Web designers exploit this property to make web sites easier to maintain, but the concept need not be limited to web pages. After drinking the sweet semantic kool-aid, I found myself wanting to write everything using semantic structure.
So, that’s that. We had a good run, WYSIWYG and I, but now I fear we’re breaking up, and I’ll be happy if I never have to use another word processor ever again.
Tools for Semantic Writing
When I went shopping for a semantic writing toolchain, I had a few basic requirements in mind. Whatever I settled on had to have the following four characteristics:
- Simple to author, clear to read in source form. As I would mostly be creating simple documents, I didn’t need a lot of advanced features and the complexity they bring.
- Works well with Git. I recognize that this is not a common requirement for a writer to have, but I’m also a programmer and am thus addicted to version control.
- Has tools to transform into a number of output formats, including pretty PDF and RTF in standard manuscript format.
- Elegant to use. Not too much effort to set up if I could help it.
I evaluated HTML, Markdown, LaTeX, and dedicated applications such as Ulysses against these four criteria, and this is what I found.
My First Markup Language: HTML
I’ll be honest: I didn’t spend long thinking I might use HTML for regular writing. While it is a powerful markup language that I know very well, it’s not really designed for writing things other than web pages. It would be unnecessarily cumbersome to write (for example) fiction using HTML, and the tools that convert from HTML to RTF are, from what little I’ve seen, not very good at it.
HTML is still a part of my life. It was my first love in semantic writing, but it couldn’t be my last.
The Mother of All Markup Languages: LaTeX
As far as I know, LaTeX was the first system to separate presentation from content. It consists of a markup language and a toolchain that converts said language into a number of pretty output formats. It’s frequently used in the publishing industry as a basis for typesetting books, and is the standard for writing academic papers and books. These attributes made LaTeX look very promising to me.
That is, until I started to learn it. I quickly discovered that, while I’m sure it’s a great tool for a lot of people, it didn’t fit well with my needs.
- LaTeX is old. It’s crufty. It has a lot of features that are obsoleted by modern character encodings, for example.
- LaTeX is complex. It has a lot of features and does a lot of things I didn’t need it to. As a result of this, it was often hard to know which markup construct to use for my specific use case.
- LaTeX is kind of hard to style. Its style tags are obtuse and complicated.
- Even after all this time, the toolchain still has that “duct tape and bailing wire” feel common to all too many open source products.
Ultimately, I found LaTeX to be overkill, and not really suited to my needs.
Mark Down, Not Up
Markdown is a simple markup language created by John Gruber of Daring Fireball. It was designed to be natural to read and write, and is typically compiled down to HTML. I love Markdown; it’s great for taking notes and writing simple documents, not to mention its original purpose (blogging). In fact, I’m writing this blog post in Markdown right now.
I’ve been doing most of my (non-coding) schoolwork in Markdown for the past year and a half, and when I take notes, I use Markdown. I’ve even written three short stories using the format. It is very clean, and has a small, focused feature set (if you need more, you can simply drop into HTML). But, it doesn’t scale well to large documents, like novels.
MultiMarkdown is an extension to Markdown created by Fletcher Penny. It was designed to address the large document issue, and also to broaden the range of Markdown’s output formats. I think MultiMarkdown is a great concept, and can recommend it to anyone looking to get into semantic writing. I almost settled on it myself.
What stopped me were the tools. They felt immature and cobbled together. Yes, they are usable, but they simply can’t compete with the elegance and ease of use provided by a dedicated application. If said applications didn’t exist, this is where this post would end, but since they do, we soldier on.
Ulysses the Epic
Yes, there is more than one dedicated semantic writing app for the Mac, but since Ulysses is the one I settled on and most heartily recommend, I will omit mention of the others.
Ulysses brings semantic writing, powerful exporting, and project management together in a capable, more-or-less elegant package. Its simple markup is designed specifically for the needs of fiction writers, and is customizable. It outputs a large number of formats, including RTF, LaTeX, PDF, and (soon) HTML. It provides fine-grained control over styling of output. It is a focused, purpose-built tool, and it is very good at what it does.
Within a project, the actual writing is saved in any number of documents, which can be tagged, reordered, annotated with notes, and otherwise managed. This setup is great for writing really big projects, like novels. (Yes, I’m crazy enough to have started writing a novel. Yes, this fact is what prompted me to really nail down my semantic writing toolchain. And yes, I know my novel is going to suck. Shut up. It’s a first try, and I’m writing it for fun.)
That said, Ulysses is not perfect. Its interface is not the best-designed in the world. In particular, I find its implementation of document groups and filters unusually removed from the documents that they group and filter. There may be a good reason for this that I’ve yet to discover. And do we really need three different types of notes?
Also, of all the tools I’ve explored, Ulysses is the only one that costs money. In fact, it’s a bit on the pricy side, though the educational discount helps. Still, at 35EUR (around 45USD at time of purchase) for an educational license, I had to think long and hard about whether it was worth the money, or if I should just stick with MultiMarkdown. That I opted to spend the money shows that I’m a sucker for good tools.
Lastly, some notes about version control. Those not interested in using a programmer’s tools for writing can skip this.
Ulysses projects are standard Mac OS bundles. Inside, they contain a few plists that manage mundane stuff like UI state and preferences, and a series of folders that contain each individual document in the project. Each document has an Info.plist, a folder for notes, a text file that stores an excerpt (not sure what I’d use that feature for), and triply-redundant storage for the document text itself. There is a plain text file, an RTF file, and a binary .textarchive, each of which contains the full text of the document. I have absolutely no idea why they chose to store documents this way.
The bad news is that the binary textarchive is the “canonical” file. It’s the one that gets loaded when a document is opened, and if the text/rtf files don’t agree with the textarchive, they are simply replaced. The good news is that the only information about the documents that is managed in the central plist is a list of all documents in the project. Furthermore, this list uses folder names (a monotonically increasing numerical index), and not the names you assign the documents through Ulysses’ UI.
This means that you can use version control to manage Ulysses projects, but you do have to step carefully. First, if you always add all new files and commit all changes, even to the plists, you should be able to switch between versions of the entire project with impunity. Second, you can revert individual documents to previous revisions as long as you, again, make sure your commits include the entire contents of the document’s subfolder.
Lastly, absolutely no merging of any kind. I’m not too concerned about that, anyway; I don’t anticipate needing it for my fiction.
Oh, and Ulysses keeps a redundant copy of each open document within the project. These open document folders are prefixed with an “o”, and persist even when the project is closed. From the perspective of version control, these are transient noise that need not be versioned. However, if you remove them without also editing the plist, Ulysses will make you save your project to a different file when you open it again, which could get annoying. At the moment, I think the best way to deal with this is simply to close all open documents before making a commit.
Semantic writing has made my writing life easier in a multitude of ways. If you like me find yourself dissatisfied with your word processor, you should give semantic writing a try. It might just sour you on WYSIWYG forever.