DocBook Experiment
I have recently been doing some writing in DocBook. It was certainly refreshing to write in an environment in which I did not need to worry about formatting while I wrote. Using DocBook it is easy to have consistent formatting throughout a document, something that I find incredibly time consuming and near impossible with Microsoft Word’s style mechanism. I have always been frustrated with WSYSWIG word processors since one of my first experiences writing a large document involved using IBM Script, a flavour of GML which is an ancestor of XML.
XML shines as a descriptive markup of text. It is easy to read, you can structure your document. But I feel that a simple XML editor (I use nxml-mode in Emacs) leaves a lot to be desired for writing text. I would like spatial rather than syntactical cues as to the structure of the document (i.e. the section level
could be indicated by indention rather than counting nested <section> elements, paragraphs by a blank line rather than <para> elements). It would be really nice to find an editor which lets me focus on the structure of the document visually rather than the syntax (i.e. XML+DocBook) or formatting (Word, Open Office). One only needs to look at typical coding standards at most organizations and most blogging languages (i.e wordpress allows a line break to be transformed to a paragraph in HTML) to see the importance of using white space as a visual clue to syntax.
What I really miss from Word when editing DocBook is the spelling and grammar check. The visual clues for spelling mistakes and grammar errors in Word are quite helpful. Spell-checking XML in Emacs is possible (I’ll leave it as an exercise to the reader to figure out) but not that great an experience, especially if you write in dialects of English other than American. I haven’t found a way to perform grammar check yet.
Gimbling together a system to produce PDF output from DocBook isn’t too hard if you are capable of the usual open source
yak-shaving routine. I gimbled together FOP, the DocBook XSL stylesheets, and xsltproc (though I plan to switch to the 4suite xsl processor 4xslt) with make (to be replaced with scons). I am currently living with the default fonts FOP handles. Its a bit more Yak shaving to get more fonts working.
There is a steep learning curve before you can change the way a printed document looks. Some simple changes are made with arguments to the DocBook XSLT stylesheets, more complicated ones you have to override templates in the DocBook styling. That basically means searching through the source, finding the template that does what you want, and pasting it into a local stylesheet. That isn’t too hard if you are a programmer. I am familiar with XSLT and had little trouble there. I feel that XPATH predicates are a nice way to to select rules to which apply styling to and that most people could either get that part of it or operate a user interface which helps them construct an equivalent rules. Learning XSL-FO and the construction of the DocBook XSLT stylesheets is another story. The syntactical clutter of XML in XSLT and XSL-FO makes it hard sometimes for people to wrap their head around. I really like XSLT as an elegant and time-efficient way to develop XML translations, but I am reminded of both how easy Haskell and Python are to read and Koranteng Ofosu-Amaah’s comments on Lisp (Lisps s-expressions have similarities to programming structures expressed in XML):
I would hazard here that the largest impediment to the widespread adoption of the elegant programming model of Lisp is not that something like recursion is difficult to understand but rather the dissonance that the proliferation of parentheses can cause when Jane Programmer scans a listing in an editor.
It is interesting how XML and SGML dialects like HTML and DocBook used as descriptive markup of text are very comprehensible to people (as compared with s-expressions) yet programming languages like XSLT are not.
XSL-FO is a big spec, so once you figure out which template in the DocBook XSLT needs changing, you need to figure out what to change it to. XSL-FO seems to be a box model similar to CSS, but a lot more complicated and harder to learn. With some tenacity I was able to change a few simple things like the appearance of lists and tables. Expect to spend some time. This is the hardest part to deal with. I certainly spent more than I would have with Word getting my document formatted.
Easy things in DocBook are cross references and image inclusion. It takes a bit of hunting through examples to figure this out the first time, but overall its a lot easier than using Word.
A nice thing about DocBook is that a template really is a template, you can apply it to other documents or edit the current document and have consistent formatting. I envision a need to have a lot of documents that look the same, have a layout this is nice and possibly more elaborate than I would normally do in word, and parts of the document automatically generated. That is my motivation for learning how to be productive in DocBook. I believe I can be more productive in Emacs and DocBook than Word or Open Office (which is sad - I’d rather have a word processor).
I really do believe a Word Processor (with every new release announcement of Word I think “this will be the one) will come along that:
- easily allows for a consistent formatting within a document and among documents that share the same template (i.e. a simple edit or user mistake won’t result in the wrong formatting being applied)
- will provide the advanced formatting afforded by XSL-FO without programming.
- will provide wsywig-like editing without leading the writer astray.
- will separate structure from formatting
- will gain market share with either a price tag between $0 and $99 and some clever promotion or a trade name “Microsoft Word”.
- will save its users a ton of time compared to the Word and OpenOffice
Technorati Tags: open source, yak-shaving, DocBook, XML, MetaLanguage, Microsoft Word, FOP, XSL-FO, GML, Open Office, S-Expressions
http://www.smallthought.com/avi/?p=8
