correct • elegant • free

The benefits of scripting

So, I'm converting my website over to Drupal. This involves uploading all the Usenet articles. On the old website, there was -- more or less -- one page per newsgroup. That's about 16 pages. In Drupal, it seems to make sense to make each article a separate node. I've just counted, and that's nearly 700 nodes.

Blissfully unaware of how much work lay ahead of me, I started uploading them "by hand". This was relatively quick: each article would require only about 8 mouse clicks, plus some scrolling, plus some switching between windows, to cut & paste each article (from its reST source, converted with pandoc) into Drupal.

I managed to get 138 articles uploaded over a couple of days. This was done as a background task in odd moments, so that probably represents only 2-3 hours work. Nevertheless, it was becoming clear that my aim of "get this finished by the end of the week" would elude me.

What I needed was a command line drupal-add program. Automating web-based interaction is a touch awkward, of course. I swim well in SQL, so I looked briefly at the possibility of writing straight to the Drupal database. But it's a hairy beast, and the chances of getting this to work in a sensible time scale seemed slim. On reflection, the right way to do it would be a PHP script that imports all the Drupal modules. I'm only slightly acquainted with PHP; even less with Drupal internals; and to be quite frank this option hadn't occurred to me.

So I settled on the route of a shell script driving Drupal through its usual web interface, with the marvellous wget doing most of the hard work.

It took me about half a day to get the drupal-add script working to my satisfaction. And then less than an hour to get the remaining 554 articles uploaded. Clearly an enormous win! What's more, although drupal-add as it stands is geared towards that particular task, it won't take me much more work to turn it into a generalized Drupal-uploader, which could be very useful. (Isn't there such a thing already? A brief look didn't turn up anything. For this fairly simple task, it's not clear that it would actually be any quicker to find somebody else's script, learn how to use it, perhaps modify it. And rolling my own also feeds into my side goal of understanding Drupal.)

A couple of conclusions. First, it's sad that the vast majority of people who use computers would be completely unable to automate a task like this. How to unleash (some of) the power of programming, without having to teach everybody how to program, remains a challenge.

Secondly, command line interfaces are vital. So far as I can tell, Drupal doesn't have one. It was easy to come up with a simplistic, inefficient, incomplete one. It would be only somewhat harder for the Drupal developers to include a full-featured CLI.

And why does a web-based content management system need a command-line interface? Well, just sometimes, you might need to upload 692 articles and not have several weeks to do it in!