I’ve just imported the blog content out of the SnipSnap wiki I’ve been running for almost 3 years. It wasn’t the cleanest process, I’ve got some stray links to dead content, but I’m happy enough with the results. If there’s a better way, I wasn’t patient enough to find it. I’ll look through the rest of the wiki for any pages (non-blog) later and pull that handful in manually if it’s worth it.
Snipsnap outputs RSS, WordPress imports RSS. Sounds easy, but it’s more involved than you may expect. Issues: Snipsnap only outputs the last 10 items, and the RSS it outputs is pre-encoded with some snipsnap specifics. Sadly, there’s no way to change that in configuration, so some java source editing is required.
Grab one of the latest source projects from http://snipforge.org/download/, I used snipsnap-1.0b3-uttoxeter-20060208-src.
I was impatient finding just the right source to change, so I found several relevant places and updated the count from 10 to 10000 to adequately cover the number of posts contained.
Snipsnap source where I changed that number:
- RecentlySnipChangedFeeder.java
- List changed = space.getChanged(10);
- Rssify.java
- while (iterator.hasNext() && result.size() <= 10)
- if (list.size() < 10)
- BlogImpl.java
- return Rssify.rssify(getPosts(10));
- SnipImpl.java
- .getChildrenDateOrder((Snip) Aspects.getThis(), 10)
- .getChildrenModifiedOrder((Snip) Aspects.getThis(), 10)
Rebuild the Snipsnap project with ant, and install your new app. I exported from my existing snipsnap app, and imported into the new one. Once that completed, I could download and save /snipsnap/exec/rss?type=rss for the full RSS containing all weblog posts.
Now some minor file scrubbing remains. WordPress requires valid XML, and also appears to expect the item/content:encoded element to contain your HTML marked up, and wrapped with CDATA. What instead exists is HTML Encoded, with no CDATA. There are also some Snipsnap inserted images and formatting markup. I manually found/replaced strings with BBEdit, but a series of regular expressions would have worked just as well, and certainly resulted in more geek-points.
All that’s left in the WordPress admin is to load the RSS import screen, select your downloaded RSS, and let it do it’s thing. I raised a few database errors up front, but it then displayed the list of imported snips, and it looks like everything came through correctly.
Remaining issues: The RSS importer assigns the userId to 1, so if that’s not you, you’ll have to update it. Also, the categories are not assigned, so you have to manually address that or insert the needed data for the RSS and the WordPress importer.
I’m going to do this at least once more for other sites, so we’ll see if any major issues come up.