Last weeked I found some spam at two of my BlogEngine.NET blogs. It is not the first time, and in the past updating to the latest major solved the issue, but this time I had to switch from 2.9.X to 3.2, and I already suffered a migration from 1.9 to 2.0 that gave quite some headaches. One of my main reasons to go far away from Wordpress was to stop this tiring battle between spammers and new versions, that forced you to update way too frequently or face serious security bugs and spam-holes. Combine that with a general feeling of being tired of big, admin-driven blog engines, and I needed a big change.
My premises were:
- Administration fully optional: even prefered not to have one
- Static site generation: No more security issues, plus FTPing the changes once or twice per month is more than enough considering my posting frequency
- As less setup requirements as possible
- Local testing: Including (for the near future) Windows support without extra effort
- Favor Python 3 over other languages: I want to improve my Python n00b skills and the best way is to use it as much as possible
Reading some articles and checking some static generators, I found Pelican. I peeked at bit at the source code, documentation and plugins, and it looked quite simple. Did some local tests, wrote some posts using existing content and read the documentation, and decided to keep it.
The setup is so easy I got a blog running locally with some base theme in a few minutes. But I had one issue, all my posts are in XMLs (thankfully I had chosen not to use a DB for storage) and Pelican uses Markdown... so I had to transform the data somehow.
I don't do too complex stuff like
<meta> keywords and I set some post tags but not even show them, so in general I just needed a few fields for the pelican post format:
Title: ... Slug: ... Date: ... Tags: ...,...
...CONTENT (which can be directly HTML)...
The XML has a very simple structure, with all the fields and just HTML-encoded the content, some regular expression searches and reverse transformations were enough to port everything. I have plans to migrate another blog, and prefer to be able to reproduce the whole migration any number of times, I built a small script. It only handles basic stuff and doesn't extracts other "basic" fields like authors, and only works with posts (I manually migrated the pages as I had few), but it does what it does well and might be of some use to somebody else.
You can find both my scripts and a small plugin (to limit RSS/Atom syndication feed output to only a certain amount of items, as by default dumps every post ever written) at my GitHub.
I'll probably work on more small plugins in the future, as I have some improvement ideas regarding output content generated, but I can't be happier with the results. A proper example of "the Python way of life", simple yet practical code, easy to setup and use and does it's job without many features.
UPDATE: Added another small script to GitHub to do some post-generation tasks like creating duplicates (as "aliases") and moving or removing certain files from the output folder.