Kartones.Net CS2007 Addon 1.4, Related Posts and a bit of caching

This post is a 3x1 because I prefer to not write multiple smaller posts when they are so related.

First, I want to announce a new revision of my CS2007 Addon Pack, 1.4.0. The news are:

  • Internal refactoring of the Assembly. I've created a separate Assembly for all Kartones.Net specific features so this public one has now fewer classes and weights a few kilobytes less.
  • Internal refactoring of Twitter related components. This is still in semi-beta, but seems to work perfectly except when Twitter is down or going too slow (drop me a comment or email if it doesn't). The only downside is that now the source appears always as "web", I have to investigate it.
    I expect to finish soon with a few stuff and there will be small but interesting changes related to it.
  • RelatedPosts Community Server 2007 component. It's a blog feature that it's starting to spread, and if done correctly (I didn't check what blog engine they were using, but I've seen multiple terrible examples) can help to engage new readers to your blog.
    My implementation, altough not perfect, has the following features:
    • Searching of related posts by tags
    • 5 Results maximum
    • Caching of the results to ease SQL usage on subsequent calls
    • Only appears when viewing a single blog post. Not in a post list, not in RSS... I don't want to waste anybody's time so it's mainly focused for web visitors arriving for example from Google search results.
      related posts example

And now, the third part of the subject: a bit of caching.

As I've just mentioned, the Related Posts component caches the results for 30 minutes. In fact, the final HTML markup is cached so almost nothing is done server-side, apart from cheching the cache and retrieving the string containing the markup.
So its a per-post cache (the keys are "RELATEDPOSTS_{0}").

This approach works fast as hell, and if you have tons of visits and tons of RAM you'll be good to go. But it is really reusable?

Not really, each post ID is unique for a given post and not used anywhere else. So I made an internal version for this blog community that works differently when managing the cache.

First, as part of my CS optimization, I've created a WeblogPostLite class containing just the critical and minimal fields needed to operate with a post, returned by a new WeblogPost.GetLite() method. I plan to use it in multiple places that do heavy operations with post objecs (but using just few common fields of them), but for now, they provide me 00% serializable and fast objects.

When I search for related posts, I (as in the normal component) get 5 posts of each category (the generic component allows to choose for most viewed or most recent ordering). And here comes the trick: I insert into the cache a List<WeblogPostLite> per blog and per tag (cache key could be "RELATEDPOSTSLITE{0}").

This caching means that for a low amount of posts, I use more memory (4-5 fields per "lite" post, up to 5 posts per category, easily 5 or 6 categories per real post,...). But as soon as visits are spread upon multiple posts, benefits start to appear: Any decently used tagging of posts will achieve a great percentage of cache hits. And the amount of memory consumed is not linear (stabilizes, unless you use insane amount of tags).

In my opinion, caching is itself not hard to do, but strategical decisions about how, what and when to optimally cache data. And as usual, is really subjective and different in each scenario.

Note: The related posts search is not perfect, and while it yields nice results most of the time, sometimes it doesn't :) I plan to do "something else" more interesting to actually have the best related results but needs one important change in CS core and, most important, time (my greatest problem, lack of free time!).

Comments?

Posted by Kartones on 2009-10-04