Five basic concepts of scalability

Disclaimer: I've been only working for few months in high-scalability stuff so don't expect super-secret techniques or expert advices. It is just a newbie 5-point basic list based on what I've learned and read about this subject up to now ;)

  1. Don't touch the data layer. No matter what, try to avoid performing any SELECT query, any XML reading, except for the first time. ASP.NET page caching, XML caching or data caching (like for example PHP's memcached or Microsoft's Velocity). Avoid unnecesary querys: 15 queys on a page multiplied by millions equals insta-death against a DB-only data layer.
  2. Size matters. Every KB counts when daily pageviews go crazy. Compress Javascripts, CSS, output HTML, images, JSON data, DB table fields... everything you can. Remove all unnecessary data.
  3. Design for redundancy, balancing, partitioning, and implement failover architectures. Shit happens, no matter how good the system is. So better to be prepared to apply countermeasures or at least soft failures.
  4. Servers are no magical devices with unlimited resources. Skipping coding errors like memory leaks, big traffic sites need to be as optimum as possible with everything. If you use heavyweight objects here and there, think about refactoring them to smaller ones (or detach the most important and frecuently used data from the "extra" data, as when normalizing DB schemas).
  5. Design for scalability. Database partitioning, distributed web services, load balancers, archieving... We all hate twitter's fail whale, but having millions of twitts per day requires a lot of stuff underneath (and that's just an example, there are many more out there).

And as an extra, talking about web, prepare to enter the hell of browser incompatibilities, limitations and hacks. Desktop development is a piece of cake compared to any medium-complex web project.

Update: Here's an interesting round-up article about an example of .NET scaling.


Posted by Kartones on 2009-07-05