Paul Nielsen has written an informative post that chronicles lessons learned from developing a high transaction (35K tps) database system. Here are some topics that he touched, which I’ve always advocated:
1) Use your database as a database. Too often, large projects incorporate a massive middle-tier that turns a system into a quagmire of complexity (poor reliability/extensibility), sluggishness (poor latency), slowness (poor throughput), and un-maintainability (first step to failure). I’m not saying: don’t use a middle tier. I am saying: use your database as a database, and use your middle-tier for as little as possible (i.e. use your middle tier for what your database cannot already do). A database’s power draws upon the foundation of ACID: Atomicity, Consistency, Isolation, and Durability. In my opinion, way too many projects waste man-centuries attempting to get a massively complex middle-tier + database to achieve what a database can do alone (or with a thin middle-tier). Too many projects try to reinvent the middle-tier into a database. Too many projects fail to leverage the proven functionality already contained in their database. Too many projects don’t understand what their system foundation really is (hint: the database). Using your database as a database really is a much bigger topic.
2) Use common senses database denormalization. I have seen some really wacky architectures that merge bunches of tables into one table for reasons such as: convenience (“simplifies development!”), performance (“scales better”), ignorance (“how else could we have done it”). Denormalization is not all bad, and I’m sure not a stickler for doing normalization exactly by the book. But, a certain amount of clear thinking is required when architecting a database. And, when something overly cleaver is attempted, there is a good chance that both performance and convenience will both be degraded.
3) Be careful with Microsoft SQL Server 2005 CLR. One of the major architectural additions to SQL Server 2005 was inclusion of the CLR (Common Runtime Language). I’m not saying the CLR is all bad, but… it was a major addition that brings sizable complexity to the table. And as experience shows: be very careful of opting-into interacting with massively complex systems. Especially when it comes to performance. Microsoft does illustrate some scenarios where CLR is considered to offer clear performance advantages. So my best advice is: Understand best practices and test your own system using real data.
4) When done right, Iterative Development is gold. Since my professional start developing software and systems: All my successful programs have been run using some form of intuitive, common sense Iterative Development. And all my less-than-successful programs have been run using some less stellar development methodology, which ended up reflecting more Waterfall than anything else. I do feel Iterative is very intuitive, and is actually how I ran programs when I was first pushed into running programs, and before I had any experience (the good old days!). When done right, I think Iterative development could be called Intuitive development.
5) High transaction SQL Server systems: be aware of Index Fragmentation. Index fragmentation is especially bad when it comes to range operations, such as: Show all records After Date (Or, Between two Dates). If you have high index fragmentation, you do many range-based operations, and have a large database then you are taking a potentially sever performance hit. Here is a good overview of MSSQL 2000 and MSSQL 2005 Index Fragmentation.
6) Developers must spend time testing. From my experience, developers should spend roughly half their time testing their own code. This testing includes unit testing, sub-system/system testing, and performance testing. When your technology-stack includes a debugger, spend seriously-focused time in the debugger. The debugger is your friend and an awesome way to understand exactly what your code is doing under different situations.