Thursday, May 19, 2005

I will try to complement Apoorv's blog
Apoorv Durga's Blog on Portals and Content Management whichever ways in which I can.


In one of the recent blogs, Apoorv mentioned sizing. That is one thing which I happen to have worked a lot on.

I will list a few things which I noticed about the content management systems that I worked on :

- It is the database access that kills.

- SQL Queries on the presentation layer tend to be a lot heavier than the queries on the Content management backend

- Unless and until you have a very simple presentation - it will always make sense to cache the presentation as HTML pages and serve that to customers instead of dynamic pages (of course everyone knows that)

- Even if the update frequency or volume is large (lets say more than a page a minute on an average) and the database size gets large - even the publishing process takes its toll. It is good to have an aggressive archiving for the content. In case Archiving is not feasible (after all it is a content management system) - a replicated database for presentation may be the only option.

- Coming to sizing you are likely to get a better projections by benchmarking against existing applications

- For benchmarking against existing application - you need to have page views per second for the most frequently used dynamic pages and the database size.

- If you have a benchmark - you can half the performance for every 10 times increase in data size

- Typically, if you can serve 7 pages per second by 1 CPU of application server and 2 CPU of database server for a non cached application - it is considered good performance. Typically the CMS pages which are updating "one content item" only can achieve this kind of performance for a database having 10 to 50 thousand content items.

- XML processing is usually a big killer, so if you are transferring around large structured documents using web services, and a document size is expected to be more than 20 KB then you have to really look at the performance. As per a benchmark I am doing now - 1 CPU can consume a web service returning 1 Meg data only 3 times a second. This is with it doing no processing at all - just a web service call using regular soap client. So as a thumb rule - if you are making web service calls - it will be a good start to halve the above benchmark of 7 pages per CPU on the app server to form a target to aim for.
- Some CMS have object or XML databases. I am not sure how you can size for them if the content size is beyond a certain size.
- Search engines fall in a different league. I am not sure how to size for them.

I have a question if anyone can help me answer : Can I use google desktop on my CMS server and let people search that ?

(1) comments

This page is powered by Blogger. Isn't yours?