I’m in doubt and I need your help on this important subject. It’s about caching and about whether or not to use absolute or sliding expiration. I’ve been thinking really hard about this and can find both pros and cons to each solution. This scenario is this:
When the BlogEngine.NET application starts, it loads all posts into memory. Because BlogEngine.NET is a single-blog-per-installation it doesn’t really matter much since each post only use about 50kb of memory. So my 350 posts only use about 17.5MB of RAM. However, some people have thousands of posts even on a single installation. The goal is to reduce the memory footprint by using intelligent caching.
The simplest solution is to take the post body – the text of a post – and make it lazy loaded, so it only loads when it is requested by a visitor. By lazy loading I mean that the body is read from IO whether it is from a database or XML file only when it is requested instead of put into memory when the application starts.
The post body is by far the biggest entity on a post and therefore is the single most influence on memory consumption.
What I’ve done is to cache the post body with a sliding expiration of 3 minutes. That means that when a visitor requests a particular post, the body is cached for three minutes from that point on. One minute later, another visitor requests the same post and then the cache expiration is reset to 3 minutes again and so on.
The problem with this approach is if you have a rather popular blog where people requests a lot of both new and old posts. Then you might end up with a solution where almost all post bodies are cached all the time and never expires. On the positive side the performance remains high since no IO loading occurs because all post bodies are loaded into memory.
For lesser popular blogs, another positive is that all the rarely visited posts are removed from cache but the front page still loads directly from memory most of the times.
The other possibility is to use an absolute expiration of 3 minutes as well. That means that no matter how many requests a post gets, the cache always expires after 3 minutes after the first request. The problem here is if you have a popular blog, then the performance takes a hit since it has to read from IO every 3 minutes on every post. The positive thing is that you clear the cache very frequently to garbage collection.
Another issue with this solution is that for a new blog with few requests, the performance hit will be the biggest. It has to do IO reads at almost every request since there are more than 3 minutes between each request. As I see it, absolute expiration will reduce memory consumption most, but take the biggest performance hit.
There are pros and cons with each solution and I can’t figure out what makes the most sense. I lean toward sliding expiration since it will make sure the front page always loads from memory, but then again, it also keeps certain posts in memory all the time.
What is your take on this issue and are three minutes the right expiration time span?