Sunday, April 8, 2012

Boost Performance Using Caching

Caching is widely used for static files (like javascript files, css, etc). All browser support client-side caching for these static files. IIS also supports server-side caching. Caching done right can tremendously improve web application serving performance exponentially.

Let's consider this scenario. You have a static html page - let's say this page: http://www.google.com/intl/en/about/index.html. The payload of this page is about 9k bytes. Although it is very small, but payload is still payload. If nothing is cache, every time we go to that page, our browser have to download 9k bytes of data. Now we turn on browser caching, the first time we go to that page, we will download the full 9k bytes of data. But the subsequent visit to that page will load from browser cache. Look at the comparison of network traffic below (the top is the first visit and the below image is the subsequent visit):


The http status code "200" means successful retrieval from the web server and "304" means that content has not been modified therefore doing retrieval from local browser cache. As we see, the data transmitted through the internet or network was 40 times smaller (218B vs 9.5KB) and the time taken is also 5 times faster (47ms vs 203ms) on the subsequent visit. This performance boost is caused by browser caching. Now imagine if we do this to a more complicated page, with all the css files, javascript files, and more images - the performance gain can be quite significant.

There is also a different type of caching called data caching (which I used in application in this blog post about Session variable performance) - which basically means that the application is storing the data retrieved temporarily in memory instead of doing a data retrieval from the database. The benefit of this type of caching is that retrieval from memory is much faster compared to database query/retrieval.

In this post, I will focus on the browser caching and write about data caching in the future post.

Browser caching is actually pretty easy. In short, you need add an "expire" header. The header makes these parts of the website to be cacheable in the client browsers. I usually set an expiration for 30 days after the components were first cache. This means that the first time that component is downloaded into the client browser, it will store itself in the browser cache and reuse it again and again when needed until 30 days have elapsed. On the 31st day, it will discard the cached version and retrieve a new one from the server.

In .NET development, you can set this expiration in a couple ways - by code or via IIS. Both of these approaches will actually do the same thing in the end, so it does not really matter that much, but it's good to know.

If you want to do this by code, put a web.config file in the folder of static files you want to cache. For example, if you have an "images" folder, or "scripts" folder - put this web.config file in that folder. In the screen shot, I have a "Contents" folder, where all my images and css files. Then inside the web.config file, I set the clientCache element with my expiration values.
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <system.webServer>
        <staticContent>
            <clientCache 
                cacheControlMode="UseMaxAge" 
                cacheControlMaxAge="30.00:00:00" />
        </staticContent>
    </system.webServer>
</configuration>

If you want to it via IIS, open the website folder that you want to apply and click or highlight the folder of the files that you want to be cached. I am using the same folder as above - the "contents" folder.

Then open/double-click the "HTTP Response Header" feature.

Then click "Set Common Headers..." on the right pane.

This will launch "Set Common HTTP Response Headers" window and you can set the expiration values within this window. Once you hit "OK", what IIS will do is create a web.config file for you - creating the a similar result as the code/manual approach above.

So how is the result - at least for my web application? Here is a screen shot of the payload and timing before caching, or this is the initial load before browser caching takes effect. Totaling 523KB of data for just scripts and image files, taking around 515 ms.
Now, here is the result on the subsequent request, with the caching applied. Totaling 1.8KB of data, taking around 37ms.
So in the end, the payload is 20X smaller and the time is about 14X faster. Quite a big pay off for a small change. Half a second may not seem much, but it does bring a lot of user satisfaction when their experience in using your web application is seamless and without delay. Jeff Atwood wrote excellently in his blog post: Performance is a Feature.

Additional readings:
-- read more and comment ...