A couple of months ago, I blogged about how to improve your YSlow score under IIS7. One of the recommendations from the Yahoo! best practices guide against which YSlow rates your site is that you should set far-future expiry dates for resources such as scripts, images and stylesheets.
This is good advice, and straightforward to implement under IIS7, but can leave you with a bit of a problem – once you’ve released your site with all of your scripts and stylesheets set to expire next millenium, you can no longer edit them as you please. If you change them and release them with the same name, returning visitors to your site will not see the changes – they will use the cached version that you told them was good for another 1000 years.
Instead, if you need to change one of your files, you need to give it a new name so that clients who downloaded and cached the original version are sure to get the update when it’s deployed. This means you need to update any references to point to the new file, which has the potential to be a bit of a management nightmare.
Obviously, if you’ve only got one or two of each type of file and you’ve referenced them from a master page or similar shared component then you can handle this manually. However, if you’ve got more than that, or if you’re following the recommendation to combine and pack files, then life can get more fiddly.
On a previous project, I’ve used the Aptimize Website Accelerator, which handles all this stuff for you in an extremely slick way. However, for projects that don’t use this, I wanted to come up with a way of making the process easier.
In order to do this I decided to take advantage of ASP.NET’s internal URL rewriting feature, exposed via the HttpContext.RewritePath method. What I ended up with – which you can download from Codeplex if you’d like to give it a try – has two parts, the logic to add and remove version numbers from file names and an HttpModule to apply this logic to incoming Urls. The idea is that the code writing out links to the JS/CSS files can pass those filenames through the provider to add a version number, and the HttpModule will intercept incoming requests for those files and rewrite them to point to the correct filename.
The resourcing versioning provider
Whilst I’m not a massive fan of the ASP.NET provider model, I’ve recently been feeling the pain of using a variety of different open source projects with overlapping dependencies on different versions of the same tools. I therefore decided to use the provider model to allow me to swap out different ways of generating the version number.
The first important class here is ResourceVersioningProvider. This defines the two methods that providers need to implement.
It should be pretty obvious what they do from their names.
I’ve then got a base class, ResourceVersioningProviderBase which contains the code that’s independent of how the version number is generated (basically the filename parsing/modifying logic as well as some caching.) This class is abstract and defines a single abstract method, GetVersionNumber(). This should return the version number that’s going to be used to make the filename unique.
The final relevent class is AssemblyBasedResourceVersioningProvider, which inherits ResourceVersioningProviderBase and implements GetVersionNumber by returning the full version number of an assembly specified in the configuration. The configuration itself looks like this:
As you can see, there’s also an attribute that controls which file extensions the provider will add version numbers to. This means that if you call “AddVersionToFileName” and pass in an a filename with a different extension, no version number will be added.
The resource versioning HttpModule
The HttpModule that forms the other half of the solution is pretty straightforward. Every incoming Url is cracked open using the Uri.Segments property, and the final segment (assumed to be the requested file name) is passed into the provider to remove the version number, if present. If the provider returns a filename that differs from the one in the request url, it is used to build a new Url that can be passed to HttpContext.RewritePath. Processed Urls are cached (regardless of whether or not they require a rewrite) to improve performance.
An Alternative Approach
Despite having written this post a month or so ago, I was only spurred into posting it by reading about this alternative approach from Karl Seguin. He’s got an interesting series going on at the moment covering ASP.NET performance – it’s well worth a read. I can see that it would be fairly straightforward to replace my AssemblyBasedResourceVersioningProvider with a FileHashBasedResourceVersioningProvider, although given that no. 1 on my to do list at the moment is learning NHibernate, it might be a while before I get round to it!
This is the final part of a series of posts on optimisation work we carried out on my last project, www.fancydressoutfitters.co.uk – an ASP.NET MVC web site built using S#arp Architecture, NHibernate, the Spark view engine and Solr. There’s not much point starting here – please have a look at parts 1, 2, 3 and 4, as well as my post on improving YSlow scores for IIS7 sites, for the full picture.
In the posts on this series, I’ve reflected the separation of concerns inherent in ASP.NET MVC applications by talking about how we optimised each layer of the application independently. Good separation of concerns is by no means unique to applications built using the MVC pattern, but what stood out for me as I became familiar with the project was that for the first time it seemed like I hardly had to think to achieve it, because it’s so baked into the framework. I know I share this feeling with Howard and James (respectively architect and developer on the project), who’ve both talked about it in their own blogs.
The MVC pattern also makes it much easier to apply optimisations in the code. For example, it’s much easier to identify the points where caching will be effective, as the Model-View-ViewModel pattern makes it straightforward to apply a simple and highly effective caching pattern within the controllers. I know that this kind of thing isn’t limited to performance work – for example, our team security guru certainly felt that it was easier to carry out his threat modelling for this site than it would have been in a WebForms equivalent.
On the flip side, this process also brought home to me some of the dangers of using NHibernate. It’s an absolutely awesome product, and has totally converted me to the use of an ORM (be it NHib or Entity Framework). However, the relatively high learning curve and the fact that most of the setup was done before I joined the project made it easy for me to ignore what it was doing under the covers and code away against my domain objects in a state of blissful ignorance. Obviously this is not a brilliant idea, and properly getting to grips with NH it now jostling for first place on my to-do list (up against PostSharp 2 and ASP.NET MVC 2.0, amongst other things.)
My big challenge for future projects is ensuring that the optimisations I’ve talked about are baked in from the start instead of being bolted on at the end. The problem with this is that it’s not always clear where to stop. The goal of optimising the site is to get it to the point where it performs as we need it to, not to get it to the point where we can’t optimise any more. The process of optimisation is one of diminishing returns, so it’s essential to cover issues you know need to be covered and to then use testing tools to uncover any further areas to work on.
That said, in an ideal world I’d like to be able to build performance tests early and use them to benchmark pages on a regular basis. Assuming you work in short iterations, this can be done on an iteration by iteration basis, with results feeding into the plan for the next iteration. My next series of posts will be on performance and load testing, and as well as covering what we did for this project I will be looking at ways of building these processes into the core engineering practices of a project.
Was it all worth it?
I’ll be talking separately about the performance and load testing we carried out on the site prior to go live, but in order to put these posts into some context I thought it might be interesting to include some final numbers. For our soak testing, we built a load profile based on 6 user journeys through the site:
- Homepage: 20% of total concurrent user load
- Browse (Home -> Category Landing -> Category Listing -> Product): 30%
- Search (Home -> Search Results): 30%
- News (Home -> News list -> News story): 10%
- Static Pages (Home -> Static page): 5%
- Checkout (As for browse journey, then -> Add to basket -> View Basket -> Checkout): 5%
With a random think time of 8 – 12 seconds between each step of each journey, we demonstrated that each of the web servers in the farm could sustainably support 1000 concurrent users and generate 90 pages per second. Given the hardware in question, this far exceeds any project I’ve worked on recently.
In the end, we put www.fancydressoutfitters.co.uk live in the run up to Halloween, the busiest time of the year for the fancy dress industry. We did this with no late nights and enough confidence to go to the pub for a celebratory pint within the hour. It was also interesting that the majority of technical colleagues who responded to our go-live announcement commented on how fast it runs (which given the machinations of our corporate network’s internet routing is even more remarkable.) And best of all, we’ve had no major shocks since the site went live.
A final note
If you’ve read this series of posts, I hope you’ve got something out of it. I’d certainly be interested in any feedback that you might have – as always, please feel free to leave a comment or contact me on Twitter. In addition, the EMC Consulting blog site has been nominated in the Computer Weekly IT Blog Awards 2009, under the “Corporate/Large Enterprise” category – please consider voting for us.
I’d also like to extend a final thanks to Howard for proof reading the first draft of these posts and giving me valuable feedback, as well as for actually doing a lot of the work I’ve talked about here.