This is the final part of a series of posts on optimisation work we carried out on my last project, www.fancydressoutfitters.co.uk – an ASP.NET MVC web site built using S#arp Architecture, NHibernate, the Spark view engine and Solr. There’s not much point starting here – please have a look at parts 1, 2, 3 and 4, as well as my post on improving YSlow scores for IIS7 sites, for the full picture.
In the posts on this series, I’ve reflected the separation of concerns inherent in ASP.NET MVC applications by talking about how we optimised each layer of the application independently. Good separation of concerns is by no means unique to applications built using the MVC pattern, but what stood out for me as I became familiar with the project was that for the first time it seemed like I hardly had to think to achieve it, because it’s so baked into the framework. I know I share this feeling with Howard and James (respectively architect and developer on the project), who’ve both talked about it in their own blogs.
The MVC pattern also makes it much easier to apply optimisations in the code. For example, it’s much easier to identify the points where caching will be effective, as the Model-View-ViewModel pattern makes it straightforward to apply a simple and highly effective caching pattern within the controllers. I know that this kind of thing isn’t limited to performance work – for example, our team security guru certainly felt that it was easier to carry out his threat modelling for this site than it would have been in a WebForms equivalent.
On the flip side, this process also brought home to me some of the dangers of using NHibernate. It’s an absolutely awesome product, and has totally converted me to the use of an ORM (be it NHib or Entity Framework). However, the relatively high learning curve and the fact that most of the setup was done before I joined the project made it easy for me to ignore what it was doing under the covers and code away against my domain objects in a state of blissful ignorance. Obviously this is not a brilliant idea, and properly getting to grips with NH it now jostling for first place on my to-do list (up against PostSharp 2 and ASP.NET MVC 2.0, amongst other things.)
My big challenge for future projects is ensuring that the optimisations I’ve talked about are baked in from the start instead of being bolted on at the end. The problem with this is that it’s not always clear where to stop. The goal of optimising the site is to get it to the point where it performs as we need it to, not to get it to the point where we can’t optimise any more. The process of optimisation is one of diminishing returns, so it’s essential to cover issues you know need to be covered and to then use testing tools to uncover any further areas to work on.
That said, in an ideal world I’d like to be able to build performance tests early and use them to benchmark pages on a regular basis. Assuming you work in short iterations, this can be done on an iteration by iteration basis, with results feeding into the plan for the next iteration. My next series of posts will be on performance and load testing, and as well as covering what we did for this project I will be looking at ways of building these processes into the core engineering practices of a project.
Was it all worth it?
I’ll be talking separately about the performance and load testing we carried out on the site prior to go live, but in order to put these posts into some context I thought it might be interesting to include some final numbers. For our soak testing, we built a load profile based on 6 user journeys through the site:
- Homepage: 20% of total concurrent user load
- Browse (Home -> Category Landing -> Category Listing -> Product): 30%
- Search (Home -> Search Results): 30%
- News (Home -> News list -> News story): 10%
- Static Pages (Home -> Static page): 5%
- Checkout (As for browse journey, then -> Add to basket -> View Basket -> Checkout): 5%
With a random think time of 8 – 12 seconds between each step of each journey, we demonstrated that each of the web servers in the farm could sustainably support 1000 concurrent users and generate 90 pages per second. Given the hardware in question, this far exceeds any project I’ve worked on recently.
In the end, we put www.fancydressoutfitters.co.uk live in the run up to Halloween, the busiest time of the year for the fancy dress industry. We did this with no late nights and enough confidence to go to the pub for a celebratory pint within the hour. It was also interesting that the majority of technical colleagues who responded to our go-live announcement commented on how fast it runs (which given the machinations of our corporate network’s internet routing is even more remarkable.) And best of all, we’ve had no major shocks since the site went live.
A final note
If you’ve read this series of posts, I hope you’ve got something out of it. I’d certainly be interested in any feedback that you might have – as always, please feel free to leave a comment or contact me on Twitter. In addition, the EMC Consulting blog site has been nominated in the Computer Weekly IT Blog Awards 2009, under the “Corporate/Large Enterprise” category – please consider voting for us.
I’d also like to extend a final thanks to Howard for proof reading the first draft of these posts and giving me valuable feedback, as well as for actually doing a lot of the work I’ve talked about here.
This is part 2 in a series of posts on optimisation work we carried out on my current project – an ASP.NET MVC web site built using S#arp Architecture, NHibernate, the Spark view engine and Solr. Please see “Optimising an ASP.NET MVC web site part 1 – Introduction” for the background.
Whilst there are many different things you can do to a web application to kill performance, having a badly implemented database – or a well implemented database that you access in a bad way – has got to be right up there with the best. We therefore made the database our first port of call, and started off with a pretty simple approach – fire up SQL Server Profiler, start a new trace and see what the pain points are in each database.
Using SQL Server Profiler to find data access mistakes
As I mentioned in the introduction, the goal of our optimisation wasn’t to do everything, just to get the main things we thought were causing problems before we began our load testing. If you use the TSQL_Duration template in SQL Profiler for your trace, you can hit your site and quickly see what the most expensive queries are.
The screenshot above shows the results of repeatedly executing three different stored procedures in a random order. As you can see, the results are organised in ascending order of cost, with the interesting column being “Duration” – which shows the execution time in milliseonds.
When Howard first ran this against some of our landing and search result pages he quickly noticed a large nugget of smelliness at the bottom of the list, in the form of repeated calls to a stored procedure that was taking a whopping 400ms to execute. Whilst 400ms is not a long time – I’m sure it’s taken longer than that for you to read this sentence so far – when you call it 20 times you’ve suddenly spent 8 seconds just on database calls before any other page activity.
Digging into this identified three separate, but related issues.
1. One of our NHibernate mappings included a view based on a recursive query
Of course, this was something we’d told NHibernate to do. Our entities are organised hierarchically, and an entity can appear in multiple positions in the hierarchy. When we pass entity data to Solr for indexing, we work out the full list of paths for an entity and pass that through (see item 3 for more details.) This was done by creating a CTE to generate the list of hierarchy paths for each entity.
For the uninitiated, a Common Table Expression (or CTE) is a T-SQL construct that (amongst other things) enables you to execute recursive queries. They are very handy when dealing with hierarchical datasets, but aren’t the most efficient queries you can execute. Some time after the initial creation of the view, we’d realised that it would be useful to have the data it contained as part of our entity so we added a mapping into our NHibernate configuration. This meant that accessing that data would cause NHibernate to go off and execute a SELECT statement which included a join from the main entity table to the view. This query took in the region of 400ms.
We have two main types of page on the site landing pages and search/browse pages. The landing pages were causing this query to be executed 13 times and the browse pages were causing it to be executed 20 times, so it’s no wonder that performance had dropped. Whilst the view was never intended for use in this way, the requirement to have the data as part of our entity was still valid.
The simple solution to the problem was essentially to materialize the view. SQL Server can do this by turning the view into an indexed view – adding a unique clustered index to it does this. However, this approach isn’t applicable when the SELECT statement for the view uses a CTE, so we went with a more basic approach – since our product catalogue is actually managed in the back office and populated via ETL, we replaced the view with a table (complete with all the appropriate indexes) and tweaked the ETL to populate this table at the same time as all the rest.
For the pages in question, load time dropped from around 8 seconds to under 2, at which point we all got quite excited. However, this wasn’t solely to do with the materialisation of the view, as the investigation also showed up that…
2. Everything was being lazy loaded
By default, NHibernate uses lazy loading across the board. Depending on your object model and how it is used, this can lead to massive inefficiences. The classic example is the "SELECT N+1” anti-pattern, in which you retrieve an entity from a repository then iterate over a collection on that entity. If you’ve left NHibernate lazy loading the values, then this results in a SELECT statement being issued for every iteration of the loop. Have a look at this page on the NHibernate Profiler site for more information.
Howard spent some time using the NHibernate profiler to better understand what our app was doing, and made a number of tweaks to our mapping configuration to apply eager loading where it made sense. This provided another significant improvement, dramatically reducing the number of database calls made by the application.
3. An architectural rule imposed for performance reasons had been broken
We’d made an architectural decision to drive the search and browse functions on our website exclusively from Solr. We did this because it gives us more options for scalability. Since all the data in question comes from a single database, pulling the data directly out of that database would mean that as website load increases then so does the load on the database. The problem with that is that it’s difficult to scale a single database out – you can throw CPUs and RAM at your database server, but you’re always going to hit a limit, at which point you face some architectural challenges. This kind of basic architecture is shown in the following diagram:
Even though the main entity database is static, meaning that it would be possible to have a number of replicas of this database (potentially even one per webhead), this would require architectural change and would bring with it a new set of issues around data consistency. By pushing that load onto Solr, which has a far more straightforward scale-out story, we can grow far further without requiring a redesign. Solr basically stores a flat set of data in a form optimised for searching, and provides access via a web service. This means it is straightforward to have multiple instances of Solr running behind a load balancer. Solr makes this kind of setup even easier, as it supports a master-slave configuration as shown in the following diagram (I apologise now for the proliferation of connecting lines – I’m more of a reluctant architect than a Powerpoint architect):
In this example, the web tier will still talk direct to the data tier for some tasks – it’s unavoidable. However, we introduce the search tier which consists of a set of load balanced search servers, each of which contains an identical copy of the search index. In order to build that index, we push data from the database into the Solr master server, and the Solr master server indexes it and pushes the result out to each slave. If you can see past the nasty mess of lines, it should be obvious that as load grows, adding more webheads and/or Solr slaves is a relatively trivial operation.
However, you can have the best intentions in the world when you design your solution, but if you then go on to break the rules then it’s not going to happen. In our case, the code had ended up in a state where for each result retrieved from a Solr search, a database query would be made. Not only that, but the query in question was the horribly expensive one I mentioned in point 1.
This will no doubt cause some brow-wrinkling activity if you’re paying attention, as I mentioned that the original intended purpose of the view being used for the expensive query was to push data into Solr – so why, if the data was already in there, would we be accessing it again from the database once we’d pulled it out of Solr?
I can think of a number of good explanations, the best of which might be “My name’s Jon, and I’m a terrible programmer”. The day I get it right 100% of the time is quite probably the day I get to decide which of my Ferrari’s I drive to work, as well as the day that my projects no longer require testers, and I doubt that will ever happen. Maybe I just missed the Ballmer Peak on that day, but whatever happened, I’m just happy when the mistakes I make are as easy to identify and fix as this one was.
Using the SQL Server Database Engine Tuning Advisor
In tandem with this, we also made use of the SQL Server Database Engine Tuning Advisor. This is the next stage of evolution for the Index Tuning Wizard that (I believe) first appeared in SQL Server 2000. The MSDN page for it is here; the short (aka .NET developer) version is that you put a trace file into one end, and out of the other comes a variety of changes you can make to your database to make it go faster.
In order to generate the input, you use SQL Server Profiler with the “Tuning” template. Once this is running, and saving the trace to a file, you need to generate a “typical” load against your application. There are various ways to do this, ranging from fully manual to fully automated. We were fortunate on our project that we had a complete set of Selenium tests for the site and a way of running them as part of the build. I’m hoping that Tom, the consultant responsible for this, will start blogging about it at some point as it’s really interesting stuff. It meant that to generate our trace, all we had to do was start the profiler, kick of the test suite and go for a beer. The next morning, we pushed the resultant file through the tuning advisor and received our recommendations out of the other end. There was little we disagreed with, and most of the recommendations were subsequently applied to the database.
You may have noticed that the main focus in this post has been around the way the application accessed the data, rather than the way the database was built. Over the duration of the project, and in addition to the fantastic data architect we had in the team, we’ve had guest appearances by JRJ, Jamie and Simon, so it shouldn’t come as a massive surprise that the database works well. EMC Consulting is rightly proud of the database guys we have on the team and whilst I’m sure that there are many further tweaks that could be made to our databases to add further go-faster stripes, they aren’t needed at the moment. Optimisation is one of many things that doesn’t benefit by having the word “premature” put in front of it – it’s basically another form of YAGNI. So, until testing proved otherwise, we were happy to stop at this point and move on to something else.
In the next exciting epsiode of “Optimising an ASP.NET MVC web site”, we look at a pattern for applying application layer caching. Don’t touch that dial!
Please let me know what you think of my blog posts, either by commenting, dropping me an email, or via Twitter.