When we were facing performance issues with the new StackShare app. We originally thought it was a server issue. So we did quite a bit of research to see how many dynos we should be using for the sort of application we have and traffic profile. We couldn’t find anything useful online so I ended up asking my buddy Alain over at BlockScore. After a quick convo with him, I knew we should be totally fine with just 2 dynos.
We also tested the theory by increasing the number of dynos and running the load tests. They had little to no effect on error rate, so this also confirmed that it wasn’t a server issue.
So that meant it was an application issue. New Relic wasn’t any help. I spoke with another friend who suggested we use a profiler. We totally should have been using one all along. We added mini-profiler, which was great for identifying slow queries and overall page load times. We also had the Rails Chrome extension so we could see how long view rendering was taking. So we cleaned up the slowest queries.
We tried to use mini-profiler in production on the new StackShare app and for some reason, we couldn’t get it to work. We were in a time crunch so I asked Alain what they used and he said that they use Skylight in production. Funny enough, I remembered the name Skylight because we listed it on the site a while back. So we did that, and at first we couldn’t really see how it was useful. Then we realized what we were seeing were a ton of repeat queries on some of the pages we load tested.
Skylight is cool because it sort of gives you the full MVC profile. We were able to pinpoint specific db queries that being repeated. So we cleaned those up pretty quickly. But then we noticed the views were taking up all the load time, so we start implementing caching more aggressively. After we cleaned up the db queries and added more caching, our pages went from this: to this:
Skylight ended up being super useful. We use it in production now.nicknish
I don’t remember exactly how I heard about Loader.io. I think I was adding load testing services to Leanstack. I saw it was a SendGrid Labs project, so there would be competent people behind it. And since they had a Heroku Add-On it was easy to get started. Loader.io is cool because it’s super simple to set up.
When executing tests, you can see error rate and average response times. But we also check the Heroku logs to see if they are real errors.
My biggest complaint: figuring out what load to set for your tests is difficult. We don’t understand the language they use and no one we’ve spoken to that has used Loader.io understands it either. We’ve been testing at 250 clients (maintain client load) for all of our tests on 2 dynos. That means a constant load of 250 people using the site over a minute, or so I thought. The number of requests at the end of the test suggests it’s more like 250 additional clients hitting the site every second for a minute. But I guess accommodating a higher load is better anyways? 250 concurrent users seems to be our average HN traffic spike so that’s why we went with that load.nicknish