What is AppSignal?
What is New Relic?
Need advice about which tool to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
What are the cons of using AppSignal?
Sign up to add, upvote and see more consMake informed product decisions
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
Which #APM / #Infrastructure #Monitoring solution to use?
The 2 major players in that space are New Relic and Datadog Both are very comparable in terms of pricing, capabilities (Datadog recently introduced APM as well).
In our use case, keeping the number of tools minimal was a major selection criteria.
As we were already using #NewRelic, my recommendation was to move to the pro tier so we would benefit from advanced APM features, synthetics, mobile & infrastructure monitoring. And gain 360 degree view of our infrastructure.
Few things I liked about New Relic: - Mobile App and push notificatin - Ease of setting up new alerts - Being notified via email and push notifications without requiring another alerting 3rd party solution
I've certainly seen use cases where NewRelic can also be used as an input data source for Datadog. Therefore depending on your use case, it might also be worth evaluating a joint usage of both solutions.
We currently monitor performance with the following tools:
- Heroku Metrics: our main app is Hosted on Heroku, so it is the best place to get quick server metrics like memory usage, load averages, or response times.
- Good old New Relic for detailed general metrics, including transaction times.
- Skylight for more specific Rails
Controller#actiontransaction times. Navigating those timings is much better than with New Relic, as you get a clear full breakdown of everything that happens for a given request.
Skylight offers better Rails performance insights, so why use New Relic? Because it does frontend monitoring, while Skylight doesn't. Now that we have a separate frontend app though, our frontend engineers are looking into more specialized frontend monitoring solutions.
Finally, if one of our apps go down, Pingdom alerts us on Slack and texts some of us.
Regarding Continuous Integration - we've started with something very easy to set up - CircleCI , but with time we're adding more & more complex pipelines - we use Jenkins to configure & run those. It's much more effort, but at some point we had to pay for the flexibility we expected. Our source code version control is Git (which probably doesn't require a rationale these days) and we keep repos in GitHub - since the very beginning & we never considered moving out. Our primary monitoring these days is in New Relic (Ruby & SPA apps) and AppSignal (Elixir apps) - we're considering unifying it in New Relic , but this will require some improvements in Elixir app observability. For error reporting we use Sentry (a very popular choice in this class) & we collect our distributed logs using Logentries (to avoid semi-manual handling here).
If you follow the registration flow you end up with running analytics virtually in a minute. Awesome first experience.
I don't have my application in production so I needed to enable skylight in development, but Skylight navigated me nicely to the exact paragraph of a documentation, which helped.
When we were facing performance issues with the new StackShare app. We originally thought it was a server issue. So we did quite a bit of research to see how many dynos we should be using for the sort of application we have and traffic profile. We couldn’t find anything useful online so I ended up asking my buddy Alain over at BlockScore. After a quick convo with him, I knew we should be totally fine with just 2 dynos.
We also tested the theory by increasing the number of dynos and running the load tests. They had little to no effect on error rate, so this also confirmed that it wasn’t a server issue.
So that meant it was an application issue. New Relic wasn’t any help. I spoke with another friend who suggested we use a profiler. We totally should have been using one all along. We added mini-profiler, which was great for identifying slow queries and overall page load times. We also had the Rails Chrome extension so we could see how long view rendering was taking. So we cleaned up the slowest queries.
We tried to use mini-profiler in production on the new StackShare app and for some reason, we couldn’t get it to work. We were in a time crunch so I asked Alain what they used and he said that they use Skylight in production. Funny enough, I remembered the name Skylight because we listed it on the site a while back. So we did that, and at first we couldn’t really see how it was useful. Then we realized what we were seeing were a ton of repeat queries on some of the pages we load tested.
Skylight is cool because it sort of gives you the full MVC profile. We were able to pinpoint specific db queries that being repeated. So we cleaned those up pretty quickly. But then we noticed the views were taking up all the load time, so we start implementing caching more aggressively. After we cleaned up the db queries and added more caching, our pages went from this: to this:
Skylight ended up being super useful. We use it in production now.
Free Heroku add-on. Not particularly useful for us. Rails profilers tend to do a better job at the app level. And I can never really figure out what’s going on with Heroku by looking at New Relic. I don’t know if we’re just not using New Relic correctly or if it really does just suck for our use case. But I guess some insight is better than none.
How do you know what parts of the workflow need improvement? Measure it. With New Relic in place, we have graphs of our API performance and can directly see if a server or zone is causing trouble, and the impact of our changes. There’s no comparison between a real-time performance graph and “Strange, the site seems slow, I should tail the logs”.
We monitor and troubleshoot our app's performance using New Relic, which gives us a great view into each type of request that hits our servers. It also gives us a nice weekly summary of error rates and response times so that we know how well we've done in the past week.
I'm trying to wring more instrumentation out of New Relic as it pertains to Rack, but for the time being, New Relic is monitoring/alerting uptime and some basic performance metrics.