New Relic vs Prometheus: What are the differences?
Developers describe New Relic as "SaaS Application Performance Management for Ruby, PHP, .Net, Java, Python, and Node.js Apps". New Relic is the all-in-one web application performance tool that lets you see performance from the end user experience, through servers, and down to the line of application code. On the other hand, Prometheus is detailed as "An open-source service monitoring system and time series database, developed by SoundCloud". Prometheus is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.
New Relic and Prometheus are primarily classified as "Performance Monitoring" and "Monitoring" tools respectively.
Some of the features offered by New Relic are:
- Performance Data Retention
- Real-User Response Time, Throughput, & Breakdown by Layer
- App Response Time, Throughput, & Breakdown by Component
On the other hand, Prometheus provides the following key features:
- a multi-dimensional data model (timeseries defined by metric name and set of key/value dimensions)
- a flexible query language to leverage this dimensionality
- no dependency on distributed storage
"Easy setup" is the top reason why over 411 developers like New Relic, while over 32 developers mention "Powerful easy to use monitoring" as the leading cause for choosing Prometheus.
Prometheus is an open source tool with 25K GitHub stars and 3.56K GitHub forks. Here's a link to Prometheus's open source repository on GitHub.
According to the StackShare community, New Relic has a broader approval, being mentioned in 3142 company stacks & 576 developers stacks; compared to Prometheus, which is listed in 243 company stacks and 85 developer stacks.
What is New Relic?
What is Prometheus?
Need advice about which tool to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
Sign up to add, upvote and see more consMake informed product decisions
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
We recently implemented Thanos alongside Prometheus into our Kubernetes clusters, we had previously used a variety of different metrics systems and we wanted to make life simpler for everyone by just picking one.
Prometheus seemed like an obvious choice due to its powerful querying language, native Kubernetes support and great community. However we found it somewhat lacking when it came to being highly available, something that would be very important if we wanted this to be the single source of all our metrics.
Thanos came along and solved a lot of these problems. It allowed us to run multiple Prometheis without duplicating metrics, query multiple Prometheus clusters at once, and easily back up data and then query it. Now we have a single place to go if you want to view metrics across all our clusters, with many layers of redundancy to make sure this monitoring solution is as reliable and resilient as we could reasonably make it.
If you're interested in a bit more detail feel free to check out the blog I wrote on the subject that's linked.
Why we spent several years building an open source, large-scale metrics alerting system, M3, built for Prometheus:
By late 2014, all services, infrastructure, and servers at Uber emitted metrics to a Graphite stack that stored them using the Whisper file format in a sharded Carbon cluster. We used Grafana for dashboarding and Nagios for alerting, issuing Graphite threshold checks via source-controlled scripts. While this worked for a while, expanding the Carbon cluster required a manual resharding process and, due to lack of replication, any single node’s disk failure caused permanent loss of its associated metrics. In short, this solution was not able to meet our needs as the company continued to grow.
To ensure the scalability of Uber’s metrics backend, we decided to build out a system that provided fault tolerant metrics ingestion, storage, and querying as a managed platform...
(GitHub : https://github.com/m3db/m3)
Which #APM / #Infrastructure #Monitoring solution to use?
The 2 major players in that space are New Relic and Datadog Both are very comparable in terms of pricing, capabilities (Datadog recently introduced APM as well).
In our use case, keeping the number of tools minimal was a major selection criteria.
As we were already using #NewRelic, my recommendation was to move to the pro tier so we would benefit from advanced APM features, synthetics, mobile & infrastructure monitoring. And gain 360 degree view of our infrastructure.
Few things I liked about New Relic: - Mobile App and push notificatin - Ease of setting up new alerts - Being notified via email and push notifications without requiring another alerting 3rd party solution
I've certainly seen use cases where NewRelic can also be used as an input data source for Datadog. Therefore depending on your use case, it might also be worth evaluating a joint usage of both solutions.
We currently monitor performance with the following tools:
- Heroku Metrics: our main app is Hosted on Heroku, so it is the best place to get quick server metrics like memory usage, load averages, or response times.
- Good old New Relic for detailed general metrics, including transaction times.
- Skylight for more specific Rails
Controller#actiontransaction times. Navigating those timings is much better than with New Relic, as you get a clear full breakdown of everything that happens for a given request.
Skylight offers better Rails performance insights, so why use New Relic? Because it does frontend monitoring, while Skylight doesn't. Now that we have a separate frontend app though, our frontend engineers are looking into more specialized frontend monitoring solutions.
Finally, if one of our apps go down, Pingdom alerts us on Slack and texts some of us.
Regarding Continuous Integration - we've started with something very easy to set up - CircleCI , but with time we're adding more & more complex pipelines - we use Jenkins to configure & run those. It's much more effort, but at some point we had to pay for the flexibility we expected. Our source code version control is Git (which probably doesn't require a rationale these days) and we keep repos in GitHub - since the very beginning & we never considered moving out. Our primary monitoring these days is in New Relic (Ruby & SPA apps) and AppSignal (Elixir apps) - we're considering unifying it in New Relic , but this will require some improvements in Elixir app observability. For error reporting we use Sentry (a very popular choice in this class) & we collect our distributed logs using Logentries (to avoid semi-manual handling here).
We have Prometheus as a monitoring engine as a part of our stack which contains Kubernetes cluster, container images and other open source tools. Also, I am aware that Sysdig can be integrated with Prometheus but I really wanted to know whether Sysdig or sysdig+prometheus will make better monitoring solution.
Free Heroku add-on. Not particularly useful for us. Rails profilers tend to do a better job at the app level. And I can never really figure out what’s going on with Heroku by looking at New Relic. I don’t know if we’re just not using New Relic correctly or if it really does just suck for our use case. But I guess some insight is better than none.
How do you know what parts of the workflow need improvement? Measure it. With New Relic in place, we have graphs of our API performance and can directly see if a server or zone is causing trouble, and the impact of our changes. There’s no comparison between a real-time performance graph and “Strange, the site seems slow, I should tail the logs”.
We monitor and troubleshoot our app's performance using New Relic, which gives us a great view into each type of request that hits our servers. It also gives us a nice weekly summary of error rates and response times so that we know how well we've done in the past week.
We primarily use Prometheus to gather metrics and statistics to display them in Grafana. Aside from that we poll Prometheus for our orchestration-solution "JCOverseer" to determine, which host is least occupied at the moment.
I'm trying to wring more instrumentation out of New Relic as it pertains to Rack, but for the time being, New Relic is monitoring/alerting uptime and some basic performance metrics.
Just like we care about errors, we care about metrics - especially around performance. You'd be crazy not to use it - and not surprisingly, it's a one-click add-on in Heroku.
Gather metrics from systems and applications. Evaluate alerting rules. Alerts are pushed to OpsGenie and Slack.
We primarily use Prometheus to gather metrics and statistics to display them in Grafana.