Datadog vs Rollbar: What are the differences?
What is Datadog? Unify logs, metrics, and traces from across your distributed infrastructure. Datadog is the leading service for cloud-scale monitoring. It is used by IT, operations, and development teams who build and operate applications that run on dynamic or hybrid cloud infrastructure. Start monitoring in minutes with Datadog!.
What is Rollbar? Full-stack error monitoring for developers. Rollbar helps development teams find and fix errors faster. Quickly pinpoint what’s broken and why. View exceptions from all of your languages, frameworks, platforms & environments in one place. Get context & insights to defeat all errors.
Datadog and Rollbar are primarily classified as "Performance Monitoring" and "Exception Monitoring" tools respectively.
Some of the features offered by Datadog are:
- 14-day Free Trial for an unlimited number of hosts
- 200+ turn-key integrations for data aggregation
- Clean graphs of StatsD and other integrations
On the other hand, Rollbar provides the following key features:
- Errors get queued, de-duped, grouped and prioritized
- View detailed stack traces with local variables
"Monitoring for many apps (databases, web servers, etc)", "Easy setup" and "Powerful ui" are the key factors why developers consider Datadog; whereas "Consolidates similar errors by impact", "Slack integration" and "Centralize error management" are the primary reasons why Rollbar is favored.
According to the StackShare community, Datadog has a broader approval, being mentioned in 540 company stacks & 223 developers stacks; compared to Rollbar, which is listed in 479 company stacks and 116 developer stacks.
What is Datadog?
What is Rollbar?
Need advice about which tool to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
What are the cons of using Rollbar?
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
Which #APM / #Infrastructure #Monitoring solution to use?
The 2 major players in that space are New Relic and Datadog Both are very comparable in terms of pricing, capabilities (Datadog recently introduced APM as well).
In our use case, keeping the number of tools minimal was a major selection criteria.
As we were already using #NewRelic, my recommendation was to move to the pro tier so we would benefit from advanced APM features, synthetics, mobile & infrastructure monitoring. And gain 360 degree view of our infrastructure.
Few things I liked about New Relic: - Mobile App and push notificatin - Ease of setting up new alerts - Being notified via email and push notifications without requiring another alerting 3rd party solution
I've certainly seen use cases where NewRelic can also be used as an input data source for Datadog. Therefore depending on your use case, it might also be worth evaluating a joint usage of both solutions.
Our primary source of monitoring and alerting is Datadog. We’ve got prebuilt dashboards for every scenario and integration with PagerDuty to manage routing any alerts. We’ve definitely scaled past the point where managing dashboards is easy, but we haven’t had time to invest in using features like Anomaly Detection. We’ve started using Honeycomb for some targeted debugging of complex production issues and we are liking what we’ve seen. We capture any unhandled exceptions with Rollbar and, if we realize one will keep happening, we quickly convert the metrics to point back to Datadog, to keep Rollbar as clean as possible.
We use Segment to consolidate all of our trackers, the most important of which goes to Amplitude to analyze user patterns. However, if we need a more consolidated view, we push all of our data to our own data warehouse running PostgreSQL; this is available for analytics and dashboard creation through Looker.
Team Rollbar--I LOVE you guys and your wonderful service! This review is far too long overdue.
Let me save you a bunch of time and make the decision for you. If you're not already using an error tracking platform, you must. If you're deciding between which services to use, just go with Rollbar, and stop deliberating.
Rollbar is hands-down, THE BEST full-stack application error and exception monitoring/tracking system.
I was an early user and first started using Rollbar in early 2012 (back when it was still called Ratchet.io). Suffice it to say, it has completely transformed and leveled-up the way I build and write applications.
Rollbar is now a must-have for any application I build. TDD? Yeah, could do that, or you can just be more lean and start building, and Rollbar will catch all of your exceptions for you. Large team? Even more so that you need Rollbar, so that you can detect and fix errors before they inconvenience your users.
Here are the reasons for why I think Rollbar is great:
Rollbar has an exceptional, world-class team. Rollbar is built by engineers, for engineers. I know because I've worked with a few folks at team Rollbar, including the founder/CEO Brian Rue, who has been a mentor and advisor to me at a few startups, and some of my elite former colleagues who were hand-picked to join the Rollbar team. They are extremely talented hackers and engineers.
Rollbar scales, an is extremely reliable. We're not just talking Mickey Mouse pretend scale, but they actually do. Guess what? Unlike most of the rest of the web, they're actually NOT primarily an AWS-based stack (yeah, because AWS outages can cause large chunks of the web to fail). Rollbar is in multiple data centers across the world for improved latency, durability/reliability of data.
Rollbar is extremely easy to integrate and very well documented. There are modules/agents for just about every stack and programming language. A basic setup takes 5-10 minutes.
Rollbar is thoughtful. In the early days of Rollbar, data wasn't scrubbed so potentially you could see sensitive information in the notifications you got. Now, they've significantly improved the reporting agents and UI so that sensitive information can be scrubbed before sending notifications, and additional sensitive/PII fields can be configured in the interface.
Rollbar beats the pants off of their competitors. The primary competitors I'm thinking of are Sentry and New Relic. New Relic is more for infrastructure than application, and often, infrastructure smells and problems are code problems. In terms of budgeting or planning IT spend, I would maximize spending on Rollbar and maybe some more basic infrastructure monitoring like hand-rolled Nagios or even Datadog (which also totally rocks, btw) instead of New Relic. As for comparison with Sentry? See next bullet point.
Rollbar is an adult, whereas Sentry** is a kid. Sentry came out of Disqus, and was built by designer-engineers. Don't get me wrong--they have good engineers, but not as good as Rollbar's. I don't care if Sentry is more popular atm or if the UI looks better; I want to know that I can have absolute confidence in my error tracking platform and sleep better at night. If error tracking services were facial hair, Rollbar would be a full, lush beard, and Sentry would be the teenage kid with sporadic prickly hairs here and there and some peach fuzz on the side. Rollbar doesn't rate-limit by default, which means you get all of your exception occurrences notified and tracked. It is 4K Ultra HD, if you will. (Though, to help manage costs and temper noise, you can set custom rate limits per API key--this is so powerful!) Sentry rate-limits by default, resulting in "sampling" error tracking which isn't full coverage and leaves you erroneously thinking that your app is in better health than it actually is.
Rollbar is "multi-tenant" (similar to GitHub) in the sense that you can have one user account affiliated with multiple organizations and projects. This is a nice added convenience.
Rollbar is enterprise-ready and has on-premise deployments.
As of this review, it's 2016--why aren't you using Rollbar yet? If you're still trying to hand-roll your own error logging system, I would seriously question you or your company's technical competence.
We're a real-time financial services messaging company, so being able to monitor our servers and applications in real-time is important to us. We also like a good deal, so $15/server seemed a bargain.
What were we looking for?
We wanted to monitor our MS infrastructure (servers, SQL) and apps (C#) to understand performance issues and be able to rectify. We also want to be able to do long-term trending. And we wanted to go from nothing to live in a short time.
Installing the Datadog agent on the servers was a breeze and enabling the integrations for SQL and Windows trivial.
Using the StatsD based API was also very easy - no worrying about JSON or UDP calls. The ability to add tags to all metrics is also a key benefit. We run multiple (100+) instances of a single application and being able to distinguish events from each one via tagging, or to see aggregates, is extremely useful.
In all it took 2 days R&D to instrument our key applications sufficiently for production deployment. Deploying the agent to our production servers took 30 mins, giving our Ops team complete visibility for the 1st time.
What have we learned
Since we've been live Datadog has given us numerous insights into the way our system behaves, from uneven server loadings and sporadic memory usage to performance tuning a key application that resulted in a 50% increase in throughput. Knowing what's taking the time has been a boon.
The other nice surprise has been the evolving nature of Datadog. It seems like every couple of weeks there's a new feature on the site.
- I like the transparent pricing. Services that won't show me the price without having to talk to a sales person are really annoying.
- Support has been good. We've contacted them several times with questions and always had a quick response (time zone considered...we're in London) and a helpful answer.
So What's bad?
Probably the weakest aspect at the moment is the long term trending of data. Whilst you can wind the time bar back to see what happened last week you can't ask questions like "show me the peak period each day for the last x months". The "get data" API is also fairly weak. Neither are concerns at the moment, and I'm sure they're on the to-do list.
Our team has been testing Rollbar as a possible replacement for Honeybadger and has liked it enough to decide to make the switch! Some of the features that stood out to us:
Better UI/UX in general that makes the app feel more comfortable to use
Pricing is more affordable (assuming a reasonable number of events per month)
Powerful notification settings, letting us set up different issue severity and only receive notifications on what matters most (more signal, less noise)
Mute option (another way to cut down on the noise)
More powerful options for auto-resolving and cleaning up old errors
Nice "person tracking" functionality to see who is affected by various errors
Good asynchronous error reporting options in the Ruby adapter
RQL query language for powerful searching (although it's still a little rough around the edges)
We ran into a few minor annoyances:
The gem providing Resque support is developed by a third-party and doesn't seem to get much use (1 watcher and 4 stars as of the time of this review). Honeybadger provides native Resque support in the official Ruby gem.
There doesn't seem to be any search functionality on the documentation page.
No documentation and a slightly unintuitive workflow for one of the integrations we use.
Overall, Rollbar looks like a solid service that was easy to set up, easy to use, and has some powerful features for searching through past error data. We're looking forward to making the switch.
Rollbar became a must for me in post-deployment exception handling and production-related problems debugging.
Here's a list of features that made Rollbar a lifesaver for me:
You can manually report anything you want, this feature helps a lot in debugging production-related bugs. You can also attach a custom payload to be inspected in the dashboard later.
You can use Rollbar-agent internally in your server to prevent Rollbar from blocking your business logic and routines.You can also use the common background job processors like sidekiq, resque ... etc
You have a very rich documentation to read, but actually you'll just need the first 2 paragraphs to go ahead with it!
Wide set of helpers
For me as a rails developer i found a lot of helpers related to rails like: Capistrano deployment hooks, ActiveJob integration, Async error reporting and Rails booting error handling.
You don't need to refresh the page each time to see the new exceptions, everything is real-time !
You can use Rollbar for free to try it without getting engaged in any financial operation, and you can also keep using it if your rate is under 5000 report per month.
I had a very good experience with their support, they respond in a short time and give flexible solutions.
I've been a systems administrator most of my career. Everywhere I went, I'd have to rebuild the same monitoring + graphing system. And then make sure that every machine wrote to that system and every application handed up the proper metrics through whatever mechanism seemed good at the time.
Then, as CTO of SimpleReach, single-handedly managing over 200 servers in addition to everything else, I found Datadog. We were already using statsd to instrument our applications, now it was just a matter of getting that data to Datadog. We use Chef, so I installed the Datadog agent on every machine in about 10 minutes and we were up and running.
The best part was that we had a deploy problem the next day with one of our main applications and troubleshooting took minutes instead of hours (and Datadog immediately paid for itself). Now no new features go out without instrumentation and no machine gets created without being monitored.
Datadog just scales with us. Great service and I highly recommend it to anyone not looking to reinvent the wheel with monitoring and instrumentation.
The service has been easy to use and reliable. It's very well documented, and was easy to integrate into our stack. Obviously, you only get out of the error messages what you put in. They will log the person's IP, user agent, etc, but you will need to use descriptive log messages to know what's going on and where.
We are interested in converting some of our APIs over to rollbar as well to try to get a more wholistic view of errors as they happen. This is a very good service and I recommend them.
The guys from Rollbar made very useful error monitoring service.
We have implemented it in our startup weblium.com. This allowed us to see all errors centrally in one place and now our site has become significantly more stable than it was previously.
The main advantages of Rollbar:
1. Reporting all exceptions and errors We just add simple Rollbar code and it catches all unhandled exceptions. Very useful and simple feature, that we use.
2. Custom error reporting. We add some asserts on the most complex part of the code, to catch potential logic errors at the very beginning.
3. Real-time log. When you have to test the production. Instant error logging helps save time.
4. Assign Owner Each error can be assigned to the developer and can be marked as resolved when done. Very useful.
5. RQL Console The new tool, it allows us to create detail data queries. Very powerful tool for the analysis of problem areas in our code.
We have a very good experience with Rollbar. I highly recommend this service.
Datadog makes running a service with 800,000 unique users a month possible as a single developer/maintainer. I bought a separate monitor just to keep my datadog dashboards always visible and rely on triggers to keep watch over 20+ servers.
We use datadog to monitor our servers and some application metrics. Easy to get started and scale to many servers. Datadog support engineers are always quick to respond to bugs and other challenges.
We've only just added Rollbar but it's shaping up well. Being able to collate all of our errors with stack traces along with the current user and time is going to save more time you could ever imagine as we currently have to troll through docker logs one after each other just to find a single error where as with Rollbar we can easily search and even be alerted when something goes wrong.
We just started looking into Datadog, but from what we see, it's like New Relic meets Loggly. It's really easy to plugin different services (like the one on this list) and get detailed analysis of what is happening on your servers and services. It makes tracking down sparse and difficult to understand problems possible.
Rollbar handles any unhandled exception, express route error, and any log entry of level warning or error. It notifies our slack ops channel and is integrated with github to allow us to create issues directly from reported errors.
We use Rollbar for exception tracking. It’s fantastic. I've used other things, but Rollbar is just really, really fast. Their speed at development is amazing. The features, you can tell it’s developers building it.
Monitoring day-to-day operations of multiple high-performance computing assets distributed across several networks. Monitoring vendor provided data and setting up alerts when things do not show up on time.
We are in love with Rollbar. Its deep integration into our Slack Channel keeps us updated at all times. We are able to push bug fixes in less than an hour. The pricing is very suitable for us as a Startup
Datadog was used as an agent for monitoring and as for the statsd daemon included. This way we are able to have automated system stats and include whatever other metrics we want to track.
Datadog is used because it has a great free tier and it provides us with great insights and integrations into our infrastructure and tools.