What is Datadog?
What is New Relic?
Want advice about which of these to choose?Ask the StackShare community!
We're a real-time financial services messaging company, so being able to monitor our servers and applications in real-time is important to us. We also like a good deal, so $15/server seemed a bargain.
What were we looking for?
We wanted to monitor our MS infrastructure (servers, SQL) and apps (C#) to understand performance issues and be able to rectify. We also want to be able to do long-term trending. And we wanted to go from nothing to live in a short time.
Installing the Datadog agent on the servers was a breeze and enabling the integrations for SQL and Windows trivial.
Using the StatsD based API was also very easy - no worrying about JSON or UDP calls. The ability to add tags to all metrics is also a key benefit. We run multiple (100+) instances of a single application and being able to distinguish events from each one via tagging, or to see aggregates, is extremely useful.
In all it took 2 days R&D to instrument our key applications sufficiently for production deployment. Deploying the agent to our production servers took 30 mins, giving our Ops team complete visibility for the 1st time.
What have we learned
Since we've been live Datadog has given us numerous insights into the way our system behaves, from uneven server loadings and sporadic memory usage to performance tuning a key application that resulted in a 50% increase in throughput. Knowing what's taking the time has been a boon.
The other nice surprise has been the evolving nature of Datadog. It seems like every couple of weeks there's a new feature on the site.
- I like the transparent pricing. Services that won't show me the price without having to talk to a sales person are really annoying.
- Support has been good. We've contacted them several times with questions and always had a quick response (time zone considered...we're in London) and a helpful answer.
So What's bad?
Probably the weakest aspect at the moment is the long term trending of data. Whilst you can wind the time bar back to see what happened last week you can't ask questions like "show me the peak period each day for the last x months". The "get data" API is also fairly weak. Neither are concerns at the moment, and I'm sure they're on the to-do list.
We love Scout at Rollbar. Here's how we use it.
Zero configuration monitoring for new hosts
We have added Scout to our Ansible configuration for new host setup. So, when we provision a new machine, we get basic monitoring without any extra configuration. Once the host is up and running, we add it to the appropriate role in Scout and all of our monitoring plugins are magically deployed and enabled on the new host.
Monitoring HTTP response codes
One of the best things about Scout is how beautiful and therefore usable their graphs are. We have a Scout dashboard which shows all of our response codes which allows us to quickly see connections between different hosts when problems occur.
Scout's plugin model makes it really easy to extend. We have implemented our own log monitoring plugin which reports metrics like the 90th percentile of slow queries on our site. These types of plugins allow us to see issues at a glance during deploys, maintenance and load spikes.
Slowly taking over Nagios
Nagios is amazing, but let's be real... Anyone who has used it knows how painful it is to set up, administer and extend. We are in the process of cutting over from Nagios to Scout to handle more of our infrastructure monitoring and soon, alerting.
I've been a systems administrator most of my career. Everywhere I went, I'd have to rebuild the same monitoring + graphing system. And then make sure that every machine wrote to that system and every application handed up the proper metrics through whatever mechanism seemed good at the time.
Then, as CTO of SimpleReach, single-handedly managing over 200 servers in addition to everything else, I found Datadog. We were already using statsd to instrument our applications, now it was just a matter of getting that data to Datadog. We use Chef, so I installed the Datadog agent on every machine in about 10 minutes and we were up and running.
The best part was that we had a deploy problem the next day with one of our main applications and troubleshooting took minutes instead of hours (and Datadog immediately paid for itself). Now no new features go out without instrumentation and no machine gets created without being monitored.
Datadog just scales with us. Great service and I highly recommend it to anyone not looking to reinvent the wheel with monitoring and instrumentation.
I'm a freelance developer with a handful of servers that needed insightful monitoring and alerts. I searched high and low across both hosted and self hosted solutions... paid and open source. While many are quite capable the self-hosted solutions were clunky and overkill. The few self-hosted for pay solutions costs structure were completely outside of a freelances budget. ScoutApp is the first that had the easy to use setup, amazing plugins for specific app monitoring and the price was actually affordable. Setup couldn't be easier. Plugins are handled amazingly with a single click that initiates the agent to install remotely. The interface is minimal and easy to read. Triggers are so well done and easy to setup with clear human language detailing the alert criteria. Real-time graphing is just icing on the cake.
ScoutApp is great for not just small but enterprise level infrastructures as well. Added features such as roles, multi-user accounts, environments and even an API make growing with it a no brainer.
Very well done and highly recommended.
I used to have NewRelic on https://doorbell.io for my monitoring. It worked pretty well for the basic things, and the basic plan is free.
However, as https://doorbell.io's stack got increasingly complicated, the plugins of NewRelic didn't work as well as I needed, in order to reliably monitor all aspects of the platform.
I decided to try out Scout as an alternative, since even though it doesn't have a free plan, the basic plan is only $8/month (compared to $149 for NewRelic).
I found the interface to be really good, and they have great documentation. I found plugins for every single part of my stack, and they all worked very easily "out of the box". And best of all, added practically no overhead to the server!
So overall, I'd say it's a service that's well worth paying for. It's a steal at $8/month!
We migrated our infrastructure monitoring to Scout about six months ago when our previous monitoring solution became unreliable and cumbersome to maintain. We were pleasantly surprised at the ease of implementation and the library of plugins already available.
The fine grain polling frequency and long term metric logging helped us maintain the high level of uptime our application requires. Moreover, due to the nature of the Scout protocol, changes to our specific application monitoring can be configured at a high level in their interface with a few clicks.
For the few times we have been in communication with their support team to help sort our questions or clarify details, we have been thoroughly impressed at their response time and personalized attention to our needs.
We highly recommend using Scout.
We are a very small non profit with a very simple server setup. Our two developers do not have any special training as sys admins. But it was very easy to get setup with Scout and start some simple monitoring of our servers. Most of what we do is check that some key processes are running and that our URLs are up. It was easy to do all of that with Scout. That said: we're interested in learning more about Scout's capabilities and doing more sophisticated server monitoring down the line.
Datadog makes running a service with 800,000 unique users a month possible as a single developer/maintainer. I bought a separate monitor just to keep my datadog dashboards always visible and rely on triggers to keep watch over 20+ servers.
We use datadog to monitor our servers and some application metrics. Easy to get started and scale to many servers. Datadog support engineers are always quick to respond to bugs and other challenges.
We are a very small non profit with a very simple server setup. Our two developers do not have any special training as sys admins. That said, it was very simple to get setup with Scout and start some simple monitoring of our servers. Most of what we do is check that some key processes are running and that our URLs are up. But we're interested in learning more aboutS Scout's capabilities/ doing more sophisticated server monitoring down the line.
Free Heroku add-on. Not particularly useful for us. Rails profilers tend to do a better job at the app level. And I can never really figure out what’s going on with Heroku by looking at New Relic. I don’t know if we’re just not using New Relic correctly or if it really does just suck for our use case. But I guess some insight is better than none.
How do you know what parts of the workflow need improvement? Measure it. With New Relic in place, we have graphs of our API performance and can directly see if a server or zone is causing trouble, and the impact of our changes. There’s no comparison between a real-time performance graph and “Strange, the site seems slow, I should tail the logs”.
We just started looking into Datadog, but from what we see, it's like New Relic meets Loggly. It's really easy to plugin different services (like the one on this list) and get detailed analysis of what is happening on your servers and services. It makes tracking down sparse and difficult to understand problems possible.
We monitor and troubleshoot our app's performance using New Relic, which gives us a great view into each type of request that hits our servers. It also gives us a nice weekly summary of error rates and response times so that we know how well we've done in the past week.
Monitoring day-to-day operations of multiple high-performance computing assets distributed across several networks. Monitoring vendor provided data and setting up alerts when things do not show up on time.
Datadog was used as an agent for monitoring and as for the statsd daemon included. This way we are able to have automated system stats and include whatever other metrics we want to track.
I'm trying to wring more instrumentation out of New Relic as it pertains to Rack, but for the time being, New Relic is monitoring/alerting uptime and some basic performance metrics.
Just like we care about errors, we care about metrics - especially around performance. You'd be crazy not to use it - and not surprisingly, it's a one-click add-on in Heroku.
Datadog is used because it has a great free tier and it provides us with great insights and integrations into our infrastructure and tools.