Datadog vs Skylight: What are the differences?
What is Datadog? Unify logs, metrics, and traces from across your distributed infrastructure. Datadog is the leading service for cloud-scale monitoring. It is used by IT, operations, and development teams who build and operate applications that run on dynamic or hybrid cloud infrastructure. Start monitoring in minutes with Datadog!.
What is Skylight? The smart profiler for your Rails apps. Skylight is a smart profiler for your Rails apps that visualizes request performance across all of your servers.
Datadog and Skylight belong to "Performance Monitoring" category of the tech stack.
Some of the features offered by Datadog are:
- 14-day Free Trial for an unlimited number of hosts
- 200+ turn-key integrations for data aggregation
- Clean graphs of StatsD and other integrations
On the other hand, Skylight provides the following key features:
- Skylight looks at how your code is behaving in production, alerting you to improvements you can make before they become showstoppers.
- Skylight tells you exactly how your app is spending its time where it matters most—in your production environment
- Pricing starts at $20 for the first million requests, with automatic discounts for high-volume customers
"Monitoring for many apps (databases, web servers, etc)" is the top reason why over 118 developers like Datadog, while over 10 developers mention "Beautiful UI" as the leading cause for choosing Skylight.
Spotify, 9GAG, and Asana are some of the popular companies that use Datadog, whereas Skylight is used by GrowthHackers, StackShare, and Comptoir Français du Disque Indépendant. Datadog has a broader approval, being mentioned in 532 company stacks & 213 developers stacks; compared to Skylight, which is listed in 39 company stacks and 5 developer stacks.
What is Datadog?
What is Skylight?
Need advice about which tool to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
Sign up to get full access to all the companiesMake informed product decisions
What tools integrate with Skylight?
Sign up to get full access to all the tool integrationsMake informed product decisions
We're a real-time financial services messaging company, so being able to monitor our servers and applications in real-time is important to us. We also like a good deal, so $15/server seemed a bargain.
What were we looking for?
We wanted to monitor our MS infrastructure (servers, SQL) and apps (C#) to understand performance issues and be able to rectify. We also want to be able to do long-term trending. And we wanted to go from nothing to live in a short time.
Installing the Datadog agent on the servers was a breeze and enabling the integrations for SQL and Windows trivial.
Using the StatsD based API was also very easy - no worrying about JSON or UDP calls. The ability to add tags to all metrics is also a key benefit. We run multiple (100+) instances of a single application and being able to distinguish events from each one via tagging, or to see aggregates, is extremely useful.
In all it took 2 days R&D to instrument our key applications sufficiently for production deployment. Deploying the agent to our production servers took 30 mins, giving our Ops team complete visibility for the 1st time.
What have we learned
Since we've been live Datadog has given us numerous insights into the way our system behaves, from uneven server loadings and sporadic memory usage to performance tuning a key application that resulted in a 50% increase in throughput. Knowing what's taking the time has been a boon.
The other nice surprise has been the evolving nature of Datadog. It seems like every couple of weeks there's a new feature on the site.
- I like the transparent pricing. Services that won't show me the price without having to talk to a sales person are really annoying.
- Support has been good. We've contacted them several times with questions and always had a quick response (time zone considered...we're in London) and a helpful answer.
So What's bad?
Probably the weakest aspect at the moment is the long term trending of data. Whilst you can wind the time bar back to see what happened last week you can't ask questions like "show me the peak period each day for the last x months". The "get data" API is also fairly weak. Neither are concerns at the moment, and I'm sure they're on the to-do list.
I've been a systems administrator most of my career. Everywhere I went, I'd have to rebuild the same monitoring + graphing system. And then make sure that every machine wrote to that system and every application handed up the proper metrics through whatever mechanism seemed good at the time.
Then, as CTO of SimpleReach, single-handedly managing over 200 servers in addition to everything else, I found Datadog. We were already using statsd to instrument our applications, now it was just a matter of getting that data to Datadog. We use Chef, so I installed the Datadog agent on every machine in about 10 minutes and we were up and running.
The best part was that we had a deploy problem the next day with one of our main applications and troubleshooting took minutes instead of hours (and Datadog immediately paid for itself). Now no new features go out without instrumentation and no machine gets created without being monitored.
Datadog just scales with us. Great service and I highly recommend it to anyone not looking to reinvent the wheel with monitoring and instrumentation.
If you follow the registration flow you end up with running analytics virtually in a minute. Awesome first experience.
I don't have my application in production so I needed to enable skylight in development, but Skylight navigated me nicely to the exact paragraph of a documentation, which helped.
Datadog makes running a service with 800,000 unique users a month possible as a single developer/maintainer. I bought a separate monitor just to keep my datadog dashboards always visible and rely on triggers to keep watch over 20+ servers.
We use datadog to monitor our servers and some application metrics. Easy to get started and scale to many servers. Datadog support engineers are always quick to respond to bugs and other challenges.
When we were facing performance issues with the new StackShare app. We originally thought it was a server issue. So we did quite a bit of research to see how many dynos we should be using for the sort of application we have and traffic profile. We couldn’t find anything useful online so I ended up asking my buddy Alain over at BlockScore. After a quick convo with him, I knew we should be totally fine with just 2 dynos.
We also tested the theory by increasing the number of dynos and running the load tests. They had little to no effect on error rate, so this also confirmed that it wasn’t a server issue.
So that meant it was an application issue. New Relic wasn’t any help. I spoke with another friend who suggested we use a profiler. We totally should have been using one all along. We added mini-profiler, which was great for identifying slow queries and overall page load times. We also had the Rails Chrome extension so we could see how long view rendering was taking. So we cleaned up the slowest queries.
We tried to use mini-profiler in production on the new StackShare app and for some reason, we couldn’t get it to work. We were in a time crunch so I asked Alain what they used and he said that they use Skylight in production. Funny enough, I remembered the name Skylight because we listed it on the site a while back. So we did that, and at first we couldn’t really see how it was useful. Then we realized what we were seeing were a ton of repeat queries on some of the pages we load tested.
Skylight is cool because it sort of gives you the full MVC profile. We were able to pinpoint specific db queries that being repeated. So we cleaned those up pretty quickly. But then we noticed the views were taking up all the load time, so we start implementing caching more aggressively. After we cleaned up the db queries and added more caching, our pages went from this: to this:
Skylight ended up being super useful. We use it in production now.
We just started looking into Datadog, but from what we see, it's like New Relic meets Loggly. It's really easy to plugin different services (like the one on this list) and get detailed analysis of what is happening on your servers and services. It makes tracking down sparse and difficult to understand problems possible.
Monitoring day-to-day operations of multiple high-performance computing assets distributed across several networks. Monitoring vendor provided data and setting up alerts when things do not show up on time.
Datadog was used as an agent for monitoring and as for the statsd daemon included. This way we are able to have automated system stats and include whatever other metrics we want to track.
An essential for determining production performance. It provides us with clear insights into what might be slowing down responses so we know exactly what to fix.
Datadog is used because it has a great free tier and it provides us with great insights and integrations into our infrastructure and tools.