Need advice about which tool to choose?Ask the StackShare community!

Datadog

Stacks9.5K

Followers8.1K

+ 1

Votes861

Prometheus

Stacks4.3K

Followers3.8K

+ 1

Votes239

Add tool

Datadog vs Prometheus: What are the differences?

Introduction

In this article, we will explore the key differences between Datadog and Prometheus, two popular monitoring and observability tools.

Data Collection: Datadog collects and visualizes metrics, traces, and logs from various sources, including custom applications, cloud providers, and infrastructure. It supports a wide range of integrations and provides an agent-based or agentless data collection approach. Prometheus, on the other hand, focuses on monitoring infrastructure and application metrics. It uses a pull-based model, where a Prometheus server scrapes metrics from configured targets at regular intervals.
Scalability and Performance: Datadog is a cloud-based platform that offers scalability and ease of use. It can handle large-scale deployments and provides automatic scaling of resources. Prometheus is a self-hosted solution that requires manual management and scaling. It is suitable for smaller environments and may need additional components like a federation, alert manager, and long-term storage for larger deployments.
Alerting and Notification: Datadog provides a comprehensive alerting and notification system. It offers flexible alerting rules, integrates with various notification channels like email, SMS, and PagerDuty, and supports advanced features like anomaly detection and machine learning-based alerting. Prometheus has a built-in alerting system that allows users to define alert rules based on metrics and send notifications via integrations like email and webhooks. However, it lacks some advanced features available in Datadog.
Data Storage and Retention: Datadog provides built-in storage and retention of metrics and logs for a certain period. It offers long-term storage options like time series databases and provides query and visualization capabilities for historical data. Prometheus uses a local on-disk storage model and does not provide long-term data retention capabilities out of the box. Users need to set up additional components like remote storage solutions to retain metrics for a longer duration.
Query Language and Analysis: Datadog uses a proprietary query language called DDQL (Datadog Query Language) that allows users to perform complex analytics and aggregations on their data. It offers built-in functions, operators, and visualizations to analyze and visualize data. Prometheus uses its own query language called PromQL (Prometheus Query Language) for data retrieval and analysis. It provides powerful functionalities like filtering, aggregation, and time series functions but may have a steeper learning curve for beginners.
Ecosystem and Integrations: Datadog offers a rich ecosystem of integrations with popular services, frameworks, and platforms like AWS, Kubernetes, and Datadog's APM and Log Management solutions. It provides pre-built dashboards and out-of-the-box integrations with third-party tools. Prometheus also has a strong ecosystem with exporters available for various services and applications. However, it may require additional configuration and customization to integrate with certain systems.

In summary, the key differences between Datadog and Prometheus lie in their data collection approaches, scalability, alerting capabilities, data storage and retention, query languages, and ecosystem and integration options. The choice between these tools depends on specific monitoring and observability requirements, infrastructure size, and resources available for management and scaling.

Advice on Datadog and Prometheus

Susmita Meher

Senior SRE at African Bank · Jul 28, 2020 | 4 upvotes · 852K views

Needs advice

Grafana

Graphite

and

Prometheus

Looking for a tool which can be used for mainly dashboard purposes, but here are the main requirements:

Must be able to get custom data from AS400,
Able to display automation test results,
System monitoring / Nginx API,
Able to get data from 3rd parties DB.

Grafana is almost solving all the problems, except AS400 and no database to get automation test results.

Replies (1)

Sakti Behera

Technical Specialist, Software Engineering at AT&T · Oct 3, 2020 | 3 upvotes · 637.7K views

Recommends

Grafana

Prometheus

You can look out for Prometheus Instrumentation (https://prometheus.io/docs/practices/instrumentation/) Client Library available in various languages https://prometheus.io/docs/instrumenting/clientlibs/ to create the custom metric you need for AS4000 and then Grafana can query the newly instrumented metric to show on the dashboard.

Farzeem Diamond Jiwani

Software Engineer at IVP · Jul 21, 2020 | 8 upvotes · 1.6M views

Needs advice

AppDynamics

Datadog

and

Dynatrace

Hey there! We are looking at Datadog, Dynatrace, AppDynamics, and New Relic as options for our web application monitoring.

Current Environment: .NET Core Web app hosted on Microsoft IIS

Future Environment: Web app will be hosted on Microsoft Azure

Tech Stacks: IIS, RabbitMQ, Redis, Microsoft SQL Server

Requirement: Infra Monitoring, APM, Real - User Monitoring (User activity monitoring i.e., time spent on a page, most active page, etc.), Service Tracing, Root Cause Analysis, and Centralized Log Management.

Please advise on the above. Thanks!

Medeti Vamsi Krishna

Jun 27, 2020 | 7 upvotes · 1.5M views

Needs advice

Datadog

New Relic

and

Sysdig

We are looking for a centralised monitoring solution for our application deployed on Amazon EKS. We would like to monitor using metrics from Kubernetes, AWS services (NeptuneDB, AWS Elastic Load Balancing (ELB), Amazon EBS, Amazon S3, etc) and application microservice's custom metrics.

We are expected to use around 80 microservices (not replicas). I think a total of 200-250 microservices will be there in the system with 10-12 slave nodes.

We tried Prometheus but it looks like maintenance is a big issue. We need to manage scaling, maintaining the storage, and dealing with multiple exporters and Grafana. I felt this itself needs few dedicated resources (at least 2-3 people) to manage. Not sure if I am thinking in the correct direction. Please confirm.

You mentioned Datadog and Sysdig charges per host. Does it charge per slave node?

Replies (3)

Jens Günther

CTO · Jun 30, 2020 | 10 upvotes · 427.6K views

Recommends

Datadog

Can't say anything to Sysdig. I clearly prefer Datadog as

they provide plenty of easy to "switch-on" plugins for various technologies (incl. most of AWS)
easy to code (python) agent plugins / api for own metrics
brillant dashboarding / alarms with many customization options
pricing is OK, there are cheaper options for specific use cases but if you want superior dashboarding / alarms I haven't seen a good competitor (despite your own Prometheus / Grafana / Kibana dog food)

IMHO NewRelic is "promising since years" ;) good ideas but bad integration between their products. Their Dashboard query language is really nice but lacks critical functions like multiple data sets or advanced calculations. Needless to say you get all of that with Datadog.

Need help setting up a monitoring / logging / alarm infrastructure? Send me a message!

Maik Schröder

CIO at Instana · Jun 30, 2020 | 8 upvotes · 427.5K views

Recommends

Instana

Hi Medeti,

you are right. Building based on your stack something with open source is heavy lifting. A lot of people I know start with such a set-up, but quickly run into frustration as they need to dedicated their best people to build a monitoring which is doing the job in a professional way.

As you are microservice focussed and are looking for 'low implementation and maintenance effort', you might want to have a look at INSTANA, which was built with modern tool stacks in mind. https://www.instana.com/apm-for-microservices/

We have a public sand-box available if you just want to have a look at the product once and of course also a free-trial: https://www.instana.com/getting-started-with-apm/

Let me know if you need anything on top.

Attila Fulop

Management Advisor at artkonekt · Feb 11, 2021 | 2 upvotes · 349.5K views

Recommends

Datadog

New Relic

artkonekt

I have hands on production experience both with New Relic and Datadog. I personally prefer Datadog over NewRelic because of the UI, the Documentation and the overall user/developer experience.

NewRelic however, can do basically the same things as Datadog can, and some of the features like alerting have been present in NewRelic for longer than in Datadog. The cool thing about NewRelic is their last-summer-updated pricing: you no longer pay per host but after data you send towards New Relic. This can be a huge cost saver depending on your particular setup

https://docs.newrelic.com/docs/accounts/accounts-billing/new-relic-one-pricing-billing/new-relic-one-pricing-billing

I'd go for Datadog, but given you have lots of containers I would also make a cost calculation. If the price difference is significant and there's a budget constraint NewRelic might be the better choice.

Choosing Datadog over New Relic at artkonekt | StackShare

Sunil Chaudhari

Team Lead at XYZ · Jun 15, 2020 | 2 upvotes · 584.4K views

Needs advice

Metricbeat

and

Prometheus

Hi, We have a situation, where we are using Prometheus to get system metrics from PCF (Pivotal Cloud Foundry) platform. We send that as time-series data to Cortex via a Prometheus server and built a dashboard using Grafana. There is another pipeline where we need to read metrics from a Linux server using Metricbeat, CPU, memory, and Disk. That will be sent to Elasticsearch and Grafana will pull and show the data in a dashboard.

Is it OK to use Metricbeat for Linux server or can we use Prometheus?

What is the difference in system metrics sent by Metricbeat and Prometheus node exporters?

Regards, Sunil.

Replies (2)

Matthew Rothstein

CTO at Final · Jul 16, 2020 | 5 upvotes · 358.1K views

Recommends

Prometheus

If you're already using Prometheus for your system metrics, then it seems like standing up Elasticsearch just for Linux host monitoring is excessive. The node_exporter is probably sufficient if you'e looking for standard system metrics.

Another thing to consider is that Metricbeat / ELK use a push model for metrics delivery, whereas Prometheus pulls metrics from each node it is monitoring. Depending on how you manage your network security, opting for one solution over two may make things simpler.

talaverant

Jul 2, 2020 | 2 upvotes · 358.3K views

Recommends

Instana

Hi Sunil! Unfortunately, I don´t have much experience with Metricbeat so I can´t advise on the diffs with Prometheus...for Linux server, I encourage you to use Prometheus node exporter and for PCF, I would recommend using the instana tile (https://www.instana.com/supported-technologies/pivotal-cloud-foundry/). Let me know if you have further questions! Regards Jose

Keno Zakesy

Feb 17, 2020 | 0 upvotes · 14.9K views

Needs advice

Datadog

InfluxDB

and

Prometheus

So, I am working in a big company where they have multiple different microservices running that are written in Golang. I am currently searching for a technology that can give me all the metric data from the microservices. What time-series databases would you recommend? or which databases would you recommend to further investigate? I appreciate any input.

Replies (3)

Alfi Delacruz

May 1, 2020 | 4 upvotes · 14.4K views

Recommends

InfluxDB

Each of these tools can help you with micro service workload and work well. I will try to go through some good, bad and ugly of each.

Datadog has an easy setup and time to get something tangible out of it. The cost model is by host so this is something to take into consideration how it will affect your use case. Also as a large organization at some point you will probably want control over some/all of your telemetry data to run your own ML or AI processes. With Datadog you this can be difficult as you will need to create processes outside of its closed eco system to get Raw metrics.

Prometheus is a great tool. It also has a fairly straight forward setup especially with Kubernetes. If you are running your micro services in k8s then this is going to get used one way or another; it is a first class citizen there with heavy utilization of K8s API. I also like the fact that Kubernetes architecture is easy to understand and that it utilizes Grafana for the visualization engine. Prometheus at scale can be done but it is a pain. Especially with a distributed infrastructure across multiple workloads.

Influxdb (TICK stack in v1) is known for its scalability and flexibility as a time series database. Telegraf is the main input/data-forwarder of the architecture and is completely decoupled from the database as are the other 3 components of the stack. Influx has made it very easy to just use one component on its own. I have worked on stacks that just used telegraf for ingestion into Kinesis or another data stream. I have also worked on stacks that used Influx database but used a different ETL process for analyzing the data in realtime instead of using their v1 architectures Kapacitor query engine. Influx database is a great performing time series database that in version 2 runs within kubernetes and utilizes Flux as the query language. Flux is a nice query language that is fairly easy to learn and has a lot of flexibility. As a last positive note Telegraf is written in Go so that would fit well with your current team.

The difficulties of Influx are that it is hard to get something really tangible out of it. Initial time to see something is fast but all the other work involved is a lot. You also have to understand the architecture well. The management of Influx can be cumbersome but it can scale up better than the other two when Datadogs cost is taken into consideration. They have a lot of API hooks in their V1 enterprise edition to wire and configure it. They do offer a mange service to offload this cost until later.

My overall choice here is probably to go with some of the influx as you can rip and/or add components as needed into the flow. Eventually you will probably want to run an ML process within there (can be done within Kapacitor but of course can also use your cloud provider here too) and this gives you the flexibility to do it anywhere. I would still go through prometheus because you will most likely use it also, but it does have forwarders to Influxdb so still fits.

Lars Van Casteren

Jul 16, 2020 | 3 upvotes · 14.3K views

Recommends

Grafana

We're running Prometheus/Alertmanager/Grafana across our whole company for any monitoring and metrics requirement, from the infrastructure layer all the way up to Springboot endpoint services, the prometheus exporter / scraping approach works pretty well for us. It's really easy to setup and more importantly; to maintain it without much effort, all the Prometheus configs get automatically created through Terraform outputs and Ansible jobs. Combine it with Grafana and you're smiling.

Dmitry Mukhin

Engineer at Uploadcare · Apr 7, 2020 | 2 upvotes · 13.6K views

Recommends

Datadog

Uploadcare

We're moving towards Prometheus from Datadog at this moment. Main driving force is TOC at the moment.

Datadog is great until it becomes too expensive.

Mat Jovanovic

Head of Cloud at Mats Cloud · Oct 30, 2019 | 3 upvotes · 778.9K views

Needs advice

Datadog

Grafana

and

Prometheus

We're looking for a Monitoring and Logging tool. It has to support AWS (mostly 100% serverless, Lambdas, SNS, SQS, API GW, CloudFront, Autora, etc.), as well as Azure and GCP (for now mostly used as pure IaaS, with a lot of cognitive services, and mostly managed DB). Hopefully, something not as expensive as Datadog or New relic, as our SRE team could support the tool inhouse. At the moment, we primarily use CloudWatch for AWS and Pandora for most on-prem.

Replies (2)

Jorge Arias

Nov 8, 2019 | 3 upvotes · 778.8K views

Recommends

Datadog

I worked with Datadog at least one year and my position is that commercial tools like Datadog are the best option to consolidate and analyze your metrics. Obviously, if you can't pay the tool, the best free options are the mix of Prometheus with their Alert Manager and Grafana to visualize (that are complementary not substitutable). But I think that no use a good tool it's finally more expensive that use a not really good implementation of free tools and you will pay also to maintain its.

Lucas Rincon

Tech Evangelist · Jul 2, 2020 | 3 upvotes · 778.6K views

Recommends

Instana

this is quite affordable and provides what you seem to be looking for. you can see a whole thing about the APM space here https://www.apmexperts.com/observability/ranking-the-observability-offerings/

Decisions about Datadog and Prometheus

Leonardo Henrique da Paixão

Pleno QA Enginneer at SolarMarket · Dec 8, 2020 | 15 upvotes · 393.7K views

Chose

over

The objective of this work was to develop a system to monitor the materials of a production line using IoT technology. Currently, the process of monitoring and replacing parts depends on manual services. For this, load cells, microcontroller, Broker MQTT, Telegraf, InfluxDB, and Grafana were used. It was implemented in a workflow that had the function of collecting sensor data, storing it in a database, and visualizing it in the form of weight and quantity. With these developed solutions, he hopes to contribute to the logistics area, in the replacement and control of materials.

Attila Fulop

Founder at Vanilo · Mar 24, 2020 | 4 upvotes · 461.3K views

Chose

over

(

)

I haven't heard much about Datadog until about a year ago. Ironically, the NewRelic sales person who I had a series of trainings with was trash talking about Datadog a lot. That drew my attention to Datadog and I gave it a try at another client project where we needed log handling, dashboards and alerting.

In 2019, Datadog was already offering log management and from that perspective, it was ahead of NewRelic. Other than that, from my perspective, the two tools are offering a very-very similar set of tools. Therefore I wouldn't say there's a significant difference between the two, the decision is likely a matter of taste. The pricing is also very similar.

The reasons why we chose Datadog over NewRelic were:

The presence of log handling feature (since then, logging is GA at NewRelic as well since falls 2019).
The setup was easier even though I already had experience with NewRelic, including participation in NewRelic trainings.
The UI of Datadog is more compact and my experience is smoother.
The NewRelic UI is very fragmented and New Relic One is just increasing this experience for me.
The log feature of Datadog is very well designed, I find very useful the tagging logs with services. The log filtering is also very awesome.

Bottom line is that both tools are great and it makes sense to discover both and making the decision based on your use case. In our case, Datadog was the clear winner due to its UI, ease of setup and the awesome logging and alerting features.

Benoit Larroque

Principal Engineer at Sqreen · Sep 17, 2019 | 4 upvotes · 447.3K views

Chose

over

(

)

I chose Datadog APM because the much better APM insights it provides (flamegraph, percentiles by default).

The drawbacks of this decision are we had to move our production monitoring to TimescaleDB + Telegraf instead of NR Insight

NewRelic is definitely easier when starting out. Agent is only a lib and doesn't require a daemon

Manage your open source components, licenses, and vulnerabilities

Learn More

Pros of Datadog

Pros of Prometheus

140
Monitoring for many apps (databases, web servers, etc)
107
Easy setup
87
Powerful ui
84
Powerful integrations
70
Great value
54
Great visualization
46
Events + metrics = clarity
41
Notifications
41
Custom metrics
39
Flexibility
19
Free & paid plans
16
Great customer support
15
Makes my life easier
10
Adapts automatically as i scale up
9
Easy setup and plugins
8
Super easy and powerful
7
In-context collaboration
7
AWS support
6
Rich in features
5
Docker support
4
Cute logo
4
Simple, powerful, great for infra
4
Monitor almost everything
4
Full visibility of applications
4
Easy to Analyze
4
Cost
4
Source control and bug tracking
4
Best than others
4
Automation tools
3
Best in the field
3
Expensive
3
Good for Startups
3
Free setup
2
APM

47
Powerful easy to use monitoring
38
Flexible query language
32
Dimensional data model
27
Alerts
23
Active and responsive community
22
Extensive integrations
19
Easy to setup
12
Beautiful Model and Query language
7
Easy to extend
6
Nice
3
Written in Go
2
Good for experimentation
1
Easy for monitoring

Sign up to add or upvote prosMake informed product decisions

Cons of Datadog

Cons of Prometheus

20
Expensive
4
No errors exception tracking
2
External Network Goes Down You Wont Be Logging
1
Complicated

12
Just for metrics
6
Bad UI
6
Needs monitoring to access metrics endpoints
4
Not easy to configure and use
3
Supports only active agents
2
Written in Go
2
TLS is quite difficult to understand
2
Requires multiple applications and tools
1
Single point of failure

Sign up to add or upvote consMake informed product decisions

416

1.1K

555

6.9K

- No public GitHub repository available -

58.2K

9.5K

What is Datadog?

Datadog is the leading service for cloud-scale monitoring. It is used by IT, operations, and development teams who build and operate applications that run on dynamic or hybrid cloud infrastructure. Start monitoring in minutes with Datadog!

What is Prometheus?

Prometheus is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts if some condition is observed to be true.

Need advice about which tool to choose?Ask the StackShare community!

What companies use Datadog?

What companies use Prometheus?

Manage your open source components, licenses, and vulnerabilities

Learn More

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Datadog?

What tools integrate with Prometheus?

Sign up to get full access to all the tool integrationsMake informed product decisions

Blog Posts

Building Component Based Apps

Dec 8 2020 at 5:50PM

DigitalOcean

+11

2497

3 Ways to Run Kubernetes on AWS

May 21 2020 at 12:02AM

Rancher Labs

+12

1569

AI/ML Pipelines Using Open Data Hub and Kubeflow on Red Hat Op...

Jan 29 2020 at 2:08PM

Red Hat, Inc.

+14

2689

Monitoring Node.js Applications on OpenShift with Prometheus

Sep 4 2019 at 3:07AM

Red Hat, Inc.

1530

Update: How CircleCI Processes Over 30 Million Builds Per Mont...

Jul 23 2019 at 10:44PM

CircleCI

+32

6134

How Sentry Receives 20 Billion Events Per Month While Preparin...

Nov 8 2017 at 5:09PM

Sentry

+31

37333

How Stitch Consolidates A Billion Records Per Day

Sep 28 2017 at 4:58AM

Stitch

+22

14343

How Uploadcare Built a Stack That Handles 350M File API Reques...

Jul 28 2017 at 7:41AM

Uploadcare

+46

32799

What are some alternatives to Datadog and Prometheus?

New Relic

The world’s best software and DevOps teams rely on New Relic to move faster, make better decisions and create best-in-class digital experiences. If you run software, you need to run New Relic. More than 50% of the Fortune 100 do too.

Splunk

It provides the leading platform for Operational Intelligence. Customers use it to search, monitor, analyze and visualize machine data.

Grafana

Grafana is a general purpose dashboard and graph composer. It's focused on providing rich ways to visualize time series metrics, mainly though graphs but supports other ways to visualize data through a pluggable panel architecture. It currently has rich support for for Graphite, InfluxDB and OpenTSDB. But supports other data sources via plugins.

AppDynamics

AppDynamics develops application performance management (APM) solutions that deliver problem resolution for highly distributed applications through transaction flow monitoring and deep diagnostics.

Sentry

Sentry’s Application Monitoring platform helps developers see performance issues, fix errors faster, and optimize their code health.

See all alternatives

Datadog vs Prometheus

Need advice about which tool to choose?Ask the StackShare community!

Datadog vs Prometheus: What are the differences?

Introduction

Pros of Datadog

Pros of Prometheus

Sign up to add or upvote prosMake informed product decisions

Cons of Datadog

Cons of Prometheus

Sign up to add or upvote consMake informed product decisions

What is Datadog?

What is Prometheus?

Need advice about which tool to choose?Ask the StackShare community!

What companies use Datadog?

What companies use Prometheus?

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Datadog?

What tools integrate with Prometheus?

Sign up to get full access to all the tool integrationsMake informed product decisions

Blog Posts

Related Comparisons

Trending Comparisons

Top Comparisons