Datadog

Datadog

DevOps / Monitoring / Performance Monitoring
Software Engineer at IVP·
Needs advice
on
DynatraceDynatraceDatadogDatadog
and
AppDynamicsAppDynamics

Hey there! We are looking at Datadog, Dynatrace, AppDynamics, and New Relic as options for our web application monitoring.

Current Environment: .NET Core Web app hosted on Microsoft IIS

Future Environment: Web app will be hosted on Microsoft Azure

Tech Stacks: IIS, RabbitMQ, Redis, Microsoft SQL Server

Requirement: Infra Monitoring, APM, Real - User Monitoring (User activity monitoring i.e., time spent on a page, most active page, etc.), Service Tracing, Root Cause Analysis, and Centralized Log Management.

Please advise on the above. Thanks!

READ MORE
5 upvotes·794.8K views
Needs advice
on
ZabbixZabbix
and
CentreonCentreon

My team is divided on using Centreon or Zabbix for enterprise monitoring and alert automation. Can someone let us know which one is better? There is one more tool called Datadog that we are using for cloud assets. Of course, Datadog presents us with huge bills. So we want to have a comparative study. Suggestions and advice are welcome. Thanks!

READ MORE
6 upvotes·484.8K views
Replies (4)
Systems Engineer at Simac·
Recommends
Zabbix
at

I work at Volvo Car Corporation as a consultant Project Manager. We have deployed Zabbix in all of our factories for factory monitoring because after thorough investigation we saw that Zabbix supports the wide variety of Operating Systems, hardware peripherals and devices a Car Manufacturer has.

No other tool had the same amount of support onboard for our production environment and we didn't want to end up using a different tool again for several areas. That is the major strong point about Zabbix and it's free of course. Another strong point is the documentation which is widely available; Zabbix Youtube channel with tutorial video's, Zabbix share which holds free templates, the Zabbix online documentation and the Zabbix forum also helped us out quite a bit. Deployment is quite easy since it uses templates, so almost all configuration can be done on server side.

To conclude, we are really pleased with the tool so far, it helped us detect several causes of issues that were a pain to solve in the past.

READ MORE
6 upvotes·217K views
Recommends
Centreon

Centreon is part of the Nagios ecosystem, meaning there is a huge number of resources you may find around in the community (plugins, skills, addons). Zabbix monitoring paradigms are totally different from Centreon. Centreon plugins have some kind of intelligence when they are launched, where Zabbix monitoring rules are configured centrally with the raw data collected. Testing both will help you understand :) Users used to say Centreon may be faster for setup and deployment. And in the end, both are full of monitoring features. Centreon has out of the box a full catalog of probes from cloud to the edge https://www.centreon.com/en/plugins-pack-list/ As soon as you have defined your monitoring policies and template, you can deploy it fast through command line API or REST API. Centreon plays well in the ITSM, Automation, AIOps spaces with many connectors for Prometheus, ServiceNow, GLPI, Ansible, Chef, Splunk, ... The polling server mode is one of the differentiators with Centreon. You set up remote server(s) and chose btw multiple information-exchange mechanisms. Powerful and resilient for remote, VPN, DMZ, satellite networks. Centreon is a good value for price to do a data collection (availability, performance, fault) on a wide range of technologies (physical, legacy, cloud). There are pro support and enterprise version with dashboards and reporting. IT Central Station gathers many user feedback you can rely on both Centreon & Zabbix https://www.itcentralstation.com/products/centreon-reviews  

READ MORE
4 upvotes·216.5K views
View all (4)
Needs advice
on
SysdigSysdigNew RelicNew Relic
and
DatadogDatadog

We are looking for a centralised monitoring solution for our application deployed on Amazon EKS. We would like to monitor using metrics from Kubernetes, AWS services (NeptuneDB, AWS Elastic Load Balancing (ELB), Amazon EBS, Amazon S3, etc) and application microservice's custom metrics.

We are expected to use around 80 microservices (not replicas). I think a total of 200-250 microservices will be there in the system with 10-12 slave nodes.

We tried Prometheus but it looks like maintenance is a big issue. We need to manage scaling, maintaining the storage, and dealing with multiple exporters and Grafana. I felt this itself needs few dedicated resources (at least 2-3 people) to manage. Not sure if I am thinking in the correct direction. Please confirm.

You mentioned Datadog and Sysdig charges per host. Does it charge per slave node?

READ MORE
7 upvotes·889.9K views
Replies (3)
Recommends
Datadog

Can't say anything to Sysdig. I clearly prefer Datadog as

  • they provide plenty of easy to "switch-on" plugins for various technologies (incl. most of AWS)
  • easy to code (python) agent plugins / api for own metrics
  • brillant dashboarding / alarms with many customization options
  • pricing is OK, there are cheaper options for specific use cases but if you want superior dashboarding / alarms I haven't seen a good competitor (despite your own Prometheus / Grafana / Kibana dog food)

IMHO NewRelic is "promising since years" ;) good ideas but bad integration between their products. Their Dashboard query language is really nice but lacks critical functions like multiple data sets or advanced calculations. Needless to say you get all of that with Datadog.

Need help setting up a monitoring / logging / alarm infrastructure? Send me a message!

READ MORE
10 upvotes·2 comments·228.4K views
Medeti Vamsi Krishna
Medeti Vamsi Krishna
·
June 30th 2020 at 11:52AM

Thanks for the reply, I am working on DataDog trail version now. I am able to see my containers/pods/VMs metrics in the DataDog.

I am trying to do the jmx integration with autodiscovery now. But I am not able to see the jvm metrics in DataDog. Can you please help on this?

Here is my deployment yaml:

`

apiVersion: apps/v1

kind: Deployment

metadata:

name: myapp

namespace: datadog

annotations:

ad.datadoghq.com/myapp.check_names: >-

'["myapp"]'

ad.datadoghq.com/myapp.init_configs: >-

'[{"is_jmx": true, "collect_default_metrics": true}]'

ad.datadoghq.com/tomcat.instances: >-

'[{"host": "%%host%%","port":"5000"}]'

labels:

app: myapp

spec:

selector:

matchLabels:

app: myapp

template:

metadata:

labels:

app: myapp

spec:

containers:

- name: myapp

image: nexus.nslhub.com/sample-java-app:2.0

imagePullPolicy: Always

ports:

- containerPort: 8080

name: http

- containerPort: 5000

name: jmx

imagePullSecrets:

- name: myappsecret

nodeSelector:

kubernetes.io/hostname: ip-10-5-7-173.ap-south-1.compute.internal

`

·
Reply
Jens Günther
Jens Günther
·
June 30th 2020 at 11:57AM

Would like to help, but there could be hundreds of reasons why the incoming and outgoing jmx ports are not accessible from the agent.

·
Reply
Recommends
Instana

Hi Medeti,

you are right. Building based on your stack something with open source is heavy lifting. A lot of people I know start with such a set-up, but quickly run into frustration as they need to dedicated their best people to build a monitoring which is doing the job in a professional way.

As you are microservice focussed and are looking for 'low implementation and maintenance effort', you might want to have a look at INSTANA, which was built with modern tool stacks in mind. https://www.instana.com/apm-for-microservices/

We have a public sand-box available if you just want to have a look at the product once and of course also a free-trial: https://www.instana.com/getting-started-with-apm/

Let me know if you need anything on top.

READ MORE
8 upvotes·228.4K views
View all (3)
CTO and Software Architect at Medstrat·
Needs advice
on
DatadogDatadog
and
AppOpticsAppOptics
in

We use AppOptics. I am curious what are the current leaders for APM for small companies (50 employees) that use Python, MariaDB, RabbitMQ, and Google Cloud Storage. We run both Celery and Gunicorn services. We are considering Datadog or some other deep code profiling tool that can spot I/O, DB, or other response time/request rate issues

READ MORE
3 upvotes·68.1K views
Replies (1)
Recommends
Instana

If you want to get deep insights and fast issue resolution have a look at INSTANA.

There is a public sandbox to get first insights and feeling for the tool. If you like it you can also run a free trial if you like.

READ MORE
Instana - Getting Started with APM (instana.com)
2 upvotes·2 comments·3.6K views
Gal Cohen
Gal Cohen
·
February 2nd 2021 at 7:47AM

We are running Python & Celery, our stack is based on AWS ECS. We are using NewRelic. This tool is just amazing, both for API and Offline workers. It would provide any metric I was looking for, including a profiler, SLA / SLO dashboards, infrastructure metrics. It has alerting capability that is easily integrated with Pingdom / PagerDuty / Webhooks.

·
Reply
Greg Smethells
Greg Smethells
·
September 19th 2020 at 8:29PM

Thanks, I’ll take a look.

·
Reply
Software Engineer ·
Needs advice
on
StatsDStatsD
and
DatadogDatadog

I see StatsD is commonly used in conjunction with Datadog. In fact, Datadog even has their own StatsD daemon (called DogStatsD) embedded in the DataDog agent. Can someone explain to me what it is that StatsD gives you which you don't already have with Datadog's APM and distributed tracing functionality?

READ MORE
4 upvotes·23.8K views
Replies (1)
Core Developer at OSInet·

The Datadog statsd agent is not really a normal statsd client: it implements a large subset of the original (etsy) features, but also some Datadog-specific features (about histograms). It is used to send metrics to Datadog APM, and its big advantage is the developer experience, which is familiar and easy to use, just like any statsd client, making it trivial to replace an existing statsd metrics client in any application with the Datadog version to publish metrics to Datadog.

READ MORE
1 upvote·1.7K views
Software Engineer ·
Needs advice
on
QuartQuartconnexionconnexion
and
FlaskFlask

I'm considering moving from Flask to Quart, does anyone have some experience with this migration?

I expect possible problems with connexion which we use as OpenAPI specification.

Would be good if someone can point downsides of moving to the Quart framework so I can double-check if my plan is worth doing.

Other libs and tools used in the project: SQLAlchemy, alembic, PostgreSQL, Datadog

cons for now:

  • Refactoring uncertainty (not sure how big of a task is it)
  • Connexion might not work with Quart (moving to another library)
  • ...
READ MORE
7 upvotes·3.3K views
Lead Architect at Fresha·

Coming from a Ruby background, we've been users of New Relic for quite some time. When we adopted Elixir, the New Relic integration was young and missing essential features, so we gave AppSignal a try. It worked for quite some time, we even implemented a :telemetry reporter for AppSignal . But it was difficult to correlate data in two monitoring solutions, New Relic was undergoing a UI overhaul which made it difficult to use, and AppSignal was missing the flexibility we needed. We had some fans of Datadog, so we gave it a try and it worked out perfectly. Datadog works great with Ruby , Elixir , JavaScript , and has powerful features our engineers love to use (notebooks, dashboards, very flexible alerting). Cherry on top - thanks to the Datadog Terraform provider everything is written as code, allowing us to collaborate on our Datadog setup.

READ MORE
3 upvotes·82.5K views
Cloud Architect ·

We build everything in AWS around microservices and are looking at Amazon CloudWatch, Datadog, and New Relic. Which one would work best for our situation?

READ MORE
3 upvotes·6K views
Replies (3)
Recommends
Splunk

Via acquisitions and internal product developments over the last year+, Splunk provides really differentiated APM and monitoring for microservices and AWS. I'd recommend giving it a peak if you haven't yet! For some validation, a recent Cloud Observability vendor report by GigaOm came out and ranked Splunk as the "top performer" in the space. Hope this helps in your search

READ MORE
DevOps Observability, Analysis and Insight | Splunk (splunk.com)
3 upvotes·271 views
Principal Software Engineer at Accurate Background·

We use a combination of Java and C# microservices on AWS. We started off with CloudWatch but found it severely lacking - even for basic logging functionality; mostly because the way it sets up log groups is not very useful for distributed applications. It gets hard to find the right logs for the right instance; the interface is rather lacking, etc.

We looked into several alternatives. Our final decision fell upon: - Datadog, bundled with every docker image during the CI/CD build process. Datadog has agents easily hook into our existing processes. No real code had to be changed other than the build script and the Dockerfile (https://docs.datadoghq.com/tracing/setup_overview/setup/java/?tab=containers). Datadog has been very good at providing insights on many different levels (performance, errors, infrastructure load) and can be set up to send automated alerts when unexpected behavior happens. - Kibana, to centralize our logging into a more easily searchable and filterable configuration.

I would also recommend considering Dynatrace. I believe it comes at a higher price, but I fondly look back to my time working with the tool in the past. Dynatrace is remarkably deep and smart; it ended up being very good at helping us find tricky issues like memory leaks, it helped us monitor performance, trace user paths throughout our apps and so much more. I understand they've evolved quite a bit since I last used them, investing heavily into AI components to improve the experience. Worth the consideration.

READ MORE
2 upvotes·215 views
View all (3)
Needs advice
on
New RelicNew Relic
and
DatadogDatadog

We are migrating from New Relic to Datadog... Is there a way I can export all existing alerts in an easy way from New Relic to Datadog?

READ MORE
3 upvotes·160 views