Amazon EKS

Amazon EKS

DevOps / Build, Test, Deploy / Containers as a Service
Needs advice
on
KubernetesKubernetesLensLens
and
OctantOctant

Dear Community Members,

I hope this message finds you well.

I am reaching out to seek guidance and recommendations regarding tools that are best suited for managing Amazon EKS cluster resources. Specifically, I am exploring options that enable effective deployment and customization of resources within an EKS environment.

My objective is to provide my team with the necessary access and capabilities to deploy and customize resources within the AWS EKS cluster. I am keen to learn from the community's expertise and experiences in this area.

Could you kindly share your insights, suggestions, and experiences with tools or platforms that have proven effective for managing AWS EKS cluster resources? Any recommendations or best practices regarding access control and resource management within EKS would be greatly appreciated.

Your valuable input will not only assist in streamlining our resource management processes but will also contribute to our team's efficiency and effectiveness within the EKS environment.

Thank you in advance for your contributions and support.

READ MORE
7 upvotes·10.2K views
Needs advice
on
DatadogDatadogNew RelicNew Relic
and
SysdigSysdig

We are looking for a centralised monitoring solution for our application deployed on Amazon EKS. We would like to monitor using metrics from Kubernetes, AWS services (NeptuneDB, AWS Elastic Load Balancing (ELB), Amazon EBS, Amazon S3, etc) and application microservice's custom metrics.

We are expected to use around 80 microservices (not replicas). I think a total of 200-250 microservices will be there in the system with 10-12 slave nodes.

We tried Prometheus but it looks like maintenance is a big issue. We need to manage scaling, maintaining the storage, and dealing with multiple exporters and Grafana. I felt this itself needs few dedicated resources (at least 2-3 people) to manage. Not sure if I am thinking in the correct direction. Please confirm.

You mentioned Datadog and Sysdig charges per host. Does it charge per slave node?

READ MORE
7 upvotes·1.5M views
Replies (3)
Recommends
on
Datadog

Can't say anything to Sysdig. I clearly prefer Datadog as

  • they provide plenty of easy to "switch-on" plugins for various technologies (incl. most of AWS)
  • easy to code (python) agent plugins / api for own metrics
  • brillant dashboarding / alarms with many customization options
  • pricing is OK, there are cheaper options for specific use cases but if you want superior dashboarding / alarms I haven't seen a good competitor (despite your own Prometheus / Grafana / Kibana dog food)

IMHO NewRelic is "promising since years" ;) good ideas but bad integration between their products. Their Dashboard query language is really nice but lacks critical functions like multiple data sets or advanced calculations. Needless to say you get all of that with Datadog.

Need help setting up a monitoring / logging / alarm infrastructure? Send me a message!

READ MORE
10 upvotes·2 comments·416.1K views
Medeti Vamsi Krishna
Medeti Vamsi Krishna
·
June 30th 2020 at 11:52AM

Thanks for the reply, I am working on DataDog trail version now. I am able to see my containers/pods/VMs metrics in the DataDog.

I am trying to do the jmx integration with autodiscovery now. But I am not able to see the jvm metrics in DataDog. Can you please help on this?

Here is my deployment yaml:

`

apiVersion: apps/v1

kind: Deployment

metadata:

name: myapp

namespace: datadog

annotations:

ad.datadoghq.com/myapp.check_names: >-

'["myapp"]'

ad.datadoghq.com/myapp.init_configs: >-

'[{"is_jmx": true, "collect_default_metrics": true}]'

ad.datadoghq.com/tomcat.instances: >-

'[{"host": "%%host%%","port":"5000"}]'

labels:

app: myapp

spec:

selector:

matchLabels:

app: myapp

template:

metadata:

labels:

app: myapp

spec:

containers:

- name: myapp

image: nexus.nslhub.com/sample-java-app:2.0

imagePullPolicy: Always

ports:

- containerPort: 8080

name: http

- containerPort: 5000

name: jmx

imagePullSecrets:

- name: myappsecret

nodeSelector:

kubernetes.io/hostname: ip-10-5-7-173.ap-south-1.compute.internal

`

·
Reply
Jens Günther
Jens Günther
·
June 30th 2020 at 11:57AM

Would like to help, but there could be hundreds of reasons why the incoming and outgoing jmx ports are not accessible from the agent.

·
Reply
Recommends
on
Instana

Hi Medeti,

you are right. Building based on your stack something with open source is heavy lifting. A lot of people I know start with such a set-up, but quickly run into frustration as they need to dedicated their best people to build a monitoring which is doing the job in a professional way.

As you are microservice focussed and are looking for 'low implementation and maintenance effort', you might want to have a look at INSTANA, which was built with modern tool stacks in mind. https://www.instana.com/apm-for-microservices/

We have a public sand-box available if you just want to have a look at the product once and of course also a free-trial: https://www.instana.com/getting-started-with-apm/

Let me know if you need anything on top.

READ MORE
8 upvotes·416K views
View all (3)
Lead Engineer at StackShare·

We began our hosting journey, as many do, on Heroku because they make it easy to deploy your application and automate some of the routine tasks associated with deployments, etc. However, as our team grew and our product matured, our needs have outgrown Heroku. I will dive into the history and reasons for this in a future blog post.

We decided to migrate our infrastructure to Kubernetes running on Amazon EKS. Although Google Kubernetes Engine has a slightly more mature Kubernetes offering and is more user-friendly; we decided to go with EKS because we already using other AWS services (including a previous migration from Heroku Postgres to AWS RDS). We are still in the process of moving our main website workloads to EKS, however we have successfully migrate all our staging and testing PR apps to run in a staging cluster. We developed a Slack chatops application (also running in the cluster) which automates all the common tasks of spinning up and managing a production-like cluster for a pull request. This allows our engineering team to iterate quickly and safely test code in a full production environment. Helm plays a central role when deploying our staging apps into the cluster. We use CircleCI to build docker containers for each PR push, which are then published to Amazon EC2 Container Service (ECR). An upgrade-operator process watches the ECR repository for new containers and then uses Helm to rollout updates to the staging environments. All this happens automatically and makes it really easy for developers to get code onto servers quickly. The immutable and isolated nature of our staging environments means that we can do anything we want in that environment and quickly re-create or restore the environment to start over.

The next step in our journey is to migrate our production workloads to an EKS cluster and build out the CD workflows to get our containers promoted to that cluster after our QA testing is complete in our staging environments.

READ MORE
8 upvotes·619.4K views