At Bestow we engineer a life insurance platform with ease of use being the main focus. We want the process to be so easy that anybody can explore and purchase life insurance through a simple online portal. We know that security, scalability, and reliability are critical for a product certain to be a hit and we quickly landed on Kubernetes for our deployment platform. K8s gives us flexibility to manage and monitor our platform relatively easily without sacrificing configurability. GKE took those values a step further helping us with offload the burden of a control plane and creating a cluster with no scaling limits.
It was easy to get going with some core services such as DNS, certificate management, Nginx, and monitoring systems. We were able to migrate our platform quickly into Kubernetes deployments from simple containers running on an instance. In the early stages our engineers were able to easily scale deployments manually based on resource usage. However, GKE has some great dashboards to show resource utilization and eventually we implemented custom autoscaling based on GCP metrics and RabbitMQ metrics.
GKE has seamless integration between GCP and the Kubernetes clusters, giving us flexibility to manage workloads across our cloud infrastructure. Most of our engineers had previous Kubernetes experience, both self-hosted (K8s in K8s) and AWS-flavors. This familiarity made it easy to move between cloud providers and migrate workloads without interrupting their workday. We migrated from AWS to GCP (K8s in K8s -> GKE) on a Sunday morning and then early Monday morning, everyone just came to work as usual with no impact to developer productivity or policy sales.
The “datacenter as a service” capabilities provided by GKE are a selling point for us to use Google Cloud Platform. The capabilities allow us to focus on delivering a platform with 100% uptime, without getting lost in the details around backups, parallelism, load balancing, security patches, kernel patches, software upgrades, and other problems most data centers and home grown K8s clusters must consider.
The Terraform documentation is also very solid for implementing GKE, and the docs provided by Google were just as robust. Some of the challenges we faced included the maturity of the GKE platform alongside the existing virtual infrastructure we had provisioned. The extra layers of complexity introduced by Kubernetes sometimes makes troubleshooting infrastructure more difficult. We rely on our monitoring and metrics platform to produce more reliable clusters for our insurance platform.
Possibly the largest benefit of GKE was the integration between K8s RBAC and GCP IAM. The ability to provision users on the cluster using terraform and the google terraform provider created a very streamlined and simple way to give people access to the cluster where they needed it. It does this without limiting our ability to use RBAC in all its glory and grant additional permissions where required on the cluster. This principle also applies to service accounts. The seamless integration of IAM service accounts, created in terraform by pr, and provisioned automatically creates a record of what services can perform what actions in the GCP environment as well as on the cluster. This creates a very clear audit trail and assists with troubleshooting permission related woes without over permissioning for ease of use.
We’ve also updated our deployment pipeline to include testing and promotions through our different environment clusters. Developers could commit changes without worrying about how to get it deployed into Kubernetes. At the same time GKE + Kubectl makes it easy for all engineers to access cluster information such as logs, pods, and deployments. Our usage is always evolving as we find new ways to take advantage of Kubernetes and our container infrastructure.