How to Disable Code: The Developer’s Production Kill Switch

697
CloudBees
CloudBees, the enterprise software delivery company, provides the industry’s leading DevOps technology platform. CloudBees enables developers to focus on what they do best: Build stuff that matters, while providing peace of mind to management with powerful risk mitigation, compliance and governance tools.

The following is a guest post written by Carlos Schults.

Being able to disable code in production is a power that many developers aren’t aware of. And that’s a shame. The ability to switch off some portions—or even complete features—of the codebase can dramatically improve the software development process by allowing best practices that can shorten feedback cycles and increase the overall quality.

So, that’s what this post will cover: the mechanisms you can use to perform this switching off, why they’re useful and how to get started. Let’s dig in.

Why Would You Want to Disable Code?

Before we take a deep dive into feature flags, explaining what they are and how they’re implemented, you might be asking: Why would people want to switch off some parts of their codebase? What’s the benefit of doing that?

To answer these questions, we need to go back in time to take a look at how software was developed a couple of decades ago. Time for a history lesson!

The Dark Ages: Integration Hell

Historically, integration has been one of the toughest challenges for teams trying to develop software together.

Picture several teams inside an organization, working separately for several months, each one developing its own feature. While the teams were working in complete isolation, their versions of the application were evolving in different directions. Now they need to converge again into a single, non conflicting version. This is a Herculean task.

That’s what “integration hell” means: the struggle to merge versions of the same application that have been allowed to diverge for too long.

Enter the Solution: Continuous Integration

“If it hurts, do it more often.” What this saying means is that there are problems we postpone solving because doing so is hard. What you often find with these kinds of problems is that solving them more frequently, before they accumulate, is way less painful—or even trivial.

So, how can you make integrations less painful? Integrate more often.

That’s continuous integration (CI) in a nutshell: Have your developers integrate their work with a public shared repository, at the very least once a day. Have a server trigger a build and run the automated test suite every time someone integrates their work. That way, if there are problems, they’re exposed sooner rather than later.

How to Handle Partially Completed Features

One challenge that many teams struggle with in CI is how to deal with features that aren’t complete. If developers are merging their code to the mainline, that means that any developments that take more than one day to complete will have to be split into several parts.

How can you avoid the customer accessing unfinished functionality? There are some trivial scenarios with similarly trivial solutions, but harder scenarios call for a different approach: the ability to switch off a part of the code completely.

Feature Flags to the Rescue

Defining Feature Flags

There are many names for the mechanisms that allow developers to switch a portion of their code off and on. Some call them “feature toggles” or “kill switches.” But “feature flags” is the most popular name, so that’s what we’ll use for the remainder of this post. So, what are feature flags?

Put simply, feature flags are techniques that allow teams to change the behavior of an application without modifying the code. In general, flags are used to prevent users from accessing and using the changes introduced by some piece of code, because they’re not adequate for production yet for a number of reasons.

Disable Code: What Are the Use Cases?

We’ll now cover some of the most common use cases for disabling code in production.

Switching Off Unfinished Features

As you’ve seen, one of the main use cases for feature flags is preventing users from accessing features that aren’t ready for use yet.

That way, programmers developing features that are more complex and take a longer time to complete aren’t prevented from integrating their work often and benefiting from it.

Enabling A/B Testing

The adoption of feature flags enables the use of several valuable practices in the software development process, one of which is A/B testing.

A/B testing is a user experience research technique that consists of comparing two versions of a website or application to decide which one to keep. It entails randomly splitting users into two groups, A and B, and then delivering a different version of the application to each group. One group might receive the current production version, which we call the “control,” whereas the second group would receive the candidate for the new version, called the “treatment.”

The testers then monitor the behavior of both groups and determine which of the versions achieved better results.

Feature flags are a practical way to enable A/B testing because they allow you to quickly and conveniently change between the control and treatment versions of your application.

Enabling Canary Releases

If you deliver the new version of your app to your entire userbase at once, 100 percent of your users will be impacted if the release is bad in some way. What if you could gradually roll out the new version instead? You’d first deploy to a small subset of users, monitoring that group to detect issues. If something went wrong, you could roll it back. If everything looked fine, you could then gradually release the version for larger groups. That’s a canary release in a nutshell. It’s another powerful technique that feature flags might help with.

Customizing Features According to Users’ Preferences

It’s not uncommon to have to customize your application according to the needs of specific users, and there are several ways in which software teams can accomplish that—some more efficient, and others less so (companies that create separate branches or entire repositories for each client come to mind).

This is another area where feature flags could help, allowing teams to dynamically switch between different versions of the same functionality.

Disable Code in Production 101

How do you go about disabling code? That’s what we’re going to see now, in three increasingly sophisticated phases.

First Stage: The Most Basic Approach

We start with an approach that’s so primitive, it maybe shouldn’t be considered a feature flag at all. Consider the pseudocode below:

calculateAdditionalWorkHours(Employee employee, Date start, Date end) {     
    // return calculateAdditionalWorkHoursSameOldWay(employee, start, end);
    return calculateAdditionalWorkHoursImproved(employee, start, end); 
    }

In the code above, we're just commenting out the old version of some method and replacing it with a new version. When we want the older version to be used, we just do the opposite. (Well, I said it was primitive.) This approach lacks one of the most fundamental properties of a feature flag—the ability to change how the application behaves without changing its code.

However, it plants the seed for more sophisticated approaches.

Second Stage: Taking the Decision Out of the Code

The previous approach didn't allow developers to select the desired version of the feature without changing the code. Fortunately, that's not so hard to do. First, we introduce a logical variable to determine which version we're going to use:

calculateAdditionalWorkHours(Employee employee, Date start, Date end) {

    var result = useNewCalculation
        ? calculateAdditionalWorkHoursImproved(employee, start, end)
        : calculateAdditionalWorkHoursSameOldWay(employee, start, end);

    return result;
}

Then, we use some mechanism to be able to assign the value to the variable from an external source. We could use a configuration file:

var useNewCalculation = config[newCalculation];

Passing arguments to the application might be another option. What matters is that we now have the ability to modify how the app behaves from the outside, which is a great step toward "true" feature flagging.

Keep in mind that the code examples you see are all pseudocode. Using your favorite programming language, there's nothing stopping you from starting with this approach and taking it up a notch. You could, for instance, use classes to represent the features and design patterns (e.g., factories) to avoid if statements.

Stage 3: Full-Fledged Feature Flag Management

The previous approach might be enough when your application has only a small number of flags. But as that number grows, things start to become messy.

First, you have the issue of technical debt. Manually implemented feature flags can create terribly confusing conditional flows in your codebase. That only grows worse with new flags being introduced each day. Additionally, they might make the code harder to understand and navigate, especially for more junior developers, which is an invitation for bugs.

Another problem is that as the number of flags grows, it becomes more and more common to forget to delete old, obsolete ones.

The main problem of a homegrown approach is that it doesn't give you an easy way to see and manage all of your flags at once. That's why our third and final stage is a single piece of advice: Instead of rolling out your own feature flags approach, adopt a third-party feature flag management system.

Feature Flags Are a CI/CD Enabler

We've covered the mechanisms developers can use to disable portions of their codebase in production without having to touch the code. This capability is powerful and enables techniques such as A/B testing and canary releases, which are all hallmarks of a modern, agile-based software development process.

The names for the techniques might vary—feature flags, feature toggles, feature flipper, and so on. The way in which the techniques are implemented can also vary—from a humble if statement to sophisticated cloud-based solutions.

But no matter what you call them, you can't overstate the benefit these mechanisms offer. They're an enabler of Continuous Integration, which is essential for any modern software organization that wants to stay afloat.

CloudBees
CloudBees, the enterprise software delivery company, provides the industry’s leading DevOps technology platform. CloudBees enables developers to focus on what they do best: Build stuff that matters, while providing peace of mind to management with powerful risk mitigation, compliance and governance tools.
Tools mentioned in article
Open jobs at CloudBees
Principal Site Reliability Engineer-C...
EMEA

OUR CUSTOMERS DEVELOP SOFTWARE AT THE SPEED OF IDEAS

CloudBees, the enterprise software delivery company, provides the industry’s leading DevOps technology platform. CloudBees enables developers to focus on what they do best: Build stuff that matters while providing peace of mind to management with powerful risk mitigation, compliance, and governance tools. Used by many of the Fortune 100, CloudBees is helping thousands of companies harness the power of continuous everything and gets them on the fastest path from a great idea, to great software, to amazing customer experiences, to being a business that changes lives.

Backed by Matrix Partners, Lightspeed Venture Partners, Verizon Ventures, Delta-v Capital, Golub Capital, and Unusual Ventures, CloudBees was founded in 2010 by former JBoss CTO Sacha Labourey and an elite team of continuous integration, continuous delivery, and DevOps professionals.

The CloudBees Community team is seeking a Principal Site Reliability Engineer. In this role, you will support the Community team and the Jenkins community by maintaining, updating, refining, and improving the Jenkins project infrastructure.  The Jenkins infrastructure team powers the Jenkins project.  We provide services to more than 40000 users every day and 300000 Jenkins instances worldwide. This project is special in the way that almost everything is public, sponsored, and orchestrated by the Jenkins community.

You will manage multiple Kubernetes clusters and the applications running on those clusters.  You will interact with an international community of open source contributors both inside and outside of CloudBees.  Your work will be widely visible and deeply appreciated by users around the world.

WHAT YOU'LL DO

  • Key contributor to Jenkins infrastructure
  • Lead, design, implement, and support Jenkins infrastructure initiatives
  • Meet budget objectives and improve service reliability
  • Produce necessary documentation, reports, and presentations
  • Communicate effectively with others inside and outside CloudBees

WHAT THE ROLE REQUIRES

  • Kubernetes production experience
  • Cloud experience with at least one major public provider (AWS, Azure, Google Cloud)
  • Linux system administration experience
  • Jenkins administration experience
  • Automation experience with languages like Bash, Groovy, Python, or Go
  • Excellent organization and planning skills
  • Good English language communications skills, both written and verbal
  • Good attention to detail; ability to monitor, manage, and track impending deadlines, and present data accurately to support decision making
  • Willing to mentor others

DESIRED SKILLS

  • Windows experience
  • Java ecosystems experience
  • Infrastructure as code experience with tools like Puppet, Terraform, Packer, etc.
  • Willing to advocate for Jenkins, possibly including travel to conferences and participation in contributor events
  • Blogging or other advocacy experience

WHAT YOU’LL GET

  • Highly competitive benefits and vacation package
  • Ability to work for one of the fastest growing companies with some of the most talented people in the industry
  • Team outings
  • Fun, Hardworking, and Casual Environment
  • Endless Growth Opportunities

We have a culture of movers and shakers and are leading the way for everyone else with a vision to transform the industry. We are authentic in who we are. We believe in our abilities and strengths to change the world for the better. Being inclusive and working together is at the heart of everything we do. We are naturally curious. We ask the right questions, challenge what can be done differently and come up with intelligent solutions to the problems we find. If that’s you, get ready to bee impactful and join the hive.

At CloudBees, we truly believe that the more diverse we are, the better we serve our customers. A global community like Jenkins demands a global focus from CloudBees. Organizations with greater diversity—gender, racial, ethnic, and global—are stronger partners to their customers. Whether by creating more innovative products, or better understanding our worldwide customers, or establishing a stronger cross-section of cultural leadership skills, diversity strengthens all aspects of the CloudBees organization.

In the technology industry, diversity creates a competitive advantage. CloudBees customers demand technologies from us that solve their software development, and therefore their business problems, so that they can better serve their own customers. CloudBees attributes much of its success to its worldwide workforce and commitment to global diversity, which opens our proprietary software to innovative ideas from anywhere. Along the way, we have witnessed firsthand how employees, partners, and customers with diverse perspectives and experiences contribute to creative problem solving and better solutions for our customers and their business.

Development Support Engineer
US or Canada

Our Customers Develop Software at the Speed of Ideas

CloudBees, the enterprise software delivery company, provides the industry’s leading DevOps technology platform. CloudBees enables developers to focus on what they do best: Build stuff that matters while providing peace of mind to management with powerful risk mitigation, compliance, and governance tools. Used by many of the Fortune 100, CloudBees is helping thousands of companies harness the power of continuous everything and gets them on the fastest path from a great idea, to great software, to amazing customer experiences, to being a business that changes lives.

Backed by Matrix Partners, Lightspeed Venture Partners, Verizon Ventures, Delta-v Capital, Golub Capital, and Unusual Ventures, CloudBees was founded in 2010 by former JBoss CTO Sacha Labourey and an elite team of continuous integration, continuous delivery, and DevOps professionals.

Team description

CloudBees customers rely on our Support team to help them be successful in the use of our products. Our team is uniquely positioned to help sustain the company’s growth by providing a customer support experience that surpasses expectations. These positive customer experiences help drive annual renewals and business expansion. A successful Development Support Engineer (DSE) will use their skills and experience to accurately diagnose customer issues and get them resolved in a timely way, to the customer’s satisfaction. In addition, motivated individuals who want to contribute in other ways will have opportunities to work on our collection of internal tools that automate the diagnosis of issues, making the entire team more efficient by reducing manual work.

A typical day in our Support team starts with a scrum meeting where we review open and unassigned cases and help each other with issues we’re stuck on. Working on active cases, we answer basic questions and also troubleshoot problems that range from the mundane to the fiendishly complicated (it helps if you enjoy a good challenge). We collaborate with each other throughout the day, via Slack or video calls. During down time, we build technical knowledge through training and tools development.

CloudBees has been a remote-work-first company since it was founded, and the majority of the DSE team works remotely. The existing team has a mix of backgrounds including system administrators, developers, support engineers, and devops engineers. We strive to provide everyone on the team with interesting challenges, opportunities for personal and professional growth, and a positive work/life balance.

What You'll Do 

  • Answer customer questions about product usage and best practices
  • Diagnose complex technical issues and provide solutions or workarounds
  • Communicate with customers through a ticketing system, with phone support sometimes required for complex or urgent issues
  • Collaborate frequently with members of the Support and Engineering teams
  • Contribute to documentation
  • Contribute to internal software tools to automate diagnosis of customer issues
  • Work a weekend on-call rotation every 4-8 weeks (daytime hours only)

What The Role Requires 

A successful candidate will have:

  • Basic Linux system administration knowledge
  • Good communication skills (English language fluency required)
  • The ability to work independently
  • The ability to build knowledge of new technologies easily
  • A sense of empathy with our customers

As previously mentioned, members of our team have a variety of past work experience, and each bring a different mix of skills to our team. The following are some examples of these skills, but by no means do we expect candidates to have all of them. If any of these fit with your experience, we would love to hear from you!

  • System administration knowledge, especially Linux, storage, and/or networking
  • Good working knowledge of popular DevOps tools and services such as: Jenkins, Docker, Artifactory/Nexus, Kubernetes, git & GitHub
  • Knowledge of common enterprise environments & technologies such as LDAP & databases
  • Knowledge of common web application architectures, SSL, REST API concepts, etc.
  • Understanding of Continuous Integration and Continuous Deployment concepts and practices
  • Experience with cloud computing environments
  • Programming experience, anything from shell scripting to Java development
  • Open source community contributions, especially Jenkins
  • Previous experience in customer-facing roles
  • Computer Science / IT degree or equivalent work experience
  • Certifications: Cloud computing providers, Kubernetes, etc.

What You'll Get 

  • Gain experience working with and troubleshooting a variety of tools used widely in the tech industry
  • Enhance your career by completing industry-recognized technical certifications
  • Manage projects and initiatives within the team, contributing to team goals
  • Potential future opportunities to grow into management, engineering, or other field roles
  • Play a key role in maintaining and growing company revenue over time

We have a culture of movers and shakers and are leading the way for everyone else with a vision to transform the industry. We are authentic in who we are. We believe in our abilities and strengths to change the world for the better. Being inclusive and working together is at the heart of everything we do. We are naturally curious. We ask the right questions, challenge what can be done differently and come up with intelligent solutions to the problems we find. If that’s you, get ready to bee impactful and join the hive.

At CloudBees, we truly believe that the more diverse we are, the better we serve our customers. A global community like Jenkins demands a global focus from CloudBees. Organizations with greater diversity—gender, racial, ethnic, and global—are stronger partners to their customers. Whether by creating more innovative products, or better understanding our worldwide customers, or establishing a stronger cross-section of cultural leadership skills, diversity strengthens all aspects of the CloudBees organization.

In the technology industry, diversity creates a competitive advantage. CloudBees customers demand technologies from us that solve their software development, and therefore their business problems, so that they can better serve their own customers. CloudBees attributes much of its success to its worldwide work force and commitment to global diversity, which opens our proprietary software to innovative ideas from anywhere. Along the way, we have witnessed firsthand how employees, partners, and customers with diverse perspectives and experiences contribute to creative problem solving and better solutions for our customers and their businesses.

Development Support Engineer
, Spain

Our Customers Develop Software at the Speed of Ideas

CloudBees is powering the continuous economy by offering the world’s first end-to-end continuous software delivery management system (SDM). For millions of developers and product teams driving innovation for businesses large or small, SDM builds on continuous integration (CI) and continuous delivery (CD) to enable all functions and teams within and around the software delivery organization to best work together to amplify value creation.
 
CloudBees is the continuous integration (CI), continuous delivery (CD) and application release automation (ARA) powerhouse built from the commercial success of its products and its open source leadership as the largest contributor to Jenkins and a founding member of the Continuous Delivery Foundation (CDF). With a globally distributed workforce of more than 500 employees, the company reflects the global nature of the DevOps movement. We believe in walking the talk! From startups with full-stack developers practicing NoOps to large Fortune 100 companies, CloudBees enables all software-driven organizations to intelligently deploy the right capabilities at the right time.
  
Over 3,500 of the world’s best known brands and over 50% of the Fortune 500, invest in CloudBees because of its ability to work across any cloud, in any development environment and to balance corporate governance and control with developer flexibility and freedom.
 
CloudBees is home to the world’s leading DevOps experts helping thousands of companies harness the power of “continuous everything” and putting them on the fastest path from great idea, to great software, to great business value.

 

CloudBees customers rely on our Support team to help them be successful in the use of our products. Our team is uniquely positioned to help sustain the company’s growth by providing a customer support experience that surpasses expectations. These positive customer experiences help drive annual renewals and business expansion. A successful Development Support Engineer (DSE) will use their skills and experience to accurately diagnose customer issues and get them resolved in a timely way, to the customer’s satisfaction. In addition, motivated individuals who want to contribute in other ways will have opportunities to work on our collection of internal tools that automate the diagnosis of issues, making the entire team more efficient by reducing manual work.

A typical day in our Support team starts with a scrum meeting where we review open and unassigned cases and help each other with issues we’re stuck on. Working on active cases, we answer basic questions and also troubleshoot problems that range from the mundane to the fiendishly complicated (it helps if you enjoy a good challenge). We collaborate with each other throughout the day, via Slack or video calls. During down time, we build technical knowledge through training and tools development.

CloudBees has been a remote-work-first company since it was founded, and the majority of the DSE team works remotely. The existing team has a mix of backgrounds including system administrators, developers, support engineers, and devops engineers. We strive to provide everyone on the team with interesting challenges, opportunities for personal and professional growth, and a positive work/life balance.

WHAT YOU'LL DO

  • Answer customer questions about product usage and best practices
  • Diagnose complex technical issues and provide solutions or workarounds
  • Communicate with customers through a ticketing system, with phone support sometimes required for complex or urgent issues
  • Collaborate frequently with members of the Support and Engineering teams
  • Contribute to documentation
  • Contribute to internal software tools to automate diagnosis of customer issues
  • Work a weekend on-call rotation every 4-8 weeks (daytime hours only)
  • In this role, you will be responsible for working our North America shift hours (2:00pm - 11:00pm CET)

WHO YOU ARE

A successful candidate will have:

  • Basic Linux system administration knowledge
  • Good communication skills (English language fluency required)
  • The ability to work independently
  • The ability to build knowledge of new technologies easily
  • A sense of empathy with our customers

As previously mentioned, members of our team have a variety of past work experience, and each bring a different mix of skills to our team. The following are some examples of these skills, but by no means do we expect candidates to have all of them. If any of these fit with your experience, we would love to hear from you!

  • System administration knowledge, especially Linux, storage, and/or networking
  • Good working knowledge of popular DevOps tools and services such as: Jenkins, Docker, Artifactory/Nexus, Kubernetes, git & GitHub
  • Knowledge of common enterprise environments & technologies such as LDAP & databases
  • Knowledge of common web application architectures, SSL, REST API concepts, etc.
  • Understanding of Continuous Integration and Continuous Deployment concepts and practices
  • Experience with cloud computing environments
  • Programming experience, anything from shell scripting to Java development
  • Open source community contributions, especially Jenkins
  • Previous experience in customer-facing roles
  • Computer Science / IT degree or equivalent work experience
  • Certifications: Cloud computing providers, Kubernetes, etc.

HOW YOU WILL GROW

  • Gain experience working with and troubleshooting a variety of tools used widely in the tech industry
  • Enhance your career by completing industry-recognized technical certifications
  • Manage projects and initiatives within the team, contributing to team goals
  • Potential future opportunities to grow into management, engineering, or other field roles
  • Play a key role in maintaining and growing company revenue over time

WHAT YOU'LL GET

  • Highly competitive benefits and vacation package
  • Ability to work for one of the fastest growing companies with some of the most talented people in the industry
  • Team outings
  • Fun, Hardworking, and Casual Environment
  • Endless Growth Opportunities

At CloudBees, we truly believe that the more diverse we are, the better we serve our customers.  A global community like Jenkins demands a global focus from CloudBees. Organizations with greater diversity—gender, racial, ethnic, and global—are stronger partners to their customers.  Whether by creating more innovative products, or better understanding our worldwide customers, or establishing a stronger cross-section of cultural leadership skills, diversity strengthens all aspects of the CloudBees organization.

In the technology industry, diversity creates a competitive advantage.  CloudBees customers demand technologies from us that solve their software development, and therefore their business problems, so that they can better serve their own customers.  CloudBees attributes much of its success to its worldwide work force and commitment to global diversity, which opens our proprietary software to innovative ideas from anywhere. Along the way, we have witnessed firsthand how employees, partners, and customers with diverse perspectives and experiences contribute to creative problem solving and better solutions for our customers and their businesses.

Senior DevOps Engineer
EMEA

OUR CUSTOMERS DEVELOP SOFTWARE AT THE SPEED OF IDEAS

CloudBees, the enterprise software delivery company, provides the industry’s leading DevOps technology platform. CloudBees enables developers to focus on what they do best: Build stuff that matters while providing peace of mind to management with powerful risk mitigation, compliance, and governance tools. Used by many of the Fortune 100, CloudBees is helping thousands of companies harness the power of continuous everything and gets them on the fastest path from a great idea, to great software, to amazing customer experiences, to being a business that changes lives.

Backed by Matrix Partners, Lightspeed Venture Partners, Verizon Ventures, Delta-v Capital, Golub Capital, and Unusual Ventures, CloudBees was founded in 2010 by former JBoss CTO Sacha Labourey and an elite team of continuous integration, continuous delivery, and DevOps professionals.

It’s an exciting time to be part of the CloudBees team. That’s because thousands of development and deployment teams around the world are using CloudBees products that enhance and optimize the way their teams build and deliver software using continuous delivery.

To support the delivery of CloudBees products (both internal and external), CloudBees has an Operations team that designs, deploys, secures and manages a variety of software systems and the related GCP / AWS infrastructure that underpins these software engineering objectives.

The Operations team guiding principle is simple - “Make Engineering Faster”.

Underpinning this are the challenges of security, process change, technical change that all must be met to varying degrees.

Further, we are passionate about reducing the manual work that plagues IT teams - Engineering / Ops / Support / Security - and are empowered to re-engineer processes (more easily said than done) and technology to achieve those objectives.

Location

Our preferred candidate will be located in a European time zone to provide the best working hour coverage for our Engineering and Operational workloads. Please visit our website for a list of our approved hiring locations https://www.cloudbees.com/careers

What you’ll be doing

Team-building - inside the team and across teams - you’ll be proposing new ideas and helping implement them

Thinking - coming up with new ways of solving problems - and working with the team to prove them out and then implement them

Documenting - describing problems, how your proposed solution solves those problems, and how your implemented solutions are operated.

Optimizing - working with Engineering teams to optimize their build systems - even our monster jobs (lots of parallelism - lots of bottlenecks - lots of technical challenges)

Defining - writing / modifying Terraform to handle our infrastructure, and helping teams define their infrastructure using our modules.

Coding - you’ll be writing code in Golang - we occasionally write glue in Python / Groovy (but not very often)

Securing 

  • working with our security team to drive operational change in engineering teams - “Supply Chain Integrity” - you’ve heard about it - it’s the new hot topic
  • working with our security team to drive organizational security changes (logging, auditing, monitoring, alerting)

Observing - adding the right monitoring so that we are alerted before our customers notice, and not alerted when the system is able to heal itself

Alerting - getting alerts and working with team members to solve the initial problem, documenting the problem and then working out how to stop them happening again

DevOps / DevSecOps - We’re not “DevOps” engineers, but we do help our teams become more proficient in doing Dev and Ops (and embracing Total Ownership). We build the guard rails to help them do it safely and align tech-stacks across the company.

What you’ll bring to the team

  • New ideas from where you’ve worked in the past - what worked well, what didn’t work, what CloudBees could do better?
  • “the knack” - an uncanny ability to uncover the root cause of problems based on limited information because “it feels like something you’ve seen before”
  • The courage to say - “I’m not sure” - and getting feedback from your colleagues on how to complete a task in our tech-stack
  • No fear of saying - “Have you considered doing it this way” - and giving constructive feedback to colleagues on alternative ways of doing things

What you’ll work on

  • Corporate tools - GSuite, GitHub, Jira, Confluence, Slack
  • Operations tools - PagerDuty, DataDog
  • Language tools - Golang
  • Engineering tools - Jenkins, Vault, CodeShip, Auth0
  • Platform tools - Kubernetes, terraform, helm, docker
  • IaaS - GCP, AWS, a little Azure

There are a lot more tools - you should ask during your interview!

How you’ll be part of the team

You are:

  • self-motivated and enjoy solving problems
  • excited by the opportunity to automate yourself out of recurring work
  • able to keep tickets up to date, so we don’t need too many status meetings (we’re async first due to our distributed nature)
  • You will liaise directly with the software engineers across all teams, ensuring that decisions are agreed internally and externally to Operations and that they meet our technical and non-technical objectives.

You have experience in:

  • programming in various languages and domains
  • the "modern Ops stack" (e.g. monitoring, alerting, cloud-based provisioning, Docker, Kubernetes)
  • Linux systems administration
  • cloud-based operations (GCP, AWS, or Azure)
  • cluster orchestration and management tools (e.g. k8s / ECS / Terraform / etc)
  • continuous integration / continuous delivery tools (Jenkins / CodeShip etc)
  • modern software engineering practices: code reviews, unit / acceptance testing, source control, etc.

How you’ll work

  • You work in a geographically distributed team (APAC, US, EMEA) of peers
  • You choose your own work in tandem with the team, team-leader, and manager
  • You report directly to the Operations Manager (who reports to the VP of Engineering)
  • Your working hours are flexible - while there will be some core hours required for meeting with Engineering and Operations teams - much latitude is given in getting work done. This includes determining your own start and finish times to accommodate family life.
  • Travel is limited (especially at the moment) - but once things return to normal (whatever that looks like) we’ll have an annual offsite somewhere on the planet

On-Call

We prefer to avoid out-of-working-hours on-call where possible (and are geographically distributed to achieve this), however it is expected that you will be willing to respond to alerts out of hours on a best-effort basis. You don’t take your laptop anywhere unless you want to.

Work hours on-call is just known as “work” - if it blows up - you fix it or find someone who can fix it.

Engineering teams operate under a “Total Ownership” model - they own their entire stack and are empowered to fix it.  They occasionally need help out of hours, and we provide best-effort support.

As an example - 

  • No one has been woken up for an outage in over 8 years
  • There hasn’t been a weekend outage in the last 6 months
  • We have had to triage security vulnerability reports (on weekends), but those have not required work beyond the initial fault analysis.
  • There are occasional false-positives - that are easily silenced - and then resolved at a system level during normal working hours.

We have a culture of movers and shakers and are leading the way for everyone else with a vision to transform the industry. We are authentic in who we are. We believe in our abilities and strengths to change the world for the better. Being inclusive and working together is at the heart of everything we do. We are naturally curious. We ask the right questions, challenge what can be done differently and come up with intelligent solutions to the problems we find. If that’s you, get ready to bee impactful and join the hive.

At CloudBees, we truly believe that the more diverse we are, the better we serve our customers. A global community like Jenkins demands a global focus from CloudBees. Organizations with greater diversity—gender, racial, ethnic, and global—are stronger partners to their customers. Whether by creating more innovative products, or better understanding our worldwide customers, or establishing a stronger cross-section of cultural leadership skills, diversity strengthens all aspects of the CloudBees organization.

In the technology industry, diversity creates a competitive advantage. CloudBees customers demand technologies from us that solve their software development, and therefore their business problems, so that they can better serve their own customers. CloudBees attributes much of its success to its worldwide workforce and commitment to global diversity, which opens our proprietary software to innovative ideas from anywhere. Along the way, we have witnessed firsthand how employees, partners, and customers with diverse perspectives and experiences contribute to creative problem solving and better solutions for our customers and their business.

Verified by
Technical Evangelist
You may also like