The Growth Stacks of 2019

926
Segment
Segment provides the customer data infrastructure that businesses use to put their customers first. With Segment, companies can collect, unify, and connect their first-party data to over 200 marketing, analytics, and data warehousing tools.

This post is by Calvin French-Owen of Segment


Over the past seven years, we’ve helped thousands of companies collect data from their websites and mobile apps, and federate that data to over 250 different SaaS tools. We give them a code snippet, instructions for sending data to our API, and then we allow them to turn on different integrations with ‘the flip of a switch.’

By analyzing the data of how our user base enables these tools on the Segment platform, we’ve been able to generate an in-depth view of how the SaaS market is evolving for analytics, marketing, and growth.

Today, we’d like to share these insights with you, comparing each tool along with its competitors, and highlighting who’s growing the fastest. We’ll include some light commentary, including where we see the highest potential, but for the most part, these graphs speak for themselves.

We’re also doing our small part to help drive the growth of the 2019 category winners and beyond. This year, we’re opening up the Segment platform to partners, and our Developer Center is available in beta for you to start building integrations today.

The categories

For some quick background, Segment helps businesses collect and manage their own data about who their users are and what they are doing. We then send this data to a variety of tools our customers choose to use.

Marketing teams might use this data to power MailChimp, and product teams might use the same data to power Google Analytics or Amplitude. Each category we present solves a different problem or appeals to the different job functions in a business.

We’ll cover the following categories of tools:

  • Analytics
  • Raw Data
  • Email
  • Mobile Attribution
  • Warehouses
  • CRMs
  • Live Chat
  • Performance Monitoring
  • SMS / Push Notifications
  • Session Replay

Let’s dig in.

Understanding the graphs

Before you get started reviewing our category analysis, it’s important to understand what the graphs mean.

Here are a few helpful tips.

How information is presented:

  • Each graph represents how many Segment customers enabled a destination in a given quarter—meaning the graphs represent growth, rather than total number of users per tool.
  • Interpreting lines on the graph:
    • Upward line = tool is growing exponentially (colored and highlighted)
    • Flat line = tool is growing linearly
    • Downward line = tool is growing, but growth is slowing (grey lines)
  • Each graph highlights exponentially growing tools with colored lines. Other tools in the category are shown for scale in light grey lines.
  • Any graphs which start at zero indicate the tool was added to Segment within the past 3 years, and that you should discount the zeros.

Caveats:

  • These graphs do not account for customer value. A multi-billion dollar public company and a developer’s side project will each count as a single customer. As such, this analysis skews towards self-service tools.
  • This data is sourced from internal usage behavior. This means we know the data is extremely high fidelity, but also that it generally skews towards users who know about Segment and are thinking critically about their data in the first place. We haven’t included tools that are not on our platform.

There’s a definite selection bias here, but we think it’s an incredibly interesting dataset to model pure account growth as we’ve seen it.


Analytics

The background: Analytics was one of the earliest categories we had at Segment. Of the six integrations we launched with, 50% of them fit into the category of ‘analytics.’

Today, there are three major players we’d like to highlight in the category, which doesn’t seem to be growing as actively as it once was.

Our data:

As you might expect, there is the clear dominance of Google Analytics, the big line at the top. It’s a free product, able to take in large amounts of data, and give you a pretty wide range of insights about where your web traffic is coming from.

Notably, Google Analytics is also the first integration that our users tend to enable. We see about 60% of customers enabling Google Analytics before enabling another integration.

However, it seems that there is still room in this space to expand. Amplitude is growing at an astonishing rate, and Mixpanel continues to innovate as well. While Amplitude started out as the easy-to-install alternative to the first wave of behavioral analytics tools, namely Mixpanel, both companies continue to push the envelope on product development.

Both show climbing adoption and are iterating quickly as they expand the notion of what’s possible when it comes to behavioral analytics. The lively competition means it’s a great time to be a customer in the space, as providers push each other to innovate at a much faster pace.

The bottom line: Analytics as a category seems to be slowing its growth, but that doesn’t mean there isn’t room for innovation. Self-service tools targeting all types of companies—from startups to enterprise—continue to grow quickly.

Fastest growing: Google Analytics, Amplitude


Raw Data

The background: Some raw data tools on Segment’s platform, like Webhooks and Zapier, enable even more tools we don’t support today. Others provide a means for companies to run their own custom data pipelines, like Amazon S3 and Kinesis.

Our data:

Overall, we continue to see strong growth across the board in terms of adoption of these tools.

As cloud tooling is making it easier and easier to run complex analysis on terabytes of data, we see more data pipelines consuming from Kinesis and PubSub to power custom ML pipelines.

We also see strong adoption of Amazon S3. Customers putting data in S3 report it is a cost-effective, long-term place to store and run analysis on your data via EMR or Athena. Many of our customers even run event-based data pipelines based upon S3 events as part of a broader movement towards purely ‘serverless’ solutions.

Unlike projections that we’ve seen elsewhere, Google Cloud PubSub seems to be gaining more market share versus its competitors, though it still lags behind Kinesis.This adds additional insights as many outlets report that AWS owns 70% of the market.

The bottom line: Raw data tools are in. Segment’s easy collection, coupled with cheap storage and serverless workers means that there is a ton of growth in this space. Cloud providers continue to build more useful primitives for developers to plug into.

Fastest growing: Webhooks, Kinesis, Amazon S3, Google Cloud PubSub


Email

The background: Email is a crowded space mostly because there are so many different niche opportunities in this market.

Marketing automation tools try to differentiate on any number of axes. Some target consumer businesses (Iterable); others target B2B (Marketo). Some focus mostly on advanced segmentation, others differentiate in terms of customizable drip campaigns.

Our data:

Regardless of the differences between competitors in this category, two things are clear:

  1. The category in aggregate continues to grow. (10 of these tools are growing exponentially).
  2. There is upheaval where existing players are being driven out.

Looking at the top line, we have Customer.io, an email tool targeting SMBs and mid-sized businesses. They are currently leading the space in terms of Segment installs per month, and appear to be growing linearly over the past 18 months.

Moving to the next three lines in these graphs, the big existing players from 2016 seem to have growth rates that are starting to flatten out. Instead, they are being replaced by upstarts who are growing actively.

In particular, Braze and Iterable stand out as new tools on the market who are actively gaining in terms of market share, while their larger competitors seem to be slowing.

The bottom line: Email as a whole continues to grow, with different tools finding their own niches. Most email tools are branching out beyond email to also offer push notifications and text message options. At least on the three-year timescale, email is an area that seems ripe for disruption.

Fastest growing: Braze, Iterable, Customer.io


Mobile Attribution

The background: Mobile attribution tools attempt to… well… attribute the actions users take to the various channels where they came from.

In most cases, these tools are trying to help mobile app developers understand how users install their apps. Unlike on the web, you don’t get a referrer field passed along from the browser as a UTM parameter, so you typically need some other sort of other tool to analyze where the user came from (app store, an advertisement, a web page, etc).

Our data:

In terms of Mobile Attribution tools, we see two clear leaders by number of accounts: AppsFlyer and Branch Metrics, and also solid sustained growth from Adjust.

What’s most interesting about this whole category is that it seems to be continuing to grow very fast. Branch, AppsFlyer, and Adjust all are growing exponentially by number of accounts.

It’s no wonder that this has been one of the most actively funded areas of SaaS in the past few years. Between just the top three players, over $350 million has been invested in mobile attribution. That said, it looks like we are starting to see more ‘breakaway’ winners from the mobile attribution category.

The bottom line: Mobile attribution continues to be a growing category, with more funding being given to the winners. It seems mobile attribution is still a category that is receiving a lot more interest.

Fastest growing: AppsFlyer, Adjust, Branch


Warehouses

The background: Warehouses are databases that empower our customers to use SQL to run very custom analysis that they might not get from an out-of-the-box tool.

We didn’t launch our warehouses offering until mid-2016, which is why the graphs don’t pick up until then.

Our data:

Of the cloud-hosted data warehouses, BigQuery seems to be growing most quickly.

One of the new entrants, Snowflake is also starting to accelerate its growth, though not yet at the levels of the earlier competitors. Snowflake tends to target more enterprise buyers, and therefore, benefits less from the self-service motion that Google and Amazon are able to achieve.

The bottom line: This entire market is still growing very actively. As companies become more data conscious and the infrastructure primitives get cheaper and more powerful, using the raw events is more and more appealing.

Fastest growing: BigQuery, Snowflake, AWS (really, all are growing)


CRMs

The background: CRMs present an interesting beast of a category. Salesforce has been the long, consistent leader in CRM, and is often seen as the defining brand for the category.

Our data:

The first aspect that jumps out of our analysis is that Salesforce doesn’t dominate the CRM category in the way you might think.

We have a number of theories for this.

The leading theory is that Salesforce is a heavily customizable tool. Companies usually build custom workflows and schemas in Salesforce that reflect their use cases.

Segment has historically focused on fully “turnkey” integrations. Customers who want unique integration setups that fall outside of the defaults we set are more likely to use intermediary “glue” partners like Tray.io and Zapier to get their Segment data into Salesforce. Therefore, we see a subset of data biased towards the turnkey CRMs.

Our other take is that HubSpot primarily targets SMB businesses, rather than more traditional enterprises. HubSpot continues to grow actively because it has a much larger and actively growing market when judged in terms of number of businesses, given that each business is much smaller in size.

The bottom line: CRM as a category is growing slightly, with more accelerating growth curves from the players who target SMBs.

Fastest growing: HubSpot


Live Chat

The background: Live Chat tools have been around since 2009. Olark, part of the YC S09 batch, pioneered the idea of talking directly with your users on your website. Since then, the category has become more crowded with new entrants.

But the number of entrants stalled by 2014, when it seemed like the Live Chat category was relatively set.

Our data:

The Live Chat market changed drastically in 2016 when Drift, the fastest accelerating player, entered the market.

It’s interesting to see Drift so quickly capture market share here, in what seemed to be a relatively stable market. In terms of go-to-market, Drift has done an incredible job showcasing end use cases for their customers.

The bottom line: Live Chat as a category is relatively stable in terms of growth, but Drift is proving that new entrants can still take market share. Having a clear set of ‘recipes’ for their users has helped accelerate their growth substantially.

Fastest growing: Drift

Compare: Drift vs Olark


Performance Monitoring

The background: Performance monitoring and error reporting is a fairly unique category compared to the rest of the tools we support. It’s focused mostly on developer workflows rather than driving growth or retention.

As a whole, we’ve seen the bulk of the players in this market stay relatively flat in terms of growth rate, with one exception that is growing rapidly: Sentry.

Our data:

Most of the Segment performance monitoring tools take advantage of the ability to collect various forms of crash and error data from a page.

Sentry takes this a step further. Sentry also sends data back into Segment, so not only can you see your analytics events as “breadcrumbs” in your error tools (as in the others) but you can see your customer errors and crashes along side the fullcustomer journey from Segment!

This superpower lets you actually measure to see how performance and responsiveness affect the overall customer experience.

The bottom line: Sentry is the clear leader in the performance monitoring space. Their top-notch product quality and clear surfacing of user events in the context of errors is likely driving their continued growth.

Fastest growing: Sentry


SMS and Push Notifications

The background: SMS and push notification tools help companies notify their users at the right time with the right message. While Apple and Google provide APIs to message users, this set of tools helps coordinate those pushes.

Our data:

Here, there are a bunch of players who are growing actively, but there are two that stand out in terms of trajectory: Braze and UserEngage.

Braze has been growing actively and is working toward becoming the one of the most advanced push notification and SMS tool on the market. Braze allows you to model the full user journey and gives you extremely customize-able tools to create what they call a canvas.

UserEngage (now known as User.com), is a player who you might be less familiar with. They initially started out of Poland, raised 2.7 million as part of their Series A in late 2018, and have been growing quickly since then.

The bottom line: SMS and push notifications still seem like a growing market, but one that is rapidly starting to consolidate around a handful of winners. That said, it is not yet ‘set’, as we can see from recent growth of upstarts like UserEngage.

Fastest growing: Braze, UserEngage


Session Replay

The background: Session replay gives users the ability to follow along with various browser sessions, as if they were doing in-person user testing.

It’s a helpful tool for understanding areas of your app that are hard to navigate, or what is stopping your users from completing your setup flows.

Our data:

This category has two companies, FullStory and Hotjar, showing strong market share growth that can be attributed to incredible product execution.

As can be seen from the graph, once these companies went live in the Segment catalog they were able to see significant adoption ramp very quickly. In fact, they were each able to create a sustained lead flow of customers using their product.

This is another reason we’re incredibly excited about opening up our platform. Our Developer Center allows companies like Hotjar to build an integration right away, and start building a sustainable source of customers that find value in their product when combined with Segment data.

The bottom line: FullStory and Hotjar are breakouts in the space. It’s been hard for their competitors to compete with their growth velocity.

Fastest growing: FullStory, Hotjar

Compare: FullStory vs Hotjar


Emerging Trend 1: Customer Success is ripe for disruption

Interestingly, when we look at the Customer Success category in the catalog, we don’t nearly see the explosive exponential growth of integrations that we do elsewhere.

On the whole, we see the category continuing to flatline by number of enabled accounts, without a clear winner.

This analysis puzzled us.

If anything, it seems like customer success should be a growing category. Given the general explosion of SaaS businesses, the growth of direct-to-consumer, and big IPOs of B2B startups like NewRelic, Zendesk, Okta, and Twilio, we’d expect customer success to be a huge focus.

Instead, we think there’s a selection bias happening. Customers must be going somewhere to handle customer success tools, but it is possible that this is not reflected in the Segment catalog today.

Interestingly, of the 18 new integrations who have been added via the Developer Center, seven are focused on Customer Success: Vitally, Kustomer, Savio, ChurnZero, ScopeAI, Unwaffle, and UserBot.

Many of these tools are trying to leverage new advancements in ML/AI to better score users who might be likely to churn. As Segment is one of the best sources of the behavioral data required to make a good score, it seems only natural that we help power this next generation of tools.

Emerging Trend 2: Beyond users to accounts

When Mixpanel launched in 2009, web analytics weren’t exactly a new market. Website owners had been using everything from Google Analytics to Hit Counters for years to understand their users.

But Mixpanel took it a step further. Instead of measuring pageviews alone (a proxy for value), Mixpanel sought to model the new types of behavior happening online with web apps. It differentiated itself by being the first product designed to track user events rather than pageviews.

Today, nearly all tools have followed Mixpanel’s lead. Instead of tracking simple pageviews, they allow developers to send in all sorts of semantic events which are more closely tied to business-critical metrics.

In our newest cohort of tools launched via our open platform, we’re seeing a similar paradigm shift emerging. This time, it’s focused entirely around B2B businesses who sell into separate ‘accounts.’

If you run a B2B business, you typically have to make a hard choice. You can track data on the individual user level, but then you may have a hard time combining it later (how do I understand what a 2,000 person organization is doing?). On the other hand, you can track the health and actions of overall ‘accounts’, but possibly miss data from individual users (what if you have one user who loves you but another who hates you?).

With this new class of tools, we’ve specifically seen a stronger focus towards tracking individual user actions, but combining those into the idea of an account.

We even have a few examples below of companies that are leading the charge in this space.


Segment: The Next Generation

In closing, we’d like to highlight a few of the new tools which have been integrated using the Developer Center (now in beta for accepted partners).

We see these tools as the next generation of entrants into the customer data space. Almost all of them have existing Segment customers who have leveraged home-grown connectors to get integrated.

Today, we’re excited that each of these integrations is in our official catalog. With every new tool built on our platform, our customers are finding new ways to use their data to improve their customer experiences.

Kustomer

Kustomer is a customer success tool designed to help your support team quickly respond to customer issues with an unparalleled level of service.

Unlike other help desks or account scoring tools, Kustomer tries to bring the full view of the customer all into one place. They focus on combining ‘custom objects,’ things like orders and products, with user behavior and questions all in one single place. It’s sort of like a help desk on steroids.

For months, many of our users (like Glossier and StickerMule) have used both Segment and Kustomer to help treat their customers to world-class service. But each of these companies has had to build their own connection from scratch.

Today, that changes.

"Now, users can instantly unlock a number of use cases leveraging their first-party data to proactively engage with their customers.” - Peter Johnson, VP of Product at Kustomer

Mutiny

Mutiny helps SaaS companies easily personalize their website content for each visitor.

Unlike traditional personalization solutions, Mutiny requires minimal data integration and engineering work to set up. They leverage aggregate learnings across B2B customers to help each customer launch the most impactful personalization experiences and get to results faster.

Customers who use Mutiny and Segment can access near real-time data to inform personalized content. Website conversion events such as a visitor “booking a venue” are tracked by Segment and sent to Mutiny in real-time.

To understand how powerful this is, we asked Peerspace, a Segment customer currently sending data to Mutiny:

“We use Segment to manage our data and Mutiny to give each of our website visitors a tailored experience that’s right for them. Website conversion events, such as when a visitor books a venue, is tracked by Segment and sent to Mutiny in real-time enabling us to see how each personalized experience is performing. We have seen up to 98% increase in conversion -- with the Segment Mutiny integration we can feel confident in the results.” - Arndt Voges, Head of Growth at Peerspace

Vitally

Vitally is a customer success platform that takes in all the data about your users to give you better onboarding and retention tooling. They were also one of the top integrations we saw users adding via Webhooks.

In particular, Vitally specializes in high-growth companies. Vitally understands most customer success teams are undersized and need tools that help two CSMs seem like 200.

They seamlessly organize all your customer data–Segment traits and events, conversations, subscriptions, and NPS scores–into 360° profiles. They then layer that data with automated workflows that help auto-detect and engage with customers in need.

As you acquire more trials and customers, Vitally handles that scale seamlessly with powerful segmentation and analytics that help you continuously optimize every customer stage, from self-service trials to churn.

“At Gorgias, we help Shopify stores provide the best customer service, so it’s only natural that our own support should stand out. Segment + Vitally helps us do just that. Vitally enables us to analyze our customers’ interactions with our product, received via Segment, thus helping us identify and predict their needs through built-in indicators and success metrics. By pushing that over to Segment and propagating the enriched data to our entire marketing stack, we can be proactive in the way we serve our customers while improving our automation process.” - Axelle Heems, Growth Ops at Gorgias

Split

Split is a feature flagging and experimentation tool. It helps product and engineering teams to safely release new functionality while understanding the impact on customer experience.

Split helps you:

  • create different variations and feature flags for your users
  • group users into different variations depending on arbitrary rules
  • understand the impact of one variant against another

The superpower of Split is that it lets you combine data about your users with which variant they see. If you see a new feature performing badly, you can immediately find out and disable it, all in Split’s interface!

We asked one of our customers, Imperfect Produce, for a little more background on how they used the Segment <> Split connection. Here’s what they said:

“Sending Segment event data to Split, such as ‘added item to cart’, will help us to innovate faster than ever before. We already send a rich set of custom events to Segment for understanding user behavior. Having Split tie those measurements to feature flags and experiments gives us a powerfully-integrated system for finding out exactly what new features make our customers the happiest.” - Patti Chan, Director of Product at Imperfect Produce

ClearBrain

ClearBrain helps growth marketers create predictions and target their users based on intent. It’s not quite magic, but it’s close.

If you’re already using Segment, setting up ClearBrain is incredibly straightforward:

  • You send your Segment data to ClearBrain
  • You tell Clearbrain which actions and traits you want to predict (paying users, net promoters, etc.)
  • ClearBrain automatically groups those users into audiences using its AI/ML framework

With ClearBrain, you can predict any action or trait with a simple self-serve interface. This allows you to configure hundreds of predictive audiences by likelihood to convert, churn, purchase, or engage (or literally any event you’ve tracked in Segment) within minutes.

“Using Segment enabled us to get started on predictive analytics tools like ClearBrain a lot faster. With historical data replay, we can send years of Segment data to power predictive insights in ClearBrain and gain automated audience insights on the most important actions that lead to upgrade or churn.” - Kyle Gesuelli, Head of Growth at Frame.io

An open Segment platform

Needless to say, we’re incredibly excited to open the Segment platform to our technology partners and to start giving early access to our Developer Center beta. In the next few months, we are committing to:

More integrations on the Segment catalog

In one month, over 18 partners used our open platform to build a destination and 20 more are coming soon. If you’re a partner who wants to power your tool with rich customer data, now is the perfect time to request access to build on Segment.

More partner discovery

As we expand our integration catalog, we want to make updates that help customers find what they’re looking for. That means helping partners get discovered when they solve customer problems, as well as making recommendations for what customers can integrate next.

More product innovation

By opening the Segment platform, we are helping new entrants, new categories, and established tools to better onboard customer data into their products. Not only does this mean our partners can get customers to value faster, it also means that they can focus on the product areas that can help them achieve ‘best in breed’ status.

If you’re interested in building on our platform and adding your product to the Segment catalog, you can start by requesting access to our Developer Center beta here.

If you’re a customer who wants your vendors to add an integration with Segment, we’ve made it easy to share the reasons why they should build an integration.

Segment
Segment provides the customer data infrastructure that businesses use to put their customers first. With Segment, companies can collect, unify, and connect their first-party data to over 200 marketing, analytics, and data warehousing tools.
Tools mentioned in article
Open jobs at Segment
DevOps Engineer
San Francisco

At Segment, we believe companies should be able to send their data wherever they want, whenever they want, with no fuss. We make this easy with a single platform that collects, stores and sends data to hundreds of business tools with the flip of a switch. Our goal is to make using data easy, and we’re looking for people to join us on the journey. We are excited about building toward a world where engineers at other companies spend their time working on their core product, rather than spending nights and weekends tweaking their customer data into various formats for 3rd party tools

Site Reliability Engineers (SRE) at Segment are members of the engineering team whose primary goal is to ensure the reliability, flexibility, and cost effectiveness of our production infrastructure. 
 
While these responsibilities are shared with the entire engineering team, SREs build and maintain the portions of our stack that ensure the entire engineering team can confidently ship software day in and day out. They complement other engineers with their deeper knowledge of the fundamental pieces of technology that underpin our production infrastructure. The SRE team are our in-house experts on building reliable, maintainable systems and they are responsible for setting the direction that determines how we go about constructing and deploying our production environment.
 

Core Responsibilities: 

  • Build software that improves the reliability, performance, and efficiency of Segment’s high-throughput, large-scale SaaS platform.
  • Collaborate with the entire engineering team on projects as the expert on reliability, performance, and efficiency.
  • Automate away the process of managing capacity, safely deploying software, and mitigating failures.
  • Troubleshoot and mitigate the thorniest problems in our most mission-critical systems. Advise the team during postmortems on effectively avoiding repeated incidents.
  • Share a 24x7 on-call rotation with the other engineers in your focus area.
  • Work with cutting edge technology, share with others through open source, and spread your expertise through contributions to our engineering blog.

Requirements: 

  • CS Degree and/or a demonstrable, solid understanding of CS fundamentals.
  • Proficient coder: strong with at least one programming language.
  • Solid grasp of Linux systems and networking concepts
  • Drive to dig into problems and burrow until the solution is found.
  • Excellent communicator; writes great documentation.

Bonus: 

  • Experience operating large-scale, distributed systems on top of cloud infrastructure such as Amazon Web Services or Google Compute Platform.
  • Broad understanding of the OS and of networking protocols with demonstrated ability to apply this understanding to solve real problems.
  • Strong proficiency with OS tuning and expertise at the application of debugging tools.
  • Strong sense of urgency and ownership over critical problem areas.
  • Demonstrable experience effectively coordinating response for outages and incidents.
  • Rare ability to inspire engineering teams to up their reliability game.

 

Site Reliability Engineer/ Systems En...
San Francisco

At Segment, we believe companies should be able to send their data wherever they want, whenever they want, with no fuss. We make this easy with a single platform that collects, stores and sends data to hundreds of business tools with the flip of a switch. Our goal is to make using data easy, and we’re looking for people to join us on the journey. We are excited about building toward a world where engineers at other companies spend their time working on their core product, rather than spending nights and weekends tweaking their customer data into various formats for 3rd party tools

Site Reliability Engineers (SRE) at Segment are members of the engineering team whose primary goal is to ensure the reliability, flexibility, and cost effectiveness of our production infrastructure. 
 
While these responsibilities are shared with the entire engineering team, SREs build and maintain the portions of our stack that ensure the entire engineering team can confidently ship software day in and day out. They complement other engineers with their deeper knowledge of the fundamental pieces of technology that underpin our production infrastructure. The SRE team are our in-house experts on building reliable, maintainable systems and they are responsible for setting the direction that determines how we go about constructing and deploying our production environment.
 

Core Responsibilities: 

  • Build software that improves the reliability, performance, and efficiency of Segment’s high-throughput, large-scale SaaS platform.
  • Collaborate with the entire engineering team on projects as the expert on reliability, performance, and efficiency.
  • Automate away the process of managing capacity, safely deploying software, and mitigating failures.
  • Troubleshoot and mitigate the thorniest problems in our most mission-critical systems. Advise the team during postmortems on effectively avoiding repeated incidents.
  • Share a 24x7 on-call rotation with the other engineers in your focus area.
  • Work with cutting edge technology, share with others through open source, and spread your expertise through contributions to our engineering blog.

Requirements: 

  • CS Degree and/or a demonstrable, solid understanding of CS fundamentals.
  • Proficient coder: strong with at least one programming language.
  • Solid grasp of Linux systems and networking concepts
  • Drive to dig into problems and burrow until the solution is found.
  • Excellent communicator; writes great documentation.

Bonus: 

  • Experience operating large-scale, distributed systems on top of cloud infrastructure such as Amazon Web Services or Google Compute Platform.
  • Broad understanding of the OS and of networking protocols with demonstrated ability to apply this understanding to solve real problems.
  • Strong proficiency with OS tuning and expertise at the application of debugging tools.
  • Strong sense of urgency and ownership over critical problem areas.
  • Demonstrable experience effectively coordinating response for outages and incidents.
  • Rare ability to inspire engineering teams to up their reliability game.

 

Infrastructure Engineer
San Francisco
At Segment, we believe companies should be able to send their data wherever they want, whenever they want, with no fuss. We make this easy with a single platform that collects, stores and sends data to hundreds of business tools with the flip of a switch. Our goal is to make using data easy, and we’re looking for people to join us on the journey. We are excited about building toward a world where engineers at other companies spend their time working on their core product, rather than spending nights and weekends tweaking their customer data into various formats for 3rd party tools
 
Our infrastructure is mostly written in Go (we’re huge fans!), uses Docker containers for our 70 different microservices, and generally uses the latest and greatest from AWS.  Our small team is providing the data infrastructure for thousands of companies, and as a result we’re already processing terabytes of data each day.  We’re rapidly scaling our systems to keep up with our dramatic growth, and we’re looking for folks who love Kafka, NSQ, NoSQL databases, and distributed systems of every flavor.
 
Our customer data hub is helping companies achieve data nirvana, the blissful state you enter when all of your customer data is clean, complete, and accessible in your data warehouse and various analytics tools. Integrating with the Segment platform enables our customers and partners to a new class of analytics models and marketing automation experiences.  Though we have already thousands of companies being built on top of our analytics platform, we’ve only penetrated less than 1% of the market.  We are building toward a world where all customer data in the world is flowing through Segment.
 

Projects you can dive into:

 
Real-time schema:
 
Segment's API pipeline processes billions of messages per day. The incoming messages are simple JSON objects, which must be tracked, parsed, and store with a structured schema. Our API layer needs to allow schemas to be completely dynamic: when our customers issue a track call, each JSON object can introduce us to new properties we haven't seen before.  Dealing with flexible data can be incredibly challenging, because we must adjust these schemas on the fly, in realtime.
 
Performing this process at high scale is extremely challenging because the infrastructure has to be ready for extremely high spikes in reads and writes, as our API promises the ability to process batch historical data from customers. Even during these times, we need to deliver a query speed of under 100ms.
 
To make things even more interesting, sometimes data contradicts itself. A column might come in for a long time as a string and then start coming in as a number. How should we handle cases where we can't cast from one type to another? How do we propagate type changes in the downstream tools? How do we make sure that the system remains idempotent?
 
Queuing Topology

 Acting as the middleman for billions of events isn’t easy. We essentially have to build the L4 networking layer, reliably delivering messages from clients in order, but at the L7 layer in the network stack. Where things get complicated is the fact that clients expect us to queue, instead of backing off. Messages get re-ordered, and we need to be prepared for the integrations we’re sending data to fail at any time.
 
We want to build a queueing topology to handle all of these cases gracefully, and scalably. If an integration’s endpoint goes down, it shouldn’t affect other destinations for that data. And if a customer suddenly batches a ton of data for an integration, they shouldn’t starve message delivery for their neighbors.
 
Most queueing systems don’t handle this case well. They’re severely limited in terms of partitions, topics, or whatever logical separation they use to provide isolation. We need a system that scales well, but also provides the same sorts of ordering and delivery guarantees that we’d get through a Kafka.
 
It’s a big, challenging piece of core infrastructure, but we’re feeling the pain more as customers exhaust the pipes for individual integrations.

Query API

We take in hundreds of thousands of events every single minute. It’s roughly the number of new tweets and new snapchats combined. By agreeing to process events, pageviews, and click data–we’re effectively shouldering the scalability of all of our customers at once.
 
To date, we’ve scaled by making the system stateless. When needed, we can boot up more routing nodes, and more workers.
 
But being stateless limits us. We can’t enrich data as it’s passing through our processing pipeline, join user ids together, or perform other types of advanced analysis based upon the user’s prior actions. We want to be smarter, but combining all this data into a single database comes with serious scaling challenges.
 
We’d like to expose all of that user data, first as part of an internal query API, and then finally exposed to customers. That way our users can build their own pieces of custom tooling on top of it.

Custom transforms

Today, Segment hosts 180 different transforms for matching data from our API to our partners’. But most of those integrations are code we maintain and develop. 
 
In recent years, there’s an explosion of customer data tools, and there are thousands more we’d like to support.
 
It’s obvious the current system won’t scale to thousands of tools, or the millions of unique use cases that our customers have. But, we’d like to find a way to connect those tools to our hub, so customers can ‘free’ their data without having to run a complex data pipeline themselves. 
 
That’s might involve customers submitting a lambda-esque function, or a container for us to run. we’d love to let our customers and partners supply custom transforms for us to run. It’s a fairly complex isolation and sandboxing challenge–how do we make sure functions don’t misbehave, or affect one another? In some ways we’d almost be running ‘remote code execution as a service’. We’d love your help in getting us to the point where we can scale to support any customer user case.

Responsibilities

  • Building infrastructure to process terabytes of data per day and thousands of API calls per second
  • Use cutting-edge technologies such as AWS, Go, Docker, and Terraform to continue to scale our infrastructure
  • Relentlessly measure and optimize as Segment builds the highest-scale and most advanced analytics platform in the world

Requirements

  • CS Degree or equivalent knowledge of data structures and algorithms
  • 2+ years of industry experience building and owning large-scale distributed infrastructure
  • Expert knowledge developing and debugging in C/C++, Java, or Go
Full Stack Engineer
San Francisco

Segment’s mission is to make it super easy for companies to use their customer data to build incredible products. We’re building towards a future where all customer data in the world flows through Segment. As a Full Stack Engineer, you are essential to that future. You will work closely with designers to build the user interface that sits in front of the infrastructure that receives billions of API calls every day. We use cutting-edge tools like React, Webpack, Redux, and ES6. Some of our team specializes in CSS, some of us specialize in Go, but all of us are JavaScript experts and full stack engineers.

 
We are looking for a Full Stack Engineer who can:
  • Improve the reliability of our Go and Node.js services by cutting our 5xx rate to zero and implementing good failover/monitoring
  • Build well-tested client libraries with clear APIs to communicate with our Go services, and update services as-needed when bugs are discovered
  • Move our Frontend codebase from Deku to React
  • Build an auto-tracking feature into Analytics.js to allow codeless tracking and ex-post filtering of the tracking data
  • Break down ambiguous business goals into actionable technical projects
 
What are examples of work that Full Stack Engineers have done at Segment?
  • Built a UI to display a live stream of incoming API calls (sometimes hundreds of calls per second) as a debugging tool for our customers
  • Brought our site response time down by ~98% by caching templates server-side and optimizing database calls
  • Revamped our pricing, which increased revenue by roughly 25%
 
Requirements
  • Expert knowledge of JavaScript
  • Minimum of 4 years of industry experience in engineering
  • Deep understanding of the complexities involved in writing large single-page applications
  • Evidence of exposure to architectural patterns of high-scale web application (e.g., well-designed APIs, high volume data pipelines, efficient algorithms)
  • Experience with web best practices such as A/B testing and test coverage
Verified by
Sr. Manager, Acquisition
CEO and Co-founder
Engineering Manager
You may also like