How Heap Built an Analytics Platform that Auto-Tracks Every User Event

9,709
Heap
Heap automatically captures every user action in your app and lets you measure it all. Clicks, taps, swipes, form submissions, page views, and more. Track events and segment users instantly. No pushing code. No waiting for data to trickle in.

Written by Paul Jaworski and adapted from an interview with Dan Robinson, CTO of Heap




Dan Robinson, CTO of Heap Dan Robinson, CTO of Heap


When Dan Robinson joined Heap as the company's first engineer, it was unclear whether it was even possible to build the product to scale. And that's exactly why he joined. Most startups, he says, face a significant risk in finding product-market fit. He was absolutely confident that a need for this product existed. The real challenge would be a technical one.

Most analytics platforms require the user to choose the events they want tracked ahead of time. This requires significant developer time and the foresight to know which analytics events you'll care about later. Heap instead tracks everything up front and lets the user define events with a visual tool afterward.


Humble Beginnings

Heap was founded by Matin Movassate, a former Product Manager at Facebook, and Ravi Parikh. They entered the Winter '11 class of YCombinator with a simple MVP of the product: a single Node.js server running on EC2 with PostgreSQL. All of the persistent data had to be mirrored in memory for the queries to be fast. Dan joined soon after, and his first project was to rebuild the infrastructure to be able to handle more than ~200gb of data.

The early version of the product was a JavaScript snippet that customers could install on their site, tracking UI events. This was accompanied by a customer-facing dashboard written in jQuery and D3.js where users could graph their data, create conversion funnels, or do things like click on a recent user and see every single event Heap has tracked for them. These analyses were transformed into SQL queries for Postgres on the backend. Much of the data still had to be stored in memory for the queries to be fast enough to satisfy consumers, though they knew this was not a long-term solution.


Matin Movassate cutting Dan's welcome cake CEO Matin Movassate cutting Dan's "welcome cake"


PostgreSQL was an easy early decision for the founding team. The relational data model fit the types of analyses they would be doing: filtering, grouping, joining, etc., and it was the database they knew best. Shortly after adopting PG, they discovered Citus, which is a tool that makes it easy to distribute queries. Although it was a young project and a fork of Postgres at that point, Dan says the team was very available, highly expert, and it wouldn’t be very difficult to move back to PG if they needed to:

The stuff they forked was in query execution. You could treat the worker nodes like regular PG instances.

Citus also gave them a ton of flexibility to make queries fast, and again, they felt the data model was the best fit for their application.


Growing Pains

In early 2014, Heap released an event visualizer tool that allowed non-technical people to use the product, which Dan believes was the key piece in achieving product-market fit. As a result, the company grew to the point where they had users who were processing millions of events per month. They started to hit the limits of the initial Citus infrastructure. As larger customers began signing up, the large datasets they brought with them became difficult to handle. Eventually, the analyses became too slow to be viable, and simply “throwing more machines” at the problem was cost-prohibitive.

The early version Heap was using didn’t have much distributed systems functionality, so they had rolled their own solutions for things like recovering from a failed node, splitting data into different sharding schemes, and moving data between machines. That homegrown functionality was starting to have issues at scale.

The major breakthrough came when they found a way to cheaply index the event definitions users were creating. These were the only points of data that users were querying, and each event definition represented far less than 1% of the overall data Heap was collecting. It became clear that they could achieve substantial performance gains if they could build an infrastructure around indexing these events.

Heap searched for an existing tool that would allow them to express the full range of analyses they needed, index the event definitions that made up the analyses, and was a mature, natively distributed system. After coming up empty on this search, they decided to compromise on the “maturity” requirement and build their own distributed system around Citus and sharded PostgreSQL. It was at this point that they also introduced Kafka as a queueing layer between the Node.js application servers and Postgres.

The front end had also begun to grow unwieldy. The original jQuery pieces became difficult to maintain and scale, and a decision was made to introduce Backbone, Marionette, and TypeScript. Ultimately this ended up being a “detour” in the search for a scalable and maintainable front-end solution. The system did allow for developers to reuse components efficiently, but adding features was a difficult process, and it eventually became a bottleneck in advancing the product.


Reducing Cost and Improving Performance

Because of the massive amounts of data that Heap is ingesting, it’s taken a great deal of work to get to a cost-viable product. One of the major projects in reducing cost involved switching to ZFS, which allows compression at the file system level. That switch alone allowed them to compress their data by a factor of 2. Dan says they’re currently experimenting with even further improvements that could increase this compression to 3-3.5x. Additional gains have come from doing some low-level CPU profiling” to determine where their resources were being used on the EC2 instances.

Today, they’re doubling query speed each quarter and constantly seeking even more improvements. As Dan points out, the size of the customer they can support is directly correlated to the performance of the application.

If we can make queries 3x faster, we can support a customer who is 3x larger.


DevOps and Organizational Structure

Engineering teams at Heap are broken into 3-5 developers, and about half of them are working on infrastructure. This is mostly related to business priorities, since again, the primary challenge behind the product is not, “What new features should we add?” but “How do we scale to customers who are 100x larger?”.


Dan and engineers Dan with members of the engineering team


All of the code at Heap lives on GitHub, and they use CircleCI, which in turn kicks off Ansible scripts for deployment. They use Salt for managing machine configuration, and Terraform to manage all of their AWS configuration. Currently, everything is running on AWS for Heap, so Terraform was an easy choice, as they loved the modularity and “great dev workflows” it provides.


Lessons Learned

After 5 years of building one of the fastest growing tools for analytics and ingesting billions of events, Dan Robinson has some valuable advice for budding CTOs:

There are decisions you make that are hard to reverse and decisions that are easy to reverse - like one-way doors and two-way doors. Most things are two-way doors and it's better to just go fast and learn something.

He advises that if you’re building a serious distributed system where the performance is critical to the success of your application, one-way doors include things like selecting your data model and data system:

If you want to change from PostgreSQL to MySQL, that's going to be a rewrite.

If he could go back in time, Dan probably would have started using Kafka on day one. He’s learned that it’s a very good fit for an analytics tool, since you can handle a huge number of incoming writes with relatively low latency. Kafka also gives you the ability to “replay” the data flow: “It’s like a commit log for your whole infrastructure.”


Drawing out the current infrastructure


One of the biggest benefits in adopting Kafka has been the peace of mind that it brings. In an analytics infrastructure, it’s often possible to make data ingestion idempotent. In Heap’s case, that means that, if anything downstream from Kafka goes down, they won’t lose any data – it’s just going to take a bit longer to get to its destination. He’s also learned that you want the path between data hitting your servers and your initial persistence layer (in this case, Kafka) to be as short and simple as possible, since that is the surface area where a failure means you can lose customer data.

Dan also says he’s been “continuously shocked” at how often YAGNI has been true:

I remember writing our exports feature in 2014. It was a simple feature that let you get a nightly dump of your Heap data on S3. The code was littered with TODOs that I was completely sure we were going to need to resolve within the next few weeks – minor extensions of the feature, configurability options, operability improvements, or known technical debt items. A lot of those TODOs didn't come up for years, and some of them are still there.

Instead of focusing on writing perfect software, he believes it’s much more important to get something in front of users. This is something you may have heard by now from a product perspective, but it’s equally important for code:

You’ll learn what areas of your technical debt actually matter, and that learning is a lot more important than getting decisions right if the decisions are easily reversible.


The Present and Future of Heap

Today, the Heap product consists primarily of a customer-facing dashboard powered by React, MobX, and TypeScript on the front end. Data is sent up to a Node.js server, passed on to Kafka, and eventually ends up in PostgreSQL. All of the data customers perform analyses on in the dashboard comes, of course, from the JavaScript snippet installed on users’ websites.

Most recently, Heap has released a feature that also pulls in data from third-party providers like MailChimp, Stripe, Optimizely, and Shopify. The team there realized that a large percentage of their customers’ data didn’t actually live on their own platform, but instead was scattered around these various vendors.

Dan points out that the additional data volume hasn’t been a significant challenge - especially since events like payments are very high value from an analytics perspective. The real challenge has been learning the “language” of these providers. Each one has a different API, a different way to format event data, and different semantics for retrieving it.

Once the data is in Heap, they also had to figure out how to correlate that data with their own. How do you attribute an email sent in one system to a button click in another? The answer was building out their own UI for Heap users to tie the events together.

Beyond features, performance improvements continue to be a major focus today and in the future. Heap currently stores 1 petabyte of data, ingests 1 billion events per day, and performs over 250,000 analyses per week.




If your company has a great story behind your tech, email us to be featured!

Heap
Heap automatically captures every user action in your app and lets you measure it all. Clicks, taps, swipes, form submissions, page views, and more. Track events and segment users instantly. No pushing code. No waiting for data to trickle in.
Tools mentioned in article
Open jobs at Heap
Engineering Manager

Heap’s mission is to power business decisions with truth. We’re building infrastructure to automatically capture customer interactions on web and mobile applications, make sense of them, and make them actionable for anyone. We want to enable everyone to understand their millionth customer as well as they understood their first.

As an Engineering Manager at Heap, you’ll be responsible for technical strategy, execution, and delivery on a core part of our product experience used by thousands of customers. You’ll hire, coach, and develop a diverse team of engineers, and work cross-functionally with a variety of teams across the company to inform direction and roadmap.

Engineering managers at Heap are people-focused, but expected to have a strong engineering background. Your particular area of expertise isn't too important: we have management needs across the entire stack.

What you’ll do

  • Remotely manage a diverse group of 6-10 engineers ranging from junior to senior. Support their growth and development through continuous coaching and feedback.

  • Work closely with partners across the company, including product management, design, solutions, marketing, and solutions, to bring new features and services to market.

  • Own execution and delivery on your eng team, and help the team balance new feature work with technical debt and platform investment.

  • Contribute to a fast-growing engineering organization and help us scale up processes and best practices across the entire company.

We’re a distributed team that operates mostly on US and Australia (EST) timezones. We’re looking for managers for teams across those timezones.

What we’re looking for

  • A track record of leading and coaching diverse teams of engineers (at least 3+ years managing a team of 5+, including remote engineers)

  • A solid engineering background that has kept pace with new technologies, even if it’s been a few years since you coded full time (at least 5+ years of total experience in technical roles)

  • Experience running team cadences in a fast-moving, Agile environment, and working cross-functionally with product management to set roadmap and direction

  • A genuine commitment to building diverse teams, and promoting equity and inclusion on your team and throughout the organization

Under the hood, Heap is powered by Node.js, TypeScript, Golang, Scala, Spark, Kafka, Redis, and PostgreSQL (using Citus). For more about our architecture, check out Virtual Events: Making Data-Driven Decisions a Reality.


People are what make Heap awesome. Regardless of age, education, ethnicity, gender, sexual orientation, or any personal characteristics, we want everyone to feel welcome. We are committed to building a diverse and inclusive equal opportunity workplace everyone can call home.

Heap has raised $205M in funding from NEA, Y Combinator, Menlo Ventures, SVAngel, Sam Altman, Garry Tan, Alexis Ohanian, Harj Taggar, Ram Shriram, and others. We offer plenty of awesome benefits, and we were named #1 on Glassdoor’s Best Places to Work (SMB). We'd love to hear from you!

#LI-MM1

Engineering Manager, DevOps & Infrast...

Heap’s mission is to power business decisions with truth. We’re building infrastructure to automatically capture customer interactions on web and mobile applications, make sense of them, and derive actionable insights for anyone. We want to enable everyone to understand their millionth customer as well as they understood their first.

For this role, we’re looking for an Engineering Manager to grow and develop an engineering team within our Platform group. You’ll be responsible for technical strategy, execution, and delivery on a core part of our platform used by all of Heap’s engineering teams (Capture, Insights, Data Ecosystem, and Data Science) to deliver delightful products used by thousands of customers. You’ll hire, coach, and develop a diverse team of engineers, and work cross-functionally with a variety of teams across the company to inform direction and roadmap.

As the leader for our Bedrock team, you’ll be responsible for enabling Heap engineers with the tools to deploy, monitor, troubleshoot, and restore services with minimal overhead. This will include automating infrastructure behind our dev, test, staging, and production environments, build and CI/CD infrastructure, and observability and monitoring services and infrastructure. You will bring maturity to the team to ensure quality and predictability to the deliverables, and drive efficiency and optimization in execution. This is an extremely high-impact role with high visibility with the executive team.

What you’ll do

  • Remotely manage a diverse group of 6-10 engineers ranging from junior to senior. Support their growth and development through continuous coaching and feedback.

  • Work closely with product management and other engineering leaders to understand the needs of the organization and help translate those into roadmap items for the team.

  • Anticipate the needs of the engineering teams as they grow and create strategic initiatives to ensure the platform team is ready for the future.

  • Own execution and delivery on your engineering team, and help the team balance new feature work with technical debt and platform investment.

  • Own the code - help lead system and technical design, and ensure code meets the highest standard of quality and maintainability by participating in code reviews.

  • Contribute to a fast-growing engineering organization and help us scale up processes and best practices across the entire company.

We are a distributed team with members in Australia and US. We encourage a culture of close collaboration and high throughput communication and interaction among and within teams. For this role, we’re primarily looking for a manager working in the Pacific or Australia time zones.

What we’re looking for

  • A track record of leading projects to success over many years (5-7+ years of experience), with deep experience in infrastructure design and deployments at scale, preferably for large-scale web and data processing workloads.

  • A solid engineering background that has kept pace with new technologies and a strong understanding of cloud computing (at least 5 years of experience in technical roles working with AWS).

  • Deep knowledge of the full DevOps ecosystem and significant experience with modern DevOps tooling, infrastructure as code, configuration management, CI pipelines, package managers.

  • Passion for coaching, developing, and managing diverse teams of engineers (at least 2+ years leading a team of 5+, including remote engineers).

  • Experience running team cadences in a fast-moving, Agile environment, working cross-functionally with product management to set a roadmap, and driving disciplined execution for results.

  • Excellent communication skills, particularly in writing. We're a distributed team worldwide, so we pride ourselves on our ability to communicate complex ideas clearly in writing across the team.

We use a variety of technologies and tools to deploy and manage our infrastructure including Terraform, Packer, Docker, InSpec, Buildkite, and AWS services. The tech stack used by Heap engineers to build products includes Node.js, TypeScript, Golang, Scala, Spark, Kafka, Redis, and PostgreSQL (using Citus). For more about our architecture, check out Virtual Events: Making Data-Driven Decisions a Reality.


People are what make Heap awesome. Regardless of age, education, ethnicity, gender, sexual orientation, or any personal characteristics, we want everyone to feel welcome. We are committed to building a diverse and inclusive equal opportunity workplace everyone can call home.

Heap has raised $205M in funding from NEA, Y Combinator, Menlo Ventures, SVAngel, Sam Altman, Garry Tan, Alexis Ohanian, Harj Taggar, Ram Shriram, and others. We offer plenty of awesome benefits, and we were named #1 on Glassdoor’s Best Places to Work (SMB). We'd love to hear from you! 

#LI-CN1

Software Engineer / Generalist

Heap’s mission is to power business decisions with truth. We’re building infrastructure to automatically capture customer interactions on web and mobile applications, make sense of them, and make them actionable for anyone. We want to enable everyone to understand their millionth customer as well as they understood their first.

We’re hiring engineers across the stack for a variety of teams. On our product teams, you’ll work cross-functionally with engineers, designers, and product managers to understand our customers and bring new features to market. On our platform teams, you’ll work on the underlying infrastructure that allows Heap to process and analyze billions of events every day. Whichever team you’re on, you’ll have a manager and mentors who care about helping you grow in your career, and you’ll have the opportunity to help mentor others.

We’re a distributed team that operates mostly on US time zones, with team members in Europe and Australia as well. We’re open to remote engineers in time zones that overlap with our teams in the US, including North America, South America, and Australia.

What we’re building

  • Analysis and visualization. Building Heap is as much a design problem as anything else. The most powerful analytics infrastructure in the world doesn't matter if our users can't access that power and understand the results. We’re working on the next generation of tools to help clients understand and take action on their data.

  • Data capture and integrations. We want to provide our users with the richest possible dataset. That means expanding our SDKs (on web, iOS, and Android), building integrations out to other destinations (like Marketo), and pulling in data from other sources (like Salesforce).

  • Real-time infrastructure. We support an expressive set of queries that allow our users to slice and dice their data in arbitrary ways. We also support materializing massive retroactive datasets into our customers' cloud data warehouses. We're working on a new distributed infrastructure to make this possible.

Under the hood, Heap is powered by Node.js, React, TypeScript, Golang, Scala, Spark, Kafka, Redis, and PostgreSQL (using CitusDB), plus Objective-C, Swift, Java, and Kotlin SDKs. For more about our architecture, check out Virtual Events: Making Data-Driven Decisions a Reality.

What we’re looking for

We aren’t expecting you to know everything! One of our core values is “Emphasize Slope Over Y-Intercept,” so we are always looking for candidates eager to learn new technologies.

  • A collaborative and intellectually curious approach to software development. You enjoy learning from and teaching others, and aren’t afraid of asking lots of questions.

  • Excellent communication skills, particularly in writing. We’re a distributed team all over the world, so we pride ourselves on our ability to communicate complex ideas clearly in writing across the team.

  • For more senior engineers, a track record of leading projects to success, and a passion for leading by example, mentoring others, and sharing your expertise with the rest of the team.

  • For our product engineering teams, experience across the stack with any modern web framework. Familiarity with parts of our stack is a plus (React, Node.js, TypeScript, PostgreSQL)

  • For our Capture SDK teams, deep knowledge of application runtimes like HTML/DOM, UIKit, ReactNative, Android, or iOS.

  • For our platform engineering teams, backend server and database experience. Familiarity with parts of our stack is a plus (Node.js, Golang, Scala, Spark, Kafka, Redis, PostgreSQL).


People are what make Heap awesome. Regardless of age, education, ethnicity, gender, sexual orientation, or any personal characteristics, we want everyone to feel welcome. We are committed to building a diverse and inclusive equal opportunity workplace everyone can call home.

Heap has raised $205M in funding from NEA, Y Combinator, Menlo Ventures, SVAngel, Sam Altman, Garry Tan, Alexis Ohanian, Harj Taggar, Ram Shriram, and others. We offer plenty of awesome benefits, and we were named #1 on Glassdoor’s Best Places to Work (SMB). We'd love to hear from you!

#LI-MM1

Executive Assistant (Engineering, Pro...

Heap’s mission is to power business decisions with truth. We’re building the next generation of product analytics that automatically captures customer interactions, makes sense of them, and makes them actionable for anyone. We want to enable everyone to understand their millionth customer as well as they understood their first.

We are looking for a self motivated Executive Assistant to work directly with our EVP of Engineering, EVP of Product, and VP of Design. This person should like working in a fast paced environment and have the ability to juggle multiple critical requests. In addition to managing administrative duties, you will work with other C/VP level leaders within Heap on strategic initiatives that are key to the success of our company. 

At Heap, this means you should be:

  • Extremely detailed oriented: you will manage all aspects of calendar, appointments, travel and related action items helping to drive completion of key deliverables and following up on outstanding items.
  • Efficient: be able to multitask in a fast paced environment. Knowing how and what to prioritize will be a key to success.
  • Be proactive: anticipate important meetings, deadlines, and create time for work asking the right questions to assess priority and ensuring the right documentation/communication are in place.
  • Handle confidential information: high level of integrity and discretion handling sensitive business and personal information for C-level executives. 
  • Support leadership: plan and manage office and client functions and work with the broader leadership team at Heap to coordinate company wide initiatives. 

What we’re looking for: 

  • Experience supporting leaders through periods of high growth.
  • Highly motivated, organized individual with a strong attention to detail who is able to work effectively with minimal supervision.
  • Has led and managed the delivery of projects across multiple stakeholders.
  • Clear written and verbal communication skills, ability to communicate with all levels of leadership.
  • Effective organizational, communication, and interpersonal skills.
  • Ability to prioritize and handle multiple assignments at any given time.
  • Ability to anticipate needs of others.

People are what make Heap awesome. Regardless of age, education, ethnicity, gender, sexual orientation, or any personal characteristics, we want everyone to feel welcome. We are committed to building a diverse and inclusive equal opportunity workplace everyone can call home.

Heap has raised $205M in funding from NEA, Y Combinator, Menlo Ventures, SVAngel, Sam Altman, Garry Tan, Alexis Ohanian, Harj Taggar, Ram Shriram, and others. We offer plenty of awesome benefits, and we were named #1 on Glassdoor’s Best Places to Work (SMB). We'd love to hear from you!

#LI-MM1

Verified by
You may also like