Stream & Go: News Feeds for Over 300 Million End Users

39,820
Stream
Build scalable feeds, activity streams & chat in a few hours instead of months.

By Thierry Schellenbach, CEO, Stream.


Stream + Go


Stream is an API that enables developers to build news feeds and activity streams (try the API). We are used by over 500 companies and power the feeds of more than 300 million end users. Companies such as Product Hunt, Under Armour, Powerschool, Bandsintown, Dubsmash, Compass and Fabric (Google) rely on Stream to power their news feeds. In addition to the API, the founders of Stream also wrote the most widely used open source solution for building scalable feeds.

Here’s what Stream looks like today:

  • Number of servers: 180
  • Feed updates per month: 34 billion
  • Average API response time: 12ms
  • Average real-time response time: 2ms
  • Regions: 4 (US-East, EU-West, Tokyo and Singapore)
  • 1B API requests per month (~20k/minute)

Given that most of our customers are engineers, we often talk about our stack. Here’s a high level overview:

  1. Go is our primary programming language.
  2. We use a custom solution built on top of RocksDB + Raft for our primary database (we started out on Cassandra but wanted more control over performance). PostgreSQL stores configs, API keys, etc.
  3. OpenTracing and Jaeger handle tracing, StatsD and Grafana provide detailed monitoring and we use the ELK stack for centralized logging.
  4. Python is our language of choice for machine learning, devops and our website https://getstream.io (Django, DRF & React).
  5. Stream uses a combination of fanout-on-write and fanout-on-read. This results in fast read performance when a users open their feed, as well as fast propagation times when a famous user posts an update.

Tommaso Barguli and I (@tschellenbach) are the developers who started Stream nearly 3 years ago. We founded the company in Amsterdam, participated in Techstars NYC 2015 and opened up our Boulder, Colorado office in 2016. It’s been quite a crazy ride in a fairly short amount of time! With over 15 developers and a host of supporting roles, including sales and marketing, the team feels enormous compared to the early days.

The Challenge: News Feeds & Activity Streams

The nature of follow relationships makes it hard to scale feeds. Most of you will remember Facebook’s long load times, Twitter’s fail whale or Tumblr’s year of technical debt. Feeds are hard to scale since there is no clear way to shard the data. Follow relationships connect everyone to everyone else. This makes it difficult to split data across multiple machines. If you want to learn more about this problem, check out these papers:

At a very high level there are 3 different ways to scale your feeds:

  1. Fanout-on-write: basically precompute everything. It’s expensive, but its easy to shard the data.
  2. Fanout-on-read: hard to scale, but more affordable and new activities show up faster.
  3. Combination of the above two approaches: better performance and reduced latency, but increased code complexity.

Stream uses a combination of fanout-on-write and fanout-on-read. This allows us to effectively support both customers with highly connected graphs, as well as customers with a more sparse dataset. This is important since the ways in which our customers use Stream are very different. Have a look at these screenshots from Bandsintown, Unsplash, and Product Hunt:

screenshot examples


Switching from Python to Go

After years of optimizing our existing feed technology we decided to make a larger leap with 2.0 of Stream. While the first iteration of Stream was powered by Python and Cassandra, for Stream 2.0 of our infrastructure we switched to Go. The main reason why we switched from Python to Go is performance. Certain features of Stream such as aggregation, ranking and serialization were very difficult to speed up using Python.

We’ve been using Go since March 2017 and it’s been a great experience so far. Go has greatly increased the productivity of our development team. Not only has it improved the speed at which we develop, it’s also 30x faster for many components of Stream.

The performance of Go greatly influenced our architecture in a positive way. With Python we often found ourselves delegating logic to the database layer purely for performance reasons. The high performance of Go gave us more flexibility in terms of architecture. This led to a huge simplification of our infrastructure and a dramatic improvement of latency. For instance, we saw a 10 to 1 reduction in web-server count thanks to the lower memory and CPU usage for the same number of requests.

Initially we struggled a bit with package management for Go. However, using Dep together with the VG package contributed to creating a great workflow.

Go Stream

If you’ve never tried Go, you’ll want to try this online tour: https://tour.golang.org/welcome/1

Go as a language is heavily focused on performance. The built-in PPROF tool is amazing for finding performance issues. Uber’s Go-Torch library is great for visualizing data from PPROF and will be bundled in PPROF in Go 1.10.

Flame Graph


Switching from Cassandra to RocksDB & Raft

1.0 of Stream leveraged Cassandra for storing the feed. Cassandra is a common choice for building feeds. Instagram, for instance started, out with Redis but eventually switched to Cassandra to handle their rapid usage growth. Cassandra can handle write heavy workloads very efficiently.

Cassandra is a great tool that allows you to scale write capacity simply by adding more nodes, though it is also very complex. This complexity made it hard to diagnose performance fluctuations. Even though we had years of experience with running Cassandra, it still felt like a bit of a black box. When building Stream 2.0 we decided to go for a different approach and build Keevo. Keevo is our in-house key-value store built upon RocksDB, gRPC and Raft.

RocksDB is a highly performant embeddable database library developed and maintained by Facebook’s data engineering team. RocksDB started as a fork of Google’s LevelDB that introduced several performance improvements for SSD. Nowadays RocksDB is a project on its own and is under active development. It is written in C++ and it’s fast. Have a look at how this benchmark handles 7 million QPS. In terms of technology it’s much more simple than Cassandra. This translates into reduced maintenance overhead, improved performance and, most importantly, more consistent performance. It’s interesting to note that LinkedIn also uses RocksDB for their feed.

Our infrastructure is hosted on AWS and is designed to survive entire availability zone outages. Unlike Cassandra, Keevo clusters organizes nodes into leaders and followers . When a leader (master) node becomes unavailable the other nodes in the same deployment will start an election and pick a new leader. Electing a new leader is a fast operation and barely impacts live traffic.

To do this, Keevo implements the Raft consensus algorithm using Hashicorp’s Go implementation. This ensures that every bit stored in Keevo is stored on 3 different servers and operations are always consistent. This site does a great job of visualizing how Raft works: https://raft.github.io/

raft


Not Quite Microservices

By leveraging Go and RocksDB we’re able to achieve great feed performance. The average response time is around 12ms. The architecture lies somewhere between a monolith and a microservice. Stream runs on the following 7 services:

  1. Stream API
  2. Keevo
  3. Real time & Firehose
  4. Analytics
  5. Personalization & Machine learning
  6. Site & Dashboard
  7. Async Workers

To see all our services divided into stacks, head over here.

Personalization & Machine Learning

Almost all large apps with feeds use machine learning and personalization. For instance, LinkedIn prioritizes the items in your feed. Instagram’s explore feed displays pictures outside of the people you follow that you might be interested in. Etsy uses a similar approach to optimize ecommerce conversion. Stream supports the following 5 use cases for personalization:

Documentation for building personalized feeds.

All of these personalization use cases rely on combining feeds with analytics and machine learning. For the machine learning side we generate the models using Python. The models are different for each of our enterprise customers. Typically we’ll use one of these amazing libraries:

Analytics

Analytics data is collected using a tiny Go-based server. In the background, it will spawn go-routines to rollup the data as needed. The resulting metrics are stored in Elastic. In the past, we looked at Druid, which seems like a solid project. For now, we could get away with a simpler solution though.

Dashboard & Site

The dashboard is powered by React and Redux. We also use React and Redux for all of our example applications:

The site, as well as the API for the site, is powered by Python, Django and Django Rest Framework. Stream is sponsoring Django Rest Framework since it’s a pretty great open source project. If you need to build an API quickly there is no better tool than DRF and Python.

We use Imgix to resize the images on our site. For us, Imgix is cost efficient, fast and overall a great service. Thumbor is a good open source alternative.

Real time

Our real time infrastructure is based on Go, Redis and the excellent gorilla websocket library. It implements the Bayeux protocol. In terms of architecture it’s very similar to the node based Faye library.

It was interesting to read the “Ditching Go for Node.js” post on Hacker News. The author moves from Go to Node to improve performance. We actually did the exact opposite and moved from Node to Go for our real time system. The new Go-based infrastructure handles 8x the traffic per node.

Devops, Testing & Multiple Regions

In terms of devops the provisioning and configuration of instances is fully automated using a combination of:

Because our infrastructure is defined in code it has become trivial to launch new regions. We heavily use CloudFormation. Every single piece of our stack is defined in a CloudFormation template. If needed we are able to spawn a new dedicated shard in a few minutes. In addition, AWS Parameter Store is used to hold application settings. Our largest deployment is in US-East, but we also have regions in Tokyo, Singapore and Dublin.

A combination of Puppet and Cloud-init is used to configure our instances. We run our self-contained Go's binaries directly on the EC2 instance without any additional containerization layer.

Releasing new versions of our services is done by Travis. Travis first runs our test suite. Once it passes, it publishes a new release binary to GitHub. Common tasks such as installing dependencies for the Go project, or building a binary are automated using plain old Makefiles. (We know, crazy old school, right?) Our binaries are compressed using UPX.

Tool highlight: Travis

Travis has come a long way over the past years. I used to prefer Jenkins in some cases since it was easier to debug broken builds. With the addition of the aptly named “debug build” button, Travis is now the clear winner. It’s easy to use and free for open source, with no need to maintain anything.


Next we use Fabric to do a rolling deploy to our AWS instances. If anything goes wrong during the deploy it will halt the deploy. We take stability very seriously:

  • A high level of test coverage is required.
  • Releases are created by Travis (making it hard to deploy without running tests).
  • Code is reviewed by at least 2 team members.
  • Our extensive QA integration test suite evaluates if all 7 components still work.

We’ve written about our experience with testing our Go codebase. When things do break we do our best to be transparent about the ongoing issue:

VictorOps is a recent addition to our support stack. It’s made it very easy to collaborate on ongoing issues.

Tool Highlight: VictorOps

The best part about VictorOps is how they use a timeline to collaborate amongst team members. VictorOps is an elegant way to keep our team in the loop about outages. It also integrates well with Slack. This setup enables us to quickly react to any problems that make it into production, work together and resolve them faster.


Victorops


The vast majority of our infrastructure runs on AWS:

The devops responsibilities are shared across our team. While we do have one dedicated devops engineer, all our developers have to understand and own the entire workflow.

Monitoring

Stream uses OpenTracing for tracing and Grafana for beautiful dashboards. The tracking for Grafana is done using StatsD. The end result is this beauty:


monitoring


We track our errors in Sentry and use the ELK stack to centralize our logs.

Tool Highlight: OpenTracing + Jaegar

One new addition to the stack is OpenTracing. In the past we used New Relic, which works like a charm for Python, but isn’t able to automatically measure tracing information for Go. OpenTracing with Jaeger is a great solution that works very well for Stream. It also has, perhaps, the best logo for a tracing solution:

tool highlight opentracing & jaegar

Closing Thoughts

Go is an absolutely amazing language and has been a major win in terms of performance and developer productivity. For tracing we use OpenTracing and Jaeger. Our monitoring is running on StatsD and Graphite. Centralized logging is handled by the ELK stack.

Stream’s main database is a custom solution built on top of RocksDB and Raft. In the past we used Cassandra, which we found hard to maintain and which didn’t give us enough control over performance when compared to RocksDB.

We leverage external tools and solutions for everything that’s not a core competence. Redis hosting is handled by ElastiCache, Postgres by RDS, email by Mailgun, test builds by Travis and error reporting by Sentry.

Thank you for reading about our stack! If you’re a user of Stream, please be sure to add Stream to your stack here on StackShare. If you’re a talented individual, come work with us! And finally, if you haven’t tried out Stream yet, take a look at this quick tutorial for the API.

Stream
Build scalable feeds, activity streams & chat in a few hours instead of months.
Tools mentioned in article
Open jobs at Stream
Backend Software Developer
Amsterdam;

We are looking for a full time Senior Software Engineer to join our development team. Job duties will include working on Stream's core API technology as well as designing and building high-performance software.

What you will be doing

Most of your day will be dedicated to software design, research, and coding. On typical projects, you will have a lot of freedom and you will be paired with another team member. Our team is made up of very experienced engineers, some with more than 10 years of experience. By working together you will learn from each other along the way. You will have an enormous impact on making our API service faster, more scalable and more flexible.

You will add new features to the service and find ways to make the existing ones perform orders of magnitude faster. Our customers have millions of users; they use Stream for mission critical features such as showing content and exposing core functionality of their application. Building stable and reliable software is not just an option: as a member of the development team, you will design and write state-of-the-art software, follow best practices, measure everything and be responsible for deployment to production. You will also spend part of your time talking to our customers and help them to use Stream in their app.

The challenges

  • Distributed databases: we built our own data store for feeds and for chat
  • Real-time messaging
  • High performance: our API responses are in the 10ms range
  • High scalability: we use sharding, master-master, and master-slave to ensure scalability
  • High availability: our entire infrastructure is designed and operated to survive entire datacenter crashes
  • Multi-region: we deploy our service on 4 different continents

You have

  • 5+ years of backend development experience
  • Experience with at least one (preferably few) of the following languages: Go, Rust, Java, C, C++, Erlang
  • Proficiency in Go language is strongly preferred
  • Experience with high traffic and high performance applications
  • Solid knowledge of relational databases
  • Experience with building HTTP APIs
  • Experience managing your own projects and work in a team

Our tech stack

At Stream we use a wide collection of technologies to offer highly optimized and available features to our customers. Over the years we have experimented with different programming languages, frameworks, databases, and libraries. Here is a short list of the technology that we currently use. Do not worry if you do not master them all or if you do not see your favorite tool or language, you will have the chance to be exposed to most and to convince us to expand the list:

  • Go, gRPC, RocksDB, Python
  • Postgresql, RabbitMQ
  • AWS, Puppet, CloudFormation
  • Grafana, Graphite, ELK, Jaeger
  • Redis, Memcached

What we have to offer you

Stream employees enjoy some of the best benefits in the industry:

  • A team of exceptional engineers
  • The chance to work on OSS projects
  • A competitive salary
  • 28 days paid time off plus paid Dutch holidays 
  • Company equity
  • A pension scheme
  • A generous Learning and Development budget
  • Commute expenses to Amsterdam covered or option to use a company bike within the city
  • Gym membership of choice covered and weekly pilates sessions in the office
  • MacBook Pro or another development setup
  • Healthy team lunches and plenty of snacks
  • A generous relocation package
  • An office in the heart of Amsterdam
  • The opportunity to attend or present to global conferences and meetups
  • The possibility to visit our office in Boulder, CO

Our culture

Stream has a casual social culture, our team is diverse and we all have different backgrounds.

Our talented developers are highly technical and collaborative, which makes Stream a great place to learn and improve your skills. When it comes to software engineering our culture is oriented towards ownership and quality: our goal is to deliver stable software.

If you are interested in becoming a part of what we do, apply now!

No recruiters/agencies please

VP of Engineering
Amsterdam,or

We are looking for a highly skilled VP of Engineering to lead our technical execution and objectives. Between our Feed and Chat products, Stream provides services for more than a billion end users.

What you will be doing

You will work directly with CEO and CTO and be responsible for the execution and delivery of our technical objectives. Our engineering team operates a large infrastructure deployed in 10+ regions, we build SDKs for more than 10 different platforms and the API serves traffic for over a billion end-users.

Why Join Stream as VPE

We want to become the AWS of application components, we currently power Chat and Feeds for over a billion end-users. Our API is used by applications with hundreds of millions of users.

  • Large engineering team ~ 50% of the entire company is part of the engineering department
  • We build and release SDK on all major platforms
  • Modern tech stack: we use Golang, RocksDB, Raft
  • The engineering team is very solid

Responsibilities

  • Own execution and delivery of all technical objectives
  • Hiring and growing the best talent
  • Lead planning, prioritization, and design of new features for API, SDK, and other tech teams
  • Report directly to CTO and work closely with CEO
  • Budget and organize all hiring for the engineering department

About You

You are a VP of Engineering, CTO, or technical Co-Founder with more than 10 years of experience in software development.

As a manager, you maintain a hands-on approach and stay involved with technical challenges, you are passionate about building software and technical teams.

You have

  • Prior experience as VP of engineering, CTO, or as a technical Co-Founder
  • More than 8 years of experience managing software engineering teams
  • Experience managing 30+ developers
  • Hired top talent and grew them under your management
  • CS degree or similar/relevant experience
  • Worked for more than 5 years as a software engineer

Bonus points

  • Startup experience
  • Saas & API products
  • Mobile development background

Our tech stack (for dev positions / engineering dept. roles)

At Stream we use a wide collection of technologies to offer highly optimized and available features to our customers. Here is a shortlist of the technology that we currently use:

  • Go, gRPC, RocksDB, Python
  • Postgresql, RabbitMQ
  • AWS, Puppet, CloudFormation
  • Grafana, Graphite, ELK, Jaeger
  • Redis, Memcached

What we have to offer you

Stream employees enjoy some of the best benefits in the industry:

  • A team of exceptional engineers (developer positions only)
  • The chance to work on OSS projects (developer positions only)
  • A competitive salary
  • 28 days of paid time off plus paid Dutch holidays
  • Company equity
  • A pension scheme
  • A generous Learning and Development budget
  • Commute expenses to Amsterdam covered or the option to use a company bike within the city
  • Gym membership of choice covered and weekly pilates sessions in the office
  • A Macbook Pro and, if necessary, a home office setup package
  • Healthy team lunches and plenty of snacks
  • A generous relocation package
  • An office in the heart of Amsterdam or the possibility to work fully remote
  • The opportunity to attend or present to global conferences and meetups
  • The possibility to visit our office in Boulder, CO

Our culture

Stream has a casual social culture, our team is diverse and we all have different backgrounds. Our talented developers are highly technical and collaborative, which makes Stream a great place to learn and improve your skills. When it comes to software engineering our culture is oriented towards ownership and quality: our goal is to deliver stable software.

If you are interested in becoming a part of what we do, apply now!

No recruiters/agencies please

Verified by
Building cool things on the internet 🛠️
Software Engineer
Software Engineer, Data Science
Javascript Developer
Software Architect
Account Executive
You may also like