Stream & Go: News Feeds for Over 300 Million End Users

40,849
Stream
Build scalable feeds, activity streams & chat in a few hours instead of months.

By Thierry Schellenbach, CEO, Stream.


Stream + Go


Stream is an API that enables developers to build news feeds and activity streams (try the API). We are used by over 500 companies and power the feeds of more than 300 million end users. Companies such as Product Hunt, Under Armour, Powerschool, Bandsintown, Dubsmash, Compass and Fabric (Google) rely on Stream to power their news feeds. In addition to the API, the founders of Stream also wrote the most widely used open source solution for building scalable feeds.

Here’s what Stream looks like today:

  • Number of servers: 180
  • Feed updates per month: 34 billion
  • Average API response time: 12ms
  • Average real-time response time: 2ms
  • Regions: 4 (US-East, EU-West, Tokyo and Singapore)
  • 1B API requests per month (~20k/minute)

Given that most of our customers are engineers, we often talk about our stack. Here’s a high level overview:

  1. Go is our primary programming language.
  2. We use a custom solution built on top of RocksDB + Raft for our primary database (we started out on Cassandra but wanted more control over performance). PostgreSQL stores configs, API keys, etc.
  3. OpenTracing and Jaeger handle tracing, StatsD and Grafana provide detailed monitoring and we use the ELK stack for centralized logging.
  4. Python is our language of choice for machine learning, devops and our website https://getstream.io (Django, DRF & React).
  5. Stream uses a combination of fanout-on-write and fanout-on-read. This results in fast read performance when a users open their feed, as well as fast propagation times when a famous user posts an update.

Tommaso Barguli and I (@tschellenbach) are the developers who started Stream nearly 3 years ago. We founded the company in Amsterdam, participated in Techstars NYC 2015 and opened up our Boulder, Colorado office in 2016. It’s been quite a crazy ride in a fairly short amount of time! With over 15 developers and a host of supporting roles, including sales and marketing, the team feels enormous compared to the early days.

The Challenge: News Feeds & Activity Streams

The nature of follow relationships makes it hard to scale feeds. Most of you will remember Facebook’s long load times, Twitter’s fail whale or Tumblr’s year of technical debt. Feeds are hard to scale since there is no clear way to shard the data. Follow relationships connect everyone to everyone else. This makes it difficult to split data across multiple machines. If you want to learn more about this problem, check out these papers:

At a very high level there are 3 different ways to scale your feeds:

  1. Fanout-on-write: basically precompute everything. It’s expensive, but its easy to shard the data.
  2. Fanout-on-read: hard to scale, but more affordable and new activities show up faster.
  3. Combination of the above two approaches: better performance and reduced latency, but increased code complexity.

Stream uses a combination of fanout-on-write and fanout-on-read. This allows us to effectively support both customers with highly connected graphs, as well as customers with a more sparse dataset. This is important since the ways in which our customers use Stream are very different. Have a look at these screenshots from Bandsintown, Unsplash, and Product Hunt:

screenshot examples


Switching from Python to Go

After years of optimizing our existing feed technology we decided to make a larger leap with 2.0 of Stream. While the first iteration of Stream was powered by Python and Cassandra, for Stream 2.0 of our infrastructure we switched to Go. The main reason why we switched from Python to Go is performance. Certain features of Stream such as aggregation, ranking and serialization were very difficult to speed up using Python.

We’ve been using Go since March 2017 and it’s been a great experience so far. Go has greatly increased the productivity of our development team. Not only has it improved the speed at which we develop, it’s also 30x faster for many components of Stream.

The performance of Go greatly influenced our architecture in a positive way. With Python we often found ourselves delegating logic to the database layer purely for performance reasons. The high performance of Go gave us more flexibility in terms of architecture. This led to a huge simplification of our infrastructure and a dramatic improvement of latency. For instance, we saw a 10 to 1 reduction in web-server count thanks to the lower memory and CPU usage for the same number of requests.

Initially we struggled a bit with package management for Go. However, using Dep together with the VG package contributed to creating a great workflow.

Go Stream

If you’ve never tried Go, you’ll want to try this online tour: https://tour.golang.org/welcome/1

Go as a language is heavily focused on performance. The built-in PPROF tool is amazing for finding performance issues. Uber’s Go-Torch library is great for visualizing data from PPROF and will be bundled in PPROF in Go 1.10.

Flame Graph


Switching from Cassandra to RocksDB & Raft

1.0 of Stream leveraged Cassandra for storing the feed. Cassandra is a common choice for building feeds. Instagram, for instance started, out with Redis but eventually switched to Cassandra to handle their rapid usage growth. Cassandra can handle write heavy workloads very efficiently.

Cassandra is a great tool that allows you to scale write capacity simply by adding more nodes, though it is also very complex. This complexity made it hard to diagnose performance fluctuations. Even though we had years of experience with running Cassandra, it still felt like a bit of a black box. When building Stream 2.0 we decided to go for a different approach and build Keevo. Keevo is our in-house key-value store built upon RocksDB, gRPC and Raft.

RocksDB is a highly performant embeddable database library developed and maintained by Facebook’s data engineering team. RocksDB started as a fork of Google’s LevelDB that introduced several performance improvements for SSD. Nowadays RocksDB is a project on its own and is under active development. It is written in C++ and it’s fast. Have a look at how this benchmark handles 7 million QPS. In terms of technology it’s much more simple than Cassandra. This translates into reduced maintenance overhead, improved performance and, most importantly, more consistent performance. It’s interesting to note that LinkedIn also uses RocksDB for their feed.

Our infrastructure is hosted on AWS and is designed to survive entire availability zone outages. Unlike Cassandra, Keevo clusters organizes nodes into leaders and followers . When a leader (master) node becomes unavailable the other nodes in the same deployment will start an election and pick a new leader. Electing a new leader is a fast operation and barely impacts live traffic.

To do this, Keevo implements the Raft consensus algorithm using Hashicorp’s Go implementation. This ensures that every bit stored in Keevo is stored on 3 different servers and operations are always consistent. This site does a great job of visualizing how Raft works: https://raft.github.io/

raft


Not Quite Microservices

By leveraging Go and RocksDB we’re able to achieve great feed performance. The average response time is around 12ms. The architecture lies somewhere between a monolith and a microservice. Stream runs on the following 7 services:

  1. Stream API
  2. Keevo
  3. Real time & Firehose
  4. Analytics
  5. Personalization & Machine learning
  6. Site & Dashboard
  7. Async Workers

To see all our services divided into stacks, head over here.

Personalization & Machine Learning

Almost all large apps with feeds use machine learning and personalization. For instance, LinkedIn prioritizes the items in your feed. Instagram’s explore feed displays pictures outside of the people you follow that you might be interested in. Etsy uses a similar approach to optimize ecommerce conversion. Stream supports the following 5 use cases for personalization:

Documentation for building personalized feeds.

All of these personalization use cases rely on combining feeds with analytics and machine learning. For the machine learning side we generate the models using Python. The models are different for each of our enterprise customers. Typically we’ll use one of these amazing libraries:

Analytics

Analytics data is collected using a tiny Go-based server. In the background, it will spawn go-routines to rollup the data as needed. The resulting metrics are stored in Elastic. In the past, we looked at Druid, which seems like a solid project. For now, we could get away with a simpler solution though.

Dashboard & Site

The dashboard is powered by React and Redux. We also use React and Redux for all of our example applications:

The site, as well as the API for the site, is powered by Python, Django and Django Rest Framework. Stream is sponsoring Django Rest Framework since it’s a pretty great open source project. If you need to build an API quickly there is no better tool than DRF and Python.

We use Imgix to resize the images on our site. For us, Imgix is cost efficient, fast and overall a great service. Thumbor is a good open source alternative.

Real time

Our real time infrastructure is based on Go, Redis and the excellent gorilla websocket library. It implements the Bayeux protocol. In terms of architecture it’s very similar to the node based Faye library.

It was interesting to read the “Ditching Go for Node.js” post on Hacker News. The author moves from Go to Node to improve performance. We actually did the exact opposite and moved from Node to Go for our real time system. The new Go-based infrastructure handles 8x the traffic per node.

Devops, Testing & Multiple Regions

In terms of devops the provisioning and configuration of instances is fully automated using a combination of:

Because our infrastructure is defined in code it has become trivial to launch new regions. We heavily use CloudFormation. Every single piece of our stack is defined in a CloudFormation template. If needed we are able to spawn a new dedicated shard in a few minutes. In addition, AWS Parameter Store is used to hold application settings. Our largest deployment is in US-East, but we also have regions in Tokyo, Singapore and Dublin.

A combination of Puppet and Cloud-init is used to configure our instances. We run our self-contained Go's binaries directly on the EC2 instance without any additional containerization layer.

Releasing new versions of our services is done by Travis. Travis first runs our test suite. Once it passes, it publishes a new release binary to GitHub. Common tasks such as installing dependencies for the Go project, or building a binary are automated using plain old Makefiles. (We know, crazy old school, right?) Our binaries are compressed using UPX.

Tool highlight: Travis

Travis has come a long way over the past years. I used to prefer Jenkins in some cases since it was easier to debug broken builds. With the addition of the aptly named “debug build” button, Travis is now the clear winner. It’s easy to use and free for open source, with no need to maintain anything.


Next we use Fabric to do a rolling deploy to our AWS instances. If anything goes wrong during the deploy it will halt the deploy. We take stability very seriously:

  • A high level of test coverage is required.
  • Releases are created by Travis (making it hard to deploy without running tests).
  • Code is reviewed by at least 2 team members.
  • Our extensive QA integration test suite evaluates if all 7 components still work.

We’ve written about our experience with testing our Go codebase. When things do break we do our best to be transparent about the ongoing issue:

VictorOps is a recent addition to our support stack. It’s made it very easy to collaborate on ongoing issues.

Tool Highlight: VictorOps

The best part about VictorOps is how they use a timeline to collaborate amongst team members. VictorOps is an elegant way to keep our team in the loop about outages. It also integrates well with Slack. This setup enables us to quickly react to any problems that make it into production, work together and resolve them faster.


Victorops


The vast majority of our infrastructure runs on AWS:

The devops responsibilities are shared across our team. While we do have one dedicated devops engineer, all our developers have to understand and own the entire workflow.

Monitoring

Stream uses OpenTracing for tracing and Grafana for beautiful dashboards. The tracking for Grafana is done using StatsD. The end result is this beauty:


monitoring


We track our errors in Sentry and use the ELK stack to centralize our logs.

Tool Highlight: OpenTracing + Jaegar

One new addition to the stack is OpenTracing. In the past we used New Relic, which works like a charm for Python, but isn’t able to automatically measure tracing information for Go. OpenTracing with Jaeger is a great solution that works very well for Stream. It also has, perhaps, the best logo for a tracing solution:

tool highlight opentracing & jaegar

Closing Thoughts

Go is an absolutely amazing language and has been a major win in terms of performance and developer productivity. For tracing we use OpenTracing and Jaeger. Our monitoring is running on StatsD and Graphite. Centralized logging is handled by the ELK stack.

Stream’s main database is a custom solution built on top of RocksDB and Raft. In the past we used Cassandra, which we found hard to maintain and which didn’t give us enough control over performance when compared to RocksDB.

We leverage external tools and solutions for everything that’s not a core competence. Redis hosting is handled by ElastiCache, Postgres by RDS, email by Mailgun, test builds by Travis and error reporting by Sentry.

Thank you for reading about our stack! If you’re a user of Stream, please be sure to add Stream to your stack here on StackShare. If you’re a talented individual, come work with us! And finally, if you haven’t tried out Stream yet, take a look at this quick tutorial for the API.

Stream
Build scalable feeds, activity streams & chat in a few hours instead of months.
Tools mentioned in article
Open jobs at Stream
Senior Backend Engineer (Golang)
Amsterdam or (EU and UK)
<section> <section> <p>We are seeking a skilled Senior Software Developer to join our team. This role is open remotely (if you're EU or UK-based) or hybrid in our Amsterdam office (relocation support and Visa sponsorship are available for The Netherlands).</p> <h2><strong>What you will be doing</strong>&nbsp;</h2> <p>You'll focus on one of the most used Products: Chat, as well as the brand new Video &amp; Audio API.</p> <p>A big portion of your day will be dedicated to software design, research, and coding.</p> <p>On typical projects, you will have a lot of freedom and you will be paired with another team member.</p> <p>Our team is made up of very experienced engineers, some with more than 10 years of experience.</p> <p>By working together you will learn from each other along the way. Not only that, you will have an enormous impact!</p> <p>Our customers have millions of users; they use Stream for mission-critical features such as showing content and exposing the core functionality of their application.</p> <h2><strong>Responsibilities</strong></h2> <ul> <li>Add new features to the service and find ways to make the existing ones perform orders of magnitude faster</li> <li>Help make our API service faster, more scalable and more flexible.</li> <li>Write clean, efficient, and well-documented code</li> <li>Design and write state-of-the-art software, follow best practices, measure everything and be responsible for deployment to production</li> <li>Engage with customers and help them to use Stream in their app</li> </ul> <p><strong>The challenges:</strong></p> <ul> <li><span style="text-decoration: underline;">Distributed databases:</span> we built our own data store for feeds and for chat</li> <li><span style="text-decoration: underline;">Real-time messaging</span></li> <li><span style="text-decoration: underline;">High performance: </span>our API responses are in the 10ms range</li> <li><span style="text-decoration: underline;">High scalability:</span> we use sharding, master-master, and master-slave to ensure scalability</li> <li><span style="text-decoration: underline;">High availability:</span> our entire infrastructure is designed and operated to survive entire datacenter crashes</li> <li><span style="text-decoration: underline;">Multi-region:</span> we deploy our service on 4 different continents</li> </ul> <h2><strong>About you</strong><span style="font-weight: 400;">&nbsp;</span></h2> <p><strong>You have:</strong></p> <ul> <li data-stringify-indent="0" data-stringify-border="0">5+ years of backend development experience</li> <li data-stringify-indent="0" data-stringify-border="0">Experience with high-traffic and high-performance applications</li> <li data-stringify-indent="0" data-stringify-border="0">Solid knowledge of relational databases</li> <li data-stringify-indent="0" data-stringify-border="0">Experience with building HTTP APIs</li> <li data-stringify-indent="0" data-stringify-border="0">Experience managing your own projects and work in a team</li> </ul> <p><strong>Bonus points:</strong></p> <ul> <li>Proficiency in Go language is strongly preferred</li> <li>Experience with JavaScript and web development frameworks such as React or Angular</li> <li>Experience with one (preferably few) of the following languages: Rust, Java, C, C++, Erlang, Node.js, Python</li> <li>Experience with message queues such as RabbitMQ</li> <li>Experience with automated testing and continuous integration/continuous deployment (CI/CD)</li> <li>Experience with designing and building REST API’s</li> <li>Experience with cloud-based platforms (e.g. AWS, Azure)</li> <li>Bachelor's degree in Computer Science, Engineering or a related field.<br><br></li> </ul> <p><strong>Our tech stack:</strong></p> <p>At Stream we use a wide collection of technologies to offer highly optimised and available features to our customers. Over the years we have experimented with different programming languages, frameworks, databases, and libraries.</p> <p>Here is a short list of the technology that we currently use.</p> <p>Do not worry if you do not master them all or if you do not see your favourite tool or language, you will have the chance to be exposed to most and to convince us to expand the list:</p> <ul> <li>Go, Python, NodeJS&nbsp;</li> <li>Postgresql, CockroachDB&nbsp;</li> <li>AWS, Puppet, CloudFormation&nbsp;</li> <li>Grafana, Graphite, ELK, Jaeger&nbsp;</li> <li>Redis, Memcached</li> </ul> <h2><strong>Why join Stream?</strong></h2> <ul> <li style="font-weight: 400;"><strong>History of success.</strong><span style="font-weight: 400;"> From Amsterdam to Boulder and Techstars in-between, Stream has raised over $58.25M to build the best Chat Messaging &amp; Activity Feed infrastructure available, with best-in-class support.</span></li> <li style="font-weight: 400;"><strong>Freedom and endless growth opportunities.</strong><span style="font-weight: 400;"> As a rapidly growing startup (since 2020 we have gone from 30 to 150 employees), Stream gives you unique personal and professional growth opportunities. The opportunity of true ownership and accountability has a massive impact on your career. These are the things you can rarely experience in huge corporations.</span></li> <li style="font-weight: 400;"><strong>Be on the front line of progress and innovation.</strong><span style="font-weight: 400;"> While working with cutting-edge technology, we are passionate about tackling difficult tech problems at scale and creating reusable components for them, empowering engineering teams to ship apps faster, more securely, and with a better user experience.</span></li> <li style="font-weight: 400;"><strong>They believe in us:</strong><span style="font-weight: 400;"> Stream is backed by leading VC companies (Felicis Ventures, GGV Capital, 01.Advisors, Techstars, Arthur Ventures), including backers like Dick Costolo (01 Advisors, ex-CEO of Twitter), Olivier Pomel (CEO of Datadog), Tom Preston-Werner (Co-Founder of GitHub), Nicolas Dessaigne (Co-Founder of Algolia), Johnny Boufarhat (Founder and CEO of Hopin).</span></li> </ul> <h2><strong>What we have to offer you</strong></h2> <p><span style="font-weight: 400;">Stream employees enjoy some of the best benefits in the industry:</span></p> <ul> <li style="font-weight: 400;"><span style="font-weight: 400;">A team of exceptional engineers&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">The chance to work on OSS projects </span><strong><em>&nbsp;</em></strong></li> <li style="font-weight: 400;"><span style="font-weight: 400;">28 days paid time off plus paid Dutch holidays</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Company equity</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">A pension scheme</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Remote work flexibility</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">A Learning and Development budget</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Commute expenses to Amsterdam covered or the option to use a company bike within the city</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Fitness stipend&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Monthly in-office chair massages by a professional</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">MacBook Pro&nbsp;</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Healthy team lunches and plenty of snacks</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">A generous relocation package</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">An office in the heart of Amsterdam</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">The opportunity to attend or present at global conferences and meetups</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">The possibility to visit our office in Boulder, CO</span></li> <li style="font-weight: 400;"><span style="font-weight: 400;">Parental leave paid at 100%</span></li> </ul> <p><strong><span class="discussion-id-822efe3d-614e-4adc-9991-ea81cdc89918 notion-enable-hover" data-token-index="0" data-reactroot="">Note: </span><span class="discussion-id-822efe3d-614e-4adc-9991-ea81cdc89918 notion-enable-hover" data-token-index="1" data-reactroot="">this list of benefits applies to Netherlands-based employees and is adjusted per your location of residence</span><span class="notion-enable-hover" data-token-index="2" data-reactroot="">.</span></strong></p> <h2><strong>Our culture</strong></h2> <p><span style="font-weight: 400;">Stream has a casual social culture, our team is diverse and we all have different backgrounds. Now, Stream is a team of over 130+ peers from over 35 countries across the globe.</span></p> <p><span style="font-weight: 400;">We value transparency, aim for excellence, and support each other on our way to new victories.</span></p> <p><span style="font-weight: 400;">Our team consists of the strongest talents worldwide, making Stream a great place to learn and improve your skills.&nbsp;</span></p> <p><span style="font-weight: 400;">When it comes to software engineering, our culture is oriented towards ownership and quality: our goal is to deliver stable software.</span></p> <p><span style="font-weight: 400;">If you are interested in becoming a part of what we do, apply now!</span></p> <p><em><span style="font-weight: 400;">Stream provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.</span></em></p> <p><em><span style="font-weight: 400;">This policy applies to all terms and conditions of employment, including recruiting, hiring, placement, promotion, termination, layoff, recall, transfer, leaves of absence, compensation and training.</span></em></p> <p><strong><em>No recruiters/agencies please</em></strong></p> </section> </section>
Verified by
Building cool things on the internet 🛠️
Software Engineer, Data Science
Account Executive
Software Engineer
Software Architect
Javascript Developer
Marketing Manager
You may also like