Feed powered byStream Blue Logo Copy 5Created with Sketch.


Application and Data / Data Stores / In-Memory Databases

Decision at Stream about Go, Faye, Redis, Languages, InMemoryDatabases, ApplicationHosting, DataStores, RealtimeBackendApi

Avatar of tschellenbach

Our real time infrastructure is based on Go , Redis and the excellent gorilla websocket library. It implements the Bayeux protocol.

In terms of architecture it’s very similar to the node based Faye library. It was interesting to read the “Ditching Go for Node.js” post on Hacker News. The author moves from Go to Node to improve performance. We actually did the exact opposite and moved from Node to Go for our real time system. The new Go-based infrastructure handles 8x the traffic per node. #InMemoryDatabases #RealtimeBackendApi #ApplicationHosting #Languages #DataStores

22 upvotes·366 views

Decision at Sentry about Redis, PostgreSQL, Celery, Django, InMemoryDatabases, MessageQueue

Avatar of jtcunning
Operations Engineer at Sentry ·

Sentry started as (and remains) an open-source project, growing out of an error logging tool built in 2008. That original build nine years ago was Django and Celery (Python’s asynchronous task codebase), with PostgreSQL as the database and Redis as the power behind Celery.

We displayed a truly shrewd notion of branding even then, giving the project a catchy name that companies the world over remain jealous of to this day: django-db-log. For the longest time, Sentry’s subtitle on GitHub was “A simple Django app, built with love.” A slightly more accurate description probably would have included Starcraft and Soylent alongside love; regardless, this captured what Sentry was all about.

#MessageQueue #InMemoryDatabases

19 upvotes·13.5K views

Decision at Stream about RocksDB, Cassandra, Redis, Databases, DataStores, InMemoryDatabases

Avatar of tschellenbach

1.0 of Stream leveraged Cassandra for storing the feed. Cassandra is a common choice for building feeds. Instagram, for instance started, out with Redis but eventually switched to Cassandra to handle their rapid usage growth. Cassandra can handle write heavy workloads very efficiently.

Cassandra is a great tool that allows you to scale write capacity simply by adding more nodes, though it is also very complex. This complexity made it hard to diagnose performance fluctuations. Even though we had years of experience with running Cassandra, it still felt like a bit of a black box. When building Stream 2.0 we decided to go for a different approach and build Keevo. Keevo is our in-house key-value store built upon RocksDB, gRPC and Raft.

RocksDB is a highly performant embeddable database library developed and maintained by Facebook’s data engineering team. RocksDB started as a fork of Google’s LevelDB that introduced several performance improvements for SSD. Nowadays RocksDB is a project on its own and is under active development. It is written in C++ and it’s fast. Have a look at how this benchmark handles 7 million QPS. In terms of technology it’s much more simple than Cassandra.

This translates into reduced maintenance overhead, improved performance and, most importantly, more consistent performance. It’s interesting to note that LinkedIn also uses RocksDB for their feed.

#InMemoryDatabases #DataStores #Databases

17 upvotes·218 views

Decision at Uploadcare about PostgreSQL, Amazon DynamoDB, Amazon S3, Redis, Python, Google App Engine

Avatar of dmitry-mukhin
Amazon DynamoDBAmazon DynamoDB
Amazon S3Amazon S3
Google App EngineGoogle App Engine

Uploadcare has built an infinitely scalable infrastructure by leveraging AWS. Building on top of AWS allows us to process 350M daily requests for file uploads, manipulations, and deliveries. When we started in 2011 the only cloud alternative to AWS was Google App Engine which was a no-go for a rather complex solution we wanted to build. We also didn’t want to buy any hardware or use co-locations.

Our stack handles receiving files, communicating with external file sources, managing file storage, managing user and file data, processing files, file caching and delivery, and managing user interface dashboards.

At its core, Uploadcare runs on Python. The Europython 2011 conference in Florence really inspired us, coupled with the fact that it was general enough to solve all of our challenges informed this decision. Additionally we had prior experience working in Python.

We chose to build the main application with Django because of its feature completeness and large footprint within the Python ecosystem.

All the communications within our ecosystem occur via several HTTP APIs, Redis, Amazon S3, and Amazon DynamoDB. We decided on this architecture so that our our system could be scalable in terms of storage and database throughput. This way we only need Django running on top of our database cluster. We use PostgreSQL as our database because it is considered an industry standard when it comes to clustering and scaling.

15 upvotes·634 views

Decision at Dubsmash about Amazon RDS for Aurora, Redis, Amazon DynamoDB, Amazon RDS, Heroku, PostgreSQL, Databases, PlatformAsAService, NosqlDatabaseAsAService, SqlDatabaseAsAService

Avatar of tspecht
‎Co-Founder and CTO at Dubsmash ·
Amazon RDS for AuroraAmazon RDS for Aurora
Amazon DynamoDBAmazon DynamoDB
Amazon RDSAmazon RDS

Over the years we have added a wide variety of different storages to our stack including PostgreSQL (some hosted by Heroku, some by Amazon RDS) for storing relational data, Amazon DynamoDB to store non-relational data like recommendations & user connections, or Redis to hold pre-aggregated data to speed up API endpoints.

Since we started running Postgres ourselves on RDS instead of only using the managed offerings of Heroku, we've gained additional flexibility in scaling our application while reducing costs at the same time.

We are also heavily testing Amazon RDS for Aurora in its Postgres-compatible version and will also give the new release of Aurora Serverless a try!

#SqlDatabaseAsAService #NosqlDatabaseAsAService #Databases #PlatformAsAService

13 upvotes·412 views

Decision at Shopify about Redis, Memcached, MySQL, Rails

Avatar of kirs
Production Engineer at Shopify ·

As is common in the Rails stack, since the very beginning, we've stayed with MySQL as a relational database, Memcached for key/value storage and Redis for queues and background jobs.

In 2014, we could no longer store all our data in a single MySQL instance - even by buying better hardware. We decided to use sharding and split all of Shopify into dozens of database partitions.

Sharding played nicely for us because Shopify merchants are isolated from each other and we were able to put a subset of merchants on a single shard. It would have been harder if our business assumed shared data between customers.

The sharding project bought us some time regarding database capacity, but as we soon found out, there was a huge single point of failure in our infrastructure. All those shards were still using a single Redis. At one point, the outage of that Redis took down all of Shopify, causing a major disruption we later called “Redismageddon”. This taught us an important lesson to avoid any resources that are shared across all of Shopify.

Over the years, we moved from shards to the concept of "pods". A pod is a fully isolated instance of Shopify with its own datastores like MySQL, Redis, memcached. A pod can be spawned in any region. This approach has helped us eliminate global outages. As of today, we have more than a hundred pods, and since moving to this architecture we haven't had any major outages that affected all of Shopify. An outage today only affects a single pod or region.

10 upvotes·400 views

Decision at StackShare about Redis, CircleCI, Webpack, Amazon CloudFront, Amazon S3, GitHub, Heroku, Rails, Node.js, Apollo, Glamorous, React, Microservices, StackDecisionsLaunch, SSR, FrontEndRepoSplit

Avatar of ruswerner
Lead Engineer at StackShare ·
Amazon CloudFrontAmazon CloudFront
Amazon S3Amazon S3

StackShare Feed is built entirely with React, Glamorous, and Apollo. One of our objectives with the public launch of the Feed was to enable a Server-side rendered (SSR) experience for our organic search traffic. When you visit the StackShare Feed, and you aren't logged in, you are delivered the Trending feed experience. We use an in-house Node.js rendering microservice to generate this HTML. This microservice needs to run and serve requests independent of our Rails web app. Up until recently, we had a mono-repo with our Rails and React code living happily together and all served from the same web process. In order to deploy our SSR app into a Heroku environment, we needed to split out our front-end application into a separate repo in GitHub. The driving factor in this decision was mostly due to limitations imposed by Heroku specifically with how processes can't communicate with each other. A new SSR app was created in Heroku and linked directly to the frontend repo so it stays in-sync with changes.

Related to this, we need a way to "deploy" our frontend changes to various server environments without building & releasing the entire Ruby application. We built a hybrid Amazon S3 Amazon CloudFront solution to host our Webpack bundles. A new CircleCI script builds the bundles and uploads them to S3. The final step in our rollout is to update some keys in Redis so our Rails app knows which bundles to serve. The result of these efforts were significant. Our frontend team now moves independently of our backend team, our build & release process takes only a few minutes, we are now using an edge CDN to serve JS assets, and we have pre-rendered React pages!

#StackDecisionsLaunch #SSR #Microservices #FrontEndRepoSplit

8 upvotes·4.7K views

Decision about Yarn, Redux.js, React, jQuery, vuex, Vue.js, MongoDB, Redis, PostgreSQL, Sidekiq, Rails, Font-awesome, Bulma.io

Avatar of cyrusstoller

I'm building a new process management tool. I decided to build with Rails as my backend, using Sidekiq for background jobs. I chose to work with these tools because I've worked with them before and know that they're able to get the job done. They may not be the sexiest tools, but they work and are reliable, which is what I was optimizing for. For data stores, I opted for PostgreSQL and Redis. Because I'm planning on offering dashboards, I wanted a SQL database instead of something like MongoDB that might work early on, but be difficult to use as soon as I want to facilitate aggregate queries.

On the front-end I'm using Vue.js and vuex in combination with #Turbolinks. In effect, I want to render most pages on the server side without key interactions being managed by Vue.js . This is the first project I'm working on where I've explicitly decided not to include jQuery . I have found React and Redux.js more confusing to setup. I appreciate the opinionated approach from the Vue.js community and that things just work together the way that I'd expect. To manage my javascript dependencies, I'm using Yarn .

For CSS frameworks, I'm using #Bulma.io. I really appreciate it's minimal nature and that there are no hard javascript dependencies. And to add a little spice, I'm using #font-awesome.

5 upvotes·1.2K views

Decision at Zulip about Redis, Python, RabbitMQ

Avatar of tabbott
Founder at Zulip ·

We've been using RabbitMQ as Zulip's queuing system since we needed a queuing system. What I like about it is that it scales really well and has good libraries for a wide range of platforms, including our own Python. So aside from getting it running, we've had to put basically 0 effort into making it scale for our needs.

However, there's several things that could be better about it: * It's error messages are absolutely terrible; if ever one of our users ends up getting an error with RabbitMQ (even for simple things like a misconfigured hostname), they always end up needing to get help from the Zulip team, because the errors logs are just inscrutable. As an open source project, we've handled this issue by really carefully scripting the installation to be a failure-proof configuration (in this case, setting the RabbitMQ hostname to, so that no user-controlled configuration can break it). But it was a real pain to get there and the process of determining we needed to do that caused a significant amount of pain to folks installing Zulip. * The pika library for Python takes a lot of time to startup a RabbitMQ connection; this means that Zulip server restarts are more disruptive than would be ideal. * It's annoying that you need to run the rabbitmqctl management commands as root.

But overall, I like that it has clean, clear semanstics and high scalability, and haven't been tempted to do the work to migrate to something like Redis (which has its own downsides).

4 upvotes·1.2K views

Decision at Codecov about Redis, Celery, Python

Avatar of hootener
CTO at Codecov ·

A major aspect of Codecov is the use of long running asynchronous tasks to process large amounts of test coverage data uploaded by our users. Being a Python stack, Celery felt like a natural fit to manage codecov's long running tasks. We rely on Celery to manage all our background queues and asyncronous scheduling. Celery enables us to set timeouts for different tasks which has been instrumental in maintaining our queue in production. Celery also interfaces easily with Redis as a backend store, which allowed it to slot neatly into our existing infrastructure.

4 upvotes·895 views