Citus vs Memcached: What are the differences?
Developers describe Citus as "Worry-free Postgres for SaaS. Built to scale out". Citus is worry-free Postgres for SaaS. Made to scale out, Citus is an extension to Postgres that distributes queries across any number of servers. Citus is available as open source, as on-prem software, and as a fully-managed service. On the other hand, Memcached is detailed as "High-performance, distributed memory object caching system". Memcached is an in-memory key-value store for small chunks of arbitrary data (strings, objects) from results of database calls, API calls, or page rendering.
Citus and Memcached belong to "Databases" category of the tech stack.
"Multi-core Parallel Processing" is the primary reason why developers consider Citus over the competitors, whereas "Fast object cache" was stated as the key factor in picking Memcached.
Citus and Memcached are both open source tools. Memcached with 8.99K GitHub stars and 2.6K forks on GitHub appears to be more popular than Citus with 3.64K GitHub stars and 273 GitHub forks.
What is Citus?
What is Memcached?
Need advice about which tool to choose?Ask the StackShare community!
Sign up to add, upvote and see more prosMake informed product decisions
What are the cons of using Citus?
What are the cons of using Memcached?
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
Around the time of their Series A, Pinterest’s stack included Python and Django, with Tornado and Node.js as web servers. Memcached / Membase and Redis handled caching, with RabbitMQ handling queueing. Nginx, HAproxy and Varnish managed static-delivery and load-balancing, with persistent data storage handled by MySQL.
As is common in the Rails stack, since the very beginning, we've stayed with MySQL as a relational database, Memcached for key/value storage and Redis for queues and background jobs.
In 2014, we could no longer store all our data in a single MySQL instance - even by buying better hardware. We decided to use sharding and split all of Shopify into dozens of database partitions.
Sharding played nicely for us because Shopify merchants are isolated from each other and we were able to put a subset of merchants on a single shard. It would have been harder if our business assumed shared data between customers.
The sharding project bought us some time regarding database capacity, but as we soon found out, there was a huge single point of failure in our infrastructure. All those shards were still using a single Redis. At one point, the outage of that Redis took down all of Shopify, causing a major disruption we later called “Redismageddon”. This taught us an important lesson to avoid any resources that are shared across all of Shopify.
Over the years, we moved from shards to the concept of "pods". A pod is a fully isolated instance of Shopify with its own datastores like MySQL, Redis, memcached. A pod can be spawned in any region. This approach has helped us eliminate global outages. As of today, we have more than a hundred pods, and since moving to this architecture we haven't had any major outages that affected all of Shopify. An outage today only affects a single pod or region.
At Shopify, over the years, we moved from shards to the concept of "pods". A pod is a fully isolated instance of Shopify with its own datastores like MySQL, Redis, Memcached. A pod can be spawned in any region. This approach has helped us eliminate global outages. As of today, we have more than a hundred pods, and since moving to this architecture we haven't had any major outages that affected all of Shopify. An outage today only affects a single pod or region.
As we grew into hundreds of shards and pods, it became clear that we needed a solution to orchestrate those deployments. Today, we use Docker, Kubernetes, and Google Kubernetes Engine to make it easy to bootstrap resources for new Shopify Pods.
PostgreSQL was an easy early decision for the founding team. The relational data model fit the types of analyses they would be doing: filtering, grouping, joining, etc., and it was the database they knew best.
Shortly after adopting PG, they discovered Citus, which is a tool that makes it easy to distribute queries. Although it was a young project and a fork of Postgres at that point, Dan says the team was very available, highly expert, and it wouldn’t be very difficult to move back to PG if they needed to.
The stuff they forked was in query execution. You could treat the worker nodes like regular PG instances. Citus also gave them a ton of flexibility to make queries fast, and again, they felt the data model was the best fit for their application.