What is Memcached?
Who uses Memcached?
Why developers like Memcached?
Here are some stack decisions, common use cases and reviews by companies and developers who chose Memcached in their tech stack.
Back in 2014, I was given an opportunity to re-architect SmartZip Analytics platform, and flagship product: SmartTargeting. This is a SaaS software helping real estate professionals keeping up with their prospects and leads in a given neighborhood/territory, finding out (thanks to predictive analytics) who's the most likely to list/sell their home, and running cross-channel marketing automation against them: direct mail, online ads, email... The company also does provide Data APIs to Enterprise customers.
I had inherited years and years of technical debt and I knew things had to change radically. The first enabler to this was to make use of the cloud and go with AWS, so we would stop re-inventing the wheel, and build around managed/scalable services.
For the SaaS product, we kept on working with Rails as this was what my team had the most knowledge in. We've however broken up the monolith and decoupled the front-end application from the backend thanks to the use of Rails API so we'd get independently scalable micro-services from now on.
Our various applications could now be deployed using AWS Elastic Beanstalk so we wouldn't waste any more efforts writing time-consuming Capistrano deployment scripts for instance. Combined with Docker so our application would run within its own container, independently from the underlying host configuration.
Storage-wise, we went with Amazon S3 and ditched any pre-existing local or network storage people used to deal with in our legacy systems. On the database side: Amazon RDS / MySQL initially. Ultimately migrated to Amazon RDS for Aurora / MySQL when it got released. Once again, here you need a managed service your cloud provider handles for you.
Future improvements / technology decisions included:
Caching: Amazon ElastiCache / Memcached CDN: Amazon CloudFront Systems Integration: Segment / Zapier Data-warehousing: Amazon Redshift BI: Amazon Quicksight / Superset Search: Elasticsearch / Amazon Elasticsearch Service / Algolia Monitoring: New Relic
As our usage grows, patterns changed, and/or our business needs evolved, my role as Engineering Manager then Director of Engineering was also to ensure my team kept on learning and innovating, while delivering on business value.
One of these innovations was to get ourselves into Serverless : Adopting AWS Lambda was a big step forward. At the time, only available for Node.js (Not Ruby ) but a great way to handle cost efficiency, unpredictable traffic, sudden bursts of traffic... Ultimately you want the whole chain of services involved in a call to be serverless, and that's when we've started leveraging Amazon DynamoDB on these projects so they'd be fully scalable.
Although we were using Elasticsearch in the beginning to power our in-app search, we moved this part of our processing over to Algolia a couple of months ago; this has proven to be a fantastic choice, letting us build search-related features with more confidence and speed.
Elasticsearch is only used for searching in internal tooling nowadays; hosting and running it reliably has been a task that took up too much time for us in the past and fine-tuning the results to reach a great user-experience was also never an easy task for us. With Algolia we can flexibly change ranking methods on the fly and can instead focus our time on fine-tuning the experience within our app.
Memcached is used in front of most of the API endpoints to cache responses in order to speed up response times and reduce server-costs on our side.
At Shopify, over the years, we moved from shards to the concept of "pods". A pod is a fully isolated instance of Shopify with its own datastores like MySQL, Redis, Memcached. A pod can be spawned in any region. This approach has helped us eliminate global outages. As of today, we have more than a hundred pods, and since moving to this architecture we haven't had any major outages that affected all of Shopify. An outage today only affects a single pod or region.
As we grew into hundreds of shards and pods, it became clear that we needed a solution to orchestrate those deployments. Today, we use Docker, Kubernetes, and Google Kubernetes Engine to make it easy to bootstrap resources for new Shopify Pods.
As is common in the Rails stack, since the very beginning, we've stayed with MySQL as a relational database, Memcached for key/value storage and Redis for queues and background jobs.
In 2014, we could no longer store all our data in a single MySQL instance - even by buying better hardware. We decided to use sharding and split all of Shopify into dozens of database partitions.
Sharding played nicely for us because Shopify merchants are isolated from each other and we were able to put a subset of merchants on a single shard. It would have been harder if our business assumed shared data between customers.
The sharding project bought us some time regarding database capacity, but as we soon found out, there was a huge single point of failure in our infrastructure. All those shards were still using a single Redis. At one point, the outage of that Redis took down all of Shopify, causing a major disruption we later called “Redismageddon”. This taught us an important lesson to avoid any resources that are shared across all of Shopify.
Over the years, we moved from shards to the concept of "pods". A pod is a fully isolated instance of Shopify with its own datastores like MySQL, Redis, memcached. A pod can be spawned in any region. This approach has helped us eliminate global outages. As of today, we have more than a hundred pods, and since moving to this architecture we haven't had any major outages that affected all of Shopify. An outage today only affects a single pod or region.
We decided to use MemCachier as our Memcached provider because we were seeing some serious PostgreSQL performance issues with query-heavy pages on the site. We use MemCachier for all Rails caching and pretty aggressively too for the logged out experience (fully cached pages for the most part). We really need to move to Amazon ElastiCache as soon as possible so we can stop paying so much. The only reason we're not moving is because there are some restrictions on the network side due to our main app being hosted on Heroku.
We initially started out with Heroku as our PaaS provider due to a desire to use it by our original developer for our Ruby on Rails application/website at the time. We were finding response times slow, it was painfully slow, sometimes taking 10 seconds to start loading the main page. Moving up to the next "compute" level was going to be very expensive.
We moved our site over to AWS Elastic Beanstalk , not only did response times on the site practically become instant, our cloud bill for the application was cut in half.
In database world we are currently using Amazon RDS for PostgreSQL also, we have both MariaDB and Microsoft SQL Server both hosted on Amazon RDS. The plan is to migrate to AWS Aurora Serverless for all 3 of those database systems.
Additional services we use for our public applications: AWS Lambda, Python, Redis, Memcached, AWS Elastic Load Balancing (ELB), Amazon Elasticsearch Service, Amazon ElastiCache