What is SQLAlchemy?
Who uses SQLAlchemy?
Here are some stack decisions, common use cases and reviews by companies and developers who chose SQLAlchemy in their tech stack.
Recently we were looking at a few robust and cost-effective ways of replicating the data that resides in our production MongoDB to a PostgreSQL database for data warehousing and business intelligence.
We set ourselves the following criteria for the optimal tool that would do this job: - The data replication must be near real-time, yet it should NOT impact the production database - The data replication must be horizontally scalable (based on the load), asynchronous & crash-resilient
Based on the above criteria, we selected the following tools to perform the end to end data replication:
We chose MongoDB Stitch for picking up the changes in the source database. It is the serverless platform from MongoDB. One of the services offered by MongoDB Stitch is Stitch Triggers. Using stitch triggers, you can execute a serverless function (in Node.js) in real time in response to changes in the database. When there are a lot of database changes, Stitch automatically "feeds forward" these changes through an asynchronous queue.
We chose Amazon SQS as the pipe / message backbone for communicating the changes from MongoDB to our own replication service. Interestingly enough, MongoDB stitch offers integration with AWS services.
In the Node.js function, we wrote minimal functionality to communicate the database changes (insert / update / delete / replace) to Amazon SQS.
Next we wrote a minimal micro-service in Python to listen to the message events on SQS, pickup the data payload & mirror the DB changes on to the target Data warehouse. We implemented source data to target data translation by modelling target table structures through SQLAlchemy . We deployed this micro-service as AWS Lambda with Zappa. With Zappa, deploying your services as event-driven & horizontally scalable Lambda service is dumb-easy.
In the end, we got to implement a highly scalable near realtime Change Data Replication service that "works" and deployed to production in a matter of few days!
In the early days, the Uber backend system was a monolithic software architecture with several app servers and a single database. The system was mainly written in Python and used SQLALchemy as the ORM-layer to the database.
In early 2014, Uber faced the fact that given their trip growth —roughly 20% per month—the solution for storing trips was going to run out of steam, both in terms of storage volume and IOPS (Input/Output Operations Per Second) by the end of the year.
To fix this problem, project Mezzanine was launched. The whole system, which Uber simply calls Schemaless in homage to its design, is written in Python. The initial version took about 5 months from idea to production deployment.
Unlike our frontend, we chose Flask, a microframework, for our backend. We use it with Python 3 and Gunicorn.
One of the reasons was that I have significant experience with this framework. However, it also was a rather straightforward choice given that our backend almost only serves REST APIs, and that most of the work is talking to the database with SQLAlchemy .
We could have gone with something like Hug but it is kind of early. We might revisit that decision for new services later on.