In the world of microservices hype, we've always strived to keep our application stack as simple as possible to reduce accidental operational complexity. We're strong believers in keeping bounded contexts aligned with large business units to prevent having to implement ROLLBACK over the network.
Following the KISS principle for Infrastructure is how we managed to scale to 4 million monthly visitors with just 2 engineers.
Durable storage layer
Different schemas assigned for separate DDD Bounded Contexts within our monolith
In memory caching layer for API requests
Message Queueing backend for Bull
Message Queue for domain event delivery across Bounded Contexts (Rock solid after 3 years! Literally never fails)
Used in place of a more complex Kafka setup
Object/functional composition have helped us keep a clean separation of units to allow for mock-free unit tests, with composition of side effects happening near the composition root of each app, far away from business logic.
We started our Infrastructure on Heroku, utilizing their managed Postgres/Redis services to quickly get up and running. After costs became a problem, we cutover to a custom AWS setup, allowing for greater flexibility/cost optimization.
Nowadays, we use Heroku for running load tests against our staging infrastructure.
The ability to quickly scale dynos allow us to utilize the Actor Model to emulate typical player behavior, finding crucial bottlenecks in our Infrastructure before they happen in production.
Applying DDD, CQRS & ES (Domain-driven Design, Command Query Responsibility Segregation & Event Sourcing) have by far been the most important decisions we've made in terms of reducing complexity in our app.
We started simple, with an ORM, Active Record & Transaction Script. As time went on, we realized the core domain had heavy amounts of Reactive behavior and Invariants (when this happens, send and email, when that happens, do this to something else). The Anemic Domain Model (https://martinfowler.com/bliki/AnemicDomainModel.html) caused an explosion of cyclomatic complexity in our core domain command handlers, causing slow turnaround time to meet business needs.
We use DDD to build a shared ubiquitous language (https://martinfowler.com/bliki/UbiquitousLanguage.html) with domain experts on our Miro board. With this language in place, we apply Event Modeling (https://eventmodeling.org/) to design workflows, complete with wireframed UI's for our team to implement. This allows us to find our bounded context boundaries early before writing a single line of code, since we can see where all the data is flowing to and why.
We use CQRS (https://martinfowler.com/bliki/CQRS.html) to separate our command processing aggregates from our queries, as well as build a dimensional model asynchronously for analytical reporting.
We use Event Sourcing (https://martinfowler.com/eaaDev/EventSourcing.html) (Redux state container pattern) sparingly in our core aggregates with the most complex behaviour, but do not persist JSON event streams. Instead, we map from a relational structure to events, fold the events into current state to make new decisions, then persist events + current state back to relational writes. We ensure that the columns we map from are always immutable.
This removes the common versioning problems that exist with having to choose stream boundaries up-front, and allows us to have an immediately consistent queryable SQL tables of our source of truth.