StackShareStackShare
Follow on
StackShare

Discover and share technology stacks from companies around the world.

Product

  • Stacks
  • Tools
  • Companies
  • Feed

Company

  • About
  • Blog
  • Contact

Legal

  • Privacy Policy
  • Terms of Service

© 2025 StackShare. All rights reserved.

API StatusChangelog
  1. Home
  2. Companies
  3. FLO HEALTH, INC.
FLO HEALTH, INC. logo

FLO HEALTH, INC.

Verified
flo.health/careers?utm_source=linkmainpage&utm_id=stackshare
20
Tools
10
Decisions
0
Followers

Tech Stack

Application & Data

10 tools

Flyway logo
Flyway
Presto logo
Presto
Akka logo
Akka
Apache Spark logo
Apache Spark
Scala logo
Scala
Trino logo
Trino
Kotlin logo
Kotlin
Swift logo
Swift
PostgreSQL logo
PostgreSQL
Python logo
Python

Utilities

1 tool

Slack logo
Slack

DevOps

2 tools

Prometheus logo
Prometheus
Grafana logo
Grafana

Team Members

Pavel Adamovich
Pavel Adamovich
Roman Bugaev
Roman Bugaev
Siarhei Zuyeu
Siarhei Zuyeu
Vladimir Burylov
Vladimir BuryloviOS Developer
Ivan Sharamet
Ivan SharametData Engineer
Vladislav Ermolin
Vladislav ErmolinAndroid Engineer
Marina Skuryat
Marina SkuryatData Quality Engineer
Povilas Marazas
Povilas Marazas

Engineering Blog

Stack Decisions

Ivan Klimuk
Ivan Klimuk

Mar 17, 2022

We train and deploy various ML algorithms to personalize the user experience in every part of the Flo app. While our first models were trained and served in a custom way, it quickly became hard to manage all the complex datasets and entities we deal with.

Therefore we’ve adopted the Tecton Feature Store. Tecton is one of the most advanced Feature Stores on the market. It allows to quickly explore and experiment with new features and datasets offline, while making it easy to serve the exact same data in real-time in a high-load environment.

The key benefits of using the Tecton Feature Store:

  • A clear and unified feature definition framework - less place for bugs and inconsistency between training on historical data and real-time serving

  • The feature definition code is covered with tests, the feature availability and freshness is monitored

  • Online data serving is easy to run and manage - every feature service becomes an API endpoint that is launched with a single command

  • Time-travel - we’re able to obtain the exact historical values of every feature we store and serve - this makes it easy to debug models retrospectively

  • Features are reusable in multiple problems - we can collaborate across different domains, while keeping the same standards for ML engineering for the whole company

Tecton runs on our own AWS infrastructure, using Spark and DynamoDB as offline and online storages accordingly. Tecton was our first step on our way to a scalable, reliable and efficient ML infrastructure at Flo.

39.1k views39.1k
Comments
Marina Skuryat
Marina Skuryat

Mar 10, 2022

Fast data growth and the importance of data-driven decisions make the data catalog one of the core components of data management. Flo Health adheres to high standards in engineering, including data solutions. The company itself has grown rapidly in recent years, and the accompanying increase in data made it obvious that we needed a solution to issues of data ownership, quality, and discoverability, as well as data governance.

So how did we resolve these issues?

Background and data-related issues:

Data ownership and responsibilities

Most data pipelines and datasets were owned by product teams and data analysts, but some pipelines with complex calculation logic were owned by data engineers. There were also cases that had multiple owners, or none. On top of that, responsibilities for data owners were not clearly defined. This meant that some data pipelines were implemented using best data engineering standards, covered with high-quality tests, and routinely validated, but because other data owners did not consider those things to be their responsibility, that wasn’t happening across the board.

Data discoverability and observability

With growing volumes of data, complex data pipelines, changing data sources, and an increasing number of people dealing with data, data discovery and observability became a challenge. Cases when it was difficult to determine the business context of data, find proper data for analysis because similar data was stored in multiple tables, or understand downstream processes began to appear.

Data trustability and governance

We didn’t have a simple entry point that would let data users both find proper data for analysis and know whether this data is trustable or not (i.e., if it had been tested, by what type of tests, and when it had last been successfully tested). There also wasn’t a centralized place to find all de-identified PII data in the storage or an automatic mechanism to identify potential data noncompliance in terms of privacy ahead of time.

What we needed

So, we needed a single entry point for working with the data that could resolve all our data issues. Of course, we also had high technical expectations:

  • Ability to integrate with various data sources: Glue, Databricks, Looker, etc.
  • Rich data lineage
  • REST API

In addition, we wanted the tool to have a clear and simple UI and UX so that it wouldn’t create constraints in data governance process adoption. Everyone in the company, regardless of technical skillset, needed to be able to easily gather insights from data.

There’s a multitude of solutions on the market:

  • Open-source solutions: Amundsen, DataHub, Magda, Atlas, etc.
  • Proprietary solutions: Alation, Atlan, Collibra, etc.
  • Mono-cloud solutions: Google Cloud Data Catalog, Azure Data Catalog
  • Data observability platforms: Datafold, Monte Carlo

However, not all of them could meet our needs and fit a reasonable budget. You can find an overview of high-level tools here: github.com/Alexkuva/awesome-data-catalogs. To make the decision, our data engineers performed a comprehensive analysis of available open-source and proprietary solutions and agreed to go with Atlan.

What we can do with Atlan

  1. Collect and centralize all the company’s metadata in one place and add necessary technical and business information for each entity (e.g., table, column, dashboard).
  2. Get transparency in data pipelines with the help of automated data lineage across multiple sources.
  3. Achieve clarity on data ownership. All core tables contain the name of the responsible team owner in Atlan. People are also assigned as owners and experts.
  4. Develop a business glossary and connect data with appropriate glossary terms to help users understand the context of data assets.
  5. Integrate our data quality tests with Atlan metadata via REST API to automatically set the status of test executions to a particular table.
  6. Run data profiling functionality to collect base statistical information about the data.
  7. Query all tables directly from Atlan, save and share SQL queries, and collaborate on issues via integrated chat.
  8. Comply with security and confidentiality regulations for data user management via access policies, user groups, and roles.
  9. Auto-detect PII using provided column names and auto-glossary recommendation options.

This isn’t everything Atlan can do — just the functionalities that we’re currently using the most at Flo right now. It should be mentioned that we’ve been using Atlan for a little less than a year, and we’re currently in the process of data catalog adoption among users. So far, we haven’t faced any bottlenecks in the Atlan functionalities related to our needs. I’m excited to see how it goes.

34.1k views34.1k
Comments
Vladislav Ermolin
Vladislav Ermolin

Feb 23, 2022

As of 2024 we still believe that we made the right choice back in 2015 of Android SDK as the basis for our Android application.

Nowadays, we could choose from plenty of cross-platform SDK options, which would’ve probably saved us resources at the beginning of the product’s development life cycle. However, engineering resource utilization isn’t the only consideration for making decisions. If you wanted to create the best women’s health solution on the market, you would need to care about performance and seamless integration with operating system features too. The modern cross-platform SDKs have just begun to get closer to the native development option in that regard. The Kotlin Multiplatform Project is a good example of such a framework. Unfortunately, because it hasn't been around for a long time, it still has plenty of issues, so it currently doesn't fit our needs. However, we might consider it in the future. All in all, I believe that we made the right choice.

Over time, Android engineering best practices, tools, and the operating system itself evolved, giving developers multiple ways to implement the same features more effectively, both in terms of engineering team performance and device resource utilization. Our team evolved as well: We’ve come a long way from a single Android developer to a dozen feature teams that need to work on the same codebase simultaneously without stepping on each other's toes. We began caring more about cycle time because one can’t successfully compete by delivering value slowly.

For our dev team, these changes prompted a request to update the codebase in order to deliver value faster and increase the speed of new Android features adoption, raising the overall level of quality at the same time.

We began with the modularization of our Android application. Using the power of the Gradle build system, we split our application into 70+ shared core modules and 30+ independent feature modules. Such a huge step required the revision of the application’s architecture. One could say that we moved to clean architecture; however, I would say that we use architecture driven by common software engineering principles like SOLID, DRY, KISS, etc. On the presentation layer, we switched from the MVP to the MVVM pattern. Implementation of this pattern, powered by the Jetpack Lifecycle components, simplifies Android component lifecycle management and increases the reusability of the code.

Supporting such a setup would be barely possible without a dependency injection (DI) implementation. We settled on Dagger 2. This DI framework supports compile-time graph validation, multibinding, and scoping support. Apart from that, it offers two ways to wire up individual components into a single graph: subcomponents and component dependencies, each good for its purpose. At Flo, we prefer component dependencies, as they better isolate the features and positively impact the build speed, but we use subcomponents closer to the leaves of the dependency graph as well.

Though we still have Java code in the project, Kotlin has become our main programming language. Compared to Java, it has multiple advantages:

  • Improved type system, which, for example, makes it possible to avoid the “billion-dollar mistake” in the majority of cases
  • Rich and mature standard library, which provides solutions for many typical tasks out of the box and minimizes the need for extra utilities
  • Advanced features to better fit the open-closed principle (for example, extension functions and removal of checked exceptions let us improve the extendability of solutions)
  • The syntax sugar, which simply lets you write code faster (it’s hard to imagine modern Android development without data classes, sealed classes, delegates, etc.) We attempt to use Kotlin wherever possible. Our build scripts are written in it, and we also migrate the good old bash scripts onto KScript.

Another huge step in Kotlin adoption is the migration from RxJava to the Kotlin coroutines. RxJava is a superb framework for event-based and asynchronous programming. However, it is not the best choice for asynchronous programming. In that regard, Kotlin coroutines seem like a much wiser choice, offering more effective resource utilization, more readable error stack traces, finer control over the execution scope and the syntax, which looks almost identical to the synchronous code. In its main area of usage — event-based programming — RxJava has also begun to lose ground. Being written in Java, it does not support Kotlin’s type system well. Besides, many of its operators are synchronous by design, which can limit developers. Driven by the Kotlin coroutines, Flow addresses both of these drawbacks and we found it perfectly fits our needs.

Perhaps the most noticeable sign that the above changes were not taken in vain is that you can now use Flo on your smartwatch powered by Android Wear. This is the second Flo app for the Android platform, and it effectively reuses the codebase of the mobile app. One of the core advantages of the Flo Watch app lies in Wear Health Services. It allows us to effectively and securely collect health-related data from the user’s device, if a user so chooses, and utilize it to improve the precision of cycle estimation.

But we won't stop chasing innovation!

Even though we migrated to ViewBinding, enjoying the extra type safety and reduced amount of the boilerplate code, we couldn’t pass by the Jetpack Compose framework. It allows us to use Kotlin power to construct UI, reduces code duplication, increases reusability of the UI components, and unblocks building complex view hierarchies with less performance penalty.

Finally, what about recent Android features support? Well, we continuously improve the app in that sense. Like most teams, we rely on different Jetpack, Firebase, and Play Services libraries to achieve that goal. We use them to move work to the background, implement push notifications, integrate billing, and many other little things, all of which improve the overall UX or let us reach out to users more effectively. However, we also invest in first-party tooling. In an effort to ensure secure and transparent management of user data, we implemented our own solutions for A/B testing, remote configuration management, logging, and analytics.

What about quality? Developers are responsible for the quality of created solutions. To ensure that we use modern tools and approaches:

  • We chose Detekt and Android Lint for static code analysis. These frameworks prevent many issues from coming up in production by analyzing the codebase during compile time. They are capable of finding the most common problems in Kotlin and Android-related code, ensuring the whole team follows the same code style. When those frameworks do not provide the necessary checks out of the box, we implement them by ourselves.
  • The above two frameworks are used both locally and in the continuous integration pipelines. However, in the latter, we additionally utilize the Sonarcloud tool. It provides extended complexity, security, and potential bug checks, which are run in the cloud.
  • To ensure that the code meets the requirements, we use multiple layers of automated testing. Our test pyramid includes unit tests, which use the JUnit5 platform, and E2E tests powered by Espresso framework. Together, these two approaches to testing allow developers to get feedback fast while at the same time ensuring that features work as expected end-to-end.
46.4k views46.4k
Comments
Ivan Sharamet
Ivan Sharamet

Feb 15, 2022

Mid-2019, we started a search for a better solution for all of our analytical needs. Our existing analytical infrastructure had started to fall apart; it was incapable of keeping up with the rapid growth of the data we were collecting and analyzing at the time. Trino (previously known as Presto SQL) became our prime candidate since it had a feature set we needed; was actively maintained by and active community (and also commercial companies like Facebook and Starburst); had seamless integration with our BI tools; and last but not least, was successfully used by the biggest companies in tech — Facebook, Uber, Netflix, AirBnB, and many others. It remains at the heart of analytics at Flo, helping us to make truly data-driven decisions ever since.

​One particular aspect of Trino that distinguishes it from the competition is its ability to integrate a large variety of data sources thanks to its Connector API. With this feature, you can do things like expose data from Kafka topics as Trino tables or query your favorite SQL/NoSQL database. Trino's performance is also great because it allows us to process workloads requiring relatively low latency and process long-running queries effectively.

​A bit of trivia: Since all of our infrastructure is deployed in AWS, at the time, it was tempting for us to deploy Presto as a part of Amazon EMR; however, we went with a different approach. There were two flavors of Presto at the time (and still are): the original Facebook version and a fork named PrestoSQL (later rebranded as Trino due to copyright claims) that was created by the team of original presto authors who had departed from Facebook. After some extensive internal benchmarking, we chose the latter, since it has a better feature set and a much higher development pace. And since it wasn't available as a part of EMR, we ended up deploying PrestoSQL in the Kubernetes cluster, which allowed us to scale the cluster up and down better and faster depending on the current workload.

​It's not a perfect tool, though — a fast development pace means that sometimes you find things broken, but that's OK).​

Presto: SQL on Everything

32k views32k
Comments
Dzmitry Aliashkevich
Dzmitry Aliashkevich

Feb 8, 2022

Business Context:

Flo Health has always been a data-driven company. All of our significant decisions are backed with some form of analytics, like experiments, surveys, machine learning models, etc. In addition to being able to acquire and hold leadership positions in the women’s health market, this decision-making process has to be quick and precise.

A few years ago, Flo Health was a relatively small company with 200 employees, 3.5 million daily active users, and around 18 million daily analytical events. The data platform at that time had a simple architecture and was completely self-served. The only work necessary to gather some insights was just to run a couple of SQL queries and create trivial visualizations, and maintenance efforts and costs were quite low.

Fortunately, our business ideas were successful, and we were able to become market leaders, which resulted in rapid growth. By the start of 2022, we will have 400 employees, 250% more daily active users, and the demand for data platform usage has also significantly increased (1,700% more analytical events). Looking into the future, the growth trend looks exponential, so we have to ensure that we’ll be able to serve it without time-to-market for decisions increasing.

Rapid growth always leads to data platform complications, and we found ourselves in a situation where employees were spending more time performing data analysis.

In essence, two of the most common ways to deal with this problem are either to invest in massive hiring of specialized engineers (which is quite hard due to trade market limitations) or use an automated self-served data platform infrastructure to lower cognitive barriers and hide complexity inside it (preferable for use because it’s faster and more scalable in the future).

So our team’s primary goal for 2022 is to hide this complexity from people and serve them the simplest toolset possible.

Methodological solution:

We wanted to answer a couple of questions that we suspected might help us make the right decision in terms of choosing a centric idea of organization for a new, improved data platform:

  • Should it be centralized or decentralized?
  • Who will be responsible for what?
  • What is the success criteria?

Because we had successfully resolved this type of problem before with application infrastructure by moving from monolithic architecture to microservices, and since organizational structure is already product-team-centric, we tried to figure out something similar for the data platform. And fortunately, this type of architecture already exists. It’s called data mesh — here’s a wonderful article describing it. So the next step for us was to choose a suitable technical solution to implement this architecture.

Technical solution:

Before considering solutions, one important question has to be answered: Should we look for a vendor or in-house solution? Each of these options has known pros and cons:

Vendor pros:

  • Typically significantly less time to adopt
  • Lower maintenance efforts
  • Often has faster delivery of new solution features delivery
  • No product work needed
  • Compliance to standards like ISO and HIPAA is supported by design out of the box

Vendor cons:

  • Not really flexible in terms of fitting company-specific requirements
  • High annual costs
  • Sometimes hard to contribute to solutions from the company side
  • In-house pros are basically vendor cons and vice versa.

Based on that knowledge, we decided to go with a vendor solution. To make our decision, we prepared some acceptance criteria, the most important of which were:

  • Migration period of less than 1 year
  • Unified platform for all data related operations, starting from data ingestion and finishing by BI and ML
  • ISO and HIPAA compliance
  • Good vendor reputation and solution improvement pace
  • Dedicated on-demand infrastructure for teams and simplified approach for working with data
  • Reduces maintenance efforts as much as possible
  • Cloud native

From the solutions we considered, Databricks was the only one that satisfied all of the above criteria and required moderate migration efforts and monetary investment. So we picked it and started adoption.

27.6k views27.6k
Comments
Vladimir Burylov
Vladimir Burylov

Feb 8, 2022

  • We have been developing all new features in Swift since 2018 - the choice that doesn’t need much explaining in 2023. The Objective-C part of the code is isolated well from the rest of the app and slowly but steadily declines in size.
  • In 2022, we considered SwiftUI mature enough and started using it for all new UI code instead of the Texture framework we had used since 2018. The transition went smoothly since the layout in SwiftUI is based on similar principles as it was in Texture: it's declarative and relies on a container-based system for layout. The Texture framework was preferred over UIKit for the same reason before (and superior performance), but SwiftUI now has all the benefits and is also a first-party tool, actively supported and developed.
  • In 2023 we switched to The Composable Architecture (TCA) for all our business-logic-related code. We have been using Redux-like architecture since 2019, when we decided to pursue this direction instead of the popular MVVM or Viper. It took us some time to adjust to it initially. Still, we enjoyed its benefits immensely: full transparency and control over state mutation, convenient testability, and composability that allows for immense scalability with minimal overhead. Initially, we used our in-house solution inspired by TCA, based on RxSwift. But since then, TCA evolved considerably and added many features that our solution was lacking, including support for modern first-party tools like async/await and Combine, so with the release of 1.0, we finally decided to make the switch. We keep it up to date, and as of Nov 2024, we use version 1.5 of TCA
  • With the switch to SwiftUI and TCA in 2023, we adopted async/await and Combine instead of the previously used RxSwift. Naturally, it became replaced with first-party tools that provide the same or even superior functionality, which the rest of our stack integrates better with.
  • We chose Swift Package Manager (SPM) over CocoaPods or Carthage back in 2020. Like the Swift language itself, it emerged somewhat limited. We watched patiently how it evolved until it received all the features that we needed, like the ability to have binary packages. With SPM, we could fully embrace modularization in our app, allowing us to easily create new internal modules and support any number of them.

Updated: Nov 2024

10.5k views10.5k
Comments
Siarhei Zuyeu
Siarhei Zuyeu

Dec 24, 2021

Scala was chosen for implementing mission-critical services and applications: Some were built from scratch, some during system evolution. As a fully object-oriented, multi-paradigm language with powerful extensions, Scala is now being used for building web services or web applications, streaming services, constructing data pipelines with ETL jobs, and a variety of utilities.

Scala has helped us to reinforce the backend of the Flo application in order to handle a rapidly increasing volume of users and process a massive amount of data every day. Depending on domain and application features, we organize and develop our backend using both Scala and Python. The current architecture consists of microservices built within health, community, commercial, communication, and other domains. Most of them have a reactive nature, with event-driven design. Each month, more than 75 million users access the app, and the core services manage more than 1,200 queries per second per single service instance — and here’s where the power of Scala helps us make it real and serve high-load — and sometimes high-pressure — API requests very quickly and without errors.

With the help of Scala’s strict type system, seamless interoperability with Java and JVM runtime, and a huge world of libraries and tools, we’re ready to quickly start the development of new services or update existing ones with minimal errors and high-quality results. Scala lang has strong debugging and monitoring tools that help mitigate any code issues during development. The JVM platform has a vast number of required libraries while the Scala community provides more efficient and native implementation, so we’re reinforced by both worlds.

At the same time, JVM languages have evolved in different directions. If you want to more easily and reliably perform routine tasks, Scala may open up previously unknown horizons for you to conquer and explore. It’s a really scalable language. In turn, it helps develop your engineering skills, and this in itself makes it worthwhile to learn.

14.1k views14.1k
Comments
Siarhei Zuyeu
Siarhei Zuyeu

Dec 24, 2021

Python was first used at Flo because we needed quick prototyping and product idea validation. So, the very first backend architecture was built as a monolithic Python web service. Now we use it for ML-related projects, building web services for our core product, and a huge variety of utilities that help to support platforms or integrations with external tools.

We do focus on getting the most beneficial aspects of language and balancing development with Scala. Core application services include the health domain cycle predictions, managing user data, chat-bots, etc. Python helped us start development and is still capable of managing high-load.

29.4k views29.4k
Comments
Vladimir Kurlenya
Vladimir Kurlenya

Dec 22, 2021

It’s pretty common when you read a success story about migrating from a monolith to microservices to see that people have a clear idea of what they already have; what they want to attain overall; that they have looked at all the pros and cons; and out of the plethora of available candidates, they chose Kubernetes. They have been faced with insurmountable problems, and with an unbelievable superhuman effort they resolved these issues and finally found the kind of happy resolution that happens “a year and a half into production.”

Was that how it was for us? Definitely not.

We didn’t spend a lot of time considering the idea of migrating to microservices like that. One day we just decided “why not give it a try?” There was no need to choose from the orchestrators at that point: Most of the dinosaurs were already on their last legs, except for Nomad, which was still showing some signs of life. Kubernetes became the de facto standard, and I had experience working with it. So we decided to pluck up our courage and try to run something non-critical in Kubernetes.

Considering that at that time all our infrastructure was in AWS, we also didn’t spend much time deciding to use EKS.

I’m struggling to remember who we chose as the guinea pig for the run in EKS — it might have been Jenkins. Or Prometheus. It’s difficult to say, but gradually all the new services were launched in EKS, and on the whole, everyone liked the approach.

The only thing that we didn’t understand was how to organize CI/CD.

At that time, we had the heady mix of Ansible/Terraform/Bitbucket, and we were not entirely satisfied with the results. Besides, we tried to practice delivery engineering and didn’t have a dedicated DevOps team, and there were many other factors too.

What did we need?

  • Unification — despite the fact that we never needed our teams to use a strictly defined stack, in CI/CD, some certainty was desired.
  • Decentralization — as mentioned earlier, we did not have a dedicated DevOps team, nor the desire (or need) to start one.
  • Relevance — not bleeding edge, but we wanted a tech stack that was on trend.
  • We also wanted the obvious things like speed, convenience, flexibility, etc.

It was safe to say that Helm was the standard for installing and running applications in EKS, so we didn’t use Ansible or Terraform for the management and templating of Kubernetes objects, although this solution was offered. We only used Helm (although there were lots of questions and complaints about it).

We also didn’t use Ansible or Terraform to manage Helm charts. It didn’t fit with our desire for decentralization and wasn’t exactly convenient. Again, because we don’t have a DevOps team, our service can be deployed in EKS by any developer with the help of Helm, and we don’t need (or want) to be involved in this process. We therefore took the most controversial route: We made our wrapper for Helm so it would work like an automatic transmission, more specifically that it would reduce interaction with the user when making the decision to go or not to go (in our case, to deploy or not to deploy). Later, we added a general Helm chart to this wrapper, so the developer needed several input values for deploying:

  • What to deploy (docker image)
  • Where to deploy (dev, stage, prod, etc.)

So in all, the service deployment process was run from the repository of the same service by the same developer, exactly when and how the developer needed it. Our participation in this process was reduced to minimal consultation on some borderline cases and occasionally eliminating errors (where would we be without them?) in the wrapper.

And then we lived happily ever after. But our story isn’t about that at all.

In fact, I was asked to talk about why we use Kubernetes, not how it went. If I am honest (and as you can surely tell), I don’t have a clear answer. Maybe it would be better if I told you why we are continuing to use Kubernetes.

With Kubernetes, we were able to:

  • Better utilize EC2 instances
  • Obtain a better mix of decentralization (all the services arrive in Kubernetes from authorized repositories, we are not involved in the process) and centralization (we always see when, how, and from where a service arrives to us, whether it is a log, audit or event)
  • Conveniently scale a cluster (we use the combination cluster autoscaler and horizontal pod autoscaler)
  • Get a convenient infrastructure debug (not forgetting that Kubernetes is only one level of abstraction over several others, and even in the worst case scenario it is under the hood of standard RHEL … well, at the very least we have it)
  • Get high levels of fault tolerance and self-healing for the infrastructure
  • Get a single (well, almost) and understandable CI/CD
  • Significantly shorten TTM
  • Have an excellent reason to write this post

And although we didn’t get anything new, we like what we got.

34.1k views34.1k
Comments
Igor Rybakov
Igor Rybakov

Dec 21, 2021

#1 women's health app

Flo exists to empower women and everyone who gets a period by giving them a space where they can access the knowledge and support they need to prioritize their health and well-being, with more than 43M monthly active users and 200M downloads worldwide.

Flo started as a period tracking app, which is still a big part of the application. Eventually, Flo evolved into a holistic AI-Powered Super App that includes multiple domains: health, social media, content creation, marketing, etc.

On top of AWS Cloud, with our Data Platform, we process 500TB daily — approximately 1.5B events.

Asynchronous communications and streaming

To be able to effectively integrate domain and process data at that scale, we frequently use asynchronous contracts and streaming at Flo engineering.

Since we need reliable and efficient communication, we have chosen a commit log approach as the main tool for the job and have used both AWS Kinesis and Kafka (AWS MSK).

Kafka (MSK) vs. Kinesis

We have been using AWS Kinesis for years to stream events: It provided us with a high level of reliability and performance. Fortunately for Flo, the number of events and data was growing, but we started to notice unpleasant spikes in latency — up to 500 ms.

For cases when we serve user requests, this is quite painful: According to research, our response time should be lower than 200 ms (p95) and lower than 500 ms (p99), so we made this our performance “North star.”

In the case of edge, customer-facing services 500 ms for transport means we are out of SLO, even for p99. So we started to search for alternatives. Kafka was the best option.

Since we prefer using managed services, we decided to test AWS MSK. We have made several load tests and received tremendous results — Kafka latency is ~ 70ms (p95) and ~140ms (p99).

We have since started migration to Kafka.

237 views237
Comments