Distributing storage and improving search with Nebula

In 2015 Airbnb grew to a point that a scalable and distributed storage system was required to store data for some applications, especially search. Supporting low-latency personalized search was a major driver of the new architecture, as well as not having to index directly with the Rails and MySQL main application.

For this purpose, Airbnb created Nebula, which supports both real-time and batch access. The real-time part is powered by DynamoDB and the batch is a file format called HFileService, developed in-house at Airbnb.

Spark is used to merge all historical data together with the batch updates, with snapshots stored on S3. Nebula also can stream updates using Kinesis and Kafka, to keep other applications aware of the latest changes.

Distributing storage and improving search with Nebula

Related Tools

Trending on StackShare

Needs advice on code coverage tool in / with External API Te...

I was building a personal project that I needed to store ite...

Your tech stack is solid for building a real-time messaging ...

I had a goal to create the simplest accounting software for ...

Your development environment should ideally match the produc...