AWS Lambda vs Kudu

Overview

AWS Lambda

Stacks26.0K

Followers18.8K

Votes432

Apache Kudu

Stacks71

Followers259

Votes10

GitHub Stars828

Forks282

AWS Lambda vs Kudu: What are the differences?

What is AWS Lambda? Automatically run code in response to modifications to objects in Amazon S3 buckets, messages in Kinesis streams, or updates in DynamoDB. AWS Lambda is a compute service that runs your code in response to events and automatically manages the underlying compute resources for you. You can use AWS Lambda to extend other AWS services with custom logic, or create your own back-end services that operate at AWS scale, performance, and security.

What is Kudu? Fast Analytics on Fast Data. A columnar storage manager developed for the Hadoop platform. A new addition to the open source Apache Hadoop ecosystem, Kudu completes Hadoop's storage layer to enable fast analytics on fast data.

AWS Lambda and Kudu are primarily classified as "Serverless / Task Processing" and "Big Data" tools respectively.

"No infrastructure" is the primary reason why developers consider AWS Lambda over the competitors, whereas "Realtime Analytics" was stated as the key factor in picking Kudu.

Kudu is an open source tool with 789 GitHub stars and 263 GitHub forks. Here's a link to Kudu's open source repository on GitHub.

Share your Stack

Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.

View Docs

CLI (Node.js)

Manual

Advice on AWS Lambda, Apache Kudu

Tim

CTO at Checkly Inc.

Sep 18, 2019

Needs adviceon

Heroku

AWS Lambda

When adding a new feature to Checkly rearchitecting some older piece, I tend to pick Heroku for rolling it out. But not always, because sometimes I pick AWS Lambda . The short story:

Developer Experience trumps everything.
AWS Lambda is cheap. Up to a limit though. This impact not only your wallet.
If you need geographic spread, AWS is lonely at the top.

The setup

Recently, I was doing a brainstorm at a startup here in Berlin on the future of their infrastructure. They were ready to move on from their initial, almost 100% Ec2 + Chef based setup. Everything was on the table. But we crossed out a lot quite quickly:

Pure, uncut, self hosted Kubernetes — way too much complexity
Managed Kubernetes in various flavors — still too much complexity
Zeit — Maybe, but no Docker support
Elastic Beanstalk — Maybe, bit old but does the job
Heroku
Lambda

It became clear a mix of PaaS and FaaS was the way to go. What a surprise! That is exactly what I use for Checkly! But when do you pick which model?

I chopped that question up into the following categories:

Developer Experience / DX 🤓
Ops Experience / OX 🐂 (?)
Cost 💵
Lock in 🔐

Read the full post linked below for all details

357k views357k

Comments

Detailed Comparison

AWS Lambda	Apache Kudu
AWS Lambda is a compute service that runs your code in response to events and automatically manages the underlying compute resources for you. You can use AWS Lambda to extend other AWS services with custom logic, or create your own back-end services that operate at AWS scale, performance, and security.	A new addition to the open source Apache Hadoop ecosystem, Kudu completes Hadoop's storage layer to enable fast analytics on fast data.
Extend other AWS services with custom logic;Build custom back-end services;Completely Automated Administration;Built-in Fault Tolerance;Automatic Scaling;Integrated Security Model;Bring Your Own Code;Pay Per Use;Flexible Resource Model	-
Statistics
GitHub Stars -	GitHub Stars 828
GitHub Forks -	GitHub Forks 282
Stacks 26.0K	Stacks 71
Followers 18.8K	Followers 259
Votes 432	Votes 10
Pros & Cons
Pros 129 No infrastructure 83 Cheap 70 Quick 59 Stateless 47 No deploy, no server, great sleep Cons 7 Cant execute ruby or go 3 Compute time limited 1 Can't execute PHP w/o significant effort	Pros 10 Realtime Analytics Cons 1 Restart time
Integrations
No integrations available	Hadoop

What are some alternatives to AWS Lambda, Apache Kudu?

Apache Spark

Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.

Presto

Distributed SQL Query Engine for Big Data

Azure Functions

Azure Functions is an event driven, compute-on-demand experience that extends the existing Azure application platform with capabilities to implement code triggered by events occurring in virtually any Azure or 3rd party service as well as on-premises systems.

Google Cloud Run

A managed compute platform that enables you to run stateless containers that are invocable via HTTP requests. It's serverless by abstracting away all infrastructure management.

Amazon Athena

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

Apache Flink

Apache Flink is an open source system for fast and versatile data analytics in clusters. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.

lakeFS

It is an open-source data version control system for data lakes. It provides a “Git for data” platform enabling you to implement best practices from software engineering on your data lake, including branching and merging, CI/CD, and production-like dev/test environments.

Druid

Druid is a distributed, column-oriented, real-time analytics data store that is commonly used to power exploratory dashboards in multi-tenant environments. Druid excels as a data warehousing solution for fast aggregate queries on petabyte sized data sets. Druid supports a variety of flexible filters, exact calculations, approximate algorithms, and other useful calculations.

Serverless

Build applications comprised of microservices that run in response to events, auto-scale for you, and only charge you when they run. This lowers the total cost of maintaining your apps, enabling you to build more logic, faster. The Framework uses new event-driven compute services, like AWS Lambda, Google CloudFunctions, and more.

Google Cloud Functions

Construct applications from bite-sized business logic billed to the nearest 100 milliseconds, only while your code is running

Related Comparisons

Bootstrap vs Materialize

Django vs Laravel vs Node.js

Bootstrap vs Foundation vs Material UI

Node.js vs Spring-Boot

Flyway vs Liquibase

AWS Lambda vs Kudu: What are the differences?

AWS Lambda and Kudu are primarily classified as "Serverless / Task Processing" and "Big Data" tools respectively.

"No infrastructure" is the primary reason why developers consider AWS Lambda over the competitors, whereas "Realtime Analytics" was stated as the key factor in picking Kudu.

Kudu is an open source tool with 789 GitHub stars and 263 GitHub forks. Here's a link to Kudu's open source repository on GitHub.