Apache Spark vs Apex Comparison | StackShare

Apex vs Apache Spark: What are the differences?

Apex: Serverless Architecture with AWS Lambda. Apex is a small tool for deploying and managing AWS Lambda functions. With shims for languages not yet supported by Lambda, you can use Golang out of the box; Apache Spark: Fast and general engine for large-scale data processing. Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.

Apex belongs to "Serverless / Task Processing" category of the tech stack, while Apache Spark can be primarily classified under "Big Data Tools".

Some of the features offered by Apex are:

Supports languages Lambda does not natively support via shim, such as Go
Binary install (useful for continuous deployment in CI etc)
Project level function and resource management

On the other hand, Apache Spark provides the following key features:

Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk
Write applications quickly in Java, Scala or Python
Combine SQL, streaming, and complex analytics

Apex and Apache Spark are both open source tools. Apache Spark with 22.3K GitHub stars and 19.3K forks on GitHub appears to be more popular than Apex with 7.82K GitHub stars and 567 GitHub forks.

Apache Spark	Apex
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.	Apex is a small tool for deploying and managing AWS Lambda functions. With shims for languages not yet supported by Lambda, you can use Golang out of the box.
Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk;Write applications quickly in Java, Scala or Python;Combine SQL, streaming, and complex analytics;Spark runs on Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Cassandra, HBase, S3	Supports languages Lambda does not natively support via shim, such as Go;Binary install (useful for continuous deployment in CI etc);Project level function and resource management;Configuration inheritance and overrides;Command-line function invocation with JSON streams;Transparently generates a zip for your deploy;Function rollback support;Tail function CloudWatchLogs;Concurrency for quick deploys;Dry-run to preview changes
Statistics
GitHub Stars 42.2K	GitHub Stars 33
GitHub Forks 28.9K	GitHub Forks 56
Stacks 3.0K	Stacks 336
Followers 3.5K	Followers 117
Votes 140	Votes 0
Pros & Cons
Pros 61 Open-source 48 Fast and Flexible 8 One platform for every big data problem 8 Great for distributed SQL like applications 6 Easy to install and to use Cons 4 Speed	No community feedback yet
Integrations
No integrations available	AWS Lambda Golang

Apache Spark vs Apex

Overview

Advice on Apache Spark, Apex

Detailed Comparison

What are some alternatives to Apache Spark, Apex?

AWS Lambda

Presto

Azure Functions

Google Cloud Run

Amazon Athena

Apache Flink

lakeFS

Druid

Serverless

Google Cloud Functions

Related Comparisons

Bootstrap vs Materialize

Django vs Laravel vs Node.js

Bootstrap vs Foundation vs Material UI

Node.js vs Spring-Boot

Flyway vs Liquibase