AWS Lambda vs. Apache Spark

  • -
  • 2
  • 11.1K
  • -
  • 613
  • 0
No public GitHub repository stats available

What is AWS Lambda?

AWS Lambda is a compute service that runs your code in response to events and automatically manages the underlying compute resources for you. You can use AWS Lambda to extend other AWS services with custom logic, or create your own back-end services that operate at AWS scale, performance, and security.

What is Apache Spark?

Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
Why do developers choose AWS Lambda?
Why do you like AWS Lambda?

Why do developers choose Apache Spark?
Why do you like Apache Spark?

What are the cons of using AWS Lambda?
Downsides of AWS Lambda?

What are the cons of using Apache Spark?
No Cons submitted yet for Apache Spark
Downsides of Apache Spark?

Want advice about which of these to choose?Ask the StackShare community!

How much does AWS Lambda cost?
AWS Lambda Pricing
How much does Apache Spark cost?
What companies use AWS Lambda?
1234 companies on StackShare use AWS Lambda
What companies use Apache Spark?
338 companies on StackShare use Apache Spark
What tools integrate with AWS Lambda?
42 tools on StackShare integrate with AWS Lambda
What tools integrate with Apache Spark?
15 tools on StackShare integrate with Apache Spark

What are some alternatives to AWS Lambda and Apache Spark?

  • Serverless - The most widely-adopted toolkit for building serverless applications
  • AWS Elastic Beanstalk - Quickly deploy and manage applications in the AWS cloud.
  • Azure Functions - Listen and react to events across your stack
  • AWS Step Functions - Build Distributed Applications Using Visual Workflows

See all alternatives to AWS Lambda

How to automate the auditing of operational best pra...
Monitoring tools for serverless environments and AWS...
Create a Serverless Twitter Bot with Airbrake and AW...
Related Stack Decisions
Jeyabalaji Subramanian
Jeyabalaji Subramanian
CTO at FundsCorner | 12 upvotes 27.7K views
atFundsCorner
Amazon SQS
Sentry
GitLab CI
Slack
Google Compute Engine
Netlify
AWS Lambda
Zappa
vuex
Vuetify
Vue.js
Swagger UI
MongoDB
Flask
Python

At FundsCorner, we are on a mission to enable fast accessible credit to India鈥檚 Kirana Stores. We are an early stage startup with an ultra small Engineering team. All the tech decisions we have made until now are based on our core philosophy: "Build usable products fast".

Based on the above fundamentals, we chose Python as our base language for all our APIs and micro-services. It is ultra easy to start with, yet provides great libraries even for the most complex of use cases. Our entire backend stack runs on Python and we cannot be more happy with it! If you are looking to deploy your API as server-less, Python provides one of the least cold start times.

We build our APIs with Flask. For backend database, our natural choice was MongoDB. It frees up our time from complex database specifications - we instead use our time in doing sensible data modelling & once we finalize the data model, we integrate it into Flask using Swagger UI. Mongo supports complex queries to cull out difficult data through aggregation framework & we have even built an internal framework called "Poetry", for aggregation queries.

Our web apps are built on Vue.js , Vuetify and vuex. Initially we debated a lot around choosing Vue.js or React , but finally settled with Vue.js, mainly because of the ease of use, fast development cycles & awesome set of libraries and utilities backing Vue.

You simply cannot go wrong with Vue.js . Great documentation, the library is ultra compact & is blazing fast. Choosing Vue.js was one of the critical decisions made, which enabled us to launch our web app in under a month (which otherwise would have taken 3 months easily). For those folks who are looking for big names, Adobe, and Alibaba and Gitlab are using Vue.

By choosing Vuetify, we saved thousands of person hours in designing the CSS files. Vuetify contains all key material components for designing a smooth User experience & it just works! It's an awesome framework. All of us at FundsCorner are now lifelong fanboys of Vue.js and Vuetify.

On the infrastructure side, all our API services and backend services are deployed as server less micro-services through Zappa. Zappa makes your life super easy by packaging everything that is required to deploy your code as AWS Lambda. We are now addicted to the single - click deploys / updates through Zappa. Try it out & you will convert!

Also, if you are using Zappa, you can greatly simplify your CI / CD pipelines. Do try it! It's just awesome! and... you will be astonished by the savings you have made on AWS bills at end of the month.

Our CI / CD pipelines are built using GitLab CI. The documentation is very good & it enables you to go from from concept to production in minimal time frame.

We use Sentry for all crash reporting and resolution. Pro tip, they do have handlers for AWS Lambda , which made our integration super easy.

All our micro-services including APIs are event-driven. Our background micro-services are message oriented & we use Amazon SQS as our message pipe. We have our own in-house workflow manager to orchestrate across micro - services.

We host our static websites on Netlify. One of the cool things about Netlify is the automated CI / CD on git push. You just do a git push to deploy! Again, it is super simple to use and it just works. We were dogmatic about going server less even on static web sites & you can go server less on Netlify in a few minutes. It's just a few clicks away.

We use Google Compute Engine, especially Google Vision for our AI experiments.

For Ops automation, we use Slack. Slack provides a super-rich API (through Slack App) through which you can weave magical automation on boring ops tasks.

See more
Julien DeFrance
Julien DeFrance
Full Stack Engineering Manager at ValiMail | 2 upvotes 4.2K views
atSmartZip
Amazon SageMaker
Amazon Machine Learning
AWS Lambda
Serverless
#FaaS
#GCP
#PaaS

Which #IaaS / #PaaS to chose? Not all #Cloud providers are created equal. As you start to use one or the other, you'll build around very specific services that don't have their equivalent elsewhere.

Back in 2014/2015, this decision I made for SmartZip was a no-brainer and #AWS won. AWS has been a leader, and over the years demonstrated their capacity to innovate, and reducing toil. Like no other.

Year after year, this kept on being confirmed, as they rolled out new (managed) services, got into Serverless with AWS Lambda / FaaS And allowed domains such as #AI / #MachineLearning to be put into the hands of every developers thanks to Amazon Machine Learning or Amazon SageMaker for instance.

Should you compare with #GCP for instance, it's not quite there yet. Building around these managed services, #AWS allowed me to get my developers on a whole new level. Where they know what's under the hood. Where they know they have these services available and can build around them. Where they care and are responsible for operations and security and deployment of what they've worked on.

See more
Aviad Mor
Aviad Mor
CTO & Co-Founder at Lumigo | 5 upvotes 3K views
atLumigo
Serverless
CircleCI
AWS Lambda

Our backend is serverless based, with many AWS Lambda , with CI/CD, using CircleCI and Serverless. This allows to develop with awesome agility and move fast. Since we update our lambdas daily, we needed a way to make sure we did not run into AWS's max limit of versions per lambda. We use the open source in link below to clear them out and stay clear of the limit.

See more


Interest Over Time