What are some alternatives to Azure Databricks?

What is Azure Databricks and what are its top alternatives?

Accelerate big data analytics and artificial intelligence (AI) solutions with Azure Databricks, a fast, easy and collaborative Apache Spark–based analytics service.

Azure Databricks is a tool in the General Analytics category of a tech stack.

Top Alternatives to Azure Databricks

Databricks
Databricks Unified Analytics Platform, from the original creators of Apache Spark™, unifies data science and engineering across the Machine Learning lifecycle from data preparation to experimentation and deployment of ML applications. ...
Azure Machine Learning
Azure Machine Learning is a fully-managed cloud service that enables data scientists and developers to efficiently embed predictive analytics into their applications, helping organizations use massive data sets and bring all the benefits of the cloud to machine learning. ...
Azure HDInsight
It is a cloud-based service from Microsoft for big data analytics that helps organizations process large amounts of streaming or historical data. ...
Apache Spark
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning. ...
Snowflake
Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)—no infrastructure to manage and no knobs to turn. ...
Azure Data Factory
It is a service designed to allow developers to integrate disparate data sources. It is a platform somewhat like SSIS in the cloud to manage the data you have both on-prem and in the cloud. ...
Azure Functions
Azure Functions is an event driven, compute-on-demand experience that extends the existing Azure application platform with capabilities to implement code triggered by events occurring in virtually any Azure or 3rd party service as well as on-premises systems. ...
Google Analytics
Google Analytics lets you measure your advertising ROI as well as track your Flash, video, and social networking sites and applications. ...

Azure Databricks alternatives & related posts

Databricks

507

A unified analytics platform, powered by Apache Spark

Stacks507

Votes8

PROS OF DATABRICKS

1
Best Performances on large datasets
1
True lakehouse architecture
1
Scalability
1
Databricks doesn't get access to your data
1
Usage Based Billing
1
Security
1
Data stays in your cloud account
1
Multicloud

CONS OF DATABRICKS

Be the first to leave a con

COMPARE

Compare Databricks vs Azure Databricks

Azure Machine Learning

244

A fully-managed cloud service for predictive analytics

Stacks244

Votes0

PROS OF AZURE MACHINE LEARNING

Be the first to leave a pro

CONS OF AZURE MACHINE LEARNING

Be the first to leave a con

COMPARE

Compare Azure Machine Learning vs Azure Databricks

Azure HDInsight

A cloud-based service from Microsoft for big data analytics

Stacks31

Votes0

PROS OF AZURE HDINSIGHT

Be the first to leave a pro

CONS OF AZURE HDINSIGHT

Be the first to leave a con

COMPARE

Compare Azure HDInsight vs Azure Databricks

Apache Spark

140

Fast and general engine for large-scale data processing

Stacks3K

Votes140

PROS OF APACHE SPARK

61
Open-source
48
Fast and Flexible
8
One platform for every big data problem
8
Great for distributed SQL like applications
6
Easy to install and to use
3
Works well for most Datascience usecases
2
Interactive Query
2
Machine learning libratimery, Streaming in real
2
In memory Computation

CONS OF APACHE SPARK

4
Speed

COMPARE

Compare Apache Spark vs Azure Databricks

Snowflake

1.1K

The data warehouse built for the cloud

Stacks1.1K

Votes27

PROS OF SNOWFLAKE

7
Public and Private Data Sharing
4
Multicloud
4
Good Performance
4
User Friendly
3
Great Documentation
2
Serverless
1
Economical
1
Usage based billing
1
Innovative

CONS OF SNOWFLAKE

Be the first to leave a con

COMPARE

Compare Snowflake vs Azure Databricks

Azure Data Factory

252

Hybrid data integration service that simplifies ETL at scale

Stacks252

Votes0

PROS OF AZURE DATA FACTORY

Be the first to leave a pro

CONS OF AZURE DATA FACTORY

Be the first to leave a con

COMPARE

Compare Azure Data Factory vs Azure Databricks

Azure Functions

681

Listen and react to events across your stack

Stacks681

Votes62

PROS OF AZURE FUNCTIONS

14
Pay only when invoked
11
Great developer experience for C#
9
Multiple languages supported
7
Great debugging support
5
Can be used as lightweight https service
4
Easy scalability
3
WebHooks
3
Costo
2
Event driven
2
Azure component events for Storage, services etc
2
Poor developer experience for C#

CONS OF AZURE FUNCTIONS

1
No persistent (writable) file system available
1
Poor support for Linux environments
1
Sporadic server & language runtime issues
1
Not suited for long-running applications

COMPARE

Compare Azure Functions vs Azure Databricks

related Azure Functions posts

Kestas Barzdaitis

Entrepreneur & Engineer · Dec 3, 2018 | 16 upvotes · 784.5K views

Shared insights

Google Compute Engine +4 more

CodeFactor

CodeFactor being a #SAAS product, our goal was to run on a cloud-native infrastructure since day one. We wanted to stay product focused, rather than having to work on the infrastructure that supports the application. We needed a cloud-hosting provider that would be reliable, economical and most efficient for our product.

CodeFactor.io aims to provide an automated and frictionless code review service for software developers. That requires agility, instant provisioning, autoscaling, security, availability and compliance management features. We looked at the top three #IAAS providers that take up the majority of market share: Amazon's Amazon EC2 , Microsoft's Microsoft Azure, and Google Compute Engine.

AWS has been available since 2006 and has developed the most extensive services ant tools variety at a massive scale. Azure and GCP are about half the AWS age, but also satisfied our technical requirements.

It is worth noting that even though all three providers support Docker containerization services, GCP has the most robust offering due to their investments in Kubernetes. Also, if you are a Microsoft shop, and develop in .NET - Visual Studio Azure shines at integration there and all your existing .NET code works seamlessly on Azure. All three providers have serverless computing offerings (AWS Lambda, Azure Functions, and Google Cloud Functions). Additionally, all three providers have machine learning tools, but GCP appears to be the most developer-friendly, intuitive and complete when it comes to #Machinelearning and #AI.

The prices between providers are competitive across the board. For our requirements, AWS would have been the most expensive, GCP the least expensive and Azure was in the middle. Plus, if you #Autoscale frequently with large deltas, note that Azure and GCP have per minute billing, where AWS bills you per hour. We also applied for the #Startup programs with all three providers, and this is where Azure shined. While AWS and GCP for startups would have covered us for about one year of infrastructure costs, Azure Sponsorship would cover about two years of CodeFactor's hosting costs. Moreover, Azure Team was terrific - I felt that they wanted to work with us where for AWS and GCP we were just another startup.

In summary, we were leaning towards GCP. GCP's advantages in containerization, automation toolset, #Devops mindset, and pricing were the driving factors there. Nevertheless, we could not say no to Azure's financial incentives and a strong sense of partnership and support throughout the process.

Bottom line is, IAAS offerings with AWS, Azure, and GCP are evolving fast. At CodeFactor, we aim to be platform agnostic where it is practical and retain the flexibility to cherry-pick the best products across providers.

codefactor.io (@CodeFactor_io) | Twitter

Michal Nowak

Co-founder at Evojam · Dec 5, 2018 | 8 upvotes · 561.2K views

Shared insights

In a couple of recent projects we had an opportunity to try out the new Serverless approach to building web applications. It wasn't necessarily a question if we should use any particular vendor but rather "if" we can consider serverless a viable option for building apps. Obviously our goal was also to get a feel for this technology and gain some hands-on experience.

We did consider AWS Lambda, Firebase from Google as well as Azure Functions. Eventually we went with AWS Lambdas.

PROS

No servers to manage (obviously!)
Limited fixed costs – you pay only for used time
Automated scaling and balancing
Automatic failover (or, at this level of abstraction, no failover problem at all)
Security easier to provide and audit
Low overhead at the start (with the certain level of knowledge)
Short time to market
Easy handover - deployment coupled with code
Perfect choice for lean startups with fast-paced iterations
Augmentation for the classic cloud, server(full) approach

CONS

Not much know-how and best practices available about structuring the code and projects on the market
Not suitable for complex business logic due to the risk of producing highly coupled code
Cost difficult to estimate (helpful tools: serverlesscalc.com)
Difficulty in migration to other platforms (Vendor lock⚠️)
Little engineers with experience in serverless on the job market
Steep learning curve for engineers without any cloud experience

More details are on our blog: https://evojam.com/blog/2018/12/5/should-you-go-serverless-meet-the-benefits-and-flaws-of-new-wave-of-cloud-solutions I hope it helps 🙌 & I'm curious of your experiences.

Should You go Serverless? Meet The Benefits And Flaws of New Wave of Cloud Solutions — Evojam