Amazon Redshift vs Qubole: What are the differences?
Amazon Redshift: Fast, fully managed, petabyte-scale data warehouse service. Redshift makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions; Qubole: Prepare, integrate and explore Big Data in the cloud (Hive, MapReduce, Pig, Presto, Spark and Sqoop). Qubole is a cloud based service that makes big data easy for analysts and data engineers.
Amazon Redshift and Qubole belong to "Big Data as a Service" category of the tech stack.
Some of the features offered by Amazon Redshift are:
- Optimized for Data Warehousing- It uses columnar storage, data compression, and zone maps to reduce the amount of IO needed to perform queries. Redshift has a massively parallel processing (MPP) architecture, parallelizing and distributing SQL operations to take advantage of all available resources.
- Scalable- With a few clicks of the AWS Management Console or a simple API call, you can easily scale the number of nodes in your data warehouse up or down as your performance or capacity needs change.
- No Up-Front Costs- You pay only for the resources you provision. You can choose On-Demand pricing with no up-front costs or long-term commitments, or obtain significantly discounted rates with Reserved Instance pricing.
On the other hand, Qubole provides the following key features:
- Intuitive GUI
- Optimized Hive
- Improved S3 Performance
"Data Warehousing" is the top reason why over 27 developers like Amazon Redshift, while over 9 developers mention "Simple UI and autoscaling clusters" as the leading cause for choosing Qubole.
What is Amazon Redshift?
What is Qubole?
Want advice about which of these to choose?Ask the StackShare community!
Why do developers choose Qubole?
Sign up to add, upvote and see more prosMake informed product decisions
What are the cons of using Amazon Redshift?
What are the cons of using Qubole?
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
We ultimately migrated our Hadoop jobs to Qubole, a rising player in the Hadoop as a Service space. Given that EMR had become unstable at our scale, we had to quickly move to a provider that played well with AWS (specifically, spot instances) and S3. Qubole supported AWS/S3 and was relatively easy to get started on. After vetting Qubole and comparing its performance against alternatives (including managed clusters), we decided to go with Qubole
Aggressive archiving of historical data to keep the production database as small as possible. Using our in-house soon-to-be-open-sourced ETL library, SharpShifter.