Amazon Redshift vs Amazon Redshift Spectrum: What are the differences?
Developers describe Amazon Redshift as "Fast, fully managed, petabyte-scale data warehouse service". Redshift makes it simple and cost-effective to efficiently analyze all your data using your existing business intelligence tools. It is optimized for datasets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions. On the other hand, Amazon Redshift Spectrum is detailed as "Exabyte-Scale In-Place Queries of S3 Data". With Redshift Spectrum, you can extend the analytic power of Amazon Redshift beyond data stored on local disks in your data warehouse to query vast amounts of unstructured data in your Amazon S3 “data lake” -- without having to load or transform any data.
Amazon Redshift can be classified as a tool in the "Big Data as a Service" category, while Amazon Redshift Spectrum is grouped under "Big Data Tools".
Lyft, Coursera, and 9GAG are some of the popular companies that use Amazon Redshift, whereas Amazon Redshift Spectrum is used by VSCO, CommonBond, and intermix.io. Amazon Redshift has a broader approval, being mentioned in 270 company stacks & 68 developers stacks; compared to Amazon Redshift Spectrum, which is listed in 5 company stacks and 4 developer stacks.
What is Amazon Redshift?
What is Amazon Redshift Spectrum?
Need advice about which tool to choose?Ask the StackShare community!
Why do developers choose Amazon Redshift Spectrum?
Sign up to add, upvote and see more prosMake informed product decisions
What are the cons of using Amazon Redshift?
What are the cons of using Amazon Redshift Spectrum?
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
Looker , Stitch , Amazon Redshift , dbt
We recently moved our Data Analytics and Business Intelligence tooling to Looker . It's already helping us create a solid process for reusable SQL-based data modeling, with consistent definitions across the entire organizations. Looker allows us to collaboratively build these version-controlled models and push the limits of what we've traditionally been able to accomplish with analytics with a lean team.
For Data Engineering, we're in the process of moving from maintaining our own ETL pipelines on AWS to a managed ELT system on Stitch. We're also evaluating the command line tool, dbt to manage data transformations. Our hope is that Stitch + dbt will streamline the ELT bit, allowing us to focus our energies on analyzing data, rather than managing it.
Aggressive archiving of historical data to keep the production database as small as possible. Using our in-house soon-to-be-open-sourced ETL library, SharpShifter.