Amazon Redshift Spectrum vs Pig: What are the differences?
What is Amazon Redshift Spectrum? Exabyte-Scale In-Place Queries of S3 Data. With Redshift Spectrum, you can extend the analytic power of Amazon Redshift beyond data stored on local disks in your data warehouse to query vast amounts of unstructured data in your Amazon S3 “data lake” -- without having to load or transform any data.
What is Pig? Platform for analyzing large data sets. Pig is a dataflow programming environment for processing very large files. Pig's language is called Pig Latin. A Pig Latin program consists of a directed acyclic graph where each node represents an operation that transforms data Operations are of two flavors: (1) relational-algebra style operations such as join, filter, project; (2) functional-programming style operators such as map, reduce. .
Amazon Redshift Spectrum and Pig can be primarily classified as "Big Data" tools.
Pig is an open source tool with 583 GitHub stars and 449 GitHub forks. Here's a link to Pig's open source repository on GitHub.
According to the StackShare community, Pig has a broader approval, being mentioned in 9 company stacks & 4 developers stacks; compared to Amazon Redshift Spectrum, which is listed in 5 company stacks and 4 developer stacks.