Need advice about which tool to choose?Ask the StackShare community!
Snowflake vs Treasure Data: What are the differences?
Storage Architecture: Snowflake and Treasure Data have different storage architectures. Snowflake uses a multi-cluster shared data architecture, where data is stored in multiple virtual warehouses that can be scaled independently. Treasure Data, on the other hand, uses a distributed data storage architecture, which allows for reliable and scalable storage of large volumes of data.
Query Processing: Snowflake and Treasure Data have different approaches to query processing. Snowflake uses a unique two-step query processing model that separates query compilation from query execution. This allows for faster query performance and optimization. Treasure Data, on the other hand, uses a distributed query processing model that allows for parallel execution of queries across multiple nodes, resulting in high query performance.
Data Transformation: Snowflake and Treasure Data have different capabilities for data transformation. Snowflake provides a wide range of built-in functions and operators for data transformation, including support for complex data types and data manipulation language (DML) operations. Treasure Data, on the other hand, provides a rich set of data transformation features, including support for user-defined functions, data aggregation, and data cleansing.
Data Integration: Snowflake and Treasure Data have different approaches to data integration. Snowflake provides native connectors to popular data integration tools, such as Informatica and Talend, allowing for seamless data ingestion and integration. Treasure Data, on the other hand, offers a data collection platform that supports data integration from various sources, including web and mobile devices, IoT devices, and third-party applications.
Security: Snowflake and Treasure Data have different security features. Snowflake provides advanced security features, such as granular access controls, encryption at rest and in transit, and integration with external authentication providers. Treasure Data, on the other hand, offers secure data processing and storage, with features such as data anonymization, access controls, and encryption.
Scalability: Snowflake and Treasure Data have different scalability capabilities. Snowflake is designed to scale horizontally, allowing for seamless scaling of compute and storage resources as data volume and query workload increase. Treasure Data, on the other hand, is built on a distributed architecture that enables horizontal scaling of data processing and storage, allowing for high scalability and performance.
In Summary, Snowflake and Treasure Data differ in their storage architecture, query processing, data transformation capabilities, data integration approaches, security features, and scalability capabilities.
Cloud Data-warehouse is the centerpiece of modern Data platform. The choice of the most suitable solution is therefore fundamental.
Our benchmark was conducted over BigQuery and Snowflake. These solutions seem to match our goals but they have very different approaches.
BigQuery is notably the only 100% serverless cloud data-warehouse, which requires absolutely NO maintenance: no re-clustering, no compression, no index optimization, no storage management, no performance management. Snowflake requires to set up (paid) reclustering processes, to manage the performance allocated to each profile, etc. We can also mention Redshift, which we have eliminated because this technology requires even more ops operation.
BigQuery can therefore be set up with almost zero cost of human resources. Its on-demand pricing is particularly adapted to small workloads. 0 cost when the solution is not used, only pay for the query you're running. But quickly the use of slots (with monthly or per-minute commitment) will drastically reduce the cost of use. We've reduced by 10 the cost of our nightly batches by using flex slots.
Finally, a major advantage of BigQuery is its almost perfect integration with Google Cloud Platform services: Cloud functions, Dataflow, Data Studio, etc.
BigQuery is still evolving very quickly. The next milestone, BigQuery Omni, will allow to run queries over data stored in an external Cloud platform (Amazon S3 for example). It will be a major breakthrough in the history of cloud data-warehouses. Omni will compensate a weakness of BigQuery: transferring data in near real time from S3 to BQ is not easy today. It was even simpler to implement via Snowflake's Snowpipe solution.
We also plan to use the Machine Learning features built into BigQuery to accelerate our deployment of Data-Science-based projects. An opportunity only offered by the BigQuery solution
Pros of Snowflake
- Public and Private Data Sharing7
- Multicloud4
- Good Performance4
- User Friendly4
- Great Documentation3
- Serverless2
- Economical1
- Usage based billing1
- Innovative1
Pros of Treasure Data
- Scaleability, less overhead2
- Makes it easy to ingest all data from different inputs2
- Responsive to our business requirements, great support1