What is Alooma?
What is Altiscale?
Need advice about which tool to choose?Ask the StackShare community!
Why do developers choose Alooma?
Why do developers choose Altiscale?
Sign up to add, upvote and see more prosMake informed product decisions
What are the cons of using Alooma?
What are the cons of using Altiscale?
Sign up to get full access to all the companiesMake informed product decisions
What tools integrate with Altiscale?
Sign up to get full access to all the tool integrationsMake informed product decisions
Looker , Stitch , Amazon Redshift , dbt
We recently moved our Data Analytics and Business Intelligence tooling to Looker . It's already helping us create a solid process for reusable SQL-based data modeling, with consistent definitions across the entire organizations. Looker allows us to collaboratively build these version-controlled models and push the limits of what we've traditionally been able to accomplish with analytics with a lean team.
For Data Engineering, we're in the process of moving from maintaining our own ETL pipelines on AWS to a managed ELT system on Stitch. We're also evaluating the command line tool, dbt to manage data transformations. Our hope is that Stitch + dbt will streamline the ELT bit, allowing us to focus our energies on analyzing data, rather than managing it.
Currently, we need to ingest the data from Amazon S3 to DB either Amazon Athena or Amazon Redshift. But the problem with the data is, it is in .PSV (pipe separated values) format and the size is also above 200 GB. The query performance of the timeout in Athena/Redshift is not up to the mark, too slow while compared to Google BigQuery. How would I optimize the performance and query result time? Can anyone please help me out?
Aggressive archiving of historical data to keep the production database as small as possible. Using our in-house soon-to-be-open-sourced ETL library, SharpShifter.