Alternatives to Treasure Data logo

Alternatives to Treasure Data

Fluentd, Segment, Splunk, Google BigQuery, and Amazon Redshift are the most popular alternatives and competitors to Treasure Data.
27
43
+ 1
5

What is Treasure Data and what are its top alternatives?

Treasure Data's Big Data as-a-Service cloud platform enables data-driven businesses to focus their precious development resources on their applications, not on mundane, time-consuming integration and operational tasks. The Treasure Data Cloud Data Warehouse service offers an affordable, quick-to-implement and easy-to-use big data option that does not require specialized IT resources, making big data analytics available to the mass market.
Treasure Data is a tool in the Big Data as a Service category of a tech stack.

Top Alternatives to Treasure Data

  • Fluentd
    Fluentd

    Fluentd collects events from various data sources and writes them to files, RDBMS, NoSQL, IaaS, SaaS, Hadoop and so on. Fluentd helps you unify your logging infrastructure. ...

  • Segment
    Segment

    Segment is a single hub for customer data. Collect your data in one place, then send it to more than 100 third-party tools, internal systems, or Amazon Redshift with the flip of a switch. ...

  • Splunk
    Splunk

    It provides the leading platform for Operational Intelligence. Customers use it to search, monitor, analyze and visualize machine data. ...

  • Google BigQuery
    Google BigQuery

    Run super-fast, SQL-like queries against terabytes of data in seconds, using the processing power of Google's infrastructure. Load data with ease. Bulk load your data using Google Cloud Storage or stream it in. Easy access. Access BigQuery by using a browser tool, a command-line tool, or by making calls to the BigQuery REST API with client libraries such as Java, PHP or Python. ...

  • Amazon Redshift
    Amazon Redshift

    It is optimized for data sets ranging from a few hundred gigabytes to a petabyte or more and costs less than $1,000 per terabyte per year, a tenth the cost of most traditional data warehousing solutions. ...

  • Snowflake
    Snowflake

    Snowflake eliminates the administration and management demands of traditional data warehouses and big data platforms. Snowflake is a true data warehouse as a service running on Amazon Web Services (AWS)—no infrastructure to manage and no knobs to turn. ...

  • Amazon EMR
    Amazon EMR

    It is used in a variety of applications, including log analysis, data warehousing, machine learning, financial analysis, scientific simulation, and bioinformatics. ...

  • Stitch
    Stitch

    Stitch is a simple, powerful ETL service built for software developers. Stitch evolved out of RJMetrics, a widely used business intelligence platform. When RJMetrics was acquired by Magento in 2016, Stitch was launched as its own company. ...

Treasure Data alternatives & related posts

Fluentd logo

Fluentd

541
653
37
Unified logging layer
541
653
+ 1
37
PROS OF FLUENTD
  • 11
    Open-source
  • 9
    Great for Kubernetes node container log forwarding
  • 9
    Lightweight
  • 8
    Easy
CONS OF FLUENTD
    Be the first to leave a con

    related Fluentd posts

    Segment logo

    Segment

    2.9K
    874
    275
    A single hub to collect, translate and send your data with the flip of a switch.
    2.9K
    874
    + 1
    275
    PROS OF SEGMENT
    • 86
      Easy to scale and maintain 3rd party services
    • 49
      One API
    • 39
      Simple
    • 25
      Multiple integrations
    • 19
      Cleanest API
    • 10
      Easy
    • 9
      Free
    • 8
      Mixpanel Integration
    • 7
      Segment SQL
    • 6
      Flexible
    • 4
      Google Analytics Integration
    • 2
      Salesforce Integration
    • 2
      SQL Access
    • 2
      Clean Integration with Application
    • 1
      Own all your tracking data
    • 1
      Quick setup
    • 1
      Clearbit integration
    • 1
      Beautiful UI
    • 1
      Integrates with Apptimize
    • 1
      Escort
    • 1
      Woopra Integration
    CONS OF SEGMENT
    • 2
      Not clear which events/options are integration-specific
    • 1
      Limitations with integration-specific configurations
    • 1
      Client-side events are separated from server-side

    related Segment posts

    Robert Zuber

    Our primary source of monitoring and alerting is Datadog. We’ve got prebuilt dashboards for every scenario and integration with PagerDuty to manage routing any alerts. We’ve definitely scaled past the point where managing dashboards is easy, but we haven’t had time to invest in using features like Anomaly Detection. We’ve started using Honeycomb for some targeted debugging of complex production issues and we are liking what we’ve seen. We capture any unhandled exceptions with Rollbar and, if we realize one will keep happening, we quickly convert the metrics to point back to Datadog, to keep Rollbar as clean as possible.

    We use Segment to consolidate all of our trackers, the most important of which goes to Amplitude to analyze user patterns. However, if we need a more consolidated view, we push all of our data to our own data warehouse running PostgreSQL; this is available for analytics and dashboard creation through Looker.

    See more
    Max Musing
    Founder & CEO at BaseDash · | 8 upvotes · 231.9K views

    Functionally, Amplitude and Mixpanel are incredibly similar. They both offer almost all the same functionality around tracking and visualizing user actions for analytics. You can track A/B test results in both. We ended up going with Amplitude at BaseDash because it has a more generous free tier for our uses (10 million actions per month, versus Mixpanel's 1000 monthly tracked users).

    Segment isn't meant to compete with these tools, but instead acts as an API to send actions to them, and other analytics tools. If you're just sending event data to one of these tools, you probably don't need Segment. If you're using other analytics tools like Google Analytics and FullStory, Segment makes it easy to send events to all your tools at once.

    See more
    Splunk logo

    Splunk

    545
    915
    14
    Search, monitor, analyze and visualize machine data
    545
    915
    + 1
    14
    PROS OF SPLUNK
    • 2
      Ability to style search results into reports
    • 2
      Alert system based on custom query results
    • 2
      API for searching logs, running reports
    • 2
      Query engine supports joining, aggregation, stats, etc
    • 1
      Query any log as key-value pairs
    • 1
      Splunk language supports string, date manip, math, etc
    • 1
      Granular scheduling and time window support
    • 1
      Custom log parsing as well as automatic parsing
    • 1
      Dashboarding on any log contents
    • 1
      Rich GUI for searching live logs
    CONS OF SPLUNK
    • 1
      Splunk query language rich so lots to learn

    related Splunk posts

    Shared insights
    on
    KibanaKibanaSplunkSplunkGrafanaGrafana

    I use Kibana because it ships with the ELK stack. I don't find it as powerful as Splunk however it is light years above grepping through log files. We previously used Grafana but found it to be annoying to maintain a separate tool outside of the ELK stack. We were able to get everything we needed from Kibana.

    See more
    Shared insights
    on
    SplunkSplunkElasticsearchElasticsearch

    We are currently exploring Elasticsearch and Splunk for our centralized logging solution. I need some feedback about these two tools. We expect our logs in the range of upwards > of 10TB of logging data.

    See more
    Google BigQuery logo

    Google BigQuery

    1.5K
    1.3K
    147
    Analyze terabytes of data in seconds
    1.5K
    1.3K
    + 1
    147
    PROS OF GOOGLE BIGQUERY
    • 27
      High Performance
    • 24
      Easy to use
    • 21
      Fully managed service
    • 19
      Cheap Pricing
    • 16
      Process hundreds of GB in seconds
    • 11
      Full table scans in seconds, no indexes needed
    • 11
      Big Data
    • 8
      Always on, no per-hour costs
    • 6
      Good combination with fluentd
    • 4
      Machine learning
    CONS OF GOOGLE BIGQUERY
    • 1
      You can't unit test changes in BQ data

    related Google BigQuery posts

    Context: I wanted to create an end to end IoT data pipeline simulation in Google Cloud IoT Core and other GCP services. I never touched Terraform meaningfully until working on this project, and it's one of the best explorations in my development career. The documentation and syntax is incredibly human-readable and friendly. I'm used to building infrastructure through the google apis via Python , but I'm so glad past Sung did not make that decision. I was tempted to use Google Cloud Deployment Manager, but the templates were a bit convoluted by first impression. I'm glad past Sung did not make this decision either.

    Solution: Leveraging Google Cloud Build Google Cloud Run Google Cloud Bigtable Google BigQuery Google Cloud Storage Google Compute Engine along with some other fun tools, I can deploy over 40 GCP resources using Terraform!

    Check Out My Architecture: CLICK ME

    Check out the GitHub repo attached

    See more
    Tim Specht
    ‎Co-Founder and CTO at Dubsmash · | 14 upvotes · 697.4K views

    In order to accurately measure & track user behaviour on our platform we moved over quickly from the initial solution using Google Analytics to a custom-built one due to resource & pricing concerns we had.

    While this does sound complicated, it’s as easy as clients sending JSON blobs of events to Amazon Kinesis from where we use AWS Lambda & Amazon SQS to batch and process incoming events and then ingest them into Google BigQuery. Once events are stored in BigQuery (which usually only takes a second from the time the client sends the data until it’s available), we can use almost-standard-SQL to simply query for data while Google makes sure that, even with terabytes of data being scanned, query times stay in the range of seconds rather than hours. Before ingesting their data into the pipeline, our mobile clients are aggregating events internally and, once a certain threshold is reached or the app is going to the background, sending the events as a JSON blob into the stream.

    In the past we had workers running that continuously read from the stream and would validate and post-process the data and then enqueue them for other workers to write them to BigQuery. We went ahead and implemented the Lambda-based approach in such a way that Lambda functions would automatically be triggered for incoming records, pre-aggregate events, and write them back to SQS, from which we then read them, and persist the events to BigQuery. While this approach had a couple of bumps on the road, like re-triggering functions asynchronously to keep up with the stream and proper batch sizes, we finally managed to get it running in a reliable way and are very happy with this solution today.

    #ServerlessTaskProcessing #GeneralAnalytics #RealTimeDataProcessing #BigDataAsAService

    See more
    Amazon Redshift logo

    Amazon Redshift

    1.5K
    1.3K
    107
    Fast, fully managed, petabyte-scale data warehouse service
    1.5K
    1.3K
    + 1
    107
    PROS OF AMAZON REDSHIFT
    • 40
      Data Warehousing
    • 27
      Scalable
    • 17
      SQL
    • 14
      Backed by Amazon
    • 5
      Encryption
    • 1
      Cheap and reliable
    • 1
      Isolation
    • 1
      Best Cloud DW Performance
    • 1
      Fast columnar storage
    CONS OF AMAZON REDSHIFT
      Be the first to leave a con

      related Amazon Redshift posts

      Julien DeFrance
      Principal Software Engineer at Tophatter · | 16 upvotes · 2.6M views

      Back in 2014, I was given an opportunity to re-architect SmartZip Analytics platform, and flagship product: SmartTargeting. This is a SaaS software helping real estate professionals keeping up with their prospects and leads in a given neighborhood/territory, finding out (thanks to predictive analytics) who's the most likely to list/sell their home, and running cross-channel marketing automation against them: direct mail, online ads, email... The company also does provide Data APIs to Enterprise customers.

      I had inherited years and years of technical debt and I knew things had to change radically. The first enabler to this was to make use of the cloud and go with AWS, so we would stop re-inventing the wheel, and build around managed/scalable services.

      For the SaaS product, we kept on working with Rails as this was what my team had the most knowledge in. We've however broken up the monolith and decoupled the front-end application from the backend thanks to the use of Rails API so we'd get independently scalable micro-services from now on.

      Our various applications could now be deployed using AWS Elastic Beanstalk so we wouldn't waste any more efforts writing time-consuming Capistrano deployment scripts for instance. Combined with Docker so our application would run within its own container, independently from the underlying host configuration.

      Storage-wise, we went with Amazon S3 and ditched any pre-existing local or network storage people used to deal with in our legacy systems. On the database side: Amazon RDS / MySQL initially. Ultimately migrated to Amazon RDS for Aurora / MySQL when it got released. Once again, here you need a managed service your cloud provider handles for you.

      Future improvements / technology decisions included:

      Caching: Amazon ElastiCache / Memcached CDN: Amazon CloudFront Systems Integration: Segment / Zapier Data-warehousing: Amazon Redshift BI: Amazon Quicksight / Superset Search: Elasticsearch / Amazon Elasticsearch Service / Algolia Monitoring: New Relic

      As our usage grows, patterns changed, and/or our business needs evolved, my role as Engineering Manager then Director of Engineering was also to ensure my team kept on learning and innovating, while delivering on business value.

      One of these innovations was to get ourselves into Serverless : Adopting AWS Lambda was a big step forward. At the time, only available for Node.js (Not Ruby ) but a great way to handle cost efficiency, unpredictable traffic, sudden bursts of traffic... Ultimately you want the whole chain of services involved in a call to be serverless, and that's when we've started leveraging Amazon DynamoDB on these projects so they'd be fully scalable.

      See more
      Ankit Sobti

      Looker , Stitch , Amazon Redshift , dbt

      We recently moved our Data Analytics and Business Intelligence tooling to Looker . It's already helping us create a solid process for reusable SQL-based data modeling, with consistent definitions across the entire organizations. Looker allows us to collaboratively build these version-controlled models and push the limits of what we've traditionally been able to accomplish with analytics with a lean team.

      For Data Engineering, we're in the process of moving from maintaining our own ETL pipelines on AWS to a managed ELT system on Stitch. We're also evaluating the command line tool, dbt to manage data transformations. Our hope is that Stitch + dbt will streamline the ELT bit, allowing us to focus our energies on analyzing data, rather than managing it.

      See more
      Snowflake logo

      Snowflake

      900
      1K
      19
      The data warehouse built for the cloud
      900
      1K
      + 1
      19
      PROS OF SNOWFLAKE
      • 4
        Public and Private Data Sharing
      • 3
        Good Performance
      • 3
        Multicloud
      • 2
        Great Documentation
      • 2
        User Friendly
      • 2
        Serverless
      • 1
        Innovative
      • 1
        Usage based billing
      • 1
        Economical
      CONS OF SNOWFLAKE
        Be the first to leave a con

        related Snowflake posts

        I'm wondering if any Cloud Firestore users might be open to sharing some input and challenges encountered when trying to create a low-cost, low-latency data pipeline to their Analytics warehouse (e.g. Google BigQuery, Snowflake, etc...)

        I'm working with a platform by the name of Estuary.dev, an ETL/ELT and we are conducting some research on the pain points here to see if there are drawbacks of the Firestore->BQ extension and/or if users are seeking easy ways for getting nosql->fine-grained tabular data

        Please feel free to drop some knowledge/wish list stuff on me for a better pipeline here!

        See more
        Shared insights
        on
        Google BigQueryGoogle BigQuerySnowflakeSnowflake

        I use Google BigQuery because it makes is super easy to query and store data for analytics workloads. If you're using GCP, you're likely using BigQuery. However, running data viz tools directly connected to BigQuery will run pretty slow. They recently announced BI Engine which will hopefully compete well against big players like Snowflake when it comes to concurrency.

        What's nice too is that it has SQL-based ML tools, and it has great GIS support!

        See more
        Amazon EMR logo

        Amazon EMR

        524
        646
        54
        Distribute your data and processing across a Amazon EC2 instances using Hadoop
        524
        646
        + 1
        54
        PROS OF AMAZON EMR
        • 15
          On demand processing power
        • 12
          Don't need to maintain Hadoop Cluster yourself
        • 7
          Hadoop Tools
        • 6
          Elastic
        • 4
          Backed by Amazon
        • 3
          Flexible
        • 3
          Economic - pay as you go, easy to use CLI and SDKs
        • 2
          Don't need a dedicated Ops group
        • 1
          Massive data handling
        • 1
          Great support
        CONS OF AMAZON EMR
          Be the first to leave a con

          related Amazon EMR posts

          Stitch logo

          Stitch

          143
          146
          12
          All your data. In your data warehouse. In minutes.
          143
          146
          + 1
          12
          PROS OF STITCH
          • 8
            3 minutes to set up
          • 4
            Super simple, great support
          CONS OF STITCH
            Be the first to leave a con

            related Stitch posts

            Ankit Sobti

            Looker , Stitch , Amazon Redshift , dbt

            We recently moved our Data Analytics and Business Intelligence tooling to Looker . It's already helping us create a solid process for reusable SQL-based data modeling, with consistent definitions across the entire organizations. Looker allows us to collaboratively build these version-controlled models and push the limits of what we've traditionally been able to accomplish with analytics with a lean team.

            For Data Engineering, we're in the process of moving from maintaining our own ETL pipelines on AWS to a managed ELT system on Stitch. We're also evaluating the command line tool, dbt to manage data transformations. Our hope is that Stitch + dbt will streamline the ELT bit, allowing us to focus our energies on analyzing data, rather than managing it.

            See more
            Cyril Duchon-Doris

            Hello, For security and strategic reasons, we are migrating our apps from AWS/Google to a cloud provider with more security certifications and fewer functionalities, named Outscale. So far we have been using Google BigQuery as our data warehouse with ELT workflows (using Stitch and dbt ) and we need to migrate our data ecosystem to this new cloud provider.

            We are setting up a Kubernetes cluster in our new cloud provider for our apps. Regarding the data warehouse, it's not clear if there are advantages/inconvenients about setting it up on kubernetes (apart from having to create node groups and tolerations with more ram/cpu). Also, we are not sure what's the best Open source or on-premise tool to use. The main requirement is that data must remain in the secure cluster, and no external entity (especially US) can have access to it. We have a dev cluster/environment and a production cluster/environment on this cloud.

            Regarding the actual DWH usage - Today we have ~1.5TB in BigQuery in production. We're going to run our initial rests with ~50-100GB of data for our test cluster - Most of our data comes from other databases, so in most cases, we already have replicated sources somewhere, and there are only a handful of collections whose source is directly in the DWH (such as snapshots, some external data we've fetched at some point, google analytics, etc) and needs appropriate level of replication - We are a team of 30-ish people, we do not have critical needs regarding analytics speed, and we do not need real time. We rebuild our DBT models 2-3 times a day and this usually proves enough

            Apart from postgreSQL, I haven't really found open-source or on-premise alternatives for setting up a data warehouse, and running transformations with DBT. There is also the question of data ingestion, I've selected Airbyte and @meltano and I have troubles understanding if one of the 2 is better but Airbytes seems to have a bigger community.

            What do you suggest regarding the data warehouse, and the ELT workflows ? - Kubernetes or not kubernetes ? - Postgresql or something else ? if postgre, what are the important configs you'd have in mind ? - Airbyte/DBT or something else.

            See more