Alternatives to Qubole logo

Alternatives to Qubole

Databricks, Snowflake, Amazon Redshift, Google BigQuery, and Amazon EMR are the most popular alternatives and competitors to Qubole.
25
64
+ 1
53

What is Qubole and what are its top alternatives?

Qubole is a cloud based service that makes big data easy for analysts and data engineers.
Qubole is a tool in the Big Data as a Service category of a tech stack.

Top Alternatives to Qubole

Qubole alternatives & related posts

Databricks logo

Databricks

134
205
0
A unified analytics platform, powered by Apache Spark
134
205
+ 1
0
PROS OF DATABRICKS
    No pros available
    CONS OF DATABRICKS
      No cons available

      related Databricks posts

      Snowflake logo

      Snowflake

      307
      294
      1
      The data warehouse built for the cloud
      307
      294
      + 1
      1
      PROS OF SNOWFLAKE
      CONS OF SNOWFLAKE
        No cons available

        related Snowflake posts

        Shared insights
        on
        Google BigQueryGoogle BigQuerySnowflakeSnowflake

        I use Google BigQuery because it makes is super easy to query and store data for analytics workloads. If you're using GCP, you're likely using BigQuery. However, running data viz tools directly connected to BigQuery will run pretty slow. They recently announced BI Engine which will hopefully compete well against big players like Snowflake when it comes to concurrency.

        What's nice too is that it has SQL-based ML tools, and it has great GIS support!

        See more
        Amazon Redshift logo

        Amazon Redshift

        1K
        778
        94
        Fast, fully managed, petabyte-scale data warehouse service
        1K
        778
        + 1
        94

        related Amazon Redshift posts

        Julien DeFrance
        Julien DeFrance
        Principal Software Engineer at Tophatter · | 16 upvotes · 1.9M views

        Back in 2014, I was given an opportunity to re-architect SmartZip Analytics platform, and flagship product: SmartTargeting. This is a SaaS software helping real estate professionals keeping up with their prospects and leads in a given neighborhood/territory, finding out (thanks to predictive analytics) who's the most likely to list/sell their home, and running cross-channel marketing automation against them: direct mail, online ads, email... The company also does provide Data APIs to Enterprise customers.

        I had inherited years and years of technical debt and I knew things had to change radically. The first enabler to this was to make use of the cloud and go with AWS, so we would stop re-inventing the wheel, and build around managed/scalable services.

        For the SaaS product, we kept on working with Rails as this was what my team had the most knowledge in. We've however broken up the monolith and decoupled the front-end application from the backend thanks to the use of Rails API so we'd get independently scalable micro-services from now on.

        Our various applications could now be deployed using AWS Elastic Beanstalk so we wouldn't waste any more efforts writing time-consuming Capistrano deployment scripts for instance. Combined with Docker so our application would run within its own container, independently from the underlying host configuration.

        Storage-wise, we went with Amazon S3 and ditched any pre-existing local or network storage people used to deal with in our legacy systems. On the database side: Amazon RDS / MySQL initially. Ultimately migrated to Amazon RDS for Aurora / MySQL when it got released. Once again, here you need a managed service your cloud provider handles for you.

        Future improvements / technology decisions included:

        Caching: Amazon ElastiCache / Memcached CDN: Amazon CloudFront Systems Integration: Segment / Zapier Data-warehousing: Amazon Redshift BI: Amazon Quicksight / Superset Search: Elasticsearch / Amazon Elasticsearch Service / Algolia Monitoring: New Relic

        As our usage grows, patterns changed, and/or our business needs evolved, my role as Engineering Manager then Director of Engineering was also to ensure my team kept on learning and innovating, while delivering on business value.

        One of these innovations was to get ourselves into Serverless : Adopting AWS Lambda was a big step forward. At the time, only available for Node.js (Not Ruby ) but a great way to handle cost efficiency, unpredictable traffic, sudden bursts of traffic... Ultimately you want the whole chain of services involved in a call to be serverless, and that's when we've started leveraging Amazon DynamoDB on these projects so they'd be fully scalable.

        See more
        Ankit Sobti
        Ankit Sobti

        Looker , Stitch , Amazon Redshift , dbt

        We recently moved our Data Analytics and Business Intelligence tooling to Looker . It's already helping us create a solid process for reusable SQL-based data modeling, with consistent definitions across the entire organizations. Looker allows us to collaboratively build these version-controlled models and push the limits of what we've traditionally been able to accomplish with analytics with a lean team.

        For Data Engineering, we're in the process of moving from maintaining our own ETL pipelines on AWS to a managed ELT system on Stitch. We're also evaluating the command line tool, dbt to manage data transformations. Our hope is that Stitch + dbt will streamline the ELT bit, allowing us to focus our energies on analyzing data, rather than managing it.

        See more
        Google BigQuery logo

        Google BigQuery

        835
        655
        116
        Analyze terabytes of data in seconds
        835
        655
        + 1
        116

        related Google BigQuery posts

        Context: I wanted to create an end to end IoT data pipeline simulation in Google Cloud IoT Core and other GCP services. I never touched Terraform meaningfully until working on this project, and it's one of the best explorations in my development career. The documentation and syntax is incredibly human-readable and friendly. I'm used to building infrastructure through the google apis via Python , but I'm so glad past Sung did not make that decision. I was tempted to use Google Cloud Deployment Manager, but the templates were a bit convoluted by first impression. I'm glad past Sung did not make this decision either.

        Solution: Leveraging Google Cloud Build Google Cloud Run Google Cloud Bigtable Google BigQuery Google Cloud Storage Google Compute Engine along with some other fun tools, I can deploy over 40 GCP resources using Terraform!

        Check Out My Architecture: CLICK ME

        Check out the GitHub repo attached

        See more
        Tim Specht
        Tim Specht
        ‎Co-Founder and CTO at Dubsmash · | 14 upvotes · 504.7K views

        In order to accurately measure & track user behaviour on our platform we moved over quickly from the initial solution using Google Analytics to a custom-built one due to resource & pricing concerns we had.

        While this does sound complicated, it’s as easy as clients sending JSON blobs of events to Amazon Kinesis from where we use AWS Lambda & Amazon SQS to batch and process incoming events and then ingest them into Google BigQuery. Once events are stored in BigQuery (which usually only takes a second from the time the client sends the data until it’s available), we can use almost-standard-SQL to simply query for data while Google makes sure that, even with terabytes of data being scanned, query times stay in the range of seconds rather than hours. Before ingesting their data into the pipeline, our mobile clients are aggregating events internally and, once a certain threshold is reached or the app is going to the background, sending the events as a JSON blob into the stream.

        In the past we had workers running that continuously read from the stream and would validate and post-process the data and then enqueue them for other workers to write them to BigQuery. We went ahead and implemented the Lambda-based approach in such a way that Lambda functions would automatically be triggered for incoming records, pre-aggregate events, and write them back to SQS, from which we then read them, and persist the events to BigQuery. While this approach had a couple of bumps on the road, like re-triggering functions asynchronously to keep up with the stream and proper batch sizes, we finally managed to get it running in a reliable way and are very happy with this solution today.

        #ServerlessTaskProcessing #GeneralAnalytics #RealTimeDataProcessing #BigDataAsAService

        See more
        Amazon EMR logo

        Amazon EMR

        378
        355
        53
        Distribute your data and processing across a Amazon EC2 instances using Hadoop
        378
        355
        + 1
        53

        related Amazon EMR posts

        Stitch logo

        Stitch

        81
        74
        9
        All your data. In your data warehouse. In minutes.
        81
        74
        + 1
        9
        CONS OF STITCH
          No cons available

          related Stitch posts

          Ankit Sobti
          Ankit Sobti

          Looker , Stitch , Amazon Redshift , dbt

          We recently moved our Data Analytics and Business Intelligence tooling to Looker . It's already helping us create a solid process for reusable SQL-based data modeling, with consistent definitions across the entire organizations. Looker allows us to collaboratively build these version-controlled models and push the limits of what we've traditionally been able to accomplish with analytics with a lean team.

          For Data Engineering, we're in the process of moving from maintaining our own ETL pipelines on AWS to a managed ELT system on Stitch. We're also evaluating the command line tool, dbt to manage data transformations. Our hope is that Stitch + dbt will streamline the ELT bit, allowing us to focus our energies on analyzing data, rather than managing it.

          See more
          Cloudera Enterprise logo

          Cloudera Enterprise

          80
          97
          0
          Enterprise Platform for Big Data
          80
          97
          + 1
          0
          PROS OF CLOUDERA ENTERPRISE
            No pros available
            CONS OF CLOUDERA ENTERPRISE
              No cons available

              related Cloudera Enterprise posts

              Dremio logo

              Dremio

              48
              123
              4
              The data lake engine
              48
              123
              + 1
              4

              related Dremio posts