Alternatives to Scrapy logo

Alternatives to Scrapy

Selenium, import.io, BeautifulSoup, Puppeteer, and Apify are the most popular alternatives and competitors to Scrapy.
244
239
+ 1
0

What is Scrapy and what are its top alternatives?

It is the most popular web scraping framework in Python. An open source and collaborative framework for extracting the data you need from websites. In a fast, simple, yet extensible way.
Scrapy is a tool in the Web Scraping API category of a tech stack.
Scrapy is an open source tool with 50.6K GitHub stars and 10.3K GitHub forks. Here’s a link to Scrapy's open source repository on GitHub

Top Alternatives to Scrapy

  • Selenium
    Selenium

    Selenium automates browsers. That's it! What you do with that power is entirely up to you. Primarily, it is for automating web applications for testing purposes, but is certainly not limited to just that. Boring web-based administration tasks can (and should!) also be automated as well. ...

  • import.io
    import.io

    import.io is a free web-based platform that puts the power of the machine readable web in your hands. Using our tools you can create an API or crawl an entire website in a fraction of the time of traditional methods, no coding required. ...

  • BeautifulSoup
    BeautifulSoup

    It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work. ...

  • Puppeteer
    Puppeteer

    Puppeteer is a Node library which provides a high-level API to control headless Chrome over the DevTools Protocol. It can also be configured to use full (non-headless) Chrome. ...

  • Apify
    Apify

    Apify is a platform that enables developers to create, customize and run cloud-based programs called actors that can, among other things, be used to extract data from any website using a few lines of JavaScript. ...

  • ParseHub
    ParseHub

    Web Scraping and Data Extraction ParseHub is a free and powerful web scraping tool. With our advanced web scraper, extracting data is as easy as clicking on the data you need. ParseHub lets you turn any website into a spreadsheet or API w ...

  • Octoparse
    Octoparse

    It is a free client-side Windows web scraping software that turns unstructured or semi-structured data from websites into structured data sets, no coding necessary. Extracted data can be exported as API, CSV, Excel or exported into a database. ...

  • Portia
    Portia

    Portia is an open source tool that lets you get data from websites. It facilitates and automates the process of data extraction. This visual web scraper works straight from your browser, so you don't need to download or install anything. ...

Scrapy alternatives & related posts

Selenium logo

Selenium

15.7K
12.3K
525
Web Browser Automation
15.7K
12.3K
+ 1
525
PROS OF SELENIUM
  • 175
    Automates browsers
  • 154
    Testing
  • 101
    Essential tool for running test automation
  • 24
    Record-Playback
  • 24
    Remote Control
  • 8
    Data crawling
  • 7
    Supports end to end testing
  • 6
    Easy set up
  • 6
    Functional testing
  • 4
    The Most flexible monitoring system
  • 3
    End to End Testing
  • 3
    Easy to integrate with build tools
  • 2
    Comparing the performance selenium is faster than jasm
  • 2
    Record and playback
  • 2
    Compatible with Python
  • 2
    Easy to scale
  • 2
    Integration Tests
  • 0
    Integrated into Selenium-Jupiter framework
CONS OF SELENIUM
  • 8
    Flaky tests
  • 4
    Slow as needs to make browser (even with no gui)
  • 2
    Update browser drivers

related Selenium posts

Kamil Kowalski
Lead Architect at Fresha · | 28 upvotes · 3.8M views

When you think about test automation, it’s crucial to make it everyone’s responsibility (not just QA Engineers'). We started with Selenium and Java, but with our platform revolving around Ruby, Elixir and JavaScript, QA Engineers were left alone to automate tests. Cypress was the answer, as we could switch to JS and simply involve more people from day one. There's a downside too, as it meant testing on Chrome only, but that was "good enough" for us + if really needed we can always cover some specific cases in a different way.

See more
Benjamin Poon
QA Manager - Engineering at HBC Digital · | 8 upvotes · 1.9M views

For our digital QA organization to support a complex hybrid monolith/microservice architecture, our team took on the lofty goal of building out a commonized UI test automation framework. One of the primary requisites included a technical minimalist threshold such that an engineer or analyst with fundamental knowledge of JavaScript could automate their tests with greater ease. Just to list a few: - Nightwatchjs - Selenium - Cucumber - GitHub - Go.CD - Docker - ExpressJS - React - PostgreSQL

With this structure, we're able to combine the automation efforts of each team member into a centralized repository while also providing new relevant metrics to business owners.

See more
import.io logo

import.io

39
89
24
Extract data from the web
39
89
+ 1
24
PROS OF IMPORT.IO
  • 8
    Easy setup
  • 5
    Native desktop app
  • 5
    Free lead generation tool
  • 3
    Continuous updates
  • 3
    Features based on users suggestions
CONS OF IMPORT.IO
    Be the first to leave a con

    related import.io posts

    BeautifulSoup logo

    BeautifulSoup

    86
    89
    4
    A Python library for pulling data out of HTML and XML files
    86
    89
    + 1
    4
    PROS OF BEAUTIFULSOUP
    • 3
      Parsed html even when poorly formed
    • 1
      It just works
    CONS OF BEAUTIFULSOUP
      Be the first to leave a con

      related BeautifulSoup posts

      Shared insights
      on
      ParseHubParseHubBeautifulSoupBeautifulSoup

      Which tool is best for webscrapping, BeautifulSoup or ParseHub???????????

      See more
      Puppeteer logo

      Puppeteer

      887
      567
      26
      Headless Chrome Node API
      887
      567
      + 1
      26
      PROS OF PUPPETEER
      • 10
        Very well documented
      • 10
        Scriptable web browser
      • 6
        Promise based
      CONS OF PUPPETEER
      • 10
        Chrome only

      related Puppeteer posts

      Raziel Alron
      Automation Engineer at Tipalti · | 7 upvotes · 2M views

      Currently, we are using Protractor in our project. Since Protractor isn't updated anymore, we are looking for a new tool. The strongest suggestions are WebdriverIO or Puppeteer. Please help me figure out what tool would make the transition fastest and easiest. Please note that Protractor uses its own locator system, and we want the switch to be as simple as possible. Thank you!

      See more

      I work in a company building web apps with AngularJS. I started using Selenium for tests automation, as I am more familiar with Python. However, I found some difficulties, like the impossibility of using IDs and fixed lists of classes, ending up with using xpaths most, which unfortunately could change with fixes and modifications in the code.

      So, I started using Puppeteer, but I am still learning. It seems easier to find elements on the webpage, even if the creation and managing of arrays of elements seem to be a little bit more complicated than in Selenium, but it could be also due to my poor knowledge of JavaScript.

      Any comments on this comparison and also on comparisons with similar tools are welcome! :)

      See more
      Apify logo

      Apify

      35
      71
      4
      Cloud-based web scraping tool for developers
      35
      71
      + 1
      4
      PROS OF APIFY
      • 4
        Perfect for Heavy Java Script Websites
      CONS OF APIFY
        Be the first to leave a con

        related Apify posts

        ParseHub logo

        ParseHub

        32
        89
        19
        Turn dynamic websites into APIs
        32
        89
        + 1
        19
        PROS OF PARSEHUB
        • 6
          Great support
        • 5
          Easy setup
        • 5
          Complex websites
        • 3
          Native Desktop App
        CONS OF PARSEHUB
          Be the first to leave a con

          related ParseHub posts

          Shared insights
          on
          ParseHubParseHubBeautifulSoupBeautifulSoup

          Which tool is best for webscrapping, BeautifulSoup or ParseHub???????????

          See more
          Octoparse logo

          Octoparse

          31
          79
          12
          A cloud-based web data extraction solution that helps users extract relevant information
          31
          79
          + 1
          12
          PROS OF OCTOPARSE
          • 3
            Cloud extraction
          • 3
            Easy to use
          • 2
            API
          • 1
            Great support
          • 1
            Web Scraping Template
          • 1
            Web Scraping Template
          • 1
            Auto-detection
          • 0
            Great support
          CONS OF OCTOPARSE
            Be the first to leave a con

            related Octoparse posts

            Portia logo

            Portia

            26
            66
            0
            Visual web scraping tool that lets you extract data without writing a single line of code
            26
            66
            + 1
            0
            PROS OF PORTIA
              Be the first to leave a pro
              CONS OF PORTIA
                Be the first to leave a con

                related Portia posts