Need advice about which tool to choose?Ask the StackShare community!

Docparser

9
21
+ 1
0
OpenPDF

3
30
+ 1
0
Add tool

Docparser vs OpenPDF: What are the differences?

Docparser: Extract Data Form PDF Files & Automate Your Business. Docparser is a cloud based document processing solution and workflow automation software. Docparser makes it easy to convert PDF documents into structured data and automate document based workflows; OpenPDF: A free Java library for creating and editing PDF files. OpenPDF is a free Java library for creating and editing PDF files with a LGPL and MPL open source license. OpenPDF is based on a fork of iText.

Docparser and OpenPDF can be categorized as "File Conversion" tools.

OpenPDF is an open source tool with 1.12K GitHub stars and 156 GitHub forks. Here's a link to OpenPDF's open source repository on GitHub.

Advice on Docparser and OpenPDF
Needs advice
on
GhostscriptGhostscriptOpenPDFOpenPDF
and
PDF.jsPDF.js

Users are uploading huge PDF files of more than 100MB on our platform. We are creating several tools to manage those files, but keeping the raw file will eat up space, as we are handling several of them. After upload, they will be mainly keep stored for future use.

I am looking for a tool to compress and optimize those PDFs, like a library or an external API that can process that for us.

Thanks

See more
Replies (2)
Andres Montalban
Recommends
on
GhostscriptGhostscript

I have been using Ghostscript and Python to get JPEG images from PDF files and that way we have reduced PDF size. But if your average is 100MB then probably those are hi-res images and not sure if your users will accept a quality reduction.

See more
Recommends
on
BunnyCDNBunnyCDN

You can store raw files on a cdn service like bunnyCdn. If they want to work with the raw files, you can get it from the cdn service. Compressing is not a persistent solution for space problem. Also it's more safe way, because cdn providers copy your files more than one servers..

Modern cdn solutions have ftp / ssh support so you can easily send files to them...

See more
Get Advice from developers at your company using StackShare Enterprise. Sign up for StackShare Enterprise.
Learn More
- No public GitHub repository available -

What is Docparser?

Docparser is a cloud based document processing solution and workflow automation software. Docparser makes it easy to convert PDF documents into structured data and automate document based workflows.

What is OpenPDF?

OpenPDF is a free Java library for creating and editing PDF files with a LGPL and MPL open source license. OpenPDF is based on a fork of iText.

Need advice about which tool to choose?Ask the StackShare community!

What companies use Docparser?
What companies use OpenPDF?
See which teams inside your own company are using Docparser or OpenPDF.
Sign up for StackShare EnterpriseLearn More

Sign up to get full access to all the companiesMake informed product decisions

What tools integrate with Docparser?
What tools integrate with OpenPDF?

Sign up to get full access to all the tool integrationsMake informed product decisions

What are some alternatives to Docparser and OpenPDF?
Pandoc
It is a free and open-source document converter, widely used as a writing tool and as a basis for publishing workflows. It converts files from one markup format into another. It can convert documents in (several dialects of) Markdown, reStructuredText, textile, HTML, DocBook, LaTeX, MediaWiki markup, TWiki and many more.
PDF.js
It is a Portable Document Format (PDF) viewer that is built with HTML5. It is community-driven and supported by Mozilla Labs. The goal is to create a general-purpose, web standards-based platform for parsing and rendering PDFs.
Typ
It is a simple typesetting application. Turn plain Markdown into a formatted PDF, ready for print. Focus on content, not formatting.
pdfmake
pdfmake, client/server side PDF printing in pure JavaScript.
wkhtmltopdf
wkhtmltopdf and wkhtmltoimage are command line tools to render HTML into PDF and various image formats using the QT Webkit rendering engine. These run entirely "headless" and do not require a display or display service.
See all alternatives