Need advice about which tool to choose?Ask the StackShare community!
Docparser vs OpenPDF: What are the differences?
Docparser: Extract Data Form PDF Files & Automate Your Business. Docparser is a cloud based document processing solution and workflow automation software. Docparser makes it easy to convert PDF documents into structured data and automate document based workflows; OpenPDF: A free Java library for creating and editing PDF files. OpenPDF is a free Java library for creating and editing PDF files with a LGPL and MPL open source license. OpenPDF is based on a fork of iText.
Docparser and OpenPDF can be categorized as "File Conversion" tools.
OpenPDF is an open source tool with 1.12K GitHub stars and 156 GitHub forks. Here's a link to OpenPDF's open source repository on GitHub.
Users are uploading huge PDF files of more than 100MB on our platform. We are creating several tools to manage those files, but keeping the raw file will eat up space, as we are handling several of them. After upload, they will be mainly keep stored for future use.
I am looking for a tool to compress and optimize those PDFs, like a library or an external API that can process that for us.
Thanks
You can store raw files on a cdn service like bunnyCdn. If they want to work with the raw files, you can get it from the cdn service. Compressing is not a persistent solution for space problem. Also it's more safe way, because cdn providers copy your files more than one servers..
Modern cdn solutions have ftp / ssh support so you can easily send files to them...
I have been using Ghostscript and Python to get JPEG images from PDF files and that way we have reduced PDF size. But if your average is 100MB then probably those are hi-res images and not sure if your users will accept a quality reduction.