Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.
Our APIs use computer vision, machine learning and natural language processing to help developers extract and understand objects from any Web page. We've determined that the entire Web can be classified into approximately 18 structural page types. From this basic understanding of common page layouts, Diffbot then uses computer vision, natural language processing and other machine learning algorithms to identify and extract the important items from within these pages. | Python is a general purpose programming language created by Guido Van Rossum. Python is most praised for its elegant syntax and readable code, if you are just beginning your programming career python suits you best. |
The Article API is used to extract clean article text from news article web pages.;The Follow API allows you to subscribe to the changes of any web page.;The Frontpage API takes in a multifaceted “homepage” and returns individual page elements.;[Limited Alpha] The Page Classifier API takes any web link and automatically determines what type of page it is.;Accurate- We utilize state-of-the art computer vision and NLP algorithms; have the largest collection of tagged pages and update our model several times per week.;Easy- Pass in a URL and we'll do the rest. Stop spending time building custom scrapers and -- even worse -- maintaining them.;Stable- Diffbot is built and run by Web veterans in a multi-tiered environment with redundancy, monitoring and scalability built-in. Our scale lets us operate the service more cheaply than running it yourself.;Open- We use open standards (schema.org) and allow for endless configurability via our customization tool. | - |
Statistics | |
GitHub Stars - | GitHub Stars 69.7K |
GitHub Forks - | GitHub Forks 33.3K |
Stacks 16 | Stacks 262.9K |
Followers 30 | Followers 205.4K |
Votes 0 | Votes 6.9K |
Pros & Cons | |
No community feedback yet | Pros
Cons
|
Integrations | |

JavaScript is most known as the scripting language for Web pages, but used in many non-browser environments as well such as node.js or Apache CouchDB. It is a prototype-based, multi-paradigm scripting language that is dynamic,and supports object-oriented, imperative, and functional programming styles.

Fast, flexible and pragmatic, PHP powers everything from your blog to the most popular websites in the world.

Ruby is a language of careful balance. Its creator, Yukihiro “Matz” Matsumoto, blended parts of his favorite languages (Perl, Smalltalk, Eiffel, Ada, and Lisp) to form a new language that balanced functional programming with imperative programming.

Java is a programming language and computing platform first released by Sun Microsystems in 1995. There are lots of applications and websites that will not work unless you have Java installed, and more are created every day. Java is fast, secure, and reliable. From laptops to datacenters, game consoles to scientific supercomputers, cell phones to the Internet, Java is everywhere!

Go is expressive, concise, clean, and efficient. Its concurrency mechanisms make it easy to write programs that get the most out of multicore and networked machines, while its novel type system enables flexible and modular program construction. Go compiles quickly to machine code yet has the convenience of garbage collection and the power of run-time reflection. It's a fast, statically typed, compiled language that feels like a dynamically typed, interpreted language.

HTML5 is a core technology markup language of the Internet used for structuring and presenting content for the World Wide Web. As of October 2014 this is the final and complete fifth revision of the HTML standard of the World Wide Web Consortium (W3C). The previous version, HTML 4, was standardised in 1997.

C# (pronounced "See Sharp") is a simple, modern, object-oriented, and type-safe programming language. C# has its roots in the C family of languages and will be immediately familiar to C, C++, Java, and JavaScript programmers.

Scala is an acronym for “Scalable Language”. This means that Scala grows with you. You can play with it by typing one-line expressions and observing the results. But you can also rely on it for large mission critical systems, as many companies, including Twitter, LinkedIn, or Intel do. To some, Scala feels like a scripting language. Its syntax is concise and low ceremony; its types get out of the way because the compiler can infer them.

Elixir leverages the Erlang VM, known for running low-latency, distributed and fault-tolerant systems, while also being successfully used in web development and the embedded software domain.

Writing code is interactive and fun, the syntax is concise yet expressive, and apps run lightning-fast. Swift is ready for your next iOS and OS X project — or for addition into your current app — because Swift code works side-by-side with Objective-C.