Help developers discover the tools you use. Get visibility for your team's tech choices and contribute to the community's knowledge.
Lucene Core, our flagship sub-project, provides Java-based indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities. | rasa NLU (Natural Language Understanding) is a tool for intent classification and entity extraction. You can think of rasa NLU as a set of high level APIs for building your own language parser using existing NLP and ML libraries. |
over 150GB/hour on modern hardware;small RAM requirements -- only 1MB heap;incremental indexing as fast as batch indexing;index size roughly 20-30% the size of text indexed;ranked searching -- best results returned first;many powerful query types: phrase queries, wildcard queries, proximity queries, range queries;fielded searching (e.g. title, author, contents);sorting by any field;multiple-index searching with merged results;allows simultaneous update and searching;flexible faceting, highlighting, joins and result grouping;fast, memory-efficient and typo-tolerant suggesters;pluggable ranking models, including the Vector Space Model and Okapi BM25;configurable storage engine (codecs) | Open source; NLP; Machine learning |
Statistics | |
Stacks 175 | Stacks 120 |
Followers 230 | Followers 282 |
Votes 2 | Votes 25 |
Pros & Cons | |
Pros
| Pros
Cons
|
Integrations | |

It lets you either batch index and search data stored in an SQL database, NoSQL storage, or just files quickly and easily — or index and search data on the fly, working with it pretty much as with a database server.

It is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. It comes with pre-trained statistical models and word vectors, and currently supports tokenization for 49+ languages.

It builds completely static HTML sites that you can host on GitHub pages, Amazon S3, or anywhere else you choose. There's a stack of good looking themes available. The built-in dev-server allows you to preview your documentation as you're writing it. It will even auto-reload and refresh your browser whenever you save your changes.

It can be used to complement any regular touch user interface with a real time voice user interface. It offers real time feedback for faster and more intuitive experience that enables end user to recover from possible errors quickly and with no interruptions.

Turn emails, tweets, surveys or any text into actionable data. Automate business workflows and saveExtract and classify information from text. Integrate with your App within minutes. Get started for free.

It is geared towards building search systems for any kind of data, including text, images, audio, video and many more. With the modular design & multi-layer abstraction, you can leverage the efficient patterns to build the system by parts, or chaining them into a Flow for an end-to-end experience.

Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking for.

It provides an easy method to compute dense vector representations for sentences, paragraphs, and images. The models are based on transformer networks like BERT / RoBERTa / XLM-RoBERTa etc. and achieve state-of-the-art performance in various tasks.

An open-source, high-performance, distributed SQL database built for resilience and scale. Re-uses the upper half of PostgreSQL to offer advanced RDBMS features, architected to be fully distributed like Google Spanner.

It is an open-source, free, lightweight library that allows users to learn text representations and text classifiers. It works on standard, generic hardware. Models can later be reduced in size to even fit on mobile devices.