It is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. The toolkit is designed to be parallel among more than 70 languages, using the Universal Dependencies formalism. | Haystack is an open source NLP framework to interact with your data using Transformer models and LLMs (GPT-4, ChatGPT, etc.). It offers production-ready tools to build NLP backend services, e.g., question answering or semantic search. |
Native Python implementation requiring minimal efforts to set up;
Full neural network pipeline for robust text analytics, including tokenization, multi-word token (MWT) expansion, lemmatization, part-of-speech (POS) and morphological features tagging, dependency parsing, and named entity recognition;
Pretrained neural models supporting 66 (human) languages;
A stable, officially maintained Python interface to CoreNLP | Question answering;
Semantic document search;
Latest models;
Vector databases;
Scalable pipelines; |
Statistics | |
GitHub Stars 7.6K | GitHub Stars 23.2K |
GitHub Forks 926 | GitHub Forks 2.5K |
Stacks 9 | Stacks 6 |
Followers 34 | Followers 12 |
Votes 0 | Votes 0 |
Integrations | |

rasa NLU (Natural Language Understanding) is a tool for intent classification and entity extraction. You can think of rasa NLU as a set of high level APIs for building your own language parser using existing NLP and ML libraries.

It is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. It comes with pre-trained statistical models and word vectors, and currently supports tokenization for 49+ languages.

It can be used to complement any regular touch user interface with a real time voice user interface. It offers real time feedback for faster and more intuitive experience that enables end user to recover from possible errors quickly and with no interruptions.

Turn emails, tweets, surveys or any text into actionable data. Automate business workflows and saveExtract and classify information from text. Integrate with your App within minutes. Get started for free.

It is geared towards building search systems for any kind of data, including text, images, audio, video and many more. With the modular design & multi-layer abstraction, you can leverage the efficient patterns to build the system by parts, or chaining them into a Flow for an end-to-end experience.

That transforms AI-generated content into natural, undetectable human-like writing. Bypass AI detection systems with intelligent text humanization technology

It is a framework built around LLMs. It can be used for chatbots, generative question-answering, summarization, and much more. The core idea of the library is that we can “chain” together different components to create more advanced use cases around LLMs.

It allows you to run open-source large language models, such as Llama 2, locally.

It provides an easy method to compute dense vector representations for sentences, paragraphs, and images. The models are based on transformer networks like BERT / RoBERTa / XLM-RoBERTa etc. and achieve state-of-the-art performance in various tasks.

It is a project that provides a central interface to connect your LLMs with external data. It offers you a comprehensive toolset trading off cost and performance.