What is Spark NLP?
It is a Natural Language Processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that scale easily in a distributed environment. It comes with 160+ pretrained pipelines and models in more than 20+ languages.
Spark NLP is a tool in the NLP / Sentiment Analysis category of a tech stack.
Spark NLP is an open source tool with 1.7K GitHub stars and 355 GitHub forks. Here’s a link to Spark NLP's open source repository on GitHub
Who uses Spark NLP?
8 developers on StackShare have stated that they use Spark NLP.
Spark NLP's Features
- Stop Words Removal
- Regex Matching
- Text Matching
- Date Matcher
- Part-of-speech tagging
- Sentence Detector
- Dependency parsing (Labeled/unlabled)
- Sentiment Detection (ML models)
- Spell Checker (ML and DL models)
- Word Embeddings (GloVe and Word2Vec)
- BERT Embeddings
- ELMO Embeddings
- Universal Sentence Encoder Sentence Embeddings
- Chunk Embeddings
Spark NLP Alternatives & Comparisons
What are some alternatives to Spark NLP?
See all alternatives
It is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. It comes with pre-trained statistical models and word vectors, and currently supports tokenization for 49+ languages.
rasa NLU (Natural Language Understanding) is a tool for intent classification and entity extraction. You can think of rasa NLU as a set of high level APIs for building your own language parser using existing NLP and ML libraries.
Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to discover insights from text. Amazon Comprehend provides Keyphrase Extraction, Sentiment Analysis, Entity Recognition, Topic Modeling, and Language Detection APIs so you can easily integrate natural language processing into your applications.
Google Cloud Natural Language API
You can use it to extract information about people, places, events and much more, mentioned in text documents, news articles or blog posts. You can use it to understand sentiment about your product on social media or parse intent from customer conversations happening in a call center or a messaging app. You can analyze text uploaded in your request or integrate with your document storage on Google Cloud Storage.
It is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.