StackShareStackShare
Follow on
StackShare

Discover and share technology stacks from companies around the world.

Follow on

© 2025 StackShare. All rights reserved.

Product

  • Stacks
  • Tools
  • Feed

Company

  • About
  • Contact

Legal

  • Privacy Policy
  • Terms of Service
Spark NLP
ByDiaDia

Spark NLP

#18in Text & Language Models
Discussions0
Followers38
OverviewDiscussionsAdoption

What is Spark NLP?

It is a Natural Language Processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that scale easily in a distributed environment. It comes with 160+ pretrained pipelines and models in more than 20+ languages.

Spark NLP is a tool in the Text & Language Models category of a tech stack.

Key Features

TokenizationStop Words RemovalNormalizerStemmerLemmatizerNGramsRegex MatchingText MatchingChunkingDate MatcherPart-of-speech taggingSentence DetectorDependency parsing (Labeled/unlabled)Sentiment Detection (ML models)Spell Checker (ML and DL models)Word Embeddings (GloVe and Word2Vec)BERT EmbeddingsELMO EmbeddingsUniversal Sentence Encoder Sentence EmbeddingsChunk Embeddings

Spark NLP Pros & Cons

Pros of Spark NLP

No pros listed yet.

Cons of Spark NLP

No cons listed yet.

Spark NLP Alternatives & Comparisons

What are some alternatives to Spark NLP?

Transformers

Transformers

It provides general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch.

SpaCy

SpaCy

It is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products. It comes with pre-trained statistical models and word vectors, and currently supports tokenization for 49+ languages.

rasa NLU

rasa NLU

rasa NLU (Natural Language Understanding) is a tool for intent classification and entity extraction. You can think of rasa NLU as a set of high level APIs for building your own language parser using existing NLP and ML libraries.

Gensim

Gensim

It is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community.

Sentence Transformers

Sentence Transformers

It provides an easy method to compute dense vector representations for sentences, paragraphs, and images. The models are based on transformer networks like BERT / RoBERTa / XLM-RoBERTa etc. and achieve state-of-the-art performance in various tasks.

Amazon Comprehend

Amazon Comprehend

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to discover insights from text. Amazon Comprehend provides Keyphrase Extraction, Sentiment Analysis, Entity Recognition, Topic Modeling, and Language Detection APIs so you can easily integrate natural language processing into your applications.

Try It

Visit Website

Adoption

On StackShare

Spark NLP Integrations

Python, Java, Scala, TensorFlow are some of the popular tools that integrate with Spark NLP. Here's a list of all 4 tools that integrate with Spark NLP.

Python
Python
Java
Java
Scala
Scala
TensorFlow
TensorFlow
Companies
6
MUJRCN
Developers
22
MACTDD+16