What is Spark NLP?
It is a Natural Language Processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines that scale easily in a distributed environment. It comes with 160+ pretrained pipelines and models in more than 20+ languages.
Spark NLP is a tool in the NLP / Sentiment Analysis category of a tech stack.
Spark NLP is an open source tool with 3.9K GitHub stars and 719 GitHub forks. Here’s a link to Spark NLP's open source repository on GitHub
Who uses Spark NLP?
Companies
5 companies reportedly use Spark NLP in their tech stacks, including Newzera, Ukuli Data, and Multivac DSL.
Developers
22 developers on StackShare have stated that they use Spark NLP.
Spark NLP Integrations
Spark NLP's Features
- Tokenization
- Stop Words Removal
- Normalizer
- Stemmer
- Lemmatizer
- NGrams
- Regex Matching
- Text Matching
- Chunking
- Date Matcher
- Part-of-speech tagging
- Sentence Detector
- Dependency parsing (Labeled/unlabled)
- Sentiment Detection (ML models)
- Spell Checker (ML and DL models)
- Word Embeddings (GloVe and Word2Vec)
- BERT Embeddings
- ELMO Embeddings
- Universal Sentence EncoderSentence Embeddings
- Chunk Embeddings
Spark NLP Alternatives & Comparisons
What are some alternatives to Spark NLP?
Postman
It is the only complete API development environment, used by nearly five million developers and more than 100,000 companies worldwide.
Postman
It is the only complete API development environment, used by nearly five million developers and more than 100,000 companies worldwide.
Stack Overflow
Stack Overflow is a question and answer site for professional and enthusiast programmers. It's built and run by you as part of the Stack Exchange network of Q&A sites. With your help, we're working together to build a library of detailed answers to every question about programming.
Google Maps
Create rich applications and stunning visualisations of your data, leveraging the comprehensiveness, accuracy, and usability of Google Maps and a modern web platform that scales as you grow.
Elasticsearch
Elasticsearch is a distributed, RESTful search and analytics engine capable of storing data and searching it in near real time. Elasticsearch, Kibana, Beats and Logstash are the Elastic Stack (sometimes called the ELK Stack).