Amazon CloudSearch vs Groonga: What are the differences?
Developers describe Amazon CloudSearch as "Set up, manage, and scale a search solution for your website or application". Amazon CloudSearch enables you to search large collections of data such as web pages, document files, forum posts, or product information. With a few clicks in the AWS Management Console, you can create a search domain, upload the data you want to make searchable to Amazon CloudSearch, and the search service automatically provisions the required technology resources and deploys a highly tuned search index. On the other hand, Groonga is detailed as "* An open-source full-text search engine and column store*". It is an embeddable super fast full text search engine. It can be embedded into MySQL. Mroonga is a storage engine that is based on it.
Amazon CloudSearch and Groonga can be primarily classified as "Search as a Service" tools.
Some of the features offered by Amazon CloudSearch are:
- Simple to Configure – You can make your data searchable using the AWS Management Console, API calls, or command line tools. Simply point to a sample set of data, and Amazon CloudSearch automatically proposes a list of index fields and a suggested configuration.
- Automatic Scaling For Data &
- Traffic – Amazon CloudSearch scales up and down seamlessly as the amount of data or query volume changes.
On the other hand, Groonga provides the following key features:
- Storage Engine
- Easy to use
What is Amazon CloudSearch?
What is Groonga?
Need advice about which tool to choose?Ask the StackShare community!
Why do developers choose Groonga?
What are the cons of using Amazon CloudSearch?
What are the cons of using Groonga?
Sign up to get full access to all the companiesMake informed product decisions
Sign up to get full access to all the tool integrationsMake informed product decisions
We send over 20 billion emails a month on behalf of our customers. As a result, we manage hundreds of millions of "suppression" records that track when an email address is invalid as well as when a user unsubscribes or flags an email as spam. This way we can help ensure our customers are only sending email that their recipients want, which boosts overall delivery rates and engagement. We need to support two primary use cases: (1) fast and reliable real-time lookup against the list when sending email and (2) allow customers to search, edit, and bulk upload/download their list via API and in the UI. A single enterprise customer's list can be well over 100 million. Over the years as the size of this data started small and has grown increasingly we have tried multiple things that didn't scale very well. In the recent past we used Amazon DynamoDB for the system of record as well as a cache in Amazon ElastiCache (Redis) for the fast lookups and Amazon CloudSearch for the search function. This architecture was overly complicated and expensive. We were able to eliminate the use of Redis, replacing it with direct lookups against DynamoDB, fronted with a stripped down Node.js API that performs consistently around 10ms. The new dynamic bursting of DynamoDB has helped ensure reliable and consistent performance for real-time lookups. We also moved off the clunky and expensive CloudSearch to Amazon Elasticsearch Service for the search functionality. Beyond the high price tag for CloudSearch it also had severe limits streaming updates from DynamoDB, which forced us to batch them - adding extra complexity and CX challenges. We love the fact that DynamoDB can stream directly to ElasticSearch and believe using these two technologies together will handle our scaling needs in an economical way for the foreseeable future.