What is Gearman and what are its top alternatives?
Top Alternatives to Gearman
- RabbitMQ
RabbitMQ gives your applications a common platform to send and receive messages, and your messages a safe place to live until received. ...
- Kafka
Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design. ...
- Celery
Celery is an asynchronous task queue/job queue based on distributed message passing. It is focused on real-time operation, but supports scheduling as well. ...
- Redis
Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache, and message broker. Redis provides data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes, and streams. ...
- Beanstalkd
Beanstalks's interface is generic, but was originally designed for reducing the latency of page views in high-volume web applications by running time-consuming tasks asynchronously. ...
- Laravel
It is a web application framework with expressive, elegant syntax. It attempts to take the pain out of development by easing common tasks used in the majority of web projects, such as authentication, routing, sessions, and caching. ...
- ZeroMQ
The 0MQ lightweight messaging kernel is a library which extends the standard socket interfaces with features traditionally provided by specialised messaging middleware products. 0MQ sockets provide an abstraction of asynchronous message queues, multiple messaging patterns, message filtering (subscriptions), seamless access to multiple transport protocols and more. ...
- Amazon SQS
Transmit any volume of data, at any level of throughput, without losing messages or requiring other services to be always available. With SQS, you can offload the administrative burden of operating and scaling a highly available messaging cluster, while paying a low price for only what you use. ...
Gearman alternatives & related posts
- It's fast and it works with good metrics/monitoring234
- Ease of configuration79
- I like the admin interface59
- Easy to set-up and start with50
- Durable21
- Intuitive work through python18
- Standard protocols18
- Written primarily in Erlang10
- Simply superb8
- Completeness of messaging patterns6
- Scales to 1 million messages per second3
- Reliable3
- Distributed2
- Supports MQTT2
- Better than most traditional queue based message broker2
- Supports AMQP2
- Clusterable1
- Clear documentation with different scripting language1
- Great ui1
- Inubit Integration1
- Better routing system1
- High performance1
- Runs on Open Telecom Platform1
- Delayed messages1
- Reliability1
- Open-source1
- Too complicated cluster/HA config and management9
- Needs Erlang runtime. Need ops good with Erlang runtime6
- Configuration must be done first, not by your code5
- Slow4
related RabbitMQ posts
As Sentry runs throughout the day, there are about 50 different offline tasks that we execute—anything from “process this event, pretty please” to “send all of these cool people some emails.” There are some that we execute once a day and some that execute thousands per second.
Managing this variety requires a reliably high-throughput message-passing technology. We use Celery's RabbitMQ implementation, and we stumbled upon a great feature called Federation that allows us to partition our task queue across any number of RabbitMQ servers and gives us the confidence that, if any single server gets backlogged, others will pitch in and distribute some of the backlogged tasks to their consumers.
#MessageQueue
Around the time of their Series A, Pinterest’s stack included Python and Django, with Tornado and Node.js as web servers. Memcached / Membase and Redis handled caching, with RabbitMQ handling queueing. Nginx, HAproxy and Varnish managed static-delivery and load-balancing, with persistent data storage handled by MySQL.
Kafka
- High-throughput126
- Distributed119
- Scalable92
- High-Performance86
- Durable66
- Publish-Subscribe38
- Simple-to-use19
- Open source18
- Written in Scala and java. Runs on JVM12
- Message broker + Streaming system9
- KSQL4
- Avro schema integration4
- Robust4
- Suport Multiple clients3
- Extremely good parallelism constructs2
- Partioned, replayable log2
- Simple publisher / multi-subscriber model1
- Fun1
- Flexible1
- Non-Java clients are second-class citizens32
- Needs Zookeeper29
- Operational difficulties9
- Terrible Packaging5
related Kafka posts
The algorithms and data infrastructure at Stitch Fix is housed in #AWS. Data acquisition is split between events flowing through Kafka, and periodic snapshots of PostgreSQL DBs. We store data in an Amazon S3 based data warehouse. Apache Spark on Yarn is our tool of choice for data movement and #ETL. Because our storage layer (s3) is decoupled from our processing layer, we are able to scale our compute environment very elastically. We have several semi-permanent, autoscaling Yarn clusters running to serve our data processing needs. While the bulk of our compute infrastructure is dedicated to algorithmic processing, we also implemented Presto for adhoc queries and dashboards.
Beyond data movement and ETL, most #ML centric jobs (e.g. model training and execution) run in a similarly elastic environment as containers running Python and R code on Amazon EC2 Container Service clusters. The execution of batch jobs on top of ECS is managed by Flotilla, a service we built in house and open sourced (see https://github.com/stitchfix/flotilla-os).
At Stitch Fix, algorithmic integrations are pervasive across the business. We have dozens of data products actively integrated systems. That requires serving layer that is robust, agile, flexible, and allows for self-service. Models produced on Flotilla are packaged for deployment in production using Khan, another framework we've developed internally. Khan provides our data scientists the ability to quickly productionize those models they've developed with open source frameworks in Python 3 (e.g. PyTorch, sklearn), by automatically packaging them as Docker containers and deploying to Amazon ECS. This provides our data scientist a one-click method of getting from their algorithms to production. We then integrate those deployments into a service mesh, which allows us to A/B test various implementations in our product.
For more info:
- Our Algorithms Tour: https://algorithms-tour.stitchfix.com/
- Our blog: https://multithreaded.stitchfix.com/blog/
- Careers: https://multithreaded.stitchfix.com/careers/
#DataScience #DataStack #Data
As we've evolved or added additional infrastructure to our stack, we've biased towards managed services. Most new backing stores are Amazon RDS instances now. We do use self-managed PostgreSQL with TimescaleDB for time-series data—this is made HA with the use of Patroni and Consul.
We also use managed Amazon ElastiCache instances instead of spinning up Amazon EC2 instances to run Redis workloads, as well as shifting to Amazon Kinesis instead of Kafka.
- Task queue99
- Python integration63
- Django integration40
- Scheduled Task30
- Publish/subsribe19
- Various backend broker8
- Easy to use6
- Great community5
- Workflow5
- Free4
- Dynamic1
- Sometimes loses tasks4
- Depends on broker1
related Celery posts
As Sentry runs throughout the day, there are about 50 different offline tasks that we execute—anything from “process this event, pretty please” to “send all of these cool people some emails.” There are some that we execute once a day and some that execute thousands per second.
Managing this variety requires a reliably high-throughput message-passing technology. We use Celery's RabbitMQ implementation, and we stumbled upon a great feature called Federation that allows us to partition our task queue across any number of RabbitMQ servers and gives us the confidence that, if any single server gets backlogged, others will pitch in and distribute some of the backlogged tasks to their consumers.
#MessageQueue
Automations are what makes a CRM powerful. With Celery and RabbitMQ we've been able to make powerful automations that truly works for our clients. Such as for example, automatic daily reports, reminders for their activities, important notifications regarding their client activities and actions on the website and more.
We use Celery basically for everything that needs to be scheduled for the future, and using RabbitMQ as our Queue-broker is amazing since it fully integrates with Django and Celery storing on our database results of the tasks done so we can see if anything fails immediately.
- Performance886
- Super fast542
- Ease of use513
- In-memory cache444
- Advanced key-value cache324
- Open source194
- Easy to deploy182
- Stable164
- Free155
- Fast121
- High-Performance42
- High Availability40
- Data Structures35
- Very Scalable32
- Replication24
- Great community22
- Pub/Sub22
- "NoSQL" key-value data store19
- Hashes16
- Sets13
- Sorted Sets11
- NoSQL10
- Lists10
- Async replication9
- BSD licensed9
- Bitmaps8
- Integrates super easy with Sidekiq for Rails background8
- Keys with a limited time-to-live7
- Open Source7
- Lua scripting6
- Strings6
- Awesomeness for Free5
- Hyperloglogs5
- Transactions4
- Outstanding performance4
- Runs server side LUA4
- LRU eviction of keys4
- Feature Rich4
- Written in ANSI C4
- Networked4
- Data structure server3
- Performance & ease of use3
- Dont save data if no subscribers are found2
- Automatic failover2
- Easy to use2
- Temporarily kept on disk2
- Scalable2
- Existing Laravel Integration2
- Channels concept2
- Object [key/value] size each 500 MB2
- Simple2
- Cannot query objects directly15
- No secondary indexes for non-numeric data types3
- No WAL1
related Redis posts
We use MongoDB as our primary #datastore. Mongo's approach to replica sets enables some fantastic patterns for operations like maintenance, backups, and #ETL.
As we pull #microservices from our #monolith, we are taking the opportunity to build them with their own datastores using PostgreSQL. We also use Redis to cache data we’d never store permanently, and to rate-limit our requests to partners’ APIs (like GitHub).
When we’re dealing with large blobs of immutable data (logs, artifacts, and test results), we store them in Amazon S3. We handle any side-effects of S3’s eventual consistency model within our own code. This ensures that we deal with user requests correctly while writes are in process.
I'm working as one of the engineering leads in RunaHR. As our platform is a Saas, we thought It'd be good to have an API (We chose Ruby and Rails for this) and a SPA (built with React and Redux ) connected. We started the SPA with Create React App since It's pretty easy to start.
We use Jest as the testing framework and react-testing-library to test React components. In Rails we make tests using RSpec.
Our main database is PostgreSQL, but we also use MongoDB to store some type of data. We started to use Redis for cache and other time sensitive operations.
We have a couple of extra projects: One is an Employee app built with React Native and the other is an internal back office dashboard built with Next.js for the client and Python in the backend side.
Since we have different frontend apps we have found useful to have Bit to document visual components and utils in JavaScript.
- Fast23
- Free12
- Does one thing well12
- Scalability9
- Simplicity8
- External admin UI developer friendly3
- Job delay3
- Job prioritization2
- External admin UI2
related Beanstalkd posts
I used Kafka originally because it was mandated as part of the top-level IT requirements at a Fortune 500 client. What I found was that it was orders of magnitude more complex ...and powerful than my daily Beanstalkd , and far more flexible, resilient, and manageable than RabbitMQ.
So for any case where utmost flexibility and resilience are part of the deal, I would use Kafka again. But due to the complexities involved, for any time where this level of scalability is not required, I would probably just use Beanstalkd for its simplicity.
I tend to find RabbitMQ to be in an uncomfortable middle place between these two extremities.
- Clean architecture553
- Growing community392
- Composer friendly370
- Open source344
- The only framework to consider for php324
- Mvc220
- Quickly develop210
- Dependency injection168
- Application architecture156
- Embraces good community packages143
- Write less, do more73
- Orm (eloquent)71
- Restful routing66
- Database migrations & seeds57
- Artisan scaffolding and migrations55
- Great documentation41
- Awesome40
- Awsome, Powerfull, Fast and Rapid30
- Build Apps faster, easier and better29
- Eloquent ORM28
- Promotes elegant coding26
- Modern PHP26
- JSON friendly26
- Most easy for me25
- Easy to learn, scalability24
- Beautiful23
- Blade Template22
- Test-Driven21
- Security15
- Based on SOLID15
- Clean Documentation13
- Easy to attach Middleware13
- Cool13
- Simple12
- Convention over Configuration12
- Easy Request Validatin11
- Simpler10
- Fast10
- Easy to use10
- Get going quickly straight out of the box. BYOKDM9
- Its just wow9
- Laravel + Cassandra = Killer Framework8
- Simplistic , easy and faster8
- Friendly API8
- Less dependencies7
- Super easy and powerful7
- Great customer support6
- Its beautiful to code in6
- Speed5
- Eloquent5
- Composer5
- Minimum system requirements5
- Laravel Mix5
- Easy5
- The only "cons" is wrong! No static method just Facades5
- Fast and Clarify framework5
- Active Record5
- Php75
- Ease of use4
- Laragon4
- Laravel casher4
- Easy views handling and great ORM4
- Laravel Forge and Envoy4
- Cashier with Braintree and Stripe4
- Laravel Passport3
- Laravel Spark3
- Intuitive usage3
- Laravel Horizon and Telescope3
- Laravel Nova3
- Rapid development3
- Laravel Vite2
- Scout2
- Deployment2
- Succint sintax1
- PHP53
- Too many dependency33
- Slower than the other two23
- A lot of static method calls for convenience17
- Too many include15
- Heavy13
- Bloated9
- Laravel8
- Confusing7
- Too underrated5
- Not fast with MongoDB4
- Not using SOLID principles1
- Slow and too much big1
- Difficult to learn1
related Laravel posts
I need to build a web application plus android and IOS apps for an enterprise, like an e-commerce portal. It will have intensive use of MySQL to display thousands (40-50k) of live product information in an interactive table (searchable, filterable), live delivery tracking. It has to be secure, as it will handle information on customers, sales, inventory. Here is the technology stack: Backend: Laravel 7 Frondend: Vue.js, React or AngularJS?
Need help deciding technology stack. Thanks.
Back at the start of 2017, we decided to create a web-based tool for the SEO OnPage analysis of our clients' websites. We had over 2.000 websites to analyze, so we had to perform thousands of requests to get every single page from those websites, process the information and save the big amounts of data somewhere.
Very soon we realized that the initial chosen script language and database, PHP, Laravel and MySQL, was not going to be able to cope efficiently with such a task.
By that time, we were doing some experiments for other projects with a language we had recently get to know, Go , so we decided to get a try and code the crawler using it. It was fantastic, we could process much more data with way less CPU power and in less time. By using the concurrency abilites that the language has to offers, we could also do more Http requests in less time.
Unfortunately, I have no comparison numbers to show about the performance differences between Go and PHP since the difference was so clear from the beginning and that we didn't feel the need to do further comparison tests nor document it. We just switched fully to Go.
There was still a problem: despite the big amount of Data we were generating, MySQL was performing very well, but as we were adding more and more features to the software and with those features more and more different type of data to save, it was a nightmare for the database architects to structure everything correctly on the database, so it was clear what we had to do next: switch to a NoSQL database. So we switched to MongoDB, and it was also fantastic: we were expending almost zero time in thinking how to structure the Database and the performance also seemed to be better, but again, I have no comparison numbers to show due to the lack of time.
We also decided to switch the website from PHP and Laravel to JavaScript and Node.js and ExpressJS since working with the JSON Data that we were saving now in the Database would be easier.
As of now, we don't only use the tool intern but we also opened it for everyone to use for free: https://tool-seo.com
ZeroMQ
- Fast23
- Lightweight20
- Transport agnostic11
- No broker required7
- Low level APIs are in C4
- Low latency4
- Open source1
- Publish-Subscribe1
- No message durability5
- Not a very reliable system - message delivery wise3
- M x N problem with M producers and N consumers1
related ZeroMQ posts
Hi, we are in a ZMQ set up in a push/pull pattern, and we currently start to have more traffic and cases that the service is unavailable or stuck. We want to: * Not loose messages in services outages * Safely restart service without losing messages (ZeroMQ seems to need to close the socket in the receiver before restart manually)
Do you have experience with this setup with ZeroMQ? Would you suggest RabbitMQ or Amazon SQS (we are in AWS setup) instead? Something else?
Thank you for your time
- Easy to use, reliable62
- Low cost40
- Simple28
- Doesn't need to maintain it14
- It is Serverless8
- Has a max message size (currently 256K)4
- Triggers Lambda3
- Easy to configure with Terraform3
- Delayed delivery upto 15 mins only3
- Delayed delivery upto 12 hours3
- JMS compliant1
- Support for retry and dead letter queue1
- D1
- Has a max message size (currently 256K)2
- Proprietary2
- Difficult to configure2
- Has a maximum 15 minutes of delayed messages only1
related Amazon SQS posts
We are in the process of building a modern content platform to deliver our content through various channels. We decided to go with Microservices architecture as we wanted scale. Microservice architecture style is an approach to developing an application as a suite of small independently deployable services built around specific business capabilities. You can gain modularity, extensive parallelism and cost-effective scaling by deploying services across many distributed servers. Microservices modularity facilitates independent updates/deployments, and helps to avoid single point of failure, which can help prevent large-scale outages. We also decided to use Event Driven Architecture pattern which is a popular distributed asynchronous architecture pattern used to produce highly scalable applications. The event-driven architecture is made up of highly decoupled, single-purpose event processing components that asynchronously receive and process events.
To build our #Backend capabilities we decided to use the following: 1. #Microservices - Java with Spring Boot , Node.js with ExpressJS and Python with Flask 2. #Eventsourcingframework - Amazon Kinesis , Amazon Kinesis Firehose , Amazon SNS , Amazon SQS, AWS Lambda 3. #Data - Amazon RDS , Amazon DynamoDB , Amazon S3 , MongoDB Atlas
To build #Webapps we decided to use Angular 2 with RxJS
#Devops - GitHub , Travis CI , Terraform , Docker , Serverless
In order to accurately measure & track user behaviour on our platform we moved over quickly from the initial solution using Google Analytics to a custom-built one due to resource & pricing concerns we had.
While this does sound complicated, it’s as easy as clients sending JSON blobs of events to Amazon Kinesis from where we use AWS Lambda & Amazon SQS to batch and process incoming events and then ingest them into Google BigQuery. Once events are stored in BigQuery (which usually only takes a second from the time the client sends the data until it’s available), we can use almost-standard-SQL to simply query for data while Google makes sure that, even with terabytes of data being scanned, query times stay in the range of seconds rather than hours. Before ingesting their data into the pipeline, our mobile clients are aggregating events internally and, once a certain threshold is reached or the app is going to the background, sending the events as a JSON blob into the stream.
In the past we had workers running that continuously read from the stream and would validate and post-process the data and then enqueue them for other workers to write them to BigQuery. We went ahead and implemented the Lambda-based approach in such a way that Lambda functions would automatically be triggered for incoming records, pre-aggregate events, and write them back to SQS, from which we then read them, and persist the events to BigQuery. While this approach had a couple of bumps on the road, like re-triggering functions asynchronously to keep up with the stream and proper batch sizes, we finally managed to get it running in a reliable way and are very happy with this solution today.
#ServerlessTaskProcessing #GeneralAnalytics #RealTimeDataProcessing #BigDataAsAService