Need advice about which tool to choose?Ask the StackShare community!
Airflow vs Kissflow: What are the differences?
Airflow vs Kissflow
Airflow and Kissflow are both workflow automation tools, but they differ in several key aspects.
1. **Workflow Type**: Airflow is primarily designed for data pipeline orchestration, allowing users to schedule and monitor complex data workflows. On the other hand, Kissflow is focused on business process management, allowing users to create, automate, and track business workflows.
2. **Customization and Extensibility**: Airflow offers a high level of customization and extensibility through its Python-based scripting abilities, which allow users to define complex workflows using code. In contrast, Kissflow offers a more user-friendly interface with pre-built templates and drag-and-drop features, making it easier for non-technical users to create workflows.
3. **Deployment and Scalability**: Airflow requires users to set up and manage their own infrastructure, making it more suitable for organizations with in-house technical expertise. Kissflow, on the other hand, is a cloud-based platform that handles deployment and scalability automatically, making it more accessible to smaller organizations or teams without dedicated IT resources.
4. **Integration with Third-Party Tools**: Airflow has strong integration capabilities with a wide range of third-party tools and services, allowing users to easily connect their workflows with other systems. While Kissflow also offers integrations, its focus is more on providing a comprehensive solution within the platform, reducing the need for external integrations.
5. **Cost Structure**: Airflow is an open-source tool with no licensing fees, but users incur costs for infrastructure management and maintenance. In contrast, Kissflow operates on a subscription-based model, with pricing tiers based on the number of users and features required, making it easier to predict and budget for costs.
6. **User Audience**: Airflow is more suited for technical users who are comfortable with scripting and working with code, making it ideal for data engineers and developers. Kissflow, on the other hand, is designed for business users and process owners who need to streamline and automate workflows without relying on technical expertise.
In Summary, Airflow and Kissflow differ in their focus on workflow types, customization levels, deployment options, integration capabilities, cost structures, and target user audiences.
I am so confused. I need a tool that will allow me to go to about 10 different URLs to get a list of objects. Those object lists will be hundreds or thousands in length. I then need to get detailed data lists about each object. Those detailed data lists can have hundreds of elements that could be map/reduced somehow. My batch process dies sometimes halfway through which means hours of processing gone, i.e. time wasted. I need something like a directed graph that will keep results of successful data collection and allow me either pragmatically or manually to retry the failed ones some way (0 - forever) times. I want it to then process all the ones that have succeeded or been effectively ignored and load the data store with the aggregation of some couple thousand data-points. I know hitting this many endpoints is not a good practice but I can't put collectors on all the endpoints or anything like that. It is pretty much the only way to get the data.
For a non-streaming approach:
You could consider using more checkpoints throughout your spark jobs. Furthermore, you could consider separating your workload into multiple jobs with an intermittent data store (suggesting cassandra or you may choose based on your choice and availability) to store results , perform aggregations and store results of those.
Spark Job 1 - Fetch Data From 10 URLs and store data and metadata in a data store (cassandra) Spark Job 2..n - Check data store for unprocessed items and continue the aggregation
Alternatively for a streaming approach: Treating your data as stream might be useful also. Spark Streaming allows you to utilize a checkpoint interval - https://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing
Pros of Airflow
- Features51
- Task Dependency Management14
- Beautiful UI12
- Cluster of workers12
- Extensibility10
- Open source6
- Complex workflows5
- Python5
- Good api3
- Apache project3
- Custom operators3
- Dashboard2
Pros of Kissflow
Sign up to add or upvote prosMake informed product decisions
Cons of Airflow
- Observability is not great when the DAGs exceed 2502
- Running it on kubernetes cluster relatively complex2
- Open source - provides minimum or no support2
- Logical separation of DAGs is not straight forward1