Apache Flume vs Splunk Cloud

Need advice about which tool to choose?Ask the StackShare community!

Apache Flume

41
106
+ 1
0
Splunk Cloud

158
411
+ 1
15
Add tool

Apache Flume vs Splunk Cloud: What are the differences?

What is Apache Flume? A service for collecting, aggregating, and moving large amounts of log data. It is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.

What is Splunk Cloud? Easy and fast way to analyze valuable machine data with the convenience of software as a service (SaaS). If you're looking for all the benefits of Splunk® Enterprise with all the benefits of software-as-a-service, then look no further. Splunk Cloud is backed by a 100% uptime SLA, scales to over 10TB/day, and offers a highly secure environment.

Apache Flume and Splunk Cloud belong to "Log Management" category of the tech stack.

Advice on Apache Flume and Splunk Cloud
Jigar Shah
Security Software Engineer at Pinterest · | 8 upvotes · 78.3K views

We would like to detect unusual config changes that can potentially cause production outage.

Such as, SecurityGroup new allow/deny rule, AuthZ policy change, Secret key/certificate rotation, IP subnet add/drop. The problem is the source of all of these activities is different, i.e., AWS IAM, Amazon EC2, internal prod services, envoy sidecar, etc.

Which of the technology would be best suitable to detect only IMP events (not all activity) from various sources all workload running on AWS and also Splunk Cloud?

See more
Replies (5)
Nati Abebe
Recommends
AWS ConfigAWS Config

For continuous monitoring and detecting unusual configuration changes, I would suggest you look into AWS Config.

AWS Config enables you to assess, audit, and evaluate the configurations of your AWS resources. Config continuously monitors and records your AWS resource configurations and allows you to automate the evaluation of recorded configurations against desired configurations. Here is a list of supported AWS resources types and resource relationships with AWS Config https://docs.aws.amazon.com/config/latest/developerguide/resource-config-reference.html

Also as of Nov, 2019 - AWS Config launches support for third-party resources. You can now publish the configuration of third-party resources, such as GitHub repositories, Microsoft Active Directory resources, or any on-premises server into AWS Config using the new API. Here is more detail: https://docs.aws.amazon.com/config/latest/developerguide/customresources.html

If you have multiple AWS Account in your organization and want to detect changes there: https://docs.aws.amazon.com/config/latest/developerguide/aggregate-data.html

Lastly, if you already use Splunk Cloud in your enterprise and are looking for a consolidated view then, AWS Config is supported by Splunk Cloud as per their documentation too. https://aws.amazon.com/marketplace/pp/Splunk-Inc-Splunk-Cloud/B06XK299KV https://aws.amazon.com/marketplace/pp/Splunk-Inc-Splunk-Cloud/B06XK299KV

See more
Isaac Povey
Casual Software Engineer at Skedulo · | 6 upvotes · 51.9K views
Recommends
TerraformTerraform

While it won't detect events as they happen a good stop gap would be to define your infrastructure config using terraform. You can then periodically run the terraform config against your environment and alert if there are any changes.

See more
Matthew Rothstein
Recommends
Security MonkeySecurity Monkey

Consider using a combination of Netflix Security Monkey and AWS Guard Duty.

You can achieve automated detection and alerting, as well as automated recovery based on policies with these tools.

For instance, you could detect SecurityGroup rule changes that allow unrestricted egress from EC2 instances and then revert those changes automatically.

It's unclear from your post whether you want to detect events within the Splunk Cloud infrastructure or if you want to detect events indicated in data going to Splunk using the Splunk capabilities. If the latter, then Splunk has extremely rich capabilities in their query language and integrated alerting functions. With Splunk you can also run arbitrary Python scripts in response to certain events, so what you can't analyze and alert on with native functionality or plugins, you could write code to achieve.

See more
Vijayanand Narayanasharma
DevOps/TechOps Consultant at Qantas Loyalty · | 3 upvotes · 43.4K views
Recommends
AWS CloudTrailAWS CloudTrail

Well there are clear advantages of using either tools, it all boils down to what exactly are you trying to achieve with this i.e do you want to proactive monitoring or do you want debug an incident/issue. Splunk definitely is superior in terms of proactively monitoring your logs for unusal events, but getting the cloudtrail logs across to splunk would require some not so straight forward setup (Splunk has a blueprint for this setup which uses AWS kinesis/Firehose). Cloudtrail on the other had is available out of the box from AWS, the setup is quite simple and straight forward. But analysing the log could require you setup Glue crawlers and you might have to use AWS Athena to run SQL Like query.

Refer: https://docs.aws.amazon.com/athena/latest/ug/cloudtrail-logs.html

In my personal experience the cost/effort involved in setting up splunk is not worth it for smaller workloads, whereas the AWS Cloudtrail/Glue/Athena would be less expensive setup(comparatively).

Alternatively you could look at something like sumologic, which has better integration with cloudtrail as opposed to splunk. Hope that helps.

See more
Recommends
AWS CloudTrailAWS CloudTrail

I'd recommend using CloudTrail, it helped me a lot. But depending on your situation I'd recommed building a custom solution(like aws amazon-ssm-agent) which on configuration change makes an API call and logs them in grafana or kibana.

See more
Get Advice from developers at your company using StackShare Enterprise. Sign up for StackShare Enterprise.
Learn More
Pros of Apache Flume
Pros of Splunk Cloud
    Be the first to leave a pro
    • 7
      More powerful & Integrates with on-prem & off-prem
    • 3
      Free
    • 3
      Powerful log analytics
    • 1
      Pci compliance
    • 1
      Production debugger

    Sign up to add or upvote prosMake informed product decisions

    No Stats

    What is Apache Flume?

    It is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with tunable reliability mechanisms and many failover and recovery mechanisms. It uses a simple extensible data model that allows for online analytic application.

    What is Splunk Cloud?

    If you're looking for all the benefits of Splunk® Enterprise with all the benefits of software-as-a-service, then look no further. Splunk Cloud is backed by a 100% uptime SLA, scales to over 10TB/day, and offers a highly secure environment.

    Need advice about which tool to choose?Ask the StackShare community!

    Jobs that mention Apache Flume and Splunk Cloud as a desired skillset
    CBRE
    United States of America Texas Richardson
    CBRE
    India Telangana Hyderabad
    CBRE
    United Kingdom of Great Britain and Northern Ireland England Feltham
    What companies use Apache Flume?
    What companies use Splunk Cloud?
    See which teams inside your own company are using Apache Flume or Splunk Cloud.
    Sign up for StackShare EnterpriseLearn More

    Sign up to get full access to all the companiesMake informed product decisions

    What tools integrate with Apache Flume?
    What tools integrate with Splunk Cloud?
      No integrations found

      Sign up to get full access to all the tool integrationsMake informed product decisions

      Blog Posts

      GitHubPythonNode.js+26
      29
      15559
      What are some alternatives to Apache Flume and Splunk Cloud?
      Apache Spark
      Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
      Logstash
      Logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use (like, for searching). If you store them in Elasticsearch, you can view and analyze them with Kibana.
      Apache Storm
      Apache Storm is a free and open source distributed realtime computation system. Storm makes it easy to reliably process unbounded streams of data, doing for realtime processing what Hadoop did for batch processing. Storm has many use cases: realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Storm is fast: a benchmark clocked it at over a million tuples processed per second per node. It is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.
      Kafka
      Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design.
      Apache Flink
      Apache Flink is an open source system for fast and versatile data analytics in clusters. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.
      See all alternatives