Stroom logo


A scalable data storage, processing and analysis platform
+ 1

What is Stroom?

It is a data processing, storage and analysis platform. It is scalable - just add more CPUs / servers for greater throughput. It is suitable for processing high volume data such as system logs, to provide valuable insights into IT performance and usage.
Stroom is a tool in the Big Data Tools category of a tech stack.
Stroom is an open source tool with 377 GitHub stars and 47 GitHub forks. Here’s a link to Stroom's open source repository on GitHub

Stroom's Features

  • Receive and store large volumes of data such as native format logs. Ingested data is always available in its raw form
  • Create sequences of XSL and text operations, in order to normalise or export data in any format. It is possible to enrich data using lookups and reference data
  • Easily add new data formats and debug the transformations if they don't work as expected
  • Create multiple indexes with different retention periods. These can be sharded across your cluster
  • Run queries against your indexes or statistics and view the results within custom visualisations
  • Record counts or values of items over time

Stroom Alternatives & Comparisons

What are some alternatives to Stroom?
Logstash is a tool for managing events and logs. You can use it to collect logs, parse them, and store them for later use (like, for searching). If you store them in Elasticsearch, you can view and analyze them with Kibana.
Apache Spark
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.
It is the acronym for three open source projects: Elasticsearch, Logstash, and Kibana. Elasticsearch is a search and analytics engine. Logstash is a server‑side data processing pipeline that ingests data from multiple sources simultaneously, transforms it, and then sends it to a "stash" like Elasticsearch. Kibana lets users visualize data with charts and graphs in Elasticsearch.
Papertrail helps detect, resolve, and avoid infrastructure problems using log messages. Papertrail's practicality comes from our own experience as sysadmins, developers, and entrepreneurs.
Fluentd collects events from various data sources and writes them to files, RDBMS, NoSQL, IaaS, SaaS, Hadoop and so on. Fluentd helps you unify your logging infrastructure.
See all alternatives

Stroom's Followers
2 developers follow Stroom to keep up with related blogs and decisions.