StackShareStackShare
Follow on
StackShare

Discover and share technology stacks from companies around the world.

Follow on

© 2025 StackShare. All rights reserved.

Product

  • Stacks
  • Tools
  • Feed

Company

  • About
  • Contact

Legal

  • Privacy Policy
  • Terms of Service
Pachyderm
ByPachydermPachyderm

Pachyderm

#175in Databases
Discussions0
Followers95
OverviewDiscussionsAdoptionAlternativesIntegrations
Try It

What is Pachyderm?

Pachyderm is an open source MapReduce engine that uses Docker containers for distributed computations.

Pachyderm is a tool in the Databases category of a tech stack.

Key Features

Git-like File SystemDockerized MapReduceMicroservice ArchitectureDeployed with CoreOS

Pachyderm Pros & Cons

Pros of Pachyderm

  • ✓Containers
  • ✓Can run on GCP or AWS
  • ✓Versioning

Cons of Pachyderm

  • ✗Recently acquired by HPE, uncertain future.

Pachyderm Alternatives & Comparisons

What are some alternatives to Pachyderm?

Apache Spark

Apache Spark

Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters through YARN or Spark's standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new workloads like streaming, interactive queries, and machine learning.

Splunk

Splunk

It provides the leading platform for Operational Intelligence. Customers use it to search, monitor, analyze and visualize machine data.

Apache Flink

Apache Flink

Apache Flink is an open source system for fast and versatile data analytics in clusters. Flink supports batch and streaming analytics, in one system. Analytical programs can be written in concise and elegant APIs in Java and Scala.

Amazon Athena

Amazon Athena

Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run.

Apache Hive

Apache Hive

Hive facilitates reading, writing, and managing large datasets residing in distributed storage using SQL. Structure can be projected onto data already in storage.

AWS Glue

AWS Glue

A fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics.

Try It

Visit Website

Adoption

On StackShare

Pachyderm Integrations

Docker, Amazon EC2, Google Compute Engine, Vagrant are some of the popular tools that integrate with Pachyderm. Here's a list of all 4 tools that integrate with Pachyderm.

Docker
Docker
Amazon EC2
Amazon EC2
Google Compute Engine
Google Compute Engine
Vagrant
Vagrant
Companies
6
AENEAI
Developers
18
RSCGR1+12