Scaling a web infrastructure requires services, and building a service-oriented infrastructure is hard. Make it EASY, with SmartStack’s automated, transparent service discovery and registration: cruise control for your distributed infrastructure. | It is a next-generation data discovery and observability tool for enterprises and startups that help to efficiently democratize data, powers collaboration of data science and data engineering teams, significantly reduces time to data discovery, cuts on data downtime and offers a modern, easy-to-use environment with quick time-to-value. It makes all your data entities reliable, observable, and easily discoverable. |
Within a health check interval’s delay of a backend becoming healthy, it is made available in Zookeeper;this makes it instantly available to consumers via Synapse’s Zookeeper watches.;We detect problems within a health check interval, and take backends out of rotation. A mechanism which allows services to notify Nerve that they’re not healthy is planned, to reduce the interval further. In the meantime, deploys can stop Nerve when they start, and then re-start it at the end.;Synapse acts on information the moment it’s published in Zookeeper, and reconfiguring HAProxy is very very fast most of the time. Because we utilize HAProxy’s stats socket for many changes, we don’t even restart the process unless we have to add new backends.;Because our infrastructure is distributed, we cannot do centralized planning. But HAProxy provides very configurable queueing semantics. For our biggest clients, we set up intelligent queueing at the HAProxy layer;for others, we at least guarantee round-robin.;Doing debugging or maintenance on a backend is as simple as stopping the Nerve process on the machine;nothing else is affected!;You can see exactly which backends are available simply by looking at the HAProxy status page. Because of HAProxy’s excellent log output, you also get amazing aggregate and per-request information, including statistics on number of behavior of requests right in rsyslog.;The infrastructure is completely distributed. The most critical nodes are the Zookeeper nodes, and Zookeeper is specifically designed to be distributed and robust against failure. | Data discovery; Data observability; DataSet search; Data lineage; Popularity search; Alerting; Data governance |
Statistics | |
GitHub Stars 245 | GitHub Stars 1.4K |
GitHub Forks 44 | GitHub Forks 129 |
Stacks 7 | Stacks 3 |
Followers 51 | Followers 7 |
Votes 1 | Votes 0 |
Pros & Cons | |
Pros
| No community feedback yet |
Integrations | |

Consul is a tool for service discovery and configuration. Consul is distributed, highly available, and extremely scalable.

Eureka is a REST (Representational State Transfer) based service that is primarily used in the AWS cloud for locating services for the purpose of load balancing and failover of middle-tier servers.

A centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications.

etcd is a distributed key value store that provides a reliable way to store data across a cluster of machines. It’s open-source and available on GitHub. etcd gracefully handles master elections during network partitions and will tolerate machine failure, including the master.

The main goal of this project is to provide simple and robust facilities for loadbalancing and high-availability to Linux system and Linux based infrastructures.

SkyDNS is a distributed service for announcement and discovery of services. It leverages Raft for high-availability and consensus, and utilizes DNS queries to discover available services. This is done by leveraging SRV records in DNS, with special meaning given to subdomains, priorities and weights (more info here: http://blog.gopheracademy.com/skydns).

Serf is a service discovery and orchestration tool that is decentralized, highly available, and fault tolerant. Serf runs on every major platform: Linux, Mac OS X, and Windows. It is extremely lightweight: it uses 5 to 10 MB of resident memory and primarily communicates using infrequent UDP messages.

It is an easy-to-use dynamic service discovery, configuration and service management platform for building cloud native applications.

It is an open source web service that lists software development project dependencies and alerts developers to new versions of the software libraries they are using.

Baker Street is an HAProxy-based client side load balancer that simplifies scaling, testing, and upgrading microservices.