We ultimately migrated our Hadoop jobs to Qubole, a rising player in the Hadoop as a Service space. Given that EMR had become unstable at our scale, we had to quickly move to a provider that played well with AWS (specifically, spot instances) and S3. Qubole supported AWS/S3 and was relatively easy to get started on. After vetting Qubole and comparing its performance against alternatives (including managed clusters), we decided to go with Qubole
Like many large scale web sites, Pinterest’s infrastructure consists of servers that communicate with backend services composed of a number of individual servers for managing load and fault tolerance. Ideally, we’d like the configuration to reflect only the active hosts, so clients don’t need to deal with bad hosts as often. ZooKeeper provides a well known pattern to solve this problem.