Shared insights

Initially, Stitch only supported real-time updates and addressed this problem with a MapReduce job named The Restorator that performed the following actions:

  • Calculated the expected totals
  • Queried Cassandra to get the values it had for each counter
  • Calculated the increments needed to apply to fix the counters
  • Applied the increments

Meanwhile, to stop the sand shifting under its feet, The Restorator needed to coordinate a locking system between itself and the real-time processors, so that the processors did not try to simultaneously apply increments to the same counter, resulting in a race-condition. It used ZooKeeper for this. Zookeeper

1 upvote·14.7K views
Avatar of undefined
Avatar of tim-thimmaiah