One thing that's generalizable (though maybe obvious) is to explicitly define the SLAs for each microservice. There were a few weeks where we gave ourselves paging errors every time a smaller service had a deploy or went down due to unimportant errors.
A current discussion for us is how we'll standardize the interchange format between various services (something like Protobufs). That's probably worth figuring out early.
Looks great, thanks for the recommendation!