Each piece of our infrastructure is monitored using Nagios, alerting us immediately if anything goes wrong (hopefully before anyone else notices), and with a level of granularity that really helps in resolving things quickly when things are on fire. Nagios