

It was with a bit of a laugh that I read this note from Technorati today. The system had been off for about an hour. When it returned I went to the message they had highlighted at the top of the screen...
Technorati fell victim to a swarm of power outages beginning at about 1:50 p.m. PDT Tuesday, July 24, 2007 that affected all of San Francisco and as far south as Daly City and South San Francisco, according to news reports. Our offices at Third and Townsend and one of our two co-location facilities were affected. We are working with our co-location facility managers to assess why it is back-up power generators failed to provide the necessary back-up power to prevent our site going down. We apologize for any inconvenience caused by our site being unavailable this afternoon.
It would appear that they chose co-location facilities that were not far enough away to escape the power outage they fell victim to. Since California has experienced so many power failures in the past few years it would seem like a good idea to use a co-location in say Seattle, Chicago, or Boston. The point her is not to berate Technorati. The point is to say that when you consider a backup system, be sure to think about what you want from it. What are you looking to prevent. Then test your system on occasion to see that it really does support the goal you intended.
Once company I ran put the backup tapes in a drawer by the computer. It wasn't until a consultant pointed out how they would be lost along with everything else in a fire that we started to store them offsite. Later when another company I was at started using a co-location facility, one was chosen that was far away so that we could take control from our internal servers if that system went down. Our key customers were also given the IP addresses needed to access the backup system if that was required before the domain information had propagated.
Redundant systems perform the role of service insurance. They are often ignored and considered a waste of money and effort until the day they are needed. If they fail, then everyone wonders why more care was not taken. If they succeed, then everyone is more than grateful for the time and effort spent.
(To be fair to the folks in Redmond - here is the Mac Black Screen of Death)
Do you have backups systems for your critical components? Do you have backups for your data? Do you have a contingency plan for your supply line or critical parts? What happens if you lose your phone, computer, or building? What backup do you have for your key people should they become incapacitated, ill, or leave?



Comment Preview