"They said it couldn't happen"

Published: 2010-06-16
One of our data centers has suffered repeated incidents over the last months. Much more than is expected and much more than our other data centers. To sort things out, we spent the better part of the day running a full audit: infrastructure, procedures, etc.

Here are a few random thoughts about what we learned. Nothing extraordinary, but does it ever hurt to go back to basics ? I think not.
The list goes on. This may sound harsh or paranoid... and maybe it is.

However, can you NOT consider the worst scenarios and see how well the data center will survive them ? 

What will your tell your CEO and your customers when power fails at the busiest time of the year? Or when planned maintenance goes wrong ? Or when public construction cuts your "redundant" fibers ?

"They said it couldn't happen" ? That just doesn't work.

Remember: the more you sweat in training, the less you bleed in combat. Make sure you and your team sweat a lot.

PS: kudos to the audit team today (you know who you are). You made me proud :)

About the Author

Julien Simon is the Chief Evangelist at Arcee AI , specializing in Small Language Models and enterprise AI solutions. Recognized as the #1 AI Evangelist globally by AI Magazine in 2021, he brings over 30 years of technology leadership experience to his role.

With 650+ speaking engagements worldwide and 350+ technical blog posts, Julien is a leading voice in practical AI implementation, cost-effective AI solutions, and the democratization of artificial intelligence. His expertise spans open-source AI, Small Language Models, enterprise AI strategy, and edge computing optimization.

Previously serving as Principal Evangelist at Amazon Web Services and Chief Evangelist at Hugging Face, Julien has helped thousands of organizations implement AI solutions that deliver real business value. He is the author of "Learn Amazon SageMaker," the first book ever published on AWS's flagship machine learning service.

Julien's mission is to make AI accessible, understandable, and controllable for enterprises through transparent, open-weights models that organizations can deploy, customize, and trust.