Nov26

Partial systems outage 26/11

Posted by Karanbir Singh on 26/Nov/2009  ~  Posted in: Linux

At approx 16:45 the Coreix Data Center suffered a power loss, UPS took over the load for short while, however the Generators did not come online in time and it took till 7:35 for power to be restored. There are details posted on the coreix status page at http://status.coreix.net/

While power did get restored to the entire DC, only 1 of the machines came back online. All the rest have needed some level of manual intervention! I'm working with the support people ( who are a really good and effective bunch of guys ) to get the other machines online.

Services affected are:

  • RPMForge svn repo
  • RPMForge master mirror
  • RPMForge mailing lists
  • Karan.org Build services
  • Karan.org testing services
  • CentOS.org ipv6 test / qa setups
  • CentOS.org Package and Automated testing development machine

Once all services are restored, I'll update this blog post with details. And apologies for this completely unplanned and avoidable outage.

Update: as of 23:45 26th Nov, all services are now restored.