Forums  |   Blog  |   Contact  |   Chat Now  |   My Cart  |   MyServerBeach   
 

Go Back   ServerBeach Forums > ServerBeach > Beach Watch
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply
 
Thread Tools Rating: Thread Rating: 22 votes, 3.45 average. Display Modes
  #1  
Old 2005-06-09, 14:16 PM
Mitch Mitch is offline
ServerBeach Forum Admin
Join Date: 2003 Jul
Posts: 78
Exclamation URGENT: Virginia datacenter customers 2005-9-06

Dear ServerBeach Customers,

We are experiencing power problems in our Virginia Datacenter. There is a power outage affecting the entire building.

We are running on backup power, but the generators have not taken the power load from the UPS. Therefore, we have taken the rather extreme action of shutting down servers before we experience irreparable hardware failure.

We are in constant communication with the engineers and expect to have the situation rectified ASAP.

We apologize for this inconvenience and will update you as soon as possible.

Robert Miggins
COO - ServerBeach
Reply With Quote
  #2  
Old 2005-06-09, 15:04 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
More information on the power failure:

- The building where our datacenter is located lost power.
- The UPS systems did function properly and carried the power load.
- The generators did start, however the transfer switch that moves load from commercial power to generator power did not function properly.
- The UPS batteries were drained because the generator power could not be transferred.
- The UPS system does not power the HVAC units, so the temperature inside the datacenter climbed very quickly, forcing us to shut down ALL servers to prevent heat from damaging hardware and data.

The generators and UPS systems are the responsibility of the building owners which we are leasing space from, and they were tested and expected to work properly. It has not yet been determined why the generator power was not transferred properly.

Again, this outage only affects servers located in our Virginia datacenter (69.44.x.x, 69.45.x.x, 64.34.x.x). MyServerBeach logins have been temporarily suspended due to the increased load from ticket submissions. We are working diligently with the building engineers and our crew in Virginia to make sure the power is brought back online as soon as possible.
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #3  
Old 2005-06-09, 16:07 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
Quick Update

The building engineer notified us that the commercial power feed is "flickering", which indicates that the utility company is working on the issue outside of the building. We are bringing in all available employees to have as many people on hand when power is restored to make sure all servers are powered on as quickly as possible.
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #4  
Old 2005-06-09, 16:43 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
Power On!

We just received word from our datacenter team that power has been restored to the network room and some customer servers. Not all rows have power, and we are working on getting internal systems and network gear turned on first. Power will be restored in phases until all servers are powered on.

Thanks for hanging in there with us! We still have a lot of work to do, but things are looking up now.

-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #5  
Old 2005-06-09, 17:22 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
The building engineers are working on some residual power issues and are trying to make sure power is stable within the datacenter. Most customer servers have been turned on, but some of our network gear is still not powered. We are working to get power restored to essential equipment as soon as possible.

-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #6  
Old 2005-06-09, 17:50 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
Power has been restored to our edge routers, but we are still working on powering up the distribution routers. This means that even though servers are powered on, network connectivity has not yet been restored. The situation is being handled by our very capable network engineers and the distribution routers should be online shortly.

-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #7  
Old 2005-06-09, 18:10 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
Network mostly online

Network connectivity has been restored to some areas of the datacenter. We do not yet have an accurate count of exactly how many servers have network connectivity and how many may still be down for other reasons. We are going to start checking servers (ping sweeps and manual check) to make sure everything comes back online properly.

-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #8  
Old 2005-06-09, 18:44 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
As of 6:30, network connectivity has been restored to 95% of our network equipment. Datacenter technicians are still checking servers and we are trying to hammer out any final issues. The following IP subnets are still without connectivity, so if your server is inside one of these ranges then it may not be accessible yet:

64.34.166.0/26
64.34.166.64/26
64.34.168.192/26
64.34.171.0/26
64.34.171.64/26
64.34.172.0/26
64.34.172.64/26
64.34.174.0/26
64.34.178.0/26
64.34.178.64/26
66.135.35.0/26
66.135.37.0/26
66.135.37.128/26
66.135.39.128/26
66.139.76.192/26
66.139.77.0/26
66.139.77.64/26
69.44.57.0/26
69.44.57.64/26
69.44.58.192/26
69.44.60.0/26
69.44.60.64/26
69.44.62.128/26


-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #9  
Old 2005-06-09, 18:48 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
Ok, new list as of 6:45 PM CDT:

64.34.171.0/26
64.34.171.64/26
64.34.172.0/26
64.34.172.64/26
64.34.174.0/26
64.34.178.0/26
64.34.178.64/26
66.135.35.0/26
66.135.37.0/26
66.135.37.128/26
66.135.39.128/26
69.44.62.128/26


-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #10  
Old 2005-06-09, 19:48 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
OK, as of now all of our network infrastructure is online and should be functioning properly. There are still servers that are down and we are running ping sweeps and manual checks to get them all back up and running. If your server is firewalling ICMP ping, it will still be checked. Due to the abrupt shutdown, there may be many servers requiring manual filesystem checks.

If your server is still down, rest assured that we are checking every single server that is not responding to ping sweeps. If your server is pinging but you cannot login, feel free to submit a support ticket. Our datacenter crew will be working through the night or as long as it takes to get everyone back up and running.

-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #11  
Old 2005-06-09, 22:54 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
I just received an update from Virginia that the commercial power feed has failed again and the datacenter is currently running on UPS power. The power company out there has over 5,000 customers without power, and we are in the affected area.

We are in the same situation as we were earlier, with the temperature quickly rising because the HVAC is not working. Several rows of servers were shutdown in the hottest part of the datacenter before they suffered heat damage or failed hard.

We are all going to be working through the night to make sure this situation doesn't get out of hand and try to get it repaired as soon as possible.

-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #12  
Old 2005-06-09, 23:20 PM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
Yo-yo

The power is back online (again) and HVAC is running. The temperature is dropping below 85F now and the datacenter team is turning the servers back on that had to be powered off about 30 minutes ago. The power glitched for about 3 seconds which caused all servers to reboot, but they should be coming back up now. Again, we are sweeping the datacenter and checking out servers that do not respond to the network.

-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
  #13  
Old 2005-06-10, 01:39 AM
knightfoo's Avatar
knightfoo knightfoo is offline
Code Ninja
Join Date: 2003 Jul
Location: San Antonio, TX
Posts: 2,568
Good News

Hopefully this will be my last update tonight .. things are starting to calm down and get back to normal.

The datacenter is back online, running on utility and UPS power. Our San Antonio crew landed just in time to hear the news about the second power outage with spare hard drives and power supplies in hand. The building engineers reported that they are not comfortable with the ATS yet, so guess who gets to spend the night next to the manual transfer switch? There will an engineer on-site and ready to transfer power to generator should there be any further power interruptions. We will have more engineers on hand tomorrow as well as our Virginia operations staff to go over the systems and investigate what happened.

Most servers are online at this point, but there are some servers that are still down due to hardware failures or filesystem errors. With the extra staff from San Antonio, we've doubled the manpower available to troubleshoot and fix down servers. They are going through down server tickets in the order they were received and we are still performing network checks to verify all servers are online. I don't think anyone in Virginia will be sleeping tonight until every server is online or on its way to being online.

I would like to thank you all for being patient throughout this ordeal.

-knightfoo
__________________
I am not a ServerBeach employee, but I used to play one at work.
Real admins run Debian!
Recursive; adj. See Recursive
Reply With Quote
Reply


Thread Tools
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT -5. The time now is 13:11 PM.