Server Downtime. Yeah, I know. No one noticed. I’m going to explain it here anyway… I won’t go into detail here about my actual server setup, that is a post for another time but I’ll outline the basics. Starting my day on Saturday I was running three physical servers and an additional four more virtual servers plus a 2TB NAS array. One of my servers handles for the most part email, and that one is running fine. The other two were storage/Virtual Server hosts. Those two were in a bit of trouble and needed to be given the once over and combined into one server that I thought we be enough to get me through another few months. No, I will mention that I do use the word server a bit loosely here, none of my hardware is really server grade at this point but those machines are not desktops and sit in a closet and serve every day all day. I took two of my servers offline with the hope of taking the better bits of hardware from them both and making them into on really effective server. I had 2 RAID arrays running in critial states, OS installs that were quite crippled do to some of my past mistakes and some messy configuration hacks that needed to be cleaned up and streamlined. I consolidated the hardware from the older slower machine into the faster machine and got ready to power back up. I had created a couple new RAID arrays, doubled my memory ect and headed back out to the server closet to get started on doing a fresh install of my favorite version of Linux.
Everything seemed to go OK but the system was painfully slow and nearly unusable. This was odd as this machine was under no real load. I hadn’t even installed VMWare Server yet let alone powered on any machines. I found performance to be unacceptable though and I headed out to take a look around and see what the trouble was. After some time poking and proding I assumed I make some kind of mistake and got ready to redo the server again. I get everything up and running and this time for the net install I chose different mirrors to pull down files from and tried again. This time performace was better and I thought I was in the clear. Got the system running, installed VMWare Server and then it happened. The system locked hard. I went out, restarted it and it did it again but I saw no errors in any of the system logs. This told me ‘You have a hardware Issue.’ I poked around considering what it could be. I mean, earlier this week this server was in thermal shutdown even through the server closet was down in the 30’s temperature wise. I pulled out the extra RAM I had added in from the other server and hoped for the best. That processor has been tortured over time unfortunately due to many heat issues and I was really hoping it had not died. I power the old gal back on and hoped for the best.
It’s been on ever since. Turns out it was the RAM. That is quite unfortunate though because now I’m running the host OS and 4 Virtual Servers on 1GB of RAM. Sorry is the site is a bit slow for a few more months…I’m doing a bit of paging. I’m quite impressed that the processor still lives on though. It had been over heating all summer and regularly overheated when I was using it as my audio recording frontend. Intel knows how to make a killer processor, that is for sure. Next time I just hope that when I apply the thermal grease I do it correctly.
So, thats why the site was down. I hope to get a new server in the next few months but we will see how that plays out….