Did we get another DoS false positive over the night? I think I've got all the servers restarted now.
Down earlier due to a semi-planned move being aborted halfway and the network cable being plugged back into the wrong port (gg triple nic motherboard)
Packet loss about 2 hours ago due to a massive (40+ gbit) ddos on another customer Probably a lot more where that came from too after the recent NTP shite But yeah ddos detection system did its job very effectively (perhaps too effectively)
Down for an hour or so this morning due to a massive mail queue fuck up freezing php-fpm and eventually somehow killing nginx entirely Entirely unrelated to the move, which has not yet occurred (pushed to an unknown later date)
Seems to be connectivity/upstream issues, extremely high pings etc. Gameservers have been disabled twice now by the ddos detection thingy, probably best they stay off until this is resolved
intermittent downtime for past 3 hours and my sleeping pattern fucked beyond belief thanks to the incompetent retards behind LVM2
prophunt stats DDoS actually managed to bring down murmur for about 30 mins, conveniently while i was afk
Server went down for unknown reasons at approx 08:45 BST and came back rebooted at approx 09:30 BST For some reason the web shit didn't start automatically until I noticed it was fucked (about an hour ago), so all websites were down for about 7 hours
Have now added a fix so the second bit shouldn't be able to happen again. Can't diagnose or do anything anything about the spontaneous reboot and hopefully it's not indicative of anything bad
late x9000 but i've determined this was a power failure of some sort as other bandwidth.co.uk servers have an almost identical uptime so gg nothing to worry about, still stable as fuck
Site was down for the past hour due to a kernel bug (first i've seen a bug in the mainline kernel in quite a long time) combined with the excellent logic of grsecurity which decided to ban the user that web services run under Code: [Mon Jun 30 19:41:37 2014] grsec: banning user with uid 500 until system restart for suspicious kernel crash
Starbound servers will be down for the next ~12 hours because disk space is about to run out and I'm juggling partitions Edit: Maintenance complete, everything is now gravy
As suspected it was a routing issue, caused by the internet as a whole hitting 512k BGP sessions for the first time ever. In other words, someone literally broke the internet http://www.reddit.com/r/sysadmin/comments/2dcol3/the_internet_hit_512k_bgp_routes_today_causing/
"Tldr. BGP is the backbone of the internet and the internets just got fat enough for the backbone to start cracking."