Server Downtime log

Discussion in 'Server/Website Chat' started by Spykodemon, 19 Dec 2009.

  1. Something identical to the 512k BGP issue is happening again, killed all gameservers as a precaution
  2. And again, killed all gameservers
  3. TF2 servers in segfault loop after latest update, will investigate soon
  4. Servers are now working again
  5. Site was down from 08:00 - 14:30 UTC+1 for unknown software reasons

    TF2 servers remain broken, investigating now
  6. Site was down from approx 08:00 to 11:50 UTC+1 due to nginx crashing and me being asleep. I've installed updates and added an auto-restart system that should prevent this happening again.

    Edit: Very spooky that this started at the same time as above
  7. prophunt continually crashing resulted in an important partition being filled with coredumps, which then killed mysql and caused minor corruption

    issues began at 5am from what I can see, but likely took several more hours to reach the point where shit started becoming unavailable. all resolved now :)
  8. Down for past 1 hour due to a botched system update
  9. Virgin Media has been massively saturated at LINX since like 10am or so this morning, so anyone on Virgin will be getting awful connectivity to GM and to many other sites worldwide today
  10. After having just about everything that could possibly go wrong end up going wrong in the latest system update, the TF2 servers are now back up

    The above virgin issue is also resolved now
  nlspeed

    I see nothing in the server list on the main page here, but assuming they work again, yay! Good job!
  12. Really not sure what is going on here - the way they are being turned off can only be performed manually by a manager

    Edit: Oh derp it's the thing to stop embarrassingly bad ping when we get ddosed being triggered by the internet being generally totally fucked atm
  Geit

  14. I wish I knew what happened last night, just about managed to drunkenly type reboot in SSH and thankfully everything started working again
  15. Starbound was filling /tmp with shit (which was bringing down many services), have now made it so this is all removed during the restart every 3 hours
  16. Finished applying updates, all is gravy
  HellJack

  18. Was down for a quick update there to try to alleviate some issues I've been seeing over the past couple of weeks
  19. Server crashed and rebooted around 5-6am

    On the bright side, it's managed to stay stable 24/7 for nearly the entire time I've been working at Jagex - ~183 days uptime. It looks like the crash was a one-off issue with memory management from the huge amount of uptime

    Taking this opportunity to do a long overdue system update, expect things to break...

