Site Outage

Sorry everyone, the site has been down since about 3 am PST. I’m not exactly sure what happened. Something in the database was corrupted and I had to restore from yesterdays back up. I’m going to try to restore todays posts as soon as possible.

-Charlie

2 comments
  1. jayessell says: March 26, 200810:05 am

    Was this your first outage Charley?
    I was expecting the “Car Door Dog Bag” comments
    to crash your system, not “Life in 2008″.
    It seems ok now.

  2. Charlie says: March 26, 200810:26 am

    No, it used to be that every time something from the site got on the front page of digg the site would go down. In the early days when I was hosting with dreamhost it would take about 2 minutes of that volume and blammo dead site. Then I moved to a dedicated machine. It would last longer, but still eventually it would start thrashing, the cpu usage would go to 100% and it would become unusable. To the point that I couldn’t even log in through telnet or ssh. Then I reconfigured apache to load less processes. Now it wouldn’t max the cpu or thrash the disk, but everything would slow to a crawl as all the requests queued up. Finally I added another gig of ram about a month ago.

    I still had to tune the apache settings, but I found that I couldn’t really do that accurately without the site being under extremely high load. I was lucky with the Car Dog Bag article in that I could see it coming and I was ready and waiting when it got to the front page of dig. I kept adjusting the settings and rebooting apache untill it was just about maxing out the cpu and memory, but not hitting the swap and everything calmed down. Now it seems to be able to handle the load just fine. The site feels really responsive even when it’s getting pounded.

    I’m not sure what happened yesterday. I haven’t gotten a chance to go through the logs or the db, but something in the WordPress Mysql db got corrupted. This was bad because I have a service that sends me a text message when the site is unreachable. However it wasn’t unreachable, it just served blank pages. When I went to the admin site it just kept telling me it needed to upgrade the database, but it wouldn’t work. So I restored from backup and everything seems alright.

    I’m actually pretty amazed how well the site has held up over the last few days. Since Monday morning the life in 2008 article has been linked to from the front pages of digg, reddit, slashdot, delicious, boing boing and even tech crunch. During that time we’ve served over 260,000 pages and with the exception of that outage, everything has been smooth. For comparison this site served about 530,000 page views for the entire month of November. This month (for the first time) we’re over 1,000,000 page views.

Submit comment

You must be logged in to post a comment.