GoogleBot goes berzerk! The apocolypse is nigh!
by ZetaGecko | 5 Comments | Internet, Issues/Problems
Question: What would cause Google to load one page (http://www.geckotribe.com/rss/) over 700 times in 24 hours? Answer: I'll tell you when I find out.
Over the last few weeks, an ever increasing number of requests have been coming in from a few of Google's IP addresses loading the same page over and over and over... The page in question is updated at most once per hour. Every time they load it, the Referer header lists "http://www.mouken.com/rss/", which is the old address of the page from before I changed my company name over a year ago. Today, the number of page loads skyrocketed from the level it had slowly risen to of about 300/day to over 700 in the last 24 hours!
I've contacted Google about the problem. The first time, they didn't quite seem to get the message and replied that they had slowed the crawling of my website. Thanks but that's not the problem I replied. I've gotten no response to that (not yet anyway, perhaps I will within a few days), so today (or was it yesterday?) I posted a brand new comment that was hopefully worded more clearly (not that the other one wasn't clear--just easier to gloss over if you handle too many messages like this in one day). At that time, I was up to about 400 requests in 24 hours. Now it's just getting ridiculous.
My first response before the latest burst was to add a little code to the page that sends a very small response when that page is loaded with the Referer header that Google is sending, and requires another click to see the real page (and no--I don't think that's what caused the new burst--otherwise I'd expect a different Referer header when they followed the link...but then who knows--if they're doing something screwy, maybe they're doing something even screwier too).
The next step, which I'm holding off on for just a bit in case leaving things as they are helps them fix the problem, will be to update robots.txt to block them from that page completely. Here's hoping the machines haven't already taken control in Mountain View.
December 2nd, 2005 at 10:45 am
I got a response from Google just now indicating that they believe they've fixed the problem. They didn't indicate what the cause was, but I wouldn't expect them to. I'll comment again if the problem hasn't been solved. Otherwise, assume that they've taken care of it.
Whew!
December 8th, 2005 at 5:09 pm
No such luck. In the last 24 hours, they've hit the page over 1000 times--it just keeps getting worse! We're still in communication though, and I'm sure we'll get it fixed...eventually.
December 10th, 2005 at 6:42 pm
I think they've fixed it now. The number of hits in the last 24 hours has plummetted from about 1150 to 93, and I'm guessing that the other 93 are just the tail end of the flood which will disappear in a few hours! At last!
December 12th, 2005 at 7:35 am
"We're baaack!" Yep, back over 400 again. This is beginning to feel like some kind of sick social experiment. I'll post again after Google has left the page alone for a week.
December 18th, 2005 at 7:10 am
ive the same problem. it even crashed my apache