Apologies for the following post. It’s really quite dull.
Internet Seer, on the face of it, offer a pretty useful service. What they do, according to the sales pitch, is monitor the availability of your website. They’ll ‘ping’ your domain on a regular basis to check whether it’s online or not, and if they don’t get a response, you’ll receive an e-mail letting you know. Very useful.
In reality, however, I suspect that Internet Seer are not as sweet and lovely as they might appear at first glance, a theory backed up by little bit of googling – there’s this example, for instance. Or this sorry tale. Or this little nugget. Or this healthy dose of vitriol. Or this… you get the picture. And now I’m joining in.
Let’s start here. On this page Internet Seer claim that they adhere to the robots exclusion standard, which is a way for webmasters to declare that they don’t want particular parts of their sites to be indexed, or they don’t want to allow certain robots access at all. A random sample of five recent day’s log files reveals that the Internet Seer robot requested my robots.txt file exactly ZERO times, completely contradicting their own claims.
But does this matter? If they’re only monitoring my site to check it’s availability, then it’s no big deal, right? Well, let’s choose a day at random, say November 19th. On this day the Internet Seer robot visited on a number different occasions, examining 67 different pages as it passed. Now wait a minute… if all Internet Seer are doing is checking to see if my site is online, why are they visting 67 different pages? What use could they possibly have?
I dunno, let’s play devil’s advocate here. What’s the worst they could be doing? Harvesting e-mail addresses, perhaps? Surely not. But then again… two people have e-mailed me recently after receiving an e-mail from the company, notifying them of the following or similar:
On Fri Dec 05, 2003 at 10:55:01 PM EST we were unable to reach your website: http://www.blogjam.com/old/000760.html due to the following reason: Host Not Found. As of Mon Dec 08, 2003 at 11:11:59 AM EST we were able to access your website again.
How strange. Why are other people being notified when my website is supposedly down and, more pertinantly, where did Internet Seer obtain their addresses from? The first part I can’t answer, but the second seems obvious – from my website, where both people in question had left comments on the pages Internet Seer had claimed were down in their respective e-mails. I guess the obvious question at this point is this: how could Internet Seer possibly grab a couple of e-mail addresses if my site was actually down? Playing Devil’s advocate once more, I rather suspect that the company aren’t monitoring my site at all in the way thet they claim. I’m sure Internet Seer will deny all this, claiming that the two people in question are subscribers that signed up for their service (unlikely, since neither has their own website), or that they were picked up during one of Internet Seer’s extraordinarily vague “studies on the connectivity of the internet
and the related web sites that were involved in such studies.”
Suffice to say I’ll be preparing a friendly e-mail to the company to question their actions and, in the meantime, if anyone else reading this has had mails from Internet Seer regarding my site, please let me know. If you’ve still got the original mails, even better – please forward, with full headers. Thanks.
Update: I heard back from Internet Seer. They’ve promised to remove me from their database, and offered the following explanation of how they came to be be e-mailng people who left comments on my site:
I can only assume that we must have noted the emails on those pages during a prior visit, stored them as a contact in the event we found the site down.
Which appears to confirm in no uncertain terms my theory that they’re scraping sites for e-mail addresses. If they’re doing anything further with these addresses other than annoying the owners with irrelevent monitoring mail is a different matter altogether – I’ve no reason to believe that they are, but as their current business model seems at best chaotic and at worst unethical, nothing would surprise me.