Friday, March 19, 2010

On Robots.txt and META Robot

A lot had been said about the two. Both blocks the bots to crawl to your page so as not to be indexed. One popular reason for doing such is these pages are dupes of your website. This could happen if your website generate sessions IDs. You can tell that it's a session ID when it has a pretty long characters after the domain - yourdomain.com/ecnjnkjenffingjalkne2iu48u=?u2ijn - something like that.

Anyway, so much for session IDs, it came to my mind what are the difference of the two entities mentioned in the title. So I deed a little research and tried to comprehend. In my own understanding, when a bot finds your page, it will look for the robot.txt in your website then it will read what it tells it to do such as "hey,do not index this pages" and the bot will do as such. So technically, the bots would find your page, because if it doesn't it wont have a chance of reading your sign not to index it.

Ok, my page will not be indexed..Yipeee! But what if  some fan of your website, happened to save this page with the session ID which you have been hoping no to be indexed is accessed, it went directly to the page, didn't consult the robots.txt in this case, what should I do?

So here goes the META robot. Put this on the part of your pages so bots will not index it.
Sounds simple, huh?

So that's the main difference I found. I hope this gives you the idea what to use in your website. If you feel the information stated here is somewhat incorrect, feel free to leave your comments.

No comments:

Post a Comment

This is an open pad. Feel free to write your thoughts