The robots.txt File
Search engines make use of a file called robots.txt to determine
which web pages (and other files) are excluded from indexing
when they visit your site. A sample robots.txt file has been
included in your public_html directory.
Not all search robots honour this file. Never keep any sensitive
information in your public_html directory as it could be viewed
by anyone or indexed by search engine robots.
Some services are now indexing graphics on the web. If you don't want your images to be accessed in this manner, include them in your robots.txt file.
In case your robots.txt file is damaged or missing, here is a sample of the file. Simply cut and paste in-between the dashed lines.
it is direct to robot where they go, access area or non access area,
if you want to disallow to robot to read a cgi-bin directory write below
User-agent:
*
Disallow: /cgi-bin/
you
want to stop google robot to read you site's particular area
use below
User-agent: googlebot
Disallow: /cgi-bin/
You
want to stop search engine to stop read your site
User-agent: *
Disallow: /
You can also stop search engine to read page add below in meta
tag
<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">
And
if you want to stop read a particular file
Disallow: email.htm