Support / KnowledgeBase

 
Search the KnowledgeBase Search

Prevent search engines from indexing your websites

  • Applies to: (gs), All (dv)

  • Difficulty: Easy

  • Time needed: 15 minutes

  • Tools needed: FTP client, plain text editor

 
  • Applies to: (gs)
    • Difficulty: Easy
    • Time Needed: 10
    • Tools Required: FTP client, plain text editor
  • Applies to: All (dv)
    • Difficulty: Easy
    • Time Needed: 10
    • Tools Required: FTP client, plain text editor

Overview

Web Robots, also known as Web Wanderers, Crawlers, or Spiders, are programs that traverse the web automatically. Search engines, such as Google or Yahoo, use them to index the web content of your site. However they can also be used inappropriately, such as spammers using them to scan for email addresses. A robots.txt file will tell robots who visit your sites how you wish them to behave.

Instructions

First, using a plain text editor, create a robots.txt document with your favorite text editor. Then simply upload it to a directory on your service. For details on all the rules you can create please visit: http://www.robotstxt.org/

The following is an example robots.txt file which you are free to use. You will need to upload this file to your webroot, such as /home/00000/domains/example.com/html/ /var/www/vhosts/example.com/httpdocs/. Remember to remove the # sign for any command you wish the robots to follow, but be sure not to un-comment the commands description.


# Example robots.txt from (mt) Media Temple
# Learn more at http://wiki.mediatemple.net
# (mt) Forums - http://wiki.mediatemple.net/w/MT:Join_User_Forums
# (mt) System Status - http://status.mediatemple.net
# (mt) Statement of Support - http://mediatemple.net/support/statement/
 
# How do I check that my robots.txt file is working as expected
# http://www.google.com/support/webmasters/bin/answer.pyanswer=35237
 
# For a list of Robots please visit: http://www.robotstxt.org/db.html
 
# Instructions
# Remove the "#" to uncomment any line that you wish to use, but be sure not to uncomment the Description.
 
# Grant Robots Access
#######################################################################################
 
# This example allows all robots to visit all files because the wildcard "*" specifies all robots:
#User-agent: *
#Disallow:
 
#To allow a single robot you would use the following:
#User-agent: Google
#Disallow:
 
#User-agent: *
#Disallow: /
 
# Deny Robots Access
#######################################################################################
 
# This example keeps all robots out:
#User-agent: *
#Disallow: /
 
# The next is an example that tells all crawlers not to enter into four directories of a website:
#User-agent: *
#Disallow: /cgi-bin/
#Disallow: /images/
#Disallow: /tmp/
#Disallow: /private/

# Example that tells a specific crawler not to enter one specific directory:
#User-agent: BadBot
#Disallow: /private/

# Example that tells all crawlers not to enter one specific file called foo.html
#User-agent: *
#Disallow: /domains/example.com/html//var/www/vhosts/example.com/httpdocs/


User Comments

No visitor comments posted. Post a comment
Fields marked with an asterisk(*) are required. Comment on this article
Fill out the form below if you would like to comment on this article.
 
 
 

(code is not case-sensitive)
 
Submit
 
 

Continue