Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Robots.txt

  1. #1
    Flash M0nkey
    Join Date
    Sep 2001


    Ok just a quick note for those not aware of what a robots.txt file is.....

    Baically robots.txt is a text file a webmaster places in the root directory of his site which instructs some search engines as to what areas of the site he is ok for them to index.

    Q. Do I have to have a robots.txt file to get ranked on a search engine??
    No you do not - if no robots.txt file is present the search engine or spider will just assume that it is ok for it to index your whole site

    Q. Will a robots.txt file help my site get looked at by more search engines??
    A. No - in order to view the file they must already be at your site

    Q. So what do I need it for then??
    A. Well it allows you to designate certain areas which you do not wish to be indexed by the search engine - perhaps you have a test folder where you simply try out new things that you dont want the public to be aware of etc

    Q. Ok so how do I make one then?
    A. Easy just open a txt editor like notepad and type the following

    User-agent: *
    Disallow: /test/

    you can add as many dir to this as you wish

    Q. Ok have written my file but how do i know it is working?
    A. simply upload it into the root dir of your site and that should be it if you wish to check the syntax - and make sure everything is correct you can visit here
    or for more info on robots.txt try here



  2. #2
    AO French Antique News Whore
    Join Date
    Aug 2001
    Thank you for that info!
    -Simon \"SDK\"

  3. #3
    Senior Member
    Join Date
    Nov 2003
    Nice. I'll be sure to do that. Thanks Val!
    There is a ghost in the machine, and he is my friend.

  4. #4
    @ΜĮЙǐЅŦГǻţΩЯ D0pp139an93r's Avatar
    Join Date
    May 2003
    St. Petersburg, FL
    w00t. Good info. I will use that later on when I start my site.
    Real security doesn't come with an installer.

  5. #5
    AO Ancient: Team Leader
    Join Date
    Oct 2002
    Er... Be careful with this..... let's think about it for a minute.....

    You have a set of subfolders on your website that you don't want accessed as a result of a simple web search.... So you pop them in the old robots.txt and sit back fat, dumb and happy. Well, Mr. H4x0|2 wants to know if you have anything to "hide" all he does is request robots.txt and bingo - there are those closely held secrets for him to see...... It's a two edged sword...

    The Moral: Don't place stuff on a web site that you can't have _anyone_ see and still sleep at night......
    Don\'t SYN us.... We\'ll SYN you.....
    \"A nation that draws too broad a difference between its scholars and its warriors will have its thinking done by cowards, and its fighting done by fools.\" - Thucydides

  6. #6
    Join Date
    Sep 2002
    Hey, that was a great little tute mate. Thanks for posting. I've been doing a lot of web stuff for years, so I knew what it was, but my girlfriend has had a page of her site (about 200,000 hits per day) where she didn't have a robot command and some of the members found out some secrets as to what she was planning to do with the site update-wise. Again, thanks mate.

  7. #7
    hey val does this help on geocities,angelfire sites where the tt is not present originally?
    If not tell me something to improve such free sites

  8. #8
    Flash M0nkey
    Join Date
    Sep 2001
    this should work on any site that is spidered it will not bring more search engines to your site but once there it instructs them as to what to index


  9. #9
    AO Antique pwaring's Avatar
    Join Date
    Aug 2001
    Nice tutorial Val.

    BTW, even if you're not bothered about giving instructions to search engines, you might want to include a blank robots.txt file anyway. I found that every search engine visiting my site tried to request robots.txt, which left a load of 404 errors in my logs (because the file didn't exist) and made me think I had a broken link somewhere. Same goes for a favicon.ico file when people try to bookmark your site.
    Paul Waring - Web site design and development.

  10. #10
    Senior Member
    Join Date
    May 2002
    interesting thing, i did a search on google for robots.txt and found that the whitehouse.gov has a robots.txt file listing disallowing some folders from being spidered....
    Sex is like \"Social Security\". You get a little each month, but it\'s not enough to live on.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts