Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: Robots.txt

  1. #1

    Robots.txt

    Ok just a quick note for those not aware of what a robots.txt file is.....

    Baically robots.txt is a text file a webmaster places in the root directory of his site which instructs some search engines as to what areas of the site he is ok for them to index.

    Q. Do I have to have a robots.txt file to get ranked on a search engine??
    No you do not - if no robots.txt file is present the search engine or spider will just assume that it is ok for it to index your whole site

    Q. Will a robots.txt file help my site get looked at by more search engines??
    A. No - in order to view the file they must already be at your site

    Q. So what do I need it for then??
    A. Well it allows you to designate certain areas which you do not wish to be indexed by the search engine - perhaps you have a test folder where you simply try out new things that you dont want the public to be aware of etc

    Q. Ok so how do I make one then?
    A. Easy just open a txt editor like notepad and type the following

    User-agent: *
    Disallow: /test/

    you can add as many dir to this as you wish

    Q. Ok have written my file but how do i know it is working?
    A. simply upload it into the root dir of your site and that should be it if you wish to check the syntax - and make sure everything is correct you can visit here
    or for more info on robots.txt try here

    thanks

    v_Ln

  2. #2
    AO French Antique News Whore
    Join Date
    Aug 2001
    Posts
    2,126
    Thank you for that info!
    -Simon \"SDK\"

  3. #3
    Senior Member
    Join Date
    Nov 2003
    Posts
    247
    Nice. I'll be sure to do that. Thanks Val!
    www.ADigitalPimp.com
    There is a ghost in the machine, and he is my friend.

  4. #4
    @ÞΜĮЙǐЅŦГǻţΩЯ D0pp139an93r's Avatar
    Join Date
    May 2003
    Location
    St. Petersburg, FL
    Posts
    1,705
    w00t. Good info. I will use that later on when I start my site.
    Real security doesn't come with an installer.

  5. #5
    AO Ancient: Team Leader
    Join Date
    Oct 2002
    Posts
    5,197
    Er... Be careful with this..... let's think about it for a minute.....

    You have a set of subfolders on your website that you don't want accessed as a result of a simple web search.... So you pop them in the old robots.txt and sit back fat, dumb and happy. Well, Mr. H4x0|2 wants to know if you have anything to "hide" all he does is request robots.txt and bingo - there are those closely held secrets for him to see...... It's a two edged sword...

    The Moral: Don't place stuff on a web site that you can't have _anyone_ see and still sleep at night......
    Don\'t SYN us.... We\'ll SYN you.....
    \"A nation that draws too broad a difference between its scholars and its warriors will have its thinking done by cowards, and its fighting done by fools.\" - Thucydides

  6. #6
    Hey, that was a great little tute mate. Thanks for posting. I've been doing a lot of web stuff for years, so I knew what it was, but my girlfriend has had a page of her site (about 200,000 hits per day) where she didn't have a robot command and some of the members found out some secrets as to what she was planning to do with the site update-wise. Again, thanks mate.

  7. #7
    hey val does this help on geocities,angelfire sites where the tt is not present originally?
    If not tell me something to improve such free sites
    txs

  8. #8
    this should work on any site that is spidered it will not bring more search engines to your site but once there it instructs them as to what to index

    v_Ln

  9. #9
    AO Antique pwaring's Avatar
    Join Date
    Aug 2001
    Posts
    1,409
    Nice tutorial Val.

    BTW, even if you're not bothered about giving instructions to search engines, you might want to include a blank robots.txt file anyway. I found that every search engine visiting my site tried to request robots.txt, which left a load of 404 errors in my logs (because the file didn't exist) and made me think I had a broken link somewhere. Same goes for a favicon.ico file when people try to bookmark your site.
    Paul Waring - Web site design and development.

  10. #10
    Senior Member
    Join Date
    May 2002
    Posts
    256
    interesting thing, i did a search on google for robots.txt and found that the whitehouse.gov has a robots.txt file listing disallowing some folders from being spidered....
    Sex is like \"Social Security\". You get a little each month, but it\'s not enough to live on.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •