Removal of site from web archive
Page 1 of 2 12 LastLast
Results 1 to 10 of 13

Thread: Removal of site from web archive

  1. #1
    AO Antique pwaring's Avatar
    Join Date
    Aug 2001
    Posts
    1,409

    Removal of site from web archive

    One of my sites is in the internet archive which basically keeps a copy of web sites in a huge database for people to view when a site is down or has changed (a bit like Google cache, except it's permanent).

    Usually I wouldn't mind the archive crawling my site, but in this instance I don't want the site in question to be archived. Unfortunately, it's on a shared host and I can't use robots.txt to block the bot that crawls the site.

    I've emailed the web archive half a dozen times at different addresses but I've never received a reply and my site is still in the archive.

    Has anyone on AO had this problem with their site or have any suggestions as to dealing with the web archive? If they don't read their emails there doesn't seem to be a lot I can do. I would go and see them in person except it's on the other site of the Atlantic to me.

    Thanks in advance.
    Paul Waring - Web site design and development.

  2. #2
    Senior Member Zonewalker's Avatar
    Join Date
    Jul 2002
    Posts
    949
    are you still using the site at all? I mean do you still actually want it online or is it a dead site? If it's a dead site why not just remove it from the webserver (archive it offline if you want to keep it for posterity)? If it is still active why don't you want it archived (just out of curiosity)?

    I have to admit I've not had any dealing with the Internet archive but I'll have a look around see if I can find something that will help.

    Z
    Quis Custodiet Ipsos Custodes

  3. #3
    AO Antique pwaring's Avatar
    Join Date
    Aug 2001
    Posts
    1,409
    It's one of my blogs - I had deleted some of the comments/posts for various reasons so I didn't want them ending up in the archive where I can't delete them. What is really annoying is that in the FAQs the archive says 'email us at this address...', so I did (three times!) and they didn't reply or take any action.

    Thanks for offering to take a look about for me though.
    Paul Waring - Web site design and development.

  4. #4
    Senior Member Zonewalker's Avatar
    Join Date
    Jul 2002
    Posts
    949
    umm... you say you can't use robots.txt... a thought struck me does that include adding the following meta tag to every page?

    <META NAME="ROBOTS" CONTENT="NOARCHIVE">

    I actually uses this meta tag to ensure robots check my site every month... but I do recall that the noarchive content should have the same effect as robots.txt. Not sure if that would retroactively work but... just a thought off the cuff

    Z
    Quis Custodiet Ipsos Custodes

  5. #5
    AO Antique pwaring's Avatar
    Join Date
    Aug 2001
    Posts
    1,409
    I did think about that, but my site is based on templates provided by my blogging hoster, so I can't change things in the <head> tags either.
    Paul Waring - Web site design and development.

  6. #6
    Senior Member Zonewalker's Avatar
    Join Date
    Jul 2002
    Posts
    949
    I have to admit I haven't been able to find anything yet... I'm going out for drinks in half an hour so I'll have to stop looking for now.... but I might have a really horrible workaround for you.....

    I assume your site template adds the head tags.. yes? I have never worked with templates but there shouldn't be anything to stop you from adding a set of head tags at the start of your document before you add body tags should there?
    If there isn't anything to stop you then add your own head tags surrounding the above meta tag before you add the main body... it means that you will have two head tags in your HTML (which is why this is such a horrible work around - definitely not W3C compliant here!) but it should (I think) stop your site from being indexed further.

    One other thing I noticed from the FAQ - they take up to 12months to actually add a site onto the archive... maybe they take that long to remove it??

    I'll have a look around again when I get a chance

    Z
    Quis Custodiet Ipsos Custodes

  7. #7
    AO Antique pwaring's Avatar
    Join Date
    Aug 2001
    Posts
    1,409
    Hmm, it's one way of doing it I suppose, although I loathe to do it because it breaks every XHTML-compliance bone in my body.

    I just don't understand why they don't reply to my emails, perhaps I will try again this weekend.
    Paul Waring - Web site design and development.

  8. #8
    Senior Member Zonewalker's Avatar
    Join Date
    Jul 2002
    Posts
    949
    although I loathe to do it because it breaks every XHTML-compliance bone in my body.
    LOL - I know how you feel - how do you think I felt even suggesting it??? My own site is XHTML and CSS compliant... I felt lower than a sewer rat

    as for answering emails - cos they're lazy bastards??

    anyway - bit drunk now so am going to bed. I'll have a look again tomorrow.

    and its taken me about 30 mins to type this with relatively few spelling mistkaes

    Z
    Quis Custodiet Ipsos Custodes

  9. #9
    AO Antique pwaring's Avatar
    Join Date
    Aug 2001
    Posts
    1,409
    Hmm, I guess I'll have to try some of the standard email addresses (webmaster@ etc.) until someone gets their backside into gear and sorts this for me.

    Thanks for your help ZW, enjoy your beer!
    Paul Waring - Web site design and development.

  10. #10
    Senior Member Zonewalker's Avatar
    Join Date
    Jul 2002
    Posts
    949
    wotcha... still no joy I'm afraid. I think you may have to throw an email into the abyss and hope for some thing to respond

    the beer was... good by the way..... very nice, thank you for asking - the only fly in the ointment was a 'mate' (I use the word loosely as of last night) who was supposed to come out but decided that he couldn't because he 'didn't realise it was so close to Christmas' (I kid you not, that was his excuse) - a new one on me but there you have it.

    On that note hope you have a happy Xmas too!


    Z
    Quis Custodiet Ipsos Custodes

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •