Did the Wayback Machine die and nobody noticed?

Reading Time: < 1 minute

The Wayback Machine or archive.org served a good purpose. In its early years it tried to keep a copy of many pages from websites great and small. People who inadvertently deleted their website were able to recover some of the content through it. More recently (five years ago, not five weeks), it couldn’t cope with the quantity of pages and people complained when it hadn’t indexed their pages. Many SEOs blocked its spider from their sites.

When I checked some well-known sites, I was surprised that they hadn’t been archived for some years:

  • nab.com.au = 21 Apr 2008
  • netmagellan.com = 23 Jun 2008 Ouch!
  • whitehouse.gov = 27 Oct 2009
  • nike.com = 27 Oct 2009
  • microsoft.com = 23 Feb 2010
  • smh.com.au = 3 Aug 2010
  • cnn.com = 9 Aug 2010
  • nytimes.com = 10 Dec 2010
  • ubank.com.au = not there

The site mentions a “new” beta site waybackmachine.org but even there the last copy of microsoft.com is dated 27 July 2010. netmagellan.com was last indexed on 3 Jan 2010. whitehouse.gov on 24 Jul 2010.

archive.org
www.archive.org

Looks like the Wayback Machine died in 2010. I can’t find any article about its demise. OK, someone might find a page archived in 2011, but it effectively died when it stopped archiving important sites.

4 Replies to “Did the Wayback Machine die and nobody noticed?”

  1. It is possible that they are still crawling, but not updating the index on the web. Having seen some of the scale issues of crawling the web and storing a history first hand. The crawl and the data retrieval are quite different problems.

    Stalling for a year in this case could possibly increase their value. This might then give them a premium service at some point in 2012. But if they did, I would have thought they would have made some noise.

  2. Hi,
    The Wayback hasn’t died, we just released the new beta version at waybackmachine.org last Thursday. Internet Archive is still archiving sites, we just have a 6 month embargo on results (and newer results are only available in the beta). In other words, if we crawled something today, you wouldn’t see it in the Wayback Machine until late July. We might make that 6 month embargo shorter in the future, but that’s the way it works for now.

    Thanks for taking a look!

    Alexis

    1. Yes, the 6-month embargo explains the latest pages being from August 2010 and the odd one for NY Times from December 2010, but for humbler sites such as this one, the beta site stopped archiving 12 months ago.

      There is a problem for some of the archived pages at the beta site. A page from a bank is blocked by Norton SafeWeb:

      Suspicious Web Page Blocked

      You attempted to access:

      ?http:?//rep?lay.w?aybac?kmach?ine.o?rg/20?09092?91339?43/ht?tp://?www.u?bank.?com.a?u/ub/?web/w?hoWeA?re/ba?cked-?by-na?b

      For your protection, this web page has been blocked and submitted for review. Visit Symantec to learn more about phishing and internet security.

      It is recommended that you do NOT visit this page, however if you know that this web page is safe, you may choose to visit this web page anyway.

Mastodon