Ash Nallawalla's blog

Did the Wayback Machine die and nobody noticed?

Ash Nallawalla

24 January 2011

Other

The Wayback Machine or archive.org served a good purpose. In its early years it tried to keep a copy of many pages from websites great and small. People who inadvertently deleted their website were able to recover some of the content through it. More recently (five years ago, not five weeks), it couldn’t cope with the quantity of pages and people complained when it hadn’t indexed their pages. Many SEOs blocked its spider from their sites.

When I checked some well-known sites, I was surprised that they hadn’t been archived for some years:

  • nab.com.au = 21 Apr 2008
  • netmagellan.com = 23 Jun 2008 Ouch!
  • whitehouse.gov = 27 Oct 2009
  • nike.com = 27 Oct 2009
  • microsoft.com = 23 Feb 2010
  • smh.com.au = 3 Aug 2010
  • cnn.com = 9 Aug 2010
  • nytimes.com = 10 Dec 2010
  • ubank.com.au = not there

The site mentions a “new” beta site waybackmachine.org but even there the last copy of microsoft.com is dated 27 July 2010. netmagellan.com was last indexed on 3 Jan 2010. whitehouse.gov on 24 Jul 2010.

archive.org
www.archive.org

Looks like the Wayback Machine died in 2010. I can’t find any article about its demise. OK, someone might find a page archived in 2011, but it effectively died when it stopped archiving important sites.

Ash Nallawalla

Search strategist experienced in large, complex websites. Ash's Google+ profile

Related Posts

Programming frequencies in the Yaesu FT-2DR/DE

Feel free to share...Yesterday I bought the Yaesu-Musen FT2DR (which is the shortened name of the FT-2DR/DE) here in Melbourne from Strictly Ham. These days there are hundreds of repeaters that you need to input into a transceiver and each entry has numerous fields. Feel free to share...

Read More

President Obama’s whitehouse.gov pages archived

Ash Nallawalla

22 January 2017

Other, SEO

Feel free to share...After Mr Donald Trump became the president at noon today (USA EST), many reported that the whitehouse.gov website removed references to Climate Change and LGBT. That isn’t entirely accurate. The website up to that point has been archived and can be found at https://obamawhitehouse.archives.gov/. The LGBT URL was https://www.whitehouse.gov/lgbt but it redirects […]

Read More

4 Comments

  • Dixon Jones on 24 January 2011

    It is possible that they are still crawling, but not updating the index on the web. Having seen some of the scale issues of crawling the web and storing a history first hand. The crawl and the data retrieval are quite different problems.

    Stalling for a year in this case could possibly increase their value. This might then give them a premium service at some point in 2012. But if they did, I would have thought they would have made some noise.

  • Alexis on 25 January 2011

    Hi,
    The Wayback hasn’t died, we just released the new beta version at waybackmachine.org last Thursday. Internet Archive is still archiving sites, we just have a 6 month embargo on results (and newer results are only available in the beta). In other words, if we crawled something today, you wouldn’t see it in the Wayback Machine until late July. We might make that 6 month embargo shorter in the future, but that’s the way it works for now.

    Thanks for taking a look!

    Alexis

    • Ash on 25 January 2011

      Yes, the 6-month embargo explains the latest pages being from August 2010 and the odd one for NY Times from December 2010, but for humbler sites such as this one, the beta site stopped archiving 12 months ago.

      There is a problem for some of the archived pages at the beta site. A page from a bank is blocked by Norton SafeWeb:

      Suspicious Web Page Blocked

      You attempted to access:

      ?http:?//rep?lay.w?aybac?kmach?ine.o?rg/20?09092?91339?43/ht?tp://?www.u?bank.?com.a?u/ub/?web/w?hoWeA?re/ba?cked-?by-na?b

      For your protection, this web page has been blocked and submitted for review. Visit Symantec to learn more about phishing and internet security.

      It is recommended that you do NOT visit this page, however if you know that this web page is safe, you may choose to visit this web page anyway.

  • Richard on 4 January 2013

    Their new beta service is available here:
    http://web-beta.archive.org/web/*/http://amazon.com

Comments are closed

Older Posts