Computer Science Faculty Research and Publications

Why Web Sites Are Lost (And How They're Sometimes Found)

Document Type

Article

Publication Title

Communications of the ACM

Publication Date

11-2009

Volume

52

Issue

11

First Page

141

Last Page

145

Abstract

The authors discuss their creation of a web-repository crawler, Warrick, that restores lost websites from Internet Archive, Google, Live Search (now known as Bing) and Yahoo, collectively known as the Web Infrastructure (WI). They present the results of their online survey surrounding lost websites and their after-loss recovery. Respondents had either personally lost one of their web sites or had recovered someone else's web site. They found that esoteric sites were being restored. They suggest that technology to preserve digital materials will become more inclusive and seamless.

Copyright held by

ACM

Share

COinS