Canadian Heritage - Patrimoine canadien Canada
 
Français Contact Us Help Search Canada Site
Home Site Map
Canadian
Heritage
 News
 Job Postings
 Conferences
 and Training

 Directories
 Funding
 Order Publications
 Add Information

Creating and Managing Digital Content Creating and Managing Digital Content

Preservation of Digital Information

Web sites - Archiving

Web sites present different, more complex problems for preservation. There are not only individual files that need to be preserved, but the relationships and linkages between them, and their structured indices, pose further preservation issues.

The following article provides a helpful summary of the issues for Web sites, including

  • instability of URLs
  • viruses
  • the need for change documentation

Preservation Risk Management for Web Resources
Virtual Remote Control in Cornell's Project Prism
D-Lib Magazine
January 2002

The article also describes projects that have tried to address the problem of preserving Web sites:

In 2001, the Internet Archive introduced the Wayback Machine, where users can view snapshots of Web sites as they appeared at various points in the past.

Although this project provides useful snapshots, it does not record

  • Changes in structure of a Web site
  • Databased material

The following paper describes the survey of Web sites at the Smithsonian, and makes some key recommendations for preserving Web sites:

Archival Preservation of Smithsonian Web Resoures: Strategies, Principles, and Best Practices
July 20, 2001

Smithsonian recommendations for design and authoring of Web sites and HTML pages for long-term access

  • Good Dublin Core metatags on pages
  • HTML markup should be XML compliant
  • Avoid the use of a third-party proprietary search engine not under control of the site's Webmaster
  • All links within the site should be relative
  • Copies of Web site should be periodically created and maintained
  • Major revisions to a Web site should be fully documented

An Approach to Managing Internet and Intranet Information for Long Term Access and Accountability A Paper Prepared by the IM Forum and Intranet Working Group

This guide was produced to provide government-wide guidance on managing records and publications on the Internet and on departmental intranets and extranets.

Listed here are some international projects that have tackled the issues related to preserving the complex information objects created for the Web.

Prototype Web Archiving Projects

Minerva -- Mapping the Internet Electronic Resources Virtual Archive project of Library of Congress

This project is described in Collecting and Preserving the Web: The Minerva Prototype (RLG Diginews, Vol 5, No 2) William Y. Arms, et al., Cornell University

Pandora - Preserving and Accessing Networked Online Resources of Australia
National Library of Australia
Archive of selected Australian online publications, including Web sites developed strategy for long-term preservation.

Collecting and Preserving the Web: Developing and Testing the NEDLIB Harvester, National Library of Finland (RLG Diginews, Vol 5, No 2)

Previous Page       Table of Contents       Next Page


Virtual Museum of Canada (VMC) Logo Date Published: 2002-04-27
Last Modified: 2006-06-15
Top of Page © CHIN 2006. All Rights Reserved
Important Notices