Although it makes sense to me to produce a CD-ROM (or similar permanent record) for a particular event such as an electronic conference, in general web pages can change rapidly and irregularly. Software often changes in this way. Perhaps we can borrow practices from the software industry by using versions numbers and recording changes from one version of a page to the next. Then you just have to take archive copies at major version changes, recording the minor changes within comments in the HTML of the page. Maybe as an alternative, archivists (with sponsorship from computer companies?) could try to get a snapshot of various sites around the world; a "day in the life of the Web"? By the way, isn't it the case that the useful material on the Web _does_ stick around, albeit in modified form? regards, Graham Graham Mullier, computational chemistry group, Zeneca Agrochemicals, Bracknell, UK Graham.Mullier@zeneca.co.uk
Next year will see the start of the 4th year of operation of our Web site. It struck me that we have no permanent record of the evolution of the site, and would have great difficulty tracking the history of the various files (since the dump tapes are recycled after a few month).
I wonder if anyone operates any sort of archival policy? For example, at the end of each year, it should be possible to burn e.g. a CD ROM of the Web hierarchy of a server (assuming its less than 660 Mbytes!). Such a CD ROM might be held locally, or could be sent to e.g custodian such as a learned society. It would be used to resolve disputes, establish precedence and priority, etc etc, and even provide copies of documents which might have been cited as URLs in journal articles (for example, in our paper of June 1994, we cited around 40 URLs, most of which probably no longer work!).
I am mindful of the fact that a historian of the early development of the Web is going to have a very thin time of it, unless we take some sort of pro-active steps to preserve snapshots of the system at periodic intervals. I am also mindful that CAS is abstracting only a tiny proportion of Web based materials, almost entirely deriving from e.g. conferences and journals which have well characterised "punctuation points" and are not in a constant state of flux.
Is this the sort of thing a learned society should be doing? Perhaps we should devise a code of good practice to encourage this sort of thing? Or do we accept that a very high proportion of all Web based chemistry deservedly should expire without trace within 1-2 years?
Dr Henry Rzepa, Dept. Chemistry, Imperial College, LONDON SW7 2AY; rzepa@ic.ac.uk; Tel (44) 171 594 5774; Fax: (44) 171 594 5804. URL: http://www.ch.ic.ac.uk/rzepa/ (Eudora Pro 3.0)
----- chemweb: A list for Chemical Applications of the Internet. Archived as: http://www.ch.ic.ac.uk/hypermail/chemweb/ To unsubscribe, send to listserver@ic.ac.uk the following message; unsubscribe chemweb List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
----- chemweb: A list for Chemical Applications of the Internet. Archived as: http://www.ch.ic.ac.uk/hypermail/chemweb/ To unsubscribe, send to listserver@ic.ac.uk the following message; unsubscribe chemweb List coordinator, Henry Rzepa (rzepa@ic.ac.uk)