ChemInfo/MetaChem: efficient resource discovery for the chemical community
Colleagues, Henry's "Needle in a Chemical Haystack" Webmasters4 paper http://www.ch.ic.ac.uk/talks/wm4/ and recent discussions here about search engine performance have prompted me to introduce myself and a project that I hope may be of interest to many of you. Among other things, I am principal investigator for an Australian Research Council (ARC) project "ChemInfo/MetaChem" whose primary aim is to create an internationally accessible electronic gateway to chemistry information resources of all kinds. As a consequence of widespread concerns in the Australian academic chemical community about the high costs of a number of the major sources of chemistry related information, both print and electronic, and the effect on ability of users of chemistry information to have convenient and reliable access to these resources, representatives of the Royal Australian Chemical Institute (RACI), Professors and Heads of Departments of Chemistry (PHoDs), CSIRO and CAUL (Council of Australian University Librarians) met some time ago to discuss ways of improving the chemical information research infrastructure in Australia. The outcome of this meeting was the development of a range of strategies designed to improve access to both print and electronic information sources. One of the strategies identified was the development of a chemistry gateway on the web which would provide access to a diverse range of evaluated information sources and services. The ChemInfo/MetaChem gateway will provide access to Internet information such as electronic chemistry publications and databases, research projects, data sources, software, online teaching modules, directories, conferences etc. In addition, the gateway will provide links, through library catalogues and document delivery services, to print information. Similar gateway initiatives have emerged from similarly concerned chemical communities in the US and UK and elsewhere. Some excellent ones that come to mind are ChemPort http://www.chemport.org/, ChemCenter http://www.Chemcenter.org/, ChemSoc http://www.chemsoc.org, ChemDex http://www.shef.ac.uk/chemistry/chemdex/ and it's 'commercial' development ChemDex+ http://chemweb.com/databases/chemdex/chemdex.exe, and the ChemWeb http://www.chemweb.com/ projects. We do not wish to reinvent these excellent wheels. However, we believe that a distinguishing feature of the ChemInfo/MetaChem gateway will be that each resource will be evaluated, described, classified and indexed by subject specialists - thus providing the element of trust that is crucial to efficient resource discovery in these times. The final result will be a database of metadata records providing information about, and links to, the resources and also their level of validity and authority as a basis for research and teaching activity. Discussions are underway to ensure that the metadata database will be mirrored in at least the US and UK. At the moment, the project is a collaborative effort involving 9 Australian university libraries and we expect that this number will increase as the utility of the gateway is recognised. Subject specialist librarians currently donate their time as an "in-kind" contribution to the $ provided by the ARC. Over the few 6 months, our librarians plan to produce metadata for up to 10000 records, using the MetaWeb suite of tools http://purl.nla.gov.au/metaweb/home developed by the Distributed Systems Technology Center (DSTC) - another of the project partners. As our primary aim is efficient resource discovery, Dublin Core http://128.253.70.110/DC5/UserGuide5.html has been chosen as our primary metadata schema, but as many resources out there are educational, these can also be well described by the EdNA (Education Network of Australia) metadata schema http://www.edna.edu.au/edna/owa/info.getpage?sp=auto&pagecode=5210. We are also very interested in the development of metadata that can be used to describe resources peculiar to the chemical community. Henry's prototype DC-Chem schema http://www.ch.ic.ac.uk/talks/wm4/9.html looks like it would be a good place to start. At the moment I have to confess that we do not have a fully developed web site for the gateway, but some background info can be found at http://www.ch.adfa.oz.au/metachem/ A more comprehensive and useable site is on the virtual drawing board. I would very much appreciate your comments, particularly in the areas of evaluation criteria, chemically-specific metadata elements, whose metadata will you trust, identifying our initial 10000 resources, etc. Please also do not hesitate to let me know if you can see anything we can do down-under to create productive international alliances and provide better access to information for the global chemical community. ___________________________________________________________________ Alan Arnold, School of Chemistry and Director of Flexible Education, University College (UNSW) Australian Defence Force Academy, CANBERRA ACT 2600 Australia voice:+61 2 6268 8080 fax:+61 2 6268 8002 web: http://www.ch.adfa.oz.au/apa/ chemweb: A list for Chemical Applications of the Internet. To post to list: mailto:chemweb@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/chemweb/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe chemweb List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
Dear Alan, re: your posting about The ChemInfo/MetaChem gateway
we believe that a distinguishing feature of the ChemInfo / MetaChem gateway will be that each resource will be evaluated, described, classified and indexed by subject specialists
My comments here are not intended to give any answers, just to raise a few questions with reference to my experiences with "WWW Links for Chemists". http://www.liv.ac.uk/Chemistry/Links/links.html IMO of the four tasks you outline "evaluated and described" are not as important as "classified and indexed", for our site we try to concentrate on "classified and indexed".
providing the element of trust that is crucial to efficient resource discovery in these times. The final result will be a database of metadata records providing information about, and links to, the resources and also their level of validity and authority as a basis for research and teaching activity.
I refer to a mail sent to chemweb some time ago by Paul Deards with reference to indexes of chemistry resources http://www.ch.ic.ac.uk/hypermail/chemweb/0293.html Quote - "settling on a single centralised site as the 'authoritative list' is not the answer - in an increasingly fragmented scientific community, why should we expect a (say) spectroscopist to be able to categorise (say) organometallic links any better than Yahoo? Today's Chemistry is too big for a single list."
Over the few 6 months, our librarians plan to produce metadata for up > to 10000 records
A few points - The Reviewers - unless you can provide a specialist reviewer for each particular branch of chemistry, reviews remain little more than opinions of an individual. Are subject librarians well enough informed to give useful reviews? Are they the right people for the task? I would point out that the commercial version of ChemDex plus is maintained by a Ph.D. level chemist, as are most indexes of this type. Download times - am I the only person who hates waiting for a review or a list of reviews to download? More often than not, they consist of waffle which separates me from the document I am after. You are dealing with an highly educated audience, which generally will have an idea of the inforation they require. Our index provides little more than the title of the document (someties an abridged version at that!). We have received no requests for reviews. How many? - I guess the 10000 figure was arrived at arbitrarily. But are there that many relevant chemistry documents out there? Our site has just under 3700 resources categorised. The collaborative "International University Chemistry Departments" section points to about another 1000 resources on other sites, we have perhaps 400 resources waiting to be reviewed, OK say 5000 resources at the most. ChemDex has about 3700, the commercialised ChemDex+ even less. Unless you start reviewing individual sub-pages on sites e.g. individual chemists within a department, will your project manage 10000? If you were to take that approach it will become fiendishly difficult to keep your database current. Also, ours is an established index (on-line for 3 years), it receives between two and four requests for additions per day, this contribution from users is effectively the lifeblood of the index. But if you are starting from scratch it will take time to build an audience.
"Similar gateway initiatives have emerged from similarly concerned > chemical communities in the US and UK and elsewhere."
I cannot see how you can avoid "reinventing the wheel" somewhere along the line. When we set out, we carefully avoided duplicating other indexes by targeting specific areas, e.g. indexing corporate chemical websites, choosing to index just our own country's university chemistry department websites and then pointing to other sites for different countries thus avoiding the duplication problem. With the exception of a few sites, most of the original indexes have fallen out of date e.g. UCLA and WebChemistry, or behind due to lack of maintainance.
At the moment I have to confess that we do not have a fully developed web site for the gateway, but some background info can be found at http://www.ch.adfa.oz.au/metachem/ A more comprehensive and useable site is on the virtual drawing board. I would very much appreciate your comments, particularly in the areas of evaluation criteria, chemically-specific metadata elements, whose metadata will you trust, identifying our initial 10000 resources, etc.
WRT to metadata, although "WWW Links for Chemists" has metadata in it's .html, we take little notice of the .html code/content of documents we index. Until someone comes up with a chemistry specific *spidering* search engine, i.e. a chemistry specific version of AltaVista, the metadata isn't going to be used effectively. Perhaps it is worth noting that there is even discrepancy within individual search engines. A search of AltaVista doesn't show our site in the top 50, but their review site "LookSmart" ranks both our site pretty highly. NB "Best of the Web 95" is also up there. http://altavista.looksmart.com/e53712/e263953/e263954/r?lu&izf& So perhaps we do get some recognition for our meta tags after all. Some search engines such as Excite (and the associated WebCrawler) actually ignore metadata as they claim it can be misleading. http://www.excite.com/Info/listing.html#meta
Please also do not hesitate to let me know if you can see anything we can do down-under to create productive international alliances and provide better access to information for the global chemical community.
Well maybe you could contribute to http://www.liv.ac.uk/Chemistry/Links/international.html I have had several requests by visitors to our site from your country to approach you to cover Australia as you already have OzChemNet in place. If you provide the page, we just point to it, you get the hits. IMO this is the way forward. Unfortunately (or fortunately?) this decentralisation approach does not lend itself well to commercialisation, infact it could be seen as undermining the commercial efforts of others by providing for free what they would charge for (or at least make a profit from by some means such as advertising). I hope I have not sounded too discouraging, as all such efforts should be supported. All the best for your project. Regards, Michael Barker. -- *------------------------------------------------------------* | Michael H. Barker GRSC The University of Liverpool | | Department of Chemistry | | E-mail: mhbarker@liv.ac.uk Oxford Street | | Liverpool | | Tel: +44 151 794 2274 L69 7ZD | | Fax: +44 151 794 3588 United Kingdom | | | | http://www.liv.ac.uk/Chemistry/Links/links.html | | "Chemistry: WWW Links for Chemists" | *------------------------------------------------------------* chemweb: A list for Chemical Applications of the Internet. To post to list: mailto:chemweb@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/chemweb/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe chemweb List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
Michael H. Barker writes,
WRT to metadata, although "WWW Links for Chemists" has metadata in it's .html, we take little notice of the .html code/content of documents we index. Until someone comes up with a chemistry specific *spidering* search engine, i.e. a chemistry specific version of AltaVista, the metadata isn't going to be used effectively.
I agree entirely that the community itself will have to solve this problem, although it should make good use of generic technologies developed by people who know what they are doing. This is why initiatives such as the Dublin Core, Dublin Chem, XML and RDF are so important to us, but its us who will have to implement it for chemists. For what its worth, two years ago, I inserted a tiny little document into the root directory of our server called chembot.txt (after the control file that most generic search engines are supposed to inspect prior to indexing your site). If every chemistry related site were to adopt such a convention as a global chemical standard, then the creation of a "chemistry specific version of AltaVista" would be quite trivial! chembot.txt, at its simplest, represents a single byte of information; if its present, the site will have chemical content, if its not, it probably will not. Chembot.txt was patterned after the MIME content of the site. Arguably, nowadays it should also include more explicit meta-data search terms Dr Henry Rzepa, Dept. Chemistry, Imperial College, LONDON SW7 2AY; mailto:rzepa@ic.ac.uk; Tel (44) 171 594 5774; Fax: (44) 171 594 5804. URL: http://www.ch.ic.ac.uk/rzepa/ chemweb: A list for Chemical Applications of the Internet. To post to list: mailto:chemweb@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/chemweb/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe chemweb List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
participants (3)
- 
                
                Alan Arnold
- 
                
                Michael H. Barker
- 
                
                Rzepa, Henry