Dear Colleagues, Recently, I have been playing with the CS ChemDraw Net plug-in, with the aim of constructing tutorial pages to help 1st year chemistry students convert systematic names into stick diagrams. I had the idea that I could use the SMILES string available from the Java applet suplied by CambridgeSoft along with a little bit of JavaScript to test if the structure drawn by the student corresponds to the name supplied. However, after a few tests I realised the major flaw in my idea... any particular structure does not have a unique SMILES string (I would like to add in my defence that I knew this but had forgotten about it in my excitement about the possibility of creating the tutorials). I am aware that methods of creating "unique" strings have been discussed but I have not investigated this yet. However, with the CS plug-in (or the Office Pro suite) it is possible to create different SMILES strings depending on how the structure is draw so the availability of these methods is somewhat academic. So, I am wondering if anyone on the List has attempted to do anything similar and/or if they have been successful... and would like to share theire experiences with the List. Of course, for simple structures it is relatively straightforward (if a little dull) to work-out all the SMILES strings (and then test all the possibilities with the JavaScript) but for more complex "simple" branched structures it would be considerable more time consuming. This also makes me wonder if it is possible to calculate the total number of SMILES strings for a structure? Thanks for any comments... Anthony Lewis C. Anthony Lewis Petroleum & Environmental Geochemistry Group, Department of Environmental Sciences, University of Plymouth, Plymouth, Devon PL4 8AA, U.K. tel: +44 (0)1752 233000 ext. 2988 FAX: +44 (0)1752 233035 e-mail: calewis@plymouth.ac.uk chemweb: A list for Chemical Applications of the Internet. To post to list: mailto:chemweb@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/chemweb/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe chemweb List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
So, I am wondering if anyone on the List has attempted to do anything similar and/or if they have been successful... and would like to share theire experiences with the List.
Sure. When we first created the ChemDraw Plug-In this was one of our demos. For the reasons you discuss, you cannot count on a specific SMILES string to be entered by the user, so you need to do a full atom-by-atom comparison of the user's structure against the "correct" structure. This is most conveniently done on the server, and is pretty straightforward assuming you already have experience writing Windows-based CGI scripts.
Of course, for simple structures it is relatively straightforward (if a little dull) to work-out all the SMILES strings (and then test all the possibilities with the JavaScript) but for more complex "simple" branched structures it would be considerable more time consuming. This also makes me wonder if it is possible to calculate the total number of SMILES strings for a structure?
Sure, but you won't like the answer. For any structure, there are an infinite number of possible SMILES strings. Maybe divided by two, but something close to that. The problem is that you can put disconnects anywhere you like, and ring closers don't have to be consecutive. So while ethane is traditionally CC, there is nothing officially *illegal* about any of: C1.C1 C2.C2 C3.C3 C%19.C%19 etc. The number of SMILES strings without disconnects is quite a bit smaller, but still huge. Something along the order of magnitude of (Total Number Of Atoms) * (SUM), where SUM is calculated by adding (n-2) for every n-coordinate atom in the structure. Or something like that... Jonathan Brecher CambridgeSoft Corporation jsb@camsoft.com chemweb: A list for Chemical Applications of the Internet. To post to list: mailto:chemweb@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/chemweb/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe chemweb List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
participants (2)
- 
                
                C Anthony Lewis
- 
                
                Jonathan Brecher