So, I am wondering if anyone on the List has attempted to do anything similar and/or if they have been successful... and would like to share theire experiences with the List.
Sure. When we first created the ChemDraw Plug-In this was one of our demos. For the reasons you discuss, you cannot count on a specific SMILES string to be entered by the user, so you need to do a full atom-by-atom comparison of the user's structure against the "correct" structure. This is most conveniently done on the server, and is pretty straightforward assuming you already have experience writing Windows-based CGI scripts.
Of course, for simple structures it is relatively straightforward (if a little dull) to work-out all the SMILES strings (and then test all the possibilities with the JavaScript) but for more complex "simple" branched structures it would be considerable more time consuming. This also makes me wonder if it is possible to calculate the total number of SMILES strings for a structure?
Sure, but you won't like the answer. For any structure, there are an infinite number of possible SMILES strings. Maybe divided by two, but something close to that. The problem is that you can put disconnects anywhere you like, and ring closers don't have to be consecutive. So while ethane is traditionally CC, there is nothing officially *illegal* about any of: C1.C1 C2.C2 C3.C3 C%19.C%19 etc. The number of SMILES strings without disconnects is quite a bit smaller, but still huge. Something along the order of magnitude of (Total Number Of Atoms) * (SUM), where SUM is calculated by adding (n-2) for every n-coordinate atom in the structure. Or something like that... Jonathan Brecher CambridgeSoft Corporation jsb@camsoft.com chemweb: A list for Chemical Applications of the Internet. To post to list: mailto:chemweb@ic.ac.uk Archived as: http://www.lists.ic.ac.uk/hypermail/chemweb/ To (un)subscribe, mailto:majordomo@ic.ac.uk the following message; (un)subscribe chemweb List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)