Sunday, March 01, 2009

PubChem takes liberties with hydrogens

The submitted structure (a) is C3H5O5P, the PubChem shows C3H4O5P+ (b). How did that happen? Why the deposited molecule lost hydride (H)?

In the case of structure C16H36MoN6O4P2 (c), presumably submitted by NIST, it has acquired two hydrons in PubChem to become [C16H38MoN6O4P2]2+ (d).

Egon Willighagen said...

I have seen a lot of such problems, no clue how they were lost, but I did observe inconsistent behaviour in the ASN and MDL molfiles you can download for entries, the former looking 'chemically better'. File formats are a known source of problems, and it is not uncommon the cheminformatics tools guess wrongly on missing information in those formats.

Regarding hydrogens, as several other Blue Obelisk members advocate too: always use explicit hydrogens! Those are much harder to loose :)

Kirill said...

Sure, I always try to draw hydrogens, especially in non-standard cases. However, the second example shows added hydrogens, and this is difficult to prevent!