Sunday, March 01, 2009

PubChem takes liberties with hydrogens

The submitted structure (a) is C3H5O5P, the PubChem shows C3H4O5P+ (b). How did that happen? Why the deposited molecule lost hydride (H)?

3-[hydroxy(oxido)phosphoranyl]pyruvic acid
(a)

(b)

In the case of structure C16H36MoN6O4P2 (c), presumably submitted by NIST, it has acquired two hydrons in PubChem to become [C16H38MoN6O4P2]2+ (d).

3-[hydroxy(oxido)phosphoranyl]pyruvic acid
(c)

(d)

2 comments:

Egon Willighagen said...

I have seen a lot of such problems, no clue how they were lost, but I did observe inconsistent behaviour in the ASN and MDL molfiles you can download for entries, the former looking 'chemically better'. File formats are a known source of problems, and it is not uncommon the cheminformatics tools guess wrongly on missing information in those formats.

Regarding hydrogens, as several other Blue Obelisk members advocate too: always use explicit hydrogens! Those are much harder to loose :)

Kirill said...

Sure, I always try to draw hydrogens, especially in non-standard cases. However, the second example shows added hydrogens, and this is difficult to prevent!