Tuesday, December 06, 2011

Collaborative Computational Technologies for Biomedical Research

It’s been a while since I read a science/technology book from back to back. And was it worth it? Definitely.

The book is about collaboration and is a collaboration. Ironically, the best-written chapters almost invariably are those by single authors. Which confirms my own theory that writing (including scientific writing) is not exactly collaborative activity. The contributions by Robert Porter Lynch [1], Robin W. Spencer [2], Victor J. Hruby [3], Edward D. Zanders [4], Brian Pratt [5] and Keith T. Taylor [6] are especially worth noting — I wish the whole book was written at the level of these chapters. Then again, collaboration is always a compromise. The material presented here is diverse and heterogeneous — what did you expect?

I am sure there are people who do all sorts of stuff using their smartphones, including scientific database browsing and chemical structure drawing [7]. This latter activity does not strike me as especially productive or convenient. (Also, makes me glad that the use of mobile phones while driving is outlawed in most of Europe.) In my view, for the purposes of computer graphics bigger is better: if I had a choice, I’d go for HIPerWall (25,600 × 8000 pixels) or, better still, HIPerSpace (35,840 × 8000 pixels) display walls [8]. Then I could draw some really large (in many senses) molecules.

As much as I enjoy reading the real (hardcopy) book, it could be nice to see it online, preferably in open access. For instance, Chapter 25 [9] has 196 references, all of them are URLs, and some of them are rather long ones. I’d love to be able to click on them rather than type!

Will the wikis, virtual communities and cloud computing replace the behemoth pharma companies and NCBI? A man can dream. Ekins et al. write [10]:
As a result of the recent recession there is a lot of drug discovery and development talent available now due to company lay-offs. If the software or other tools to enable this workforce to be productive and collaborate were available and they participated in the existing scientific collaboration networks, then there may be potential for enormous breakthroughs.
I wish I could share the authors’ optimism. Yes there is potential, but it is highly unlikely that unemployed researchers are in the mood to collaborate. In case you wonder why: being unemployed is a full-time occupation, which leaves preciously little spare time. I rather inclined to agree with Robin W. Spencer [2]:
Especially for cutting-edge scientific challenges, the participants you need are probably well paid and not particularly enthused by another tee shirt, coffee cup, or $100 voucher.
More quotes from this book can be found here.

I use this opportunity to lament the decline of old-fashioned copy editing [11]. I get used to the lack of any such luxury in open access publications: if the paper is accepted, the publisher tends to keep all your typos intact. But when you buy a book from John Wiley & Sons for a hundred something bucks, you’d expect some editorial intervention. (To be honest, I did not buy it. I can’t afford buying books at such prices anyway.) The major and minor irritations include:
  • Typos: “chpater” instead of “chapter” (p. 281) — I thought by now the text editing software should take care of these.
  • Tautologies: ‘The institutes of the national Institutes of Health’ (p. 496); ‘... we need to consider standards specifically for chemistry and biology. In chemistry specifically...’ (p. 202).
  • Impenetrable sentences, e.g. ‘Many aspects should be considered, such as a regulatory path for filing, potential market size, differentiability of the therapeutic and experience with and difficulty to carry out clinical trials in the disease of interest’ (p. 252) or ‘This will only be done by drawing from the mental resources of an extended scientific community in an innovative and complex, yet “daily practice”, manner that promises a profound impact on our ability to use existing data to generate new knowledge with the maximum conceivable serendipity’ (p. 454). You what?
  • Overabundance of acronyms (have a look at p. 497 and you’ll see what I mean).
  • Overabundance of buzz-words of yesteryear: crowdsourcing (see below), integration, leveraging, paradigm, stakeholder and so on. The worst offenders, however, are clear and clearly. Clearly, when these words is used too often, it is clear that something is not quite clear.
Now for “crowdsourcing”: I find the term not only ugly but offensive. As a scientist (once a scientist, always a scientist), I am open to collaboration. Also, as a scientist, I detest being part of a crowd. Period.

Don’t get me wrong: it is a good book. I wouldn’t hesitate to recommend it to any decent scientific library. But it could have been a great book.

  1. Lynch, R.P. Collaborative innovation: essential foundation of scientific discovery. In: Ekins, S., Hupcey, M.A.Z. and Williams, A.J. (eds.) Collaborative Computational Technologies for Biomedical Research. John Wiley & Sons, Hoboken, 2011, pp. 19—37.
  2. Spencer, R.W. Consistent patterns in large-scale collaboration. Ibid., pp. 99—111.
  3. Hruby, V.J. Collaborations between chemists and biologists. Ibid., pp. 113—120.
  4. Zanders, E.D. Scientific networking and collaborations. Ibid., pp. 149—160.
  5. Pratt, B. Collaborative systems biology: open source, open data, and cloud computing. Ibid., pp. 209—220.
  6. Taylor, K.T. Evolution of electronic laboratory notebooks. Ibid., pp. 303—320.
  7. Williams, A.J., Arnold, R.J.G., Neylon, C., Spencer, R.W., Schürer, S. and Ekins, S. Current and future challenges for collaborative computational technologies for the life sciences. Ibid., pp. 491—517.
  8. He, Z., Ponto, K. and Kuester, F. Collaborative visual analytics environment for imaging genetics. Ibid., pp. 467—490.
  9. Bradley, J.-C., Lang, A.S.I.D., Koch, S. and Neylon, C. Collaboration using open notebook science in academia. Ibid., pp. 425—452.
  10. Ekins, S., Williams, A.J. and Hupcey, M.A.Z. Standards for collaborative computational technologies for biomedical research. Ibid., pp. 201—208.
  11. Clark, A. The lost art of editing. Guardian, 11 February 2011.

Wednesday, November 23, 2011

Kilogram, pterin, selenium

I really enjoyed the latest issue of Chemistry International. Did you know that pterin is called “pterin” because it was first isolated from butterfly wings, and folic acid is “folic” because it was first found in leafy vegetables (from Latin folium)? I just learned that from Edward Taylor’s illuminating article on Alimta [1].

Next, two papers on kilogram in the “New SI”. Currently, kilogram is defined as a unit of mass equal to mass of the international prototype kilogram (IPK), which is a cylinder made of 90% platinum—10% iridium alloy kept at the International Bureau of Weights and Measures in France. The problem is, IPK is losing mass! But even if it did not, it is still not good that one of SI base units is linked to an artifact rather than to something more fundamental. The chemist in me prefers the definition of kilo based on carbon-12 mass [2] to the one based on Planck constant [3].

Finally, essay by Jan Trofast on discovery of selenium [4]. I didn’t know that Swedes discovered so many elements!
  1. Taylor, E.C. (2011) From the wings of butterflies: The discovery and synthesis of Alimta. Chemistry International 33, 4—8.
  2. Censullo, A.C., Hill, T.P. and Miller, J. (2011) Part I — From the current “kilogram problem” to a proposed definition. Chemistry International 33, 9—12.
  3. Mills, I. (2011) Part II — Explicit-constant definitions for the kilogram and for the mole. Chemistry International 33, 12—15.
  4. Trofast, J. (2011) Berzelius’ discovery of selenium. Chemistry International 33, 16—19.

Friday, October 07, 2011


In 1982, Dan Shechtman observed unusual diffraction pattern in aluminium—manganese alloy [1, 2]. Almost 30 years later, he was awarded The Nobel Prize in Chemistry 2011for the discovery of quasicrystals”.

Earlier this year, the first naturally occurring quasicrystal was described. Icosahedrite Al63Cu24Fe13 is a new mineral found in southeastern Chukhotka, Russia. It is named “for the icosahedral symmetry of its internal atomic structure, as observed in its diffraction pattern” [3].

  1. Shechtman, D., Blech, I., Gratias, D. and Cahn, J. (1984) Metallic phase with long-range orientational order and no translational symmetry. Physical Review Letters 53, 1951—1953.
  2. Fernholm, A. (2011) Crystals of golden proportions. Nobelprize.org.
  3. Bindi, L., Steinhardt, P.J., Yao, N. and Lu, P.J. (2011) Icosahedrite, Al63Cu24Fe13, the first natural quasicrystal. American Mineralogist 96, 928—931.

Sunday, September 25, 2011

Do we need the terminal e?

Chemical English, after all, is just a subset of English. As such, it suffers the same problem as English in general: the pronunciation of the words is far from obvious. What makes it worse for chemistry is absence of any authoritative pronunciation guide. (Since the last year’s post on this topic, the audio guide “Pronunciation of Chemical Terms”, originally hosted by Hong Kong Cyber Campus, has disappeared from the web.)

You’d think that the chemical terminology was developed after the Great Vowel Shift and therefore there must be less of gap between the spoken and written word. You’d be wrong. The gap is there, a-gaping.

For instance, the effect of silent terminal e on pronunciation of English words, including chemical terms, is simply unpredictable. Sometimes the terminal e makes no difference: both thiamine and thiamin are pronounced and mean the same. (Cf. “win” and “wine”.) In some other cases, it makes a lot of difference: chlorine (chemical element number 17) and chlorin (tetrapyrrole), or silicon (chemical element number 14) and silicone (a class of silicon-containing polymers).

Protein vs cysteine; cisplatin vs astatine; krypton vs ketone; phenol vs pyrrole — what is the point of terminal es? Wouldn’t we all be better off without them? That will spare us a few rules about elision of terminal vowels, for example.

Saturday, August 13, 2011

Sometimes metal just plain rusts

Our stainless steel forks and knives, which in England were literally stainless, even spotless, for years, here on Fuerteventura developed rust stains in a matter of days. What’s the matter?

I found this lovely quote from Brion Toss’s book [1]:
Sometimes metal just plain rusts. Stainless steel rusts more slowly, but tropical climates will get to it in just a few years. Galvanized steel left untended can dissolve in a matter of months.
Well said, but what exactly is wrong with “tropical climates”? High humidity and high temperature, that’s what.

But wait. Humidity in Fuerteventura is not higher than in England, right? We hardly have any rain on this island. But the temperature is definitely higher. As is the case with most chemical reactions, the corrosion rate increases with increasing temperature. Add to this salt air. (Salt acts as a catalyst of rusting.) No wonder cars rust quickly here.

Ah well, we always can use the chopsticks.

The Complete Rigger's Apprentice: Tools and Techniques for Modern and Traditional Rigging
  1. Toss, B. (1998) The Complete Rigger's Apprentice: Tools and Techniques for Modern and Traditional Rigging. International Marine/Ragged Mountain Press, Camden, Maine.

Friday, July 29, 2011

Thursday, July 28, 2011

Phe—Val crosslink in symerythrin

The crystal structure of diiron protein symerythrin from Cyanophora paradoxa reveals a novel C—C cross-link between valine and phenylalanine residues [1].

  1. Cooley, R.B., Rhoads, T.W., Arp, D.J. and Karplus, P.A. (2011) A diiron protein autogenerates a valine-phenylalanine cross-link. Science 332, 929.

Saturday, June 18, 2011

Open and closed P450 2B4

The crystal structures of rabbit P450 2B4 covalently bound to the mechanism-based inactivator 4-tert-butylphenylacetylene in closed (a) and open (b) conformations have been solved [1].



  1. Gay, S.C., Zhang, H., Wilderman, P.R., Roberts, A.G., Liu, T., Li, S., Lin, H.-l., Zhang, Q., Woods, V.L., Jr., Stout, C.D., Hollenberg, P.F. and Halpert, J.R. (2011) Structural analysis of mammalian cytochrome P450 2B4 covalently bound to the mechanism-based inactivator tert-butylphenylacetylene: insight into partial enzymatic activity. Biochemistry 50, 4903—4911.

Tuesday, April 26, 2011

Importance of being obsessive-compulsive

Never underestimate the importance of naming. For instance, I would probably never read the excellent paper by Kuhn and Wahl-Jensen [1] if not for its title. (Seriously, read it. Although the note appears in Binomina, it is relevant to scientific nomenclature and terminology in general, not just biological taxonomy.) They write:
When terms get renamed just for the sake of renaming them then outrage at nomenclature experts is justified. But it is a two-way street. Nomenclature without discourse with the scientific community working in laboratories is useless — but science without nomenclature cannot be performed, either.
Conversely, Roderic Page argues that “quite a lot” of biology can be performed without “proper” taxonomic names [2], even though his
definition of “proper” name is a little loose: anything that had two words, second one starting with a lower case letter, was treated as a proper name.
Just imagine the fury of those who are “obsessive-compulsive about terminology” on reading that! Surely not any binomial name is “proper”? However, that is beyond the point. Linnaean names are just the labels. They may be preferable to NCBI tax_id codes because of aesthetic considerations but ultimately they are dispensable. We only cling to them because we believe that these labels have, as Robert M. Pirsig put it, “an intrinsic sacredness” of their own [3]:
One finds that in the Judeo-Christian culture in which the Old Testament ‘Word’ had an intrinsic sacredness of its own, men are willing to sacrifice and live by and die for words.
But what about chemistry? On the one hand, chemistry appears to be in a better position because of superiority of chemical nomenclature over, well, any other known nomenclature. The name constructed according to the rules of systematic chemical nomenclature holds the key to the structure of the entity in question. It does not mean that there could or should be only one “proper” name for one structure. For example, “tetrafluoridolead” (additive nomenclature) and “tetrafluoroplumbane” (substitutive nomenclature) correspond to the same entity, PbF4. You don’t have to know it, because you can figure it out. Compare this with the situation in biology: there is no way to deduce that, say, Prunus dulcis and Amygdalus communis are synonyms.

On the other hand, we chemists often fall into the same trap as anyone else: we tend to believe that “proper” naming of a compound (of known structure) automatically improves our knowledge of it. But why? The terms can change. The nomenclature rules are changing. For a structure of certain complexity, the application of the same rules by different chemists (or different naming software) may result in different systematic names. Some structures as yet cannot be named by any software. So what? It is highly unlikely that a name which takes more than 100 characters will be used in any discourse. I distinctly remember thinking about it a few years ago while reading the draft of IUPAC Recommendations for rotaxane nomenclature [4]. Why not to use the (equally unpronounceable but more useful) InChI string instead?

Still, I wouldn’t dismiss the aesthetics that easily. For me, concise, clear, elegant is good; long, ambiguous, ugly is bad. And coming back to the title of [1]: “being obsessive-compulsive about terminology and nomenclature” is neither a vice nor a virtue. It is a mental condition that some people (myself included) have, for the better or the worse.
  1. Kuhn, J.H. and Wahl-Jensen, V. (2010) Being obsessive-compulsive about terminology and nomenclature is not a vice, but a virtue. Bionomina 1: 11—14.
  2. Page, R. (2011) Dark taxa: GenBank in a post-taxonomic world.
  3. Pirsig, R.M. (1974) Zen and the Art of Motorcycle Maintenance.
  4. Yerin, A., Wilks, E.S., Moss, G.P. and Harada, A. (2008) Nomenclature for rotaxanes and pseudorotaxanes (IUPAC Recommendations 2008). Pure Appl. Chem. 80, 2041—2068.

Friday, April 08, 2011


Last Summer, we bought this crystal-growing kit in Oxfam. It contains ingredients and instructions to grow several types of crystals. Those nicknamed “quartz” and “emerald” in fact are monoammonium phosphate, NH4H2PO4, while “amethyst” and “fluorite” are grown from a solution of potassium aluminium sulphate, KAl(SO4)2. Our latest experiment was to grow an “amethyst” crystal cluster. Timur and I prepared the solution, poured it over two stones from our garden and left to grow for a week. Here’s the result.

Tuesday, March 22, 2011

Chlorite dismutase

Photosynthesis is not the only dioxygen-evolving biological process. For instance, chlorite dismutase (Cld; EC catalyses the production of O2 from chlorite (1):

ClO2 → Cl + O2(1)

The reaction (1) is not really disproportionation, and NC-IUBMB made a valid point that the term “chlorite dismutase” is “misleading”. Even so, the NC-IUBMB-approved, sorry, “accepted” name “chlorite O2-lyase” for an oxidoreductase is equally absurd; I am going to ignore it.

Chlorite dismutase from Azospira oryzae exists as a homohexamer [1] while Cld from Dechloromonas aromatica [2] and enzyme from Candidatus Nitrospira defluvii [3] are homopentamers.

The active site contains a single haem group [Fe(ppIX)] coordinated by a proximal histidine residue. Goblirsch et al. [2] propose the mechanism where the reaction of chlorite within the distal pocket of Cld generates hypochlorite (ClO) and a compound I intermediate [Fe(ppIX)O] (1a). Then ClO rebounds with compound I forming the chloride and dioxygen (1b):

ClO2 + [Fe(ppIX)] → ClO + [Fe(ppIX)O](1a)
ClO + [Fe(ppIX)O] → Cl + O2 + [Fe(ppIX)](1b)
  1. de Geus, D.C., Thomassen, E.A.J., Hagedoorn, P.-L., Pannu, N.S., van Duijn, E. and Abrahams, J.P. (2009) Crystal structure of chlorite dismutase, a detoxifying enzyme producing molecular oxygen. J. Mol. Biol. 387, 192—206.
  2. Goblirsch, B.R., Streit, B.R., DuBois, J.L. and Wilmot, C.W. (2010) Structural features promoting dioxygen production by Dechloromonas aromatica chlorite dismutase. J. Biol. Inorg. Chem. 15, 879—888.
  3. Kostan, J., Sjöblom, B., Maixner, F., Mlynek, G., Furtmüller, P.G., Obinger, C., Wagner, M., Daims, H. and Djinović-Carugo, K. (2010) Structural and functional characterisation of the chlorite dismutase from the nitrite-oxidizing bacterium “Candidatus Nitrospira defluvii”: Identification of a catalytically important amino acid residue. J. Struct. Biol. 172, 331—342.

Monday, March 21, 2011

Femtosecond X-ray nanocrystallography of PSI

It is a well known fact (to those who know it well) that in order to obtain a decent diffraction pattern one has to grow a decent-size crystal first. Well, that is about to change. The PDB entry enigmatically named “femtosecond X-ray protein nanocrystallography” [1] in fact contains the structure of the photosystem I (PSI) from Thermosynechococcus elongatus solved by this new method [2]. In this work, more that 3 million diffraction patterns were collected from really small PSI crystals (from ~200  nm to 2  μm in size) illuminated by the new femtosecond X-ray laser, the Linac Coherent Light Source in Stanford. According to the authors,

We mitigate the problem of radiation damage in crystallography by using pulses briefer than the timescale of most damage processes.

  1. PDB:3PCQ
  2. Chapman, H.N., Fromme, P., Barty, A. et al. (2011) Femtosecond X-ray protein nanocrystallography. Nature 470, 73—77.

Sunday, March 20, 2011

P450 from Actinoplanes teichomyceticus

The crystal structure of the P450 monooxygenase from Actinoplanes teichomyceticus (CYP165D3) involved in biosynthesis of antibiotic teicoplanin has been solved [1].

  1. Li, Z., Rupasinghe, S., Schuler, M. and Nair, S.K. (2011) Crystal structure of a phenol-coupling P450 monooxygenase involved in teicoplanin biosynthesis. Proteins: Structure, Function, and Bioinformatics 79, 1728—1738.

Thursday, March 17, 2011

P450 monooxygenase AurH from S. thioluteus

The first crystal structures of the unique P450 monooxygenase AurH from Streptomyces thioluteus have been solved [1].

  1. Zocher, G., Richter, M.E.A., Mueller, U. and Hertweck, C. (2011) Structural fine-tuning of a multifunctional cytochrome P450 monooxygenase. J. Am. Chem. Soc. 133, 2292—2302.

Thursday, January 27, 2011

Natural products

Do a Google search and you’ll find all sorts of stuff claimed to be “natural products” — amazingly, some of them even are “chemical-free”! Now seriously. To quote IUPAC’s 1999 recommendations [1],
The nomenclature of natural products has suffered from much confusion.
That does not surprise me. What is surprising, however, that neither these nor the previous recommendations [2] tell us what the “natural products” are. Ditto the Gold Book. It may define terpenoids as “natural products and related compounds formally derived from isoprene units” but the natural products go without explanation. (Nor is it clear what “related compounds” are.) The very same Gold Book says that natural graphite is “a mineral found in nature”. Therefore, “natural” means “found in nature”. Right? Right. Is natural graphite a natural product? I am not sure.

Let us look in Webster then:
A chemical substance produced by a living organism; — a term used commonly in reference to chemical substances found in nature that have distinctive pharmacological effects. Such a substance is considered a natural product even if it can be prepared by total synthesis.
Is that any better? Both dioxygen and nitric oxide are produced by living organisms and have rather distinctive pharmacological effects, yet most chemists would hesitate to call them natural products. In The Concise Oxford Dictionary, the first definition of “natural” is
existing in or caused by nature; not artificial.
Of course human beings are parts of nature, but maybe this negative definition, “not artificial”, is indeed most useful?

In ChEBI, natural product (no definition so far) is an organic molecular entity (so no O2 or NO here) and includes the following classes:
On the first glance, nothing looks particularly disturbing here. But I see a bit of a problem with ontology. First, all is a children of natural product have to be natural products. What if we have, say, (artificially) fluorinated carbohydrates? Are they still carbohydrates? If no, then the True path rule is broken. If yes, then some “unnatural” compounds will be considered natural products. I don’t mind that — perhaps that will cover “related compounds” (whatever they are) nicely.

Second, CHEBI:33243 belongs to chemical entity ontology, which is (or at least it should be) purely structure-based. The origin does not enter here. There is no such thing as intrinsic “naturalness” in a natural product molecule: natural product remains a natural product even if (artificially) prepared by total synthesis.

Natural products often are equated with secondary metabolites. This does not seem right. In ChEBI, secondary metabolite (“A metabolite that is not directly involved in the normal growth, development or reproduction of an organism” — another negative definition?) belongs to role ontology. (The role ontology sounded such a good idea at the time... no, don’t get me started.) At best, one can say some natural products have role “secondary metabolite”. Yuck.

To summarise: “natural product” appears to be a rather useless top-level term. Let us look at the sources of natural products: plants, fungi, bacteria, animals. What if, instead of saying “fungal natural product”, we say “fungi-specific compound”? In this case, we discard primary metabolites, other simple compounds found just about everywhere in the universe and are left with exactly what we want: molecules isolated from and specific for fungi.

Or are we? Antibiotics, naturally synthesised by fungi, are not naturally found in humans. But when we take them, they are naturally metabolised in our liver and eventually excreted with urine. Are these metabolites the natural products? If yes, are they fungal or animal or neither?
  1. Revised Section F: Natural products and related compounds (IUPAC Recommendations 1999). Pure Appl. Chem. 71, 587—643 (1999).
  2. Nomenclature of Organic Chemistry. Section F: Natural Products and Related Compounds. Recommendations 1976. Eur. J. Biochem. 86, 1—8 (1978).