Wednesday, June 10, 2015

There is no perfect language

From The Information: A History, A Theory, A Flood by James Gleick:

It was once thought that a perfect language should have an exact one-to-one correspondence between words and their meanings. There should be no ambiguity, no vagueness, no confusion. Our earthly Babel is a falling off from the lost speech of Eden: a catastrophe and a punishment. “I imagine,” writes the novelist Dexter Palmer, “that the entries of the dictionary that lies on the desk in God’s study must have one-to-one correspondences between the words and their definitions, so that when God sends directives to his angels, they are completely free from ambiguity. Each sentence that He speaks or writes must be perfect, and therefore a miracle.” We know better now. With or without God, there is no perfect language.

Leibniz thought that if natural language could not be perfect, at least the calculus could: a language of symbols rigorously assigned. “All human thoughts might be entirely resolvable into a small number of thoughts considered as primitive.” These could then be combined and dissected mechanically, as it were. “Once this had been done, whoever uses such characters would either never make an error, or, at least, would have the possibility of immediately recognizing his mistakes, by using the simplest of tests.” Gödel ended that dream.

On the contrary, the idea of perfection is contrary to the nature of language. Information theory has helped us understand that — or, if you are a pessimist, forced us to understand it.

Monday, June 01, 2015

Periodic Videos

It’s been a while since I posted anything on this blog, but now I’m back.

This is a very cool collection of videos, “a lesson about every single element on the periodic table”. Featuring Professor and a really awesome reaction, here’s one about one of my favourite elements. Yes, iron is in my blood! (In yours too.)

Wednesday, October 15, 2014


From A Dictionary of Symbols by Juan Eduardo Cirlot (translated by Jack Sage):
In astrology they are called ‘terrestrial’ or ‘subterranean planets’, because of the analogous correspondences between the planets and the metals. For this reason astrologers consider that there are only seven metals (influenced by the same number of spheres), which does not mean that mankind during the astrobiological period did not recognize more. As Piobb has pointed out, some engineers have noted that the seven planetary metals make up a series which is applicable to the system of the twelve polygons. But, apart from the theory of correspondences, the metals symbolize cosmic energy in solidified form and, in consequence, the libido. On this basis, Jung has asserted that the base metals are the desires and the lusts of the flesh. Extracting the quintessence from these metals, or transmuting them into higher metals, is equivalent to setting creative energy free from the fetters of the sense world, a process identical with what esoteric tradition and astrology regard as liberation from the ‘planetary influences’. The metals can be grouped within a progressive ‘series’ in which each metal displays its hierarchical superiority over the one preceding it, with gold as the culminating point of the progression. This is why, in certain rites, the neophyte is required to divest himself of his ‘metals’ — coins, keys, trinkets — because they are symbolic of his habits, prejudices and characteristics, etc. We, for our part, however, are inclined to believe that in each particular pairing of planet with metal (as Mars with iron) there is an essential element of the ambitendent, in that its positive quality tends one way and its negative defect tends the other. Molten metal is an alchemic symbol expressing the coniunctio oppositorum (the conjunction of fire and water), related to mercury, Mercury and Plato’s primordial, androgynous being. And at the same time, the solid or ‘closed’ properties of matter emphasize its symbolism as a liberator — hence the connexion with Hermes the psychopomp <...> . The correspondences between the planets and the metals, from inferior to superior, are: Saturn — lead, Jupiter — tin, Mars — iron, Venus — copper, Mercury — mercury, Moon — silver, Sun — gold.

Tuesday, September 23, 2014

Pseudomonas fluorescens PhoX

Alkaline phosphatases (EC occur widely in nature and are found in all three domains of life [1]. The Escherichia coli PhoA enzyme has been extensively studied whereas PhoX family of alkaline phosphatases are only minimally characterised and show no sequence similarity to other phosphotransfer enzymes. Yong et al. [2] determined high-resolution crystal structures for native PhoX from Pseudomonas fluorescens [3] and for its complexes with phosphate [4], a nonhydrolysable ATP analogue adenosine-5′-[β,γ-methylene]triphosphate (AMP-PCP) [5], and the putative transition-state mimic vanadate [6]. The active site contains two antiferromagnetically coupled ferric ions (Fe3+), three calcium ions (Ca2+), and an oxo group bridging one Ca2+ and two Fe3+ ions.

Cartoon representation of P. fluorescens PhoX crystal structure.
The PhoX active site containing bound phosphate [1, Fig. 2c].
A model for the catalytic mechanism of PhoX [1, Fig. 3d].
The transition state is indicated with the double dagger (‡) symbol.
  1. Millán, J.L. (2006) Alkaline Phosphatases: Structure, substrate specificity and functional relatedness to other members of a large superfamily of enzymes. Purinergic Signalling 2, 335–341.
  2. Yong, S.C., Roversi, P., Lillington, J., Rodriguez, F., Krehenbrink, M., Zeldin, O.B., Garman, E.F., Lea, S.M. and Berks, B.C. (2014) A complex iron-calcium cofactor catalyzing phosphotransfer chemistry. Science 345, 1170—1173.
  3. PDB:4A9V
  4. PDB:4ALF
  5. PDB:4AMF
  6. PDB:3ZWU

Tuesday, August 26, 2014

Ogres are not like cakes

I was intrigued by the article in New Scientist which starts with the question, “Do you speak chemistry?” [1]. So much that I asked my friend to send me the original paper [2] authored by the Bartosz Grzybowski group of Northwestern University in Evanston, Illinois. It is a curious reading.

Don’t get me wrong. I have nothing against the analogies. I love the analogies. If the linguistic analogy works for chemistry, it’s fine by me. As long as everybody understands that it is just an analogy.

The authors try to “demonstrate that a natural language such as English and organic chemistry have the same structure in terms of the frequency of, respectively, text fragments and molecular fragments”. How do they do that? They start by looking at the maximum common substrings (MCS) found in 100 sentences randomly chosen from English Wikipedia.

Perhaps not surprisingly, the most common fragment of the sentences is “e”, followed by “a” and “o”.
That is surprising to me though, considering that only “a” is a word in English. I wouldn’t be surprised if it happened to be Spanish Wikipedia. Are the authors talking about letter frequency per chance? But the “top three” letters in English (from most to least common) are known to be E, T, A while in Spanish they are E, A, O. Anyway, they show that the distribution of the fragments, whatever they are, follows the power law. Then they show that the distribution of the common molecular fragments, derived from the corpus of organic molecules, also follows the power law. Big deal: so do the earthquake magnitudes, populations of cities and stock market crashes [3]. Cadeddu et al. do not seem to be bothered with that at all:
We have just shown that there exists a set of molecular fragments with which organic molecules can be described akin to a language.
So far so bad; whether you are a linguist, a computational chemist or an organic chemist, both methodology and conclusions of this paper are bound to make you cringe. So, my immediate reaction was to dismiss it altogether. Ogres are not like cakes. Organic molecules are not like a language. End of story.

But could it be that I am missing something? On the one hand, the language of chemistry — whether we are talking trivial names, systematic names, or graphical diagrams — is very much like any other language: a system of communication. On the other hand, the molecules themselves are not. Unless they are the information macromolecules. The message encoded in a single DNA molecule can be very much abstracted from its chemical structure. Without any doubt, genetic code is a communication system, therefore it is a language, although not man-made.

It’s interesting that the authors view organic molecules as “sentences” rather than “words”; the latter would be the nomenclaturist’s approach. I guess it depends on your taste, or language preferences. Most systematic chemical names look alien in English but would fit rather nicely in German or Finnish. I personally view any chemical name as a noun phrase describing a corresponding molecular entity; a molecular entity itself is not a noun phrase. However, in natural languages, there rarely is a confusion regarding the boundaries of a word:

a word is the smallest element that may be uttered in isolation with semantic or pragmatic content (with literal or practical meaning).
On the contrary, Grzybowski’s “words” are the molecular fragments which do not exist in isolation. It is also worth noting that in the world of biopolymers, say nucleic acids, each monomer (as complex as any of Grzybowski’s “sentences”), is often represented as a letter, while an entire bacterial genome (still a single DNA molecule) could be considered a War and Peace (or Crime and Punishment).

Cadeddu et al. further claim that linguistic approach identifies the symmetry/repeat units in molecules such as α-cyclodextrin and porphyrin:

We emphasize that this is not a small feat given we have not even considered any (x, y, z) coordinates of the atoms making up these molecules and performed no linear-algebra analyses to find symmetries which, incidentally, can be a computationally intensive procedure involving manipulation of matrices.
I find this modest remark regarding the size of the “feat” within the body of a scientific article in a respectable journal really cute. Are the authors even aware that there are chemical similarity/substructure search engines? You don’t need atomic coordinates to identify the fragments with the same connectivity.

Which brings me to the final point. What is the “chemical linguistics” anyway? If the “words” of chemistry, as postulated in [2], are nothing else but molecular fragments, or substructures, then the chemoinformaticians were doing the substructure search of chemical databases for donkey’s years without knowing that it is called chemical linguistics. I am aware of completely different use of this term in a sense “mining of natural language texts for chemical information” [4, 5]. This latter use is well-established and I think applying the name “chemical linguistics” to unrelated area will only confuse everybody.

  1. Aron, J. (2014) Language of chemistry is unveiled by molecular make-up. New Scientist no. 2981, p. 8.
  2. Cadeddu, A., Wylie, E.K., Jurczak, J., Wampler-Doty, M. and Grzybowski, B.A. (2014) Organic chemistry as a language and the implications of chemical linguistics for structural and retrosynthetic analyses. Angewandte Chemie 126, 8246—8250.
  3. Buchanan, M. (2000) Ubiquity, Weidenfeld & Nicolson, London.
  4. Goebels, L., Grotz, H., Lawson, A.L., Roller, S. and Wisniewski, J. (2005) Method and software for extracting chemical data. Patent DE 102005020083 A1.
  5. Day, N.E., Corbett, P.T. and Murray-Rust, P. (2007) Semantic chemical publishing. ACS National Meeting #233, Chicago.

Thursday, July 31, 2014

F420-reducing [NiFe]-hydrogenase at 1.7 Å

The F420-reducing [NiFe]-hydrogenase (FrhABG; EC catalyses the reversible redox reaction between coenzyme F420 and H2. FrhABG is a group 3 [NiFe]-hydrogenase with a dodecameric quaternary structure recently revealed by high-resolution cryo-electron microscopy [1]. Vitt et al. report the crystal structure of FrhABG from Methanothermobacter marburgensis at 1.7 Å resolution [2, 3] and compare it with the structures of group 1 [NiFe]-hydrogenases, the only previously structurally characterised group.

  1. Allegretti, M., Mills, D.J., McMullan, G., Kühlbrandt, W. and Vonck, J. (2014) Atomic model of the F420-reducing [NiFe] hydrogenase by electron cryo-microscopy using a direct electron detector. eLife 3, e01963.
  2. Vitt, S., Ma, K., Warkentin, E., Moll, J., Pierik, A.J., Shima, S. and Ermler, U. (2014) The F420-reducing [NiFe]-hydrogenase complex from Methanothermobacter marburgensis, the first X-ray structure of a group 3 family member. J. Mol. Biol. 426, 2813—2826.
  3. PDB:4OMF

Monday, June 30, 2014

Phycocyanin against Alzheimer’s?

Could a light-harvesting protein phycocyanin be used as a novel drug against Alzheimer’s disease (AD) [1, 2]?

In the present study, intact hexameric phycocyanin was isolated and crystallized from the cyanobacterium Leptolyngbya sp. N62DM, and the structure was solved to a resolution of 2.6 Å. Molecular docking studies show that the phycocyanin αβ-dimer interacts with the enzyme β-secretase, which catalyzes the proteolysis of the amyloid precursor protein to form plaques. The molecular docking studies suggest that the interaction between phycocyanin and β-secretase is energetically more favorable than previously reported inhibitor-β-secretase interactions. Transgenic Caenorhabditis elegans worms, with a genotype to serve as an AD-model, were significantly protected by phycocyanin. Therefore, the present study provides a novel structure-based molecular mechanism of phycocyanin-mediated therapy against AD.
  1. Singh, N.K., Hasan, S.S., Kumar, J., Raj, I., Pathan, A.A., Parmar, A., Shakil, S., Gourinath, S. and Madamwar D. (2014) Crystal structure and interaction of phycocyanin with β-secretase: A putative therapy for Alzheimer's disease. CNS Neurol. Disord. Drug Targets 13, 691—698.
  2. PDB:4L1E