Saturday, January 09, 2021

Irregularity and suppletion

Now that we’ve established that all chemical names consist of content words and each content word includes at least one base, we can rephrase our original statement ix

  1. New chemical names are formed by combining existing content morphemes with functional morphemes or adding new content morphemes

as

  1. New chemical names are formed by combining existing bases with functional morphemes or adding new bases.

When we say “combining”, we mean that the parts of our construction set themselves are not changing. Right? In this way, the chemical name-building (out of standardised blocks, like names of atoms, groups, etc.) reflects the actual molecule-building (out of standard blocks, like atoms, groups, etc.).

On the other hand, if we agree that chemical names form part of a natural language, we also have to accept that sometimes they behave in not quite regular fashion. For example, we can figure out that the substituent group called ethenyl is derived from ethene because they share the base ‘ethen’. However, we cannot deduce in the similar fashion that phenyl group is derived from benzene. What’s going on here?

This is a case of suppletion, which could be defined as the use of etymologically unrelated words within the same paradigm. Suppletion in a strict sense refers to different inflections of the word having unrelated stems, as can be observed in the conjugation of some irregular verbs (cf. English go and went, Spanish ir/va/fue, Russian идти/шёл and so on) or comparative and superlative forms of adjectives such as good and bad (cf. English good/better and bad/worse, Spanish bueno/mejor and malo/peor, Russian хороший/лучший and плохой/худший, etc.) In a broader sense, suppletion encompasses similar processes within the word family. For instance, English adjectives of frequency can be derived from the corresponding nouns, as in every hourhourly, every daydaily, every weekweekly, every monthmonthly, every two months → bimonthly... And then the pattern breaks down and gets replaced by a different one: every three months → quarterly, twice a year → biannual, every yearannual, every two years → biennial, every four years → quadrennial, etc. In this new pattern, the English roots are replaced by their Latin counterparts.

Let’s have a look now at some paradigms in chemical terminology, starting with element names and terms directly derived from them. In a perfectly regular paradigm,

  • the English name is exactly the same as the Latin name, or at least they share the base;
  • the derived terms for anions, parent hydrides, groups, ligands etc. share the same base;
  • the element symbol shares the first letter with the element name; in case of two-letter symbols, the second letter is also found in the element name.

Like here [1, pp. 337—339]:

Element symbol Latin name English name Monoatomic anion Heteroatomic anion ‘a’ term ‘y’ term Hydride
Cr chromium chromium chromide chromate chroma chromy
Ga gallium gallium gallide gallate galla gally gallane
I iodum iodine iodide iodate ioda iody iodane
P phosphorus phosphorus phosphide phosphate; phosphite phospha phosphy phosphane
Xe xenonium xenon xenonide xenonate xenona xenony

Now and then, however, you’ll see deviations from this paradigm. In case of silicon and germanium, their corresponding bases, ‘silic’ and ‘german’, are conserved in the names of anions but get shortened to ‘sil’ and ‘germ’ in the names of hydrides and in the ‘a’ and ‘y’ terms. Carbon keeps the base ‘carbon’ in the name of heteroatomic anion, carbonate, otherwise it is shortened to ‘carb’. Aluminium, selenium, tellurium and polonium conserve their bases ‘alumin’, ‘selen’, ‘tellur’ and ‘polon’ throughout except for the names of hydrides where they get truncated to ‘alum’, ‘sel’, ‘tell’ and ‘pol’, respectively. The opposite story is with indium: all the derived terms contain the root ‘ind’ which gets an extension in the hydride name indigane*. Hydrogen and oxygen contain two roots each, ‘hydr’/‘ox’ and ‘gen’. The second root is lost everywhere except for the heteroatomic anion names, hydrogenate and oxygenate. The origin of the suffix ‘on’ in ‘hydrony’ is unclear (hydron is the general name of the ion H+). The name oxidane, uniquely for inorganic hydrides, contains the suffix ‘id’.

Element symbol Latin name English name Monoatomic anion Heteroatomic anion ‘a’ term ‘y’ term Hydride
Al aluminium aluminium aluminide aluminate alumina aluminy alumane
In indium indium indide indate inda indy indigane
C carbonium carbon carbide carbonate carba carby carbane
Si silicium silicon silicide silicate sila sily silane
Ge germanium germanium germanide germanate germa germy germane
H hydrogenium hydrogen hydride hydrogenate hydrony
O oxygenium oxygen oxide oxygenate oxa oxy oxidane
Se selenium selenium selenide selenate; selenite selena seleny selane
Te tellurium tellurium telluride tellurate tellura tellury tellane
Po polonium polonium polonide polonate polona polony polane

So what, you might say. In spite of these irregularities, one still can deduce the meaning of derived terms. But wait. For iron, copper, silver, gold, lead and tin — six of seven classical metals — both the element symbols and all the derived terms come from Latin, including the hydride names for lead and tin. On the other extreme, hydrargyrum, the Latin name for mercury, has left its trace only in the element symbol, Hg. All the derived terms contain the root ‘mercur’, which unfortunately results in the ‘y’ term being identical to the element name. In case of antimony, another element which ends in ‘-y’, we have an intermediate situation: the anion names, antimonide and antimonate, keep the English root ‘antimon’ whereas the ‘a’ and ‘y’ terms as well as the element symbol Sb are based on the Latin name stibium. With nitrogen, the anion names are nitride, nitrate and nitrite although the ‘a’ and ‘y’ terms are derived from the French azote. Sulfur is almost regular except for the ‘a’ term, ‘thia’, from the Greek θεῖον.

IUPAC was pushing to make the situation with sodium, potassium and tungsten the same as with mercury: completely regular except for the element symbols. The 2005 edition of Red Book discards ‘kalide’ and ‘natride’ as obsolete and recommends ‘potasside’ and ‘sodide’, respectively. It also gets rid of all words with the root ‘wolfram’ although the earlier publications recommended the terms ‘wolframate’ [2] and ‘wolframy’ [3]. Given that in Latin, Spanish, Nordic and Slavic languages the name of this element is a variation on German Wolfram, it’s little wonder that chemists expressed their concern about disappearance of ‘wolfram’ as an acceptable alternative to tungsten [4].

Element symbol Latin name English name Monoatomic anion Heteroatomic anion ‘a’ term ‘y’ term Hydride
Ag argentum silver argentide argentate argenta argenty
Au aurum gold auride aurate aura aury
Cu cuprum copper cupride cuprate cupra cupry
Fe ferrum iron ferride ferrate ferra ferry
Hg hydrargyrum mercury mercuride mercurate mercura mercury
K kalium potassium kalide
potasside
potassate potassa kaly
potassy
Na natrium sodium natride
sodide
sodate soda natry
sody
Pb plumbum lead plumbide plumbate plumba plumby plumbane
Sb stibium antimony antimonide antimonate stiba stiby stibane
Sn stannum tin stannide stannate stanna stanny stannane
W wolframium tungsten tungstide wolframate
tungstate
tungsta wolframy
tungsty
French name
N azote nitrogen nitride nitrate; nitrite aza azy azane
Greek name
S θεῖον (theion) sulfur sulfide sulfate; sulfite thia sulfy sulfane

Another paradigm is at the core of substitutive nomenclature. The names of the substituents can either precede the name of the parent structure, in which case they are (incorrectly) referred to as “prefixes”, or follow it, as “suffixes”. In fact the names of the substituents are combining forms which contain at least one root. In any given structure, one — and only one — of the substituents can be chosen as the principal group, i.e. group that gives the name to the class. This substituent will occupy the place of “suffix”, while the rest can only be “prefixes”. You’d be right to expect the names of the substituents to share their roots/bases no matter where they are positioned. Indeed, here’s how the “regular” substituents behave [5]:

Formula Class “Suffix” “Prefix”
–COO carboxylates -carboxylate carboxylato-
–COOH carboxylic acids -carboxylic acid carboxy-
–NH2 amines -amine amino-
=NH imines -imine imino-
–P(O)(OH)2 phosphonic acids -phosphonic acid phosphono-
–S(O)2 sulfones -sulfone sulfonyl-

In many cases, however, the “prefix” and “suffix” names for the same substituent have no common bases. In case of –SH group, there are even three suppletive names: the perfectly regular “prefix” ‘sulfanyl-’, derived from the parent hydride name sulfane; another “prefix” ‘mercapto’, introduced by the Danish chemist William Christopher Zeise, from the Latin mercurium captāns, “capturing mercury” (“no longer acceptable” by IUPAC but still in wide use); and “suffix” ‘thiol’, which consists of Greek-derived root ‘thi’ plus ‘ol’ from ‘alcohol’.

Formula Class “Suffix” “Prefix”
–CHO aldehydes -carbaldehyde formyl-
–{C}HO -al oxo-
–C≡N nitriles -carbonitrile cyano-
–{C}≡N -nitrile
=O ketones -ketone oxo-
–OH alcohols -ol hydroxo-
–SH thiols -thiol mercapto-
sulfanyl-

Nevertheless, the “suffix”, if present, often shares the root, or a part thereof, with the corresponding class name. For instance, ‘alcohol’ gets shortened to ‘ol’ and ‘aldehyde’ to ‘al’.

Just like in natural languages, it is the most common words that tend to be irregular. Consider the molecule (a) known under its Germanic name ‘water’. The Latin word ‘aqua’ is used to designate water ligands, as in pentaaquanitrosyliron(2+) (b), while ‘hydrate’, as in (c), and subtractive ‘anhydro’ names contain the root ‘hydr’ (from Greek ὕδωρ).

(a) (b) (c)
  1. H2O
    water (trivial)
    oxidane (parent hydride)
    dihydrogen oxide (binary)
    dihydridooxygen (additive)
  2. [Fe(NO)(OH2)5]2+
    pentaaquanitrosyliron(2+) (additive)
  3. Na2[Fe(CN)5(NO)]·2H2O
    sodium pentacyanidonitrosylferrate(2−) dihydrate
    disodium pentacyanidonitrosylferrate—water (1/2)

* The name indane is “well established as the name of the hydrocarbon 2,3-dihydroindene” [1, p. 85].
Oxane is a Hantzsch-Widman name of 2H-tetrahydropyran [1, p. 85].
{C} indicates the carbon atom implicit in the parent name.

References

  1. Connelly, N.G., Hartshorn R.M., Damhus, T. and Hutton, A.T. Nomenclature of Inorganic Chemistry: IUPAC Recommendations 2005. Royal Society of Chemistry, Cambridge, 2005.
  2. International Union of Pure and Applied Chemistry. Nomenclature of Inorganic Chemistry: Definitive Rules 1970. Butterworths, London, 1971.
  3. Fluck, E.O. and Laitinen, R.S. (1997) Nomenclature of inorganic chains and ring compounds (IUPAC Recommendations 1997). Pure and Applied Chemistry 69, 1659—1692.
  4. Goya, P. and Román, P. (2005) Wolfram vs. Tungsten. Chemistry International 27, 26—27.
  5. Hellwich, K.-H., Hartshorn, R.M., Yerin, A., Damhus, T. and Hutton, A.T. (2020) Brief guide to the nomenclature of organic chemistry (IUPAC Technical Report). Pure and Applied Chemistry 92, 527—539.

No comments: