Enzymes are the stars of the molecular world. Often proteins, sometimes protein/RNA complexes, and sometimes RNA alone*, these molecules mediate and speed up molecular reactions – everything from joining together nucleotides to make strands of RNA to breaking down sugars for energy to moving molecules around. *although I say “alone,” enzymes of all sorts often have other “cofactors” tightly bound to them – things like metals or vitamins which help them do what they do. How they do what they do (their reaction mechanism) varies from enzyme to enzyme, but often involves holding the reacting molecules together, in the right orientation, an optimal environment, providing electrons and/or protons, etc. Enzymes are picky about what reacting molecules (substrates) they will act on and what reactions they’ll speed up (catalyze). And I mean, really picky – in fact, it’s this specificity that sets enzymes apart from more “generic” chemical catalysts. This specificity allows organisms to tightly regulate and orchestrate what goes on where in their bodies. But it has the consequence that we have a LOT of enzymes. And there are additional enzymes that other organisms have that we don’t. Enzymes for doing things like using sunlight to convert the carbon dioxide in the air into sugar through photosynthesis or breaking down agricultural “waste” to produce biofuels. Each of those processes actually requires multiple enzymes. So the numbers keep racking up. There are over 6,600 enzymes currently officially listed in the Enzyme Database, with more in the validation process. Keeping track of these is basically impossible for a single person – enter the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) and their internationally agreed upon enzyme classification system, the Enzyme Commission (EC) numbers, which is accessible through the ExplorEnz database. https://www.enzyme-database.org/
These numbers are crucial for a lot of reasons. It’s not just that there are so many enzymes to keep track of – it’s also that people often call the same enzyme different things. Or really weird/non-helpful things. For instance, many times enzymes are named based on what they were initially discovered to do even if that’s not what they normally do or not what we traditionally think of them as doing. Sometimes different people discover the same enzyme and name it different things, etc. It’s a mess…. The catalytic cats may be out of the bag a bit, but it’s important to have some sort of agreed-upon nomenclature for talking about them and comparing them (and seeing if the “yellow enzyme” you’re reading one paper about is really the same enzyme as the “yellow enzyme” another paper is talking about).
One of the key things to keep in mind when it comes to enzymes is that they can’t “make” a reaction happen, they can only make it easier for an already-possible reaction to happen. It’s the biochemical equivalent of the saying “you can lead a horse to water but you can’t make it drink.” Say you have 2 molecules you want to join together. If they’re each wandering around aimlessly, they’re not likely to meet (and once they meet they still might not join together). But what if you were to grab the molecules, stick them right next to each other and slap some glue on their hands? Now they’re more likely to join. But they can also “un-join” and the enzyme can help facilitate that too – often the tipping point between joining and un-joining is an unstable intermediate that can easily go either way and the enzyme doesn’t tip the scale. So, would you call this enzyme a joiner or an un-joiner? Starting to see where some of the naming trouble comes in?
A great example of a name that always confused me is pyruvate kinase, a key enzyme in glycolysis (the breakdown of glucose). In glycolysis, it catalyzes the transfer of a phosphoryl group from phosphoenolpyruvate (PEP) (a breakdown intermediate) to ADP to give you pyruvate and ATP. The reaction in this direction is so energetically favorable in vivo that it’s considered “irreversible.” In fact, if you want to make glucose (gluconeogenesis), this is one of the steps you can’t just “reverse” – you have to go an alternative way to get PEP from pyruvate (and it’s going to cost you an ATP and a GTP!). However, pyruvate kinase’s name makes its seem like it’s typically adding a phosphoryl group to pyruvate.
I went looking for some more “here’s why some centralized, systematic, nomenclature is needed” examples and found this great article https://febs.onlinelibrary.wiley.com/doi/full/10.1111/febs.12530
And, oh man, some of these examples are wild! I especially love Old Yellow Enzyme (OYE) (aka NADPH dehydrogenase) and New Yellow Enzyme (NYE) (aka D‐Amino acid oxidase)
In addition to those, uh… visually descriptive, names, people often named enzymes pretty “generic” names, especially when they discovered the first of that type of enzyme. For example, Ilona Banga discovered an enzyme that broke down a protein called elastin, so she named it elastase. But then a whole bunch of other elastase enzymes were discovered, making it difficult when I was writing her Wikipedia article!
Other times, enzymes were named after where they were discovered, which is really unhelpful – like the ribonuclease (RNase)(RNA cutter) from the bacteria Bacillus amyloliquefaciens which is named barnase.
And I can’t help sharing this one, aerobic 5,6-dimethylbenzimidazole synthase (BluB), an enzyme that some microbes use to break down one cofactor, flavin, to form another one, 5,6-dimethylbenzimidazole (DMB), in the process of vitamin B12 (cobalamin) synthesis. In a 2007 Nature paper titled “BluB cannibalizes flavin to form the lower ligand of vitamin B12,” the authors (Taga et al.) propose the enzyme name “flavin destructase” ([PMID: 17377583]). Although that would make an awesome superhero name, we clearly need some sort of a systematic naming system…
Similar to how there are “gene families” that make it possible to talk about similar genes from different organisms, or within an organism that do different things, there are enzyme families, and sub-families, and sub-sub families. And instead of naming them the “Does” and “Smiths,” these groups are given a string of 4 numbers, starting with a number representing one of 7 main enzyme classes, with subsequent numbers representing sub-classifications based on things like what substrates they act on and what cosubstrates (helper molecules) they use.
note: In addition to numbers, the EC gives an “accepted name” (commonly the one that’s most commonly used) and a systematic name, which consists of the substrate(s) followed by a term specifying the reaction that the enzyme catalyzes
The numbering system was developed by the precursor to the International Union of Biochemistry and Molecular Biology (IUBMB), and for years there were 6 classes. And then, in 2018, they stunned the world and added a 7th class, the translocases.
Here’s a brief overview, then in coming weeks, time-permitting, I plan give a little more info about what enzymes in these classes do! And after the overview of classes, in today’s post I will give some practical advice for interpreting the classification numbering and finding more information in enzyme databases
- oxidoreductases: catalyze redox reactions (electron transfers)
- transferases: catalyze group transfer
- hydrolases: use water to break bonds
- lyases: add or remove things to make (or break) bonds⠀
- isomerases: catalyze changes within a molecule (such as shifting things around or flipping which direction something’s pointing)
- ligases: use ATP to join together 2 things (like that DNA ligase I mentioned earlier)
- translocases: help move things from one side of something (often a membrane) to another⠀
This is a great article for people who want to get way into the nitty gritty details (but in a fun-to-read presentation of those details): McDonald, A.G. and Tipton, K.F. (2022), Enzyme nomenclature and classification: the state of the art. FEBS J. https://doi.org/10.1111/febs.16274
It explains the history as well as how things are organized in the official database, accessible at the IUBMB ExplorEnz website.
We’ll get WAY more into these numbers in a minute, but in addition to the numbering, they give names – I mean, you need to know what enzyme an enzyme number is referring to. So they list all the names that are and/or have been used to refer to it (you can find these under “other names,” occasionally with warnings that those names are misleading/incorrect). And then they give one that’s like “THE Accepted name.” But even this name isn’t agreed upon by everyone…
Initially it was called the “Trivial name,” which was later changed to “Recommended name” which drew controversy when people didn’t think a recommended name was really recommendation-worthy. So it was changed to “Accepted name.” And this “accepted name” is the one in common use, even if it’s not very accurate or helpful in terms of what the enzyme actually does… More on this in that great article which may sound like it would be boring but is super entertaining!
Since enzymes catalyze reactions in both directions (e.g. a stick breaker is also a stick maker if you give it those broken stick pieces and the optimal conditions to go that way) sometimes the accepted name indicates the direction the reaction “normally” goes. But this isn’t always the case*– a lot of times it indicates what it was first found to do, not what it normally does (and sometimes the thing it was first found to do has nothing to do what its main function is). And sometimes, enzymes catalyze both directions “normally” – you see this with a lot of enzymes in metabolic pathways that can be used in both catabolism (breaking down molecules) & anabolism (building new molecules).
*note: in EC entries, the reactions and systematic names (which we’ll get into below) for all enzymes in the same sub-subclass are always written in the same direction, regardless of the physiological direction. Therefore, if most enzymes in the sub-subclass catalyze one direction but one catalyzes the opposite direction, it will still be written in the other direction.
Sometimes the accepted names include some baggage… In cases where the same name is commonly used to talk about enzyme that really are not the same, “qualifying additions” are added (in parentheses). These qualifiers include things like substrate name, cosubstrate/cofactor dependence, or catalysis mechanism.
And we’re not done! Continuing on with our name game, enzymes also get unambiguous “Systematic names” which give you lots of info, but sometimes more than you want when you’re just trying to chat with people or even share research findings. You think journal articles are long and dense now? Imagine if you had to write “3-phospho-D-glycerate carboxy-lyase (dimerizing; D-ribulose-1,5-bisphosphate-forming)” every time you wanted to talk about the world’s most abundant enzyme, RuBisCO! (whose accepted name is actually ribulose-bisphosphate carboxylase, and EC number is 184.108.40.206).
For anyone wondering, rubisco, um I mean, 3-phospho-D-glycerate carboxy-lyase (dimerizing; D-ribulose-1,5-bisphosphate-forming), is so abundant because it plays a key role in photosynthesis. In fact, it catalyzes the first major step of carbon fixation, taking CO2 and using it to carboxylate ribulose bisphosphate & make 2 molecules of 3-phosphoglycerate. More on it in this great PDB-101 Molecule of the Month article https://pdb101.rcsb.org/motm/11
As you might have surmised from that example, the systematic name includes info about what substrate(s) the enzyme acts on and some -ase term telling you “how/what” it does (often its class name).
Speaking of what specific enzymes do, let’s get into that a bit – and how the numbers tell us key information about what they do.
Warning: There’s gonna be a lot of jargon and I’m not going to give as much background as normal because I figure that if you aren’t at least somewhat familiar with enzymes already, you’re not gonna care about how we classify them and have probably already stopped reading this. But, in case people do care but get confused, I will give links to more information (and sorry for confusing you!) The key background you need to at least follow along with the jargon is…
Atoms (like individual C’s and H’s and O’s) are made up of smaller parts called subatomic particles; namely positively-charged protons and neutral neutrons which hang out in a dense central nucleus and negatively-charged electrons which whizz around them in a diffuse “electron cloud.” The number of protons an atom has is fixed for a given element (e.g. carbon has 6 and nitrogen has 7). But the number and location of electrons can vary and atoms often have “ideal” numbers of electrons, so they can give, take, and share them to meet their desires. It is this sharing that forms the basis of the strong covalent bonds that link together the atoms in molecules (everything from individual waters, single amino acids, giant proteins, and DNA).
Atoms can share 1 pair of electrons to form a single bond, 2 pairs for a double, or 3 pairs for a triple. The more they share, the stronger & shorter the bond. Additionally, some molecules can take electrons (become reduced) and others can give up electrons (become oxidized) and this will come into play with some of the enzyme mechanisms. Remember OIL RIG: Oxidation Is Loss (of electrons); Reduction Is Gain (of electron). more on redox reactions: https://bit.ly/redoxbiochem
We see redox-y enzymes in class 1. For example, ribonucleoside reductases, which remove the 2’ -OH from RNA letters (ribonucleotides) to form DNA letters (deoxyribonucleotides); and glutathione-disulfide reductase (EC 220.127.116.11), which reduces oxidized glutathione (where you have 2 of them stuck together) to 2 reduced glutathiones, which can then be used to reduce other stuff, like oxidized cystine.
Here are some other examples of types of things you’ll find in the various classes. Another warning: the same reactions can sometimes be catalyzed in different (sometimes radically different) ways by enzymes in different classes. Now that you’ve been warned…
- 1. oxidoreductases: catalyze redox reactions (electron transfers)
- this class includes things with names like thingamabob- “dehydrogenase”, “reductase,” or “oxidase”
- 2. transferases: catalyze group transfer
- this class includes things with names like thingamabob-transferase, kinases, phosphatases
- 3. hydrolases: use water to break bonds
- this class includes things with names like thingamabob-ase, where “thingamabob” is the substrate that’s getting broken. Examples include nucleases, proteases
- 4. lyases: add or remove things to make (or break) bonds (often double bonds, but not always)
- this class includes things with names like thingamabob- “decarboxylase,” “aldolase,” “dehydratase,” and “synthase”
- 5. isomerases: modify a single molecule (like moving an oxygen 1 carbon over or flipping a bond that points up to one that points down)
- this class includes many thingamabob isomerases or epimerases, racemases, tautomerases…
- 6. ligases: use ATP or a similar molecule to join together 2 things (like that DNA ligase I mentioned earlier)
- this class includes many things with thingamabob ligase names, but also some “carboxylases,” “synthases,” and “synthetases” sprinkled in there
- 7. translocases: help move things from one side of something (often a membrane) to another
- this class includes a bunch of thingamabob “transporters,” but if something doesn’t have “transporter” in its name that doesn’t mean it isn’t a translocase! For example, there are a number of oxidases, reductases, and decarboxylases in this class, which couple those other activities to the substrate movement.
Now, let’s step back and explain that numbering! Well, first, we need to step wayyyy back – to 1958. This was when the first real published attempt at systematically classifying and numbering enzymes started. It was then that a couple of scientists – Dixon and Webb – published a list of the 659 known-at-the-time enzymes. They split enzymes up into 3 main categories: hydrolyzing enzymes, transferring enzymes, and other enzymes and then they subdivided those categories.
In 1961, the first draft of the current numbering system was published as the NC‐IUBMB Enzyme List. NC stands for “Nomenclature Committee” and it came out of an “International Commission on Enzymes” established in 1956 by the IUBMB (then called just the International Union of Biochemistry because molecular biology was not invented yet!).
They only had 6 classes then, but they already had the sub-class scheme. So let’s get into that, with an example to follow along. Warning! The sub-classes & sub-sub-classes vary for each main group so it’s not like the 3 in 18.104.22.168 will mean the same thing as the 3 in 22.214.171.124. To know what it’s actually telling you, you need to look to the classification guidelines. You can find the official definitions of each subclass and sub-subclass on the ExplorEnz database. Under “enzymes by class” – https://www.enzyme-database.org/class.php?c=1&sc=2&ssc=*.
For this example, let’s use an enzyme you might be familiar with, a common restriction enzyme (site-specific DNA cutter) you might have used to cut DNA in the lab, BamHI.
The name BamHI isn’t very helpful (except for shorthand) – it just tells us it was the first restriction enzyme discovered in the “H” strain of a bacterium called Bacillus amyloliquefaciens.
Its EC number is EC 126.96.36.199
And, actually, this is the same EC number as all similar “type II site-specific deoxyribonucleases” – so the EC numbering doesn’t always take you to the absolute “there is no other enzyme quite like you” point. For example, “isoenzymes” which catalyze the same reaction but have different makeups will have the same number because the number is based on the reaction catalyzed. note: sometimes, if you look in other unofficial database sources there’s a letter added to the 4th number (e.g. 1.1.1.n4), but these are not official EC numbers. EC numbers have only digits in the 4th number. And using such numbers in any scientific publication should be avoided since they haven’t been officially verified. Huge thanks to Drs. Keith Tipton and Ron Caspi of the Nomenclature Committee for helping explain this to me when I was confused! (and for helping fact-check and provide guidance on this whole post!)
Let’s go through this (188.8.131.52) number by number
- first number: this tells you the enzyme class
- here, it’s 3, which tells you it’s a hydrolase (it uses water to help cleave bonds, or in the reverse reaction, it makes bonds, kicking out water in the process)
- second number: this tells you more info, often what type of compound it acts on, here it tells you what type of bonds it acts on
- here, it’s 1, which tells you it cuts ester bonds (an ester is a C double-bonded to 1 O and single-bonded to another O, with all of that sandwiched between 2 other carbons, so, -C-(C=O)-O-C-). DNA & RNA’s nucleotides are hooked together through phosphotriester bonds
- third number: this tells you more info, here about what it’s cutting & how
- here it’s 21, which tells you it’s an “Endodeoxyribonuclease producing 5′-phosphomonoesters” meaning it’s a DNA cutter that cuts in the middle of DNA (i.e. not at an end) and when it cuts it leaves one of the nucleotides with a phosphate in the 5’ position (some cutters cut in a way that leaves the phosphate on a 3’ end and an OH on the 5’ end – those ones get the number 22)
- fourth number: this further classifies them
- here it’s 4, which tells you it’s a type II site-specific deoxyribonuclease
How ‘bout another example, shall we? I say we shall… so…
Let’s go back to that DNA ligase we were talking about – the DNA joiner (which seems appropriate since we often use it the lab to join together pieces of DNA we’ve cut up). As the name gives away, it’s a ligase, which means it gets a class number of 6. So we know our first number is 6, but what about the rest? The full EC number for “DNA ligase” (the “classical,” ATP-requiring, kind we use in the lab) turns out to be 184.108.40.206. I found this by going to, well, Google (seriously, a big part of grad school is Googling things!) Google took me to a Wikipedia article with links to the protein in various databases – including BRENDA, which shows a nice “EC Tree” for each entry with the numbering hierarchy and what, briefly, the different numbers mean (the EC title of of the class/subclass, etc.). BRENDA, isn’t the official EC list (and thus the Nomenclature Committee, with its tight standards, doesn’t vouch for its accuracy) but BRENDA does use a lot of data from the official list. And the BRENDA entry provides a direct link to the ExplorEnz entry, where that official data is (just scroll down to “External Links.”) Alternatively, to get more detailed (and strictly validated) info, what I like to do is, once I have that EC number, I go to ExplorEnz and go to the “Enzymes by Class” tab, then click to expand all of the categories leading to it. Hopefully this link takes you where I hope: https://www.enzyme-database.org/class.php?c=6&sc=5&ssc=*&sh=1
I like doing it this way instead of taking the link directly to the entry so that I get to see what else is in the class, subclass, etc. (and go to their entries as well when I inevitably get curious). And if I click on the class, subclass, and sub-subclass names I get the official definitions of who’s included.
Let’s look closer at that EC number. With our restriction enzyme, we had a 3 as our first number (the class number), telling you it was a hydrolase. But here, we have a 6, telling us it is a ligase – a joiner that uses ATP or a similar molecule to make new bonds between molecules or parts of molecules. But what type of bond? For this, we look to the second number (the subclass number) and see a 5. Coming after the 6, this 5 is telling us that it’s “forming phosphoric-ester bonds.” These are the type of bonds you find in DNA and RNA backbones. How about the sub-subclass? Let’s look to the third number – the 1. This one’s a little anti-climactic – the description says “Ligases that form phosphoric-ester bonds (only sub-subclass identified to date)” – so in this case we’re not further separating things. But that final 1, the fourth number (the serial number) does provide further (and final) identifying information. Coming after the 6.5.1, this 1 identifies our enzyme as DNA ligase (ATP).
If I click on EC 220.127.116.11 I get taken to its entry. It tells us that the Accepted name is DNA ligase (ATP). This is an example of what we talked about before, how there’s sometimes a cosubstrate mentioned in parentheses. In this case, it tells us that this entry is for DNA ligase that uses ATP, differentiating it from DNA ligases that use a different cosubstrate, such as NAD+ (DNA ligase (NAD+) is EC 18.104.22.168 in case you were curious). It also lists 17!! mostly ambiguous “Other names” and then the Systematic name – as unambiguous as you can get: poly(deoxyribonucleotide)-3′-hydroxyl:5′-phospho-poly(deoxyribonucleotide) ligase (ATP). Yowza – that name definitely provides useful information which is needed in some cases, but I think I will stick to its Accepted name (and number) in most circumstances!
As always with ExplorEnz entries, the entry also provides the reaction, comments, links to other databases, and references.
I’ve shown you a couple enzymes. But there are, quite literally, thousands more out there for you to explore yourself! Just go to the ExplorEnz database and dive in! Thanks to the IUBMB Nomenclature Committee, an unambiguous look at the whole world of known, validated, enzymes is just a URL away! https://www.enzyme-database.org/index.php