I scream, you scream, we all scream for glycine! It’s the smallest protein letter (amino acid) ever seen! Though it wasn’t the 1st ever seen, it was the first (or second) seen and recognized to be part of protein. Speaking of which, today, in addition to telling you about glycine I want to give a bit of history of protein research – so if you’ve ever wanted to know where the name “protein” comes from, this post is for you!
Today I’m kicking off the bumbling biochemist’s version of an advent calendar – #20DaysOfAminoAcids – with the “opening up” of GLYCINE! Each day I’m going to bring you the story of one of the 20 (common) amino acids – what we know about it and how we know about it, where it comes from, where it goes, and outstanding questions nobody knows.
Proteins are our body’s main biochemical workhorses and they’re written in “letters” called amino acids. Amino acids have generic “amino” (NH₃⁺/NH₂) & “carboxyl” (COOH/COO⁻) groups that let them link up together through peptide bonds (N links to C, H₂O lost, and the remaining “residual” parts are called residues). The reason for the “2 options” in parentheses is that, as we’ll go into more below, these groups’ protonation state (how many protons (H⁺ ) they have depends on the pH (which is a measure of how many free H⁺ are around to take).
Those generic parts are attached to a central “alpha carbon” (Ca), which is also attached to one of 20 unique side chains (“R groups”) which sticks off the peptide backbone (kinda like a charm sticking off a charm bracelet). The different “charms” have different properties (big, small, hydrophilic (water-loving), hydrophobic (water-avoiding), etc.) & proteins have different combos of them, so the proteins have different properties. Each amino acid has a backstory – a current story – and a future story. And I want to help tell these stories and maybe even make them part of the holiday “classics”
So today we’re going to start from the very beginning (I’ve heard it’s a very good place to start…)
The first “what we would later call proteins” were described by French chemist Antoine Fourcroy, who distinguished between 3 varieties of these things in 1789: albumin, fibrin, & gelatin.
In 1837, Dutch chemist Gerardus Johannes Mulder was the first person to describe proteins chemically. He was pretty surprised by their massive size – most biochemicals early scientists were working on were pretty small – but he measured the empirical formula for the proteins fibrin & egg albumin to be C400H620N100O120P1S1. He erroneously thought that all proteins had the same core substance, stuff which he called Grundstoff, but he correctly thought that these guys are massive – and massively important.
And a Swedish chemist & doctor named Jons Jacobs Berzelius agreed – he suggested Mulder call these molecules “protein” which comes from the Greek word “proteios” meaning “primary” or “in the lead” or “standing in front” because it “seems to be the primitive or principal substance of animal nutrition.” Seemed fitting. So he went with it, and the word protein first appears published in Mulder’s 1838 paper, “On the composition of some animal substances.” Now these things had a name, but there was still a lot more to learn (and still is).
Speaking of names, the “peptide” name comes from organic chemist Emil Fischer – inspired by the term “pepsis” meaning digestion or peptones, referring to the digestion products of proteins. Both he and the physical chemist Franz Hofmeister independently discovered the peptide bond and coincidentally presented their findings on the same day at the same meeting in Carlsbad in 1902, without knowing in advance.
There was a debate for some time about whether proteins were really super long chains of amino acids or whether they were just shorter chains of amino acids hanging out together as aggregates or “colloids.” Their bigness seemed too big to be true – but, sure enough proteins are macromolecules, as was shown in a number of ways – like by showing that different proteins have distinct molecular weights (if they were just loose groupings you’d expect to have a range of sizes for each protein)
You can view some of the other early protein milestones in the figures and I will get into more as we go through the amino acids one by one.
Glycine was the first amino acid to be isolated from a protein through acid hydrolysis (or second after leucine depending on where you look). It was discovered by Henri Braconnot in 1820. He wasn’t looking for what proteins were made of. Instead, he was trying to see if he could get sugar from animal products. He knew that if he treated plant products like wood or straw with acid, he’d get sugar, and he wanted to see if the same thing would happen with animal products. So he took gelatin, boiled it with sulfuric acid for 5 hours, added calcium carbonate to neutralize the acid, filtered it, and left it be for about a month. He wasn’t procrastinating – instead he was giving it time to crystallize. And crystallize it did. Crystals grew, sticking to the walls of the glass, and he tasted them. This is a major lab no-no, but he did it. And it tasted sweet, about as sweet as the sugar glucose. Aha – he thought – I’ve found sugar!
Turns out he was wrong, but he didn’t know that yet. So he named it sucre de gélatine, which was translated into German as “Leimzucker,” did some characterization of its solubility and stuff, but didn’t do much more. He didn’t even discover it contained nitrogen. (This isn’t meant to sound judgmental or anything – he did a ton!)
Other scientists, including Mulder, followed up on this work and studied its chemical composition. One of the other chemists working on it was Eben Norton Horsford – he didn’t like the original name because they now knew that it wasn’t a sugar. So Horsford suggested the name glycocoll (sweet glue) in 1946 and then, 2 years later, another scientist, Berzelius poo-pooed that name, saying it didn’t sound pretty and didn’t gibe with the other amino acid names – so he suggested the shorter name that finally stuck – glycine. (Huge thanks to Julischka (_sandwichkind on Twitter) & Dr. Anita Corbitt (acorbe2 on Twitter) who answered my call for translation help (of the German to English, not the mRNA to amino acid kind!))
Lots of research has been done on glycine over the years, and here’s some of what we know about what makes glycine special.
The peptide bond in all proteins has restricted movement because the C, N, & O share electrons between the 3 of them (usually you just have electron-sharing between 2 atoms). This communal sharing is called electron delocalization or “resonance stabilization.” As the “stabilization” part suggests, this electronic orgy of sorts makes the atoms happy. So they “want” to share like this, but they can only do so if the 3 of them are all on the same plane. So movement along the peptide bond is restricted to twisting between planes in a chain. You can learn a lot more about them here: http://bit.ly/2P0pJrB
But even that twisting is restricted, depending on the nature of the side chain because of “steric hindrance” – that’s basically a fancy way of saying 2 things can’t be in the same place at once (even if they’re super super small). Bulky things need more space, leaving them with fewer available ways to move without hitting something – like the atoms of the peptide backbone. So bulkier side chain -> more steric hindrance
The thing about glycine is that its side chain is just an H – which is pretty damn small – movement-wise it’s like there’s barely anything there at all! As a result glycine has very low steric hindrance, so glycine residues are very flexible (remember residue’s just what we call an amino when it’s in a peptide chain so has lost that water-equivalent (I don’t mean to harp on about this I was just confused about it for a really long time but embarrassed to ask!))
So glycine’s smallness lets its backbone take on awkward angles that would be major no-nos for other amino acids. To see what I mean, take a look at a Ramachandran plot, which shows the angles taken by atoms in a molecule – usually colored heat-map style to show you the most common and least common angles. When we’re solving a crystal structure we often check that the angles are geometrically solid & one of the things you’ll see in the “report card” for a structure is “Ramachandran outliers.” And usually it’ll be reported as “non-glycine” Ramachandran outliers – basically glycine can “break the normal rules” so it’s ok to find it at weird angles. Here’s the link for the paper in the figure: https://doi.org/10.1002/prot.10286
Glycine’s “loosey-goosey-ness” makes it good for flexible regions of proteins BUT bad for places you need strong structure. So it’s often found in sharp turns leading into or out of more orderly structures like helices & sheets.
It’s typically only found in small amounts in protein, though it is the most abundant in the weird triple-helices of the protein collagen that helps make our skin stretchy but sturdy. But it’s found a lot of other places too. In its free form it acts as a neurotransmitter – a chemical messenger relaying news throughout the brain. And it is a member of the antioxidant tripeptide glutathione, which helps control oxidation status in our bodies. More on that here: http://bit.ly/2Yiya50
It’s also useful outside the body. Our lab has a lot of it because it can be used as a buffer (pH-stabilizer). To understand why, we need to look back at that whole (NH₃⁺/NH₂) & (COOH/COO⁻) thing. pH is a measure of how many protons (H⁺) are hanging out in a solution. If there are a lot of protons, the pH is low (because it’s a negative log scale) and we call it acidic. Fewer protons leads to higher pH which we call “basic” or “alkaline.” Some molecules can give protons (act as an acid) and or take protons (act as a base). Amino acids can do both (we call such molecules amphiphilic) and, as a result they can “sop up” excess H⁺ OR OH⁻ to maintain pH you want. More on buffers & protein charge here: http://bit.ly/30qzHH6
technical note: For those curious (and cuz it gives me the excuse to use one of my favorite words), Glycine has a pKa1 of 2.34 and a pKa2 of 9.6, Take the average & you get a pI of 5.97, which means that at pH of 5.97, glycine will be in its zwitterionic form (NH₃⁺ & COO⁻ cancel out to give you a neutral molecule).
Less-jargony – glycine’s useful for keeping pH stead around 6ish. Which makes it good for using as a buffer in Trie-Glycine SDS-PAGE gels (used to separate proteins by size) & it’s used in many cosmetics & toiletries. About 15,000 tons of it are made commercially each year. You can even find glycine in outer space. NASA has found it in samples taken from comets.
But where does it come from when our cells use it?
Glycine is characterized as a “nonessential” amino acid – this doesn’t mean we don’t need it – we certainly do! – it just means that we don’t need to get it “pre-made” in our food because our bodies have other ways of making it.
It can be biosynthesized from another amino acid we’ll get to later this month, serine (which itself can be made from 3-phosphoglycerate which is formed when sugar is broken down). more on that here: http://bit.ly/2qKgLpW
Basically the enzyme serine hydroxymethyltransferase (SHMT) (aka glycine hydroxymethyltransferase cuz the reaction can go both ways) can remove the “extra” hydroxymethyl group from serine, transfer part to the cofactor tetrahydrofolate (THF) and part to a proton to give you glycine & N5,N10-Methylene THF & water
serine + THF → glycine + N5,N10-Methylene THF + H2O
Where does it go?
The main pathway for glycine breakdown (catabolism) is the “glycine cleavage system” (GCC). Glycine decarboxylase cuts off ammonia (NH₄⁺) & carbon dioxide (CO₂) and transfers the leftover CH₂ to THF (that same cofactor we saw earlier when we were making glycine from serine (anabolism).
glycine + tetrahydrofolate + NAD⁺ → CO₂ + NH₄⁺ + NADH + H⁺ + N5,N10-Methylene THF
Instead of just breaking it down, you can build from it – glycine serves as a precursor to other molecules including DNA (it provides the central C2N subunit of the purines (adenine & guanine)) & porphyrins (things like heme – the iron & oxygen holding helper that lets the red blood cell protein hemoglobin transport oxygen throughout your body).
But its main function is “proteinogenic” meaning it gets used for protein-making. Amino acids are “coded for” in messenger RNA (that is copied from DNA genes), with 3 RNA letters (nucelotides) acting as a “codon” that “spells” 1 amino acid. And, coincidentally, glycine is spelled by all the codons starting with the RNA letterS GG (guanine) – so GGU, GGC, GGA, & GGG (it’s got the bottom right of the codon table cornered)
How does it measure up?
chemical formula: C₂H₅NO₂
molar mass: 75.067 g/mol
systematic IUPAC name: 2-aminoethanoic acid