What’s a side chain that’s curvy gotta do with scurvy? Proline isn’t *pro*-lines & it’s definitely *anti* α-helix! But it’s a pro at being structurally awkward! Proline (Pro, P) may just be the wackiest protein letter (amino acid)… Although if you want to get technical about it, it’s actually an “imino” acid because it has a secondary instead of a primary amine group… But really, In functional terms it is an amino acid – it gets genetically coded for & inserted into growing protein chains just like any other. It’s just… special…
It’s Day 10 of #20DaysOfAminoAcids – the bumbling biochemist’s version of an advent calendar. Amino acids are the building blocks of proteins. There are 20 (common) genetically-specified ones, each with a generic backbone with to allow for linking up through peptide bonds to form chains (polypeptides) that fold up into functional proteins, as well as unique side chains (aka “R groups” that stick off like charms from a charm bracelet). Each day I’m going to bring you the story of one of these “charms” – what we know about it and how we know about it, where it comes from, where it goes, and outstanding questions nobody knows. ⠀
More on amino acids in general here http://bit.ly/aminoacidstoproteins, but the basic overview is:⠀
amino acids have generic “amino” (NH₃⁺/NH₂) & “carboxyl” (COOH/COO⁻) groups that let them link up together through peptide bonds (N links to C, H₂O lost, and the remaining “residual” parts are called residues). The reason for the “2 options” in parentheses is that these groups’ protonation state (how many protons (H⁺ ) they have) depends on the pH (which is a measure of how many free H⁺ are around to take).⠀
Those generic parts are attached to a central “alpha carbon” (Ca), which is also attached to one of 20 unique side chains (“R groups”) which have different properties (big, small, hydrophilic (water-loving), hydrophobic (water-avoided), etc.) & proteins have different combos of them, so the proteins have different properties. And we can get a better appreciation and understanding of proteins if we look at those letters. So, today let’s look at Proline (Pro, P).
Technically-speaking, proline is not an “a”mino acid – it’s an “i”mino acid – but this is really just a naming thing. Biochemically and functionally, it’s an amino acid – it serves as a protein letter (albeit a quirky one), getting coded for by 3-letter RNA “codons” (in the case of Pro, CCU, CCA, CCC, & CCG) and added to a growing chain during protein making, just like any other. The reason for the letter difference when you get all linguistically technical (the a versus i) is that, because the N is hooked up to 2 carbons in the free form, it’s called a “secondary amine” instead of a “primary amine” (where the N’s only attached to one non-H) and that gives it the chemical name α-imino acid. note: the “a” just refers to the fact that the N is hooked to that central alpha carbon (Cα) – this part doesn’t change.
So, the naming thing isn’t that big of a deal and I don’t think I’ve ever actually seen proline referred to as an “imino acid” outside of textbooks/totally in-the-weed-y websites. But the imino business does have biochemical consequences! All the other amino acids’ side chains (R groups) only connect to the “generic” backbone at Cα. BUT proline’s side chain comes off Cα, loops around & connects to backbone’s nitrogen, “kicking out” an H – a really important one! Normally when amino acids link together (through peptide bonds), this N still has that H so it can act as a H-donor in non covalent hydrogen bonds with the Os of other amino acids’ backbones to form secondary structures like a helices & beta strands. Oh jargon-dy-gook… Let me explain…
Basically, atoms (like individual C’s, H’s, O’s, & N’s) are really tiny, but they’re made up of even tinier parts called “subatomic particles,” which include electrons, protons, and neutrons. Electrons are negatively-charged subatomic particles that whizz around in “electron clouds” around a dense central core called the atomic nucleus where positively-charged protons (with some gluing together help from neutral neutrons) are tasked with reigning them in. Next-door-neighbor atoms join together in strong “covalent bonds” when they share pairs of electrons. This strong gluing is how molecules are formed. But atoms can also interact through weaker attractions that don’t involve electron housing rearrangements, they just involve charge-based attractions.
In a neutral molecule, the # of protons = the # of neutrons. If there’s a “full” imbalance meaning you have more electrons than protons or vice versa, you get a formal charge and we call such charged particles ions. More electrons than protons gives you a negatively-charged ANION & more protons gives you a positively-charged CATION.
The # of protons an element has is fixed – it’s what defines an element (e.g. carbon has 6 and will always have 6 and hydrogen has 1 and will always only have 1 – if it had 2 it’d be helium). But the electrons, especially the more loosely-held outermost ones called valence electrons can roam around a little and even get lured away. They have to get fully lured away to give you a full charge, but they can get lured a little farther away from their “owning” nucleus without leaving when they’re in an unfairly-shared covalent bond. In such “polar covalent bonds,” one atom is electron-hogging (electronegative) and it pulls their shared electrons closer to it, leaving its neighbor partly positive, and making itself partly negative.
As we’ve revisited over and over (and will continue to revisit over and over because nature does too cuz it’s that fundamental), opposite charges attract – even partial ones. So the oppositely-charged regions of polar molecules can get attracted to one another.
Some atoms, like oxygen (O) & nitrogen (N), are really electron-hogging (electronegative), but poor little hydrogen only has a single proton to try to persuade electrons to hang out with it. So when H is in a covalent bond with an N or an O, it usually has its electron party pulled away, leaving it partly positive – and thus happy to interact with a partly negative part of another molecule. Oxygen & Nitrogen often have “lone pairs” of electrons – pairs of electrons that aren’t shared with another atom and thus are kinda like little negative charge bundles. So partly-positive H’s can get attracted to them and hang out, forming a “hydrogen bond” (aka H-bond). In H-bonds, the atoms don’t share the electrons (no housing rearrangements) but the charge-based attraction is stronger than most non-covalent bonds. http://bit.ly/frizzandmolecularattractions
So, I’m going through all this work telling you about H-bonds… just to tell you that proline’s backbone N can’t form them! Normally, the generic backbone offers 2 locations for H bonding. The carbonyl (C=O) provides an H-bond acceptor in the form of the O and the amino group provides an H-bond donor in the form of the N-H. Therefore, backbones can interact through H-bonds to give a protein its “secondary structure” (common “structural motifs” like helixes, sheets, etc) http://bit.ly/insulindiabetes
BUT Proline’s N doesn’t have this H because it’s “been replaced” by a bond to side chain. So it can’t act as a donor. Thus, it doesn’t want to form α-helixes, and if it’s in them it’ll make them kinky. Proline can also make other places kinky because its side chain contortion “locks” the N-Ca bond in place, leading to limited backbone flexibility – even limited-er than usual!
All protein backbones have limited flexibility because the peptide bonds linking them together get stabilized by resonance (electron delocalization where “extra” electrons are shared among more than just 2 atoms), which can only happen if Ca, N, & O are in the same plane, so you end up with a chain of planes where you can only rotate at certain places in the backbone (C-Cα (psi) & Cα-N (phi)). And even those rotations are restricted by steric hindrance (you can’t have atoms colliding with one another so bulky side chains restrict movement more). http://bit.ly/aminoacidstoproteins
You can see this if you look at a Ramachandran Plot, which shows the backbone angles taken by atoms in a molecule – usually colored heat-map style to show you the most common and least common angles. When we’re solving a crystal structure (more later) we often check that the angles are geometrically solid & one of the things you’ll see in the “report card” for a structural model is “Ramachandran outliers” – atoms in the model that have suspicious angles. Usually it’ll be reported as “non-glycine,” “non-proline” Ramachandran outliers – basically glycine can “break the normal rules” because it’s so small (its side chain’s just an H) so it’s ok to find it at weird angles. Proline can also break the rules, but instead of being able to move lots more ways, like glycine, it just has “different rules” – it’s restricted to different angles. Here’s the link for the paper in the figure: https://doi.org/10.1002/prot.10286
Despite being kinda opposites, these two amino acid outlaws (proline & glycine) often work as partners in crime (or more like partners in cool chemistry). They’re often found together in sharp turns, where Pro helps reinforce the kink & G’s small enough to squeeze in.
I’ve been talking about how Pro doesn’t like to form the *common* a-helix, but chains of it can form a special kind of helix called a polyproline helix which is found in collagen and consists of 3 really long (>1400 amino acids) polypeptide chains (two copies of α1 & two copies of α2) in a cool triple helix, with repeating sequences of Gly-X-Y, where X or Y can refer to proline or hydroxyproline (proline can be in either place but hydroxyproline can just be in Y)
“Hydroxyproline” is just proline that gets a hydroxyl (-OH) group added on “after the fact,” (post-translationally) during protein-making in the endoplasmic reticulum (ER) – a membrane bound “room” in your cells where some proteins get extra help getting made if they require modifications and/or shipping. The genetic instructions tell the protein-making ribosomes to add proline to the growing chain during the process of “translation.” Then the protein enzymes (reaction mediators) collagen prolyl 4-hydroxylase or prolyl 3-hydroxylase add the -OH. This hydroxyl can get added to the “4” or the “3” position, to give you 4-hydroxyproline or 3-hydroxyproline (with 4 dominating ~ 100:1, so it’s usually what people are talking about if they don’t specify a number).
Enzymes, such as those hydroxylates, are able to speed up reactions without getting used up themselves because they “just” bring the reactants together in the right orientation, environment, etc.; they only lead a horse to water, not make it drink. http://bit.ly/enzymecatalysis
In the case of proline hydroxylation, the oxygen comes from molecular oxygen (O₂) and the enzyme needs some help getting it to go on… Vitamin C (ascorbic acid) to the rescue! (as well as iron and α-ketoglutarate (which loses CO₂ and takes one of the O₂’s O’s to become succinate). Vitamin C acts as a cofactor (non-protein helper molecule) for the hydroxylates. So if you want to make strong collagen, you need vitamin C, and this is why scurvy (vitamin C deficiency) causes defective collagen leading to your teeth falling out and your skin bruising easily. Therefore, even though we can make proline (it’s non-essential), since we can’t make vitamin C, we can’t make proper collagen without getting this vitamin in our diet.
There are different kinds of collagen and the Rutger’s Protein DataBase (PDB) has a great article about it as one of their “molecule of the month” pieces. https://pdb101.rcsb.org/motm/4 In the pics you’ll find a picture from that page showing you what the chains look like. Those “pictures” were obtained using a technique called x-ray crystallography. This is a technique where we look at protein structures by bouncing x-rays off molecules’ of molecules’ e⁻ , then capturing those bounced-off x-rays on a detector and working backwards from the pattern of spots (diffraction pattern) to find the bounce-off points. http://bit.ly/xraycrystallography2
I’m mentioning this here, because of another weird thing about proline that can be discovered (or falsely discovered) using this technique. My first experience with crystallography was so cool – actually getting to “see” the bonds you only hear about ( although what we see is really just a “meshy” thing of electron density that we’ve computed based off of a pattern of dots and then built a model into). Even when you have a hard time making out side chains, the backbone can still be visible. So you can trace those characteristic peptide bond angles, and what you usually see is a “zig-zag.” This zig-zag comes from how amino acids link together; they almost always attach so that neighboring Cα are on OPPOSITE sides (TRANS conformation) (so you get a zig-zag (/-/)) instead of having the Cα on the SAME side, which we call CIS conformation, which gives you a “staple” (/-\) shape.
Most amino acids are exclusively in TRANS all the time because this leads to less steric clashing with its neighbors. But for Pro, it’s clash-y either way – and pretty much to the same extent. So, unlike the other amino acids, Proline can also (though less commonly) adopt a CIS conformation. This requires some help though because the ribosome puts it in trans when its linking up the amino acids. And switching is hard because it requires disrupting that partial-double-bond-like peptide bond. An enzyme called prolyl isomerase thus helps convert between trans (which it’s formed as) & cis. This need for enzymatic help in switching can be a holdup point in protein folding, but most Pro doesn’t need to be swapped.
Studying at Cold Spring Harbor Laboratory (CSHL), there are a lot of great perks – like sitting in on the legendary x-ray crystallography course. I will never forget a lecture by the great Jane Richardson (of ribbon diagram & MolProbity fame) about how you can’t just chock modeling problems up to such “cis-Pro” (once scientists first discovered that cis-Pro exists, it became common for weird regions of density to be explained by this strange conformation (e.g. it’s not a model problem – it’s just a cis-Pro!) Big crystallography no-no!). But, these cis-Pro really do exist (they’re more likely to be legit if the preceding letter is Gly or an aromatic amino acid) – and when they do they can have great functional significance.
Collagen is the main protein in your skin (and it’s also in tendons, cartilage, bones, blood vessels, etc.) and your skin’s your biggest organ, so you have a LOT of collagen (~1/4 of your body’s proteins), and ~ 1/3 of collagen is proline, so your body uses a LOT of proline. Thankfully, our body CAN make proline, so we call is “non-essential” in the dietary sense.
Proline can be made from the amino acid glutamate in a pathway which involves a couple steps which require enzymatic assistance from pyrroline-5-carboxylate synthase (P-5-C synthase) to first form glutamate-5-semialdehye. And then (since those enzymes did their job in making a really awkward molecules) it spontaneously ring-i-fies to give you pyrroline-5-carboxylate (P5C), which can lose electrons (and gain an H) to become proline with the help of P5C reductase.
The reaction can also go the other way with the help of dehydrogenases, allowing you to make glutamate from proline (dehydrogenating it with the help of proline dehydrogenase 1 (PRODH) letting you go back through that P-5-C). And if you want, you can then remove the amino group, to convert that to α-ketoglutarate which can be further broken down to give you pyruvate, which can enter the citric acid cycle (aka TCA or Krebs cycle) or get used to make glucose (blood sugar) – so we refer to it as glucogenic.
Proline is classified as NONPOLAR – not surprising since its side chain just has C’s & H’s, which share electrons pretty fairly. But, although nonpolar amino acids typically avoid water and hang out in the core of proteins, thanks to that turn-making helpfulness, unlike most other nonpolar amino acids, you do find it on the surface of proteins (proline takes one for the team).
It was first isolated in 1900 by Richard Willstatter & gets its name from “pyrrolidine” (chemical name for its side chain)🔬
how does it measure up?
coded for by: CCU, CCA, CCG, CCC
chemical formula: C5H9NO2
molar mass: 115.132 g·mol−1
systematic IUPAC name: Pyrrolidine-2-carboxylic acid