Wait a Sec! I thought U said this was #20DaysOfAminoAcids?! I know this is supposed to be #20DaysOfAminoAcids, because there are 20 (common) proteinogenic amino acids (protein letters) that can be genetically encoded (as opposed to changed after the fact), but there’s actually a 21st – selenocysteine. It looks like cysteine, but with selenium instead of sulfur. Sulfur vs. selenium may not seem like a big deal, but it matters a lot when there are electrons to steal! And it might be rare, but Selenium (Se)’s chemical properties can imbue electron give-take-ing abilities to proteins, allowing our body’s natural antioxidant system to protect us from high-energy reactive oxygen species (ROS). And selenocysteine is special in another way – it’s able to “fool” our “foolproof” protein-making machinery into adding it instead of stopping.
Proteins (molecular machines) are made up up of chains of building block “letters” called amino acids that are like charm bracelets. Amino acids have a generic backbone (chain link) that allows any amino acid to connect to any other amino acid as well as a unique side chain “charm” that sticks out. These charms have different chemical properties that allow them to interact in different ways w/one another (important for the protein to fold properly) & with other molecules (important for intracellular interactions).
Over the past few weeks, we saw the 20 “usual” amino acid charms. We saw how there could be some modifications made to them after they’re already in proteins. Since the process of protein-making is called translation, we call such modifications “post-translational modifications” and they include things like: phosphorylation (adding a negatively-charged phosphate group (phosphate surrounded by 4 oxygens) – typically to serine (Ser, S), threonine (Thr, T), or tyrosine (Tyr, Y)); methylation (adding a -CH₃ group (often to (Lys, K)); hydroxylation (adding an -OH group, often to proline (Pro, P) or Lys) and glycosylation (adding a sugar chain – often to asparagine (Asn, N)).
But all of those “special” amino acids come from changing the charms *after* the bracelet’s made – they’re not “spelled for” in the mRNA instructions, where amino acids are “spelled” in 3-letter RNA words called codons and, during translation, transfer RNAs (tRNAs) with matching anticodons on one part and the corresponding amino acid hooked up to another part bring them to be added to a growing peptide chain in a protein-making complex called the ribosome.
There are multiple quality control steps to make sure 1) the right amino acid is attached to the right tRNA and 2) the right tRNA binds to the right codon in the mRNA. Yet, somehow, selenocysteine is able to trick this “foolproof” system… unlike the “special” amino acids we talked about before, like phosphorylated serine (phosphoserine), SELENOCYSTEINE (abbreviated Sec or U), is “original” – it’s added as the bracelet’s being made (translated) BUT it doesn’t have it’s own word. Instead, it hides a “secret message,” a stem-loop structure (named a SECIS element) in the “footnotes” (3’ untranslated region (UTR)) of the mRNA which tells the protein-making machinery that, in this particular protein, one of the words for stop (UGA) really spells selenocysteine. Thus, when the ribosome reaches that UGA, Sec-incorporating machinery is recruited and, instead of binding a release factor and cutting off the peptide, the ribosome adds Sec & keeps going along.
Selenocysteine is able to sneak in – but it sure isn’t easy! It takes a lot of different molecular players working together. We’ll talk more about how it pulls it off later, but first I want to talk about *why* our bodies invest all that effort and energy – especially given that selenocysteine-containing proteins (selenoproteins) are rare in numbers terms – humans only have 25 selenocysteine-containing proteins (selenoproteins) as far as we know, compared to at lease 20,000 different total proteins. So why are we celebrating selenocysteine?
Selonocysteine (Sec, U) & cysteine (Cys, C) – sounds like they should be similar… Indeed they are *similar* BUT they’re also subtly, importantly, different. And that difference comes from the difference between sulfur (S) and selenium (Se). That’s the only thing that’s different between them atom-wise. Cys’ side chain is a -CH₂-SH group, whereas Sec has a -CH₂-SeH group. -SH (what Cys has) is called a THIOL (it’s alcohol’s (-OH) sulfur CYSter). And -SeH (what Sec has) is called a SELENOL. We’ll talk more about the bigger picture of this later, but first I want to talk about the much smaller picture… yup, I’m talking the subatomic picture.
The difference between Cys and Sec is a single atom (an atom of the element sulfur vs an atom of the element selenium). Atoms contain a central nucleus containing positively-charged protons & neutral neutrons surrounded by a “cloud” of negatively-charged electrons (e⁻ ). Atoms interact with one another through their outermost (valence) electrons – these ones are the most energetic and furthest from the pull of the nucleus, so they’re most reactive and atoms can share to form strong covalent bonds (like the ones linking amino acids together). The # of valence e⁻ an atom “has to spend” influences how many of these bonds to form.
In a neutral atom, the # of protons = the # of electrons. The number of electrons can change – leading to non-balanced charge and thus molecules that have a net charge (we call such charged molecules “ions”). But the number of protons is what defines an element – so if an atom has 6 protons it’s *always* carbon and if it has 8 it’s *always* oxygen. Within the periodic table of elements, elements in the same column (group or family) have the same # of valence e⁻, so they often react similarly. But as you go down the group, more layers of e⁻ are added in between those outer e⁻ & the protons & there are also more protons, so the atoms get heavier, but their positive pull is shielded from the outer e⁻ by all those extra layers in between, so the outer e⁻ are more “free”
The elements Oxygen (O), Sulfur (S), & Selenium (Se) are all in the same column (group 16 – sometimes called “oxygen family”; also sometimes called chalcogens (“ore-formers” because they commonly bind metals)) so they have same # of valence e⁻ (6) & can react similarly in many cases. BUT, these physical chemical differences lead to biochemical effects, & biological consequences, as readily seen in the case of serine vs cysteine vs selenocysteine!
Let’s start with Cys as it’s MUCH more common (it’s in most proteins). Remember, it has that -CH₂-SH group, with -SH called a THIOL. 2 Cys (either within the same protein or between 2 proteins) can link together (protein)-SH + HS-(protein) -> (protein)-S-S-(protein) to give you a disulfide bond (aka a disulfide bridge or cross-link) & now you call the Cys’s cystIne. http://bit.ly/cysteinecrosslinks
Unlike other charm-charm interactions, which are just *attractions* based on charge (or partial charge) differences, this is a covalent bond, so it’s strong (and good for sturdying-up secreted proteins that have to live outside the comfort of the cell), BUT it’s not quite as strong as the covalent bonds in the protein’s backbone (linking the chain links) so you can split them back up without splitting up the chain links themselves.
This splitting & unsplitting involves REDuction & OXidation (REDOX) reactions. More here ( http://bit.ly/dttreducingagents) but basically they involve thing 1 (the reductant) giving e⁻ to another thing 2 (the oxidant), reducing thing 2 & oxidizing thing 1 in the process. remember OIL RIG: Oxidation Is Loss (of e⁻), Reduction Is Gain (of e⁻)
When cysteine’s link up as cystines, they share their “extra” e⁻ so they’re “losing” e⁻, thus being OXIDIZED (lose the “e” in the name (and also “e⁻”)). When they split up, they get the e⁻ they’d been sharing “back” – that is, they’re REDUCED.
They’re usually in the “reduced” unspilt state inside your cells, but they can bridge up when they get oxidized by reactive oxygen species (ROS) which have high-energy electrons. Low, CONTROLLED, levels of ROS are important for things like signaling but high levels can be damaging because, as their name suggests, they’re reactive and they can react with things like DNA & proteins, messing them up. So your cells have antioxidant proteins like Thioredoxin (Trx) with Cys’s standing by to “take the oxidative hit” – they sop up the ROS, but then they get stuck in the bridged form. To regenerate the antioxidant-ness, you need to reverse the OXIDATION by REDUCING them, & this is one of the places where Sec (or another Cys) can come in handy by breaking the disulfide & forming a new bond to one of the S’s.
In order to do this, Cys or Sec has to give up their H (deprotonate). Ring a bell? Giving up a proton (H⁺) is the Bronsted definition of an ACID! S can do this, & it does so more readily than O (as we saw with serine yesterday, the alcohol -OH group is really reluctant to deprotonate) but still only grudgingly. http://bit.ly/serineserineproteases
How about Sec? It’s “charm” is an -CH₂-SeH group. -SeH is called a SELENOL & it’s more generous with its H⁺, as evidenced by it’s lower pKa. pKa is a measure of acid strength (more here http://bit.ly/phacidbase). The lower the pKa, the stronger the acid (it will give up an H⁺ more readily). pH is a measure of “free” H⁺ At a pH above a molecule’s pKa, there are fewer extra H⁺, so even a stingier acid will start to feel “guilty” and donate to the cause – it’s more likely to be deprotonated at a pH above the pKa (more basic/alkaline conditions) & below the pKa (more acidic conditions) it’s more likely to be protonated (there are plenty of free H⁺ around, why should I share?)
Se gives up an H⁺ more easily than S (in science-y terms we say it’s more acidic & has a lower pKa (the pKa of Sec is ~5.2 vs. 8.3 for Cys). The pH of our cells is ~7.4, which is above Sel’s pKa but below Cys’, so Cys is likely in its protonated thiol form (protein)-SH, BUT Sel is likely in it’s proton-less selenate form (protein)-Se⁻
Atoms are more stable when they’re neutral, but H leaves as a proton (H⁺) leaving behind its e⁻ and unbalancing the charge. Sel’s Se⁻ doesn’t like being charged – it only gave up an H⁺ because it saw that others needed it (well, not technically – it’s more like it just always loses it more easily and now there are less new ones to find when it does) So it will jump at the chance to get some positivity to even out its negativity. And where can you find positivity? In a nucleus (where the positive protons are) – so we call selenate a NUCLEOPHILE. http://bit.ly/nucleophilefiles
As a nucleophile, it will seek out & attack ELECTROPHILES (partially or fully charged things that want more e⁻) & form new bonds with them. BUT, it’s also more likely than S to lose those new bonds it forms. Why such a fair-weather friend?
S “needs” a partner more than Se does. Because Se’s bigger, it’s better able to spread out the extra charge that comes w/splitting up (think of a drop of food coloring in a pool vs a cup). It feels the loss of a partner less so is better able to cope without it. Additionally, just like it’s awkward for someone with a giant hand to shake a little kid’s hand, it’s awkward for a big atom like Se to form bonds w/smaller things, so the bonds it forms are weaker. Therefore, you wouldn’t want to rely on Sec-Sec bridges to hold a protein together outside the cell or Sec to hold on tight to an important metal ion (things Cys is good for) BUT the reversibility makes Sel a great sensor & protects proteins from “permanent” oxidative damage.
Going back to breaking up a disulfide bridge – you need a reducing agent & selenate (the -Se⁻ form) can step up to the plate! Remember, it has more e⁻ than it wants & oxidation is loss of e⁻. So it wants to be oxidized (which means something else has to be reduced, since you can’t have one without the other!) Therefore, it breaks up the disulfide bond by latching onto one of the S’s. But now you have a Se stuck to the S (a selenenylsulfide (1 Se & 1 S) instead of a disulfide (2 S’s)). (instead of protein-SH & HS-protein you have protein-S-Se-protein & HS-protein) Thankfully, that Se-S bond’s weaker than the S-S bond was so it can break off easier, giving you 2 reduced cysteines
Glutathione peroxidases (GpX’s) were the first identified selenoproteins & they’re one of the first lines of defense against oxidative stress. They catalyze (mediate & speed up) reduction of hydrogen peroxide (H₂O₂) &/or lipid peroxides to water & lipid alcohols with the help of glutathione. GpX’s selenol (GPx–SeH) reacts with the peroxide, neutralizing that threat but by grabbing one of its Os, and turning into selenic acid (GPx–SeOH) in the process. To get back to where it started, it latches onto glutathione’s sulfur, trapping it in a glutathionylated intermediate (GPx–Se–S–G) which then reacts with another glutathione & the glutathiones, now in the oxidized form (GSSG), leave together, giving you the selenol form again.
Other important selenoproteins are thioredoxin reductases (TrxRs), which work with little proteins called thioredoxin (Trx) to serve as an intracellular disulfide reduction squad to regenerate antioxidants and help regulate signaling pathways. TrxR’s Sec helps replenish Trx’s Cys’s. And iodothyronine deiodinases are involved in making thyroid hormones
Protein crystallographers also like selenium for a different reason – since selenium is heavy it will scatter x-rays differently, so we can incorporate it into proteins as selenomethionine (SeMet) which you can do by growing the cells making the protein in food with SeMet instead of Methionine (Met). Met is the other S-containing charm (it has a -CHc-CH2-S-CH3 ). Replace that S with an Se & you get selenomethionine. It scatters x-rays differently so we can find where the SeMet sites are and use them as anchor points to figure out what proteins look like. more on crystallography here http://bit.ly/xraycrystallography2
So, time for the magician to reveal her secrets – how *does* Sec sneak in?!!! Sec doesn’t speak English, so let me try to *translate* the story for you… let’s go back to the mRNA to see what it has to say! I mentioned some of this briefly before, but now I want to tell you some more (so apologies for redundancy or, as I like to think of it here, review…)
mRNA stands for messenger RNA, and it’s an RNA copy of the DNA recipe for a protein (a gene). Mature mRNA is not a letter-for-letter copy – firstly because RNA swaps the letter “U” for the letter “T” and RNA has an extra oxygen in its sugar (Ribose instead of Deoxyribose) – but more significantly because the immature mRNA gets edited to remove most of the regulatory information – at least the “margin notes” called introns that interrupt the parts that have instructions for expressing protein letters (exons). There is, however, regulatory info left on the ends in untranslated regions (UTRs) – the 5’ UTR at the start and the 3’ UTR at the end (‘ is pronounced “prime” and it refers to whether the end sugar has a free left arm (5’) or left leg (3’) available for linking).
The 3’ UTR is going to come into play later so keep it in mind… But for now let’s look at what’s in the exons, the “protein-coding” parts, which spell out what amino acids to add in what order. Amino acids are spelled for in 3-letter messenger RNA (mRNA) “words” (copied from DNA words) called codons (e.g. UGU spells cysteine & CCU spells proline). Multiple words can mean the same thing (e.g. UGU & UGC both mean cysteine (just alternate spellings) (there’s redundancy) BUT *usually* each word only means 1 thing (e.g. UGU never spells proline)- that is to say, there’s *no* degeneracy, because translation is an incredibly efficient, accurate, evolutionarily-honed process. Let’s look a little closer at this process so we can understand how Sec hacks it.
Translation involves a protein-making complex made up of protein & RNA called the ribosome traveling along the mRNA (or the mRNA traveling through it) and joining together amino acids based on the sequence of RNA letters it encounters. It reads codons non-overlappingly (e.g. UGUCCU is read as UGU CCU, leading to the addition of Cys, then Pro) and it adds those amino acids because transfer RNAs (tRNAs) with a complementary 3-letter anticodon on one part and the corresponding amino acid hooked onto another part come pass it off to the growing chain while the ribosome holds it in place and facilitates the transfer. http://bit.ly/proteintranslation
The tRNAs don’t come alone – proteins called elongation factors bring the tRNAs and make sure the codon & anticodon match before they spend energy money and leave – they hydrolyze (use water to split) GTP into GDP (this is same concept as usage of ATP for energy (it takes energy to hold the negatively-charged and thus repelling phosphate groups, so you can think of phosphate-phosphate bond hydrolysis as unclamping a biochemical spring) – GTP just has a different nucleobase). https://bit.ly/atpenergymoney
It’s the anticodon that matters here, not the actual amino acid, so you need more safety checks upstream- when the tRNA is getting “charged” (attached to an amino acid) the charger makes sure its a match. It even “invests” in it, this time spending “energy money” in the more commonly-encountered form of ATP – in the amino acid activation step, AMP gets transferred onto the amino acid from ATP. And then in the tRNA charging step, the amino acid binds to the matching tRNA – AMP gets released, and you’re left with aminoacylated tRNA http://bit.ly/2KNe00D
Once the first elongation factor splits GTP, it “splits” -> falls off. The peptide bond forms, with the growing strand transferred to the newest tRNA, leading to an awkward transition stage that gets helped along by another elongation factor coming & spending another GTP to help things along. This “translocation” step moves the growing chain (attached to the tRNA) from the ribosome’s “A” (Aminoacyl) room, where tRNA Arrives to the “P” (Petptidyl) room where the Peptide chain is now held, freeing up the entryway (“A” spot) for the next tRNA to come and pushing the “old,” spent tRNA out the Exit “E” room http://bit.ly/2GhbFps
Something special *usually* happens when a “stop codon” shows up there (UAA, UAG, or UGA). This *usually* signals the end of the chain. Instead of a tRNA binding it, a protein TERMINATION FACTOR binds and cleaves the chain off. http://bit.ly/2IFmmEZ
Normally, when the ribosome gets to a stop codon, it has to pause for a sec (while a release factor gets brought in to help break things off) – and with a little trickery you can get it to pause for a Sec! With selenocysteine, before there’s even time to bring one of those in, the tRNA for selenocysteine sneaks in there – and it’s able to do this in part because of a structure in the 3’ UTR. yep- mRNA can have shapes too! in this case it’s a stem loop (kinda like a hairpin, but the human one has an extra bubble). This special structure is called a selenocysteine insertion (SECIS) element and, since it’s located on the same strand of mRNA as the UGA to be ignored, we call it a “cis-acting element” (cis for same). There are also a couple “trans-acting elements” -trans for different because they’re totally separate things – in this case “accessory” proteins that help the magic happen and bind the Sec-charged tRNA, keeping it nearby and ready to sneak in before a release factor can.
But it’s not just the inserting into a peptide chain that’s hard – making selcys is a challenge too. The terminology gets kinda confusing, so bear with me… the tRNA for Sec is often “named” tRNA[Ser]Sec because it initially gets charged with serine (Ser, S – the alcohol version of Cys/Sec), but then that Ser gets converted into Sec once it’s attached to the tRNA (but not in the protein yet!) Don’t let the tRNA[Ser]Sec name fool you – the tRNA is specific to Sec – it has anticodon complementary to UGA. So I’m just going to refer to it as tRNASec. But it gets charged with Serine with the helper that also helps put Serine on serine tRNA, seryl-tRNA synthetase.
So, we take tRNASec and charge it with Serine (which, as we talked about before, requires spending ATP). So at this point we have Ser-tRNASec. The Ser to Sec conversion takes a couple of steps. First, its seryl group gets phosphorylated to phosphoseryl-tRNA (aka (Sep)-tRNASec) by O-phosphoseryl-tRNASec kinase (PSTK). Then, finally, you get to Sec by swapping out the phosphate for selenium with the help of O-phosphoseryl-tRNASec:selenocysteinyl-tRNASec synthase. That’s a mouthful of a name, isn’t it – thankfully it has a nickname, SepSecS. SepSecS gets the selenium it transfers from selenophosphate (SePO₃) and gets help from the co-factor pyridoxal phosphate (PLP) (yup – the very same one we’ve seen as a helper in transaminases). http://bit.ly/lysineanalysis
But the selenophsphate has to be “made fresh” because selenium is one of those elements that your body works hard to control – it’s always gotta be supervised (copper is another example of an element in need of chaperoning). So instead of just having free selenium floating around, our bodies actually store it in protein form. Selenoproteins typically have a single Sec residue, but selenoprotein P (SelP) has 10!, making it well suited for transporting selenium safely. SelP is home to 40-50% of selenium in blood plasma (the cell-less part of blood), and it can be broken down as needed. The final breakdown product from selenocysteine degradation by selenocysteine lyase is (Se²⁻), which selenophosphate synthetase (SPS2) combines with ATP to make selenophosphate as needed. SepSecS uses it and you get your charged Sec-tRNASec.
All the other amino acids get along fine using the same elongation factor (chaperone to the ribosome) but tRNASec needs a special one, EFsec. And, even with this specialized escort, it needs more help. Because it still has to convince the ribosome to add it. Another “trans-acting element” is there to help – SECIS-binding protein 2 (SBP2) helps EFSec and its Sec-tRNASec passenger bind to the SECIS loop, and it also interacts with the ribosome to help sneak Sec in when it’s time.
Selenium itself was first discovered in 1817 – by the Swedish chemist Jöns Jacob Berzelius who found it in a sulfuric acid factory. That place must have smelled awful, but he named the element selenium after the Greek moon goddess Selene. While selenocysteine itself isn’t essential in the dietary sense since our bodies can make it ourselves, the selenium part is – this makes Sec the only amino acid that requires an essential dietary micronutrient.
Selenocysteine in proteins was discovered in the 1970s by the late biochemist Thressa Stadtman, and much of what we know about selenium, selenocysteine, and selenoproteins is thanks to her. She was born in upstate New York in 1920. After graduating as valedictorian from her high school, she worked 4 hours a day as a waitress to supplement a scholarship to Cornell. She graduated with a BS in microbiology and, after trying industry work, realized that her passions aligned more with basic research, so she continued her studies at Cornell, obtaining a Master’s degree in bacteriology and nutrition. She went on to complete doctoral work at UC Berkeley, where she studied biochemical reactions in bacteria she found in local mud. Berkeley was also where she met her future husband, and fellow biochemist, Earl. When Thressa was offered a job at the NIH, Earl followed her there, where the two spent the rest of their long careers working, often in collaboration.
Theresa is best known for her discovery of selenocysteine, but she also made major contributions to knowledge about amino acid synthesis and B12-dependent enzymes. Her career spanned over half a decade and involved the publishing of 212 peer-reviewed articles. Wanting to enable others to have similarly successful careers, she established scholarships for women majoring in science at Cornell, stipulating that, once women had achieved equity, the scholarship should be opened up to people in other marginalized groups. Thressa died in December 2016, but her legacy lives on, as does her name – her rigorous but supportive mentorship style is lovingly referred to as “The Stadtman Way,” there are numerous awards given in her honor, and she even has a microorganism named after her: Methanospaera stadtmaniae. https://bit.ly/3p9Lshs
how does it measure up?
systematic name: 2-Amino-3-selanylpropanoic acid
coded for by: UGA (with help)
chemical formula: C3H7NO2Se
molar mass: 168.065 g·mol−1