We now often take the idea of “genetic diseases” for granted, but even once scientists knew that hereditary info was stored and passed on in the form of DNA, it still wasn’t clear how “faulty” DNA could lead to any sort of symptoms. Sickle cell anemia (SCA) would provide the answer – by showing that a single DNA letter swap could lead to a single protein letter swap that enfeebled a protein, Sanger, Ingram, and colleagues showed the direct link between the language of nucleic acids (DNA & RNA) and the language of proteins, validating the existence of a “genetic code.”
It started in 1949, when Linus Pauling showed that the oxygen-carrying protein hemoglobin in the blood of patients with SCA differed from that of healthy people at the molecular level – it was less negatively-charged. And then in the late 1950s, Ingram pinpointed a single protein letter difference between them. Might not seem like that big of a deal – but it is a big, ginormous, colossal deal – it was the first demonstration that a single, inheritable, genetic difference in DNA letters could cause an isolated, specific, difference in protein letters – a change that could have profound functional consequences. Their papers describing their work are things of beauty and I feel they deserve more attention – so I’m gonna give them some!
Sickle cell anemia (SCA) as a disease has gotten a lot of attention – for a lot of years. In this blood disorder, red blood cells tasked with carrying oxygen throughout the body adopt moon-like sickle shapes and clog blood vessels, leading to painful “crises” and potentially organ damage. Long before people knew about DNA, they knew about SCA – and it was pretty clear to see that the disease was hereditary – it ran in families. With the invention of microscopes, etc., they could gather more clues as to what was going on…
When they looked at an SCA patient’s blood under the microscope it looked weird. Especially under conditions of low oxygen, SCA blood cells would sickle up, and if they looked even closer, they could see clumpy stuff inside those cells – they were looking at (but didn’t yet know) mutated hemoglobin.
Almost all of the protein in red blood cells (RBCs) is hemoglobin. Mature red blood cells, even normal ones are always “weird” – unlike the rest of your cells, they don’t have DNA – immature ones do, but they ditch it to focus on oxygen carrying. And for that they don’t need DNA, but they do need hemoglobin. And lots of it. The “heme” in the name “hemoglobin” comes from the fact that “globin” proteins bind to a ring-y molecule called heme, which holds onto an iron atom, and that iron atom is able to hold onto oxygen. Under conditions of high oxygen (like in your lungs), hemoglobin binds to oxygen strongly and then, under low-oxygen conditions (like in your toe) in lets go – and then goes back to your lungs “oxygen store” to pick up more.
In normal RBCs, the hemoglobin molecules float around on their own, but in SCA patient cells they would clump together into aggregates and long chains (polymers), especially under low-oxygen conditions. Not very conducive for blood flow, let alone oxygen transport… But what was causing this? And how could this protein error be passed on?
Pauling decided to figure out what was different about the proteins using technology available, including the newly-developed method of electrophoresis. Electrophoresis uses electric fields to make molecules move due to the whole opposites attract thing. If you have a negatively-charged molecule and you generate an electric gradient from negative to positive charge, the molecule will move towards the positive charge. The more opposite the charges, the more motivated it will be to move. So you can use electrophoresis to separate molecules with different charges – even if they have the same size.
Walk into any modern biochemistry or molecular biology lab and you’re likely to see electrophoresis taking place – in gel form – for example, we use agarose gels to separate DNA & SDS-PAGE gels to separate proteins (yesterday I ran ran my 512th one since I started counting….) In those cases, we just use charge as a motivating force, with the real separating power coming from the gel, whose mesh-like pores slow down bigger things more, allowing molecules to separate by size. more here: http://bit.ly/2Ik1g0s
But, there are other formats of electrophoresis, and if you don’t have a gel with tiny holes holding things up, you can separate based on differences in charge instead of size.
For DNA or RNA this wouldn’t be that helpful, because the only charged part of those guys is their generic sugar-phosphate backbone. The part that differentiates the letters (i.e. A vs T vs C vs G) is the unique nucleobases (“bases”) which are neutral. So even if you had 2 DNA pieces of different sequences but the same length, they would have the same charge, so travel together. more on nucleic acids here: http://bit.ly/35Yspx6
BUT – with proteins, now you’re talking! Proteins are made up of letters called amino acids. There are 20 (common) ones, and they have a generic peptide backbone they link through and unique side chains (aka R groups) that stick off like charms on a charm bracelet. Unlike DNA & RNA, their backbone is neutral – but some of the side chains sometimes are not! more on amino acids here: http://bit.ly/37Aym43
There are 5 amino acids that are sometimes charged. I say sometimes because it’s pH-dependent. pH refers to acidity, which refers to the concentration of protons (H⁺). pH is an inverse log scale, so when the pH is low, it means there are a lot of protons around – and some molecules will act as “bases” and take one – which can make something neutral positive or something negative neutral. Conversely, when the pH is high, it means there are fewer protons around, so some molecules will act as “acids” and donate a proton, becoming negative or neutral in the process.
Different molecules have different willingnesses to give and take protons, and the pKa is the pH at which half of a molecule will be protonated – the lower the pKa, the stronger the acid. There are 5 amino acids that have proton-giving/taking groups that are commonly charged at bodily pH (~7.4). Aspartic acid (Asp, D) & Glutamic acid (Glu, E) have carboxylic acid (-(C=O)-OH) groups with low pKas (much lower than bodily pH) so they’re usually in their deprotonated, negatively-charged, states (which we refer to as aspartate & glutamate). On the positively-charged side of things protein-wise you have Lysine (Lys, K), Arginine (Arg, R), & Histidine (His, H), which have nitrogens that will grab onto protons, picking up their + charge with them. more here: http://bit.ly/2NuWkGJ
Different proteins have different combinations of amino acid letters – including different numbers and positions of charged ones. So different proteins will (usually) have different overall charges, so if you run electrophoresis on them without messing with the charge, they’ll run at different speeds, and thus separate (don’t get confused with SDS-PAGE where the SDS detergent coats all the proteins with a uniform negative charge to avoid the charge difference “issue” which we *want* here!)
So, Sanger applied electrophoresis to normal and sickled hemoglobin. Instead of a gel format, he used this weird U-shaped thing called a Tiselius apparatus which has a buffer-filled u-shaped tube you put your sample in, then apply a charge gradient and things in the sample will move “up or down” to get to the charge they like. And you can use optical scanning to figure out how much stuff is where on the tube’s different arms.
When he used this, he found that the normal version traveled further towards the positive charge, indicating that it was more negatively-charged. He didn’t know where the change was, just that there was one – or at least it really seemed like there was (he couldn’t rule out the possibility that the SCA protein was just folded up differently so that the “missing charge” was hidden). Furthermore, he showed that patients who were carriers of the SCA mutation but didn’t have full-on SCA because they still had a normal version able to pick up the slack, had about 1/2 (40%) normal hemoglobin and about 1/2 (60%) of the less-negatively-charged hemoglobin.
And the difference in charge appeared to be a difference of about 2 to 4 “charges” – since they knew negative charges on proteins come from the carboxylate groups of Glu or Asp, they surmised that the SCA hemoglobin had fewer of these.
We now know that hemoglobin is made up of 4 subunits (it’s a tetramer). The main form of adult hemoglobin is hemoglobin A, which has 2 copies of the β subunit and 2 copies of the α subunit. It’s the β subunit that’s mutated in SCA. And each hemoglobin A has 2 copies of this, so a single charge difference in each copy would make sense – at this time, they new based on x-ray crystal structures that there were “2 identical halves” which were revealed to be dimers of α and β.
Pauling’s work was a major breakthrough – as he writes in his classic-but-not-classic-enough Science paper, “Sickle Cell Anemia, a Molecular Disease” http://bit.ly/36r5slV
“This investigation reveals, therefore, a clear case of a change produced in a protein molecule by an allelic change in a single gene involved in synthesis.”
But what was that change in protein? Ingram took things further, trying to pinpoint the exact difference. He used “protein scissors” (protein enzymes (reaction mediator/speed-uppers) called proteases) to cut the hemoglobin into smaller pieces and carried out electrophoresis on these peptide pieces.
Tiselius’ electrophoretic contraption was a masterpiece of an accomplishment, and it won Arne Tiselius the 1948 Nobel Prize in Chemistry, but it wasn’t exactly user-friendly, and it didn’t give clean separation – instead you got broad bands corresponding to “moving boundaries” of regions with different components in your sample. So instead of using that finicky Tiselius thing, Ingram used wet paper sandwiched between 2 pieces of glass with oppositely-charged electrodes at either end.
He didn’t want the pieces to be too small or too big, but thankfully the enzyme trypsin was well-matched for his “partial digestion” task. This protease cuts next to the amino acids lysine (Lys, K) & arginine (Arg, R), which are pretty well spaced out in hemoglobin, giving him a couple dozen peptide pieces. But a lot of the pieces had similar charges and thus didn’t separate well with electrophoresis. So he added a second separation step – paper chromatography.
Electrophoresis exploited differences in molecules’ charge – chromatography exploits differences in a molecules’ desire to hang out with a movable phase (like a liquid solvent wicking through a piece of paper) compared to a stationary phase (like the fibers in that piece of paper). The more a molecule likes the liquid, the faster/further it will travel with the liquid without getting sidetracked by the paper. Molecules can have the same charge but different solubilities in different solvents, and thus, if you take that band of peptides that ran together in electrophoresis and subject them to chromatography, you can get them to separate.
So that’s what Ingram did (and named this technique “peptide fingerprinting”). After running paper electrophoresis in one direction (e.g. horizontally), he let the paper dry and then ran chromatography in the perpendicular direction (e.g. vertically). He did the exact same thing with normal and sickle hemoglobin digests. And then he compared the spots.
Well, first he had to make them visible. To do this he used a chemical called ninhydrin, which reacts with amine (nitrogen-hydrogen) groups. This staining method had an added advantage – all the peptides will stain because they have an amiNe group at their N-terminus. But some side chains have additional amine groups, so they’ll react more strongly and thus give a darker color which gives a sort of qualitative idea about what’s in them and makes dot to dot comparing easier.
When he did this, only one peptide moved differently than in the normal. Now he “just” had to figure out why – what was different between them?
He isolated this peptide from the normal and SCA proteins, cutting out their spots and extracting them from the paper. And then he further characterized them. This time he split them all the way into individual letters via acid hydrolysis and figured out what letters were in each by doing some more chromatography. The normal peptide 4 had one more Glu & 1 fewer Valine (Val, V) than the SCA peptide 4. Later, with advances in technology & methodology, he and other groups would sequence these peptides (and those of the entire protein) to figure out the exact order of the amino acids (an order that, thanks in part to these experiments, we now know is specified by the order of DNA letters in the protein’s gene!)
But Ingram’s original “peptide 4 is the only difference experiment” didn’t rule out the possibility of there being additional mutations. Because, while peptide 4 was the only difference in the peptides he analyzed, there was a whole part of the protein he hadn’t been able to compare at the nitty gritty level – when he did the trypsin digestion there was a big part of the protein that was resistant to cutting. This part was dark-colored, so he suspected it contained the heme group (in addition to binding iron, heme’s “porphyrin” rings absorb light such that they (and your blood) appear colored.
To cut it up, he turned to a second protease, chymotrypsin, which has different cutting preferences – it likes to cut next to aromatic amino acids like Tyrosine (Tyr, Y), Phenylalanine (Phe, F), and Tryptophan (Trp, W). When he cut the trypsin-resistant core with chymotrypsin and did his fingerprinting, he couldn’t detect any differences between the peptides from normal or sickle hemoglobin, further pointing to the existence of a single change.
But was it the same change in all the patients? All the time? All of this was comparing 2 samples, right? Well, he did the same thing with blood from multiple patients and got the same results, further cementing the relationship.
In his 1957 Nature paper, he writes a beautiful declaration of success: “While there may also be changes in folding, it has now been definitely established that the amino-acid sequences of the two proteins differ, and differ at only one point. Thus it can be seen that an alteration in a Mendelian gene causes an alteration in the amino-acid sequence of the corresponding .polypeptide chain. In the case of sickle cell anemia hemoglobin, this is the smallest alteration possible -only one amino-acid is affected-reflecting, presumably, a change in a very small portion of the hemoglobin gene. It is not known, but it may well be that this involves a replacement of no more than a single base-pair in the chain of the deoxyribonucleic acid of the gene.” https://go.nature.com/2NY6YFA
Electrophoresis became a useful clinical tool for diagnosing SCA. But there are other forms of “hemoglobinopathies” (blood disorders characterized by defective hemoglobin proteins) as well as “thalassemias” (blood disorders characterized by not enough hemoglobin proteins being made). There are over 1000 identified β-globin mutations – some mutations do not cause charge differences, so they won’t get detected by electrophoresis alone.
Doctors these days often use HPLC for diagnosing various hemoglobin-related disorders. HPLC stands for High-Performance Liquid Chromatography. It’s a form of chromatography which, instead of using paper as a stationary phase, uses an “analytical cartridge” – a column containing different types of materials. Different cartridges have different properties, which can be used to separate peptides based on their different properties. So, for example, by using a negatively-charged silica cartridge, gradually increasing the salt concentration, and and seeing how much salt is required to compete off the stuck peptide, you can separate peptides based on charge without having to use electricity.
HPLC can also tell you about the relative amounts of different hemoglobin subunits based on their different characteristic “retention times” (corresponding to where in the salt gradient they come out (elute from) the column & pass through a detector. You know how I said the main adult form of hemoglobin, HbA, has 2 α & 2 β chains? (making in ααββ) Well, there’s also δ & γ chains that can take β’s place. The γ chain is found in fetal hemoglobin (HbF), instead of the β chain (so ααγγ). It has a higher affinity for oxygen, making it great for fetuses who need to eke out the small amounts of oxygen from their moms’ “spent” blood, but a bit of an overkill for babies after they can breathe on their own, so it stops being made shortly after birth (<1% of normal adult Hb). But you still have the genetic instructions for making it.
And this whole post was in part inspired by the post I did on a clinical trial using gene editing to get the blood of patients with SCA or β thalassemia to make this γ chain to compensate for problems with their β chain. And I was wondering how they could tell if it was working. These chains run differently on HPLC, so doctors/scientists can tell what proportion of hemoglobin in a patient’s blood is what form. If the treatment’s working, they’ll start to see higher proportions of γ chain compared to β chain, hence the encouraging news I heard delivered in those news segments. more here: http://bit.ly/33foda8
Wondering about the δ subunit? You know I was! Turns out that the main form of normal adult Hb, HbA (ααββ) makes up >96% of a healthy person’s Hb, with the δ-containing HbA2 (ααδδ) only about 2.5%. Higher levels of it indicate β thalassemia – if less β globin is made, δ steps in, but it’s less stable and isn’t made as much normally.