You’ve gotta make a mess to get proteins clean! I do a ton of these “cleanings” (protein purifications) because as a protein biochemist, I’m a “molecular mechanic” of sorts – I study how proteins work (or don’t work). And, thanks to RECOMBINANT PROTEIN EXPRESSION I can “tweak” the genetic recipes for them (using site-directed mutagenesis), stick those recipes into cells to make them for me, purify them out using PROTEIN CHROMATOGRAPHY and test them to see if the tweaks tweaked the tested thing.
Since I can make lots of tweaks (using site-directed mutagenesis), I can make lots of different modified versions or “constructs” of the same protein. This is important because I’m a protein biochemist so, as a “molecular mechanic” of sorts I study how proteins work (or don’t work). But that requires me to have lots of pure protein, so I do a LOT of protein purification.
Today I did the first steps of purifying 2 such proteins (it took me a long time to work up the courage & confidence to do multiple at once – especially since the proteins are just really similar versions of one another so you can’t easily see if you swapped them! careful labeling is key!) So I thought I’d walk you all through what the purification process is like. A major caveat is that there’s no real “typical” protein purification because it varies based on the protein’s preferences.
Each protein’s different, so the process has to get tweaked & optimized for different proteins. Here I’m just going to take you through the process of producing and purifying a simple, cytoplasmic protein (e.g. a water-soluble protein that lives in the general “main part” of cells & isn’t embedded in a membrane) through recombinant expression (if these words aren’t familiar, don’t worry, I’ll explain all these terms).
We can break up the workflow into 3 main parts:
- MOLECULAR CLONING – this is where we take the gene* (DNA instructions for making the protein) from its original home and put it into a VECTOR (a manipulatable piece of DNA that can serve as a “vehicle” for getting those instructions into cells).
- EXPRESSION – this is where we have the bacteria make the protein for us (we can also express trickier proteins in insect cells or mammalian cells – I express most of my proteins in insect cells, but bacteria are simpler and simpler to explain!)
- PURIFICATION – this is where we break open the cells & isolate the protein we want from all the other stuff. It usually involves several types of PROTEIN CHROMATOGRAPHY, where we flow our sample through columns filled with special little beads (resin) that separate proteins based on how their different properties (charge, size, etc.) influence how they interact with those beads
Each of these main parts has multiple sub-parts. I don’t have time/space to go into them all in detail here, so I’m going to provide links to more detailed posts. But here’s a bit more detail…
MOLECULAR CLONING: The word “cloning” might evoke thoughts of Dolly the cloned sheep, but that’s definitely not the sort of thing I do! Instead of copying whole organisms, I’m just copying genetic instructions for single proteins and sticking them into vectors to work with. Since we’re recombining pieces of DNA (the protein instructions and the vector) we call this recombinant DNA and the protein we make from it will be called recombinant protein.
*instead of the gene, we’re actually putting in an “edited” version of the gene called cDNA (short for complementary DNA). This is a DNA copy of the mature mRNA copy of the gene – basically, a gene is a stretch of DNA in a chromosome which contains instructions for making some sort of product, such as a protein or a functional RNA. To actually make a protein based off of those genetic instructions, the cell first makes RNA copies of them, and then removes regulatory regions in a process called splicing. With some further processing (e.g. capping & tailing) you get mature messenger RNA (mRNA) which is what the protein-making machinery (ribosomes) use as instructions for making proteins. For recombinant protein expression, we want to give cells these edited versions, but we need them to be in DNA form so that we can recombine them with our DNA vector. So we use cDNA.
For expression in bacteria we typically use plasmid vectors. Plasmids are a small, manipulatable, circular piece of DNA. They usually come from bacteria-infected viruses called phages and bacteria can host them as “extrachromosomal DNA” – basically the bacteria still has its own DNA but it lets these circular pieces hang out inside and get replicated and passed on to future cells. http://bit.ly/bacoverexpression
For expression in insect cells (which is what I typically), we use something called a baculovirus expression vector system (BEVs). It starts similarly to the bacteria way, but instead of cloning our cDNA into a normal plasmid, we clone it into a bacmid. A bacmid is a plasmid that can replicate in both bacteria and insect cells, and it can cause insect cells to make baculovirus (a coated virus containing that bacmid). So we can grow it up in bacteria, getting the bacteria to make lots and lots of copies of the bacmid, and then we can purify that bacmid and stick it into insect cells. Those insect cells will make and secrete baculovirus from the bacmid instructions, and that baculovirus can then be collected & used to infect large quantities of insect cells, which will then start making the protein you want. http://bit.ly/bevsinsect
For expression in mammalian cells, you can use various modified vectors including viral vectors, which use harmless viruses, such as some modified adenoviruses, to get the cDNA inside of cells. https://bit.ly/adenoviralvectors http://bit.ly/transfectionmethods
So, you’ve chosen your molecular cloning strategy, you’ve expressed your protein, and you know have liters of liquid with a ton of “random” stuff and a little bit of your protein. Now what? It’s harvest time!
HARVESTING – this is where you remove the liquid media (cell food) from the cells after they’ve made your protein, but leave the cells intact. You do this by pouring the cell-filled media from the flasks you’ve grown them in into bottles and centrifuging them (spinning them really fast) – the cells are heavy so the sink to the bottom, forming a pellet. Then you can pour off the liquid (supernatant) and resuspend the cells in a (much smaller amount of) cleaner liquid – add your resuspension buffer and vortex and/or pipet up and down a lot to get the cells un-clumped and evenly distributed. Now flash freeze these in liquid nitrogen & store them in the -80C freezer until you’re ready for the next step
LYSIS – this is where we actually break open the cells – we thaw the pellet, sometimes add lots of salt to disrupt the membranes, and ultrasonicate them to use waves of energy to break up the DNA so it doesn’t gunk stuff up. Insect cells are more fragile and easier to lyse, but bacteria are hardy (they have cell walls that are resistant to your efforts to break them) so with bacteria I usually use multiple freeze-thaw cycles with ultrasonications in between. https://bit.ly/ultrasonicshearing
ULTRACENTRIFUGATION – this is where we separate the membrane pieces from the soluble stuff. It uses much higher speeds than the centrifugation we did before (I usually spin them at 35 thousand rpm!). We need these higher speeds because the membrane pieces are much lighter than the whole cells were. But the membrane pieces are still heavier than the dissolved proteins so the membrane bits pellet while your protein remains in the liquid part (supernatant). So here you want to keep that supernatant.
At this point, we call the supernatant the LYSATE. And it has your protein, but also a lot of other proteins. You can see this if you run an SDS-PAGE gel, which separates proteins by their size. more here: http://bit.ly/2GZc3tG
As the purification proceeds, if you take samples of your work-in-progress and run them on a cell, you should see all those extra bands start to disappear since you’re removing those unwanted proteins.
And the way you remove those unwanted proteins is with PROTEIN CHROMATOGRAPHY. More here: http://bit.ly/30LklxG
The basic idea with this is that you can flow a solution containing the protein through columns filled with little beads (resin) and the proteins will get separated based on how they interact with the resin on their journey. Different proteins have different properties, so they will travel faster or slower (in SEC) or “stick” or “not stick” (in affinity chromatography). You can use different resins to separate based on different properties, and you can flow different buffers (pH-stabilized salt waters) through the column to selectively unstick stuck ones based on how stuck they are.
I start with an AFFINITY CHROMATOGRAPHY step. This uses resin that recognizes something really specific – usually an affinity tag we’ve added onto the end of the protein (by putting the genetic instructions for it before or after the instructions for our gene in the plasm). Because affinity chromatography is recognizing something “unnatural” and highly specific, it can (hopefully) remove most contaminating proteins – but not all.
A common affinity tag is a His tag, which is just 6 or 8 Histidines (a specific amino acid), which will bind to a Nickel (Ni) or Cobalt (Co) coated column (Immobilized Metal Affinity Chromatography; IMAC). Other proteins have Histidines too. But not that many in a row, so your protein will bind preferentially. It’ll hog the column and the other proteins will flow through. http://bit.ly/histidineimac
Then you need something that will outcompete the His tag to get your protein off. Bring on the imidazole. It looks like His, so you can flood the column with imidazole to push the His-tagged proteins off. But before you flood it, you wash it with low levels of imidazole to remove non-specific binders that are just binding cuz they happen to have a lot of Hises.
His-tags work well for bacterially-expressed proteins, but the proteins I’m purifying today were expressed in insect cells. Insect cells and cells from other eukaryotes (things with membrane-bound rooms inside their cells to house their DNA, etc. – so basically most stuff except bacteria) have a lot of proteins that naturally have a lot of Hises. So instead of the His tag I’m using a “Strep-tag.” more here: http://bit.ly/streptag
But basically it mimics biotin & binds to a column that mimics streptavadin – the biotin/streptavadin interaction is one of the strongest non covalent (non-electron-sharing) interactions known so these mimics are weakened a little so the protein doesn’t get stuck to the column forever & we can elute it off with another biotin mimic, desthiobiotin. Bacteria naturally have a lot of biotinylated stuff so strep tags don’t work as well for them, but this isn’t as much of a problem with insect cells
Once you’ve eluted your protein (gotten it to come off the column) you have the option to cut off that tag using an endoprotease (protein scissors) that recognizes a sequence between the tag & the start of the protein. After you’ve removed the tag there’s nothing “artificially super specific” about your protein, so you now have to exploit natural differences between your protein and any remaining contaminating proteins (which now includes that protease you added).
The first other property we’ll exploit is charge. Proteins have different charges because they’re made up of different combinations of amino acid letters, some of which are charged.
We’ll take advantage of this using Ion Exchange Chromatography (IEX), where we bind proteins to resins that are oppositely-charged. Ions are charged things and basically you “exchange” ions from salts (like the Na⁺ or Cl⁻ of NaCl (table salt) with protein ions. Then you can gradually increase the salt concentrations so that those salt ions outcompete the protein and you get another exchange. Or you can change the pH to change the protein’s overall charge (the lower the pH, the more free H⁺ for the protein to latch onto -> become more positive & vice versa) http://bit.ly/ionexchangechromatography
In cation exchange chromatography you have negatively charged resin & you’re binding & exchanging positively-charged (cationic) proteins & salt ions.
anion exchange chromatography is the opposite – you have positively charged resin & you’re binding & exchanging negatively-charged (anionic) proteins & salt ions.
At this point your protein is hopefully pretty darn pure. But proteins can have similar charges, so there are likely still small levels of lingerers. They might have similar charges but chances are (hopefully) they’ll have different sizes. So next we can use Size Exclusion Chromatography (SEC) (aka gel filtration) to separate the remaining proteins by their size. In this type of chromatography the resin’s “boring” to the proteins so the proteins don’t interact with it. But the resin’s also “bored” in the sense that it has “secret tunnels” “bored” into it. The tunnels have different diameters so proteins have to be small enough to fit in order to go through them. So the smaller the protein, the more tunnels it will go through and the longer it will take to go through the column. In this way, the proteins get separated by size, with bigger things coming out sooner, smaller things later.
Often, SEC is considered a “polishing step” because your protein’s usually mostly pure going in. But even if it doesn’t have enough contaminating proteins to cause problems, it also likely is swimming in a lot of salt. Another reason SEC is useful is that it acts as a buffer exchanger. When your protein comes out it’s in the buffer you’ve been running through the column.
Now you (hopefully) have really pure protein! You should see one nice band on your SDS-PAGE gel. http://bit.ly/sdspageruler
For long-term storage of the protein, you want to keep it at -80C, usually with a cryoprotectant like glycerol to prevent harmful ice crystals from forming. Proteins don’t like ice forming and they also don’t like being frozen and woken up and frozen and woken up and… You want to avoid multiple freeze-thaws, so you freeze “single-use” aliquots. This is probably my least favorite part of the purification process because if you have a lot of protein it can be really tedious.
My hand was getting super sore from all that tiny writing on the top of the tube (I like to label the tubes with the construct number, concentration, & date), so I got some printable cryodot labels. That helped, but now I keep finding random cryodots on my clothes in the laundry! And the labels tend to unstick leaving unlabeled tubes so I’ve gone back to cramped handwriting.
After aliquoting, we flash freeze the aliquots in liquid nitrogen (flash freezing gets the water in and around them to freeze in place without time to link up into ice crystals) more here: http://bit.ly/flashfreezingdance
Now you’re ready to test their activity!