Want a lot of protein made on demand? And you don’t just want it expressed – you want it over-expressed? Take a page out of the book of T7-phage! In recombinant protein over-expression in bacteria, I can stick the genetic instructions for a protein I’m interested in into a plasmid vector (like pET28) and get the bacteria to devote almost all their time & resources into making that one specific protein – I get to tell them exactly what protein I want made – and when I want it made, through IPTG induction of T7 polymerase.
The basic principle behind recombinant protein expression is that we can stick the genetic instructions for a protein we want made into cells from a different organism and it’ll make the protein for us. Yesterday, we looked at how we express proteins in insect cells – it’s a good option for more complex proteins that need more “human-like” machinery http://bit.ly/38APYwz
But since the genetic code is universal, “any” type of cell could theoretically serve as a custom protein-making factory. And insect cells are typically a “backup” option to the easiest, cheapest, and fastest recombinant expression system out there, the O.G. of the expression world, bacteria. And, it’s not just their cheapness and fast growth that makes bacteria a popular choice, with some clever phagey foolery we can use bacteria to make lots of proteins of our choosing on demand!
The “molecular biology revolution” of the mid/late 1900s enabled scientists to mix and match (recombine) pieces of DNA to better control & manipulate them. And this was a major breakthrough because
- 1) DNA holds the instructions for making proteins (in the form of genes, which are stretches of DNA that (in a process called transcription) get copied into RNA and edited to form mature messenger RNA (mRNA) “recipes” that get read by protein-making machinery called ribosomes in a process called translation, and
- 2) DNA/RNA is a “universal language” meaning that all organisms can read it – 3 RNA-letter mRNA “words” called “codons” spell one of the 20 unique protein letters (amino acids), and the corresponding amino acid is brought to the ribosome by a transfer RNA (tRNA) with a complementary 3-letter “anticodon” -> the ribosome links it onto the growing chain and it’s onto the next letter
A gene’s “natural home” is in a chromosome, which is a really long strand of DNA that holds a lot of genes. We’re only interested in one (and we want the edited version – the DNA copy of the mature mRNA, which we call complementary DNA (cDNA). So, using methods like “cutting and pasting” with restriction enzymes and DNA ligase or “copying and stapling” using PCR-based methods like SLIC, we can stick the cDNA for the protein we want made into a smaller piece of DNA that’s easier to work with and which has some special features. We call this carrier DNA a “vector” and for bacteria, the vector is usually a small circular piece of DNA called a plasmid.
A plasmid that I commonly use for bacterial expression is pET28a(+). pET stands for *p*lasmid *E*xpression vector under *T*7 control, and it’s a family of “phagemid” vectors that mix and match features of plasmids: small, circular DNA that can survive and get copied even though they’re “extrachromosomal” (not part of the bacteria’s own chromosome); and bacteriophages or “phages” (a virus that infects bacteria). One such phage is T7, and, like other phages, it has a simple goal: reproduce!
T7 has a small, linear, DNA genome (complete genetic blueprint for making identical copies of itself)). It coats itself in a protein shell when it travels, then latches onto bacteria & injects its DNA into it, The bacteria copy this DNA for them & use it to make the proteins it needs to coat itself & inject into other bacteria
They have to convince the bacteria to make *their* proteins instead of their own bacterial proteins. And, for recombinant expression, this is what we want to do too!. So we learn from (*steal from*) the masters. So how do they do it? It’s all about bypassing the holdups: there are 2 main “holdups” that can get in their way 1) Making mRNA copies of the gene (transcription) & 2) making protein from those mRNA copies (translation)
Transcription (DNA -> RNA) requires an RNA Polymerase (RNA Pol). Bacteria have their own, but it’s busy making bacterial proteins, so rather than rely on the bacterial RNA Pol, T7 makes its own RNA Pol (T7 Pol). And this one is specific for its own promoter (start site on the DNA the Pol latches onto to start making recipe copies). So T7 gets this one all to itself – the bacteria can’t use it.
The T7 promoter tells T7 Pol where to start but how does it know where to stop? The T7 terminator! This is a sequence that, when copied, folds into a hairpin which causes the mRNA to fall off & frees T7 Pol to make more copies!
And you want to make LOTS of copies of the mRNA because this does have to compete w/bacterial mRNA for the bacteria’s protein-making machinery (ribosomes). But because T7’s so active & exclusive it can easily swamp out the bacterial mRNA.
Similarly, if we put a T7 promoter before our gene, a T7 terminator after it & give it some T7 Pol, we can get bacteria to overexpress our protein. With bacterial overexpression, you get the bacteria to devote almost all their resources to expressing your gene – after just a few hours over ½ of all protein in the cell could be yours
But, because the bacterial cells are devoting themselves to making our protein, they’re neglecting their own needs – including reproduction – that reason why bacteria are so useful in the lab (well, one of many reasons) is that their population booms rapidly because it doesn’t take them long to copy all their DNA (DNA replication) then split in half, giving each new cell a copy. That takes a lot of energy and resources, which the bacteria doesn’t have if it’s devoting itself to T7 protein-making. T7 doesn’t care about this, but we *do*, because we need to be able to grow the cells to get enough cells to express lots of our protein
One way to do this is to just not give it T7 Pol – that special polymerase that makes the RNA copies of the T7 genes (which ribosomes use to make T7 proteins) or anything that “looks” like a T7 gene because it’s under the control of a T7 promoter (like the gene we want to express). And in fact, if you look at a pET vector, you’ll see it does NOT have the T7 Pol gene. So how does our protein get made? Wasn’t the whole point of using the T7 promoter to make a lot of it?! Don’t worry – we still have the T7 Pol gene – we just keep it separate so we can activate it “on command”
We rely on the bacterial host DNA and NOT the plasmid DNA to provide T7 Pol. Bacteria don’t normally have this gene (it’s from a virus that wants to sabotage it, remember), but specific strains of bacteria have been designed so they DO. If we’re still in the cloning phase & only want to make more copies of the plasmid ⭕️ -> ⭕️⭕️⭕️… we can stick the plasmid in bacteria that don’t have the T7 Pol gene (strains like DH5α). And then, when we want to express it, we stick it into bacteria that DO have it (like BL21(DE3))
BUT we still want more control – we want to be able to control when those bacteria that *have* the T7 Pol gene actually *make* T7 Pol. So we steal from another clever biological setup – the LAC OPERON, to be able to control *when* we express the protein
Bacteria use the Lac operon to control when they make the machinery for breaking down the sugar lactose. More here: http://bit.ly/2MxNPs2
They only want to make that machinery if there’s lactose present, so when there isn’t, a repressor protein (Lac repressor) sits on the Lac promoter (site where RNA Pol needs to bind) & “hides it” Then, when lactose is available, some of that lactose gets converted to allolactose which binds the repressor. This causes the repressor to change shape & fall off, freeing the promoter for RNA Pol binding
If we stick a lac promoter in front of the T7 Pol gene & don’t give the bacteria lactose (it’d rather eat glucose anyway) the T7 promoter will stay hidden, so no T7 Pol will be made. I say “no” but the promoter can “leak” if the repressor falls off on its own and RNA Pol sneaks in before it rebinds. So, for tighter control, we can stick one of these lac promoters in front of the T7 promoter in front of our gene as well, giving us a “T7lac” promoter.
When we add the allolactose mimic IPTG (Isopropyl β-D-1-thiogalactopyranoside), it binds the repressor ⏩ repressor falls off ⏩ bacteria makes T7 Pol ⏩ T7 Pol binds T7 promoter in front of our gene ⏩ T7 copies the DNA into RNA until it reaches the T7 terminator & they come apart ⏩ does this over & over 🔁 making lots of mRNA copies that swamp out the bacterial mRNA & outcompete for the limited ribosomes ⏩ ribosomes make our protein from the mRNA instructions ⏩ we celebrate!
Well, sometimes we celebrate. But sometimes we’re not so cheery because sometimes they make too much for the cell to handle, so the cell can’t fold our protein properly & the protein forms clumps of aggregates called inclusion bodies, and when we break open the cells (lyse them) to get out our protein and then spin them down (centrifuge them) to pellet out the insoluble stuff like membrane bits, and we expect our protein to be in the liquid part, its actually with a bunch of crud in the pellet. BUT, all hope’s not lost – we can try again & lower expression by reducing inducer concentration (add less IPTG) and/or growing at a lower temperature.
But sometimes that’s not enough to get you the protein you want. It’s easiest to explain recombinant protein expression in terms of bacterial expression systems, and a lot of proteins are expressed this way (probably most of them) – but some proteins don’t express well (or at least they don’t survive the expression process well) in bacteria – because even though bacteria have all the copying machinery, they don’t have the same folding helpers and post-translational modifiers our cells do – so they can misfold & clump up, have different phosphorylation (added phosphates) & glycosylation (added sugar chains) patterns
So for these trickier proteins we can express them in cells more like ours – mammalian cells are harder (but doable), but insect cells like Sf9 aren’t too bad. I express a lot of my proteins using those.
But when I can use bacteria, I do because it’s way cheaper & easier – and – when it works – you can get a lot more protein per liter. They have really simple growth conditions – they grow fastest at ~37°C, so we set the shaker incubator thermostat to this nice warm temp when we want them to grow and multiply lots. The shaking is important because it makes sure the cells stay aerated – each cell gets a chance to be closer to the oxygen, and CO₂ doesn’t build up – for proper aeration you need to leave a lot of empty space in the flask (like at least 3/4 of what the flask says it holds). I do small “starter cultures” overnight (50ml) so I can get a lot of cells to start with. Then I add some (usually ~5ml) to 1L portions of media in 4L flasks.
Now I have to start monitoring its growth – I want them to grow enough that I get lots of cells (my “factories”) but I need to make sure each of these factories gets enough supplies & doesn’t have to compete with one another for resources. So I periodically check the OD600 to tell me how dense the media is which (the more cells there are the harder it is for light to pass through it) and we can measure this cloudiness as the “Optical Density” measured by a spectrophotometer that shines light (in this case light with a wavelength of 600nm) through a sample of it in a little square “tube” with clear walls called a cuvette and measures how much of the light makes it through.
What’s the optimal optical density for induction? It’s protein – and media – dependent. For LB (Lysogeny Broth) I normally aim for ~0D 0.6-0.8. TB media is more nutrient rich, so it can support denser cell growth – I usually aim for an OD600 of ~1.4-1.8. Once I see it getting close, I move the flasks to the cold room and decrease the incubator temperature to 16 or 18°C.
I typically add IPTG to 1mM but the optimal amount is protein-dependent once again. When I add IPTG, T7 Pol gets made. So my T7-promoter-controlled gene gets copied into mRNA. And then the ribosomes start making protein from it. I let them make protein overnight at that 18°C temp – at this lower temp protein making’s slower which gives proteins more time to fold the right way and hopefully prevent aggregation.
In the morning, I can “harvest” the cells by pouring the liquid holding them into bottles, centrifuging them (spinning really fast to pellet them out cuz they’re heavier than the liquid), re-suspending them in a bath of nice clean buffer (pH-stabilized salt water), then breaking them open (lysis) and purifying out my protein – which is made “easy” because I’ve used DNA Pol to help me redesign the gene to add a little tag onto the end that will specifically bind little beads (resin) in affinity chromatography.
For bacterially-expressed proteins, I like to use a His-SUMO-TEV tag. The “His” is 8 Histidine amino acids in a row – it binds to nickel-coated resin. http://bit.ly/2RRkUUE
SUMO is a little protein that’s there as a “fusion partner” to help the protein stay soluble and stuff while it’s getting made. http://bit.ly/2SDBlUn
And by “TEV” I mean a recognition site for an endoprotease (“site-specific protein scissors”) TEV which allows me to cut the tag off when I don’t need it anymore. http://bit.ly/2P8HINE