If you want to study a protein, you want to manipulate it as much as possible, so you want to be able to manipulate its genetic instructions as much as possible. RECOMBINANT PROTEIN EXPRESSION is where we use MOLECULAR CLONING to take a gene* from one place and (most commonly) stick it into a small circular piece of DNA called a PLASMID VECTOR** -> stick that into expression cells to use the gene’s instructions to make that protein -> get lots of protein you can purify & study.
Before you can get that plasmid into the cells you choose, you have to get your gene into that plasmid. The “classic” way to do this is the “cut & paste” with RESTRICTION CLONING. Bacteria have DNA-specific “scissors” called RESTRICTION ENZYMES (aka restriction endonucleases, or REases) that recognize & cut specific “code words” (RESTRICTION SITES aka recognition sequences) written in DNA, which serve as “dotted lines.” With RESTRICTION CLONING, we use these DNA “scissors” to cut an “insert piece” with the gene we want to stick in and a “vector piece” with the vector we want to stick it in with the same pairs of scissors so they have complementary cuts. And then we purify the matching pieces and mix them together, adding a “stitcher” called DNA ligase to seal them up tight.
I use a different method, SITE & LIGATION INDEPENDENT CLONING (SLIC). With SLIC cloning we use Polymerase Chain Reaction (PCR) to make lots of copies of (amplify) “insert pieces” & “plasmid pieces.” When we do this copying we add on extra matching bits to the end of the pieces. We then let an enzyme (reaction mediator/speed-upper) chew 1 strand of each of these these ends back a little to leave sticky parts. And then we put them into bacteria to piece them together, fixing the damage. Similar PCR-based methods include Gibson assembly & Golden Gate Assembly.
I want to tell you about these methods, both the theory and some practical stuff, but first let’s clear up those asterisked terms. For people who just want the gist, that summary should hopefully suffice to skip past these notes, but for people who are actually studying and/or using these techniques, I think it’s important to understand the nuances in some of these terms because they can get confusing and confused.
*for proteins, a gene is actually kinda like a “pre-recipe” – proteins are actually made from RNA copies of the DNA recipe. These RNA copies are called messenger RNA (mRNA) and they’re edited (and temporary) copies of the gene – editing involves cutting out regulatory regions (introns) and stitching together the “expressed” regions (exons) in a process called RNA splicing. Our cells do this (and can do it in multiple ways to give you alternative splice products (splice isoforms) – kinda like purposefully skipping a step in a triple-decker cake recipe in order to make a double-decker cake). But bacterial cells don’t – and even if they did, they wouldn’t know which splice isoform to make to know what cake to bake!. So, instead of inserting the gene like it occurs in our DNA, we insert a version of the gene that is complementary to the edited form we want – mature mRNA for the specific isoform of interest. We call this complementary DNA (cDNA). To try to avoid confusion, I’ll use the term “insert” to refer to the cDNA we put it (this is actually a more relevant term anyway because you can clone in *any* DNA – it doesn’t have to be cDNA unless you want to make protein from it.
**your vector doesn’t have to be a plasmid and sorry in advance for going back-and-forth between “plasmid” & “vector” – a vector is a “vehicle” for transporting your gene into the cells (analogously to how mosquitos can be vectors for malaria, “driving” the malaria DNA into you when they go to suck your blood). Instead of mosquitos, we use things like plasmids, “artificial chromosomes,” and weakened viruses such as adenoviruses (a kind of cold-causers). What kind of vector you use depends largely on the type of cells or organisms you want to get the DNA into and how much DNA you want to stick in. For example, artificial chromosomes can hold a lot more DNA than plasmids. And “bacmids” are a sort of modified plasmid that can live and replicate (copy itself) in bacteria *and* insect cells.
Plasmids are circular pieces of DNA that bacteria will host alongside their own DNA. These plasmids typically originally come from a virus that infects bacteria (a bacteriaphage or “phage”). Because they come from viruses specialized for living in bacteria cells, they’re great for us to get DNA we want to live in them in bacteria cells; for example, they have a bacterial “Origin of Replication Initiation”which tells the bacterial DNA polymerase to make a copy of it before the cell divides so you don’t dilute it out. Another benefit of plasmids is that having a circle’s nice because you have the ends protected from DNA-end-chewers (exonucleases) and, as we’ll see, the circular format makes copying easy. note: When talking about plasmids, we often think of vectors as the “generic part” of the plasmid and when you add in your gene you get a unique “plasmid” but I tend to use the terms interchangeably, so apologies in advance for any confusion!
We can modify plasmids to live in other kinds of cells, such as by putting in an insect ORI so the plasmid can replicate in insect cells, but in both cases you’re dealing with “naked” DNA (it doesn’t have any sort of coat or anything). Why care? Getting DNA into cells is called “transformation” (although, we commonly call it transfection when referring to getting DNA into animal cells so that it doesn’t get confused with tumor transformation). We can use methods such as heat shock, electroporation, and cationic carriers to get “naked” DNA into cells in a dish. But, it’s harder to get DNA into animals, so, weakened viral vectors are often for that, taking advantage of the virus’ “knowledge” of how to use its packaging to get in.
I’m going to focus on molecular cloning of plasmids, because that’s probably the most common (and easiest to visualize and stuff). But one of the beautiful biochemical things about molecular cloning is that you can use the same sort of techniques to stick DNA “anywhere.” – so let’s talk about those techniques.
As I mentioned before the sidetrackedness, there are numerous options including variations & mixtures of:
🔹restriction-enzyme-based: cut & paste
🔹 PCR-based methods: copy just the parts you want and staple together
regardless of what method you choose you’ll need 2 things:
1️⃣ plasmid vector you want to stick your gene into (destination)
2️⃣ something containing the gene or “insert” you want to stick into that vector – these days, this insert is usually already inserted into a different plasmid vector just not the one you want so what you need to do is SUBCLONE it -> move it from one plasmid to another instead of “traditional” cloning where you’d be moving it from its original location (such as chromosomes inside human cells)
For SLIC we always use PCR (our DNA-amplifying technique) to get the insert, and we often use it to get the insert for restriction cloning as well. You *can* just cut the DNA you’re interested out directly from where it is, but then you need a lot more of the original DNA than you do if you make copies of it first. Another benefit of PCR-based insert production is that it allows you to introduce extra letters to serve as “overhangs” – and you can make these overhangs contain cut sites for restriction enzymes, so you don’t need to rely on the DNA you want happening to have cut sites right next to them (though if you order a cDNA clone where someone has already stuck the DNA you want into a sort of generic plasmid, that plasmid often has MULTIPLE CLONING SITES (MCSs) or POLYLINKERS – synthetic sequences designed to have cut sites for lots of different restriction enzymes so that you can choose one that’s also present flanking your gene).
So let’s talk a bit about how PCR works. Starting with how DNA “works.” There are 4 DNA letters – A, G, C, & T. They each have generic deoxyribose sugar-phosphate backbone and then the different letters have different “nitrogenous bases” or “nucleobases” which stick off from the generic part. The generic backbone allows letters to link up in any order to form strands – but the unique bases make between-strand pairing more picky. Because of how their atoms (individual oxygens (O’s), nitrogens (N’s), hydrogens (H’s), etc.) are situated, A & T can form weak, reversible, “hydrogen bonds” (H-bonds) with one another and similarly for C & G. So, we can say A is complementary to T and G is complementary to C.
That allows for sequence specificity and it comes in handy a lot – both in your cells and in the lab – you can use one strand as a template for making the other, and you can get short “complementary” stretches to specifically bind specific regions of interests. In PCR, we use short complementary DNA pieces called primers to bookend regions of DNA we want copied – those primers bind (one per stand) and we use DNA Polymerase (DNA Pol) to “write” the complementary strand based on the template they’re on, resulting in a copy being made from each template each cycle. Do a lot of cycles and you get an exponential increase in copies. More here: http://bit.ly/pcrtrain
The primers basically tell DNA Polymerase (DNA Pol) where to start laying down nucleotide “tracks.” If you put some extra DNA letters at the start of those primers, those letters will get stuck onto the thing you’re copying (kinda like adding a generic letterhead & footer) – and if those letters you add complement letters on a different piece of DNA they can stick together. the letters of DNA like to bind their complementary letters, regardless of where they come from – it can be the second strand of the same “insert piece” or the opposite strand of the “vector piece.” That’s what you do with SLIC. With restriction cloning, instead of putting a bit of the vector sequence on the ends of the insert, you can put in restriction enzyme cut sites.
Now we’re getting more into the details, so I’m going to focus on one method at a time. I will start with restriction cloning because it’s, at least historically, the most common.
experimental overview: Take the DNA where it currently is (such as in a plasmid or from a PCR reaction) ⏩ add restriction enzyme(s) (and a buffer containing salts, pH stabilizers, Mg2+, etc. to keep the enzyme happy) ⏩ heat it up to give the enzymes energy to work & give it time to cut ⏩ purify the pieces ⏩ mix them together ⏩ add ligase to stitch them up ⏩ stick them in bacteria ⏩ bacteria host it and make protein from it
‘nother nerdy nomenclature note: the enzymes are “numbered” not “lettered” (e.g. EcoRV isn’t an all-electric RV model, it’s EcoR FIVE (learned this the embarrassing way, true story). This tells you it was the 5th restriction enzyme found in the “RY13” strain of E. coli)
The restriction enzymes we use for cloning are usually of the “IIP subtype” – the sequences they recognize are usually fairly short (4-6 basepairs (bp) long) PALINDROMES (think kayak, racecar…) Since DNA’s 2 strands complement each other, this “palindromnicity?” means that both strands of the DNA have the cut site. So, usually working in pairs of identical copies (homodimers), the enzyme cleaves all the way through the DNA (both strand) instead of just “nicking” it (cutting a single strand).
When they cut, they can make “staggered cuts” that result in STICKY ENDS (2-4 unpaired nucleotides “overhanging” on each end (useful if you want to then stick it to something else…) or BLUNT ENDS (cut straight across – no overhangs)
If you cut 2 things with the same 2 enzymes (and they have cut sites in the same orientation) you can remix and match them. So, for example, you can cut a plasmid vector and your gene of interest with the same enzymes -> generates matching sticky ends -> purify the pieces and mix them together. Some REases have different recognition sequences but make the same cut so they’re “complementary” (kinda like one recognizes “ace” and one recognizes “racecar” but both cut after the ac leaving you with the same overhang (ac e and rac ecar). You want to make sure you’re using “unique cutters” so you only get the right pieces. If a cutter cuts multiple places you’ll get multiple matching pieces. You can use a free web tool called NCBI BLAST to find potential conflicts of cutting.
If you’ve cut with a blunt end cutter, any 2 pieces can match, but the orientation might “flip,” so if possible we use sticky ends. I say “if possible” because this only works if the “dotted lines” are there – for vectors designed for this sort of thing, this is less of a problem because they’re often defined with “multiple cloning sites” (MCSes) that have several options to choose from.
A downside of restriction cloning as compared to SLIC is that, since you aren’t making copies of the vector, you have to start with a lot of parent plasmid – and each time you make an insert piece you make a “TheRest” piece. So you have to separate the insert from the rest so they don’t just bind back together.
One thing you have going for you is that the 2 strands might “stick” together on their own because of basepair complementarity, but bases in (what you want to be) the same strand can’t “sew themselves” together. When DNA Pol links up DNA letters it does so using nucleotide triphosphates – the forms of the letters with 3 phosphate groups. Those groups are negatively-charged and like charges repel, so holding 3 of them in a row is like clamping a stiff spring, and “letting go” by breaking phosphate-phosphate bonds releases energy – and this can be used to offset the energetic cost of letter linkage (it’s expensive because molecules like to have freedom to run around & don’t like to be tied down). So when DNA Pol links together nucleotide triphosphates, it breaks off 2 of the phosphates to pay, leaving you with just a monophosphate. Thus, when DNA gets cut, there’s a monophosphate at the end. So the DNA can’t pay its own way. A different enzyme, DNA ligase, has to come to the rescue and uses energy from ATP to stitch the gap. http://bit.ly/3a2ojWm
We often use T4 DNA ligase for this. note of caution – this is DIFFERENT from T4 DNA Polymerase which, as we’ll see, we use for SLIC – and different from T4 polynucleotide kinase we use for radiolabeling – so check the tube labels carefully! The “only” thing they have in common (other than being sold mainly by NEB) is that they come from the T4 phage – a bacteria-infecting virus that has a rich history in biochemistry.
But back to our story of binding back up – speaking of “binding back” there’s a type of back-binding we don’t want – self-circularization – where the plasmid ends stick to themselves instead of the other. This is important to watch out for if you’re using the same enzyme to cut on both sides (if you’re 2 using different cutters you don’t need to worry). To prevent it, you can add a phosphatase (phosphate remover) like calf intestinal phosphatase (CIP) or antarctic phosphatase (AP). The DNA ligase doesn’t need a *tri*phosphate, but it does need a phosphate – In order to do the strand-stitching, the ligase needs the 5’ ends to be phosphorylated. If you take that phosphate away, the ligase cannot play!
But you still need it to play with your insert! So that insert needs to provide a 5’ phosphate (if your insert but not the vector has phosphates you’ll be left with 2 nicks since you only can make 2 “stitches” – but the bacteria can fix it once you stick it in.
Note: If you’re using PCR-generated pieces there are a couple of situations – one is that you make a lot of copies and then you cut the copies. In this case, end-wise, it’s just like you’d cut it out directly. But if you’re using PCR-generated pieces without cutting them (e.g. you used primers that “act as overhangs” those primers usually are NOT phosphorylated, so you have to phosphorylate them, such as with T4 Polynucleotide Kinase.
A lot of the times when we’re cloning we’re doing “subcloning” where we’re moving a gene from one plasmid to another (like moving it from the generic plasmid it came in when we ordered it) and putting it into a plasmid that’s ideal for protein-making. And we’re often swapping out a gene that was in the new one so our “TheRest” pieces are big – too big to use PCR purification kits which can remove small things like primers. more on those here: http://bit.ly/2yO5BBt
So usually you purify them with gel purification – you start by running an agarose gel to separate the DNA pieces by size (you should do this in any case just to check that the pieces look ok and it all got cut). When you’re going to purify it out of the gel you add it all (not just a little bit to look) and you use a wide comb so you have plenty of room to cut around. And you also can use low melting point agarose, which makes it easier to get your DNA out. more here: http://bit.ly/2ASXipa
Agarose gel electrophoresis separates DNA pieces by their length (DNA’s naturally negative, so you can use positive charge to motivate it through a gel mesh made of the sugar agarose – longer DNA pieces will get slowed down more because they’ll get tangled up in the mesh more (think of trying to drag a jumprope through a net) so they’ll travel slower. more here: http://bit.ly/2lPCUR8
You use a DNA-binding stain that absorbs UV light, so you use UV to tell where your band of interest is – cut out that chunk of gel – extract out the DNA (basically chemically melt away the surrounding mesh) and purify it. You purify the vector and the insert & mix them (usually at ~1:2-3 ratio of vector to insert but it might take some trial-and-error-ing) and add ligase.
After you give it time to work, you stick into into bacteria. As noted in the beginning, we call this transformation and a common way we do this is “heat shock” where we mix the DNA we want to put in with chemically-weakened bacteria in a tube on ice and then briefly dunk it in a warm water bath then stick it back on ice. The heat shock opens up pores in the bacteria so the DNA can rush in. more here: http://bit.ly/2Jj7L47
After you let it recover a bit you plate it on an agar plate in your classic Petri dish. Agar is related to but different from agarose and it makes a nice solid gel matrix to hold bacteria food. We only want it to be a nice “B&B” for bacteria that have our plasmid (not all the cells will have taken it in and other bacteria could have snuck in at various times) so we need a way to prevent them from staying here. Usually we do this using antibiotic selection. We use a plasmid that has an antibiotic resistance gene and then we spike the food with that antibiotic so that only bacteria with the plasmid can survive). more here: http://bit.ly/2tcW4ky
Since the vector has the antibiotic resistance gene you’re using for selection, even if it doesn’t have your gene, it’s important to make sure that there’s none of the original or the cut-but-re-self-sealed. So you can run a couple of controls.
In 1 control you just transform the cut vector alone – if you get colonies on this plate it indicates that not all of the vector got cut (only the circularized vector can survive & replicate inside of bacteria)
If you had one of those cases where you had to dephosphorylate, you do a second control. in control 2 you transform the cut vector + ligase (but NO insert) -> if you get colonies on this plate, but not the first it indicates that you got self-circularization, suggesting you didn’t dephosphorylate it sufficiently so the ligase *could* play
If you get colonies on both of those it could be a mix of both or just the non-cuttedness or it could be contamination or something.
You’ll probably have a few colonies on the control plates but you should have way more on the “real plate” (the one with vector + insert + ligase). You then take a few of those colonies and grow them in some liquid broth, isolate their plasmid DNA and check to make sure it actually has your insert – you can do a quick check with with colony PCR or analytical digest then verify with sequencing. more here: http://bit.ly/2TGAvo5
I used such restriction cloning in undergrad but I’ve now switched to SLIC because it’s a lot more versatile – and the piece-purifying is a lot easier. With PCR you get lots of copies of your insert and vector pieces and no copies of “TheRest” – so you just have to purify out the primers. Here are some more details about this method
The basic idea with SLIC is that we design the insert piece to have bits of the vector piece at the ends, so that when DNA Pol starts copying our gene, it adds on a bit of the vector at the beginning (kinda like adding a few words from the page you want to come before it)- specifically it adds the part of the plasmid that’s flanking where your gene will go. The PCR reaction gives us double-stranded DNA, so the complementarity is “hidden” by the second strand. You therefore have to chew back one of the strands a bit (with an exonuclease) to generate single-stranded “sticky ends” – only the end is chewed & this is the part that matches the vector, so it exposes vector-matching single-stranded DNA that can stick to the vector DNA.
The exonuclease chewing is much less precise than the restriction enzyme cleavage, so you’ll get overhangs of different lengths and when you combine them & they stick together, they’ll leave gaps – but this is ok because the bacteria can fill them in.
But in order for the bacteria to fill it in properly, they need the right template sequence – the original gene doesn’t “know” the sequence of the plasmid – so if you cut off some of the sequence you loose those instructions & the bacteria don’t know how to fill it in -> BUT if you add enough of the plasmid sequence to the ends, that’s what gets chewed back – that part will match the other part so you’ll get sticking, & your gene will be there to provide complete template info. So you design primers so that: 1: the ends match & when you chew them back & 2: they’re long enough that you don’t loose the “unique” information (you want your primers to have ~20bp overlap between the end of thing 1 & the beginning of thing 2).
Our lab uses 2 major “go-to” expression systems – bacteria & insect cells, & they use different vectors (a plasmid for the bacteria and a backed for the insect cells). We can’t get around having to do 2 sets of molecular cloning (since we need to stick it into 2 different vectors), but we can save some time with clever primer design. We’ve adapted the plasmids so that they have the same sequences flanking where we want the gene to go, so we can use the same “plasmid bit” overhangs & the same primers to amplify either plasmid.
We use vector-specific (but specific to the generic part) primers to amplify either vector & we use vector-insert chimera primers to amplify the insert. So we only have to design 1 set (of 2) SLIC primers (the hybrid vectorend-insertstart ones) per clone. We use these to get the “gene piece” that we can put in either one. To get the different plasmid vectors, we use the same “vector primers” but a different plasmid template. Most of the vector’s different but that primer binding site is all that matters here, and it’s the same. Even with this time-savingness, they still add up to lots and lots of tubes & boxes). But Most of the SLIC-ing I do is actually site-directed mutagenesis. My gene’s already where I want it I just want to change the sequence. More on this here: http://bit.ly/sitedirectedmutagenesis
And how does it work in practice? Here’s an overview:
- choose a vector
- design primers to copy just the regions of the vector and the insert you want to combine – instead of primers that just match what’s already there, you make “Special” primers – have ~20bp overlap – basically you design the primers to start with the end of the other piece, so that when you make a copy you also copy a few of the words that you want to come before it when you staple them together
- do PCR – you end up with double-stranded DNA (dsDNA) fragments whose ends are hybrids between the vector & the insert – but you still have 2 pieces… and those pieces are pretty content as is… need to chew their ends up a little to create sticky ends like those you’d get from restriction enzymes (now, instead of cleaving in the middle of DNA, which you want ENDOnucleases for you want to chew from the end so you need EXOnucleases
- add another endonuclease, DpnI to selectively degrade any of the original parent vector, as that vector would give you false positives in the selection step because it has the antibiotic resistance gene. DpnI only degrades methylated DNA and methylation (addition of a methyl (-CH₃) group) occurs in bacteria but not PCR, so the PCR-generated DNA is *not* methylated and thus is safe, whereas the parent DNA *is* methylated and thus is at risk so DpnI is able to selectively degrade the parent. more here: http://bit.ly/reasesvsmtases
- purify the products (easy to do with a spin column kit) – this is important – want to remove unused nucleotides because you’re about to add a polymerase again!
- add T4 polymerase – polymerase? isn’t that what adds the nucleotides? yes – but proofreading polymerases can also remove them – the 3’-5’ exonuclease activity is important for proofreading because it allows the polymerase to “backtrack.” But since you took away the nucleotide train track pieces, the polymerase can’t build track to go forward, so it gets “bored” and starts going the direction it can go – “backwards,” chewing off the ends as it does so and creating single-stranded (ssDNA) overhangs at the 5’ end. You don’t want it to erase all of your (ok, the thermal cycler’s) hard work – you just want it to chew off enough to give you sticky ends, so you only give it ~10 min to do this
- then you stop it by adding a dNTP (DNA letter), but just 1 of the letters – it can add this letter, so it switches to its track-laying (polymerization) mode, but then gets stuck and stalls when it needs to add a different letter
- at this point you have sticky ends that can stick together, but when they do so there may be gaps because the different strands have gotten chewed different amounts – if you stitched it “as is” you’d be deleting letters. Instead, you want to fill those letters back in, which will require more than just the stitcher, so instead of adding DNA ligase to stitch up the ends in a test tube, we just stick it in bacteria and have the bacteria stitch it up for us using its own machinery – the bacteria’s homologous repair machinery (part of its DNA maintenance crew) is well equipped to handle cases like these
A similar method is GIBSON ASSEMBLY – the PCR part’s basically the same but then things change a bit – instead of relying on one enzyme (T4) to do both the chewing & the polymerizing, it uses 2 separate enzymes – T5 exonuclease & Phusion polymerase. It also uses DNA ligase. Gibson is much more expensive because you’re supplying all these products “in vitro”(in a test tube) in their pure form. We don’t need them to be “pure” – we just stick the partial product into bacteria and have them finish the work – their “impure” forms work great! Gibson can be more efficient, but we’ve had great success with SLIC
Don’t confuse “Gibson” with “Golden Gate” which uses PCR to add on restriction enzyme sites then uses those restriction enzyme sites to cut & paste (like copy but add cut sites when you copy -> then cut & paste) and don’t confuse “Golden Gate” with “Gateway” which uses λ integrase – a whole different mechanism
Whatever method you choose, you still need to make sure it worked, same as with restriction cloning. We can use the same techniques to check if a gene we’re interested in got inserted into the plasmid – techniques like diagnostic digest, blue-white screening, and sequencing. http://bit.ly/cloningcheck