Restriction cloning – a time it’s a good thing to be in a sticky situation! Bacteria have DNA-specific “scissors” called RESTRICTION ENZYMES (aka restriction endonucleases, or REases) that recognize & cut specific “code words” (RESTRICTION SITES aka recognition sequences) written in DNA, which serve as “dotted lines.” Different REases recognize different sequences and bacteria use them as a defense against invaders like bacteria-infecting viruses (phages). Biochemists found out – and found out that you could stick DNA into bacteria and have bacteria make copies of it and/or make protein from the genetic recipes the inserted DNA contains. . Cut to today (pun not originally intended) and REases are frequently used a to cut and paste pieces of DNA together in RESTRICTION CLONING
Genes are stretches of DNA that serve as recipes for making products – like proteins and functional RNAs. In MOLECULAR CLONING, we stick a gene we want to study into a PLASMID VECTOR (circular piece of DNA) that’s easier to work with. We often stick this plasmid in bacteria cells to have them “host” it for us. One way, the “classic” way is RESTRICTION CLONING. In this method, we use DNA “scissors” -RESTRICTION ENZYMES – to cut an “insert piece” with the gene we want to stick in and a “vector piece” with the vector we want to stick it in with the same pairs of scissors so they have complementary cuts. And then we purify the matching pieces and mix them together, adding a “stitcher” called DNA ligase to seal them up tight.
nerdy nomenclature note: for proteins, a gene is actually kinda like a “pre-recipe” – proteins are actually made from RNA copies of the DNA recipe. These RNA copies are called messenger RNA (mRNA) and they’re edited (and temporary) copies of the gene – editing involves cutting out regulatory regions (introns) and stitching together the “expressed” regions (exons) in a process called RNA splicing. Our cells do this (and can do it in multiple ways to give you alternative splice products (splice isoforms) – kinda like purposefully skipping a step in a triple-decker cake recipe in order to make a double-decker cake). But bacterial cells don’t – and even if they did, they wouldn’t know which splice isoform to make to know what cake to bake!. So, instead of inserting the gene like it occurs in our DNA, we insert a version of the gene that is complementary to the edited form we want – mature mRNA for the specific isoform of interest. We call this complementary DNA (cDNA).
nerdy nomenclature note number 2: Complementarity refers to base pair complementarity. “Base” here is short for “nitrogenous base” or “nucleobase” and it’s the unique ring-y part that sticks off from the generic sugar-phosphate backbone of nucleotides (DNA & RNA letters). There are 4 DNA letters and 4 RNA letters – both DNA & RNA have A, G, & C. Then DNA has T and RNA has U (RNA also has an “extra” oxygen in its sugar (ribose) compared to the sugar in DNA (deoxyribose). The generic backbone allows letters to link up in any order to form strands – but the unique bases make between-strand pairing more picky. Because of how their atoms (individual oxygens (O’s), nitrogens (N’s), hydrogens (H’s), etc.) are situated, A & T (or U) can form weak, reversible, “hydrogen bonds” (H-bonds) with one another and similarly for C & G.
That allows for sequence specificity and it comes in handy a lot – both in your cells and in the lab – you can use one strand as a template for making the other, and you can get short “complementary” stretches to specifically bind specific regions of interests. Both of these frequently come into play in test tubes in the lab in a technique called Polymerase Chain Reaction (PCR), where we use short complementary DNA pieces called primers to bookend regions of DNA we want copied – those primers bind (one per stand) and we use DNA Polymerase (DNA Pol) to “write” the complementary strand based on the template they’re on, resulting in a copy being made from each template each cycle. Do a lot of cycles and you get an exponential increase in copies. More here: http://bit.ly/396uIQ0
So you can get a lot of copies of some DNA you’re interested in. And then you can stick it where you want to using restriction cloning (alternatively, you can just cut the DNA you’re interested out directly from where it is, but then you need a lot more of the original DNA than you do if you make copies of it first. Another benefit of PCR-based insert production is that it allows you to introduce extra letters to serve as “overhangs” – and you can make these overhangs contain cut sites for restriction enzymes, so you don’t need to rely on the DNA you want happening to have cut sites right next to them (though, as I’ll talk a bit more about later, if you order a cDNA clone where someone has already stuck the DNA you want into a sort of generic plasmid, that plasmid often has multiple cut sites next to the DNA you want)
A quick overview and then the deets: Take the DNA where it currently is (such as in a plasmid or from a PCR reaction) ⏩ add restriction enzyme(s) (and a buffer containing salts, pH stabilizers, Mg2+, etc. to keep the enzyme happy) ⏩ heat it up to give the enzymes energy to work & give it time to cut ⏩ purify the pieces ⏩ mix them together ⏩ add ligase to stitch them up ⏩ stick them in bacteria ⏩ bacteria host it and make protein from it
‘nother nerdy nomenclature note: the enzymes are “numbered” not “lettered” (e.g. EcoRV isn’t an all-electric RV model, it’s EcoR FIVE (learned this the embarrassing way, true story). This tells you it was the 5th restriction enzyme found in the “RY13” strain of E. coli)
The restriction enzymes we use for cloning are usually of the “IIP subtype” (more on this at the end) – the sequences they recognize are usually fairly short (4-6 basepairs (bp) long) PALINDROMES (think kayak, racecar…) Since DNA’s 2 strands complement each other, this “palindromnicity?” means that both strands of the DNA have the cut site. So, usually working in pairs of identical copies (homodimers), the enzyme cleaves all the way through the DNA (both strand) instead of just “nicking” it (cutting a single strand).
When they cut, they can make “staggered cuts” that result in STICKY ENDS (2-4 unpaired nucleotides “overhanging” on each end (useful if you want to then stick it to something else…) or BLUNT ENDS (cut straight across – no overhangs)
If you cut 2 things with the same 2 enzymes (and they have cut sites in the same orientation) you can remix and match them. So, for example, you can cut a plasmid vector and your gene of interest with the same enzymes -> generates matching sticky ends -> purify the pieces and mix them together. Some REases have different recognition sequences but make the same cut so they’re “complementary” (kinda like one recognizes “ace” and one recognizes “racecar” but both cut after the ac leaving you with the same overhang (ac e and rac ecar). You want to make sure you’re using “unique cutters” so you only get the right pieces. If a cutter cuts multiple places you’ll get multiple matching pieces. You can use a free web tool called NCBI BLAST to find potential conflicts of cutting.
If you’ve cut with a blunt end cutter, any 2 pieces can match, but the orientation might “flip,” so if possible we use sticky ends. I say “if possible” because this only works if the “dotted lines” are there – for vectors designed for this sort of thing, this is less of a problem because they’re often defined with “multiple cloning sites” (MCSes) that have several options to choose from.
I used such restriction cloning in undergrad but I’ve now switched the PCR-based method SLIC (more here: http://bit.ly/2oiw6wL ) because it’s a lot more versatile – and the piece-purifying is a lot easier. With PCR you get lots of copies of your insert and no copies of “TheRest” – so you just have to purify out the primers and cut up the parent plasmid.
But with restriction cloning you’re not making copies – so you have to start with a lot of parent plasmid – and each time you make an insert piece you make a “TheRest” piece. So you have to separate the insert from the rest so they don’t just bind back together.
One thing you have going for you is that the 2 strands might “stick” together on their own because of basepair complementarity, but bases in (what you want to be) the same strand can’t “sew themselves” together. When DNA Pol links up DNA letters it does so using nucleotide triphosphates – the forms of the letters with 3 phosphate groups. Those groups are negatively-charged and like charges repel, so holding 3 of them in a row is like clamping a stiff spring, and “letting go” by breaking phosphate-phosphate bonds releases energy – and this can be used to offset the energetic cost of letter linkage (it’s expensive because molecules like to have freedom to run around & don’t like to be tied down).
So when DNA Pol links together nucleotide triphosphates, it breaks off 2 of the phosphates to pay, leaving you with just a monophosphate. So when DNA gets cut, there’s a monophosphate at the end. So the DNA can’t pay its own way. So a different enzyme, DNA ligase, comes to the rescue and uses energy from ATP to stitch the gap. http://bit.ly/3a2ojWm
We often use T4 DNA ligase. note of caution – this is DIFFERENT from T4 DNA Polymerase which we use for SLIC – and T4 polynucleotide kinase we use for radiolabeling – so check the tube labels carefully! The “only” thing they have in common (other than being sold mainly by NEB) is that they come from the T4 phage – a bacteria-infecting virus that has a rich history in biochemistry.
But back to our story of binding back up – speaking of “binding back” there’s a type of back-binding we don’t want – self-circularization – where the plasmid ends stick to themselves instead of the other. This is important to watch out for if you’re using the same enzyme to cut on both sides (if you’re 2 using different cutters you don’t need to worry). To prevent it, you can add a phosphatase (phosphate remover) like calf intestinal phosphatase (CIP) or antarctic phosphatase (AP). The DNA ligase doesn’t need a *tri*phosphate, but it does need a phosphate – In order to do the strand-stitching, the ligase needs the 5’ ends to be phosphorylated. If you take that phosphate away, the ligase cannot play!
But you still need it to play with your insert! So that insert needs to provide a 5’ phosphate (if your insert but not the vector has phosphates you’ll be left with 2 nicks since you only can make 2 “stitches” – but the bacteria can fix it once you stick it in.
Note: If you’re using PCR-generated pieces there are a couple of situations – one is that you make a lot of copies and then you cut the copies. In this case, end-wise, it’s just like you’d cut it out directly. But if you’re using PCR-generated pieces without cutting them (e.g. you used primers that “act as overhangs” those primers usually are NOT phosphorylated, so you have to phosphorylate them, such as with T4 Polynucleotide Kinase.
A lot of the times when we’re cloning we’re doing “subcloning” where we’re moving a gene from one plasmid to another (like moving it from the generic plasmid it came in when we ordered it) and putting it into a plasmid that’s ideal for protein-making. And we’re often swapping out a gene that was in the new one so our “TheRest” pieces are big – too big to use PCR purification kits which can remove small things like primers. more on those here: http://bit.ly/2yO5BBt
So usually you purify them gel purification – you start by running an agarose gel to separate the DNA pieces by size (you should do this in any case just to check that the pieces look ok and it all got cut). When you’re going to purify it out of the gel you add it all (not just a little bit to look) and you use a wide comb so you have plenty of room to cut around. And you also can use low melting point agarose, which makes it easier to get your DNA out. more here: http://bit.ly/2ASXipa
Agarose gel electrophoresis separates DNA pieces by their length (DNA’s naturally negative, so you can use positive charge to motivate it through a gel mesh made of the sugar agarose – longer DNA pieces will get slowed down more because they’ll get tangled up in the mesh more (think of trying to drag a jumprope through a net) so they’ll travel slower. more here: http://bit.ly/2lPCUR8
You use a DNA-binding stain that absorbs UV light, so you use UV to tell where your band of interest is – cut out that chunk of gel – extract out the DNA (basically chemically melt away the surrounding mesh) and purify it. So you purify the vector and the insert & mix them (usually at ~1:2-3 ratio of vector to insert but it might take some trial-and-error-ing) and add ligase.
You give it time to work and then you stick into into bacteria (we call this transformation and a common way we do this is “heat shock” where we mix the DNA we want to put in with chemically-weakened bacteria in a tube on ice and then briefly dunk it in a warm water bath then stick it back on ice. The heat shock opens up pores in the bacteria so the DNA can rush in. more here: http://bit.ly/2Jj7L47
After you let it recover a bit you plate it on an agar plate in your classic Petri dish. Agar is related to but different from agarose and it makes a nice solid gel matrix to hold bacteria food. We only want it to be a nice “B&B” for bacteria that have our plasmid (not all the cells will have taken it in and other bacteria could have snuck in at various times) so we need a way to prevent them from staying here. Usually we do this using antibiotic selection. We use a plasmid that has an antibiotic resistance gene and then we spike the food with that antibiotic so that only bacteria with the plasmid can survive). more here: http://bit.ly/2tcW4ky
Since the vector has the antibiotic resistance gene you’re using for selection, even if it doesn’t have your gene, it’s important to make sure that there’s none of the original or the cut-but-re-self-sealed. So you can run a couple of controls.
In 1 control you just transform the cut vector alone – if you get colonies on this plate it indicates that not all of the vector got cut (only the circularized vector can survive & replicate inside of bacteria)
If you had one of those cases where you had to dephosphorylate, you do a second control. in control 2 you transform the cut vector + ligase (but NO insert) -> if you get colonies on this plate, but not the first it indicates that you got self-circularization, suggesting you didn’t dephosphorylate it sufficiently so the ligase *could* play
If you get colonies on both of those it could be a mix of both or just the non-cuttedness or it could be contamination or something.
You’ll probably have a few colonies on the control plates but you should have way more on the “real plate” (the one with vector + insert + ligase). You then take a few of those colonies and grow them in some liquid broth, isolate their plasmid DNA and check to make sure it actually has your insert – you can do a quick check with with colony PCR or analytical digest then verify with sequencing. more here: http://bit.ly/2TGAvo5
Where do these super useful tools come from? As I mentioned briefly before, restriction enzymes come from bacteria themselves! (like many molecular biology tools). Restriction enzymes are an important natural protection mechanism for bacteria. if a virus infects them, the restriction enzymes will recognize specific sequences in the foreign DNA & cut it so that DNA gets chewed up & does no harm.
To make sure that the bacteria doesn’t cut its own DNA, sites where that recognition site occurs in the bacterial DNA are “hidden” by a modification called methylation (which adds a methyl (-CH3) group to the DNA) Yesterday we saw how methylation can be used to tell apart original “parent” plasmid from copies of it & selectively chew up the methylated parent with an enzyme called DpnI http://bit.ly/39X28ky
DpnI is a restriction enzyme too, but a different type than the ones we use for cut-and-paste or cut-and-look-ing. There are several different “types” of restriction enzymes, some more well-suited than others for what we need. TYPE II restriction enzymes are the useful ones for these purposes. TYPE I & TYPE III are bigger & have multiple pieces (in part because they also carry out methylation to protect their host DNA) They also don’t cut where you think they “should”. TYPE I cuts at random sites that can be over 1000 base pairs (bp) away from the recognition site & TYPE III cuts ~25 bp away. And, if that weren’t bad enough, they both require energy (in the form of ATP not just heat) to cut
So TYPE II’s the one we use! But within this type we still have hundreds of enzymes to choose from (a market for which New England Biolabs (NEB) has gladly cornered)! Some of the type II can be complicated too but we typically use the less complicated ones! (the IIP subtype).
Even within TYPE II, some restriction enzymes are more promiscuous than others… Evolution-wise this makes sense – too short a sequence & the viral DNA will definitely get cut up (yay!) BUT the bacterial DNA likely will as well because it’ll be hard for the methylators to keep up! (boo!) And it will also require a lot of unnecessary energy since you don’t need to make tons of cuts – just cut it once & EXOnucleases will chew from the ends, a much “cheaper” process
As for length – longer sequences are less likely to occur by chance (yay!) BUT too long & they’d have very limited usefulness for the bacteria since it’s really unlikely that sequence would occur in the DNA of the viral invader (boo!) So bacteria found that 4-8’s a good balance
⚠️Don’t confuse restriction enzymes (which are endoNUCLEASES & cut DNA) with endoPROTEASES which cut PROTEINS. Both have their uses in molecular biology & biochemistry but at different points in the process! (for example, we can use endoproteases to cut a protein tag off of a protein) more here: http://bit.ly/2P8HINE