If you want to copy some DNA, Polymerase Chain Reaction (PCR) provides a way. And my reaction when I see a full Pol tube is Phew! We’re not out of Pfu! Though we might want to Taq a few more tubes onto our next order… We don’t get a choice of what DNA POLYMERASE (DNA Pol) our cells use to copy DNA, but we *do* get to choose what DNA Pol to use when we do it to copy specific stretches of DNA in a PCR tube. Sso7 what? Efficiency, processivity, fidelity, thermostability – that’s what! We can use “designer” copiers that best suit our DNA xeroxing needs to ensure that speedy, accurate copying proceeds!

If you’ve been following me (thanks by the way – I hope it’s been helpful!) you’re probably getting tired about hearing about this “For the Love of Enzymes” book I’m reading by Arthur Kornberg – but it’s so good! I’m through the part where he won the Nobel Prize for discovering an enzyme (reaction speeder-upper) that could copy DNA, using a single strand of DNA letters (deoxynucleotides) as a template for linking up (polymerizing) “opposite” letters into a complementary chain (which can then be used as a template for recreating that original template sequence thanks to the 1:1 “oppositeness” of DNA letters (more below). 

He called this enzyme DNA Polymerase (DNA Pol), &  I’m now at the part where he’s describing how, with the help of his son, Tom, he found out that there are actually multiple versions of DNA Pol – even within a single cell. If you look to different organisms you can find even more diversity – and scientists can take advantage of the different properties of different ones to fit our in vitro (in a test tube) DNA copying needs!

DNA (DeoxyriboNucleic Acid) is the biochemical language our genetic info’s written in  & its alphabet consists of 4 deoxynucleotide (dNT) “letters,” A, T, C, & G which have a “generic” part made up of a deoxyribose sugar with phosphate(s) hooked up on the “left arm” (5’ position) has a hydroxyl (-OH) group as a “left leg” (3’ position) & then the different letters have different “nitrogenous bases” (bases) that stick off as a “right arm” – these bases are the single- or double-ringed parts.

dNTs use their generic parts to link together through phosphodiester bonds to form long single stranded DNA (ssDNA) & 2 complementary single strands “zip together” using their unique base parts (A across from T, C across from G) to form double-stranded DNA (dsDNA). This double-strandedness protects it from damage (the bases are facing in) and allows for easy copying since if you unzip it one strand can be used as a template for making another. 

This unzip and copy is what happens in replication – before a cell divides, it needs to copy all its DNA (its entire genome) so that it can pass on a full set to each daughter cell, and it does this with the help of DNA Pol, which brings together the freely-roaming nucleotides, holding the right ones together & helping them link up, while rejecting the wrong ones (ones that don’t complement each other (e.g. don’t let an A bind a C!)

PCR (Polymerase Chain Reaction) is a way to carry out this process in a (really tiny) test tube & only copying a small section of DNA, which we specify by using PRIMERS. Primers are short pieces of DNA (oligonucleotides, or “oligos”) we design to “bookend” our region of interest (AMPLICON).

We need these primers because DNA Pol can’t start chain-building from scratch – it needs to start from a short double-stranded stretch. This is just one of its limitations (but sometimes limitations can be good! You don’t want cells copying DNA randomly!) Another limitation of its is that it can only copy DNA in One Direction (5’ to 3’). Before we get too far, let’s make sure we’re NSYNC… (90s girl, sorry!)

The letter-linking is “generic” because it only involves the “backbone” parts that all the letters have (phosphodiester bonds involve the phosphate & hydroxyl merging) so you can link letters in any order (e.g. ATTACA or CAAATT). But the strand-zipping is specific because it occurs through interactions of the unique bases. So the “opposite” of ATTACA is TAATGA, which is different from GTTTAA. But, writing the opposites like this is a bit misleading because opposite direction you should be reading.

If you have ATTACA, and you stick the complementary letters across from it, you get this:



BUT – dsDNA is ANTIPARALLEL – this means that the strands are running in opposite directions (one is 5’->3’ and the other is 3’->5’ with the ‘ pronounced “prime” and referring to whether the left arm (5’) or “left leg” (3’) of that end’s sugar is free). So

5’ ATTACA 3’

3’ TAATGA 5’

And we usually write sequences 5’ to 3’, so the “complimentary sequence” to 5’ ATTACA 3’ is 5’ AGTAAT 3’

This may seem like a mere technicality, but it’s really important in reality! Because DNA Pol can only copy DNA in one direction, 5’ to 3’, and you always have to keep in mind which way the “train tracks run”

Train tracks? This is another weird analogy of mine – I like to think of nucleotides as train tracks and DNA Pol as a train. This train can only travel on double-stranded track, so it has to lay the track down ahead of it as it goes (and it knows what track to lay down by making it “match” the other side of the track (e.g. if the next track across from it (on the other strand) is a T, lay down an A). In PCR, primers provide the starting stations for the train (since DNA Pol needs double-strandedness to start, it’ll only start where you make it double stranded (but shorter than the other strand so there’s still stuff to copy (e.g. you want something like this:


to make this


PCR is run in cycles 🔁 of 1️⃣ MELT (heat up dsDNA to unzip strands) 2️⃣ ANNEAL (cool down slightly to allow primers to bind 3️⃣ EXTEND (starting where primers leave off, add nucleotides complementary to template strand until you reach end of template strand. After the 1st cycle (where Pol goes till it runs out of steam or out of time), this end is determined by other strand’s primer because DNA can only copy it can’t “compose” so it’ll run off the track corresponding to the position that strand started being copied from in the 1st round. (easier to explain in pics)

You do this over & over 🔁 (30 or so times) to get lots of copies (each time you get 2X as many copies bc each new strand becomes another template strand).

The reaction to link nucleotides together (nucleotide polymerization) is same in your cells or in the tube. It’s also the same for RNA (but using a different Pol) but here we’ll speak in DNA terms. So the reaction takes 2 deoxynucleotide triphosphates (dNTPs)(have 3 phosphate (PO₄³⁻) groups linked together) and links them up. And when it does so, it kicks 2 of those phosphates as a molecule pyrophosphate (PPi)

BUT in cells there are lots of helpers, whereas in PCR the process is stripped down to its bare necessities:

🔹TEMPLATE DNA: dsDNA containing the sequence you want to copy (amplify)

🔹PRIMERS: short ssDNA complementing the ends of amplicon (1 to serve as a start site for each strand)

🔹dNTPS: nucleotide building blocks (letters) to be added

🔹BUFFER: liquid combo of salts & pH stabilizers to keep everything happy

🔹Mg²⁺: magnesium cation (➕ charged molecule) to act as “chaperones” to help shield phosphates’ ➖ charge  

🔹and, drumroll please… 🥁🥁🥁 DNA POLYMERASE (DNA Pol)

One of the hardest parts of carrying out a biochemical reaction is often bringing reactants together & keeping them together long enough to interact. Why’s this so hard? Molecules like to be free to move around (they want high ENTROPY) & they don’t like to be tied down. Entropy refers to how many different “states” a molecule can be in (this can refer to being in different places or in slightly different shapes (e.g. bond rotated a bit). It’s sometimes described as “randomness” or “disorder” because the more ways something can move the less likely you are to know exactly where they are at any time.

Getting nucleotides to link up is like getting a bunch of kids running around at recess to link up to form a really long 3-legged race. Firstly, you have to get them to come over, then you have to get them to stay still long enough to convince them to link up, and then once they’re linked up you have to make it “fun” for them even though they can’t run around anymore because their movement is restricted by being linked to the person next to them. And Pol has an even harder time because it has to get the kids to link up in a specified order!

So how can one little protein do all this? By giving the nucleotides something they want in return – ridding them of a couple of their phosphates. The phosphates stick off the 5’ end &  they’re basically really concentrated negative charges clamped together like a spring. PHOSPHATE (PO₄³⁻) has a central phosphorous(P) atom connected to 4 oxygen(O) atoms. It has “extra” electrons (e⁻) so it’s ➖ charged. Like charges repel, so phosphates don’t like to be next to each other. So it takes effort (in the form of energy (E)) to bring & hold phosphates together (like compressing a spring) – so we call these bonds “high energy” and when they’re broken apart that E’s freed to be used for other things like paying cost of linking nucleotides together.

So, even though nucleotide polymerization is still energetically costly because you’re tying down molecules, ⬇️ their freedom (⬇️ entropy), this is compensated for by the large ⬆️ in entropy that occurs when PPi is released & hydrolyzed (split by water) to give you 2 individual orthophosphates (Pi) (and 2 little things moving around freely has even more freedom than the 1 medium-little thing (PPi) that’s initially released – so you get a double-boost

BUT in order to get this benefit, you 1st need to get them to react & this is often where proteins &/or RNA CATALYSTS (reaction speeder-uppers) called ENZYMES come in. They act as a sort of “mediator” bringing the right reactants together, holding them in optimal positions to react, ✅ stabilizing reaction intermediates ✅ & providing a friendly environment ✅

As I mentioned before, Arthur Kornberg discovered some enzymes that help mediate DNA copying: DNA POLYMERASES (DNA Pol). DNA Pol helps hold an incoming nucleotide (of the triphosphate variety) close to the growing chain it needs to be added to & in the right position ✅ ⏩ The 3’ hydroxyl (OH) group then goes in for the attack! It latches on to the 1st phosphorus (P) group of the incoming nucleotide ->  that P now has too many bonds, so it kicks out the other 2 phosphate groups as the inorganic phosphate molecule PYROPHOSPHATE (PPi) (energy boost 1). PYROPHOSPHATE is then hydrolyzed (broken by the addition of water) into 2 molecules of ORTHOPHOSPHATE (Pi) (energy boost 2).

DNA Pols have that in common, but different organisms have slightly different DNA Pols that vary in EFFICIENCY, PROCESSIVITY, FIDELITY, & THERMOSENSITIVITY

🔹EFFICIENCY: how fast can it go?

🔹PROCESSIVITY: how many nucleotides can it add before it falls off template?

🔹FIDELITY: how many typos does it make?

🔹THERMOSENSITIVITY: how much heat can it take?

You can get ⬆️ FIDELITY (fewer typos) if your Pol has a “proofreading” 3′→ 5′ exonuclease (DNA end-chewing) domain that can sense errors, “backspace” to remove them, & then put in the correct letter. This profreedng is important because errors will get copied… & copied… & copied… BUT it slows down process so you get ⬇️ EFFICIENCY

Kornberg initially thought that the DNA chewing he saw in his experiments was do to a contaminating exonuclease. But no matter how pure he got his protein preps, the chewing continued. And he was able to show that indeed the same enzyme that was copying the DNA was also erasing it, using different parts (domains) of the protein!

Other parts of DNA Pol proteins can also help out. A way to ⬆️ EFFICIENCY is by ⬆️ PROCESSIVITY –  keep Pol on the template. Constantly falling off & hopping back on surely slows you down!  Processivity-enhancing domains (protein “sections”) or separate processivity-enhancing “subunits” bind dsDNA to help latch Pol on. But importantly they don’t bind “too tightly” & they can bind any sequence – this allows Pol to stay on but slide along

BUT before you can copy strands you have to unzip them & this too is energetically expensive (like peeling apart 2 pieces of stuck together tape). You have to put in energy to give the DNA molecules more energy so they wiggle around more & the strands come apart.

In your cells, enzyme helpers called HELICASES help unzip them using chemical energy from ATP, but our “bare-bones” PCR version doesn’t have these helpers. Instead, we get the needed energy from HEAT.  In the MELT step, we physically heat up the dsDNA so the strands come apart. And we have to get it REALLY hot! (~95°C or 200°F). Human DNA Pol would be pretty useless at this temp bc same heat that causes strands of DNA to come apart 👍 can also cause proteins to unfold 👎 (just like chains remain chains when you melt DNA (you don’t break up strong covalent bonds), heat denaturation of proteins leaves you w/chains of amino acids)

Thankfully there are organisms called THERMOPHILES that have evolved to live in super hot environments (like near thermal vents in the ocean). They have super-strong proteins that can withstand high temps needed PCR. The “classic” PCR Pol is Taq, which gets it’s name because it comes from the thermophilic bacterium Thermus aquaticus. BUT Taq tends to make a lot of typos (⬇️ FIDELITY). Pfu DNA polymerase (from Pyrococcus furiosus) makes ⬇️ errors (⬆️ FIDELITY). BUT it has relatively ⬇️ EFFICIENCY. Could we do better? 

It’s hard to get all 4, but that hasn’t stopped scientists from trying! Scientists can stitch together parts they like from different Pols to get Pol “chimeras” w/enhanced functions. Our lab uses a chimera called Phusion Polymerase (not a paid endorsement, just what we use!) It’s based off of a Pfu-like DNA Pol (w/proofreading capability for ⬆️ fidelity) fused to a small dsDNA-binding protein called Sso7 (from Sulfolobus sulfactaricus) which serves as a processivity-enhancing domain. It can add 1000 nucleotides (1 kb) in only 15 seconds w/few errors! So we time out the extension step accordingly (e.g. if we want to copy a 4kb segment, we’ll set the extension step for 4×15=60s

We do a LOT of molecular cloning bc we make a lot of mutant proteins to test what different parts do &/or try to make versions that are less “camera-shy” ( http://bit.ly/2CWPRil ). So we go through a lot of Pol. So we *always* make sure to have excess in stock.  Once we ran out… needless to say, it was not good…

