The phage display gameplay: stick “random” foreign DNA in phage coat protein genes, let phage make those added-onto proteins and send the phages through screens! The added on protein bits will stick out so in binding tests you can find hits… take an interesting molecule & test if they bind, then you can “rewind” and look to see what inserted DNA you find!

“Phage display” might sound a bit like some fancy new window display retailers put out to entice shoppers. It’s definitely not that – but in a way it is kinda like molecular “window shopping…” Phage display is a technique in which bacteria-infecting viruses called bacteriophages (“phages”) are used to “display” bits of proteins on their surface. You can get lots of phages to display lots of different protein bits and then add this “phage display library” to some molecule you want to see if those protein bits bind to. You then select for the best binders and figure out what they’re showing.

It might seem like something that’s only useful for answering obscure biological questions out of curiosity, but it’s actually a big money-maker (and not just in terms of the chemistry Nobel Prizes awarded for its discovery and development (to George Smith and Sir Gregory Winter in 2018)). Pharmaceutical companies and researchers often turn to phage display if they want to develop lab-made antibodies or peptide-based drugs. In fact, one of the world’s biggest-selling drugs, Humira (adalimumab) was the first FDA-approved human monoclonal antibody therapy and it was developed using phage display. More about that and other specific applications later, but first let’s talk about how it works. 

Basically, “genes” are stretches of DNA with instructions for making functional products like proteins (which are made up of protein “letters” or “building blocks” called amino acids connected through peptide bonds, hence the name “peptides” for protein bits). So, for example, phages have genes with DNA instructions for making “coat proteins” to form a protective coat around them when they travel and infect new cells. 

If you you stick foreign DNA into the genes of the phage coat proteins, an extra bit of protein (the amino acids corresponding to the inserted DNA) will get added into the phage coat protein. Kinda like if you have a recipe for a 3-layer cake and you sneak in an extra page with instructions for making a fourth layer. Since the coat protein sticks out from the surface of the phage, if you position the insert right you can get the extra bit “displayed” (sticking out to the environment). And then you can check to see if that displayed bit can bind something you’re interested in. 

The foreign DNA you insert can either be DNA for something you know you’re interested in, like an antibody fragment, or random DNA if you want to unbiasedly search for peptides.

A common application, sometimes referred to as “biopanning,” is to coat wells of a plate with some binding partner, like maybe a protein you want to find an inhibitor for). And then you take a phage library and add it to the well. If the phage is displaying something that can bind that protein, that phage will stick. If not, the phage won’t, so you can wash the “losers” off. And then you can get the stuck phages to unstick and sequence them to see what DNA insert they have. Since you only have a few copies at that point, usually you first do an “amplification” step, where you let the phage infect some more bacteria and make more copies of themselves first before you try to go sequencing them. But you usually don’t even do the sequencing until you’ve gone through several cycles of “affinity selection,” where you make the wash conditions harsher and harsher each cycle. So the basic rundown is:

  • make phage library
    • stick foreign bits of DNA into phage coat proteins – these bits can be totally random or a “curated collection” such as instructions for pieces of antibodies
    • let those phages infect bacteria
      • the phages will grow and insert the modified proteins into their coats
  • add the phages to the sticking test (this can be wells of a plate, beads, even whole cells)
    • give them some time to stick
    • wash off the ones that didn’t stick (such as by increasing the salt concentration)
    • wash off the ones that did stick (such as by altering the pH)
  • amplify the ones that stuck – you start with a TON of phage, and only a small portion of them will bind (especially in the first round) so you need to make more if you want to test them more (and find out what they are)
    • let them infect more bacteria so they can make more copies of themselves
  • put those to the sticking test again, but this time use harsher wash conditions to keep only the stronger stickers
  • amplify them
  • do this again as many times as you want in order to find the strongest stickers – note: there will be some mutations naturally randomly introduced into the inserted DNA, so the bits might even get better than the bits you started with! (and you can promote this mutation by growing the phages in “mutator strains” of bacteria)
  • once you’re satisfied, sequence the phage coat protein insert
  • now that you know what it is, you can do different things with it. So, for example, if it was an antibody part you can insert that DNA into the rest of the antibody gene and get cells to make it for you. If it was a random peptide you found bound to a protein you want to drug, you can synthesize that peptide and then test to see if it inhibits the protein you found it bound. 

There are a lot of variations on this scheme both in therms of the technology itself and its applications. 

So, let’s start with the technology. As I mentioned briefly, “phage” is short for “bacteriophage” and it’s a virus that infects bacteria. Literally, the term means bacteria-eater, but what really happens is even neater!

Phages have proven incredibly useful for biochemistry and molecular biology research because they grow really fast, only harm bacteria, have small genomes, and cool equipment. For instance, I recently told you about a phage called T7, and how its RNA polymerase protein is incredibly useful for things like making lots of copies of RNA (such as mRNA vaccines) in vitro from DNA templates. 

There we were just using a piece of a phage’s molecular machinery, but with phage display, we’re using the whole phage – and usually a different type of phage called a “filamentous phage” which gets its name because they’re long and skinny. There are different sub-types and the main phages used for phage display are M13 (not to be confused with MS13!), fd, and f1, which are super similar and therefore sometimes just grouped together and referred to as Ff phages because they bind to the bacterial F pilus (a part of the bacterium that sticks out and is involved with transferring plasmid DNA in bacteria sex). 

These phages have a circular, single-stranded DNA genome, which stretches their length and is surrounded by an outer coat of coat proteins. They dock onto bacteria through that F pilus, get the bacteria to retract their pilus, then inject their DNA, and get the bacteria to make oodles more phage particles which get extruded from the bacteria. 

One of the reasons phages have been helpful for science is their small genome size. It’s a lot easier to figure out what does what when you only have 11 genes you’re dealing with! Two of the phage genes are most commonly hijacked for phage display usage: the minor coat protein pIII (which has about 5 copies per phage at one end) and the major coat protein pVIII, which has about 2700 copies. 

You might think the more the better, so pVIII would be great! BUT, if you have a lot of copies of something, even if a single “something” isn’t a super strong binder on its own, lots of “something” can work together to bind strongly. You can think of it similarly to friction. How you can interlace the pages of 2 phone books to get a super duper strong connection (I always think back to the MythBusters episode where they hooked 2 cars together with phone books!)

This sort of “stronger together” phenomenon can be referred to as “avidity” and it differs from “affinity” which refers to the stickiness of a single copy. Usually, when looking for binders, scientists are more interested in affinity than avidity, so you wouldn’t want 2700 copies, as avidity effects could mask poor affinity. In fact, often even 5 copies is too much. So, as I will tell you a little more about later, scientists often use techniques to get the phages to display a single copy of fused coat protein with the other copies being the normal “wild-type” coat protein. This is referred to as “monovalent display” and it’s a more recent advancement, so let’s go back and look to the start first… 

The first “proof of concept” for phage display was published by George Smith (of the University of Missouri) in the journal Science in June 1985.

Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. BY GP SMITH SCIENCE14 JUN 1985 : 1315-1317 

He describes how he stuck part of the gene for a restriction enzyme (a protein that bacteria use to cut foreign DNA) called Eco RI into the minor coat gene of filamentous phage and got the phage to display part of Eco RI on its surface. 

Smith used the coat protein pIII, and when it came to placement, he had to think strategically! He needed to stick the new stuff some place where 

  1. it wouldn’t interfere with the coat protein’s function and
  2. it would be on the “sticking out” part of the protein as opposed to somewhere in the membrane-spanning region or the “sticking in” side (we don’t want it to hide!)

He took advantage of what was known about pIII protein at the time. The “front end” (amino terminus) of pIII binds to the bacterial F pilus. That latching on of the pIII to pilus is a critical step in infection – so you don’t want to mess with that. And the “end end” (carboxyl terminus) of pIII is inside the phage particle and thus not “on display” (and it’s also important for virus-y stuff). 

Smith thought that you might be able to sneak in some foreign DNA in between those two ends without messing with the phage’s ability to infect and while allowing the peptide corresponding to the inserted DNA to be displayed outside the phage. At this point you might be wondering, why the heck was he trying this? (Although maybe you aren’t because in hindsight it’s proven so useful – but for much more than what he was originally thinking it could be useful for!) He envisaged it as a method for cloning genes if you had an antibody against its corresponding protein. 

Antibodies are little proteins made by immune systems that specifically bind to, and thus “recognize,” foreign molecules such as viral proteins. Antibodies have generic adapter parts called “constant regions” and unique parts called “variable regions.” The variable regions include different segments of both “heavy chains” and “light chains” – animals mix and match segments and test for binding to foreign thing (and NOT self things).  We call things they bind to “antigens,” and you can inject lab animals with antigen to have them make antibodies against some protein of interest.  

So, say you have an antibody against a protein we’ll call X. You know that this antibody can bind X protein, but you have no idea where X is coming from because it’s “disconnected” from the DNA which has its recipe. If you cut up a genome into pieces and then stuck those pieces into phages, those phages would display bits of the corresponding protein, so some of them would display bits of X. You could then take your phages and put them on a plate coated with anti-X antibody. Some of the phages displaying bits of X would stick, but the other phages displaying all the other protein pieces wouldn’t. So you could wash all those non-stickers off and then change the conditions to get the stickers to unstick (such as by adding high salt or lowering the pH). You’ve now enriched for phages showing X. And they can show X because they contain part of the X gene. So now you can look to see what foreign DNA is in them. This will give you a sequence for part of the X gene. And you can use that as a probe to go fishing in the genome for the full X gene. Tada! 

EcoRI was useful for his proof of concept because the gene for it was known and available and antibodies against it were available too. So he could easily connect the 2. Then, once he had the system set up (which took some optimization of the exact insertion site, etc.) he could put it to work to use antibodies to fish out other proteins. And, he thought, that’s not all you could do with this antibody-on-plate setup. You could actually make a “peptide library” where you have lots and lots of short sequences of DNA you stick into the coat protein, have the phages display them, and then figure out what specific parts of a protein(s) bound to an antibody (i.e. the antibody’s “epitopes”). 

In his Nobel lecture, Smith gives a shout-out to some of the key players in his lab who did a lot of this work: Steve Parmley developed a practical phage display vector & affinity selection as a grad student (heck yes! grad students for the win!). Postdoc Jamie Scott showed you could do affinity selection of peptides from large random peptide libraries. And Robert Davis (chief manager & technician) sequenced the sequences of the hits. 

Speaking of that Nobel prize…Smith shared 1/2 of the 2018 Nobel Prize in Chemistry. The other 1/2 of that 1/2 was given to Sir Gregory Winter (and the other 1/2 of the whole thing was given to Frances Arnold for her work on enzyme evolution which is super cool). 

Winter got the award for adapting the phage display system to do the “opposite” of finding antibody-binders. He (and colleagues and peers) made it so that you could isolate *antibodies*. Instead of coating your plate with antibody, coat them with antigen. Then have the phages display parts of antibodies to see which ones bind. 

To understand why this is possible, it helps to have a little more background on antibodies. 

Even though antibodies are pretty small proteins, they still have multiple peptide chains & there are different choices for how to make the chains – at least the variable regions..

note: there are different “classes” of antibodies which differ based on their generic adapter parts. The type I’m talking about here are “IgG” antibodies. They’re the ones that are Y-shaped, with 2 chains (1 heavy, 1 light) per arm. So, they have 2 copies of a heavy chain  (~440 amino acids (protein letters) long) & 2 copies of a light chain (~ half that length). 

Both the heavy & the light chains have variable regions – the base of the Y and the bottom of the arms are the constant region and the tips of the arms are the variable regions. Those variable regions can be mixed and matched for a LOT LOT LOT LOT of possible combinations that the immune system experiments with & they both contribute to the “antigen-binding site” that forms at the tips of the arms when the chains link up through disulfide bonds (those strong bonds that can form between cystine amino acids). 

“Antigen” is just a fancy word for a thing that an antibody binds to. Often these things are parts of invaders (e.g. viral proteins) – but they can also be made against not-usually-harmful things (like peanut proteins, hence peanut allergies). Antibodies are usually discussed in terms of evoking an immune response, but since antibodies bind to their target antigens, they also have the potential to block the target’s actions, prevent the target from binding other things, etc.

Your body usually works hard to not make antibodies against your own proteins, etc. (auto-antibodies) so you don’t get autoimmune diseases. But sometimes, our proteins can go rogue or get over-expressed or something. For instance, in some inflammatory diseases like rheumatoid arthritis, peoples’ bodies make too much of an inflammatory signaling molecule called TNFα (Tumor Necrosis Factor alpha) – so it’d be good to have antibodies that would bind to TNFα and prevent it from binding to its receptors and setting off inflammatory cascades. So an anti-TNFα antibody might be able to block all that excess signal so that you don’t get excess inflammation. But the person’s body isn’t going to make anti-TNFα antibodies because TNFα is a “self protein.” 

However, if you could get a mouse to make them… Scientists knew they could immunize mice with target antigens (inject the mice with something they wanted them to make antibodies against, like TNFα), let them make antibodies, isolate the antibody-making cells, and use them to make a lot more copies of that one “monoclonal antibody” (mAb) – thus called because it comes from a single B cell clone. In those days, the common way to do this was with “hybridomas” whereby they’d fuse the antibody-making B cell with a tumor cell to allow it to grow really well in the lab and make lots of antibody. 

But those were mouse antibodies. They have the mouse constant regions (generic parts). So, if you stuck them into people, the people immune system will see them as foreign and destroy them and/or respond with an immune response.

Once scientists became able to engineer proteins they tried to humanize mouse mAbs by taking the antigen-binding parts and sticking them onto a human generic part. So mostly human, but a little mouse-y. But could you make fully human ones? Winter thought you could – and proved it!

Winter had the idea (and so did others, but Winter was successful first!) of using phage display for a sort of “artificial evolution” of antibodies – using phages, with their vast numbers and ability to display different things, and their fast growth, to find great antibodies that could be used to treat diseases. 

You can kind of think of the phage as acting as the generic adapter part of the antibody. So instead of using B cells to display variable regions on constant regions, you could use phages to display variable regions on coat proteins. 

another quick terminology note: as introduced above, we have the 2 copies of 2 chains per antibody (at least the IgG antibodies we’re dealing with here) – there’s the heavy (H) chain and the light (L) chain. And then each of those have constant (C) regions (generic adapter parts) and variable (V) regions (unique parts). So the regions that can be mixed and matched are VH (variable heavy) & VL (variable light). And each of those have different gene sequences. note: in addition to the “germline” antibodies you can make using the original genes you inherit, they can also undergo something called “somatic hypermutation” whereby the variable regions’ DNA acquires mutations that allow it to evolve to bind even better. 

Winter wanted to isolate the variable regions, so he turned to PCR (Polymerase Chain Reaction), which is a technique that can be used to make lots of copies of (amplify) specific regions of DNA that you specify using short pieces of DNA called primers to bookend the region you want copied. 

Winter developed primers bookending the VH & VL genes of mouse antibodies, and put in restriction sites (sequences recognized and cut by specific restriction enzymes like EcoRI) to allow them to cut and past the genes into a vector (a piece of DNA which can serve as a “vehicle” or carrier for sticking into some sort of cell or organism). 

But what vector? What did they want to use to make the antibody parts? They tried bacteria with limited success and abandoned the project for a while. But, after seeing some success from others with random combinatorial libraries, they decided to give the project another go. 

Instead of getting cells to make secreted antibodies, they decided to try to get them to display antibodies from their surface. And they (and now you) know phages are good for this!

For their proof of concept, they took the gene for an antibody which they knew bound hen egg lysozyme, linked the VH & VL domains with a short flexible linker to make a single chain Fv fragment (scFv), fused that to pIII and… it worked! The phage bound to hen egg lysozyme. And they could enrich for this phage using antigen-affinity columns (columns filled with little beads coated with antigen – so basically like the plate thing Smith used, but on beads). 

Could they use this to identify binding antibodies without “cheating” (starting with something they knew would bind?) Yes! They used a random combinatorial library of about 20,000 phage antibody clones taken from a mouse that had been immunized against something called 2-phenyloxazol-5-one (phOx). They chose this because a lot of work had been done on phOx antibody-finding using traditional methods, so they’d have a sort of standard with which to compare against.  

They showed that they could make a library of 10^7 phage clones & isolate antibodies that specifically bound to phOx. And, since they could trace the phage back to the genetic sequence, they could then just stick that sequence into the rest of the antibody gene and bypass hybridomas. 

Winter and others started making phage libraries of all the different possible human antibody combinations. But there are a lot so this was quite the undertaking! Random combinatorial libraries of human antibodies weren’t very large or diverse yet, but they didn’t want to wait for that to be done before putting the system to use to help people, so they started with a mouse mAb against TNFα (which they knew worked at least in mice) and swapped one chain at a time with random human versions of the other chain – so, they took a mouse’s VH & mixed it with a human VL library. And took a mouse’s VL and mixed it with a human VH library. They used affinity selection to find the best ones and then took the human parts of each of those and mixed them together and found the best ones of those. 

And, voila! Humira (adalimumab) was born! note: I’m not exactly sure of all the legal stuff – Winters cofounded a company called Cambridge Antibody Technology (CAT), and according to this article

“HUMIRA was isolated and optimized by Cambridge Antibody Technology, originally as D2E7, in collaboration with Knoll (a division of BASF) and subsequently developed by Knoll and Abbott (on acquisition of Knoll).” 

More in his Nobel Prize lecture, “Harnessing Evolution to Make Medicines”: 

Now we have huge phage antibody libraries of human antibody gene parts (like >10^10 combos huge). And, in  addition to the scFvs like Winter used, which have just the variable parts of the chains (tips of the Ys) linked together, you can use antigen binding fragments (Fabs), which have both chains of one arm of the Y (this includes a constant region of the heavy chain).

Once you have such a large phage antibody library, you can screen it against anything. The first round will probably be pretty crappy. But you can select the least-crappy and then do it again and again, allowing for selection for the very best. And, since some random mutations get introduced with all that copying going on, you get a sort of “artificial evolution” akin to what you’d see with that “somatic hypermutation” I mentioned earlier. 

Antibody pharmaceutical drugs have had tremendous success. All those drugs ending in “-ab” are antibodies. And a lot of them where found using phage display. another terminology note: these drugs are a form of “biologic” because it’s a protein not a “small molecule” like a conventional drug.

Antibody-finding/development has been a major application, but phage display is used for a lot of other things too including mapping of binding sites.

A couple more technological tidbits on monovalent display as promised…

As I mentioned, a common phage coat protein hijacked for phage display is pIII. If you stuck a peptide fragment (or antibody part, etc.) in there, you’d expect 5 of those fragments displayed per phage (since there are typically 5 copies of pIII), but in practice the heads of pIII often get cut off by proteases (protein-cutting molecules). Nevertheless, you still would have multiple peptide fragment copies displayed and they could kinda team up and bind better once one’s bound, an effect called avidity. And it’s hard to distinguish high avidity from high affinity if you don’t know how many heads are still there. To get around this problem, scientists often turn to monovalent display (only one (mono) copy displayed), which can be achieved using “phagemid vectors.”

You know what a phage is, and you might have heard of a “plasmid,” which is a circular piece of DNA that bacteria can host in addition to their normal genome. But what’s a phagemid? It’s kind a hybrid – it has parts of the phage genome (including the phage packaging signal and phage origin of replication (the signal for the phage DNA to get copied) and the plasmid origin of replication (the signal for the plasmid to get copied) but it also has the the plasmid origin of replication (the signal for the plasmid to get copied). This way, you can propagate it (keep it around) in bacteria just like you would a normal plasmid, but you can’t make shippable phage until you provide the other phage components through a helper phage. 

In addition to those requirements, the phagemid also usually has an antibiotic resistance gene so you can “select” for bacteria that contain it (grow the bacteria on food (media) spiked with the corresponding antibiotic and only bacteria with the phagemid (and thus the resistance gene) will survive). And it has whatever other gene(s) you want. 

So you can design a phagemid to have a fusion pIII (but not the rest of the phage genes) and then “superinfect” bacteria with both this mutant phagemid and a weakened “normal phage.” The normal phage will act as a “helper phage” providing normal coat protein as well as the other phage proteins. The normal pIII (coming from the helper) is incorporated into the phage much easier (it doesn’t have a random protein piece stuck into it!) so you end up with most of the phage having all normal pIII and some having I fusion pIII, but practically none having more than one fusion pIII. (and don’t worry about all those “all normal” phage you made, they won’t stick and there are sooooo many phages being made it’s okay if you’re not very efficient in your mutant-making! Plus, the “normal phage” has a weaker origin of replication, so the phage genome that gets shipped out will be the mutant one)

That’s just one strategy for monovalent display. There are actually quite a few. For example, there are methods where you basically add a fusion pIII into the normal phage genome but keep the normal pIII there too. Like that phagemid strategy but in one piece. So, why would you want to go with a phagemid?  Reasons include some more technical notes I don’t have time to explain thoroughly but want to throw out there…An added advantage of phagemid vectors is they transfect bacteria with higher efficiencies than phage vectors (they’re easier to get into bacteria), so you can create larger libraries. They also simplify expression of soluble fragments: put amber stop codon between fragment gene & pIII protein – in amber suppressor strains, it would get displayed because the amber codon will get ignored and so the fragment will get incorporated onto the pIII. But in non-suppressor strains, the stop signal will be obeyed and soluble protein made. 

some good review article sources for more information:

Phage Display. George P. Smith and Valery A. Petrenko. Chemical Reviews 1997 97 (2), 391-410

DOI: 10.1021/cr960065d  

Ledsgaard L, Kilstrup M, Karatt-Vellatt A, McCafferty J, Laustsen AH. Basics of Antibody Phage Display Technology. Toxins (Basel). 2018 Jun 9;10(6):236. doi: 10.3390/toxins10060236. PMID: 29890762; PMCID: PMC6024766. 

Sioud, M. Phage Display Libraries: From Binders to Targeted Drug Delivery and Human Therapeutics. Mol Biotechnol 61, 286–303 (2019).

Barderas, R., Benito-Peña, E. The 2018 Nobel Prize in Chemistry: phage display of peptides and antibodies. Anal Bioanal Chem 411, 2475–2479 (2019). 

I also encourage you to watch the Nobel lecture videos

Winter Nobel lecture: 

Smith Nobel lecture: 

I also encourage you to watch Francis Arnold’s Nobel lecture – I know she’s not involved in this story but she’s awesome! 

more on topics mentioned (& others) #365DaysOfScience All (with topics listed) 👉

Leave a Reply

Your email address will not be published.