CRISPR/Cas diagnostics – where collateral damage is a good thing! CRISPR/Cas is a bacterial immune system mechanism in which CRISPR RNA guides direct Cas proteins to cut specific DNA or RNA sequences (such as that of viruses). This system has been adapted for multiple lab uses; you probably most often hear about it in the context of gene editing (aka genetic engineering) but its potential uses go much further. And some of the most exciting uses can be found in the field of diagnostics. In May, the FDA approved the first CRISPR-based diagnostic test – it uses CRISPR/Cas to detect coronavirus RNA. Other similar tests are also in the works, so I thought I’d tell you about how they work!
The (jargonny) gist and then I’ll explain further: these tests program promiscuous Cas proteins with guide RNAs complementing the viral genome of interest. Binding to the corresponding viral sequences (if present in a sample) acts as a sort of switch that activates Cas to start cleaving single-stranded reporter RNAs or DNAs which have a label on one end. Depending on the test method, this can produce fluorescence (give off light) or, in the rapid “lateral flow” model (pregnancy test style) the reporter pieces get separated on a test strip and the location of the colored lines tell you if it was cleaved. The basic idea is:
- extract viral RNA from patient sample (saliva, nose swab, etc.)
- make copies of it using an isothermal (single temp) amplification technique
- add reporter probes & CRISPR/Cas programmed with sequences matching viral genome
- detect products (e.g. apply to lateral flow strip or stick in fluorescence scanner)
Now for the dejargoning & the details, because that’s where the cool stuff is!
CRISPR/Cas is a bacterial immune mechanism – scientists didn’t invent it, these buggies did! Scientists just figured out how it works and how, if we purify the components and put them where we want them, we can get them to work for us. CRISPR stands for “Clustered Regularly Interspersed Short Palindromic Repeats” and Cas stands for “Crispr ASsociated proteins.” The idea with CRISPR/Cas is that a short piece of RNA (the crispr guide) directs a Cas protein to matching sequences of DNA (or RNA in some cases) and then Cas cuts it. It’s useful in the laboratory because, if a cell tries to repair the damage, it often leads to inactivating mutations, thus “knocking down” genes. And if you provide alternative DNA it can paste it in. (it’s definitely not this simple and more on it’s use in gene editing later)
On the diagnostics side, by using guide RNAs that complement the genetic information of the virus they’re looking for, scientists can use CRISPR/Cas systems to find and cleave that viral information if it’s present in a patient’s sample. So, for instance, they can set Cas off to search for coronavirus RNA. (note: this is happening in a sample from a patient’s nose or spit or something, it’s not being used to seek out the virus in the person’s body)
Problems are, 1) there’s not that much of it, so you want to amplify the signal – and 2) you need a way to see it! These problems are solution-ized with a couple of key modifications including using labeled “reporter RNAs or DNAs” and Cas proteins that are much more promiscuous than the ones used for gene editing.
There are multiple types of Cas proteins (our scissors) and scientists choose which to use based on what they’re trying to achieve. The type that’s historically been used the most for gene editing is Cas9. It cleaves dsDNA and it’s really specific (though not quite specific enough, so scientists are turning to alternatives and/or genetically tweaking Cas9 to make it even pickier).
When your goal is gene editing, this pickiness is crucial – you want your scissors to be as precise as possible to avoid “off-target effects” like messing up other genes. But, when diagnostics is your goal, “collateral damage” can be a major asset! Some Cas proteins, once they get activated by binding to specific matching DNA, go cut happy, cleaving any single-stranded RNA (or DNA, depending on the Cas) around, including labeled reporters. (This sort of indiscriminate cutting may seem kinda whacky, but if you think about it further, it’s actually pretty smart – bacteria with these Cas proteins only have to be able to recognize a little snippet of an invader and they can destroy all of it!)
So the idea with CRISPR-based diagnostic tests is to use viral genetic info to activate Cas to cleave reporter RNAs and see if cutting occurred, which would indicate that Cas was activated. Since Cas is only activated if it finds the matching sequence, this tells you that the viral sequence was present in the sample.
Different tests differ in which Cas proteins they use and how the reporters give their readout. There are a couple of main players in the CRISPR diagnostics field: Sherlock & Mammoth (there are others but I’m going to focus on these 2 & their techniques). These companies are offshoots of academic research labs and have been around before covid – they’ve just gotten a bit of a covid boost – but the technology can theoretically be used to detect “any” virus – or bacteria, or fungus or whatever – as long as you know its genetic sequence.
In addition to being headquartered on opposite sides of the country (Sherlock is based out of Boston and Mammoth is based out of San Francisco), the companies differ in their choice of Cas protein. Sherlock uses a Cas13 protein, which is activated by single-stranded RNA (ssRNA) and collaterally cleaves ssRNA. Mammoth uses a Cas12a protein, which is activated by dsDNA & collaterally cleaves ssDNA.
The groups have built workflows around these proteins and given them fun acronyms: the Cas13 one is SHERLOCK (Specific High Sensitivity Enzymatic Reporter UnLOCKing) (you don’t need to be Sherlock to figure out this is the method used by Sherlock!) And the Cas12a one is DNA Endonuclease-Targeted CRISPR Trans Reporter (DETECTR).
The tests share the same overall strategy: make copies of sequence you want to detect (or at least attempt to make copies) -> add labeled reporter & Cas loaded with complementary guide -> the sequence (if present) will activate Cas -> Cas will cleave reporters -> reporters will report (so listen to them! (or at least look for them :P)
Even with the signal amplification that comes with cleaving lots of copies of the reporter probes, you still need enough copies of the viral sequence in order to activate all the Cas proteins. So the first step of these tests (after extracting the RNA from spit or nose gunk) is to make lots of copies of the viral genetic information.
This might remind you of “conventional” diagnostic tests for “the coronavirus” (SARS-CoV-2, which is just one of many but it’s the one our world is currently in the midst of dealing with). The conventional tests are based on RT-PCR (Reverse Transcription – Polymerase Chain Reaction) which require fancy PCR machines that go through cycles of temperature ups and downs. But these CRISPR tests typically use use “isothermal amplification” techniques such as RPA (Recombinase Polymerase Amplification or LAMP (loop-mediated isothermal amplification). These isothermal techniques which are quicker and carried out at a single temperature so they only require a water bath or a heating block or something similar.
Which amplification technique you use depends on the starting material and the Cas you’re using. SARS-CoV-2 holds its genetic information (genome) in a single strand of RNA. But other viruses have DNA genomes, as do bacteria and fungi. CRISPR tests can detect all of these, but they have to make sure that they make the copies in the form that will activate the Cas they’re using.
For SHERLOCK, this means making ssRNA copies, which can be done using an enzyme (reaction mediator/speed-upper) called T7 RNA polymerase which makes RNA copies of DNA (a process called transcription). So the basic workflows are:
if you start with RNA… reverse transcribe into DNA & make ssRNA copies -> use T7 polymerase to make DNA copies -> add loaded Cas13 & labeled ssRNA -> measure readout
if you start with DNA… make DNA copies -> use T7 polymerase to make ssRNA copies -> add loaded Cas13 & labeled ssRNA -> measure readout
We needed ssRNA to activate Cas13, but for Cas12a, you need dsDNA, so the process is slightly different. Instead of making ssRNA copies you make dsDNA copies. And instead of ssRNA reporters you use ssDNA reporters. So, for DETECTR, the basic workflows are:
if you start with RNA… reverse transcribe into DNA & make DNA copies -> add loaded Cas12a & labeled ssDNA -> measure readout
if you start with DNA… make DNA copies -> add loaded Cas12a & labeled ssDNA -> measure readout
There are a couple of main ways to measure readout. First the pros & cons, and then what’s going on (at the molecular level). Fluorescence-based is more sensitive (because fluorescence gives a much stronger signal than “color”) but it requires special machines to detect (though you can adapt these tests to run in a “high-throughput” style where you test a lot of samples at once in different wells of a plate). Colorimetric assays (tests where the readout is something you can see with the naked eye) can be made fairly cheaply and potentially be used *anywhere.* They lose a bit of sensitivity, but should be good enough to detect if a person is in the contagious phase of coronavirus infection; super sensitive tests like RT-PCR often pick up harmless “naked” fragments of viral RNA that patients can shed for weeks (or even months) after infection. Those harmless pieces are typically shed at low levels which the colorimetric assays probably can’t detect, but if you just want to know whether someone’s contagious, that’s fine.
The Sherlock test that has the FDA Emergency Use Authorization (EUA) uses fluorescence. At one end of the reporter (which in this case is ssRNA) is a fluorophore (a molecule that will let off light of a specific wavelength if you shine light of a different specific wavelength on it). And at the other end is a quencher, which can “steal the light” before it’s given off. That stealing happens through something called FRET (Forster Resonance Energy Transfer) which involves the non-radiative transfer of energy – so basically the energy that would have been given off as light instead just gets passed to the quencher. The key to FRET is that it’s distance-dependent. The quencher can only steal the energy if it’s really close. It’s close enough for this in the uncleaved reporter RNA, but once that RNA is cleaved, the quencher half and the fluorophore half drift apart and you can see the fluorescence. So an increase in fluorescence indicates the presence of the sequence you’re looking for (e.g. the viral sequence).
Of course, as I mentioned above, you need a machine to be able to see this fluorescence, so this technology, although it doesn’t require as much equipment as PCR, is still not home-friendly nor field-deployable to remote areas. Therefore, scientists are also working on versions with results you can see with your own eyes. These often involve the use of lateral flow strips. They look kinda like those pee-on-a-stick pregnancy tests. Mammoth has developed one of these tests for SARS-CoV-2 and published a paper on, first as a preprint, and then in a paper by Chiu et al. in Nature Biotechnology https://go.nature.com/2FZXSXO
To refresh our memory, the Mammoth tests use DETECTR, which uses Cas12a, which recognizes dsDNA and cuts ssDNA. So our labeled reporters will be ssDNA. Stuck to one end of this ssDNA is a molecule called biotin and stuck to the other end is a molecule called FAM.
When you add a sample of your amplified DNA to the starting line of the strip it mixes with anti-FAM antibodies conjugated (strongly attached to) gold nanoparticles (Au-NP). These nanoparticles have a reddish color and therefore whatever they’re attached to will be labeled red. In this case, the Au-NP-anti-FAM will bind to the FAM, which is attached to the reporter DNA – or at least part of the reporter DNA… If the reporter was cut, the biotin-labeled half of the DNA will be separated from the FAM-labeled half of the DNA. So the color is only labeling the FAM half and the biotin half remains uncolored.
cleaved: color-FAM-ddddd + dddddd-Bi
The test strips have 2 lines: C (control) and T (test). The Control line is coated with streptavadin, a molecule that binds strongly and specifically to biotin. The Test line is coated with antibodies that bind to the anti-FAM antibody (e.g. if your anti-FAM antibody is raised in goats you can use an anti-goat antibody at the second line). When you apply the sample to the strip it will get wicked towards the other side of the strip. Along the way, however, it will get caught at the lines.
Since the biotin line is first, any biotin-labeled DNA will get caught here. Uncleaved DNA will still have the colored end with them, so this line will show as a colored strip. note even if you have the sequence of interest present, not *all* of the reporter will be cleaved, so you’ll “always” see a line here (though weakened if you have cleavage). The key is, if you do NOT have cleavage you will ONLY see this one line. If you do have cleavage, however, some of the gold-FAM-labeled DNA will be freed from the biotin and won’t get trapped until the second line.
So, for a NEGATIVE test:
C line: color-FAM-ddddddddddd-Bi
And for a POSITIVE test:
C line: color-FAM-ddddddddddd-Bi & dddddd-Bi
T line: color-FAM-ddddd
So, 1 line -> negative; 2 lines -> positive
Here’s a good explainer on how the test strips (called HybriDetec) work if you want more details: https://bit.ly/2FS9gos
With this assay as originally reported, you still have to do the amplification step, but Chiu et al. show that the whole test can be done in ~30-40 min (20-30 min @ 62°C for RT-LAMP to convert the viral RNA to DNA & make DNA copies) and then 10 min at 37°C for the Cas12 detection reaction) before applying to the strip. In this paper, they extract the RNA first, which also takes time, and they use nasopharyngeal samples (those deep in the nose ones). These are a couple main holdups to making the test more widely available. But, as I talked about in my post on SalivaDirect, scientists are now finding that you can use spit instead of nose swabs, and you don’t even have to do a fancy-dancy extraction first. http://bit.ly/reallyrapidtests
Mammoth recently received a contract from the National Institutes of Health (NIH) as part of its USD 1.5 billion Rapid Acceleration of Diagnostics (RADx) program, although from what I read it sounds like the funding is for high-throuput testing where a machine can test a bunch of samples at once – similar to with the conventional RT-PCR tests, but faster.
As I mentioned above, I’ve focused on the 2 main CRISPR/Cas detection systems but there are more. If you want to learn more about them, here’s a good (technical) review: CRISPR/Cas Systems towards Next-Generation Biosensing, Y Li et al., Trends in Biotechnology, 2019: https://doi.org/10.1016/j.tibtech.2018.12.005
That article came out before covid, which is a great reminder that SARS-CoV-2 isn’t the only microbe out there. It isn’t the first. And it won’t be the last. One of the most promising things about CRISPR/Cas is that it can be used to detect *any* of them as long as you know the sequence. And scientists are working on making “multiplexed” versions where they can screen for a wide range of pathogens at once. So stay tuned!
Finally, a note about gene editing. It’s important to realize that the Cas proteins being used in these diagnostic tests are much more promiscuous than the Cas proteins used for gene editing. The main Cas that has been used for gene editing purposes is Cas9. It *only* cuts its target dsDNA sequence – none of that indiscriminate cleavage of any ssRNA or ssDNA that happens to be around. But that’s not to say Cas9 is perfect. It has problems too.
One of the major hold-ups when it comes to the application of CRISPR technology – especially in medicine – is the potential for Cas to “mess up” and cut additional sequences. Scientists try to avoid this by making sure those guides are super specific – they don’t match anywhere else in the genome, but sometimes even a partial match can spur Cas to action. Which can lead to additional genes being inactivated or otherwise altered. To make things worse, unless you go through and sequence the entire genome, you might not even know those alterations occurred. And, although many are probably harmless, a single harmful one can spell catastrophe for a patient (or could confuddle a researcher’s findings if they think some effect is due to knocking out one gene but the effect they see is actually coming from knocking out a gene they didn’t even know they knocked out!
To try to get around these problems, scientists are turning to even pickier Cas proteins – either ones that occur naturally or ones that they’ve genetically tweaked. One strategy, for example, is to make it so that Cas only “nicks” target DNA (cuts a single strand) so that you need 2 separate guide RNAs to match (1 per strand), thus increasing specificity.
This post is part of my weekly “broadcasts from the bench” for The International Union of Biochemistry and Molecular Biology Be sure to follow @the_IUBMB if you’re interested in biochemistry! They’re a really great international organization for biochemistry.⠀
more on topics mentioned (& others) #365DaysOfScience All (with topics listed) 👉 http://bit.ly/2OllAB0⠀