Do you ever hear scientists talking about genes being “expressed” and wonder what the heck they’re talking about? You’re not alone – even scientists often want clarification because “expression” can mean multiple things and can be measured multiple ways to tell you about different accepts of the “expression” and post-expression process. Are we talking about protein recipes being copied in RNA from their DNA gene (transcription)? Protein being made from those copies (translation)? Do you take into account how well those proteins survive (proteasomal degradation)? Here’s an overview of some of the techniques that are commonly used to measure those various aspects.
Proteins are cellular “workers” made up of amino acid building blocks. I like to think of them as “baked goods” like cookies & cakes. Cells are constantly dealing w/different demands, to which they have to be able to adapt their supply to meet. There are many different ways in which they do this & the further down the protein production pipeline, the more quickly effects can be seen, but the less efficient the process (like turning off a faucet vs cleaning up the mess). The original recipes are written in DNA in the form of genes, bound together into “cookbook volumes” called chromosomes housed in a membrane-bound room in your cells called the nucleus. To make a protein, the cells make an messenger RNA (mRNA) copy of the gene-encoded recipe in a process called transcription, then (if it passes the security check) the mRNA recipe gets sent out into the general part of the cell (cytoplasm) where where “chefs” called ribosomes turn it into a protein in a “baking” process called translation. https://bit.ly/translationtimestwo
Each cell has the instructions for making every protein you could ever need. But you don’t need each actual protein all the time in each cell (e.g. a restaurant doesn’t need to make ravioli at breakfast time). So the cells regulate what proteins get made when, how many copies get made, and how long those proteins stick around. Going back to our bakery analogy, and talking in terms of cupcakes, the bakery can regulate
- how many copies of the cupcake recipe are made and delivered to the chefs (transcriptional regulation)
- how long those recipe copies are available to the chefs (post-transcriptional regulation)
- how many cupcakes are made from those recipe copies (translational regulation)
- how long those cupcakes stick around & whether they have cherries or mold added (post-translational regulation)
The bakery can do this for each recipe, separately controlling the amounts of ravioli, cupcakes, cookies, etc. And your cells can do this for each protein recipe.
When people refer to “expression,” what they usually really care about is a certain protein being present in a cell (does the bakery have cupcakes when you arrive). So, for example, if someone says that a certain receptor isn’t expressed in some cell type, that cell type doesn’t have that receptor present and therefore won’t respond to the corresponding ligand (binding partner). Therefore, one way to measure “expression” is to measure the amount of that protein present, which you can do using methods including western blotting and mass spectrometry.
Western blotting allows you to “go fishing” for specific proteins. For “bait” you use a detectable antibody – a small protein that binds specifically to other things, in this case a part of your specific protein. You take some cells, break them open (lyse them), separate out the membrane bits by spinning that lysate really fast in a centrifuge so the heavy things pellet out, and then you take the liquid part (supernatant) to work with. There are lots of proteins in there but you want to find how much (if any) is the protein you’re interested in, so you’re gonna go fishing.
You start by separating the proteins by size by running them through an SDS-PAGE gel – the gel matrix is like a mesh that slows down bigger proteins more cuz they get tangled up in the mesh as they try to squeeze through. Once you’ve separated them you need to transfer them to something more stable before they diffuse out or you tear the gel. So you transfer them to a membrane (like a durable piece of paper that proteins like to stick to). You do this by applying charge in the horizontal direction (not vertical like the gel was).
You put the membrane in a bath filled with that antibody to let it bind, then see how much bound. Sometimes the protein-specific antibodies you use are labeled (e.g. with a fluorophore so they’ll give off light at a certain wavelength if you shine light of another specific wavelength on them. Other times, you have to use a second set of antibodies that bind to the first set of antibodies and are labeled. either way there’s a lot of washing and blocking and stuff to make sure that you’re only detecting legit binding – they’re a real pain…
The stronger the signal you see, the more protein was present. But that’s assuming all antibodies are created equal, which they aren’t. some antibodies kinda suck… So it’s hard to get absolute quantification from western blotting. What you can do is compare different situations, such as different cell types, or the same cell type but at different timepoints or with & without some drug. But in able to compare, you need to have a point of reference in common between the samples, so you can normalize the amounts if needed. Basically, you want to make sure you loaded the same amount of cell stuff in each lane so it’s a fair comparison. To do this, you typically do a second fishing step for a “loading control” protein like tubulin or actin, which is expressed at pretty stable levels in various cells. The bands you get when fishing for these should be similar if you want to be able to compare the bands you get for the other proteins. http://bit.ly/westernblotworkflow
You can also look at “all” proteins, and quantify their levels – if you hear “all” think “-omics”! Hear “-omics” think “big data.” The -omics folks use computer wizardry to look at big data sets and find trends, etc. Genomics looks at DNA, transcriptomics looks at mRNA, proteomics looks at proteins. Mass spectrometry (mass-spec) splits proteins into small charged (ionic) fragments then uses software to match those fragments to the proteins they came from to figure out what proteins were in a mixture. The technique has some biases (certain proteins are more easily detected, etc.) but the more of those fragments are detected, the more of that protein was likely present.
But, with either of these techniques, the amount of protein present is also going to depend on how much the protein is being degraded, so measuring the current amount of protein present is only a proxy for the actual protein “expression” step.
what you’d see:
Protein expressed – ↑ protein levels….. expressed protein degraded – ↓ protein levels
The 2 processes might even cancel each other out, so you wouldn’t even know that a protein was technically being “expressed” if you just looked at protein levels
What if the protein’s being made, it’s just not lasting long enough to detect? Unlike cakes, which are single-use, proteins usually are designed to be used multiple times. They’re used rather than consumed. But they can get “thrown out” if desired. Unwanted proteins are tagged with a chain of a little protein called ubiquitin, which a protein version of a paper shredder called the proteasome shreds so you can recycle the parts. https://bit.ly/ubiquitinylation
To test for degradation, you can do your western blot and/or mass spec experiment with and without a proteasome inhibitor. If your protein is normally getting shredded lots, you should see a much stronger band when you add the inhibitor. Alternatively or additionally, you can test for ubiquitinylation (which you can do using a few methods, including immunoprecipitating (IP-ing) your protein using a protein-specific antibody to get it to stick to antibody-coated beads, and then doing a western blot with anti-ubiquitin antibodies). If you see that the protein’s being ubiquitinylated, its levels are likely being regulated post-translationally.
So, we’ve seen how to measure current levels, and degradation, but if you really want to know about *protein expression,* there are several techniques you can use to actually measure the protein-making process (translation) as it’s happening, before degradation can occur. These methods include polysome profiling, where you look at how many ribosomes are on that proteins’ mRNA – more in a minute, but this is basically asking, how many protein chefs are making protein from each recipe copy? These chefs are protein/RNA complexes called ribosomes, and they travel along mRNAs, following the mRNA’s instructions to piece together amino acids (protein letters) to form the corresponding protein. A single ribosome on an mRNA is called a monosome. When multiple ribosomes are on an mRNA, we call it a polysome, and it’s indicative of protein being made. You can measure it by taking mRNAs from a cell and spinning them in a sugar gradient – the mRNAs will separate based on how heavy they are, and the further it will sink. The more ribosomes on an mRNA, the heavier it will be and therefore you can tell apart non- or poorly-expressed mRNAs and highly-expressed mRNAs.
This technique is called polysome profiling and it can be used for multiple things. You can detect “global” differences if the ratios of monosomes vs polysomes are skewed, and/or you can look to see where certain recipes end up by doing something called a northern Blot on the various fractions (more on northern blots in a minute). Since different mRNAs are different lengths, and the longer the length, the more ribosomes can be bound at a time (but the longer it will take each to finish) you can take this into account – look at ribosomes per length unit when comparing https://bit.ly/polysomeprofiling
So, how much protein gets made depends on how many ribosomes are on each mRNA. But you can only stuff a certain number of ribosomes on an mRNA at a time before you have the biochemical equivalent of “too many cooks in the kitchen.” So, another way cells can boost the amount of protein being made is by increasing the number of copies of that protein’s recipe (messenger RNA, aka mRNA) that are present in the cell, i.e. increasing the expression of the *gene*. The amount of such mRNA is going to depend on how much of that mRNA is made (in a process called transcription) and how quickly/to what extent that mRNA is degraded.
Transcription is regulated in large part by transcription factors, which are proteins that bind to the region in front of genes and influence (either positively or negatively) whether that gene gets transcribed. Different genes have binding sites for multiple different transcription factors. Some genes use some of the same transcription factors, so the same proteins can regulate multiple related genes.
Post-transcriptionally, mRNAs can be regulated by mechanisms including my favorite, RNA interference (RNAi)/micro-RNA mediated regulation. microRNAs (miRNAs) are short pieces of RNA (~22nt (nucleotide, aka RNA letters) long) that “match” (well, are complementary to) sequences in mRNAs (typically in their 3’UTR (untranslated region), which is the end part of the mRNA, past the protein-making instructions (coding region). miRNAs bind to a protein called Ago which uses them to find mRNAs to repress. There are thousands of miRNAs and different mRNAs have different combinations of different miRNA binding sites. Upon binding, Ago is able to recruit cofactors that help degrade the mRNA. Thus, by controlling the production of miRNAs, cells can help control the levels of various mRNAs. http://bit.ly/microRNARNAi
We can measure how much mRNA is present at any given time using a northern blot or RT-qPCR (which allows us to measure levels of specific mRNAs) and/or mRNA-seq (which looks at levels of “all” mRNAs).
Let’s start with qPCR. But before you start that, since RNA is less stable than DNA, you’ll want to convert the RNA into DNA form. We call this DNA copy cDNA. It differs from the original genomic DNA (gDNA) form of the gene in that it’s the edited form – it has the introns (regulatory, non-protein-instruction-containg-regions) removed. Since you’re going in the reverse direction from transcription (which makes an RNA copy of a DNA gene) we call this reverse transcription.
There’s a lot of RNA – and DNA – but you only want to look for mRNA. So you need some generic mRNA feature. One thing they all have is a poly(A) tail. And what binds poly(A)? Poly(T)! So you often use a short stretch of Ts (but in the deoxyribose form cuz we want to make DNA) (oligo-dT) as a primer. This tells the reverse transcriptase where to start copying.
Quantitative PCR (qPCR) is a way to “count copies” of specific pieces of DNA – and when those specific pieces are DNA copies of RNA (like the RNA copies of protein recipes) we call it Reverse Transcription (RT) qPCR – RT can also stand for Real Time, which refers to the measuring – the products are detected as they’re made because of the fluorescence they cause (either because of generic dyes or specific probes) – unlike traditional PCR (Polymerase Chain Reaction) which you use just to make copies of defined stretches of DNA, with the goal of making as many copies as possible, RT-qPCR counts the copies as they’re made to figure out how many copies you started with.
For primers here, you use specific sequences that are unique to specific recipes. The primers bind to the recipes and get DNA polymerase to make more copies. In the beginning there aren’t enough copies to measure directly, so you stabilize the recipe copies and make more copies of them to amplify the signal to a detectable level – the more copies you start with, the fewer amplification cycles you’ll need to do this – so you can compare how many cycles are required for different recipes, a number referred to as the quantification cycle (Cq) value – a lower Cq value means more recipes, and likely more protein. http://bit.ly/rtrtqpcrprimer
This is good if you only have a few recipes you’re interested. If you have a lot you can use a microarray, which is like a plate with lots of wells and inside each is a miniature Northern Blot. What’s a northern blot? Glad you asked.
A northern blot is a way to go RNA fishing – similar in concept to what we saw with the western blot, except here you’re looking for RNA instead of protein. For “bait,” instead of antibodies you use a complementary piece of DNA that you label (usually with a radioactive phosphate). First, you separate the RNA fragments by size by running them through an agarose gel – like above, but different mesh and thicker, horizontal slab. Then you transfer them to a nylon membrane (like a durable piece of paper that nucleic acids like to stick to). You do this by applying charge in the vertical direction (not horizontal like the gel was). And then you see if the matching DNA sticks and how much. http://bit.ly/blotcompass
If you’re interested in “all” the recipes, you can use “next gen” RNA-seq. You isolate the RNA and use random primers to make lots of copies of whatever’s there so there’s enough to see.
But, with any of these methods, once again, we have the problem that by using these sort of “steady-state” measurement we can’t tease apart the effects of transcription vs post-transcriptional decay. For example, if you find that a mRNA is present in low levels that could mean
- the mRNA’s gene isn’t being transcribed much, so you’re not making much of that mRNA
- the mRNA is getting made, but quickly degraded
- a combination of the 2
Thankfully, there are ways you can figure out which scenario you’re dealing with (but they’ll take extra work so most people don’t do them unless they really need to know.
To test how many mRNA copies are being made you can use a technique called Pol II CHIP-seq. Pol II is the RNA polymerase that acts as a sort of DNA to RNA Xerox machine, making (pre)mRNA recipe copies from genes (these copies then go through some processing to form mature mRNAs, but I’m not going to go into that here). If you see a lot of Pol II hanging out at a gene, there’s a pretty good chance that gene is getting transcribed. One way to be able to “see it” is to freeze Pol II in place, which you can do by cross-linking it to the DNA it’s bound to (cross-linking is where you use UV light and/or chemicals to get strong bonds to form between molecules that are only transiently bound). Then you isolate the Pol II using antibodies that are specific to it (this is where you get the name CHromatin ImmunoPrecipitationuse), use nucleases (DNA/RNA chewers) to cut away “excess” DNA and then sequence the DNA that the protein was bound to. This gives you a sense of the recipe’s popularity in terms of transcription.
Note: There’s a similar method called Ribo-Seq that does the same sort of thing but with ribosomes so you can see what proteins are actively being translated. https://bit.ly/ribosomefootprinting
Bottom line, “expression” can mean multiple things. Strictest-definition-wise, you can talk about gene expression in reference to transcription and protein expression in reference to translation. But you have to remember that you’re also dealing with decay of the transcribed mRNAs and decay of the translated proteins. So the actual amount of “product” you have is the sum of multiple processes and levels of regulation. And each technique I’ve told you about gives you a different glimpse at aspect(s) of those processes, but nothing can give you the full picture in one go.
more on topics mentioned (& others) #365DaysOfScience All (with topics listed) 👉 http://bit.ly/2OllAB0⠀