The protein biochemist’s toolbox almost definitely includes UniProt & ProtParam. So here are some of the basics of using these and other free online database tools to learn more about proteins. With UniProt as a launchpad you can search for proteins of interest, get their sequences, align them, find similar ones, look at their domains & where else those domains occur, figure out their likely charged-ness (from pI), their extinction coefficient (for measuring concentration with UV light), and way more. You can do way way more but I’m going to focus on the basic things I use a lot and give you an overview of what the way way more things include. 

I’m not gonna write much today, just a show-and-tell, but here are links to the tools and links to past posts about some of the things I talked about. Originally posted this 8/23/21. Then added content & refreshed and 7/24-7/27/22.

UniProt

This is your basic starting point for finding & learning about a protein – it connects you to way more resources through like a bazillion links & lets you align sequences): https://www.uniprot.org/

Here’s a walk through entries (longer version on left)

Expasy ProtParam

ExPasy ProtParam is the place to go if you want to know the nitty-gritty about a protein you’re studying. It lets you calculate pI, molecular weight, extinction coefficient, & more). And you can access “wild-type” (normal version of a protein) info or you can paste in your own sequence, which is super helpful if you’re working with recombinant protein expression and have modified the protein to add an affinity tag for purification or removed a floppy part or something. So it’s a protein biochemist’s friend!

Here’s a link to it: https://web.expasy.org/protparam/ 

It tells you a bunch of info. “Basic” things like how long the protein is (# of amino acids) and how “big” it is (molecular weight, in Daltons (Da). It even tells you the number of atoms!

It also tells you about how “basic” the protein is – in terms of how many “basic” (i.e. usually positively-charged) amino acids a protein has. 

This will impact the pI (isoelectric point)  which is the pH at which the protein is neutral overall. Go below that pH and there are “excess” protons available, and the protein will be positively-charged on average. Go about that pH and there are “too few” protons available, so the protein will be negatively-charged. ProtParam tells you the theoretical pI which is super useful for doing charge-based protein purification (i.e. ion exchange chromatography). If your protein has a low pI, we say it’s “acidic” and usually use anion exchange.  If your protein has a high pI, we say it’s “basic” and usually use cation exchange. more on all this here: https://bit.ly/isoelectricpoint  &  https://youtu.be/CLgzYBm_ymk and more ion exchange chromatography here: blog form: http://bit.ly/ionexchangechromatography   ; YouTube: https://youtu.be/RGF1l572IZY 

It also gives you the extinction coefficient. WAY more on this here: http://bit.ly/bradforduv & http://bit.ly/proteinmeasuring 

but the key thing is it allows you to calculate a (pure) protein’s concentration based on how much UV light it absorbs.

Extinction coefficients tell you the absorbance value (A) that corresponds to 10 mg/mL (1%) or 1 mg/mL (0.1%). Those percentages come from the weight/volume percentage convention that a 1% solution corresponds to 1 g/100 mL – more on why here: http://bit.ly/weightvolume & https://youtu.be/uo0Lx_OmKBA

You can calculate the estimated extinction coefficient using free online software tools like Expasy ProtParam. I say estimated because context matters – the local environment around the absorbing part can influence how eager it is to absorb a photon

When I do this for BSA I see this:

Extinction coefficients:

Extinction coefficients are in units of  M-1 cm-1, at 280 nm measured in water.

Ext. coefficient    42925

Abs 0.1% (=1 g/l)   0.638, assuming all pairs of Cys residues form cystines

Ext. coefficient    40800

Abs 0.1% (=1 g/l)   0.607, assuming all Cys residues are reduced

First it tells me the values under oxidizing conditions and below that it tells me the values under reducing conditions (the intracellular environment is reducing and we usually add reducing agents like DTT or β-mercaptoenthanol) to protein solutions to keep them happy outside the cell). It tells me this because cysteine crosslinks can also absorb, where applicable.

Then I can plug this into Beer’s law (or have the computer do it for me) if I measure the absorbance. 

You measure this absorbance using something called a spectrophotometer. Basically it shines light through a solution and measures to what extent different wavelengths make it through (are transmitted) versus don’t make it through (are absorbed). This can be converted into concentration of solute (dissolved molecules) using Beer’s Law 

The equation is: A = εcl

A = absorbance

ε = extinction coefficient (aka molar absorptivity coefficient) – specific for particular molecule & particular wavelength; units of L mol-1cm-1

c = concentration (in mol/L) – this is molarity – a mole is just a chemist’s “baker’s dozen” – it’s Avogadro’s number (6.022 x 10^23) of something – solute molecules or donuts, it’s just a number http://bit.ly/2r4RnrX

l = path length (in cm)

For Beer’s Law, you only need the absorbance at a single wavelength – but there’s much more to learn if you look at the whole spectrum (or at least a couple key values). Because molecules have overlapping spectra (e.g. both DNA & RNA absorb light of 260nm wavelength) you look to ratios. 

Proteins absorb most strongly at 280, and this is where we typically calculate from. Proteins also absorb at 230nm and that absorbance is from the generic backbone part – corresponds to absorbance by the peptide bonds linking the letters. These peptide bonds also have resonance, but not as much as rings do, and they absorb ~190-230nm.

Because DNA absorbs so strongly at UV260, where protein doesn’t, it’s relatively easy to see if you have DNA in your protein prep, but it’s harder to tell if you have protein in your DNA prep – 260 will dominate the 260/280 ratio

Moral of the story: there’s no “one right way” to count your proteins & it’s important to carefully choose the one you use!

For more practical protein-purification posts (and background/theory), check out the new page on my blog where I’ve collected some of my protein purification posts. http://bit.ly/proteinpurificationtech

more about the PDB: https://bit.ly/pdbstructures    

more about using the extinction coefficient to find protein concentration based on UV absorbance: http://bit.ly/proteinmeasuring 

more about pI, protein charge, and ion exchange chromatography: http://bit.ly/ionexchangechromatography 

more about proteins & how they get their structure: https://bit.ly/proteinstructure 

more about PyMol: https://bit.ly/pymolintro & https://bit.ly/pymolmovies   

more about Ago: https://bit.ly/agostructurestuff  

more about x-ray crystallography: http://bit.ly/xraycrystallography2   

more about cryoEM: https://bit.ly/cryoEMbumblyintro

more about Ago: https://bit.ly/agostructurestuff  

more about all sorts of things:  #365DaysOfScience All (with topics listed) 👉 http://bit.ly/2OllAB0 or search blog: https://thebumblingbiochemist.com     


Leave a Reply

Your email address will not be published.