Showing posts with label proteomics. Show all posts
Showing posts with label proteomics. Show all posts

Friday, October 09, 2009

Nano Anglerfish Snag Orphan Enzymes

The new Science has an extremely impressive paper tackling the problem of orphan enzymes. Due primarily to Watson-Crick basepairing, our ability to sequence nucleic acids has shot far past our ability to characterize the proteins they may encode. If I want to measure an RNA's expression, I can generate an assay almost overnight by designing specific real-time PCR (aka RT-PCR aka TaqMan) probes. If I want to analyze any specific protein's expression, it generally involves a lot of teeth gnashing & frustration. If you're lucky, there is a good antibody for it -- but most times there is either no antibody or one of unknown (and probably poor) character. Mass spec based methods continue to improve, but still don't have an "analyze any protein in any biological sample anytime" character (yet?).

One result of this is that there are a lot of ORFs of unknown function in any sequenced genome. Bioinformatic approaches can make guesses for many of these and those guesses are often around enzymatic activity, but a bioinformatic prediction is not proof and the predictions are often quite vague (such as "hydrolase"). Structural genomics efforts sometimes pull in additional proteins whose sequence didn't resemble anything of known function, but whose structure has enzymatic characteristics such as nucleotide binding pockets. There have been one or two of such structures de-orphaned by virtual screening, but these are a rarity.

Attempts have been made at high-throughput screening of enzyme activities. For example, several efforts have been published in which cloned libraries of proteins from a proteome were screened for enzyme activity. While these produced initial papers, they've never seemed to really catch fire.

The new paper is audacious in providing an approach to detecting enzyme activities and subsequently identifying the responsible proteins, all from protein extracts. The key trick is an array of golden nano anglerfish -- well, that's how I imagine it. Like an anglerfish, the gold nanoparticles dangle their chemical baits off long spacers (poly-A, of all things!). In reverse of an anglerfish, the bait complex glows after it has been taken by its prey, with a clever unquenching mechanism activating the fluorophore and marking that a reaction took place. But the real kicker is that like an anglerfish, the nanoparticles seize their prey! Some clever chemistry around a bound Cobalt ion (which I won't claim to understand)results in linking the enzyme to the nanoparticle, from which it can be cleaved, trypsinized and identified by mass spectrometry. 1676 known metabolites and 807 other compounds of interest were immobilized in this fashion.

As one test, the researchers applied separately extracts of the bacteria Pseudomonas putida and Streptomyces coelicolor to arrays. Results were in quite strong agreement with the existing bioinformatic annotations of these organisms, in that the P.putida extract's pattern of metabolized and not metabolized substrates strongly coincided with what the informatics would predict and the same was true for S.coelicolor (with a P<5.77^-177 for the latter!). But, agreement was not perfect -- each species catalyzed additional reactions on the array which were absent from the databases. By identifying the bound proteins, numerous assignments were made which were either novel or significant refinements of the prior annotation. Out of 191 proteins identified in the P.putida set, 31 hypothetical proteins were assigned function, 47 proteins were assigned a different function and the previously ascribed function was confirmed for the remaining 113 proteins.

Further work was done with environmental samples. However, given the low protein abundance from such samples, these were converted into libraries cloned into E.coli and then the extracts from these E.coli strains analyzed. Untransformed E.coli was used to estimate the backgrounds to subtract -- I must confess a certain disappointment that the paper doesn't report any novel activities for E.coli, though it isn't clear that they checked for them (but how could you not!). The samples came from three extreme environments -- one from a hot, heavy metal rich acidic pool, one from oil-contaminated seawater and a third from a deep sea hypersaline anoxic region. From each sample a plethora of enzyme activities were discovered.

Of course, there are limits to this approach. The tethering mechanism may interfere with some enzymes acting on their substrates. It may, therefore, be desirable to place some compounds multiple times on the array but with the linker attached at different points. It is unlikely we know all possible metabolites (particularly for strange bugs from strange places), so some enzymes can't be deorphaned this way. And sensitivity issues may challenge finding some enzyme activities if very few copies of the enzyme are present.

On the other hand, as long as these issues are kept in mind this is an unprecedented & amazing haul of enzyme annotations. Application of this method to industrially important fungi & yeasts is another important area, and certainly only the bare surface of the bacterial world was scratched in this paper. Arrays with additional unnatural -- but industrially interesting -- substrates are hinted at in the paper. Finally, given the reawakened interest in small molecule metabolism in higher organisms & their diseases (such as cancer), application of this method to human samples can't be far behind.

ResearchBlogging.org
Ana Beloqui, María-Eugenia Guazzaroni, Florencio Pazos, José M. Vieites, Marta Godoy, Olga V. Golyshina,, Tatyana N. Chernikova, Agnes Waliczek, Rafael Silva-Rocha, Yamal Al-ramahi, Violetta La Cono, Carmen Mendez, José A. Salas, Roberto Solano, Michail M. Yakimov, Kenneth N. Timmis, Peter N. Golyshin, & Manuel Ferrer (2009). Reactome array: Forging a link between metabolome and genome Science, 326 (5950), 252-257 : 10.1126/science.1174094

Tuesday, June 05, 2007

Tagging Up With Protein Microarrays

Molecular Systems Biology, an open access journal, has an impressive new functional protein microarray paper. The authors identified a large number of targets for a yeast ubiquitin transferase (enzymes which transfer a protein tag, ubiquitin, onto other proteins), and the data has a good ring to it.

Some background: protein microarrays are a much more complicated subject than nucleic acid microarrays. One way to split them is by intent. Capture arrays have some sort of affinity capture reagent, most likely antibodies, on the chip surface. If properly designed, built & calibrated they represent a very highly multiplexed set of protein assays. Reverse-phase protein arrays spot fractionated, but unpure, proteins from biological samples on an array.

In contrast, functional protein microarrays attempt to represent a proteome on a chip as individually addressable spots in order to study aspects of that proteome. A number of groups have worked on functional protein microarrays, but there are a limited number of commercial sources, with perhaps the most successful being Invitrogen, which offers human and yeast arrays. If you'd like a great beach book on the subject, a new volume covers a wide array of topics, with Chapter 22 ("Evaluating Precision and Recall in Functional Protein Arrays") definitely my favorite.

Functional protein arrays present a huge challenge. In the ideal case the proteins would be produced, folded correctly and deposited on the slide in such a way that an assay can be run on every protein in parallel. This is a tall order, with lots of complications. Proteins may not fold correctly during expression or may unfold in the neighborhood of the slide surface, the post-translational state of the protein may be variable and is unlikely to capture all possible states of the protein, and the protein may not have key partners which are important for its function.

Despite these, and many other concerns, protein microarray experiments have been published describing various feats. Protein-protein interaction experiments to discover novel interactions (such as this one) or create comprehensive binding profiles (such as this one) are probably the most prevalent use, but the arrays can also be used to discover DNA binding proteins, identify novel enzymes, assay phenotypic differences of mutants, develop novel infectious disease diagnostic strategies, and identify the targets of protein kinases. [links are a mix of open access & paid access; apologies)

A wide variety of ingenious methods have been used to produce functional protein microarrays. The Invitrogen arrays are spotted from purified expressed protein and expected to bind randomly, but some other approaches ensure that the majority of protein molecules bind in a defined way. Some approaches actually synthesize the proteins in situ, and one group even deposited proteins on spots using a mass spectrometer!

Protein microarrays have had their growing pains. The amount of active protein found in a spot can vary widely. One study of protein-protein interactions failed to recover most of the known interactors of the bait protein. Since the bait is primarily a phosphoprotein binding protein, one possible explanation is that the insect-expressed human proteins were not in their correct phosphorylation state. However, poor recall of known substrates was also observed in protein kinase substrate searches run in both human and yeast (see Chapter 22 of the Predki book). Even without worrying about post-translational modification, coverage is an issue. While essentially the complete Saccharomyces proteome is available, the most extensive commercial human chip has less than 1/5th of the proteome and there are not (last I checked) commercial arrays for any other species.

The new publication wins on a bunch of scores. First, it is one of the handful of publications using such arrays which is not from one of the labs pioneering them, suggesting that they might work routinely. This publication uses the Invitrogen yeast arrays. Second, they did recover a lot of known substrates for their ubiquitinating enzyme. Third, the signals look very strong by eye, which has been the case for protein-protein interaction assays but much less so for protein kinase substrate discovery. Fourth, they batted 1.000 with novel positives from array in an independent in vitro ubiquitination assay and were able to verify that at least some of these are ubiquitinated by Rsp5 in vivo (by comparing ubiquitination in wt and Rsp5 mutant strains). Fifth, they performed a protein-protein interaction microarray assay with Rsp5 and the interaction results and ubiquitination results strongly overlapped.

Of course, I used to work at Ubiquitin Proteasome Pathway Inc (which is now touting a new drug with a new target in the pathway), and there I would have been digesting this paper until arrays danced in my dreams. Such assays offer an interesting possibility for greatly expanding our understanding of UPP players and functions -- many Ub transferases or Ub-removing proteases have no known substrates. While they have a lot of issues, functional protein microarrays are starting to make a difference in proteomics.

Monday, January 08, 2007

Counting Proteins

I earlier wrote two pieces on a microfluidics chip for counting nucleic acids by limiting dilution PCR -- in one application it was used to count mRNAs and the other for counting bacteria. Last week's Science has a nice complement to this: microfluidic chips that count proteins.

A really cool aspect of the chip is its assembly line nature: it doesn't just count proteins, it performs the upstream steps as well. Starting with a sample of cells, it plucks out a single cell. The cell is rinsed and then lysed. The fluorescent antibodies are introduced (if necessary) and the an electrophoretic separation performed. Finally, fluorescent molecules are counted as they pass through a chip region which is illuminated with a sliver of light of the correct excitation wavelength.

They actually demonstrated two variants of the basic scheme: one chip performed an immunoassay on eukaryotic cells as described above, while the other looked at naturally fluorescent proteins in cyanobacteria. The immunoassay chip analyzed a single cell, whereas the cyanobacterial chip ran three in parallel.

This is a really neat development and presuming the costs can be made reasonable, would be very interesting for many applications. But, there are very few naturally fluorescent proteins and so for most applications high quality antibodies (or equivalent specific binders) will be needed -- a nut that has yet to be cracked.

Friday, November 03, 2006

Phosphopallooza.

Protein phosphorylation is a hot topic in signal transduction research. Kinases can add phosphate groups to serines, threonines & tyrosines (and very rarely histidines), and phosphatases can take them off. These phosphorylations can shift the shape of the protein directly, or create (or destroy) binding sites for other proteins. Such bindings can in turn cause the assembly/disassembly of protein complexes, trigger the transport of a protein to another part of the cell, or lead to the protein being destroyed (or prevent such) by the proteasome. This is hardly a comprehensive list of what can happen.

Furthermore, a large (by some estimates 1/4 to 1/5) amount of the pharmaceutical industries efforts, including those at my (soon to be ex-) employer Millennium, are targeting protein kinases. If you wish to drug kinases, you really want to know what the downstream biology is and that starts with what does your kinase phosphorylate, when does it do it, and what events do those phosphorylations trigger.

A large number of methods have been published for finding phosphorylation sites on proteins, but by far the most productive have been mass spectrometric ones (MS for short). Using various sample workup strategies, cleverer-and-cleverer instrument designs, and better software, the MS folks keep pushing the envelope in an impressive manner.

The latest Cell has the latest leap forward: a paper describing 6,600 phosphorylation sites (on 2,244 proteins). To put this in perspective, the total number of previously published human phosphorylation sites (by my count) was around 12,000 -- this paper has found 50% as many as were previously known! Some prior papers (such as these two examples) had found close to 2,000 sites.

Now some of this depth came from many MS runs -- but that in itself illustrates how this task is getting simpler; otherwise so many runs wouldn't be practical. The multiple runs also were used to gather more data: looking at phosphorylation changes (quantitatively!) over a timecourse.

One this this study wasn't designed to do is clearly assign the sites to kinases. Bioinformatic methods can be used to make guesses, but without some really painful work you can't really make a strong case. And if the site shouldn't look like any pattern for a known kinase -- good luck! There really aren't great methods for solving this (not to say there aren't some really clever tries).

Also interesting in this study is the low degree of overlap with previous studies. While the reference set they used is probably quite a bit lower than the 12K estimate I give, it is still quite large -- and most sites in the new paper weren't found in the older ones. There are in excess of 20 million Ser/Thr/Tyr in the proteome and many are probably not phosphorylated, but certainly a reasonable estimate would be north of 20K are.

For drug discovery, the sort of timecourse data in this paper is another proof-of-concept of the idea of discovering biomarkers for your kinase using high-throughput MS approaches (another case can be found in another paper). By pushing for so many sites, the number of candidates goes up substantially, since many sites found aren't modulated in an interesting way, at least in terms of pursuing a biomarker. This is noted in Figure 3 -- for the same protein, the temporal dynamics of phosphorylation at different sites can be quite different.

However, it remains to be seen how far into the process these MS approaches can be pushed. Most likely, the sites of interest will need to probed with immunologic assays, as previously discussed.