Log inSign up
Roshan Rao
733 posts
Image
user avatar
Roshan Rao
@proteinrosh
Research scientist @Biohub. Foundation models for biology. Prev: Co-Founder/RS @EvoscaleAI, RS @MetaAI, PhD @berkeley_ai.
New York, NY
rmrao.github.io
Joined February 2019
589
Following
3,864
Followers
  • user avatar
    Roshan Rao
    @proteinrosh
    Feb 16, 2021
    Excited to release our new paper on unsupervised attention-based MSA protein lengauge modeling! 🧵 (1/8) Paper: biorxiv.org/content/10.110… Code: github.com/facebookresear…
    Image
    GIF
  • user avatar
    Roshan Rao
    @proteinrosh
    Nov 1, 2022
    We are thrilled to announce the ESM Metagenomic Atlas (esmatlas.com)! In this effort we folded the entirety of MGnify90 and are releasing all folded structures. This database contains >617 million structures, of which >225 million are predicted with high confidence.
    Screenshot of the dataset visualization available at esmatlas.com. A 2D representation is shown, with proteins clustered by similarity in a language model embedding. Each protein is colored by its sequence novelty. On the right, metadata for a particular protein in MGnify is shown. Closest PDB structure and closest UniRef90 sequence information is provided for the MGnify protein along with the predicted structure.
  • user avatar
    Roshan Rao
    @proteinrosh
    Dec 10, 2019
    Are you a computational biologist trying to embed some proteins? Jealous of NLP researchers for @huggingface's easy-to-use repository of models? Come check out our re-release of our TAPE code (github.com/songlab-cal/ta…), complete with a huggingface-style API for loading models!
    Image
  • user avatar
    Roshan Rao
    @proteinrosh
    Jun 25, 2024
    We have trained ESM3, a generative bidirectional masked language model that reasons over the sequence, structure, and function of proteins. ESM3 is trained at three model scales - 1.4B, 7B, and 98B. x.com/alexrives/stat…
    Image
    Image
    01:31
    user avatar
    Alex Rives
    @alexrives
    Jun 25, 2024
    We have trained ESM3 and we're excited to introduce EvolutionaryScale. ESM3 is a generative language model for programming biology. In experiments, we found ESM3 can simulate 500M years of evolution to generate new fluorescent proteins. Read more: evolutionaryscale.ai/blog/esm3-rele…
    49K
  • user avatar
    Roshan Rao
    @proteinrosh
    Dec 16, 2020
    Our new paper on unsupervised contact prediction with protein LMs is up on bioarxiv! Examining Transformers trained with MLM on protein sequences, we find attention maps predict contacts *better* than Potts models trained on the corresponding MSA. 1/12
    Image
    biorxiv.org
    Transformer protein language models are unsupervised structure learners
    Unsupervised contact prediction is central to uncovering physical, structural, and functional constraints for protein structure determination and design. For decades, the predominant approach has...
  • user avatar
    Roshan Rao
    @proteinrosh
    Dec 9, 2021
    Here’s a video of my talk! I tried to make it relatively accessible to people without much background in either biology or ML. Definitely unbiased reviewers (my roommates) suggest I at least partially succeeded. youtu.be/hcJS9d09ECA
  • user avatar
    Roshan Rao
    @proteinrosh
    May 1, 2020
    TAPE v0.4 is released! In addition to a number of bugfixes, we've added the TRRosetta model for structure prediction so that you can play around with predicting structure in pytorch!
    from tape import TRRosetta

model = TRRosetta.from_pretrained('xaa')  # xaa, xab, xac, xad, xae
predictions = model(<input_msa>)
  • user avatar
    Roshan Rao
    @proteinrosh
    Dec 8, 2021
    Post defense drinks ❤️
    Image
  • user avatar
    Roshan Rao
    @proteinrosh
    Oct 3, 2021
    This is a helpful thread, but I want to point out that I and many others got into competitive PhD programs without having any publications. Strong letters of recommendation can and will outweigh publications.
    user avatar
    Chaitanya K. Joshi
    @chaitjo
    Oct 3, 2021
    Are you applying for a PhD in Machine Learning, Artificial Intelligence, and beyond? Here's a thread of high-quality resources that helped me understand the process + craft my application better. 👇
  • user avatar
    Roshan Rao
    @proteinrosh
    Jun 25, 2024
    Working on ESM3 has been the most challenging and the most rewarding part of my career. I am incredibly proud of the team we have built - y’all make it so fun to come in to work each day.
    user avatar
    Thomas Hayes
    @THayes427
    Jun 25, 2024
    Replying to @THayes427
    I’m incredibly grateful to work with this amazing team. This is the most dedicated and creative team I’ve ever worked with, and I’m so excited to continue building. Please don’t hesitate to reach out if you’re interested!
    5.1K
  • user avatar
    Roshan Rao
    @proteinrosh
    Jun 7, 2021
    Interesting paper! Arguably, this is exactly what our MSA Transformer is - a model that alternates between attention within a sequence and attention across different sequences. Big difference is they use random mini batching as opposed to an explicit search fir related points.
    user avatar
    Jannik Kossen
    @janundnik
    Jun 7, 2021
    🗞New Paper🗞 🤖🧪Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning 🧪🤖 Huge thanks to @neilbband* as well as @clarelyle, @AidanNGomez, @tom_rainforth, @yaringal, and @OATML_Oxford ! Introducing 🚀Non-Parametric Transformers🚀 1/
    Overview of the Non-Parametric Transformer. (a) The input dataset and mask matrix are stacked and (b) linearly embedded for all datapoints independently. NPT then applies (c) Attention Between Datapoints across all n samples of hidden dimension h = d · e. (d) Attention
Between Attributes then attends between the attributes for each datapoint independently. We repeat steps (c) and (d) and obtain a final prediction from a separate linear projection.
  • user avatar
    Roshan Rao
    @proteinrosh
    Dec 8, 2021
    Please call me Roshan, Dr. Rao is my mother’s name. #phdlife
  • user avatar
    Roshan Rao
    @proteinrosh
    Sep 22, 2021
    I don't fully agree here. Large Deep Networks models can have emergent behavior, beyond what they are explicitly designed to do. I wouldn't have predicted that AlphaFold could model protein complexes, but it is clearly able to do so in some cases, even without paired MSAs. (1/5)
    user avatar
    Ewan Birney
    @ewanbirney
    Sep 21, 2021
    A reminder for bioinformaticians - AlphaFold works off the *real* multiple alignment, created by evolution. Flipping an amino acid in the target protein to model a mutation will not work in AlphaFold. Please don't do it. Please dont write papers about how it doesn't work.
  • user avatar
    Roshan Rao
    @proteinrosh
    Dec 7, 2022
    I figured the right approach to learning a model of protein stability was to wait for Gabe Rocklin to make a big enough dataset
    user avatar
    bioRxiv Biophysics
    @biorxiv_biophys
    Dec 7, 2022
    Mega-scale experimental analysis of protein folding stability in biology and protein design biorxiv.org/cgi/content/sh… #biorxiv_biophys

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement