Introducing SimCLR: a Simple framework for Contrastive Learning of Representations. SimCLR advances previous SOTA in self-supervised and semi-supervised learning on ImageNet by 7-10% (see next).
arxiv.org/abs/2002.05709
Joint work with @skornblith@mo_norouzi@geoffreyhinton.
Have you wondered why object detection, unlike classification, has so many sophisticated algorithms?
With Pix2Seq (arxiv.org/abs/2109.10852), we simply cast object detection as a language modeling task conditioned on pixels!
(with @srbhsxn, Lala Li, @fleet_dj, @geoffreyhinton)
Woah! That was one of the most unexpectedly rewarding hours of my life.
Instead of passively listening to an audiobook about physics while doing errands as usual I had an hour long back-and-forth conversation with Ara from Grok 3 about a bunch of scientific topics.
We started
SimCLRv2: an improved self-supervised approach for semi-supervised learning. On ImageNet with 1% of the labels, it achieves 76.6% top-1, a 22% relative improvement over previous SOTA.
arxiv.org/abs/2006.10029
Joint work with @skornblith, @kswersk, @mo_norouzi, @geoffreyhinton
📢Introducing Pix2Seq-D, a generalist framework casting panoptic segmentation as a discrete data generation task conditioned on pixels. Works for both images and videos, with minimal task engineering.
arxiv.org/abs/2210.06366
work w/ Lala Li, @srbhsxn@geoffreyhinton@fleet_dj
Can we solve a (large) portion of vision tasks by simply formulating it as translating raw pixels into tokens/bits with higher level abstraction? A question that took a 2-year journey to get an answer: oh sure, if you know how to train a really good generative model🥳
A few months ago, we made a decision to focus on autoregressive modeling for lots of good reasons like reusing the scalable LLM training and inference stacks here at @xai, but also met with many challenges. It is really amazing to see how far we are able to push it forward!
Earlier today, we released a new model, code-named Aurora, that gives Grok the ability to generate extremely photorealistic images (and in the future, even edit them). It's free to use for all of 𝕏, try it out and send us what you're creating!
This model was trained entirely