New open-source language model from Google AI: Flan-T5 🍮
Flan-T5 is instruction-finetuned on 1,800+ language tasks, leading to dramatically improved prompting and multi-step reasoning abilities.
Public models: bit.ly/3sbNPDJ
Paper: arxiv.org/abs/2210.11416
Quoc Le
585 posts
Google Fellow
- EfficientNets: a family of more efficient & accurate image classification models. Found by architecture search and scaled up by one weird trick. Link: arxiv.org/abs/1905.11946 Github: bit.ly/30UojnC Blog: bit.ly/2JKY3qt
- Fun AutoML-Zero experiments: Evolutionary search discovers fundamental ML algorithms from scratch, e.g., small neural nets with backprop. Can evolution be the “Master Algorithm”? ;) Paper: arxiv.org/abs/2003.03384 Code: git.io/JvKrZ
GIF - XLNet: a new pretraining method for NLP that significantly improves upon BERT on 20 tasks (e.g., SQuAD, GLUE, RACE) arxiv: arxiv.org/abs/1906.08237 github (code + pretrained models): github.com/zihangdai/xlnet with Zhilin Yang, @ZihangDai, Yiming Yang, Jaime Carbonell, @rsalakhu
- New paper: Towards a Human-like Open-Domain Chatbot. Key takeaways: 1. "Perplexity is all a chatbot needs" ;) 2. We're getting closer to a high-quality chatbot that can chat about anything Paper: arxiv.org/abs/2001.09977 Blog: ai.googleblog.com/2020/01/toward…
- Want to improve accuracy and robustness of your model? Use unlabeled data! Our new work uses self-training on unlabeled data to achieve 87.4% top-1 on ImageNet, 1% better than SOTA. Huge gains are seen on harder benchmarks (ImageNet-A, C and P). Link: arxiv.org/abs/1911.04252
- Some nice improvement on ImageNet: 90% top-1 accuracy has been achieved :-) This result is possible by using Meta Pseudo Labels, a semi-supervised learning method, to train EfficientNet-L2. More details here: arxiv.org/abs/2003.10580
- A surprising result: We found that smooth activation functions are better than ReLU for adversarial training and can lead to substantial improvements in adversarial robustness. arxiv.org/abs/2006.14536
- Happy to announce that we've released a number of models trained with Noisy Student (a semi-supervised learning method). The best model achieves 88.4% top-1 accuracy on ImageNet (SOTA). Enjoy finetuning! Link: github.com/tensorflow/tpu… Paper: arxiv.org/abs/1911.04252
- Nice blog post titled "The Quiet Semi-Supervised Revolution" by Vincent Vanhoucke. It discusses two related works by the Google Brain team: Unsupervised Data Augmentation and MixMatch. towardsdatascience.com/the-quiet-semi…
- We researchers love pre-training. Our new paper shows that pre-training is unhelpful when we have a lot of labeled data. In contrast, self-training works well even when we have a lot of labeled data. SOTA on PASCAL segmentation & COCO detection. Link: arxiv.org/abs/2006.06882
- We used architecture search to find a better architecture for object detection. Results: Better and faster architectures than Mask-RCNN, FPN and SSD architectures. Architecture also looks unexpected and pretty funky. Link: arxiv.org/abs/1904.07392
- EfficientDet: a new family of efficient object detectors. It is based on EfficientNet, and many times more efficient than state of art models. Link: arxiv.org/abs/1911.09070 Code: coming soon
- We can teach a language model to solve complex problems by showing it how to break down a problem into subproblems, and solve them sequentially. This “least-to-most prompting” method solves 99.7% of SCAN benchmark while other prompting methods solve ~16% arxiv.org/abs/2205.10625

















