Jingfeng Yang (@JingfengY) / X

Jingfeng Yang

579 posts

Jingfeng Yang

@JingfengY

Agent/RL/Research | prev @xai @amazon @google @microsoft @GeorgiaTech @PKU1898

jingfengyang.github.io

Joined April 2019

Pinned
Jingfeng Yang
@JingfengY
Feb 13, 2023
#ChatGPT and #GPT3 are hot. But let’s be practical, when we want to reproduce GPT-3 or use it in our applications. Why did all of the public reproduction of GPT-3 fail? In which tasks should we use GPT-3.5/ChatGPT? I tried to answer them in a new blog: jingfengyang.github.io/gpt .
94K
Jingfeng Yang
@JingfengY
Mar 24, 2023
As a NLP researcher doing semantic parsing for nearly 5 years, I have to say semantic parsing and grounding are probably also dead. FYI, semantic parsing is to transform natural language to formal language (code, self-defined functions etc.) and execute it in the real world.
Sam Altman
@sama
Mar 23, 2023
we are starting our rollout of ChatGPT plugins. you can install plugins to help with a wide variety of tasks. we are excited to see what developers create! openai.com/blog/chatgpt-p…
00:00
365K
Jingfeng Yang
@JingfengY
Apr 27, 2023
Should we choose to use LLMs or smaller finetuned models in practical use cases? Take a look at our survey arxiv.org/abs/2304.13712 , which covers NLU tasks, Generation tasks, Knowledge-intensive tasks, abilities regarding scaling, some miscellaneous and real-world tasks.
GIF
100K
Jingfeng Yang
@JingfengY
Jul 10, 2025
Grok 4 Live Demo
41K
Jingfeng Yang
@JingfengY
Aug 29, 2024
We are hiring full-time scientists and research interns at Amazon Rufus, covering LLM pretraining, post-training, evaluation, agents, alignment and AI safety.
Lihong Li
@LihongLi20
Aug 24, 2024
Amazon Rufus is an expert shopping assistant powered by GenAI. We’re hiring LLM/RL talents to work on an array of intellectually challenging science questions. Come join us in this exciting (and fun) adventure!
28K
Jingfeng Yang
@JingfengY
Mar 17, 2023
These days many NLP researchers worry about future research directions after GPT4 comes out. These are my suggestions, which have been shared in several of my talks last month. Full slides: jingfengyang.github.io/resources/slid… . Many points are similar to what @_jasonwei suggested.
25K
Jingfeng Yang
@JingfengY
Jul 19, 2023
I have been curious about this question for one year - whether LLMs are actually reasoning or reciting/memorizing thought chains in the pretraining corpus. Not surprised, LLMs are still far from human-level reasoning in terms of task-level generalization
arxiv.org
Reasoning or Reciting? Exploring the Capabilities and Limitations...
The impressive performance of recent language models across a wide range of tasks suggests that they possess a degree of abstract reasoning skills. Are these skills general and transferable, or...
17K
Jingfeng Yang
@JingfengY
May 25, 2024
Thanks Google IO for highlighting our self-extend. Their notebook: colab.research.google.com/drive/1jtaOyPO… Our paper: arxiv.org/abs/2401.01325
François Chollet
@fchollet
May 19, 2024
If you missed it, this I/O session on LLMs with Keras 3 is a great tutorial on LLM training and fine-tuning best practices youtu.be/TV7qCk1dBWA
14K
Jingfeng Yang
@JingfengY
Feb 18, 2024
New Blog Post: jingfengyang.github.io/alignment . Are #LLM #capabilities mostly coming from the base model? If so, how to elicit them during #alignment? Although it might be common sense among many people, I’m trying to clarify the controversial parts. In this post, I first define what’s
12K
Jingfeng Yang
@JingfengY
Apr 24, 2024
New results about LLama-3's long contexts abilities. Equipping Llama-3-8b/70b with SelfExtend (arxiv.org/pdf/2401.01325…), we test their in-context-learning abilities on two long tasks: DialogRe and FewNerd from LongCIL benchmark (arxiv.org/pdf/2404.02060…) @WenhuChen @TianleLI123.
19K
Jingfeng Yang
@JingfengY
Mar 14, 2023
Some takeaways from GPT-4 main paper and blog:
8.1K
Jingfeng Yang
@JingfengY
Mar 24, 2023
Replying to @yugu_nlp
Agree. But OpenAI’s speed is just surprising me. Their grounding is probably better than any previous grounding to reach a level of commercial usage. Not sure whether grounding research in academia could still catch up.
6.4K
Jingfeng Yang
@JingfengY
May 10, 2023
After checking 12 insightful AI safety issues in #GPT 4 System Card and other papers, I wrote this blog jingfengyang.github.io/safety , trying to answer "WHY-WHAT-HOW" questions regarding AI safety. Many issues are more urgent with recent boom of #LLMs without enough safety alignment.
5.7K
Jingfeng Yang
@JingfengY
Dec 29, 2023
One advantage of PPO RLHF over DPO is its potential to create Superhuman AGI. If we believe a reward model would finally be superhuman, and human preference labels are just to elicit abilities from a reward base model, PPO RLHF would still work from the weak-to-strong
18K