Log inSign up
Jingfeng Yang
579 posts
Image
user avatar
Jingfeng Yang
@JingfengY
Agent/RL/Research | prev @xai @amazon @google @microsoft @GeorgiaTech @PKU1898
jingfengyang.github.io
Joined April 2019
732
Following
2,597
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Jingfeng Yang
    @JingfengY
    Feb 13, 2023
    #ChatGPT and #GPT3 are hot. But let’s be practical, when we want to reproduce GPT-3 or use it in our applications. Why did all of the public reproduction of GPT-3 fail? In which tasks should we use GPT-3.5/ChatGPT? I tried to answer them in a new blog: jingfengyang.github.io/gpt .
    Image
    94K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Mar 24, 2023
    As a NLP researcher doing semantic parsing for nearly 5 years, I have to say semantic parsing and grounding are probably also dead. FYI, semantic parsing is to transform natural language to formal language (code, self-defined functions etc.) and execute it in the real world.
    user avatar
    Sam Altman
    OpenAI
    @sama
    Mar 23, 2023
    we are starting our rollout of ChatGPT plugins. you can install plugins to help with a wide variety of tasks. we are excited to see what developers create! openai.com/blog/chatgpt-p…
    Image
    00:00
    365K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Apr 27, 2023
    Should we choose to use LLMs or smaller finetuned models in practical use cases? Take a look at our survey arxiv.org/abs/2304.13712 , which covers NLU tasks, Generation tasks, Knowledge-intensive tasks, abilities regarding scaling, some miscellaneous and real-world tasks.
    Image
    GIF
    100K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Jul 10, 2025
    Grok 4 Live Demo
    41K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Aug 29, 2024
    We are hiring full-time scientists and research interns at Amazon Rufus, covering LLM pretraining, post-training, evaluation, agents, alignment and AI safety.
    user avatar
    Lihong Li
    @LihongLi20
    Aug 24, 2024
    Amazon Rufus is an expert shopping assistant powered by GenAI. We’re hiring LLM/RL talents to work on an array of intellectually challenging science questions. Come join us in this exciting (and fun) adventure!
    28K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Mar 17, 2023
    These days many NLP researchers worry about future research directions after GPT4 comes out. These are my suggestions, which have been shared in several of my talks last month. Full slides: jingfengyang.github.io/resources/slid… . Many points are similar to what @_jasonwei suggested.
    Image
    25K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Jul 19, 2023
    I have been curious about this question for one year - whether LLMs are actually reasoning or reciting/memorizing thought chains in the pretraining corpus. Not surprised, LLMs are still far from human-level reasoning in terms of task-level generalization
    arXiv logo
    arxiv.org
    Reasoning or Reciting? Exploring the Capabilities and Limitations...
    The impressive performance of recent language models across a wide range of tasks suggests that they possess a degree of abstract reasoning skills. Are these skills general and transferable, or...
    17K
  • user avatar
    Jingfeng Yang
    @JingfengY
    May 25, 2024
    Thanks Google IO for highlighting our self-extend. Their notebook: colab.research.google.com/drive/1jtaOyPO… Our paper: arxiv.org/abs/2401.01325
    Image
    Image
    user avatar
    François Chollet
    @fchollet
    May 19, 2024
    If you missed it, this I/O session on LLMs with Keras 3 is a great tutorial on LLM training and fine-tuning best practices youtu.be/TV7qCk1dBWA
    14K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Feb 18, 2024
    New Blog Post: jingfengyang.github.io/alignment . Are #LLM #capabilities mostly coming from the base model? If so, how to elicit them during #alignment? Although it might be common sense among many people, I’m trying to clarify the controversial parts. In this post, I first define what’s
    Image
    12K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Apr 24, 2024
    New results about LLama-3's long contexts abilities. Equipping Llama-3-8b/70b with SelfExtend (arxiv.org/pdf/2401.01325…), we test their in-context-learning abilities on two long tasks: DialogRe and FewNerd from LongCIL benchmark (arxiv.org/pdf/2404.02060…) @WenhuChen @TianleLI123.
    Image
    Image
    19K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Mar 14, 2023
    Some takeaways from GPT-4 main paper and blog:
    Image
    8.1K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Mar 24, 2023
    Replying to @yugu_nlp
    Agree. But OpenAI’s speed is just surprising me. Their grounding is probably better than any previous grounding to reach a level of commercial usage. Not sure whether grounding research in academia could still catch up.
    6.4K
  • user avatar
    Jingfeng Yang
    @JingfengY
    May 10, 2023
    After checking 12 insightful AI safety issues in #GPT 4 System Card and other papers, I wrote this blog jingfengyang.github.io/safety , trying to answer "WHY-WHAT-HOW" questions regarding AI safety. Many issues are more urgent with recent boom of #LLMs without enough safety alignment.
    Image
    5.7K
  • user avatar
    Jingfeng Yang
    @JingfengY
    Dec 29, 2023
    One advantage of PPO RLHF over DPO is its potential to create Superhuman AGI. If we believe a reward model would finally be superhuman, and human preference labels are just to elicit abilities from a reward base model, PPO RLHF would still work from the weak-to-strong
    18K
Advertisement
Advertisement