Log inSign up
7oponaut
5,582 posts
Image
user avatar
7oponaut
@7oponaut
sharpest knife in the drawer
Joined January 2024
145
Following
1,104
Followers
  • Pinned
    user avatar
    7oponaut
    @7oponaut
    Nov 21, 2024
    Consciousness is crazy man I'm this little guy in the pilot seat, making all the decisions
    Image
    2.5K
  • user avatar
    7oponaut
    @7oponaut
    Aug 5, 2024
    Replying to @hourly_shitpost
    If nobody has 4 kids except Indians then there's no counterexample
    Image
    77K
  • user avatar
    7oponaut
    @7oponaut
    Jul 24, 2024
    Replying to @ShitpostRock
    Vibes and fan service
    Image
    81K
  • user avatar
    7oponaut
    @7oponaut
    Jul 3, 2024
    Replying to @TheERDoctor
    How does one reintroduce a disease into a vaccinated country?
    125K
  • user avatar
    7oponaut
    @7oponaut
    May 18, 2024
    Replying to @Nimbopill
    👀
    Image
    14K
  • user avatar
    7oponaut
    @7oponaut
    Apr 10, 2024
    New GPT-4 passes the magic elevator test
    Image
    63K
  • user avatar
    7oponaut
    @7oponaut
    Jun 19, 2024
    Seems like the MCTSr authors did use ground truth information in the MCTS refinement process. They use the LLM for determining the rewards, but the search terminates when the output is equal to the GT. While a similar method could be used as an RL environment to train agents
    Image
    Image
    user avatar
    nano
    @nanulled
    Jun 16, 2024
    Replying to @teortaxesTex
    Nothing surprising tbh, I've tried it 2 days ago before noise The thing is that it's literally bruteforcing answers from llm. What if we don't know the ground truth? It will fail miserably at those tasks.
    178K
  • user avatar
    7oponaut
    @7oponaut
    Jul 20, 2024
    Replying to @bryancsk
    The stick person would obviously slip to the bottom and then it looks exponential. Follow me for more math tips.
    17K
  • user avatar
    7oponaut
    @7oponaut
    Jun 12, 2024
    Months of worldbuilding casually destroyed
    user avatar
    Tsarathustra
    @tsarnick
    Jun 12, 2024
    Mira Murati says the AI models that OpenAI have in their labs are not much more advanced than those which are publicly available
    Image
    00:00
    31K
  • user avatar
    7oponaut
    @7oponaut
    Sep 15, 2024
    Replying to @s_streichsbier
    It converted a natural language description of the code into python. It could do this because the code already existed in the first place.
    13K
  • user avatar
    7oponaut
    @7oponaut
    Feb 29, 2024
    I am told 350°F/175°C is close to the point of discomfort. Only issue: the method is imprecise
    Image
    15K
  • user avatar
    7oponaut
    @7oponaut
    May 31, 2024
    I call it the Infinitripper
    Image
    00:00
    34K
  • user avatar
    7oponaut
    @7oponaut
    May 26, 2024
    I have done unspeakable things to get to this point
    Image
    00:00
    139K
  • user avatar
    7oponaut
    @7oponaut
    Aug 7, 2024
    Replying to @karpathy
    Another difference is that RLHF doesn't do proper exploration: it mostly learns to exploit a subset of the pretraining trajectories. In contrast, when doing proper RL the discrete action distribution is usually noised by adding an entropy term to the loss function.
    43K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement