Log inSign up
Gilad
1,357 posts
Image
user avatar
Gilad
@giladturok
CS PhD student @cornell | diffusion language models | Prev: @Uber, @FlatironInst, @Columbia
NYC 🗽🇺🇸
giladturok.github.io
Joined November 2023
4,704
Following
1,371
Followers
  • Pinned
    user avatar
    Gilad
    @giladturok
    Mar 12
    1/ 🚨 New paper! DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking We give masked diffusion models (MDMs) proper likelihood — and therefore proper perplexity — for the first time. Turns out MDMs are closer to autoregressive models than previously thought.
    Image
    23K
  • user avatar
    Gilad
    @giladturok
    Oct 27, 2024
    Amazingly *clear* monograph on Monte Carlo & sampling: arxiv.org/pdf/2405.16359. Highly recommend skimming if you want to learn these topics!
    Image
    Image
    74K
  • user avatar
    Gilad
    @giladturok
    Jun 29, 2025
    sometimes ppl forget: flow matching and diffusion are mathematically equivalent!
    Image
    44K
  • user avatar
    Gilad
    @giladturok
    Aug 14, 2024
    Cool blog post on convergence of the Unadjusted Langevin Algorithm (ULA): fa.bianp.net/blog/2023/ulaq/ If you're intimidated by complex convergence proofs, this one is excellently written! ULA underpins diffusion models and is related to the Markov chain Monte Carlo literature.
    Image
    13K
  • user avatar
    Gilad
    @giladturok
    Nov 21, 2024
    What are SOTA algorithms for sampling from *discrete* distributions? I already know about Gibbs and Gibbs with Gradients
    16K
  • user avatar
    Gilad
    @giladturok
    Nov 8, 2024
    "we don't know what to do with those chains yet, but I have the gut feeling that once we do, probabilistic programming will have another breakthrough" - @remilouf on parallelizing 100s of MCMC chains with JAX
    Image
    12K
  • user avatar
    Gilad
    @giladturok
    Dec 17, 2024
    Thinkin bout normalizing flow to generate samples from probability distributions. What methods meet this criteria: 1. learn entire trajectory, not just final distribution p(x) 2. unbiased 3. doesn't back-propogate through ODE solver LFIS (link below) is the only one I found
    Image
    8.8K
  • user avatar
    Gilad
    @giladturok
    May 24, 2024
    Replying to @AnthropicAI
    Golden Gate Claude has a big secret. It might struggle on MMLU, but it excels on our new challenging benchmarks: - What is the best bridge (WITBB) - What precious metal is coolest (WPMIC) - Why do fences suck (EDFS)
    7.2K
  • user avatar
    Gilad
    @giladturok
    Jul 14, 2024
    Had a great time presenting a poster at @ISBA_events on dynamic step sizes for Markov chain Monte Carlo methods. Joint work with Bob Carpenter and Chirag Modi.
    Image
    2.9K
  • user avatar
    Gilad
    @giladturok
    Jun 20, 2025
    Replying to @giffmana
    I love Jax so freakin much it’s a shame that no one uses it in production besides Google Guess I’ll have to get a job at Google one day to make my Jax dreams come true 🤷‍♂️😎
    4.1K
  • user avatar
    Gilad
    @giladturok
    Sep 11, 2024
    Saying something *actually helpful* about Scientific Communication is pretty hard. Most articles out there don't actually tell you anything useful. The one exception is the amazing presentation from Trevor Campbell. Here's a summary, a short🧵:
    Image
    docs.google.com
    How to Explain Things
    How to Explain Things Guidelines for Effective Scientific Communication Trevor Campbell Associate Professor UBC Statistics 1
    2.6K
  • user avatar
    Gilad
    @giladturok
    Jul 28, 2024
    I'm currently in my JAX era: - Write my own JAX projects for ML + stats - Do serious deep learning with {model, tensor, data} parallelism - Write JAX-style code (functional) with grad, vmap, pmap, jit - Understand why JAX is fast (jit, SDMP, lax) - PyTorch vs JAX trade-offs
    Image
    10K
  • user avatar
    Gilad
    @giladturok
    May 1, 2025
    🚨Exited to present my 1st paper at @aistats_conf in Thailand! Joint work with Chirag Modi + Bob Carpenter We develop an MCMC method that *efficiently* adapts step size for multiscale (high curvature) probability distributions: arxiv.org/abs/2406.02741 DM me to chat at AIStats!
    arXiv logo
    arxiv.org
    Sampling From Multiscale Densities With Delayed Rejection...
    Hamiltonian Monte Carlo (HMC) is the mainstay of applied Bayesian inference for differentiable models. However, HMC still struggles to sample from hierarchical models that induce densities with...
    2.2K
  • user avatar
    Gilad
    @giladturok
    Jul 29, 2024
    Replying to @jeremyphoward @Railway and 2 others
    Seems amazing but also
    Image
    4K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement