Log inSign up
Dmitry Kobak
2,702 posts
Image
user avatar
Dmitry Kobak
@hippopedoid
Researcher at Ghent University and VIB-AI. Manifold learning, contrastive learning, scRNAseq data. Excess mortality. Born but to die and reas'ning but to err.
Ghent, Belgium
dkobak.github.io
Joined December 2019
233
Following
7,373
Followers
  • Pinned
    user avatar
    Dmitry Kobak
    @hippopedoid
    May 14
    Logged in for the first time in over a year to say that I stopped using Twitter and moved to Bsky (same username). Wasn't using it much either until now but plan to revive. See you there! Also, I moved from Tübingen to Ghent and joined @ugent and @_VIB_AI! Exciting times.
    Image
    506
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Mar 30, 2023
    We held a reading group on Transformers (watched videos / read blog posts / studied papers by @giffmana @karpathy @ch402 @amaarora @JayAlammar @srush_nlp et al.), and now I _finally_ roughly understand what attention does. Here is my take on it. A summary thread. 1/n
    Image
    462K
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Jun 27, 2023
    The smallest p-value that I've ever seen reported is 😱 3.6 * 10^-2382 😱 Paper: science.org/doi/10.1126/sc… Sci-hub link: sci-hub.st/10.1126/scienc… If you ever see a smaller p-value, do let me know! I collect them. [1/3]
    Image
    761K
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Jun 21, 2024
    How many academic papers are written with the help of ChatGPT? To answer this question, we analyzed 14mln PubMed abstracts from 2010 to 2024 and looked for excess words: ** Delving into ChatGPT usage in academic writing through excess vocabulary ** arxiv.org/abs/2406.07016 1/11
    Image
    490K
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Apr 13, 2023
    Really excited to present new work by @ritagonmar: we visualized the entire PubMed library, 21 million biomedical and life science papers, and learned a lot about -- THE LANDSCAPE OF BIOMEDICAL RESEARCH biorxiv.org/content/10.110… Joint work with @CellTypist and @benmschmidt. 1/n
    Image
    797K
  • user avatar
    Dmitry Kobak
    @hippopedoid
    May 26, 2023
    Updated our preprint to add a map of retracted papers (11k) in PubMed (21m). There are clear clusters and we believe it's paper mill activity. E.g. there is a small island with 11% (!) retractions; we argue that the other 89% are all VERY suspicious and deserve further scrutiny.
    Image
    Image
    user avatar
    Dmitry Kobak
    @hippopedoid
    Apr 13, 2023
    Really excited to present new work by @ritagonmar: we visualized the entire PubMed library, 21 million biomedical and life science papers, and learned a lot about -- THE LANDSCAPE OF BIOMEDICAL RESEARCH biorxiv.org/content/10.110… Joint work with @CellTypist and @benmschmidt. 1/n
    564K
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Nov 2, 2020
    I am teaching Machine Learning I this semester at @uni_tue. Lectures will be posted online. Here is Lecture 1 with some introduction about ML vs statistics, and then a detailed treatment of a baby version of linear regression, inc. baby gradient descent. youtu.be/lWGdFeMsjzg
    Image
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Sep 30, 2021
    So what's up with the Russian election two weeks ago? Was there fraud? Of course there was fraud. Widespread ballot stuffing was videotaped etc., but we can also prove fraud using statistics. See these *integer peaks* in the histograms of the polling station results? 🕵️‍♂️ [1/n]
    Image
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Sep 13, 2021
    I am late to the party (was on holidays), but have now read @lpachter's "Specious Art" paper as well as ~300 quote tweets/threads, played with the code, and can add my two cents. Spoiler: I disagree with their conclusions. Some claims re t-SNE/UMAP are misleading. Thread. 🐘
    Image
    Image
    user avatar
    Lior Pachter
    @lpachter
    Aug 27, 2021
    It's time to stop making t-SNE & UMAP plots. In a new preprint w/ Tara Chari we show that while they display some correlation with the underlying high-dimension data, they don't preserve local or global structure & are misleading. They're also arbitrary.🧵biorxiv.org/content/10.110…
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Sep 21, 2020
    Weird enough, I now have over 2^10 followers here on Twitter despite writing almost exclusively about t-SNE ;-) So here is a ❤️-shaped t-SNE embedding of MNIST. Can you guess how I made it? Explanation tomorrow.
    Image
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Dec 20, 2019
    A year ago in Nature Biotechnology, Becht et al. argued that UMAP preserved global structure better than t-SNE. Now @GCLinderman and me wrote a comment saying that their results were entirely due to the different initialization choices: biorxiv.org/content/10.110…. Thread. (1/n)
    Image
    biorxiv.org
    UMAP does not preserve global structure any better than t-SNE when using the same initialization
    One of the most ubiquitous analysis tools employed in single-cell transcriptomics and cytometry is t-distributed stochastic neighbor embedding (t-SNE) [[1][1]], used to visualize individual cells as...
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Feb 1, 2021
    Mine and @GCLinderman's comment to the Becht et al. 2018 (@EtienneBecht) paper has finally appeared in @NatureBiotech after over a year of editorial considerations. There is no response from the authors, so I assume we are all in agreement :-) nature.com/articles/s4158…
    Image
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Apr 15, 2020
    Did you know that the optimal ridge penalty λ in linear regression can be *negative*? It's always strictly positive when n>p. Or when cov(x)=I. Or when true β is random. But here we argue that it can be zero or even negative when p>>n: arxiv.org/abs/1805.10939. HOW?! [1/n]
    Image
  • user avatar
    Dmitry Kobak
    @hippopedoid
    Dec 2, 2022
    A very long overdue thread: happy to share preprint led by Sebastian Damrich from @FredHamprecht's lab. *From t-SNE to UMAP with contrastive learning* arxiv.org/abs/2206.01816 I think we have finally understood the *real* difference between t-SNE and UMAP. It involves NCE! [1/n]
    Image

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement