Log inSign up
surya
529 posts
Image
user avatar
surya
@suryasure05
inference @nvidia (prev. @groqinc) // ece @westernu
sf + toronto
suryasure.com
Joined October 2023
927
Following
3,490
Followers
  • Pinned
    user avatar
    surya
    @suryasure05
    Aug 18, 2025
    I spent my summer building TinyTPU : An open source ML inference and training chip. it can do end to end inference + training ENTIRELY on chip. here's how I did it👇:
    Image
    00:00
    355K
  • user avatar
    surya
    @suryasure05
    Jul 2, 2025
    doing ML on hardware is hard man, we need more ppl working on this
    56K
  • user avatar
    surya
    @suryasure05
    Jun 1, 2025
    spent the last two weeks building an ML inference chip to solve the XOR problem
    Image
    00:00
    70K
  • user avatar
    surya
    @suryasure05
    Apr 2, 2025
    built a glove controlled robotic hand cause why not
    Image
    00:00
    9.3K
  • user avatar
    surya
    @suryasure05
    May 3, 2025
    i built a hardware accelerator to compute softmax more efficiently. used a pipelined approach to create a high throughput module which entire performs the softmax computation in silicon. here's how I did it:
    Image
    7.4K
  • user avatar
    surya
    @suryasure05
    Aug 18, 2025
    Replying to @suryasure05
    I worked on this with @evanliin, @XanderChin, and @kennykgguo — incredibly smart people! check out our article to see how our chip works and how we went about developing it: tinytpu.com you can also find the code here and play with it yourself:
    12K
  • user avatar
    surya
    @suryasure05
    Mar 8, 2025
    over the past week, I implemented a systolic array, an architecture used in the TPU and many other chips, with no prior experience in verilog👇
    Image
    7.4K
  • user avatar
    surya
    @suryasure05
    Aug 18, 2025
    Replying to @suryasure05
    having no hardware knowledge or experience at all until just 6 months ago, this was a very ambitious project to work on. I had no idea how difficult this would be or if I could even complete it without the "prerequisites". but throughout the last 4 months, I developed a style
    Image
    5.5K
  • user avatar
    surya
    @suryasure05
    Aug 18, 2025
    Replying to @suryasure05
    before we got to designing any hardware, we started off with properly understanding the math behind MLPs. we worked out the math by hand for inference and training of our network.
    Image
    8.4K
  • user avatar
    surya
    @suryasure05
    Apr 22, 2025
    designed a state machine for my softmax module have all I need to build it out now
    Image
    2.6K
  • user avatar
    surya
    @suryasure05
    Aug 19, 2025
    did NOT expect this at all😭
    user avatar
    evan
    @evanliin
    Aug 19, 2025
    tinytpu.com is third place on Hacker News!!
    Image
    2.7K
  • user avatar
    surya
    @suryasure05
    Aug 18, 2025
    Replying to @suryasure05
    our first step was to decide the scale of this project. we decided to target the simplest possible neural network — the XOR problem. however, we still wanted to make this scalable so a core design philosophy for us was to ensure all of our mechanisms could scale to larger
    Image
    11K
  • user avatar
    surya
    @suryasure05
    Jul 2, 2025
    Replying to @vaurenw
    telesens.co/2018/07/30/sys… will help you understand how matmuls can be done efficiently in hardware
    2.4K
  • user avatar
    surya
    @suryasure05
    Jun 1, 2025
    Replying to @suryasure05
    before we started working on the chip, we decided to work out the math behind a simple MLP by hand for both forward pass and backprop since this was the base of our entire project
    Image
    3.2K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement