surya (@suryasure05) / X

surya

529 posts

surya

@suryasure05

inference @nvidia (prev. @groqinc) // ece @westernu

sf + toronto

Joined October 2023

Pinned
surya
@suryasure05
Aug 18, 2025
I spent my summer building TinyTPU : An open source ML inference and training chip. it can do end to end inference + training ENTIRELY on chip. here's how I did it👇:
00:00
355K
surya
@suryasure05
Jul 2, 2025
doing ML on hardware is hard man, we need more ppl working on this
56K
surya
@suryasure05
Jun 1, 2025
spent the last two weeks building an ML inference chip to solve the XOR problem
00:00
70K
surya
@suryasure05
Apr 2, 2025
built a glove controlled robotic hand cause why not
00:00
9.3K
surya
@suryasure05
May 3, 2025
i built a hardware accelerator to compute softmax more efficiently. used a pipelined approach to create a high throughput module which entire performs the softmax computation in silicon. here's how I did it:
7.4K
surya
@suryasure05
Aug 18, 2025
Replying to @suryasure05
I worked on this with @evanliin, @XanderChin, and @kennykgguo — incredibly smart people! check out our article to see how our chip works and how we went about developing it: tinytpu.com you can also find the code here and play with it yourself:
12K
surya
@suryasure05
Mar 8, 2025
over the past week, I implemented a systolic array, an architecture used in the TPU and many other chips, with no prior experience in verilog👇
7.4K
surya
@suryasure05
Aug 18, 2025
Replying to @suryasure05
having no hardware knowledge or experience at all until just 6 months ago, this was a very ambitious project to work on. I had no idea how difficult this would be or if I could even complete it without the "prerequisites". but throughout the last 4 months, I developed a style
5.5K
surya
@suryasure05
Aug 18, 2025
Replying to @suryasure05
before we got to designing any hardware, we started off with properly understanding the math behind MLPs. we worked out the math by hand for inference and training of our network.
8.4K
surya
@suryasure05
Apr 22, 2025
designed a state machine for my softmax module have all I need to build it out now
2.6K
surya
@suryasure05
Aug 19, 2025
did NOT expect this at all😭
evan
@evanliin
Aug 19, 2025
tinytpu.com is third place on Hacker News!!
2.7K
surya
@suryasure05
Aug 18, 2025
Replying to @suryasure05
our first step was to decide the scale of this project. we decided to target the simplest possible neural network — the XOR problem. however, we still wanted to make this scalable so a core design philosophy for us was to ensure all of our mechanisms could scale to larger
11K
surya
@suryasure05
Jul 2, 2025
Replying to @vaurenw
telesens.co/2018/07/30/sys… will help you understand how matmuls can be done efficiently in hardware
2.4K
surya
@suryasure05
Jun 1, 2025
Replying to @suryasure05
before we started working on the chip, we decided to work out the math behind a simple MLP by hand for both forward pass and backprop since this was the base of our entire project
3.2K