I spent many long nights preparing this material for a visual introduction to optimization in Deep Learning, ranging from 1st-order methods, 2nd-order and Natural Gradient (and approximations of it such as K-FAC). Sharing the PDF (easier to download): drive.google.com/file/d/1e_9W8q…
Christian S. Perone
10K posts
Machine Learning, Computer Science, Math. Working with robots that drive.
- In the data manifold, the shortest path between two points is a geodesic that pass through high density regions of data. Just like mass curves the space geometry, data also curves the space. In the example below we can see a geodesic being optimized between two points. 1/3
00:00 - Ministry of Health in Brazil is adopting a stupid visual strategy: reducing the font size of COVID-19 deaths and increasing the font size of recovered cases. Every day there is a new visual strategy to disguise the death numbers.
- 99% of recent models: We trained this new cool model that is chatgpt-like! Reality: 1) Meta trained it, you don't have license for it; 2) The dataset you used is distilled from OpenAI; 3) You did a fine-tune of the model, you don't have to change its name because of this; 4)
- I'm working on a ML forecast system for flooding in South of Brazil, the blue mask shows the simulated flooding, which is matching surprisingly well the regions that actually flooded (the satellite image), next step is to train the stream gauge prediction model with precipitation
GIF - Just sharing ~100 slides about PyTorch 2 internals focusing on recent innovations (Dynamo, Inductor, and ExecuTorch). I had a lot of fun preparing this and hope you'll enjoy it. I'm planning to record it soon. PDF: drive.google.com/file/d/1XBox0G… Slideshare: slideshare.net/perone/pytorch…
- New article: "The geometry of data: the missing metric tensor and the Stein score" (blog.christianperone.com/2024/11/the-ge…). I show how you can derive a (efficient to compute) data manifold metric tensor with the Stein score alone ! Deep connections to diffusion, score-based models and physics.
GIF
GIF - This is the most terrifying image I have seen, I lived in this city for around 8 years in Brazil, this is a true color image from Sentinel-2 satellite showing today and a few days ago, it shows the flooding basically on an entire part of the city. This *never* happened before.
GIF
GIF - Everyone is talking about the GPT-2, but nobody asked the model if he wants to be released. Since @Thom_Wolf just released the interface on pytorch_pretrained_bert, I decided to ask the model about what he thinks.
- Just decided to make a thread to show how many decisions and pre-processing steps you need to train large language models (LLMs) such as LLaMA. This is mainly based on CommonCrawl (CC) dataset and the pipeline used in LLaMA to generate its dataset. 1/n
- Social irresponsibility: even after trying to make the author remove the tweet, @bollembd denied. Now I would like to share the answer from the original authors of the study DEBUNKING the statement made by @bollemdb. Be responsible and READ studies the way they should be folks.












