A deep neural network implementation from scratch for handwritten digit recognition, achieving 95.24% accuracy on the MNIST dataset.
This project demonstrates how to build and train a neural network without using any deep learning frameworks. It recognizes handwritten digits (0-9) by learning from the MNIST dataset, which contains 70,000 grayscale images of handwritten digits.
- Clone this repository
- Ensure you have the required dependencies:
pip install numpy pandas matplotlib
- Run the main script:
python main.py
Input Layer (784 neurons) → Hidden Layer 1 (128 neurons) → Hidden Layer 2 (64 neurons) → Output Layer (10 neurons)
- Input Layer: 784 neurons (28x28 pixel images flattened)
- Hidden Layer 1: 128 neurons with ReLU activation
- Hidden Layer 2: 64 neurons with ReLU activation
- Output Layer: 10 neurons with Softmax activation (one for each digit)
- Images are normalized from 0-255 to 0-1 scale
- Labels are converted to one-hot encoded vectors
- Data is split into training and development sets
W1 = np.random.randn(128, 784) np.sqrt(1./784) b1 = np.zeros((128, 1))
... similar for W2, b2, W3, b3
Performs forward propagation through the network:
Z1 = W1.dot(X) + b1 A1 = ReLU(Z1)
... similar for other layers
Computes gradients using backpropagation:
dZ3 = A3 - one_hot_Y dW3 = 1/m dZ3.dot(A2.T)
Updates parameters using momentum optimization:
vW1 = beta vW1 + (1-beta) dW1 W1 = W1 - alpha vW1
- Z₁ = W₁X + b₁
- A₁ = ReLU(Z₁)
- Z₂ = W₂A₁ + b₂
- A₂ = ReLU(Z₂)
- Z₃ = W₃A₂ + b₃
- A₃ = Softmax(Z₃)
- dZ₃ = A₃ - Y
- dW₃ = 1/m * dZ₃A₂ᵀ
- db₃ = 1/m * Σ(dZ₃)
- Similar computations for other layers
-
Add Convolutional Layers
- Would better capture spatial relationships
- Could improve accuracy to ~99%
-
Implement Regularization
- L2 regularization
- Dropout layers
- Batch normalization
-
Modern Optimizers
- Adam optimizer
- RMSprop
- AdaGrad
-
Data Augmentation
- Random rotations
- Small shifts
- Elastic deformations
- MNIST Dataset: http://yann.lecun.com/exdb/mnist/
- Deep Learning Book (Goodfellow et al.)
- Neural Networks and Deep Learning (Michael Nielsen)
Feel free to submit issues and enhancement requests!
This project is licensed under the MIT License - see the LICENSE file for details.