{"id":61042,"date":"2024-03-29T13:08:45","date_gmt":"2024-03-29T13:08:45","guid":{"rendered":"https:\/\/www.askpython.com\/?p=61042"},"modified":"2025-04-10T20:34:05","modified_gmt":"2025-04-10T20:34:05","slug":"torch-optim-pytorch","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python-modules\/torch-optim-pytorch","title":{"rendered":"Optimizing Neural Networks with torch.optim in PyTorch"},"content":{"rendered":"\n<p>Pytorch is a prevalent machine learning library in Python programming language. Pytorch is a handy tool in neural networks and<strong> torch.optim <\/strong>module is used in various neural network models for training. This module provides us with multiple optimization algorithms for training neural networks.<\/p>\n\n\n\n<p>In this article, we will understand in depth about the <strong>torch.optim<\/strong> module and also learn about its key components with its Python implementation.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><em>The torch.optim module in PyTorch provides various optimization algorithms commonly used for training neural networks. These algorithms minimize the loss function by adjusting the weights and biases of the network, ultimately improving the model&#8217;s performance.<\/em><\/p>\n<\/blockquote>\n\n\n\n<p><strong><em>Recommended: <a href=\"https:\/\/www.askpython.com\/python-modules\/pytorch-to-numpy-conversion\">Converting Between Pytorch Tensors and Numpy Arrays in Python<\/a><\/em><\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is torch.optim?<\/h2>\n\n\n\n<p>The <strong>torch.optim <\/strong>module, as mentioned above, provides us with multiple optimization algorithms that are most commonly used to minimize the loss function during the training of neural networks. In short, these algorithms adjust the weights and biases of the neural network to improve the performance of the model.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Key Components of torch.optim<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Optimizer Classes<\/strong><\/li>\n<\/ol>\n\n\n\n<p><strong>torch.optim<\/strong> gives us various classes that present us with specific optimization algorithms. Some popular optimizers are SGD (Stochastic Gradient Descent which changes model parameters to reduce losses), Adam (it combines both momentum and RMSprop), Adagrad(optimization algorithm that adjusts the learning rate of parameters based on historical gradient) and RMSprop (an adaptive optimization algorithm )<\/p>\n\n\n\n<p><strong>2. Parameter Groups<\/strong><\/p>\n\n\n\n<p>An optimization algorithm in PyTorch handles multiple parameter groups. A parameter group is essentially a dictionary and its optimization groups. It allows users to change learning rates and weights in different parts of the model.<\/p>\n\n\n\n<p><strong>3. Learning Rate Schedulers<\/strong><\/p>\n\n\n\n<p><strong>torch.optim<\/strong> also includes learning rate schedulers that adjust the learning rate during training, Some common schedulers are StepLR, MultiStepLR, etc.<\/p>\n\n\n\n<p>Let us now further understand <strong>torch.optim<\/strong> with an example in Python programming language.<\/p>\n\n\n\n<p><strong><em>Recommended: <a href=\"https:\/\/www.askpython.com\/python-modules\/pretrained-pytorch-models-computer-vision\">What Are the Pre-trained Models Available in PyTorch?<\/a><\/em><\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Example: SGD Optimizer<\/h2>\n\n\n\n<p>In this example, we will create a simple neural network and train it on a dataset using the SGD optimizer. Let us look at the code.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport torch\nimport torch.nn as nn\nimport torch.optim as optim\n\n# Define a simple neural network class\nclass SimpleNN(nn.Module):\n    def __init__(self, input_size, hidden_size, output_size):\n        super(SimpleNN, self).__init__()\n        self.fc1 = nn.Linear(input_size, hidden_size)\n        self.relu = nn.ReLU()\n        self.fc2 = nn.Linear(hidden_size, output_size)\n\n    def forward(self, x):\n        x = self.fc1(x)\n        x = self.relu(x)\n        x = self.fc2(x)\n        return x\n\n# Set random seed for reproducibility\ntorch.manual_seed(42)\n\n# Define input size, hidden size, and output size\ninput_size = 10\nhidden_size = 20\noutput_size = 5\n\n# Create an instance of the SimpleNN class\nmodel = SimpleNN(input_size, hidden_size, output_size)\n\n# Define a synthetic dataset\ninput_data = torch.randn(100, input_size)\ntarget = torch.randn(100, output_size)\n\n# Define a loss function\ncriterion = nn.MSELoss()\n\n# Define the SGD optimizer\noptimizer = optim.SGD(model.parameters(), lr=0.01)\n\n# Training loop\nepochs = 100\nfor epoch in range(epochs):\n    # Forward pass\n    output = model(input_data)\n\n    # Compute the loss\n    loss = criterion(output, target)\n\n    # Backward pass and optimization\n    optimizer.zero_grad()\n    loss.backward()\n    optimizer.step()\n\n    # Print the loss for every few epochs\n    if (epoch + 1) % 10 == 0:\n        print(f&#039;Epoch &#x5B;{epoch+1}\/{epochs}], Loss: {loss.item():.4f}&#039;)\n\n<\/pre><\/div>\n\n\n<p>Let us look at the output below.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"750\" height=\"268\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2024\/03\/SDG-optimizer-output.png\" alt=\"SDG Optimizer Output\" class=\"wp-image-61055\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2024\/03\/SDG-optimizer-output.png 750w, https:\/\/www.askpython.com\/wp-content\/uploads\/2024\/03\/SDG-optimizer-output-300x107.png 300w\" sizes=\"auto, (max-width: 750px) 100vw, 750px\" \/><figcaption class=\"wp-element-caption\"><strong><em>SDG Optimizer Output<\/em><\/strong><\/figcaption><\/figure>\n\n\n\n<p>Thus we have used SGD optimizer to minimize the mean squared error loss. The learning rate is set to 0.01 and the model is trained for 100 iterations. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Example: Adam Optimizer<\/h2>\n\n\n\n<p>Let us look at another Python code where we have used Adam Optimizer.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport torch\nimport torch.nn as nn\nimport torch.optim as optim\nimport matplotlib.pyplot as plt\nimport numpy as np\n\n# Generate synthetic dataset\ntorch.manual_seed(42)  # For reproducibility\n\n# Generate random data\nX = torch.unsqueeze(torch.linspace(-1, 1, 100), dim=1)\ny = 3 * X + 1 + 0.2 * torch.randn(X.size())\n\n# Define a simple linear regression model\nclass LinearRegression(nn.Module):\n    def __init__(self):\n        super(LinearRegression, self).__init__()\n        self.linear = nn.Linear(1, 1)\n\n    def forward(self, x):\n        return self.linear(x)\n\n# Instantiate the model\nmodel = LinearRegression()\n\n# Define the Mean Squared Error (MSE) loss\ncriterion = nn.MSELoss()\n\n# Define the Adam optimizer\noptimizer = optim.Adam(model.parameters(), lr=0.01)\n\n# Training loop\nnum_epochs = 1000\nlosses = &#x5B;]\n\nfor epoch in range(num_epochs):\n    # Forward pass\n    predictions = model(X)\n    loss = criterion(predictions, y)\n\n    # Backward pass and optimization\n    optimizer.zero_grad()\n    loss.backward()\n    optimizer.step()\n\n    # Save the loss for plotting\n    losses.append(loss.item())\n\n    # Print the loss every 100 epochs\n    if (epoch + 1) % 100 == 0:\n        print(f&#039;Epoch &#x5B;{epoch+1}\/{num_epochs}], Loss: {loss.item():.4f}&#039;)\n\n# Plot the training progress\nplt.plot(range(1, num_epochs+1), losses, label=&#039;Training Loss&#039;)\nplt.xlabel(&#039;Epoch&#039;)\nplt.ylabel(&#039;Loss&#039;)\nplt.title(&#039;Training Loss over Epochs&#039;)\nplt.legend()\nplt.show()\n\n# Make predictions using the trained model\nwith torch.no_grad():\n    predicted_y = model(X)\n\n# Plot the original data and the predicted values\nplt.scatter(X.numpy(), y.numpy(), label=&#039;Original Data&#039;)\nplt.plot(X.numpy(), predicted_y.numpy(), &#039;r-&#039;, label=&#039;Predicted Line&#039;)\nplt.xlabel(&#039;X&#039;)\nplt.ylabel(&#039;y&#039;)\nplt.title(&#039;Linear Regression with Adam Optimizer&#039;)\nplt.legend()\nplt.show()\n<\/pre><\/div>\n\n\n<p>In the code above, we have used Adam optimizer to train a simple linear regression model. The training loop also iterates for 1000 times. Let us also look at the output and its plots.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"483\" height=\"270\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2024\/03\/Adam-optimizer-output.png\" alt=\"Adam Optimizer Output\" class=\"wp-image-61056\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2024\/03\/Adam-optimizer-output.png 483w, https:\/\/www.askpython.com\/wp-content\/uploads\/2024\/03\/Adam-optimizer-output-300x168.png 300w\" sizes=\"auto, (max-width: 483px) 100vw, 483px\" \/><figcaption class=\"wp-element-caption\"><strong><em>Adam Optimizer Output<\/em><\/strong><\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"554\" height=\"455\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2024\/03\/Adam-optimizer-plot.png\" alt=\"Adam Optimizer Plot\" class=\"wp-image-61057\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2024\/03\/Adam-optimizer-plot.png 554w, https:\/\/www.askpython.com\/wp-content\/uploads\/2024\/03\/Adam-optimizer-plot-300x246.png 300w\" sizes=\"auto, (max-width: 554px) 100vw, 554px\" \/><figcaption class=\"wp-element-caption\"><strong><em>Adam Optimizer Plot<\/em><\/strong><\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p>torch.optim is a powerful module in PyTorch that simplifies the optimization process for training neural networks. With a wide range of optimization algorithms and useful features like parameter groups and learning rate schedulers, torch.optim helps developers train models and achieve better performance efficiently. As you continue using PyTorch, keep working with torch.optim to build and optimize your neural networks. Which optimizer will you choose for your next project?<\/p>\n\n\n\n<p><strong><em>Recommended: <a href=\"https:\/\/www.askpython.com\/python\/examples\/pytorch-loss-functions\">A Quick Guide to Pytorch Loss Functions<\/a><\/em><\/strong><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Pytorch is a prevalent machine learning library in Python programming language. Pytorch is a handy tool in neural networks and torch.optim module is used in various neural network models for training. This module provides us with multiple optimization algorithms for training neural networks. In this article, we will understand in depth about the torch.optim module [&hellip;]<\/p>\n","protected":false},"author":80,"featured_media":63909,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-61042","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python-modules"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/61042","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/80"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=61042"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/61042\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media\/63909"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=61042"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=61042"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=61042"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}