Daily ‘100 day’ updates will resume at the end of June, when I will be done with my exams.
Day 5: PyTorch Basics - Saving and Loading Models
Previously, we learned how to train a neural network model. But how do we actually make predictions with it? And how can we use open-source models that other people trained? In this post we learn how to store and load the weights that the model learned during training, and how to use open-source models and weights
100 Days Of PyTorch
Author
Liam Groen
Published
May 12, 2025
Keywords
Save PyTorch model, Load PyTorch model, Open-Source Models, Pre Trained Model
This post is part of a series on deploying models, other posts include:
The learned weights are stored in an attribute called state_dict. We can save these weights to a file to reuse them later by calling torch.save() on the attribute.
To load the saved model, we first need to create a new instance of the same model class. Just the weights won’t do us any good, they need to correspond to the correct model structure. After instantiating a new model, we can copy the weights to it.
We now have an exact copy of the neural network in a new variable!
If we want, we can inspect all the parameters (e.g to create visualizations that explain the model) through the models’ state_dict()
layer_2_bias = model.state_dict()['linear_relu_chain.2.bias'] # We can access stored parameters through the state_dict keysprint("stored parameters in the form of 'layer_num.type': ", model.state_dict().keys(), '\n')print("Amount of values in the layer 2 bias:", layer_2_bias.shape, '\n') print("First 10 biases in layer 2:", layer_2_bias[:10])
stored parameters in the form of 'layer_num.type':
[
'linear_relu_chain.0.weight',
'linear_relu_chain.0.bias',
'linear_relu_chain.2.weight',
'linear_relu_chain.2.bias',
'linear_relu_chain.4.weight',
'linear_relu_chain.4.bias',
]
Amount of values in the layer 2 bias: torch.Size([512])
First 3 biases in layer 2: tensor([0.0234, 0.0045, 0.0241])
How to Use Pre-Trained Models?
We don’t have to train every model that we want to use ourselves. Lots of times, a better model trained on more data for longer is available freely online. PyTorch comes with pre-built model structures and weights for these models.
from torchvision.models import resnet50, ResNet50_Weightsweights = ResNet50_Weights.DEFAULTmodel = resnet50(weights)model.eval()
We can now use the ResNet50 model with best weights in our code.
Warmstarting / Transfer Learning
In a scenario where we have a dataset with domain-specific images and we want to train an image recognition model on the data, we don’t have to start from scratch. We can define the model structure that we want and use imported weights for initializing the model training. This way the model does not have to learn what an image is again. Since that knowledge is already embedded in the downloaded weights, the model only needs to learn to recognize the domain-specific images. This PyTorch article explains how to do this.