How do I initialize weights in PyTorch?
Single layer
To initialize the weights of a single layer, use a function from torch.nn.init
. For instance:
conv1 = torch.nn.Conv2d(...)
torch.nn.init.xavier_uniform(conv1.weight)
Alternatively, you can modify the parameters by writing to conv1.weight.data
(which is a torch.Tensor
). Example:
conv1.weight.data.fill_(0.01)
The same applies for biases:
conv1.bias.data.fill_(0.01)
nn.Sequential
or custom nn.Module
Pass an initialization function to torch.nn.Module.apply
. It will initialize the weights in the entire nn.Module
recursively.
apply(fn): Applies
fn
recursively to every submodule (as returned by.children()
) as well as self. Typical use includes initializing the parameters of a model (see also torch-nn-init).
Example:
def init_weights(m):
if isinstance(m, nn.Linear):
torch.nn.init.xavier_uniform(m.weight)
m.bias.data.fill_(0.01)
net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
net.apply(init_weights)
How to initialize weights in a pytorch model
You are deciding how to initialise the weight by checking that the class name includes Conv with classname.find('Conv')
. Your class has the name upConv, which includes Conv, therefore you try to initialise its attribute .weight
, but that doesn't exist.
Either rename your class or make the condition more strict, such as classname.find('Conv2d')
. The strictest approach would be to check whether it's an instance of nn.Conv2d
, instead of looking at the name of the class.
def weights_init(m):
if isinstance(m, nn.Conv2d):
m.weight.data.normal_(0.0, 0.02)
elif isinstance(m, nn.BatchNorm2d):
m.weight.data.normal_(1.0, 0.02)
m.bias.data.fill_(0)
How to initialise (and check sanity) weights efficiently of layers within complex (nested) modules in PyTorch?
You can simply iterate over all submodules, at the end of your __init__
method:
class Generator(nn.Module):
def __init__(self, ....):
# all code here
# ...
# init weights, at the very bottom of __init__
for sm in self.modules():
if isinstance(sm, nn.Conv2d):
# only conv2d will be initialized in this way
torch.nn.init.normal_(sm.weight.data, 0.0, 0.02)
done.
Follow-up to In PyTorch how are layer weights and biases initialized by default?
Convolutional modules such as nn.Conv1d
, nn.Conv2d
, and nn.Conv3d
inherit from the _ConvNd
class. This class has a reset_parameters
function implemented just like nn.Linear
:
def reset_parameters(self) -> None:
# Setting a=sqrt(5) in kaiming_uniform is the same as initializing with
# uniform(-1/sqrt(k), 1/sqrt(k)), where k = weight.size(1) * prod(*kernel_size)
# For more details see:
# https://github.com/pytorch/pytorch/issues/15314#issuecomment-477448573
init.kaiming_uniform_(self.weight, a=math.sqrt(5))
if self.bias is not None:
fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
bound = 1 / math.sqrt(fan_in)
init.uniform_(self.bias, -bound, bound)
As for nn.BatchNorm2d
, it has reset_parameters
and reset_running_stats
function:
def reset_parameters(self) -> None:
self.reset_running_stats()
if self.affine:
init.ones_(self.weight)
init.zeros_(self.bias)
def reset_running_stats(self) -> None:
if self.track_running_stats:
# running_mean/running_var/num_batches... are registered at runtime depending
# if self.track_running_stats is on
self.running_mean.zero_() # type: ignore[operator]
self.running_var.fill_(1) # type: ignore[operator]
self.num_batches_tracked.zero_() # type: ignore[operator]
When does Pytorch initialize parameters?
For the basic layers (e.g., nn.Conv
, nn.Linear
, etc.) the parameters are initialized by the __init__
method of the layer.
For example, look at the source code of class _ConvNd(Module)
(the class from which all other convolution layers are derived). At the bottom of its __init__
it calls self.reset_parameters()
which initialize the weights.
Therefore, if your nn.Module
does not have any "independent" nn.Parameter
s, only trainable parameters inside sub-nn.Module
s, when you construct your network, all weights of the sub modules are being initialized as the sub modules are constructed.
That is, once you call h_model = H_model()
the weights of h_model
are already initialized to their default values. Calling h_model.load_state_dict(...)
overrides these values to the desired pre-trained weights.
Initialize weight in pytorch neural net
type(param)
will only return the actual datatype called a parameter
for any type of weight or data in the model. Because named_parameters()
doesn't return anything useful in the name either when used on an nn.sequential
-based model, you need to look at the modules to see which layers are specifically related to the nn.Conv2d class using isinstance
as such:
for layer in D.modules():
if isinstance(layer, nn.Conv2d):
layer.weight.data.normal_(...)
Or, the way that is recommended by Soumith Chintala himself, actually just loop through your main module itself:
for L,layer in D.main:
if isisntance(layer,nn.Conv2d):
layer.weight.data.normal_(..)
I actually prefer the first because you don't have to specify the exact nn.sequential module itself, and will search all possible modules in the model, but either one should do the job for you.
Create a new model in pytorch with custom initial value for the weights
You can use simply torch.nn.Parameter()
to assign a custom weight for the layer of your network.
As in your case -
model.fc1.weight = torch.nn.Parameter(custom_weight)
torch.nn.Parameter: A kind of Tensor that is to be considered a module parameter.
For Example:
# Classifier model
model = Classifier()
# your custom weight, here taking randam
custom_weight = torch.rand(model.fc1.weight.shape)
custom_weight.shape
torch.Size([128, 784])
# before assign custom weight
print(model.fc1.weight)
Parameter containing:
tensor([[ 1.6920e-02, 4.6515e-03, -1.0214e-02, ..., -7.6517e-03,
2.3892e-02, -8.8965e-03],
...,
[-2.3137e-02, 5.8483e-03, 4.4392e-03, ..., -1.6159e-02,
7.9369e-03, -7.7326e-03]])
# assign custom weight to first layer
model.fc1.weight = torch.nn.Parameter(custom_weight)
# after assign custom weight
model.fc1.weight
Parameter containing:
tensor([[ 0.1724, 0.7513, 0.8454, ..., 0.8780, 0.5330, 0.5847],
[ 0.8500, 0.7687, 0.3371, ..., 0.7464, 0.1503, 0.7720],
[ 0.8514, 0.6530, 0.6261, ..., 0.7867, 0.9312, 0.3890],
...,
[ 0.5426, 0.7655, 0.1191, ..., 0.4343, 0.2500, 0.6207],
[ 0.2310, 0.4260, 0.4138, ..., 0.1168, 0.5946, 0.2505],
[ 0.4220, 0.5500, 0.6282, ..., 0.5921, 0.7953, 0.9997]])
Related Topics
How to Install Pip3 on Windows
How to Tell If a String Repeats Itself in Python
Differencebetween an Opencv Bgr Image and Its Reverse Version Rgb Image[:,:,::-1]
Python/Beautifulsoup - How to Remove All Tags from an Element
Importerror When Importing Tkinter in Python
Python Script for Django App to Access Models Without Using Manage.Py Shell
Cython: "Fatal Error: Numpy/Arrayobject.H: No Such File or Directory"
Using a Dictionary to Select Function to Execute
How to Generate a List of Consecutive Numbers
Differencebetween 'Transform' and 'Fit_Transform' in Sklearn
How to Bind the Enter Key to a Function in Tkinter
Regular Expression Usage in Glob.Glob
Is There Any Built-In Way to Get the Length of an Iterable in Python
How to Show Explosion Image When Collision Happens