网络手术:如何在caffe中重塑caffemodel文件的卷积层?
我正在尝试重塑caffemodel的卷积层的大小(这是如何进行网络手术的教程,仅显示如何将重量参数从一个caffemodel复制到另一个相同大小的模型.
相反,我需要向卷积过滤器添加一个新通道(全为0),以便将其大小从当前(64
x 3
x 3
x 3
)更改为(64
x 4
x 3
x 3
).
I'm trying to reshape the size of a convolution layer of a caffemodel (This is a follow-up question to this question). Although there is a tutorial on how to do net surgery, it only shows how to copy weight parameters from one caffemodel to another of the same size.
Instead I need to add a new channel (all 0) to my convolution filter such that it changes its size from currently (64
x3
x3
x3
) to (64
x4
x3
x3
).
假设卷积层称为'conv1'
.这是我到目前为止尝试过的:
Say the convolution layer is called 'conv1'
. This is what I tried so far:
# Load the original network and extract the fully connected layers' parameters.
net = caffe.Net('../models/train.prototxt',
'../models/train.caffemodel',
caffe.TRAIN)
现在我可以执行此操作:
Now I can perform this:
net.blobs['conv1'].reshape(64,4,3,3);
net.save('myNewTrainModel.caffemodel');
但是保存的模型似乎没有更改.我已经读过,卷积的实际权重存储在net.params['conv1'][0].data
中而不是在net.blobs
中,但是我不知道如何重塑net.params
对象.有人有主意吗?
But the saved model seems not to have changed. I've read that the actual weights of the convolution are stored rather in net.params['conv1'][0].data
than in net.blobs
but I can't figure out how to reshape the net.params
object. Does anyone have an idea?
正如您所指出的,net.blobs
不会存储学习的参数/权重,而是存储在网络输入上应用过滤器/激活的结果.学习的权重存储在net.params
中. (有关更多详细信息,请参见此).
As you well noted, net.blobs
does not store the learned parameters/weights, but rather stores the result of applying the filters/activations on the net's input. The learned weights are stored in net.params
. (see this for more details).
AFAIK,您不能直接reshape
net.params
并添加频道.
您可以做的是拥有两个网deploy_trained_net_with_3ch.prototxt
和deploy_empty_net_with_4ch.prototxt
.除了输入形状定义和第一层的名称之外,这两个文件几乎可以完全相同.
然后,您可以将两者网络都加载到python并复制相关部分:
AFAIK, you cannot directly reshape
net.params
and add a channel.
What you can do, is have two nets deploy_trained_net_with_3ch.prototxt
and deploy_empty_net_with_4ch.prototxt
. The two files can be almost identical apart from the input shape definition and the first layer's name.
Then you can load both nets to python and copy the relevant part:
net3ch = caffe.Net('deploy_trained_net_with_3ch.prototxt', 'train.caffemodel', caffe.TEST)
net4ch = caffe.Net('deploy_empty_net_with_4ch.prototxt', 'train.caffemodel', caffe.TEST)
因为所有图层名称都是相同的(conv1
除外),所以net4ch.params
的权重为train.caffemodel
.至于第一层,您现在可以手动复制相关部分:
since all layer names are identical (apart from conv1
) net4ch.params
will have the weights of train.caffemodel
. As for the first layer, you can now manually copy the relevant part:
net4ch.params['conv1_4ch'][0].data[:,:3,:,:] = net3ch.params['conv1'][0].data[...]
最后:
net4ch.save('myNewTrainModel.caffemodel')