网络手术:如何在caffe中重塑caffemodel文件的卷积层?

网络手术:如何在caffe中重塑caffemodel文件的卷积层?

问题描述:

我正在尝试重塑caffemodel的卷积层的大小(这是如何进行网络手术的教程,仅显示如何将重量参数从一个caffemodel复制到另一个相同大小的模型.
相反,我需要向卷积过滤器添加一个新通道(全为0),以便将其大小从当前(64 x 3 x 3 x 3)更改为(64 x 4 x 3 x 3).

I'm trying to reshape the size of a convolution layer of a caffemodel (This is a follow-up question to this question). Although there is a tutorial on how to do net surgery, it only shows how to copy weight parameters from one caffemodel to another of the same size.
Instead I need to add a new channel (all 0) to my convolution filter such that it changes its size from currently (64x3x3x3) to (64x4x3x3).

假设卷积层称为'conv1'.这是我到目前为止尝试过的:

Say the convolution layer is called 'conv1'. This is what I tried so far:

# Load the original network and extract the fully connected layers' parameters.
net = caffe.Net('../models/train.prototxt', 
                '../models/train.caffemodel', 
                caffe.TRAIN)

现在我可以执行此操作:

Now I can perform this:

net.blobs['conv1'].reshape(64,4,3,3);
net.save('myNewTrainModel.caffemodel');

但是保存的模型似乎没有更改.我已经读过,卷积的实际权重存储在net.params['conv1'][0].data中而不是在net.blobs中,但是我不知道如何重塑net.params对象.有人有主意吗?

But the saved model seems not to have changed. I've read that the actual weights of the convolution are stored rather in net.params['conv1'][0].data than in net.blobs but I can't figure out how to reshape the net.params object. Does anyone have an idea?

正如您所指出的,net.blobs不会存储学习的参数/权重,而是存储在网络输入上应用过滤器/激活的结果.学习的权重存储在net.params中. (有关更多详细信息,请参见).

As you well noted, net.blobs does not store the learned parameters/weights, but rather stores the result of applying the filters/activations on the net's input. The learned weights are stored in net.params. (see this for more details).

AFAIK,您不能直接reshape net.params并添加频道.
您可以做的是拥有两个网deploy_trained_net_with_3ch.prototxtdeploy_empty_net_with_4ch.prototxt.除了输入形状定义和第一层的名称之外,这两个文件几乎可以完全相同.
然后,您可以将两者网络都加载到python并复制相关部分:

AFAIK, you cannot directly reshape net.params and add a channel.
What you can do, is have two nets deploy_trained_net_with_3ch.prototxt and deploy_empty_net_with_4ch.prototxt. The two files can be almost identical apart from the input shape definition and the first layer's name.
Then you can load both nets to python and copy the relevant part:

net3ch = caffe.Net('deploy_trained_net_with_3ch.prototxt', 'train.caffemodel', caffe.TEST) 
net4ch = caffe.Net('deploy_empty_net_with_4ch.prototxt', 'train.caffemodel', caffe.TEST) 

因为所有图层名称都是相同的(conv1除外),所以net4ch.params的权重为train.caffemodel.至于第一层,您现在可以手动复制相关部分:

since all layer names are identical (apart from conv1) net4ch.params will have the weights of train.caffemodel. As for the first layer, you can now manually copy the relevant part:

net4ch.params['conv1_4ch'][0].data[:,:3,:,:] = net3ch.params['conv1'][0].data[...]

最后:

net4ch.save('myNewTrainModel.caffemodel')