同时在不同的GPU上训练多个keras/tensorflow模型

问题描述:

我想在Jupyter笔记本电脑中同时在多个GPU上训练多个模型.我正在使用4GPU的节点上工作.我想为一个模型分配一个GPU,并同时训练4个不同的模型.现在,我通过(例如)为一个笔记本选择GPU:

I would like to train multiple models on multiple GPUs at the simultaneously from within a jupyter notebook. I am working on a node with 4GPUs. I would like to assign one GPU to one model and train 4 different models at the same time. Right now, I select a GPU for one notebook by (e.g.):

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

def model(...):
    ....

model.fit(...)

在四个不同的笔记本中.但是,拟合过程的结果和输出将分布在四个不同的笔记本中.但是,将它们顺序地放在一个笔记本中运行需要很多时间.您如何将GPU分配给各个功能并并行运行?

In four different notebooks. Though, then the results and the output of the fitting procedure is distributed in four different notebooks. Though, running them in one notebook sequentially, needs a lot of time. How do you assign GPU's to individual functions and run them in parallel?

我建议像这样使用Tensorflow范围:

I recommend using Tensorflow scopes like so:

with tf.device_scope('/gpu:0'):
  model1.fit()
with tf.device_scope('/gpu:1'):
  model2.fit()
with tf.device_scope('/gpu:2'):
  model3.fit()