如何从Google Cloud Storage存储桶加载保存在joblib文件中的模型

如何从Google Cloud Storage存储桶加载保存在joblib文件中的模型

问题描述:

我想从Google Cloud Storage存储区中加载一个保存为 joblib 文件的模型.当它在本地路径中时,我们可以按以下方式加载它(考虑 model_file 是系统中的完整路径):

I want to load a model which is saved as a joblib file from Google Cloud Storage bucket. When it is in local path, we can load it as follows (considering model_file is the full path in system):

loaded_model = joblib.load(model_file)

我们如何使用Google Cloud Storage执行相同的任务?

How can we do the same task with Google Cloud Storage?

我认为这是不可能的,至少是直接的.我虽然有一个解决方法,但是它可能并没有您想要的效率.

I don't think that's possible, at least in a direct way. I though about a workaround, but the might not be as efficient as you want.

通过使用Google Cloud Storage客户端库 [1] 您可以先下载模型文件,然后加载它,然后在程序结束时将其删除.当然,这意味着您每次运行代码时都需要下载文件.这是一个代码段:

By using the Google Cloud Storage client libraries [1] you can download the model file first, load it, and when your program ends, delete it. Of course, this means that you need to download the file every time you run the code. Here is a snippet:

from google.cloud import storage
from sklearn.externals import joblib

storage_client = storage.Client()
bucket_name=<bucket name>
model_bucket='model.joblib'
model_local='local.joblib'

bucket = storage_client.get_bucket(bucket_name)
#select bucket file
blob = bucket.blob(model_bucket)
#download that file and name it 'local.joblib'
blob.download_to_filename(model_local)
#load that file from local file
job=joblib.load(model_local)