使用gunicorn预测时无法取消对象的pickle

Question

目前我正在使用 API 提供下一个单词预测模型。该模型在使用 Flask 时成功运行，但在使用 Gunicorn 进行部署时取消对象的pickle 时出现问题。 Pickeled 对象依赖于类定义，我在需要的地方显式提供类定义。

class LanguageModel(nn.Module):
    def __init__(self, vocab_size, embedding_size, hidden_size, n_layers=1, dropout_p=0.5):
        # Defining layers
        super(LanguageModel, self).__init__()
        self.n_layers = n_layers
        self.hidden_size = hidden_size
        self.embed = nn.Embedding(vocab_size, embedding_size)
        self.rnn = nn.LSTM(embedding_size, hidden_size, n_layers, batch_first=True)
        self.linear = nn.Linear(hidden_size, vocab_size)
        self.dropout = nn.Dropout(dropout_p)

    def init_weight(self):
        # self.embed.weight = nn.init.xavier_uniform(self.embed.weight)
        self.embed.weight.data.copy_(torch.from_numpy(new_w))
        self.linear.weight = nn.init.xavier_uniform(self.linear.weight)
        self.linear.bias.data.fill_(0)

# importing word indexes
with open(w2i, "rb") as f1:
    word2index = pickle.load(f1)

with open(i2w, "rb") as f2:
    index2word = pickle.load(f2)
# loading model
model = torch.load(wordModel)


def getNextWords(words):
    results = []
    data = [words]
    data = flatten([co.strip().split() + ['</s>'] for co in data])
    x = prepare_sequence(data, word2index)
    x = x.unsqueeze(1)
    x = batchify(x, 1)

    with torch.no_grad():
        hidden = model.init_hidden(1)
        for batch in getBatch(x, 1):
            inputs, targets = batch
            output, hidden = model(inputs, hidden)
            prob = output.exp()

            word_id = torch.multinomial(prob, num_samples=1).item()
            # word_probs = torch.multinomial(prob, num_samples=1).probs()
            word = index2word[word_id]
            results.append(word)
    return [res for res in results if res.isalpha()][:4]  # return results

app = Flask(__name__)

@app.route('/')
def home():
    return "Home"

@app.route('/getPredictions', methods=["POST"])
def getPredictions():
    #...... code .........
    resultJSON = {'inputPhrase': inputPhrase,
                  'predictions': predictions}  # predictions [nextPhrase]
    print('result: ', predictions)
    return jsonify(resultJSON)


if __name__ == '__main__':
    app.run(host='0.0.0.0', port=3001, debug=True)  # 10.2.1.29

Gunicorn wsgi.py 文件：

from m_api import app
import torch
import torch.nn as nn
from torch.autograd import Variable

if __name__ == "__main__":
    class LanguageModel(nn.Module):
        def __init__(self, vocab_size, embedding_size, hidden_size, n_layers=1, dropout_p=0.5):
            # Defining layers
            super(LanguageModel, self).__init__()
            self.n_layers = n_layers
            self.hidden_size = hidden_size
            self.embed = nn.Embedding(vocab_size, embedding_size)
            self.rnn = nn.LSTM(embedding_size, hidden_size, n_layers, batch_first=True)
            self.linear = nn.Linear(hidden_size, vocab_size)
            self.dropout = nn.Dropout(dropout_p)
        def init_weight(self):
            # self.embed.weight = nn.init.xavier_uniform(self.embed.weight)
            self.embed.weight.data.copy_(torch.from_numpy(new_w))
            self.linear.weight = nn.init.xavier_uniform(self.linear.weight)
            self.linear.bias.data.fill_(0)

    app.run()

这个应用程序在由 Flask 提供时运行得很好，但是当我使用 Gunicorn 时，会抛出错误：

    model = torch.load(wordModel)
  File "/home/.conda/envs/sppy36/lib/python3.6/site-packages/torch/serialization.py", line 426, in load
    return _load(f, map_location, pickle_module, **pickle_load_args)
  File "/home/.conda/envs/sppy36/lib/python3.6/site-packages/torch/serialization.py", line 613, in _load
    result = unpickler.load()
AttributeError: Can't get attribute 'LanguageModel' on <module '__main__' from '/home/.conda/envs/sppy36/bin/gunicorn'>

为了解决这个问题，我也在 wsgi.py 文件中包含了类定义，但在加载 pickeled 文件时仍然无法获取类定义。我需要在哪里指定类定义仍然未知。

Answer 1

问题是因为gunicorn寻找类定义int的main方法，即gunicorn可执行文件。这就是为什么即使两个 .py 文件中类的显式定义在 Gunicorn 上运行时也没有完成预期的工作，但在使用 Flask 时却完成了预期的工作。为了解决这个问题，我在 Gunicorn 可执行文件中显式定义了该类，并且它起作用了。目前，我发现这是可行的解决方案。

gunicorn.py

#!/home/user/anaconda3/envs/envName/bin/python

import re
import sys

from gunicorn.app.wsgiapp import run

import torch
import torch.nn as nn
from torch.autograd import Variable

USE_CUDA = torch.cuda.is_available()

if __name__ == '__main__':
    # defining model class
    class LanguageModel(nn.Module):
        def __init__(self, vocab_size, embedding_size, hidden_size, n_layers=1, dropout_p=0.5):
            # Defining layers
            super(LanguageModel, self).__init__()
            self.n_layers = n_layers
            self.hidden_size = hidden_size
            self.embed = nn.Embedding(vocab_size, embedding_size)
            self.rnn = nn.LSTM(embedding_size, hidden_size, n_layers, batch_first=True)
            self.linear = nn.Linear(hidden_size, vocab_size)
            self.dropout = nn.Dropout(dropout_p)

        def init_weight(self):
            # self.embed.weight = nn.init.xavier_uniform(self.embed.weight)
            self.embed.weight.data.copy_(torch.from_numpy(new_w))
            self.linear.weight = nn.init.xavier_uniform(self.linear.weight)
            self.linear.bias.data.fill_(0)

        def init_hidden(self, batch_size):
            hidden = Variable(torch.zeros(self.n_layers, batch_size, self.hidden_size))
            context = Variable(torch.zeros(self.n_layers, batch_size, self.hidden_size))
            return (hidden.cuda(), context.cuda()) if USE_CUDA else (hidden, context)

        def detach_hidden(self, hiddens):
            return tuple([hidden.detach() for hidden in hiddens])

        def forward(self, inputs, hidden, is_training=False):
            embeds = self.embed(inputs)
            if is_training:
                embeds = self.dropout(embeds)
            out, hidden = self.rnn(embeds, hidden)
            return self.linear(out.contiguous().view(out.size(0) * out.size(1), -1)), hidden


    sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
    sys.exit(run())

Answer 2

为时已晚，但我希望有人发现这很有用。

它不起作用的原因是，当您取消自己的模型时，pickle 会存储模块和对象的名称。从我在输出中看到的情况来看，您很可能在脚本中训练了模型，在同一脚本中定义了自己的模型，因此当您腌制模型时，它会说该模块是 __main__。问题在于，当您运行gunicorn时，正如您所期望的那样，__main__是运行gunicorn的脚本，当您取消对象时，它会在gunicorn脚本中查找它（因为pickled文件说它在那里）。

为了克服这个问题，当你训练模型时，定义一个模型所在的目录，类似于

目录/__init__.py 目录/model.py

在 __init__.py 中确保有类似的内容

“从.model导入ModelClass”

现在尝试训练导入模型，如下所示 “从目录导入模型类”

训练它并腌制它。

然后在gunicorn中，在unpickle模型的.py中，您需要确保保留directory/*py结构，因为当您执行pickle.load时它会尝试寻找类似的东西

“目录.模型....模型类”。您实际上可以打印 pickle 文件的第一行，将其作为二进制文件读取，您将看到具有这种结构。

建议：最好的选择，也可能是最佳实践，是不要 pickle 复杂的类对象，最好坚持使用通过 pip 安装的 pickle 对象，因为它们位于 python 的默认包中，当你 pickle 那些时，你不需要查看out 的目录结构。因此，只要有可能，请尝试从这些类中 pickle 对象，例如 sklearn 模型、字典、列表，然后加载模型，取消所有这些文件并将它们作为 p

发送

使用gunicorn预测时无法取消对象的pickle

问题描述投票：0回答：2

2个回答

最新问题

使用gunicorn预测时无法取消对象的pickle

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2