我目前正试图让seq2seq模型与TF Serving一起工作。我以为我没错,但似乎我错了。我最初通过本地文本文件输入训练模型,作为批次读入。现在我想要一个传递的句子并且它返回给我总和。
我已经成功地保存了模型,并且现在我能够在前端页面上查看预测,但结果仍然是从我的本地文本文件中提取而不是我传入的查询参数句子。
我的输入是当前作为查询参数发送的一个句子,但实际显示的结果仍然是从我的文本文件中提取,即使我将batch_x映射到我已经验证的arg [1]的值是正确的预期输入。
有谁看到我做错了什么?很明显,我误解了我应该采取的过程。
现在要做的一个重要注意事项是,如果我修改传入的参数的值并直接调用python文件,我会得到正确的结果。然而,当我对正在服务的冻结模型进行相同的调用时,无论发送什么内容,我总是得到相同的预测响应。
这就是我冻结模型的方法(注意input_dict.X到batch_x的映射......相信问题是我在这里做错了):
pickle_fn = 'args.pickle'
folder = os.path.dirname(os.path.abspath(__file__)) + '/pickle'
pickle_filepath = os.path.join(folder, pickle_fn)
with open(pickle_filepath, "rb") as f:
args = pickle.load(f)
print("Loading dictionary...")
word_dict, reversed_dict, article_max_len, summary_max_len = build_dict("valid", args.toy)
print("Loading validation dataset...")
#The below call will pull from the arg passed when "serve" is used
valid_x, valid_y = build_dataset("serve", word_dict, article_max_len, summary_max_len, args.toy)
valid_x_len = list(map(lambda x: len([y for y in x if y != 0]), valid_x))
with tf.Session() as sess:
print("Loading saved model...")
model = Model(reversed_dict, article_max_len, summary_max_len, args, forward_only=True)
saver = tf.train.Saver(tf.global_variables())
ckpt = tf.train.get_checkpoint_state("./saved_model/")
saver.restore(sess, ckpt.model_checkpoint_path)
batches = batch_iter(valid_x, valid_y, args.batch_size, 1)
#print(valid_x, file=open("art_working_inp.txt", "a"))
print("Writing summaries to 'result.txt'...")
for batch_x, batch_y in batches:
batch_x_len = list(map(lambda x: len([y for y in x if y != 0]), batch_x))
valid_feed_dict = {
model.batch_size: len(batch_x),
model.X: batch_x,
model.X_len: batch_x_len,
}
prediction = sess.run(model.prediction, feed_dict=valid_feed_dict)
prediction_output = list(map(lambda x: [reversed_dict[y] for y in x], prediction[:, 0, :]))
#Save out our model
cwd = os.getcwd()
path = os.path.join(cwd, 'simple')
inputs_dict = {
"X": tf.convert_to_tensor(batch_x)
}
outputs_dict = {
"prediction": tf.convert_to_tensor(prediction_output)
}
tf.saved_model.simple_save(
sess, path, inputs_dict, outputs_dict
)
print('Model Saved')
#End save model code
#Save results to file
with open("result.txt", "a") as f:
for line in prediction_output:
summary = list()
for word in line:
if word == "</s>":
break
if word not in summary:
summary.append(word)
print(" ".join(summary), file=f)
print('Summaries are saved to "result.txt"...')
然后我打电话给服务器进行推理就在这里。无论我投入到数据中,它总是吐出相同的预测,这是我在导出模型时最初传递的预测。
def do_inference(hostport):
"""Tests PredictionService with concurrent requests.
Args:
hostport: Host:port address of the PredictionService.
Returns:
pred values, ground truth labels, processing time
"""
# connect to server
host, port = hostport.split(':')
channel = grpc.insecure_channel(hostport)
stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
# prepare request object
request = predict_pb2.PredictRequest()
request.model_spec.name = 'saved_model'
# Get the input data from our arg
jsn_inp = sys.argv[1]
data = json.loads(jsn_inp)['tokenized']
data = np.array(data)
request.inputs['X'].CopyFrom(
tf.contrib.util.make_tensor_proto(data, shape=data.shape, dtype=tf.int64))
#print(request)
result = stub.Predict(request, 10.0) # 10 seconds
return result
如果这有用,这就是构建数据集的方式。我修改了build_dataset函数,因此它只使用传入的arg,但这也没有解决问题。我想也许类似于javascript闭包的东西发生了什么,所以我想我会以这种方式拉取数据。
def build_dataset(step, word_dict, article_max_len, summary_max_len, toy=False):
if step == "train":
article_list = get_text_list(train_article_path, toy)
title_list = get_text_list(train_title_path, toy)
elif step == "valid":
article_list = get_text_list(valid_article_path, toy)
title_list = get_text_list(valid_title_path, toy)
elif step == "serve":
arg_to_use = sys.argv[1] if ("tokenized" in sys.argv[1]) else sys.argv[2]
article_list = [json.loads(arg_to_use)["tokenized"]]
else:
raise NotImplementedError
if step != "serve":
x = list(map(lambda d: word_tokenize(d), article_list))
x = list(map(lambda d: list(map(lambda w: word_dict.get(w, word_dict["<unk>"]), d)), x))
x = list(map(lambda d: d[:article_max_len], x))
x = list(map(lambda d: d + (article_max_len - len(d)) * [word_dict["<padding>"]], x))
print(x, file=open("input_values.txt", "a"))
y = list(map(lambda d: word_tokenize(d), title_list))
y = list(map(lambda d: list(map(lambda w: word_dict.get(w, word_dict["<unk>"]), d)), y))
y = list(map(lambda d: d[:(summary_max_len-1)], y))
else:
x = article_list
#x = list(map(lambda d: word_tokenize(d), article_list))
#x = list(map(lambda d: list(map(lambda w: word_dict.get(w, word_dict["<unk>"]), d)), x))
x = list(map(lambda d: d[:article_max_len], x))
x = list(map(lambda d: d + (article_max_len - len(d)) * [word_dict["<padding>"]], x))
y = list()
return x, y
SignatureDef信息(让我有点担心的一件事是下面的Const ......但不确定那是什么......现在就去看看):
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['X'] tensor_info:
dtype: DT_INT64
shape: (1, 50)
name: Const:0
The given SavedModel SignatureDef contains the following output(s):
outputs['prediction'] tensor_info:
dtype: DT_STRING
shape: (1, 11)
name: Const_1:0
Method name is: tensorflow/serving/predict
好吧....所以似乎const问题是我的问题,或者更确切地指导我找到真正的问题。我的问题的真正原因是我传入了tf.convert_to_tensor我的值而不是tf.placeholders本身。因此,通过在保存模型时修改以下条目的逻辑,我能够在发送输入时得到正确的响应。正如您所看到的,我还必须输入其他原始的batch_size和x_len。希望其他人觉得这很有用。
inputs_dict = {
"batch_size": tf.convert_to_tensor(model.batch_size),
"X": tf.convert_to_tensor(model.X),
"X_len": tf.convert_to_tensor(model.X_len),
}
outputs_dict = {
"prediction": tf.convert_to_tensor(model.prediction)
}
这产生了一个更好看的SignatureDef:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['X'] tensor_info:
dtype: DT_INT32
shape: (-1, 50)
name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
outputs['prediction'] tensor_info:
dtype: DT_INT32
shape: (-1, 10, -1)
name: decoder/decoder/transpose_1:0
Method name is: tensorflow/serving/predict