Tensorflow serving生產(chǎn)環(huán)境部署

TF-serving介紹

TensorFlow Serving是google提供的一種生產(chǎn)環(huán)境部署方案箩艺，一般來說在做算法訓(xùn)練后，都會導(dǎo)出一個(gè)模型拓巧，在應(yīng)用中直接使用涯保。

正常的思路是在flask或者tornado這種web服務(wù)中嵌入tensorflow的模型，提供rest api的云服務(wù)接口钞诡≈Ｏ郑考慮到并發(fā)高可用性，一般會采取多進(jìn)程的部署方式荧降，即一臺云服務(wù)器上同時(shí)部署多個(gè)flask接箫，每個(gè)進(jìn)程獨(dú)享一部分GPU資源，顯然這樣是很浪費(fèi)資源的朵诫。

Google提供了一種生產(chǎn)環(huán)境的新思路辛友，他們開發(fā)了一個(gè)tensorflow-serving的服務(wù)，可以自動加載某個(gè)路徑下的所有模型剪返，模型通過事先定義的輸入輸出和計(jì)算圖废累，直接提供rpc或者rest的服務(wù)邓梅。

一方面，支持多版本的熱部署（比如當(dāng)前生產(chǎn)環(huán)境部署的是1版本的模型邑滨，訓(xùn)練完成后生成一個(gè)2版本的模型日缨，tensorflow會自動加載這個(gè)模型，停掉之前的模型）掖看。
另一方面殿遂，tensorflow serving內(nèi)部通過異步調(diào)用的方式，實(shí)現(xiàn)高可用乙各，并且自動組織輸入以批次調(diào)用的方式節(jié)省GPU計(jì)算資源。

因此幢竹，整個(gè)模型的調(diào)用方式就變成了：

客戶端 ----> web服務(wù) --grpc或者rest--> tensorflow serving

如果我們想要替換模型或者更新版本耳峦，只需要訓(xùn)練模型并將訓(xùn)練結(jié)果保存到固定的目錄下就可以了。

環(huán)境準(zhǔn)備

首先需要安裝nvidia-driver（gpu驅(qū)動）以及Docker 19.03

安裝nvidia-docker焕毫，這是nvidia在docker上進(jìn)行了封裝蹲坷，讓docker可以使用GPU資源，具體安裝方法可以參考以下鏈接：https://github.com/NVIDIA/nvidia-docker#quick-start

安裝命令如下相關(guān)命令如下：

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.repo | sudo tee /etc/yum.repos.d/nvidia-docker.repo

sudo yum install -y nvidia-container-toolkit

sudo systemctl restart docker

拉取TFserving的GPU鏡像
```
docker pull tensorflow/serving-gpu
```

制作模型文件

低階API版本

TF-serving需要使用的模型是pb模型文件邑飒，而不是通常使用的ckpt模型文件循签，因此需要指定相應(yīng)的參數(shù)。

以下是一個(gè)可以用來生成pb模型的代碼疙咸，參考至mnist_saved_model.py

確定好模型的輸出路徑县匠，模型的輸入路徑是一個(gè)由一串?dāng)?shù)字命名的文件夾，數(shù)字就是版本號

output_dir = "counter"
if not os.path.exists(output_dir):
    os.mkdir(output_dir)
for i in range(100000, 9999999):
    cur = os.path.join(output_dir, str(i))
    if not tf.gfile.Exists(cur):
        output_dir = cur
        break

建立一個(gè)模型生成的工具類SavedModelBuilder：

builder = tf.saved_model.builder.SavedModelBuilder(output_dir)

構(gòu)建模型可供調(diào)用的方法簽名撒轮，以及輸入和輸出的類型乞旦，其中g(shù)et_counter，incr_counter题山，incr_counter_by兰粉，reset_counter分別對應(yīng)著四個(gè)方法的簽名

    signature_def_map = signature({
        "get_counter": {"inputs": {"nothing": nothing},
                        "outputs": {"output": counter},
                        "method_name": method_name},
        "incr_counter": {"inputs": {"nothing": nothing},
                         "outputs": {"output": incr_counter},
                         "method_name": method_name},
        "incr_counter_by": {"inputs": {'delta': delta, },
                            "outputs": {'output': incr_counter_by},
                            "method_name": method_name},
        "reset_counter": {"inputs": {"nothing": nothing},
                          "outputs": {"output": reset_counter},
                          "method_name": method_name}
    })

而在進(jìn)行定義輸入和輸出的類型的時(shí)候，我們抽出了一個(gè)函數(shù)顶瞳，協(xié)助處理玖姑，其中tf.saved_model.utils.build_tensor_info可以根據(jù)傳入的tensor 對象構(gòu)建protocol buffer，每個(gè)方法簽名都會構(gòu)建一個(gè)對象慨菱，并最終生成一個(gè)signature_dict焰络，在后續(xù)請求方法的時(shí)候，request.model_spec.signature_name需要制定這些方法簽名的key值抡柿。同時(shí)我們注意到這里的input里面還有一個(gè)key值舔琅，這個(gè)key是request.inputs['nothing']這里制定。還有一個(gè)參數(shù)method_name洲劣，是用來表示該方法屬于預(yù)測备蚓，還是分類或者回歸课蔬。

def signature(function_dict):
    signature_dict = {}
    for k, v in function_dict.items():
        inputs = {k: tf.saved_model.utils.build_tensor_info(v) for k, v in v['inputs'].items()}
        outputs = {k: tf.saved_model.utils.build_tensor_info(v) for k, v in v['outputs'].items()}
        signature_dict[k] = tf.saved_model.build_signature_def(inputs=inputs, outputs=outputs,
                                                               method_name=v['method_name'])
    return signature_dict

添加需要存儲的信息，其中tag要使用[tf.compat.v1.saved_model.tag_constants.SERVING]表明是要提供給serving的郊尝。main_op=tf.tables_initializer(), strip_default_attrs=True這兩個(gè)參數(shù)是用來初始化一個(gè)lookup_table以及版本兼容用的

  builder.add_meta_graph_and_variables(sess, tags=[tf.saved_model.tag_constants.SERVING],
signature_def_map=signature_def_map, main_op=tf.tables_initializer(), strip_default_attrs=True)

保存模型

  builder.save()

完整的代碼如下：

from __future__ import division, absolute_import, print_function

import os

import tensorflow.compat.v1 as tf  # tf2.1兼容

tf.disable_v2_behavior()


def signature(function_dict):
    signature_dict = {}
    for k, v in function_dict.items():
        inputs = {k: tf.saved_model.utils.build_tensor_info(v) for k, v in v['inputs'].items()}
        outputs = {k: tf.saved_model.utils.build_tensor_info(v) for k, v in v['outputs'].items()}
        signature_dict[k] = tf.saved_model.build_signature_def(inputs=inputs, outputs=outputs,
                                                               method_name=v['method_name'])
    return signature_dict


output_dir = "counter"
if not os.path.exists(output_dir):
    os.mkdir(output_dir)
for i in range(100000, 9999999):
    cur = os.path.join(output_dir, str(i))
    if not tf.gfile.Exists(cur):
        output_dir = cur
        break
method_name = tf.saved_model.signature_constants.PREDICT_METHOD_NAME
builder = tf.saved_model.builder.SavedModelBuilder(output_dir)
print('outputdir', output_dir)
with tf.Graph().as_default(), tf.Session() as sess:
    counter = tf.Variable(0.0, dtype=tf.float32, name="counter")
    with tf.name_scope("incr_counter_op", values=[counter]):
        incr_counter = counter.assign_add(1.0)
    delta = tf.placeholder(dtype=tf.float32, name="delta")
    with tf.name_scope("incr_counter_by_op", values=[counter, delta]):
        incr_counter_by = counter.assign_add(delta)
    with tf.name_scope("reset_counter_op", values=[counter]):
        reset_counter = counter.assign(0.0)
    nothing = tf.placeholder(dtype=tf.int32, shape=(None,))
    sess.run(tf.global_variables_initializer())
    signature_def_map = signature({
        "get_counter": {"inputs": {"nothing": nothing},
                        "outputs": {"output": counter},
                        "method_name": method_name},
        "incr_counter": {"inputs": {"nothing": nothing},
                         "outputs": {"output": incr_counter},
                         "method_name": method_name},
        "incr_counter_by": {"inputs": {'delta': delta, },
                            "outputs": {'output': incr_counter_by},
                            "method_name": method_name},
        "reset_counter": {"inputs": {"nothing": nothing},
                          "outputs": {"output": reset_counter},
                          "method_name": method_name}
    })
    
    builder.add_meta_graph_and_variables(sess, tags=[tf.saved_model.tag_constants.SERVING],
                                         signature_def_map=signature_def_map, main_op=tf.tables_initializer(),
                                         strip_default_attrs=True)
    builder.save()
    print("over")

tf.estimator版本

如果我們使用tf.estimator導(dǎo)出的話二跋，也需要提供輸入和輸出，輸出需要在模型的預(yù)測代碼返回的實(shí)例里面指出流昏，可參考如下代碼扎即，再返回預(yù)測的實(shí)例對象時(shí)，傳入了export_outputs參數(shù)况凉。

if mode == tf.estimator.ModeKeys.PREDICT:
    predictions = {
      'probabilities': prop,
      'ctr_probabilities': ctr_predictions,
      'cvr_probabilities': cvr_predictions
    }
    export_outputs = {
      'prediction': tf.estimator.export.PredictOutput(predictions)
    }
    return tf.estimator.EstimatorSpec(mode, predictions=predictions, export_outputs=export_outputs)

而輸入則是需要在主動保存模型時(shí)添加的谚鄙，首先要構(gòu)建一個(gè)serving_input_receiver_fn，用來告訴模型應(yīng)該接受什么樣的輸入刁绒，這里的receiver_tensors就是需要最后tensorflow serving需要接受的參數(shù)闷营。

官方建議使用傳入tf.example對象，然后再解析成為tensor知市，但還是有點(diǎn)麻煩傻盟，因?yàn)榭蛻舳艘驳脗魅脒@個(gè)對象才可以。

feature_spec = {'foo': tf.FixedLenFeature(...),
                'bar': tf.VarLenFeature(...)}

def serving_input_receiver_fn():
  """An input receiver that expects a serialized tf.Example."""
  serialized_tf_example = tf.placeholder(dtype=tf.string,shape=[default_batch_size],name='input_example_tensor')
  receiver_tensors = {'examples': serialized_tf_example}
  features = tf.parse_example(serialized_tf_example, feature_spec)
  return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)

但是如果我們在客戶端想直接傳入tensor嫂丙，那么可以作如下改寫：

def serving_input_receiver_fn():
  
    tensor1 = tf.placeholder(dtype=tf.int32,shape=[None,20],name='tensor1')
    tensor2 = tf.placeholder(dtype=tf.int32,shape=[None,10],name='tensor2')
    receiver_tensors = {'tensor1': tensor1,'tensor2': tensor2}
    features = receiver_tensors
    return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)

解釋一下這里的receiver_tensors和features的區(qū)別娘赴，receiver_tensors是從客戶端的request接收到的tensor，對應(yīng)的是placeholder信息跟啤，而features則是指輸入到model_fn的feature.假如我們接受的是tf.example對象诽表，那么我們需要先將他parse變成相應(yīng)的tensor。但是這里我們不需要進(jìn)行這個(gè)操作腥光，因?yàn)榻邮盏降恼埱髤?shù)已經(jīng)是可以直接扔到model_fn的tensor了关顷，不用轉(zhuǎn)換∥涓＃可參考鏈接里的解釋：TensorFlow Estimator ServingInputReceiver features vs receiver_tensors: when and why?

最后使用estimator實(shí)例的export_savedmodel方法導(dǎo)出模型到export_dir_base文件夾

estimator.export_savedmodel(export_dir_base,serving_input_receiver_fn,strip_default_attrs=True)

tf.keras版本

參考鏈接：使用REST訓(xùn)練和提供模型议双，參數(shù)簽名根據(jù)輸入和輸出的默認(rèn)

tf.keras.models.save_model(
    model,
    export_path,
    overwrite=True,
    include_optimizer=True,
    save_format=None,
    signatures=None,
    options=None
)

# 或者
model.save("my_model_dir",save_format='tf')

tensorflow serving部署模型

首先我們有如下的文件結(jié)構(gòu)

tmp
└── counter
    └── 100000
    │　 ├── saved_model.pb
    │　 └── variables
    │　　　 ├── variables.data-00000-of-00001
    │　　　 └── variables.index
    └── 100001
    　　 ├── saved_model.pb
    　　 └── variables
    　　　　 ├── variables.data-00000-of-00001
    　　　　 └── variables.index

那我們可以用如下的命令去啟動該模型，其中8500是gRPC端口捉片，8501 是 REST API的端口平痰，也可以只開啟其中一個(gè)∥槿遥可用通過-e NVIDIA_VISIBLE_DEVICES=0參數(shù)指定哪塊GPU去運(yùn)行程序

docker run --runtime=nvidia -p 8500:8500 -p 8501:8501 --mount type=bind,source=/tmp/counter,target=/models/counter -e MODEL_NAME=counter -t tensorflow/serving-gpu &

如果需要部署多個(gè)模型宗雇，那模型的文件可以用如下的結(jié)構(gòu)組織

multiModel
├── counter
│   └── 100000
│   　　 ├── saved_model.pb
│   　　 └── variables
│   　　　　 ├── variables.data-00000-of-00001
│   　　　　 └── variables.index
├── counter1
│   　　└── 100000
│   　　　　 ├── saved_model.pb
│   　　　　 └── variables
│   　　　　　　 ├── variables.data-00000-of-00001
│   　　　　　　 └── variables.index
└── models.config

其中需要包含一個(gè)models.config文件，該文件會告知需要部署哪些模型莹规。文件是ASCII protocol buffer的結(jié)構(gòu)赔蒲，具體什么事ASCII protocol buffer，可參考[鏈接](What does the protobuf text format look like)

model_config_list:{
    config:{
      name:"counter",
      base_path:"/models/multiModel/counter",
      model_platform:"tensorflow",
      model_version_policy:{
        # 這是加載全部模型的策略
        all:{}
      }
      version_labels:{
          key:"stable",
          value:100000
      }
    },
    config:{
      name:"counter1",
      base_path:"/models/multiModel/counter1",
      model_platform:"tensorflow",
      model_version_policy:{
        # 這是指定加載version的策略
        specific:{
            version:100000
        }
      }
    },
}

啟動TFserving的服務(wù)類似于下面的命令。這里需要注意的是allow_version_labels_for_unavailable_models參數(shù)需要傳個(gè)true進(jìn)去舞虱，因?yàn)槲覀冎霸趍odel.config里面定義的模型是all:{}策略欢际，沒有指定加載模型version，不指定這個(gè)參數(shù)矾兜，啟動容器會報(bào)錯(cuò)损趋。還有一個(gè)參數(shù)--model_config_file_poll_wait_seconds=60，這個(gè)參數(shù)可以定期檢查config文件椅寺，然后動態(tài)改變serve的模型浑槽，這兩個(gè)參數(shù)需要放在最后。

docker run --runtime=nvidia -p 8500:8500 -p 8501:8501 --name tf_serving --mount type=bind,source=/home/node1/model/multiModel/,target=/models/multiModel -t tensorflow/serving:latest-gpu --model_config_file=/models/multiModel/models.config --model_config_file_poll_wait_seconds=60 --allow_version_labels_for_unavailable_models=true

請求tensorflow serving的預(yù)測服務(wù)

我們可以使用如下命令去獲取serve的模型的方法和參數(shù)簽名信息

crul http://host:8501/v1/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]/metadata

REST API接口請求方式

（請求參數(shù)和上面的模型沒關(guān)系返帕，只是一個(gè)例子）

參考鏈接serving/api_rest

如果使用restful形式的去請求服務(wù)桐玻，請求的url類似如下

POST http://host:8501/v1/models/${MODEL_NAME}[/versions/${MODEL_VERSION}]:predict

其中/versions/${MODEL_VERSION}是可選的，如果不添加荆萤，則表示使用最新版本的模型畸冲，也就是MODEL_VERSION最大的那個(gè)模型。

請求的body是一個(gè)jason字符串观腊，body有兩種模式，行模式（或者叫instance模式）和列模式

行模式如下：

{
  "signature_name": <string>,
  "instances": <value>|<(nested)list>|<list-of-objects>
}

列模式如下：

{
  "signature_name": <string>,
  "inputs": <value>|<(nested)list>|<object>
}

其中"signature_name"字段表示的是模型的方法簽名算行，也就是之前定義的signature_def_map里面的值梧油，默認(rèn)應(yīng)該是 tf.compat.v1.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY，也就是“serving_default"

對于行模式州邢，加入需要傳入一個(gè)實(shí)例為

{
     "tag": ["foo"],
     "signal": [1, 2, 3, 4, 5],
     "sensor": [1, 2, 3, 4]
   }

也就是傳入"tag"儡陨，"signal"，"sensor"這幾個(gè)key對應(yīng)的數(shù)值量淌，這三個(gè)值應(yīng)當(dāng)與之前prediction_signature里面定義的input的key對應(yīng)骗村，并且注意的這三個(gè)命名變量一定要擁有相同的0維，如果不是呀枢，則需要使用列模式胚股。

這里我使用的例子和官方的有區(qū)別，因?yàn)楣俜降拿枋鍪怯忻艿娜骨铮瑓⒖剂?a href="http://www.reibang.com/p/eca49219cb19" target="_blank">Tensorflow Serving-Docker RESTful API客戶端訪問問題排查里面的案例描述琅拌，暫時(shí)認(rèn)為官方文字描述是正確的。事實(shí)上摘刑，如果每個(gè)元素的0維不一致进宝，我們也可以在模型輸入的外層套一個(gè)維度，也可以滿足0維相同的要求枷恕。

Note, each named input ("tag", "signal", "sensor") is implicitly assumed have same 0-th dimension (two in above example, as there are two objects in the instances list). If you have named inputs that have different 0-th dimension, use the columnar format described below.

如果需要傳入多個(gè)值

{
 "instances": [
   {
     "tag": ["foo"],
     "signal": [1, 2, 3, 4, 5],
     "sensor": [1, 2, 3, 4]
   },
   {
     "tag": ["bar"],
     "signal": [3, 4, 1, 2, 5],
     "sensor": [4, 5, 6, 8]
   }
 ]
}

行模式的返回值是一個(gè)json字符串党晋，如果模型的輸出只包含一個(gè)命名的tensor，我們省略名字和predictions key map，直接使用標(biāo)量或者值的list未玻。如果模型輸出多個(gè)命名的tensor灾而，我們輸出對象list，和上面提到的行形式輸入類似深胳。

{
  "predictions": <value>|<(nested)list>|<list-of-objects>
}

列模式

{
 "inputs": {
   "tag": ["foo", "bar"],
   "signal": [[1, 2, 3, 4, 5], [3, 4, 1, 2, 5]],
   "sensor": [[[1, 2], [3, 4]], [[4, 5], [6, 8]]]
 }
}

可以看出這里的每個(gè)key值后面跟著多個(gè)Tensor绰疤，他們是一一對應(yīng)的，而且這里并不需要每個(gè)元素具有相同的0維

列模式的返回值也是json字符串舞终，key是outputs轻庆，如果模型的輸出只包含一個(gè)命名的tensor，我們省略名字和outputs key map敛劝，直接使用標(biāo)量或者值的list余爆。如果模型輸出多個(gè)命名的tensor，我們輸出對象夸盟，其每個(gè)key都和輸出的tensor名對應(yīng)蛾方，和上面提到的列形式輸入類似。

{
  "outputs": <value>|<(nested)list>|<object>
}

Grpc請求方式

python版本Grpc調(diào)用

Grpc調(diào)用是需要proto文件來生成一些依賴代碼上陕，相關(guān)proto文件在鏈接里面桩砰。

version1：使用封裝好的工具進(jìn)行調(diào)用

編譯proto文件這一步，顯然有人會已經(jīng)幫我們做好了释簿，并打包上傳了名為tensorflow-serving-api的工具亚隅。我們可以從中直接獲取對應(yīng)的依賴文件。以下是一個(gè)利用它建立依賴的過程庶溶，首先需要指定model_name和signature_name煮纵，model_name是模型簽名，signature_name模型方法簽名偏螺。傳入的tensor需要經(jīng)過tf.contrib包進(jìn)行轉(zhuǎn)換成protobuf形式行疏，tensor的key則設(shè)置為存儲模型的時(shí)候指定的key值。

注意：這里如果想要指定模型的version套像，有兩種辦法酿联。一：如果是在model.config文件里寫入了version_label,那么就可以用request.model_spec.version_label='stable'這種辦法指定；二：如果沒有夺巩，則可以使用version數(shù)字货葬，傳入request.model_spec.version.value=00000123之類的版本號即可。在這里的proto文件里面劲够，使用了oneof語法,也就是只會接受一種震桶，如果同時(shí)傳入了兩中形式，那么會使用后寫入的版本

  request = predict_pb2.PredictRequest()
  request.model_spec.name = 'counter'
  request.model_spec.version_label='stable'
  request.model_spec.version.value=00000123  # 同時(shí)傳入version_label和version的話征绎，只有寫在后面的代碼會生效
  request.model_spec.signature_name = 'incr_counter'

  # read image into numpy array
  inputs=np.array([0])

  # convert to tensor proto and make request
  # shape is in NHWC (num_samples x height x width x channels) format
  tensor = tf.contrib.util.make_tensor_proto(inputs, shape=list(inputs.shape))
  request.inputs['nothing'].CopyFrom(tensor)

完整的代碼如下

from __future__ import print_function
import numpy as np
import time
tt = time.time()

import tensorflow as tf  # tf1.x蹲姐，需要使用里面的contrib包磨取，tf2.x里面沒有了

from grpc
from tensorflow_serving.apis import predict_pb2
from tensorflow_serving.apis import prediction_service_pb2_grpc


def main():  
  # create prediction service client stub
  channel = grpc.insecure_channel("172.0.0.1:8501")
  stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

  # create request
  request = predict_pb2.PredictRequest()
  request.model_spec.name = 'counter'
  request.model_spec.version_label='stable'
  request.model_spec.version.value=100000  # 同時(shí)傳入version_label和version的話，只有寫在后面的代碼會生效
  request.model_spec.signature_name = 'incr_counter'
  input = np.array([0])

  # convert to tensor proto and make request
  # shape is in NHWC (num_samples x height x width x channels) format
  tensor = tf.contrib.util.make_tensor_proto(input, shape=list(input.shape))
  request.inputs['nothing'].CopyFrom(tensor)
  resp = stub.Predict(request, 30.0)

  print('total time: {}s'.format(time.time() - tt))

if __name__ == '__main__':
    main()

version2：自己編譯proto文件生成依賴

上述兩個(gè)模塊需要太多的依賴柴墩，而實(shí)際上我們并不需要這么多依賴忙厌，因此可以使用自己編譯的proto文件，生成需要的依賴

參考鏈接一參考鏈接二

首先從tensorflow和tensorflow serving的github里面下載proto文件

tensorflow/serving/  
  tensorflow_serving/apis/model.proto
  tensorflow_serving/apis/predict.proto
  tensorflow_serving/apis/prediction_service.proto

tensorflow/tensorflow/  
  tensorflow/core/framework/resource_handle.proto
  tensorflow/core/framework/tensor_shape.proto
  tensorflow/core/framework/tensor.proto
  tensorflow/core/framework/types.proto

將上述文件保存至protos文件

protos/  
  tensorflow_serving/
    apis/
      *.proto
  tensorflow/
    core/
      framework/
        *.proto

為了簡單起見江咳，prediction_service.proto（預(yù)測服務(wù)）可以簡化為只實(shí)現(xiàn)Predict RPC逢净。這避免了引入服務(wù)中定義的其他RPC的嵌套依賴關(guān)系。

使用grpcio.tools.protoc

PROTOC_OUT=protos/  
PROTOS=$(find . | grep "\.proto$")  
for p in $PROTOS; do  
  python -m grpc.tools.protoc -I . --python_out=$PROTOC_OUT --grpc_python_out=$PROTOC_OUT $p
done

然后就可以去除掉tensorflow-serving-api的依賴歼指，同時(shí)我們可以用tensorflow里面proto生成的依賴文件爹土，從而去除掉tensorflow的依賴，一般正常使用會用到tf.contrib.util.make_tensor_proto函數(shù)去根據(jù)numpy數(shù)組生成protocol buff踩身，不需要引入這個(gè)依賴

我這里已經(jīng)生成了一份

鏈接：https://pan.baidu.com/s/1ZcJplXwiGUxzNbLz5pYCzA
提取碼：aomd

from __future__ import print_function, division, absolute_import

import time

import numpy as np

tt = time.time()

import grpc
from protos.tensorflow_serving.apis import predict_pb2
from protos.tensorflow_serving.apis import prediction_service_pb2_grpc
from protos.tensorflow.core.framework import tensor_pb2
from protos.tensorflow.core.framework import tensor_shape_pb2
from protos.tensorflow.core.framework import types_pb2


def incr_counter():
    # create prediction service client stub
    channel = grpc.insecure_channel("172.0.0.1:8501")
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
    # # create request
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'counter'
    request.model_spec.signature_name = 'incr_counter'

    input = np.array([0])
    tensor_shape = list(input.shape)
    dims = [tensor_shape_pb2.TensorShapeProto.Dim(size=dim) for dim in tensor_shape]
    print("+++++++++++")
    print(dims)
    tensor_shape = tensor_shape_pb2.TensorShapeProto(dim=dims)
    tensor = tensor_pb2.TensorProto(
        dtype=types_pb2.DT_INT32,
        tensor_shape=tensor_shape,
        int_val=list(input.reshape(-1)))
    print("+++++++++++")
    print(tensor)
    request.inputs['nothing'].CopyFrom(tensor)
    resp = stub.Predict(request, 5.0)
    print(resp)
    print('total time: {}s'.format(time.time() - tt))


def get_counter():
    # create prediction service client stub
    channel = grpc.insecure_channel("172.0.0.1:8501")
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
    #
    # # create request
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'counter'
    request.model_spec.signature_name = 'get_counter'

    input = np.array([0])
    tensor_shape = list(input.shape)
    dims = [tensor_shape_pb2.TensorShapeProto.Dim(size=dim) for dim in tensor_shape]
    print("+++++++++++")
    print(dims)
    tensor_shape = tensor_shape_pb2.TensorShapeProto(dim=dims)
    tensor = tensor_pb2.TensorProto(
        dtype=types_pb2.DT_INT32,
        tensor_shape=tensor_shape,
        int_val=list(input.reshape(-1)))
    print("+++++++++++")
    print(tensor)
    request.inputs['nothing'].CopyFrom(tensor)
    resp = stub.Predict(request, 5.0)
    print(resp)
    print('total time: {}s'.format(time.time() - tt))


def incr_counter_by():
    # create prediction service client stub
    channel = grpc.insecure_channel("172.0.0.1:8501")
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
    #
    # # create request
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'counter'
    request.model_spec.signature_name = 'incr_counter_by'

    input = 2
    # 這里需要輸入的是一個(gè)scalar胀茵，不能有任何維度
    tensor_shape = tensor_shape_pb2.TensorShapeProto(dim=[])
    tensor = tensor_pb2.TensorProto(
        dtype=types_pb2.DT_FLOAT,
        tensor_shape=tensor_shape,
        float_val=[input])
    print("+++++++++++")
    print(tensor)
    request.inputs['delta'].CopyFrom(tensor)
    resp = stub.Predict(request, 5.0)
    print(resp)
    print('total time: {}s'.format(time.time() - tt))


def reset_counter():
    # create prediction service client stub
    channel = grpc.insecure_channel("172.0.0.1:8501")
    stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)

    # # create request
    request = predict_pb2.PredictRequest()
    request.model_spec.name = 'counter'
    request.model_spec.signature_name = 'get_counter'

    input = np.array([0])
    tensor_shape = list(input.shape)
    dims = [tensor_shape_pb2.TensorShapeProto.Dim(size=dim) for dim in tensor_shape]
    print("+++++++++++")
    print(dims)
    tensor_shape = tensor_shape_pb2.TensorShapeProto(dim=dims)
    tensor = tensor_pb2.TensorProto(
        dtype=types_pb2.DT_INT32,
        tensor_shape=tensor_shape,
        int_val=list(input.reshape(-1)))
    print("+++++++++++")
    print(tensor)
    request.inputs['nothing'].CopyFrom(tensor)
    resp = stub.Predict(request, 5.0)
    print(resp)
    print('total time: {}s'.format(time.time() - tt))


if __name__ == '__main__':
    incr_counter()
    get_counter()
    incr_counter_by()
    reset_counter()

這里解釋一下根據(jù)proto文件構(gòu)建協(xié)議體的過程

input = np.array([0])
tensor_shape = list(input.shape)
# 首先根據(jù)輸入的numpy數(shù)組的shape構(gòu)建protobuf維度信息，其最終在metadata里面的結(jié)構(gòu)是dim：[{size:1,name:""},{size:2,name:""}]這樣的
dims = [tensor_shape_pb2.TensorShapeProto.Dim(size=dim) for dim in tensor_shape]
print("+++++++++++")
print(dims)
tensor_shape = tensor_shape_pb2.TensorShapeProto(dim=dims)
# 然后構(gòu)建tensor的protobuf挟阻，dtype有許多類型琼娘，可以依據(jù)所需選取，需要注意的是附鸽，如果傳入的是DT_INT32之類的脱拼，那么傳入實(shí)際值需要使用int_val參數(shù)，而不是float_val之類的坷备，否則服務(wù)不會報(bào)錯(cuò)挪拟，但是結(jié)果是不對滴。具體需要傳入什么值击你，可以依據(jù)tensor.proto文件里面的說明。
tensor = tensor_pb2.TensorProto(
    dtype=types_pb2.DT_INT32,
    tensor_shape=tensor_shape,
    int_val=list(input.reshape(-1)))

再加一個(gè)從tensor的協(xié)議體中重構(gòu)出numpy數(shù)組的過程,里面涉及的數(shù)據(jù)類型按照模型的實(shí)際值修改即可谎柄。

result_dict = dict()
for key in resp.outputs:
    tensor_proto = resp.outputs[key]
    shape = [d.size for d in tensor_proto.tensor_shape.dim]
    values = np.fromiter(tensor_proto.float_val, dtype=np.float)
    result_dict[key] = values.reshape(shape)

Java版本Grpc調(diào)用

java版本調(diào)用這里也是需要先試用proto文件生成對應(yīng)的依賴丁侄，proto文件結(jié)構(gòu)和之前的一樣。然后根據(jù)依賴編寫grpc客戶端朝巫。這是一個(gè)對應(yīng)counter模型的測試案例鸿摇。

package com.meituan.test;
 
import io.grpc.ManagedChannel;
import io.grpc.ManagedChannelBuilder;
 
import java.util.Arrays;
import java.util.List;
 
import org.tensorflow.framework.DataType;
import org.tensorflow.framework.TensorProto;
import org.tensorflow.framework.TensorShapeProto;
 
import tensorflow.serving.Model;
import tensorflow.serving.Predict;
import tensorflow.serving.PredictionServiceGrpc;
public class App {
    public static void main(String[] args) {
        List<Integer> intList =Arrays.asList(1);
        ManagedChannel channel = ManagedChannelBuilder.forAddress("0.0.0.0", 8500).usePlaintext(true).build();
        //這里還是先用block模式
        PredictionServiceGrpc.PredictionServiceBlockingStub stub = PredictionServiceGrpc.newBlockingStub(channel);
        //創(chuàng)建請求
        Predict.PredictRequest.Builder predictRequestBuilder = Predict.PredictRequest.newBuilder();
        //模型名稱和模型方法名預(yù)設(shè)
        Model.ModelSpec.Builder modelSpecBuilder = Model.ModelSpec.newBuilder();
        modelSpecBuilder.setName("counter");
        modelSpecBuilder.setSignatureName("incr_counter");
        modelSpecBuilder.setVersion(Int64Value.newBuilder().setValue(100000).build());
        predictRequestBuilder.setModelSpec(modelSpecBuilder);
        //設(shè)置入?yún)?訪問默認(rèn)是最新版本，如果需要特定版本可以使用tensorProtoBuilder.setVersionNumber方法
        TensorProto.Builder tensorProtoBuilder = TensorProto.newBuilder();
        tensorProtoBuilder.setDtype(DataType.DT_INT32);
        TensorShapeProto.Builder tensorShapeBuilder = TensorShapeProto.newBuilder();
        
        tensorShapeBuilder.addDim(TensorShapeProto.Dim.newBuilder().setSize(1));
        tensorProtoBuilder.setTensorShape(tensorShapeBuilder.build());
        tensorProtoBuilder.addAllIntVal(intList);
        predictRequestBuilder.putInputs("nothing", tensorProtoBuilder.build());
        //訪問并獲取結(jié)果
        Predict.PredictResponse predictResponse = stub.predict(predictRequestBuilder.build());
        org.tensorflow.framework.TensorProto result=predictResponse.toBuilder().getOutputsOrThrow("output");
        System.out.println("預(yù)測值是:"+result.getFloatValList());
    }
}

最后編輯于：2020.03.16 12:33:02

?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者

人面猴
序言：七十年代末劈猿，一起剝皮案震驚了整個(gè)濱河市拙吉，隨后出現(xiàn)的幾起案子，更是在濱河造成了極大的恐慌揪荣，老刑警劉巖筷黔，帶你破解...
沈念sama閱讀 212,185評論 6贊 493
死咒
序言：濱河連續(xù)發(fā)生了三起死亡事件，死亡現(xiàn)場離奇詭異仗颈，居然都是意外死亡佛舱，警方通過查閱死者的電腦和手機(jī)，發(fā)現(xiàn)死者居然都...
沈念sama閱讀 90,445評論 3贊 385
救了他兩次的神仙讓他今天三更去死
文/潘曉璐我一進(jìn)店門，熙熙樓的掌柜王于貴愁眉苦臉地迎上來请祖，“玉大人订歪，你說我怎么就攤上這事∷敛叮” “怎么了刷晋？”我有些...
開封第一講書人閱讀 157,684評論 0贊 348
道士緝兇錄：失蹤的賣姜人
文/不壞的土叔我叫張陵，是天一觀的道長慎陵。經(jīng)常有香客問我眼虱，道長，這世上最難降的妖魔是什么荆姆？我笑而不...
開封第一講書人閱讀 56,564評論 1贊 284
?港島之戀（遺憾婚禮）
正文為了忘掉前任蒙幻，我火速辦了婚禮，結(jié)果婚禮上胆筒，老公的妹妹穿的比我還像新娘邮破。我一直安慰自己，他們只是感情好仆救，可當(dāng)我...
茶點(diǎn)故事閱讀 65,681評論 6贊 386
惡毒庶女頂嫁案：這布局不是一般人想出來的
文/花漫我一把揭開白布抒和。她就那樣靜靜地躺著，像睡著了一般彤蔽。火紅的嫁衣襯著肌膚如雪摧莽。梳的紋絲不亂的頭發(fā)上，一...
開封第一講書人閱讀 49,874評論 1贊 290
城市分裂傳說
那天顿痪，我揣著相機(jī)與錄音镊辕，去河邊找鬼。笑死蚁袭，一個(gè)胖子當(dāng)著我的面吹牛征懈，可吹牛的內(nèi)容都是我干的。我是一名探鬼主播揩悄，決...
沈念sama閱讀 39,025評論 3贊 408
雙鴛鴦連環(huán)套：你想象不到人心有多黑
文/蒼蘭香墨我猛地睜開眼卖哎，長吁一口氣：“原來是場噩夢啊……” “哼！你這毒婦竟也來了删性？” 一聲冷哼從身側(cè)響起亏娜，我...
開封第一講書人閱讀 37,761評論 0贊 268
萬榮殺人案實(shí)錄
序言：老撾萬榮一對情侶失蹤，失蹤者是張志新（化名）和其女友劉穎蹬挺，沒想到半個(gè)月后维贺，有當(dāng)?shù)厝嗽跇淞掷锇l(fā)現(xiàn)了一具尸體，經(jīng)...
沈念sama閱讀 44,217評論 1贊 303
?護(hù)林員之死
正文獨(dú)居荒郊野嶺守林人離奇死亡巴帮，尸身上長有42處帶血的膿包…… 初始之章·張勛以下內(nèi)容為張勛視角年9月15日...
茶點(diǎn)故事閱讀 36,545評論 2贊 327
?白月光啟示錄
正文我和宋清朗相戀三年幸缕，在試婚紗的時(shí)候發(fā)現(xiàn)自己被綠了群发。大學(xué)時(shí)的朋友給我發(fā)了我未婚夫和他白月光在一起吃飯的照片。...
茶點(diǎn)故事閱讀 38,694評論 1贊 341
活死人
序言：一個(gè)原本活蹦亂跳的男人離奇死亡发乔，死狀恐怖熟妓，靈堂內(nèi)的尸體忽然破棺而出，到底是詐尸還是另有隱情栏尚，我是刑警寧澤起愈，帶...
沈念sama閱讀 34,351評論 4贊 332
?日本核電站爆炸內(nèi)幕
正文年R本政府宣布，位于F島的核電站译仗，受9級特大地震影響抬虽，放射性物質(zhì)發(fā)生泄漏。R本人自食惡果不足惜纵菌，卻給世界環(huán)境...
茶點(diǎn)故事閱讀 39,988評論 3贊 315
男人毒藥：我在死后第九天來索命
文/蒙蒙一阐污、第九天我趴在偏房一處隱蔽的房頂上張望。院中可真熱鬧咱圆，春花似錦笛辟、人聲如沸。這莊子的主人今日做“春日...
開封第一講書人閱讀 30,778評論 0贊 21
一樁弒父案手幢，背后竟有這般陰謀
文/蒼蘭香墨我抬頭看了看天上的太陽。三九已至忱详，卻和暖如春围来，著一層夾襖步出監(jiān)牢的瞬間，已是汗流浹背匈睁。一陣腳步聲響...
開封第一講書人閱讀 32,007評論 1贊 266
情欲美人皮
我被黑心中介騙來泰國打工监透，沒想到剛下飛機(jī)就差點(diǎn)兒被人妖公主榨干…… 1. 我叫王不留，地道東北人航唆。一個(gè)月前我還...
沈念sama閱讀 46,427評論 2贊 360
代替公主和親
正文我出身青樓胀蛮，卻偏偏與公主長得像，于是被迫代替她去往敵國和親佛点。傳聞我的和親對象是個(gè)殘疾皇子，可洞房花燭夜當(dāng)晚...
茶點(diǎn)故事閱讀 43,580評論 2贊 349