摘要:從實(shí)驗(yàn)到生產(chǎn),簡單快速部署機(jī)器學(xué)習(xí)模型一直是一個挑戰(zhàn)。總結(jié)上面我們快速實(shí)踐了使用和部署機(jī)器學(xué)習(xí)服務(wù)的過程,可以看到,提供了非常方便和高效的模型管理,配合,可以快速搭建起機(jī)器學(xué)習(xí)服務(wù)。
從實(shí)驗(yàn)到生產(chǎn),簡單快速部署機(jī)器學(xué)習(xí)模型一直是一個挑戰(zhàn)。這個過程要做的就是將訓(xùn)練好的模型對外提供預(yù)測服務(wù)。在生產(chǎn)中,這個過程需要可重現(xiàn),隔離和安全。這里,我們使用基于Docker的TensorFlow Serving來簡單地完成這個過程。TensorFlow 從1.8版本開始支持Docker部署,包括CPU和GPU,非常方便。
獲得訓(xùn)練好的模型獲取模型的第一步當(dāng)然是訓(xùn)練一個模型,但是這不是本篇的重點(diǎn),所以我們使用一個已經(jīng)訓(xùn)練好的模型,比如ResNet。TensorFlow Serving 使用SavedModel這種格式來保存其模型,SavedModel是一種獨(dú)立于語言的,可恢復(fù),密集的序列化格式,支持使用更高級別的系統(tǒng)和工具來生成,使用和轉(zhuǎn)換TensorFlow模型。這里我們直接下載一個預(yù)訓(xùn)練好的模型:
$ mkdir /tmp/resnet $ curl -s https://storage.googleapis.com/download.tensorflow.org/models/official/20181001_resnet/savedmodels/resnet_v2_fp32_savedmodel_NHWC_jpg.tar.gz | tar --strip-components=2 -C /tmp/resnet -xvz
如果是使用其他框架比如Keras生成的模型,則需要將模型轉(zhuǎn)換為SavedModel格式,比如:
from keras.models import Sequential from keras import backend as K import tensorflow as tf model = Sequential() # 中間省略模型構(gòu)建 # 模型轉(zhuǎn)換為SavedModel signature = tf.saved_model.signature_def_utils.predict_signature_def( inputs={"input_param": model.input}, outputs={"type": model.output}) builder = tf.saved_model.builder.SavedModelBuilder("/tmp/output_model_path/1/") builder.add_meta_graph_and_variables( sess=K.get_session(), tags=[tf.saved_model.tag_constants.SERVING], signature_def_map={ tf.saved_model.signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY: signature }) builder.save()
下載完成后,文件目錄樹為:
$ tree /tmp/resnet /tmp/resnet └── 1538687457 ├── saved_model.pb └── variables ├── variables.data-00000-of-00001 └── variables.index部署模型
使用Docker部署模型服務(wù):
$ docker pull tensorflow/serving $ docker run -p 8500:8500 -p 8501:8501 --name tfserving_resnet --mount type=bind,source=/tmp/resnet,target=/models/resnet -e MODEL_NAME=resnet -t tensorflow/serving
其中,8500端口對于TensorFlow Serving提供的gRPC端口,8501為REST API服務(wù)端口。-e MODEL_NAME=resnet指出TensorFlow Serving需要加載的模型名稱,這里為resnet。上述命令輸出為
2019-03-04 02:52:26.610387: I tensorflow_serving/model_servers/server.cc:82] Building single TensorFlow model file config: model_name: resnet model_base_path: /models/resnet 2019-03-04 02:52:26.618200: I tensorflow_serving/model_servers/server_core.cc:461] Adding/updating models. 2019-03-04 02:52:26.618628: I tensorflow_serving/model_servers/server_core.cc:558] (Re-)adding model: resnet 2019-03-04 02:52:26.745813: I tensorflow_serving/core/basic_manager.cc:739] Successfully reserved resources to load servable {name: resnet version: 1538687457} 2019-03-04 02:52:26.745901: I tensorflow_serving/core/loader_harness.cc:66] Approving load for servable version {name: resnet version: 1538687457} 2019-03-04 02:52:26.745935: I tensorflow_serving/core/loader_harness.cc:74] Loading servable version {name: resnet version: 1538687457} 2019-03-04 02:52:26.747590: I external/org_tensorflow/tensorflow/contrib/session_bundle/bundle_shim.cc:363] Attempting to load native SavedModelBundle in bundle-shim from: /models/resnet/1538687457 2019-03-04 02:52:26.747705: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /models/resnet/1538687457 2019-03-04 02:52:26.795363: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { serve } 2019-03-04 02:52:26.828614: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA 2019-03-04 02:52:26.923902: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:162] Restoring SavedModel bundle. 2019-03-04 02:52:28.098479: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:138] Running MainOp with key saved_model_main_op on SavedModel bundle. 2019-03-04 02:52:28.144510: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:259] SavedModel load for tags { serve }; Status: success. Took 1396689 microseconds. 2019-03-04 02:52:28.146646: I tensorflow_serving/servables/tensorflow/saved_model_warmup.cc:83] No warmup data file found at /models/resnet/1538687457/assets.extra/tf_serving_warmup_requests 2019-03-04 02:52:28.168063: I tensorflow_serving/core/loader_harness.cc:86] Successfully loaded servable version {name: resnet version: 1538687457} 2019-03-04 02:52:28.174902: I tensorflow_serving/model_servers/server.cc:286] Running gRPC ModelServer at 0.0.0.0:8500 ... [warn] getaddrinfo: address family for nodename not supported 2019-03-04 02:52:28.186724: I tensorflow_serving/model_servers/server.cc:302] Exporting HTTP/REST API at:localhost:8501 ... [evhttp_server.cc : 237] RAW: Entering the event loop ...
我們可以看到,TensorFlow Serving使用1538687457作為模型的版本號。我們使用curl命令來查看一下啟動的服務(wù)狀態(tài),也可以看到提供服務(wù)的模型版本以及模型狀態(tài)。
$ curl http://localhost:8501/v1/models/resnet { "model_version_status": [ { "version": "1538687457", "state": "AVAILABLE", "status": { "error_code": "OK", "error_message": "" } } ] }查看模型輸入輸出
很多時候我們需要查看模型的輸出和輸出參數(shù)的具體形式,TensorFlow提供了一個saved_model_cli命令來查看模型的輸入和輸出參數(shù):
$ saved_model_cli show --dir /tmp/resnet/1538687457/ --all MetaGraphDef with tag-set: "serve" contains the following SignatureDefs: signature_def["predict"]: The given SavedModel SignatureDef contains the following input(s): inputs["image_bytes"] tensor_info: dtype: DT_STRING shape: (-1) name: input_tensor:0 The given SavedModel SignatureDef contains the following output(s): outputs["classes"] tensor_info: dtype: DT_INT64 shape: (-1) name: ArgMax:0 outputs["probabilities"] tensor_info: dtype: DT_FLOAT shape: (-1, 1001) name: softmax_tensor:0 Method name is: tensorflow/serving/predict signature_def["serving_default"]: The given SavedModel SignatureDef contains the following input(s): inputs["image_bytes"] tensor_info: dtype: DT_STRING shape: (-1) name: input_tensor:0 The given SavedModel SignatureDef contains the following output(s): outputs["classes"] tensor_info: dtype: DT_INT64 shape: (-1) name: ArgMax:0 outputs["probabilities"] tensor_info: dtype: DT_FLOAT shape: (-1, 1001) name: softmax_tensor:0 Method name is: tensorflow/serving/predict
注意到signature_def,inputs的名稱,類型和輸出,這些參數(shù)在接下來的模型預(yù)測請求中需要。
使用模型接口預(yù)測:REST和gRPCTensorFlow Serving提供REST API和gRPC兩種請求方式,接下來將具體這兩種方式。
REST我們下載一個客戶端腳本,這個腳本會下載一張貓的圖片,同時使用這張圖片來計算服務(wù)請求時間。
$ curl -o /tmp/resnet/resnet_client.py https://raw.githubusercontent.com/tensorflow/serving/master/tensorflow_serving/example/resnet_client.py
以下腳本使用requests庫來請求接口,使用圖片的base64編碼字符串作為請求內(nèi)容,返回圖片分類,并計算了平均處理時間。
from __future__ import print_function import base64 import requests # The server URL specifies the endpoint of your server running the ResNet # model with the name "resnet" and using the predict interface. SERVER_URL = "http://localhost:8501/v1/models/resnet:predict" # The image URL is the location of the image we should send to the server IMAGE_URL = "https://tensorflow.org/images/blogs/serving/cat.jpg" def main(): # Download the image dl_request = requests.get(IMAGE_URL, stream=True) dl_request.raise_for_status() # Compose a JSON Predict request (send JPEG image in base64). jpeg_bytes = base64.b64encode(dl_request.content).decode("utf-8") predict_request = "{"instances" : [{"b64": "%s"}]}" % jpeg_bytes # Send few requests to warm-up the model. for _ in range(3): response = requests.post(SERVER_URL, data=predict_request) response.raise_for_status() # Send few actual requests and report average latency. total_time = 0 num_requests = 10 for _ in range(num_requests): response = requests.post(SERVER_URL, data=predict_request) response.raise_for_status() total_time += response.elapsed.total_seconds() prediction = response.json()["predictions"][0] print("Prediction class: {}, avg latency: {} ms".format( prediction["classes"], (total_time*1000)/num_requests)) if __name__ == "__main__": main()
輸出結(jié)果為
$ python resnet_client.py Prediction class: 286, avg latency: 210.12310000000002 msgRPC
讓我們下載另一個客戶端腳本,這個腳本使用gRPC作為服務(wù),傳入圖片并獲取輸出結(jié)果。這個腳本需要安裝tensorflow-serving-api這個庫。
$ curl -o /tmp/resnet/resnet_client_grpc.py https://raw.githubusercontent.com/tensorflow/serving/master/tensorflow_serving/example/resnet_client_grpc.py $ pip install tensorflow-serving-api
腳本內(nèi)容:
from __future__ import print_function # This is a placeholder for a Google-internal import. import grpc import requests import tensorflow as tf from tensorflow_serving.apis import predict_pb2 from tensorflow_serving.apis import prediction_service_pb2_grpc # The image URL is the location of the image we should send to the server IMAGE_URL = "https://tensorflow.org/images/blogs/serving/cat.jpg" tf.app.flags.DEFINE_string("server", "localhost:8500", "PredictionService host:port") tf.app.flags.DEFINE_string("image", "", "path to image in JPEG format") FLAGS = tf.app.flags.FLAGS def main(_): if FLAGS.image: with open(FLAGS.image, "rb") as f: data = f.read() else: # Download the image since we weren"t given one dl_request = requests.get(IMAGE_URL, stream=True) dl_request.raise_for_status() data = dl_request.content channel = grpc.insecure_channel(FLAGS.server) stub = prediction_service_pb2_grpc.PredictionServiceStub(channel) # Send request # See prediction_service.proto for gRPC request/response details. request = predict_pb2.PredictRequest() request.model_spec.name = "resnet" request.model_spec.signature_name = "serving_default" request.inputs["image_bytes"].CopyFrom( tf.contrib.util.make_tensor_proto(data, shape=[1])) result = stub.Predict(request, 10.0) # 10 secs timeout print(result) if __name__ == "__main__": tf.app.run()
輸出的結(jié)果可以看到圖片的分類,概率和使用的模型信息:
$ python resnet_client_grpc.py outputs { key: "classes" value { dtype: DT_INT64 tensor_shape { dim { size: 1 } } int64_val: 286 } } outputs { key: "probabilities" value { dtype: DT_FLOAT tensor_shape { dim { size: 1 } dim { size: 1001 } } float_val: 2.4162832232832443e-06 float_val: 1.9012182974620373e-06 float_val: 2.7247710022493266e-05 float_val: 4.426385658007348e-07 ...(中間省略) float_val: 1.4636580090154894e-05 float_val: 5.812107133351674e-07 float_val: 6.599806511076167e-05 float_val: 0.0012952701654285192 } } model_spec { name: "resnet" version { value: 1538687457 } signature_name: "serving_default" }性能 通過編譯優(yōu)化的TensorFlow Serving二進(jìn)制來提高性能
TensorFlows serving有時會有輸出如下的日志:
Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
TensorFlow Serving已發(fā)布Docker鏡像旨在盡可能多地使用CPU架構(gòu),因此省略了一些優(yōu)化以最大限度地提高兼容性。如果你沒有看到此消息,則你的二進(jìn)制文件可能已針對你的CPU進(jìn)行了優(yōu)化。根據(jù)你的模型執(zhí)行的操作,這些優(yōu)化可能會對你的服務(wù)性能產(chǎn)生重大影響。幸運(yùn)的是,編譯優(yōu)化的TensorFlow Serving二進(jìn)制非常簡單。官方已經(jīng)提供了自動化腳本,分以下兩部進(jìn)行:
# 1. 編譯開發(fā)版本 $ docker build -t $USER/tensorflow-serving-devel -f Dockerfile.devel https://github.com/tensorflow/serving.git#:tensorflow_serving/tools/docker # 2. 生產(chǎn)新的鏡像 $ docker build -t $USER/tensorflow-serving --build-arg TF_SERVING_BUILD_IMAGE=$USER/tensorflow-serving-devel https://github.com/tensorflow/serving.git#:tensorflow_serving/tools/docker
之后,使用新編譯的$USER/tensorflow-serving重新啟動服務(wù)即可。
總結(jié)上面我們快速實(shí)踐了使用TensorFlow Serving和Docker部署機(jī)器學(xué)習(xí)服務(wù)的過程,可以看到,TensorFlow Serving提供了非常方便和高效的模型管理,配合Docker,可以快速搭建起機(jī)器學(xué)習(xí)服務(wù)。
參考Serving ML Quickly with TensorFlow Serving and Docker
Train and serve a TensorFlow model with TensorFlow Serving
GitHub repo: qiwihui/blogFollow me: @qiwihui
Site: QIWIHUI
文章版權(quán)歸作者所有,未經(jīng)允許請勿轉(zhuǎn)載,若此文章存在違規(guī)行為,您可以聯(lián)系管理員刪除。
轉(zhuǎn)載請注明本文地址:http://specialneedsforspecialkids.com/yun/27704.html
摘要:它使用機(jī)器學(xué)習(xí)來解釋用戶提出的問題,并用相應(yīng)的知識庫文章來回應(yīng)。使用一類目前較先進(jìn)的機(jī)器學(xué)習(xí)算法來識別相關(guān)文章,也就是深度學(xué)習(xí)。接下來介紹一下我們在生產(chǎn)環(huán)境中配置模型的一些經(jīng)驗(yàn)。 我們?nèi)绾伍_始使用TensorFlow ?在Zendesk,我們開發(fā)了一系列機(jī)器學(xué)習(xí)產(chǎn)品,比如的自動答案(Automatic Answers)。它使用機(jī)器學(xué)習(xí)來解釋用戶提出的問題,并用相應(yīng)的知識庫文章來回應(yīng)。當(dāng)用戶有...
摘要:大會以機(jī)器學(xué)習(xí)資料中心和云端安全為主要議題,為未來發(fā)展做戰(zhàn)略規(guī)劃。在年,谷歌開發(fā)了一個內(nèi)部深度學(xué)習(xí)基礎(chǔ)設(shè)施叫做,這個設(shè)施允許谷歌人創(chuàng)建更大的神經(jīng)網(wǎng)絡(luò)和擴(kuò)容實(shí)訓(xùn)成千上萬個核心。 導(dǎo)言 Google近日3月23-24日在美國舊金山舉辦首次谷歌云平臺(Google Cloud Platform) GCP NEXT大會,參會人數(shù)超過2000人。GCP NEXT大會以機(jī)器學(xué)習(xí)、資料中心和云端安全...
摘要:滴滴機(jī)器學(xué)習(xí)平臺的治理思路主要是減少重復(fù)提高效率。本文將對滴滴的機(jī)器學(xué)習(xí)平臺進(jìn)行全面解讀,重點(diǎn)分享機(jī)器學(xué)習(xí)平臺不同階段所要解決的問題,以及解決問題的思路和技術(shù)方案。綜合和各自的利弊,滴滴機(jī)器學(xué)習(xí)平臺開始由架構(gòu)向建構(gòu)遷移。 前言:現(xiàn)在很多互聯(lián)網(wǎng)公司都有自己的機(jī)器學(xué)習(xí)平臺,冠以之名雖然形形色色,但就平臺所要解決的問題和技術(shù)選型基本還是大同小異。所謂大同是指大家所要處理的問題都相似,技術(shù)架構(gòu)...
閱讀 2019·2023-04-26 02:15
閱讀 2301·2021-11-19 09:40
閱讀 1038·2021-10-27 14:13
閱讀 3306·2021-08-23 09:44
閱讀 3609·2019-12-27 12:24
閱讀 652·2019-08-30 15:53
閱讀 1164·2019-08-30 10:53
閱讀 2152·2019-08-26 12:14