当Json请求包含“_bytes”或“b64”时,google cloud ml-engine会做什么?(What does google cloud ml-engine do when a Json request contains “_bytes” or “b64”?)

谷歌云文档(请参阅预测输入中的二进制数据)指出:

必须将编码的字符串格式化为JSON对象,并使用名为b64的单个键。 以下Python示例使用base64库对原始JPEG数据的缓冲区进行编码以生成实例:

{"image_bytes":{"b64": base64.b64encode(jpeg_data)}}

在TensorFlow模型代码中,您必须为输入和输出张量命名别名,以便它们以'_bytes'结尾。

我想了解更多关于这个过程如何在谷歌云端工作。

ml-engine是否会自动将“b64”字符串之后的任何内容解码为字节数据?

当请求具有此嵌套结构时,它是否仅在“b64”部分传递给服务输入函数并删除“image_bytes”键?

每个请求是单独传递给服务输入函数还是已经批处理?

我们是否定义了服务输入函数返回的ServingInputReceiver中的输入输出别名?

我找不到创建使用此嵌套结构定义特征占位符的服务输入函数的方法。 我只在我的“b64”中使用,我不确定gcloud ml-engine在接收请求时会做什么。

另外,当使用gcloud ml-engine local predict ,发送具有嵌套结构的请求失败,(意外的密钥image_bytes,因为它未在服务输入函数中定义)。 但是,当预测使用gcloud ml-engine predict ,即使服务输入函数不包含对“image_bytes”的引用,使用嵌套结构发送请求也会起作用。 当忽略“image_bytes”并传入“b64”时,gcloud预测也有效。

一个服务输入函数的例子

def serving_input_fn(): feature_placeholders = {'b64': tf.placeholder(dtype=tf.string, shape=[None], name='source')} single_image = tf.decode_raw(feature_placeholders['b64'], tf.float32) inputs = {'image': single_image} return tf.estimator.export.ServingInputReceiver(inputs, feature_placeholders)

我给出了使用图像的例子,但我认为应该适用于以字节和base64编码方式发送的所有类型的数据。

有很多stackoverflow问题,其中包含对包含“_bytes”信息片段的需求的引用,但我觉得如果有人可以详细解释一下这些内容会有什么用,那么我就不会那么受欢迎并在格式化请求时错过。

关于这个话题的Stackoverflow问题

如何正确预测cloud-ml中的jpeg图像

如何在Google机器学习中将jpeg图像转换为json文件

如何正确预测cloud-ml中的jpeg图像

使用Keras和Google Cloud ML的Base64图像

如何读取tensorflow中的utf-8编码二进制字符串?

The google cloud documentation (see Binary data in prediction input) states:

Your encoded string must be formatted as a JSON object with a single key named b64. The following Python example encodes a buffer of raw JPEG data using the base64 library to make an instance:

{"image_bytes":{"b64": base64.b64encode(jpeg_data)}}

In your TensorFlow model code, you must name the aliases for your input and output tensors so that they end with '_bytes'.

I would like to understand more about how this process works on the google cloud side.

Is the ml-engine automatically decoding any content after the "b64" string to byte data?

When the request has this nested structure, does it only pass in the "b64" section to the serving input function and remove the "image_bytes" key?

Is each request passed individually to the serving input function or are they batched?

Do we define the input output aliases in the ServingInputReceiver returned by the serving input function?

I have found no way to create a serving input function which uses this nested structure to define the feature placeholders. I only use "b64" in mine and I am not sure what the gcloud ml-engine does on receiving the requests.

Additionally when predicting locally using gcloud ml-engine local predict, sending the request with the nested structure fails, (unexpected key image_bytes as it is not defined in the serving input function). But when predicting using gcloud ml-engine predict, sending requests with the nested structure works even when the serving input function contains no reference to "image_bytes". The gcloud predict also works when leaving out "image_bytes" and passing in just "b64".

An example serving input function

def serving_input_fn(): feature_placeholders = {'b64': tf.placeholder(dtype=tf.string, shape=[None], name='source')} single_image = tf.decode_raw(feature_placeholders['b64'], tf.float32) inputs = {'image': single_image} return tf.estimator.export.ServingInputReceiver(inputs, feature_placeholders)

I gave the example using images but I assume the same should apply to all types of data sent as bytes and base64 encoded.

There are a lot of stackoverflow questions which contain references to the need to include "_bytes" with snippets of information, but I would find it useful if someone could explain a bit more in detail whats going on as then I wouldn't be so hit and miss when formatting requests.

Stackoverflow questions on this topic

how make correct predictions of jpeg image in cloud-ml

How convert a jpeg image into json file in Google machine learning

how make correct predictions of jpeg image in cloud-ml

Base64 images with Keras and Google Cloud ML

How to read a utf-8 encoded binary string in tensorflow?

最满意答案

为了帮助您澄清一些问题,请允许我从预测请求的基本解剖开始:

{"instances": [<instance>, <instance>, ...]}

如果instance是JSON对象(dict / map,我将在后面使用Python术语“dict”),而属性/键是输入的名称,其值包含该输入的数据。

云服务的作用(以及gcloud ml-engine local predict使用与服务相同的底层库)是它采用了dicts列表(可以被认为是数据行),然后将其转换为列表的字典(可以将其视为包含实例批次的列数据),其中使用与原始数据中相同的密钥。 例如,

{"instances": [{"x": 1, "y": "a"}, {"x": 3, "y": "b"}, {"x": 5, "y": "c"}]}

成为(内部)

{"x": [1, 3, 5], "y": ["a", "b", "c"]}

这个字典中的键(以及原始请求中的实例)必须与传递给ServingInputFnReceiver的字典的键相对应。 从该示例中应该显而易见的是,服务“批量化”所有数据,这意味着所有实例作为单个批次被馈送到图中。 这就是为什么输入形状的外部维度必须为None - 这是批量维度,并且在请求发生之前它是未知的(因为每个请求可能具有不同的实例数量)。 导出图形以接受上述请求时,您可以定义如下函数:

def serving_input_fn(): inputs = {'x': tf.placeholder(dtype=tf.int32, shape=[None]), 'y': tf.placeholder(dtype=tf.string, shape=[None]} return tf.estimator.export.ServingInputReceiver(inputs, inputs)

由于JSON不直接支持二进制数据,并且由于TensorFlow无法将“字符串”与“字节”区分开来,所以我们需要专门处理二进制数据。 首先,我们需要输入的名称以“_bytes”结尾,以帮助区分字符串中的文本字符串。 使用上面的例子,假设y包含二进制数据而不是文本。 我们将宣布以下内容:

def serving_input_fn(): inputs = {'x': tf.placeholder(dtype=tf.int32, shape=[None]), 'y_bytes': tf.placeholder(dtype=tf.string, shape=[None]} return tf.estimator.export.ServingInputReceiver(inputs, inputs)

请注意,唯一改变的是使用y_bytes而不是y作为输入的名称。

接下来,我们需要实际base64编码数据; 在任何可以接受字符串的地方,我们可以使用如下所示的对象:{“b64”:“”}。 根据运行示例,请求可能如下所示:

{ "instances": [ {"x": 1, "y_bytes": {"b64": "YQ=="}}, {"x": 3, "y_bytes": {"b64": "Yg=="}}, {"x": 5, "y_bytes": {"b64": "Yw=="}} ] }

在这种情况下,服务完全按照以前的方式执行,但添加了一个步骤:在发送到TensorFlow之前,它会自动base64对字符串进行解码(并用字节“替换”{“b64”:...}对象)。 所以TensorFlow实际上最终会得到一个像以前一样的dict:

{"x": [1, 3, 5], "y_bytes": ["a", "b", "c"]}

(请注意,输入的名称未更改。)

当然,base64文本数据是毫无意义的; 你通常会这样做,例如,对于不能通过JSON以任何其他方式发送的图像数据,但我希望上面的例子足以说明这一点。

还有另一个要点:服务支持一种速记。 当TensorFlow模型只有一个输入时,就不需要在实例列表中的每个对象中不断重复输入的名称。 为了说明,想象一下只导出一个只有x的模型:

def serving_input_fn(): inputs = {'x': tf.placeholder(dtype=tf.int32, shape=[None])} return tf.estimator.export.ServingInputReceiver(inputs, inputs)

“长格式”请求看起来像这样:

{"instances": [{"x": 1}, {"x": 3}, {"x": 5}]}

相反,您可以用简写方式发送请求,如下所示:

{"instances": [1, 3, 5]}

请注意,这甚至适用于base64编码数据。 因此,例如,如果不是只导出x ,我们只导出了y_bytes ,我们可以简化来自以下的请求:

{ "instances": [ {"y_bytes": {"b64": "YQ=="}}, {"y_bytes": {"b64": "Yg=="}}, {"y_bytes": {"b64": "Yw=="}} ] }

至:

{ "instances": [ {"b64": "YQ=="}, {"b64": "Yg=="}, {"b64": "Yw=="} ] }

在许多情况下,这只是一个小胜,但它绝对有助于可读性,例如,当输入包含CSV数据时。

因此,将它完全适应您的特定场景,这是您的服务功能应该是什么样子:

def serving_input_fn(): feature_placeholders = { 'image_bytes': tf.placeholder(dtype=tf.string, shape=[None], name='source')} single_image = tf.decode_raw(feature_placeholders['image_bytes'], tf.float32) return tf.estimator.export.ServingInputReceiver(feature_placeholders, feature_placeholders)

与您当前的代码有显着差异:

输入的名称不是 b64 ,但是image_bytes (可以是任何以_bytes结尾的_bytes ) feature_placeholders用作ServingInputReceiver 两个参数

示例请求可能如下所示:

{ "instances": [ {"image_bytes": {"b64": "YQ=="}}, {"image_bytes": {"b64": "Yg=="}}, {"image_bytes": {"b64": "Yw=="}} ] }

或者,可选地,简而言之:

{ "instances": [ {"b64": "YQ=="}, {"b64": "Yg=="}, {"b64": "Yw=="} ] }

最后一个最后的笔记。 gcloud ml-engine local predict和gcloud ml-engine predict根据传入的文件内容构造请求。请务必注意,文件内容当前不是完整有效的请求,而是每行的--json-instances文件成为实例列表中的一个条目。 特别是在你的情况下,文件看起来像(换行符在这里有意义):

{"image_bytes": {"b64": "YQ=="}} {"image_bytes": {"b64": "Yg=="}} {"image_bytes": {"b64": "Yw=="}}

或等效的简写。 gcloud将采取每一行并构建上面显示的实际请求。

To help clarify some of the questions you have, allow me to start with the basic anatomy of a prediction request:

{"instances": [<instance>, <instance>, ...]}

Where instance is a JSON object (dict/map, I'll use the Python term "dict" hereafter) and the attributes/keys are the names of the inputs with values containing the data for that input.

What the cloud service does (and gcloud ml-engine local predict uses the same underlying libraries as the service) is it takes the list of dicts (which can be thought of as rows of data) and then converts it to a dict of lists (which can be thought of as columnar data containing batches of instances) with the same keys as in the original data. For example,

{"instances": [{"x": 1, "y": "a"}, {"x": 3, "y": "b"}, {"x": 5, "y": "c"}]}

becomes (internally)

{"x": [1, 3, 5], "y": ["a", "b", "c"]}

The keys in this dict (and hence, in the instance in the original request) must correspond to the keys of the dict passed to the ServingInputFnReceiver. It should be apparent from this example that the service "batches" all of the data, meaning all of the instances are fed into the graph as a single batch. That's why the outer dimension of the shape of the inputs must be None -- it is the batch dimension and it is not known before a request is made (since each request may have different number of instances). When exporting a graph to accept the above requests, you might define a function like this:

def serving_input_fn(): inputs = {'x': tf.placeholder(dtype=tf.int32, shape=[None]), 'y': tf.placeholder(dtype=tf.string, shape=[None]} return tf.estimator.export.ServingInputReceiver(inputs, inputs)

Since JSON does not (directly) support binary data and since TensorFlow has no way of distinguishing "strings" from "bytes", we need to treat binary data specially. First of all, we need the name of said inputs to end in "_bytes" to help differentiate a text string from a byte string. Using the example above, suppose y contained binary data instead of text. We would declare the following:

def serving_input_fn(): inputs = {'x': tf.placeholder(dtype=tf.int32, shape=[None]), 'y_bytes': tf.placeholder(dtype=tf.string, shape=[None]} return tf.estimator.export.ServingInputReceiver(inputs, inputs)

Notice that the only thing that changed was using y_bytes instead of y as the name of the input.

Next, we need to actually base64 encode the data; anywhere where a string would be acceptable, we can instead use an object like so: {"b64": ""}. Adapting the running example, a request might look like:

{ "instances": [ {"x": 1, "y_bytes": {"b64": "YQ=="}}, {"x": 3, "y_bytes": {"b64": "Yg=="}}, {"x": 5, "y_bytes": {"b64": "Yw=="}} ] }

In this case the service does exactly what it did before, but adding one step: it automatically base64 decodes the string (and "replaces" the {"b64": ...} object with the bytes) before sending to TensorFlow. So TensorFlow actually ends up with a dict like exactly as before:

{"x": [1, 3, 5], "y_bytes": ["a", "b", "c"]}

(Note that the name of the input has not changed.)

Of course, base64 textual data is kind of pointless; you'd usually do this, e.g., for image data which can't be sent any other way over JSON, but I hope the above example is sufficient to illustrate the point anyways.

There's another important point to be made: the service supports a type of shorthand. When there is exactly one input to your TensorFlow model, there's no need to incessantly repeat the name of the that input in every single object in your list of instances. To illustrate, imagine exporting a model with only x:

def serving_input_fn(): inputs = {'x': tf.placeholder(dtype=tf.int32, shape=[None])} return tf.estimator.export.ServingInputReceiver(inputs, inputs)

The "long form" request would look like this:

{"instances": [{"x": 1}, {"x": 3}, {"x": 5}]}

Instead, you can send a request in shorthand, like so:

{"instances": [1, 3, 5]}

Note that this applies even for base64 encoded data. So, for instance, if instead of only exporting x, we had only exported y_bytes, we could simplify the requests from:

{ "instances": [ {"y_bytes": {"b64": "YQ=="}}, {"y_bytes": {"b64": "Yg=="}}, {"y_bytes": {"b64": "Yw=="}} ] }

To:

{ "instances": [ {"b64": "YQ=="}, {"b64": "Yg=="}, {"b64": "Yw=="} ] }

In many cases this is only a small win, but it definitely aids readability, e.g., when the inputs contain CSV data.

So putting it altogether to adapt to your specific scenario, here's what your serving function should look like:

def serving_input_fn(): feature_placeholders = { 'image_bytes': tf.placeholder(dtype=tf.string, shape=[None], name='source')} single_image = tf.decode_raw(feature_placeholders['image_bytes'], tf.float32) return tf.estimator.export.ServingInputReceiver(feature_placeholders, feature_placeholders)

Notable differences from your current code:

Name of the input is not b64, but image_bytes (could be anything that ends in _bytes) feature_placeholders is used as both arguments to ServingInputReceiver

And a sample request might look like this:

{ "instances": [ {"image_bytes": {"b64": "YQ=="}}, {"image_bytes": {"b64": "Yg=="}}, {"image_bytes": {"b64": "Yw=="}} ] }

Or, optionally, in short hand:

{ "instances": [ {"b64": "YQ=="}, {"b64": "Yg=="}, {"b64": "Yw=="} ] }

One last final note. gcloud ml-engine local predict and gcloud ml-engine predict construct the request based on the contents of the file passed in. It is very important to note that the content of the file is currently not a full, valid request, but rather each line of the --json-instances file becomes one entry in the list of instances. Specifically in your case, the file will look like (newlines are meaningful here):

{"image_bytes": {"b64": "YQ=="}} {"image_bytes": {"b64": "Yg=="}} {"image_bytes": {"b64": "Yw=="}}

or the equivalent shorthand. gcloud will take each line and construct the actual request shown above.

更多推荐