tf.parse_example()

tf.parse_example(serialized, features, name=None, example_names=None)

Parses Example protos into a dict of tensors.

Parses a number of serialized Example protos given in serialized.

example_names may contain descriptive names for the corresponding serialized protos. These may be useful for debugging purposes, but they have no effect on the output. If not None, example_names must be the same length as serialized.

This op parses serialized examples into a dictionary mapping keys to Tensor and SparseTensor objects. features is a dict from keys to VarLenFeature and FixedLenFeature objects. Each VarLenFeature is mapped to a SparseTensor, and each FixedLenFeature is mapped to a Tensor.

Each VarLenFeature maps to a SparseTensor of the specified type representing a ragged matrix. Its indices are [batch, index] where batch is the batch entry the value is from in serialized, and index is the value's index in the list of values associated with that feature and example.

Each FixedLenFeature df maps to a Tensor of the specified type (or tf.float32 if not specified) and shape (serialized.size(),) + df.shape.

FixedLenFeature entries with a default_value are optional. With no default value, we will fail if that Feature is missing from any example in serialized.

Examples:

For example, if one expects a tf.float32 sparse feature ft and three serialized Examples are provided:

serialized = [
  features
    { feature { key: "ft" value { float_list { value: [1.0, 2.0] } } } },
  features
    { feature []},
  features
    { feature { key: "ft" value { float_list { value: [3.0] } } }
]

then the output will look like:

{"ft": SparseTensor(indices=[[0, 0], [0, 1], [2, 0]],
                    values=[1.0, 2.0, 3.0],
                    shape=(3, 2)) }

Given two Example input protos in serialized:

[
  features {
    feature { key: "kw" value { bytes_list { value: [ "knit", "big" ] } } }
    feature { key: "gps" value { float_list { value: [] } } }
  },
  features {
    feature { key: "kw" value { bytes_list { value: [ "emmy" ] } } }
    feature { key: "dank" value { int64_list { value: [ 42 ] } } }
    feature { key: "gps" value { } }
  }
]

And arguments

example_names: ["input0", "input1"],
features: {
    "kw": VarLenFeature(tf.string),
    "dank": VarLenFeature(tf.int64),
    "gps": VarLenFeature(tf.float32),
}

Then the output is a dictionary:

{
  "kw": SparseTensor(
      indices=[[0, 0], [0, 1], [1, 0]],
      values=["knit", "big", "emmy"]
      shape=[2, 2]),
  "dank": SparseTensor(
      indices=[[1, 0]],
      values=[42],
      shape=[2, 1]),
  "gps": SparseTensor(
      indices=[],
      values=[],
      shape=[2, 0]),
}

For dense results in two serialized Examples:

[
  features {
    feature { key: "age" value { int64_list { value: [ 0 ] } } }
    feature { key: "gender" value { bytes_list { value: [ "f" ] } } }
   },
   features {
    feature { key: "age" value { int64_list { value: [] } } }
    feature { key: "gender" value { bytes_list { value: [ "f" ] } } }
  }
]

We can use arguments:

example_names: ["input0", "input1"],
features: {
    "age": FixedLenFeature([], dtype=tf.int64, default_value=-1),
    "gender": FixedLenFeature([], dtype=tf.string),
}

And the expected output is:

{
  "age": [[0], [-1]],
  "gender": [["f"], ["f"]],
}

Args:

serialized: A vector (1-D Tensor) of strings, a batch of binary serialized Example protos.
features: A dict mapping feature keys to FixedLenFeature or VarLenFeature values.
name: A name for this operation (optional).
example_names: A vector (1-D Tensor) of strings (optional), the names of the serialized protos in the batch.

Returns:

A dict mapping feature keys to Tensor and SparseTensor values.

Raises:

ValueError: if any feature is invalid.

Links:

https://www.tensorflow.org/versions/master/api_docs/python/io_ops.html#parse_example

doc_TensorFlow

2025-01-10 15:47:30

Comments