如何打印 _VariantDataset 的元素?

问题描述 投票:0回答:2

我正在为 LSTM 模型格式化数据。 这就是我正在做的:

aa=pd.DataFrame()
aa["a"]=range(30)
aa["b"]=range(30,60)
aa["c"]=range(60,90)

bb=pd.DataFrame()
bb["r"]=range(90,120)

all=tf.data.Dataset.zip((
    tf.data.Dataset.from_tensor_slices(aa.values),
    tf.data.Dataset.from_tensor_slices(bb.values)))

history_len=3
batch_size=6

for i,j in all.batch(history_len, drop_remainder=True).window(batch_size, shift=1).take(1):
    print(i)

这个输出

<_VariantDataset element_spec=TensorSpec(shape=(3, 1), dtype=tf.int64, name=None)>

相反,我想看看它包含哪些数据。 在文档中,他们似乎做了类似

print(i.flat_map(lambda x:x))
的事情。对我来说,这失败了

TypeError                                 Traceback (most recent call last)
Cell In [28], line 7
      4 batch_size=6
      6 for i,j in all.batch(history_len, drop_remainder=True).window(batch_size, shift=1).take(1):
----> 7     print(i.flat_map(lambda x:x))
      8     #print(j)

File /usr/lib/python3.10/site-packages/tensorflow/python/data/ops/dataset_ops.py:2245, in DatasetV2.flat_map(self, map_func, name)
   2212 def flat_map(self, map_func, name=None):
   2213   """Maps `map_func` across this dataset and flattens the result.
   2214 
   2215   The type signature is:
   (...)
   2243     Dataset: A `Dataset`.
   2244   """
-> 2245   return FlatMapDataset(self, map_func, name=name)

File /usr/lib/python3.10/site-packages/tensorflow/python/data/ops/dataset_ops.py:5489, in FlatMapDataset.__init__(self, input_dataset, map_func, name)
   5484 self._map_func = structured_function.StructuredFunctionWrapper(
   5485     map_func, self._transformation_name(), dataset=input_dataset)
   5486 if not isinstance(self._map_func.output_structure, DatasetSpec):
   5487   raise TypeError(
   5488       "The `map_func` argument must return a `Dataset` object. Got "
-> 5489       f"{_get_type(self._map_func.output_structure)!r}.")
   5490 self._structure = self._map_func.output_structure._element_spec  # pylint: disable=protected-access
   5491 self._name = name

File /usr/lib/python3.10/site-packages/tensorflow/python/data/ops/dataset_ops.py:132, in _get_type(value)
    129 """Returns the type of `value` if it is a TypeSpec."""
    131 if isinstance(value, type_spec.TypeSpec):
--> 132   return value.value_type()
    133 else:
    134   return type(value)

TypeError: Tensor.__init__() missing 3 required positional arguments: 'op', 'value_index', and 'dtype'

我还发现并尝试了以下示例:

with tf.Session() as sess:
    print(sess.run(i))

这也行不通

module 'tensorflow' has no attribute 'Session'

如何打印 _VariantDataset 内部的数据以查看其格式是否正确?

python tensorflow keras tensorflow-datasets
2个回答
0
投票

您可以使用

Dataset.flat_map
将窗口数据集扁平化为单个数据集。

ds = all.batch(history_len, drop_remainder=True).window(batch_size, shift=1)
ds = ds.flat_map(lambda x, y:  tf.data.Dataset.zip((x.batch(batch_size), y.batch(batch_size))))

for i,j in ds:
    print(i)

tf.Tensor(
[[[ 0 30 60]
  [ 1 31 61]
...

简单来说,

for i,j in all.batch(history_len, drop_remainder=True).window(batch_size, shift=1).take(1):
  print([value for value in i])

0
投票
aa=pd.DataFrame()
aa["a"]=range(30)
aa["b"]=range(30,60)
aa["c"]=range(60,90)

bb=pd.DataFrame()
bb["r"]=range(90,120)

all=tf.data.Dataset.zip((
    tf.data.Dataset.from_tensor_slices(aa.values),
    tf.data.Dataset.from_tensor_slices(bb.values)))

for element in all:
    print(element[0],element[1])

history_len=3
batch_size=6

all = all.batch(history_len, drop_remainder=True).window(batch_size, shift=1)

for element in all:
    print(list(element[0].as_numpy_iterator()),list(element[1].as_numpy_iterator()))
© www.soinside.com 2019 - 2024. All rights reserved.