BigTable:2写入相同的密钥,但有3个版本

问题描述 投票:0回答:1

有时如果我将多个版本写入同一行键,并且多个批处理突变中有多个列系列(每个版本与多个写入一起批处理)。

enter image description here

由于数据压缩,这是预期的行为吗?是否会随着时间的推移删除额外的版本?

bigtable google-cloud-bigtable
1个回答
1
投票

这里的问题是你将两列放在批处理中的两个单独的条目中,这意味着即使它们具有相同的行,它们也不会以原子方式应用。

批处理条目可以单独成功或失败,然后客户端将仅重试失败的条目。例如,如果一个条目成功而另一个条目超时但后来静默成功,则重试“失败”条目可能会导致您看到的部分写入结果。

因此,在python中,您应该执行以下操作(改编自cloud.google.com/bigtable/docs/samples-python-hello):

print('Writing some greetings to the table.')
greetings = ['Hello World!', 'Hello Cloud Bigtable!', 'Hello Python!']
rows = []
column1 = 'greeting1'.encode()
column1 = 'greeting2'.encode()
for i, value in enumerate(greetings):
    # Note: This example uses sequential numeric IDs for simplicity,
    # but this can result in poor performance in a production
    # application.  Since rows are stored in sorted order by key,
    # sequential keys can result in poor distribution of operations
    # across nodes.
    #
    # For more information about how to design a Bigtable schema for
    # the best performance, see the documentation:
    #
    #     https://cloud.google.com/bigtable/docs/schema-design
    row_key = 'greeting{}'.format(i).encode()
    row = table.row(row_key)

    # **Multiple calls to 'set_cell()' are allowed on the same batch
    # entry. Each entry will be applied atomically, but a separate
    # 'row' in the same batch will be applied separately even if it
    # shares its row key with another entry.**
    row.set_cell(column_family_id,
                 column1,
                 value,
                 timestamp=datetime.datetime.utcnow())
    row.set_cell(column_family_id,
                 column2,
                 value,
                 timestamp=datetime.datetime.utcnow())
    rows.append(row)
table.mutate_rows(rows)
© www.soinside.com 2019 - 2024. All rights reserved.