如何删除 Huggingface 数据集中具有空值的行?

问题描述 投票:0回答:1

加载huggingface数据集后

download_config = DownloadConfig()
dataset = load_dataset (hf_dataset_name, download_config=download_config)
dataset_split = dataset ['train']

假设如果在

None
列的行中找到
""
"answer"
,我该如何删除该行?

python huggingface-datasets
1个回答
0
投票

HuggingFace 数据集环绕 PyArrow 数据集可以过滤

import pyarray.dataset as ds

mask = ds.field('answer') is not None
filtered = dataset_split.data.filter(mask).to_table()
© www.soinside.com 2019 - 2024. All rights reserved.