dask cudf 无法访问map_partitions

问题描述 投票:0回答:1

我尝试创建 dask_cudf 数据框,但出现错误。

import dask_cudf
import cudf


# Example pandas DataFrame with a datetime string column
pdf = pd.DataFrame({'datetime_str': ['2024-03-19 12:00:00', '2024-03-19 10:00:00', '2024-03-19 11:00:00']})

# Convert the pandas DataFrame to a cuDF DataFrame
cdf = cudf.from_pandas(pdf)

# Convert the cuDF DataFrame to a Dask cuDF DataFrame
ddf = dask_cudf.from_cudf(cdf, npartitions=2) # error 

我收到错误:

AttributeError: DataFrame object has no attribute map_partitions

我发现了

cudf.core.dataframe.DataFrame # no map_partitions
dask_cudf.DataFrame.map_partitions 
dask_cudf.core.map_partitions
dask_cudf.core.DataFrame.map_partitions
dask.dataframe.map_partitions

如何让“dask_cudf.from_cudf”访问map_partitions?谢谢

python pandas dask dask-dataframe cudf
1个回答
0
投票

根据文档

dask_cudf.from_cudf
dask.dataframe.from_pandas()

周围的薄包装纸

第一个参数

data
预计为
pandas.DataFrame
pandas.Series

所以阅读应该这样完成:

pdf = pd.DataFrame({
    'datetime_str': [
        '2024-03-19 12:00:00', 
        '2024-03-19 10:00:00', 
        '2024-03-19 11:00:00']})

# Convert the cuDF DataFrame to a Dask cuDF DataFrame
ddf = dask_cudf.from_cudf(pdf, npartitions=2)
© www.soinside.com 2019 - 2024. All rights reserved.