添加一个在pyspark范围内具有随机数的列

问题描述 投票:0回答:1

我想生成一个具有这样的随机数的列:

df=df.withColumn("random_col",random.randint(100000, 1000000))

上面给我一个错误:

AssertionError:col应该是列

python pyspark
1个回答
0
投票

首先,我将确保您已导入正确的内容...

尝试导入:从pyspark.sql.functions导入rand

然后尝试执行以下代码:

df1 = df.withColumn(“ random_col”,rand()> 100000,1000000)

You also could check out this resource. It looks like it may be helpful for what you are doing

希望这会有所帮助!

© www.soinside.com 2019 - 2024. All rights reserved.