pyspark列不可使用withColumn进行迭代

问题描述 投票:0回答:1

为什么在使用pyspark时为什么出现列不可迭代错误?

cost_allocation_df = cost_allocation_df.withColumn('resource_tags_user_engagement',          
 f.when(
       (f.col('line_item_usage_account_id') == '123456789101', '1098765432101') &
       (f.col('resource_tags_user_engagement') == '' ) |
       (f.col('resource_tags_user_engagement').isNull()) |
       (f.col('resource_tags_user_engagement').rlike('^[a-zA-Z]')), '10546656565'
       ).otherwise(f.col('resource_tags_user_engagement')))
pyspark
1个回答
0
投票

您具有从列到value的直接比较,这将不起作用。您必须使用value

将该lit()

尝试将您的代码转换为:

cost_allocation_df = cost_allocation_df.withColumn('resource_tags_user_engagement',          
 f.when(
       ((f.col('line_item_usage_account_id') == f.lit('123456789101')) | 
       (f.col('line_item_usage_account_id') == f.lit('1098765432101'))) & 
       (f.col('resource_tags_user_engagement') == f.lit('') ) |
       (f.col('resource_tags_user_engagement').isNull()) |
       (f.col('resource_tags_user_engagement').rlike('^[a-zA-Z]')), '10546656565'
       ).otherwise(f.col('resource_tags_user_engagement')))
© www.soinside.com 2019 - 2024. All rights reserved.