写入目标时出现以下错误:
作业因阶段失败而中止:阶段 15526.0 中的任务 18 失败 4 次,最近一次失败:阶段 15526.0 中丢失任务 18.3 (TID 3281950) (10.179.0.125 执行程序 1190): org.apache.spark.SparkDateTimeException: [CAST_INVALID_INPUT] “STRING”类型的值“04/12/2024 01:37:07.000 AM”无法转换为“TIMESTAMP”,因为它格式不正确。根据语法更正值,或更改其目标类型。
尝试过:
df_new= (df.withColumn("Date",to_date( to_timestamp("LastUpdateDate","MM/dd/yyyy hh:mm:ss.SSS a"))))
dfnew = df..withColumn("Date", expr("try_cast(LastUpdateDateNew AS DATE)"))
#Convert string to timestamp
df_new = df.withColumn("LastUpdateTimestamp", unix_timestamp("LastUpdateDate", "MM/dd/yyyy hh:mm:ss:SSS a").cast("timestamp"))
#Convert timestamp to date in mm/dd/YYYY format
#df_new_bill = df_new.withColumn("date", to_date((col("LastUpdateTimestamp")), "MM/dd/yyyy"))
你可以试试这个:
df_new = (df.withColumn("LastUpdateTimestamp",
unix_timestamp(col("LastUpdateDate"), "MM/dd/yyyy hh:mm:ss.SSS a")
.cast("timestamp"))
)
这应该将输入列
LastUpdateDate
转换为所需的输出 LastUpdateTimestamp