这是我的架构
root
|-- DataPartition: string (nullable = true)
|-- TimeStamp: string (nullable = true)
|-- PeriodId: long (nullable = true)
|-- FinancialAsReportedLineItemName: struct (nullable = true)
| |-- _VALUE: string (nullable = true)
| |-- _languageId: long (nullable = true)
|-- FinancialLineItemSource: long (nullable = true)
|-- FinancialStatementLineItemSequence: long (nullable = true)
|-- FinancialStatementLineItemValue: double (nullable = true)
|-- FiscalYear: long (nullable = true)
|-- IsAnnual: boolean (nullable = true)
|-- IsAsReportedCurrencySetManually: boolean (nullable = true)
|-- IsCombinedItem: boolean (nullable = true)
|-- IsDerived: boolean (nullable = true)
|-- IsExcludedFromStandardization: boolean (nullable = true)
|-- IsFinal: boolean (nullable = true)
|-- IsTotal: boolean (nullable = true)
|-- ParentLineItemId: long (nullable = true)
|-- PeriodPermId: struct (nullable = true)
| |-- _VALUE: long (nullable = true)
| |-- _objectTypeId: long (nullable = true)
|-- ReportedCurrencyId: long (nullable = true)
从上面的模式我试图这样做
val temp = tempNew1
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
.withColumn("PeriodPermId", $"PeriodPermId._VALUE")
.withColumn("PeriodPermId_objectTypeId", $"PeriodPermId._objectTypeId").drop($"AsReportedItem").drop($"AsReportedItem")
我不知道我在这里失踪了什么。我得到以下错误
线程“main”中的异常org.apache.spark.sql.AnalysisException:无法从FinancialAsReportedLineItemName#2262中提取值:需要struct类型但是得到了字符串;
问题是,当FinancialAsReportedLineItemName._languageId
列已被FinancialAsReportedLineItemName
取代时,您正试图访问FinancialAsReportedLineItemName._VALUE
你应该改变以下两行
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
至
.withColumn("FinancialAsReportedLineItemName_value", $"FinancialAsReportedLineItemName._VALUE")
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
如果FinancialAsReportedLineItemName_value
列名称应该是FinancialAsReportedLineItemName
那么你应该交换withColumns
作为
.withColumn("FinancialAsReportedLineItemName_languageId", $"FinancialAsReportedLineItemName._languageId")
.withColumn("FinancialAsReportedLineItemName", $"FinancialAsReportedLineItemName._VALUE")