在json文件中输入数据:
[{
"Orders": {
"orderid": "order_id",
"customerId": "customers.customerId"
},
"Products": {
"productid": "product_id",
"productName": "products.productName"
}
}]
当我使用spark.read.json(“filepath”)读取它时,这是我得到的输出。
|Orders |Products |
+--------------------------------+----------------------------------+
|{customers.customerId, order_id}|{products.productName, product_id}
我想将 Order 作为变量传递并生成连接的值列表,例如 order_id,customers.customerId
检查下面的代码。
scala> df.show(false)
+--------------------------------+----------------------------------+
|Orders |Products |
+--------------------------------+----------------------------------+
|{customers.customerId, order_id}|{products.productName, product_id}|
+--------------------------------+----------------------------------+
scala> val inCol = "orders"
scala>
df
.selectExpr(s"concat_ws('', array(${inCol}.*)) as ${inCol}")
.show(false)
+----------------------------+
|orders |
+----------------------------+
|customers.customerIdorder_id|
+----------------------------+