SQL:表到键值表

问题描述 投票:0回答:2

我在HIVE中有这样的表格:

A    | B   | C  | value
key1 |NULL|NULL| v1
NULL | key2  |NULL| v2
NULL |NULL| key3  | v3
NULL | key4  |NULL| v4

将其转换为这样的键值表的最简单方法是什么:

key_type | key_value | value
A | key1 | v1
B | key2 | v2
C | key3 | v3
B | key4 | v4

使用Hive-SQL或Spark Dataframe转换(PySpark)?谢谢您的帮助。

sql pyspark hql hiveql
2个回答
0
投票

您可以使用union all

select t.key_type, t.key_value, t.value
from ( (select 'a' as key_type, a as key_value, value from t) union all
       (select 'b' as key_type, b as key_value, value from t) union all
       (select 'c' as key_type, c as key_value, value from t) 
     ) t
where t.key_type is not null
order by t.value;

0
投票

使用pyspark,可以在过滤所需的列并在列值不为null时返回列名称之后使用greatest

cols = [i for i in df.columns if i!='value'] #['A','B','C']

df.select(F.greatest(*[F.when(F.col(i).isNotNull(),i).alias(i) 
                             for i in cols]).alias("key_type")
     ,F.greatest(*[F.col(i) for i in cols]).alias("key_Value"),"value").show()

+--------+---------+-----+
|key_type|key_Value|value|
+--------+---------+-----+
|       A|     key1|   v1|
|       B|     key2|   v2|
|       C|     key3|   v3|
|       B|     key4|   v4|
+--------+---------+-----+
© www.soinside.com 2019 - 2024. All rights reserved.