我是 pyspark 的初学者。我想知道如何在同一列上同时使用 order by 和 group by 。我的疑问如下。我的期望是按字母顺序显示 user_state 。一旦完成,我将使用枢轴概念来区分每个州的全部男性和女性。先感谢您 。感谢您的帮助。
如果我理解正确,你想按州和性别显示 user_state 列表:
state gender total
Utah male 241
Utah female 845
...
也许这个查询可以帮助:
SELECT
user_state,
user_gender, # I don't know if its already a string or 1=male 0=female
COUNT(user_gender) as total
FROM user_vw
WHERE user_phone_numbers is not null
GROUP BY user_-gender, user_state # group by gender, then by state
ORDER BY user_state