我有以下数据集,希望创建不同的组来计算名称下值的出现。
有:(县在字符串中)
name state county
apple MD 1
apple DC 1
pear VA 1
pear VA 2
pear CA 5
peach CO 3
peach CO 3
peach CO 2
peach CO 2
想要:
name state county freq_name freq_state freq_county
apple MD 1 2 1 2
apple DC 1 2 1 2
pear VA 1 3 2 3
pear VA 2 3 2 3
pear CA 5 3 1 3
peach CO 3 4 4 2
peach CO 2 4 4 2
我相信,通过SQL,在分区上将允许按不同级别进行计数类似于:
count(name) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;
出于某种原因,这段代码没有给我正确的freq_name计数。
任何帮助将不胜感激!
对于freq_name,使用count(*)代替count(name)
count(*) over (partition by name) as freq_name,
count(name) over (partition by state) as freq_state,
count(name) as freq_county
from have
group by name,state, county;