具有多个条件的分组和计数

Question

您好，如果能在 Python Pandas.TIA 方面获得一些帮助，那就太好了我有一个包含 1M 行和以下列的数据框：

PID       lurn_fls   locality   Comparision
ACT933    2        Kambah     mbn:match both non-empty
ACT934    3F       Charwood   xne:mismatch neither empty
ACT935    3R       Glenden    mbe:match both empty
.         .         .         .
.         .         .         .
ACT155    4        Glebe      xhe:mismatch h_empty

我需要按“lurn_fls”进行分组，并计算每组不同的“比较”列结果，以便我的结果应如下表所示。例如：

lurn_fls  mbn:match both  xhe:mismatch  xne:mismatch    mbe:match both  Total
           non-empty       h_empty      neither empty   empty    
 1         600             12           10              15              XXX     
 2         700             10           8               14              XXX 
 3F        800             8            6               10              XXX 
 3R        900             6            10              12              XXX 
 4         500             4            20              10              XXX 
 5         400             2            10              14              XXX

Answer 1

IIUC，你可以使用

pd.crosstab

:

out = pd.crosstab(df["lurn_fls"], df["Comparision"])
out["Total"] = out.sum(axis=1)

print(out)

打印：

Comparision  mbe:match both empty  mbn:match both non-empty  xhe:mismatch h_empty  xne:mismatch neither empty  Total
lurn_fls                                                                                                            
2                               0                         1                     0                           0      1
3F                              0                         0                     0                           1      1
3R                              1                         0                     0                           0      1
4                               0                         0                     1                           0      1

具有多个条件的分组和计数

问题描述投票：0回答：1

1个回答

最新问题

具有多个条件的分组和计数

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1