根据标准重塑数据帧并计数值

问题描述 投票:0回答:1

我有下面的数据集。我正在尝试通过提供标签来确定客户类型。尝试时由于数据过多,我的excel崩溃了,所以尝试用Python完成。

item  customer qty
------------------
ProdA CustA    1 
ProdA CustB    1
ProdA CustC    1
ProdA CustD    1
ProdB CustA    1
ProdB CustB    1

在Excel中,我会:

1. Create new columns "ProdA", "ProdB", "Type"
2. Remove duplicates for column "customer"
3. COUNTIF Customer = ProdA, COUNTIF customer = ProdB
4. IF(AND(ProdA = 1, ProdB = 1), "Both", "One")


customer ProdA ProdB Type
--------------------------
CustA    1     1     Both
CustB    1     1     Both
CustC    1     0     One
CustD    1     0     One
python pandas
1个回答
0
投票

我们可以使用pd.crosstab,然后使用ProdAProdBmap的总和来实现:

dfn = pd.crosstab(df['customer'], df['item']).reset_index()
dfn['Type'] = dfn[['ProdA', 'ProdB']].sum(axis=1).map({2:'Both', 1:'One'})

item customer  ProdA  ProdB  Type
0       CustA      1      1  Both
1       CustB      1      1  Both
2       CustC      1      0   One
3       CustD      1      0   One
© www.soinside.com 2019 - 2024. All rights reserved.