如何查找列的总和?

问题描述 投票:0回答:2
dict={"asset":["S3","S2","E4","E1","A6","A8"],
"Rank":[1,2,3,4,5,6],"number_of_attributes":[2,1,2,2,1,1],
"number_of_cards":[1,2,2,1,2," "],"cards_plus1":[2,3,3,2,3," "]}

dframe=pd.DataFrame(dict,index=[1,2,3,4,5,6],
columns=["asset","Rank","number_of_attributes","number_of_cards","cards_plus1"])

我想做"cards_plus1"列的总和。我怎样才能做到这一点?cumsum列的输出应为:02581013

python dataframe cumsum
2个回答
0
投票

尝试一下:

首先,用nan替换空白值

import pandas as pd
import numpy as np

dict={"asset":["S3","S2","E4","E1","A6","A8"],"Rank":[1,2,3,4,5,6],"number_of_attributes":[2,1,2,2,1,1],
          "number_of_cards":[1,2,2,1,2," "],"cards_plus1":[2,3,3,2,3," "]}

dframe=pd.DataFrame(dict,index=[1,2,3,4,5,6],
columns=["asset","Rank","number_of_attributes","number_of_cards","cards_plus1"])

## replace blank values by nan
print(dframe.replace(r'^\s*$', np.nan, regex=True, inplace=True))

print (dframe)
>>> asset  Rank  number_of_attributes  number_of_cards  cards_plus1
1    S3     1                     2              1.0          2.0
2    S2     2                     1              2.0          3.0
3    E4     3                     2              2.0          3.0
4    E1     4                     2              1.0          2.0
5    A6     5                     1              2.0          3.0
6    A8     6                     1              NaN          NaN

现在cards_plus1列的数据类型是对象-更改为数字

### convert data type of the cards_plus1 to numeric 
dframe['cards_plus1'] = pd.to_numeric(dframe['cards_plus1'])

现在计算累计和

### now we can calculate cumsum
dframe['cards_plus1_cumsum'] = dframe['cards_plus1'].cumsum()

print(dframe)
>>>
asset  Rank  number_of_attributes  number_of_cards  cards_plus1  \
1    S3     1                     2              1.0          2.0   
2    S2     2                     1              2.0          3.0   
3    E4     3                     2              2.0          3.0   
4    E1     4                     2              1.0          2.0   
5    A6     5                     1              2.0          3.0   
6    A8     6                     1              NaN          NaN   

   cards_plus1_cumsum  
1                 2.0  
2                 5.0  
3                 8.0  
4                10.0  
5                13.0  
6                 NaN 

不是将空白值替换为nan,而是可以将它们替换为零,这取决于您想要的。希望有帮助。


0
投票

由于列cards_plus1的最后一个元素为字符串(" "),因此您需要首先从中提取int类型的元素,然后可以使用np.cumsum对其求和。>

import numpy as np
a = [ x for x in dict['cards_plus1'] if type(x)==int ]
cumsum = np.cumsum(a)

0
投票

我想以零而不是2开头。我希望发生中断:cards_plus1_cumsum 0 2 5 8 10 13

© www.soinside.com 2019 - 2024. All rights reserved.