存储添加到特定总和的索引值

问题描述 投票:0回答:3

我有以下df

County      TotPerson  
Wayne       148        
Oakland     125        
Macomb      63         
Washtenaw   30          
Ingham      30          
Monroe      28          
Hillsdale   15          
Livingstone 15          
Jackson     14          
Lenawee     12        

我想存储在不同的列表或字典中(这没关系)(从上到下的总和不超过190个的县。

结果应该看起来像这样:

Group1
[Wayne]

Group2
[Oakland,Macomb]

Group3
[Washtenaw, Ingham, Monroe, Hillsdale, Livingstone, Jackson, Lenawee]  
python pandas list cumsum
3个回答
1
投票
groups = []

for i in range(len(df)):
    if len(df)>0:
        groups.append(df.loc[df.TotPerson.cumsum().lt(190)].County.tolist())
        df = df.loc[df.TotPerson.cumsum().ge(190)]

[['Wayne'],
 ['Oakland', 'Macomb'],
 ['Washtenaw',
  'Ingham',
  'Monroe',
  'Hillsdale',
  'Livingstone',
  'Jackson',
  'Lenawee']]

4
投票

逻辑有点像达到极限190时复位

sumlm = np.frompyfunc(lambda a,b: a+b if a+b < 190 else b,2,1)
id=sumlm.accumulate(df.TotPerson, dtype=np.object).eq(df.TotPerson).cumsum()

l=df.County.groupby(id).agg(list)
TotPerson
1                                              [Wayne]
2                                    [Oakland, Macomb]
3    [Washtenaw, Ingham, Monroe, Hillsdale, Livings...
Name: County, dtype: object
l.tolist()

或尝试for循环

l=[]
c=0
for i, y in enumerate(df.TotPerson):
     c += y
     if c >= 190:
        l.append(i)
        c = 0
df.County.groupby(df.index.isin(l).cumsum()).agg(list)

0
投票

我只能使用循环来解决它,但是numpy.cumsum在这个问题上没有太大帮助。希望它能解决您的问题。df = pd.read_clipboard()

cumsum=0
lst1=[]
lst2=[]
for j,i in zip(df.County,df.TotPerson):
    cumsum+=i
    if cumsum <=190:
        lst1.append(j)
    else:
        lst2.append(lst1)
        cumsum=i
        lst1=[j]
lst2.append(lst1)    
lst2 # This is the desired list
© www.soinside.com 2019 - 2024. All rights reserved.