如何过滤多个数据框,并在保存的文件名中附加一个字符串?

问题描述 投票:0回答:2
  • 我想实现这个目的的原因是使用很多变量名来创建很多新的变量名,其中包含原始变量的名称。
  • 例如,我有几个pandas数据框,每个位置都有库存项目。
    • 我想创建新的数据框,只包含负数的库存项目,并使用 '_neg' 附加到原始变量名(库存位置)上。
    • 我希望能够通过类似这样的for循环来实现。
warehouse = pd.read_excel('warehouse.xls')
retail = pd.read_excel('retailonhand.xls')
shed3 = pd.read_excel('shed3onhand.xls')
tank1 = pd.read_excel('tank1onhand.xls')
tank2 = pd.read_excel('tank2onhand.xls')

all_stock_sites = [warehouse,retail,shed3,tank1,tank2]

all_neg_stock_sites = []
for site in all_stock_sites:
    string_value_of_new_site = (pseudo code):'site-->string_value_of_site' + '_neg'
    string_value_of_new_site = site[site.OnHand < 0]
    all_neg_stock_sites.append(string_value_of_new_site)
  • 这样就会产生这样的结果
# create new dataframes for each stock site's negative 'OnHand' values
warehouse_neg = warehouse[warehouse.OnHand < 0]
retail_neg = retail[retail.OnHand < 0]
shed3_neg = shed3[shed3.OnHand < 0]
tank1_neg = tank1[tank1.OnHand < 0]
tank2_neg = tank2[tank2.OnHand < 0]
  • 无需打出所有500个不同的库存地点,也无需在后面加上 '_neg' 的手。
python string pandas variables rename
2个回答
0
投票
from pathlib import Path
import pandas as pd

# set path to top file directory
d = Path(r'e:\PythonProjects\stack_overflow\stock_sites')

# get all xls files
files = list(d.rglob('*.xls'))

# create, filter and save dict of dataframe
df_dict = dict()
for file in files:
    # create dataframe
    df = pd.read_excel(file)
    try:
        # filter df and add to dict
        df = df[df.OnHand < 0]
    except AttributeError as e:
        print(f'{file} caused:\n{e}\n')
        continue 
    if not df.empty:
        df_dict[f'{file.stem}_neg'] = df
        # save to new file
        new_path = file.parent / f'{file.stem}_neg{file.suffix}'
        df.to_excel(new_path, index=False)

print(df_dict.keys())

>>> dict_keys(['retailonhand_neg', 'shed3onhand_neg', 'tank1onhand_neg', 'tank2onhand_neg', 'warehouse_neg'])

# access individual dataframes as you would any dict
df_dict['retailonhand_neg']

1
投票

我的建议是不要使用变量名作为数据的 "键",而是给它们分配合适的名字,用元组或dict。

因此,与其用:

warehouse = pd.read_excel('warehouse.xls')
retail = pd.read_excel('retailonhand.xls')
shed3 = pd.read_excel('shed3onhand.xls')

你可以用:

sites = {}
sites['warehouse'] = pd.read_excel('warehouse.xls')
sites['retail'] = pd.read_excel('retailonhand.xls')
sites['shed3'] = pd.read_excel('shed3onhand.xls')
...etc

然后你可以像这样创建负键。

sites_neg = {}
for site_name, site in sites.items():
  neg_key = site_name + '_neg'
  sites_neg[neg_key] = site[site.OnHand < 0]
© www.soinside.com 2019 - 2024. All rights reserved.