我已经打开了字典,我想编写一个循环,以便仅获取
favorite color
列等于 NaN
的那些行作为输出。
到目前为止我的代码:
# Importing modules
import openpyxl as op
import pandas as pd
import numpy as np
import xlsxwriter
from openpyxl import Workbook, load_workbook
# Defining the file path
my_file_path = r'C:\Users\machukovich\Desktop\stack.xlsx'
# Loading the file into a dictionary of Dataframes
my_dict = pd.read_excel(my_file_path, sheet_name=None, skiprows=2)
My_dict 输出:
my_dict = {'Sheet_1': Name Surname Concatenation ID_ Grade_ favourite color
1 Delilah Gonzalez Delilah Gonzalez NaN NaN NaN
2 Christina Rodwell Christina Rodwell 100.0 3.0 black
3 Ziggy Stardust Ziggy Stardust 40.0 7.0 red ,
'Sheet_2': Name Surname Concatenation ID_ Grade_ favourite color \
0 Lucy Diamonds Lucy Diamonds 22.0 9.0 brown
1 Grace Kelly Grace Kelly 50.0 7.0 white
2 Uma Thurman Uma Thurman 105.0 7.0 purple
3 Lola King Lola King NaN NaN NaN ,
'Sheet_3': Name Surname Concatenation ID_ Grade_ favourite color \
0 Eleanor Rigby Eleanor Rigby 104.0 6.0 blue
1 Barbara Ann Barbara Ann 168.0 8.0 pink
2 Polly Cracker Polly Cracker 450.0 7.0 black
3 Little Joe Little Joe NaN NaN NaN }
我想要的输出:
my_dict = {'Sheet_1': Name Surname Concatenation ID_ Grade_ favourite color
1 Delilah Gonzalez Delilah Gonzalez NaN NaN NaN
'Sheet_2': Name Surname Concatenation ID_ Grade_ favourite color \
3 Lola King Lola King NaN NaN NaN
'Sheet_3': Name Surname Concatenation ID_ Grade_ favourite color \
3 Little Joe Little Joe NaN NaN NaN
最后,我想将
desired output
写入一个新的 Excel 文件(在单独的工作表中)。
请赐教。我是Python新手。
我会这样做:
with pd.ExcelWriter("output.xlsx", engine="xlsxwriter") as writer:
for sn, df in my_dict.items():
(df.loc[df["favourite color"].isnull()] # we use boolean indexing
.to_excel(writer, sheet_name=sn, index=False)) # with startrow, starcol ?
#this is optional
for ws in writer.sheets:
writer.sheets[ws].autofit() # xlsxwriter 3.0.6+
输出(仅
Sheet_1
):
更新:
如果您想先更新
my_dict
,您可以使用这个:
for sn, df in my_dict.items():
my_dict[sn] = df.loc[df["favourite color"].isnull()]
输出:
print(my_dict)
{'Sheet_1': Name Surname Concatenation ID_ Grade_ favourite color
0 Delilah Gonzalez Delilah Gonzalez NaN NaN NaN,
'Sheet_2': Name Surname Concatenation ID_ Grade_ favourite color
0 Lola King Lola King NaN NaN NaN,
'Sheet_3': Name Surname Concatenation ID_ Grade_ favourite color
0 Little Joe Little Joe NaN NaN NaN}
然后(如果需要)您可以循环遍历每个过滤的
df
,将其存储在电子表格中:
with pd.ExcelWriter("output.xlsx", engine="xlsxwriter") as writer:
for sn, df in my_dict.items():
df.to_excel(writer, sheet_name=sn, index=False)