用于过滤字典中 NaN 的循环

问题描述 投票:0回答:1

我已经打开了字典,我想编写一个循环,以便仅获取

favorite color
列等于
NaN
的那些行作为输出。

到目前为止我的代码:

# Importing modules
import openpyxl as op
import pandas as pd
import numpy as np
import xlsxwriter
from openpyxl import Workbook, load_workbook

# Defining the file path
my_file_path = r'C:\Users\machukovich\Desktop\stack.xlsx'
# Loading the file into a dictionary of Dataframes
my_dict = pd.read_excel(my_file_path, sheet_name=None, skiprows=2)

My_dict 输出:

my_dict = {'Sheet_1':         Name   Surname      Concatenation    ID_  Grade_ favourite color   
 1    Delilah  Gonzalez   Delilah Gonzalez    NaN     NaN             NaN   
 2  Christina   Rodwell  Christina Rodwell  100.0     3.0           black   
 3      Ziggy  Stardust     Ziggy Stardust   40.0     7.0             red    ,
 'Sheet_2':     Name   Surname  Concatenation    ID_  Grade_ favourite color  \
 0   Lucy  Diamonds  Lucy Diamonds   22.0     9.0           brown   
 1  Grace     Kelly    Grace Kelly   50.0     7.0           white   
 2    Uma   Thurman    Uma Thurman  105.0     7.0          purple   
 3   Lola      King      Lola King    NaN     NaN             NaN     ,
 'Sheet_3':        Name  Surname   Concatenation    ID_  Grade_ favourite color  \
 0  Eleanor     Rigby  Eleanor  Rigby  104.0     6.0            blue   
 1  Barbara       Ann    Barbara  Ann  168.0     8.0            pink   
 2    Polly   Cracker  Polly  Cracker  450.0     7.0           black   
 3   Little       Joe     Little  Joe    NaN     NaN             NaN    }

我想要的输出:

my_dict = {'Sheet_1':         Name   Surname      Concatenation    ID_  Grade_ favourite color  
 1    Delilah  Gonzalez   Delilah Gonzalez    NaN     NaN             NaN   
 'Sheet_2':     Name   Surname  Concatenation    ID_  Grade_ favourite color  \ 
 3   Lola      King      Lola King    NaN     NaN             NaN   
  'Sheet_3':        Name  Surname   Concatenation    ID_  Grade_ favourite color  \
 3   Little       Joe     Little  Joe    NaN     NaN             NaN   

最后,我想将

desired output
写入一个新的 Excel 文件(在单独的工作表中)。 请赐教。我是Python新手。

python excel dataframe openpyxl xlsxwriter
1个回答
1
投票

我会这样做:

with pd.ExcelWriter("output.xlsx", engine="xlsxwriter") as writer:
    for sn, df in my_dict.items():
        (df.loc[df["favourite color"].isnull()] # we use boolean indexing
             .to_excel(writer, sheet_name=sn, index=False)) # with startrow, starcol ?
    #this is optional    
    for ws in writer.sheets:
        writer.sheets[ws].autofit() # xlsxwriter 3.0.6+

输出(

Sheet_1
):

更新:

如果您想先更新

my_dict
,您可以使用这个:

for sn, df in my_dict.items():
    my_dict[sn] = df.loc[df["favourite color"].isnull()]

输出:

print(my_dict)

{'Sheet_1':       Name   Surname     Concatenation  ID_  Grade_  favourite color
 0  Delilah  Gonzalez  Delilah Gonzalez  NaN     NaN              NaN,
 'Sheet_2':    Name Surname Concatenation  ID_  Grade_  favourite color
 0  Lola    King     Lola King  NaN     NaN              NaN,
 'Sheet_3':      Name Surname Concatenation  ID_  Grade_  favourite color
 0  Little     Joe    Little Joe  NaN     NaN              NaN}

然后(如果需要)您可以循环遍历每个过滤的

df
,将其存储在电子表格中:

with pd.ExcelWriter("output.xlsx", engine="xlsxwriter") as writer:
    for sn, df in my_dict.items():
        df.to_excel(writer, sheet_name=sn, index=False)
© www.soinside.com 2019 - 2024. All rights reserved.