如何根据单元格的字符串值留空/用零填充列?

问题描述 投票:0回答:1

我正在使用三个不同的工作表操作一个 Excel 文件: 理想情况下,虚拟列必须默认填充零,除非学生的名字是“Roberto”或“Leonardo”,在这种情况下,虚拟列值应保留为空白。

  • 我原来的 DF 是这样的:
{'Sheet_1':     ID      Name  Surname  Grade favourite color    favourite sport  Dummy
 0  104  Eleanor     Rigby      6            blue  American football    NaN
 1  168  Barbara       Ann      8            pink             Hockey    0.0
 2  450    Polly   Cracker      7           black      Skateboarding    NaN
 3   90   Little      Josy     10          orange            Cycling    NaN,
 'Sheet_2':     ID       Name   Surname  Grade favourite color favourite sport  Dummy
 0  106       Lucy       Sky      8          yellow          Tennis    NaN
 1  128    Delilah     Perez      5     light green      Basketball    0.0
 2  100  Christina   Rodwell      3           black       Badminton    NaN
 3   40      Ziggy  Stardust      7             red          Squash    NaN,
 'Sheet_3':     ID   Name   Surname  Grade favourite color favourite sport  Dummy
 0   22   Lucy  Diamonds      9           brown            Judo    NaN
 1   50  Grace     Kelly      7           white       Taekwondo    NaN
 2  105    Uma   Thurman      7          purple      videogames    NaN
 3   29   Lola   McQueen      3             red            Surf    0.0}
  • 我想写一个循环来结束这样的事情:
{'Sheet_1':     ID      Name  Surname  Grade favourite color    favourite sport  Dummy
 0  104  Eleanor     Rigby      6            blue  American football    NaN
 1  168  Barbara       Ann      8            pink             Hockey    NaN
 2  450    Polly   Cracker      7           black      Skateboarding    NaN
 3   90   Little      Josy     10          orange            Cycling    NaN,
 'Sheet_2':     ID       Name   Surname  Grade favourite color favourite sport  Dummy
 0  106       Lucy       Sky      8          yellow          Tennis      0
 1  128    Delilah     Perez      5     light green      Basketball      0
 2  100  Christina   Rodwell      3           black       Badminton      0
 3   40      Ziggy  Stardust      7             red          Squash      0,
 'Sheet_3':     ID   Name   Surname  Grade favourite color favourite sport  Dummy
 0   22   Lucy  Diamonds      9           brown            Judo    NaN
 1   50  Grace     Kelly      7           white       Taekwondo    NaN
 2  105    Uma   Thurman      7          purple      videogames    NaN
 3   29   Lola   McQueen      3             red            Surf    NaN}
  • 我还没有提到学生的名字写在每张纸的B2单元格上(这意味着所有纸都有相同的格式)。

感谢任何帮助。我发现很难弄清楚如何使程序考虑每张纸中写入的信息,因此,如果发现“罗伯托”或“莱昂纳多”在其中,则应用循环,条件是让该列为空白B2.下面的代码只是在所有工作表中用零填充虚拟列(这不是我预期的输出):

# Importing modules
import openpyxl as op
import pandas as pd
import numpy as np
import xlsxwriter
import openpyxl
from openpyxl import Workbook, load_workbook

# Defining the file path
file_path = r'C:/Users/machukovich/stack_2.xlsx'

# Load workbook as openpyxl
reference_workbook = openpyxl.load_workbook(file_path)
wb = load_workbook(file_path)

# We will mantain the workbook open
wb = wb.active

# Loading the file into a dictionary of Dataframes
dict_of_df = pd.read_excel(file_path, sheet_name=None, skiprows=2)

# Reading up the B2 cell for later use:
student_name = wb['B2'].value

# Writting the loop itself (it fills all the 'Dummy' columns with zeros in all sheets):
for sheet_name, df in dict_of_df.items():
      df['Dummy'] = df['Dummy'].fillna(0)

编辑:我已将代码修改为可复制,并考虑到单元格 B2 未出现在上述 DF 中。这就是为什么我跳过每个 Excel 工作表中的前两行。 Leonardo 和 Roberto 分别出现在 Sheet_1 和 Sheet_3 的 B2 中。

pandas dataframe for-loop if-statement ordereddictionary
1个回答
0
投票

试试这个:

import pandas as pd
import openpyxl

# Defining the file path
file_path = r'C:/Users/machukovich/stack_2.xlsx'

# Load workbook as openpyxl
wb = openpyxl.load_workbook(file_path)

# Loading the file into a dictionary of DataFrames
dict_of_df = pd.read_excel(file_path, sheet_name=None, skiprows=2)

# Writting the loop itself
for sheet_name, df in dict_of_df.items():
    # Read the student's name from cell B2 of the current sheet
    student_name = wb[sheet_name]['B2'].value
    # If the student's name is 'Roberto' or 'Leonardo', keep 'Dummy' column NaN, else fill with 0
    df['Dummy'] = df['Dummy'].apply(lambda x: x if student_name in ['Roberto', 'Leonardo'] else 0)

# Save the modified DataFrame back to the Excel file
with pd.ExcelWriter(file_path) as writer:
    for sheet_name, df in dict_of_df.items():
        df.to_excel(writer, sheet_name=sheet_name, index=False)

© www.soinside.com 2019 - 2024. All rights reserved.