如何基于字符串值迭代 DataFrame 字典

Question

我有一个包含三张表的 Excel 文件，每张表共享相同的结构，但包含不同的数据。

最初，我成功地在“虚拟”列的所有行中写入“成功”，其中“学生姓名”为“罗伯托”或“莱昂纳多”。
。到目前为止，我已经尝试过

ELSE

，但它没有按预期工作。
为了清晰起见，下面是我的 Excel 文件中每个工作表的快照，以及解析后的原始 OrderedDict。

{'Sheet_1': ID Name Surname Grade favourite color favourite sport Dummy 0 104 Eleanor Rigby 6 blue American football NaN 1 168 Barbara Ann 8 pink Hockey NaN 2 450 Polly Cracker 7 black Skateboarding NaN 3 90 Little Josy 10 orange Cycling NaN, 'Sheet_2': ID Name Surname Grade favourite color favourite sport Dummy 0 106 Lucy Sky 8 yellow Tennis NaN 1 128 Delilah Perez 5 light green Basketball NaN 2 100 Christina Rodwell 3 black Badminton NaN 3 40 Ziggy Stardust 7 red Squash NaN, 'Sheet_3': ID Name Surname Grade favourite color favourite sport Dummy 0 22 Lucy Diamonds 9 brown Judo NaN 1 50 Grace Kelly 7 white Taekwondo NaN 2 105 Uma Thurman 7 purple videogames NaN 3 29 Lola McQueen 3 red Surf NaN}

我的一段代码（没有为“Miquel Angelo”写“失败”）：

# Importing modules import openpyxl as op import pandas as pd import numpy as np import xlsxwriter import openpyxl from openpyxl import Workbook, load_workbook # Defining the file path file_path = r'C:/Users/machukovich/Desktop/stack_2.xlsx' # Load workbook as openpyxl reference_workbook = load_workbook(file_path) # We will mantain the workbook open wb = reference_workbook.active # Getting the sheetnames as a list using the sheetnames attribute sheet_names = reference_workbook.sheetnames print(sheet_names) names_list = [] for sheet in reference_workbook.worksheets: student_name = sheet['B2'].value names_list.append(student_name) print(f"Student Name: {student_name}") print(f"List of Names: {names_list}") # Loading the file into a dictionary of Dataframes dict_of_df = pd.read_excel(file_path, sheet_name=None, skiprows=2) # Writting the loop itself (it fills all the 'Dummy' columns with zeros in all sheets): for sheet_name, df in dict_of_df.items(): if student_name =='Roberto' or student_name =='Leonardo': df['Dummy'] = df['Dummy'].fillna('success') else: df['Dummy'] = df['Dummy'].fillna('failure')

快照：

Answer 1

moken

。

student_name

的值是最后从表 3 中读取的，并且具有值

Leonardo

。尽管

student_name

的值已添加到

names_list

，但该列表未使用。因此，当您迭代每个工作表时，

student_name

的值在

Leonardo

行中保持为

if student_name =='Roberto' or  student_name =='Leonardo':

，因此它只会将

success

写入

Dummy

列。也许您打算在迭代工作表时使用

names_list

中的值？类似下面的内容对我来说按预期工作，并在第二张纸上写下

failure

：

        index = 0
        for sheet_name, df in dict_of_df.items():
            student_name = names_list[index]
            index += 1
            print(f"Student Name in loop: {student_name}")

            if student_name =='Roberto' or  student_name =='Leonardo':
                df['Dummy'] = df['Dummy'].fillna('success')
            else:
                df['Dummy'] = df['Dummy'].fillna('failure')
            df[['Dummy']].apply(print)

完整日志输出：

[2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - ['Sheet_1', 'Sheet_2', 'Sheet_3'] [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name: Roberto [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name: Miquel Angelo [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name: Leonardo [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - List of Names: ['Roberto', 'Miquel Angelo', 'Leonardo'] [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name in loop: Roberto [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - 0 success 1 success 2 success 3 success Name: Dummy, dtype: object [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name in loop: Miquel Angelo [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - 0 failure 1 failure 2 failure 3 failure Name: Dummy, dtype: object [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name in loop: Leonardo [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - 0 success 1 success 2 success 3 success Name: Dummy, dtype: object

如何基于字符串值迭代 DataFrame 字典

问题描述投票：0回答：1

1个回答

最新问题

如何基于字符串值迭代 DataFrame 字典

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1