如何基于字符串值迭代 DataFrame 字典

问题描述 投票:0回答:1

我有一个包含三张表的 Excel 文件,每张表共享相同的结构,但包含不同的数据。

  • 最初,我成功地在“虚拟”列的所有行中写入“成功”,其中“学生姓名”为“罗伯托”或“莱昂纳多”。
  • 现在,我想在每张纸的同一列中写下“失败”,其中存在不同的“学生姓名”
  • 到目前为止,我已经尝试过
  • ELSE
  • ,但它没有按预期工作。
    为了清晰起见,下面是我的 Excel 文件中每个工作表的快照,以及解析后的原始 OrderedDict。
  • 我原来的OrderedDict:

{'Sheet_1': ID Name Surname Grade favourite color favourite sport Dummy 0 104 Eleanor Rigby 6 blue American football NaN 1 168 Barbara Ann 8 pink Hockey NaN 2 450 Polly Cracker 7 black Skateboarding NaN 3 90 Little Josy 10 orange Cycling NaN, 'Sheet_2': ID Name Surname Grade favourite color favourite sport Dummy 0 106 Lucy Sky 8 yellow Tennis NaN 1 128 Delilah Perez 5 light green Basketball NaN 2 100 Christina Rodwell 3 black Badminton NaN 3 40 Ziggy Stardust 7 red Squash NaN, 'Sheet_3': ID Name Surname Grade favourite color favourite sport Dummy 0 22 Lucy Diamonds 9 brown Judo NaN 1 50 Grace Kelly 7 white Taekwondo NaN 2 105 Uma Thurman 7 purple videogames NaN 3 29 Lola McQueen 3 red Surf NaN}

我的一段代码(没有为“Miquel Angelo”写“失败”):

# Importing modules import openpyxl as op import pandas as pd import numpy as np import xlsxwriter import openpyxl from openpyxl import Workbook, load_workbook # Defining the file path file_path = r'C:/Users/machukovich/Desktop/stack_2.xlsx' # Load workbook as openpyxl reference_workbook = load_workbook(file_path) # We will mantain the workbook open wb = reference_workbook.active # Getting the sheetnames as a list using the sheetnames attribute sheet_names = reference_workbook.sheetnames print(sheet_names) names_list = [] for sheet in reference_workbook.worksheets: student_name = sheet['B2'].value names_list.append(student_name) print(f"Student Name: {student_name}") print(f"List of Names: {names_list}") # Loading the file into a dictionary of Dataframes dict_of_df = pd.read_excel(file_path, sheet_name=None, skiprows=2) # Writting the loop itself (it fills all the 'Dummy' columns with zeros in all sheets): for sheet_name, df in dict_of_df.items(): if student_name =='Roberto' or student_name =='Leonardo': df['Dummy'] = df['Dummy'].fillna('success') else: df['Dummy'] = df['Dummy'].fillna('failure')

快照:

pandas for-loop if-statement openpyxl ordereddictionary
1个回答
0
投票
moken

student_name
的值是最后从表 3 中读取的,并且具有值
Leonardo
。尽管
student_name
的值已添加到
names_list
,但该列表未使用。因此,当您迭代每个工作表时,
student_name
的值在
Leonardo
行中保持为
if student_name =='Roberto' or  student_name =='Leonardo':
,因此它只会将
success
写入
Dummy
列。也许您打算在迭代工作表时使用
names_list
中的值?类似下面的内容对我来说按预期工作,并在第二张纸上写下
failure
        index = 0
        for sheet_name, df in dict_of_df.items():
            student_name = names_list[index]
            index += 1
            print(f"Student Name in loop: {student_name}")

            if student_name =='Roberto' or  student_name =='Leonardo':
                df['Dummy'] = df['Dummy'].fillna('success')
            else:
                df['Dummy'] = df['Dummy'].fillna('failure')
            df[['Dummy']].apply(print)

完整日志输出:

[2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - ['Sheet_1', 'Sheet_2', 'Sheet_3'] [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name: Roberto [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name: Miquel Angelo [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name: Leonardo [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - List of Names: ['Roberto', 'Miquel Angelo', 'Leonardo'] [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name in loop: Roberto [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - 0 success 1 success 2 success 3 success Name: Dummy, dtype: object [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name in loop: Miquel Angelo [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - 0 failure 1 failure 2 failure 3 failure Name: Dummy, dtype: object [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - Student Name in loop: Leonardo [2024-04-20, 22:31:09 UTC] {logging_mixin.py:151} INFO - 0 success 1 success 2 success 3 success Name: Dummy, dtype: object

© www.soinside.com 2019 - 2024. All rights reserved.