Python Pandas DataFrame format() 不更新 df 值

问题描述 投票:0回答:1

尝试更新包含浮点数或字符串的列的格式时,仅更新某些输入文件的列值,而不更新其他文件。

这是代码:

    try:
        print('{:.2e}'.format(cell_counts.iat[0,1]))
        cell_counts.iat[0,1] = '{:.2e}'.format(cell_counts.iat[0,1])
        print(cell_counts.iat[0,1])
    except ValueError:
        cell_counts.iat[0,1] = cell_counts.iat[0,1]

    for x in range(0,8):
        try:
            cell_counts.iat[x,2] = '{:.2e}'.format(cell_counts.iat[x,2])
        except ValueError:
            cell_counts.iat[x,2] = cell_counts.iat[x,2]
    
    for x in range(0,8):
        try:
            cell_counts.iat[x,5] = '{:.2e}'.format(cell_counts.iat[x,5])
        except ValueError:
            cell_counts.iat[x,5] = cell_counts.iat[x,5]

    try:
        cell_counts.at[0,'Average Cells (Dead or Live)'] = '{:.2e}'.format(cell_counts.at[0,'Average Cells (Dead or Live)'])
        cell_counts.at[4,'Average Cells (Dead or Live)'] = '{:.2e}'.format(cell_counts.at[4,'Average Cells (Dead or Live)'])
    except ValueError:
        cell_counts.at[0,'Average Cells (Dead or Live)'] = cell_counts.at[0,'Average Cells (Dead or Live)']
        cell_counts.at[4,'Average Cells (Dead or Live)'] = cell_counts.at[4,'Average Cells (Dead or Live)']


    try:
        cell_counts.at[0,'Standard Deviation'] = '{:.2e}'.format(cell_counts.at[0,'Standard Deviation'])
        cell_counts.at[4,'Standard Deviation'] = '{:.2e}'.format(cell_counts.at[4,'Standard Deviation'])
    except ValueError:
        cell_counts.at[0,'Standard Deviation'] = cell_counts.at[0,'Standard Deviation']
        cell_counts.at[4,'Standard Deviation'] = cell_counts.at[4,'Standard Deviation']

    try:
        cell_counts.at[0,'Calculated Cell Suspension'] = '{:.2e}'.format(cell_counts.at[0,'Calculated Cell Suspension'])
        cell_counts.at[4,'Calculated Cell Suspension'] = '{:.2e}'.format(cell_counts.at[4,'Calculated Cell Suspension'])
    except ValueError:
        cell_counts.at[0,'Calculated Cell Suspension'] = cell_counts.at[0,'Calculated Cell Suspension']
        cell_counts.at[4,'Calculated Cell Suspension'] = cell_counts.at[4,'Calculated Cell Suspension']

    try:
        cell_counts.at[0,'Cell Recovery'] = '{:.2e}'.format(cell_counts.at[0,'Cell Recovery'])
        cell_counts.at[4,'Cell Recovery'] = '{:.2e}'.format(cell_counts.at[4,'Cell Recovery'])
    except ValueError:
        cell_counts.at[0,'Cell Recovery'] = cell_counts.at[0,'Cell Recovery']
        cell_counts.at[4,'Cell Recovery'] = cell_counts.at[4,'Cell Recovery']

格式化字符串是正确的,当使用打印语句检查它时,它的格式正确,甚至适用于某些文件。这是输出之一:

这里是Output

格式设置适用于某些列,但不适用于其他列。在屏幕截图的顶部,我们可以看到格式符合我们的要求,但存储在所需位置后该值没有更新。我知道有些人使用iat,有些人使用at,我正在尽我所能。这是我正在更新数据框的视图/副本的情况吗?

Here是使用不同输入数据文件的输出。格式化的行为符合预期。我还尝试使用 if isinstance 而不是 try/ except ,得到相同的结果。

如有任何帮助,我们将不胜感激。我还要指出,我显然不是这方面的专家。

编辑: 在尝试下面 Serge 的建议后,我仍然得到相同的结果:

def format_to_scientific(value):
    try:
        return '{:.2e}'.format(float(value))
    except (ValueError, TypeError):
        return value


for col_index in [1,2,5]:
    for row_index in range(0,8):
        cell_counts.iat[row_index,col_index] = format_to_scientific(cell_counts.iat[row_index,col_index])

target_cells = [
    (0, 'Average Cells (Dead or Live)'),
    (4, 'Average Cells (Dead or Live)'),
    (0, 'Standard Deviation'),
    (4, 'Standard Deviation'),
    (0, 'Calculated Cell Suspension'),
    (4, 'Calculated Cell Suspension'),
    (0, 'Cell Recovery'),
    (4, 'Cell Recovery')
]
for row_index, col_name in target_cells:
    cell_counts.at[row_index, col_name] = format_to_scientific(cell_counts.at[row_index, col_name])

开头的列未格式化,而末尾的列已格式化。

检查我们哪里出错了:

    for col_index in [1,2,5]:
    for row_index in range(0,8):
        return_value = format_to_scientific(cell_counts.iat[row_index,col_index])
        print(f'Formatted value:{return_value}')
        cell_counts.iat[row_index,col_index] = return_value
        print(cell_counts.iat[row_index,col_index])

格式设置有效,但分配后数据帧值尚未更新。

python pandas dataframe format
1个回答
0
投票

为什么不定义一个函数来做到这一点。这是一个基于您的代码的示例。修改它以适应您想要做的事情。

import pandas as pd


def format_to_scientific(value):
    try:
        return '{:.2e}'.format(float(value))
    except (ValueError, TypeError):
        return value

for col_index in [2, 5]:  
    for row_index in range(8): 
        cell_counts.iat[row_index, col_index] = format_to_scientific(cell_counts.iat[row_index, col_index])

target_cells = [
    (0, 'Average Cells (Dead or Live)'),
    (4, 'Average Cells (Dead or Live)'),
    (0, 'Standard Deviation'),
    (4, 'Standard Deviation'),
    (0, 'Calculated Cell Suspension'),
    (4, 'Calculated Cell Suspension'),
    (0, 'Cell Recovery'),
    (4, 'Cell Recovery')
]

for row_index, col_name in target_cells:
    cell_counts.at[row_index, col_name] = format_to_scientific(cell_counts.at[row_index, col_name])

以下是如何应用它。这是一个测试数据框

data = {
    'ID': [1, 2, 3, 4, 5, 6, 7, 8],
    'Column2': [1.2e3, 3.4e4, 5.6e5, 7.8e6, 9.0e1, 1.1e2, 1.2e3, 1.3e4],
    'Column3': [2.3e3, 4.5e4, 6.7e5, 8.9e6, 1.0e2, 1.2e2, 1.3e3, 1.4e4],
    'Column5': [np.nan, 3.2, 'abc', 4.5e6, 5.6, 7.8, 'xyz', 9.0],
    'Average Cells (Dead or Live)': [2.1e5, 3.2e5, np.nan, 'text', 4.3e5, 5.4e5, 6.5e5, 7.6e5],
    'Standard Deviation': [1.1e2, 2.2e3, 3.3e4, 4.4e5, 5.5e6, 6.6e7, 7.7e8, 8.8e9],
    'Calculated Cell Suspension': [1.23, 2.34, 3.45, 'error', 5.67, 6.78, 7.89, 8.90],
    'Cell Recovery': [11.1, 22.2, 33.3, 44.4, 55.5, 66.6, 77.7, 88.8]
}

cell_counts = pd.DataFrame(data)

并应用该功能

for row_index, col_name in target_cells:
    cell_counts.at[row_index, col_name] = format_to_scientific(cell_counts.at[row_index, col_name])

返回:

   ID    Column2   Column3    Column5 Average Cells (Dead or Live)  \
0   1     1200.0  2.30e+03        NaN                     2.10e+05   
1   2    34000.0  4.50e+04        3.2                     320000.0   
2   3   560000.0  6.70e+05        abc                          NaN   
3   4  7800000.0  8.90e+06  4500000.0                         text   
4   5       90.0  1.00e+02        5.6                     4.30e+05   
5   6      110.0  1.20e+02        7.8                     540000.0   
6   7     1200.0  1.30e+03        xyz                     650000.0   
7   8    13000.0  1.40e+04        9.0                     760000.0   

  Standard Deviation Calculated Cell Suspension Cell Recovery  
0           1.10e+02                   1.23e+00      1.11e+01  
1           2.20e+03                       2.34          22.2  
2           3.30e+04                       3.45          33.3  
3           4.40e+05                      error          44.4  
4           5.50e+06                   5.67e+00      5.55e+01  
5           6.60e+07                       6.78          66.6  
6           7.70e+08                       7.89          77.7  
7           8.80e+09                        8.9          88.8  

© www.soinside.com 2019 - 2024. All rights reserved.