Pandas - 在特定字符索引处插入列

Question

我有一个像下面这样的 df：

Account_num First_Name  Last_Name   Zipcode  Amount
AAA111      AAA         BBB         12345    784.23
AAA112      AAB         BBA         44546    2145.32
AAA113      AAC         BBC         75452    6563.24
AAA114      AAD         BBD         45484    9532.21

我需要格式化此数据框，以便各列位于从行开头算起的某个位置。例如：

我需要 account_num 从字符 5 开始，而不是从该行的字符 1 开始。由于 account_nums 始终为 6 个字符长，因此我需要 first_name 从该行的第 13 个字符开始。（包括空格）依此类推。

我有 sas 中的示例，想用 Panda 重写它。如何才能做到这一点？输入 @5(acct_num) @13(first_name)($char32.-l)@20(last_Name) ($char32.-l)

1234567891112131415161718192021222324252627282930 AAA111 AAA BBB 12345 784.23 AAA112 AAB BBA 44546 2145.32 AAA113 AAC 英国广播公司 75452 6563.24 AAA114 AAD BBD 45484 9532.21

在上面，第一行表示字符数，我的输出需要是 account_num 列从字符 5 开始，First_Name 从 13 开始，last_Name 从 20 开始，依此类推。

我尝试过这个，但没有按预期工作：

def format_columns(df): # Define column widths and positions
widths = [6, 10, 20, 8, 10]
positions = [5, 13, 33, 53, 63]
for col, width, pos in zip(df.columns, widths, positions): df[col] = df[col].astype(str).apply(lambda x: x.ljust(width))
df[col] = df[col].apply(lambda x: ’ ’ * (pos - len(x)) + x if len(x) < pos else x)
return df # Format DataFrame columns
return df
formatted_df = format_columns(df)
print(formatted_df)

有人可以帮助理解我的代码有什么问题吗？

Answer 1

pandas.DataFrame.to_string() 方法可能会有所帮助。

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_string.html

大致：

df.to_string('export_path', 
             col_space = [11, 6, <etc> ],
             justify = 'right',
             index = False,
             )

Answer 2

您可以使用 ljust 并应用数据框，如果需要，您可以更改我的 just_list 值试试这些代码↓

just_list = [ 8, 20, 23, 11, 12 ]

def char_just(x):
    t = " "*4
    
    for e,j in enumerate(just_list):
        t += x.iloc[e].ljust(j)
    
    return t

df = df.astype(str)
df.apply(char_just, axis=1).to_csv('text.txt', index=False, header=False)

Pandas - 在特定字符索引处插入列

问题描述投票：0回答：2

2个回答

最新问题

Pandas - 在特定字符索引处插入列

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2