假设我有一个数据框:
import pandas as pd
df = pd.DataFrame({"A1": [10, 20, 15, 30, 45],
"B1": [13, 23, 18, 33, 48],
"C1": [17, 27, 22, 37, 52],
"A2": [10, 20, 15, 30, 45],
"B2": [13, 23, 18, 33, 48],
"C2": [17, 27, 22, 37, 52]}))
col1_names = ['A1', 'B1', 'C1']
col2_names = ['A2', 'B2', 'C2']
col_new = ['delA', 'delB', 'delC']
我想做一个操作,在 df 中获得三个新列,其值对应于 col2_names 和 col1_names 之间的差异。
for i in range(len(col1_names)):
df[col_new[i]] = df[col2_names[i]] - df[col1_names[i]]
有没有一种方法可以向量化并且无需循环?
尝试过这个:
df[col_new] = df[col2_names] - df[col1_names]
预期结果与上述循环解决方案相同,但我得到
ValueError: Columns must be same length as key
。
附加:这可以推广到其他操作吗?
当您对数据框执行操作时,它们将对齐其索引。行标签与行标签匹配,与列标签相同。
col1_names
和 col2_names
没有任何共同点,所以你得到的都是 NA。
试试这个:
df[col_new] = df[col2_names].values - df[col1_names].values