考虑这个简化的 df:
import pandas as pd
data = { 'Name_Type': ["Primary", "Primary", "AKA", "Primary"],
'Name': ["John", "Daniel", "Dan", "Bob"],
'Surname': ["Green", "Brown", "Brown", "White"],
'Country Type': ["Origin", "Origin", None, "Origin"],
'Country': ["UK", "UK", None, "UK"],
'Other': ["Info", None, None, "Info"]}
df = pd.DataFrame(data)
Name_Type Name Surname Country Type Country Other
0 Primary John Green Origin UK Info
1 Primary Daniel Brown Origin UK None
2 AKA Dan Brown None None None
3 Primary Bob White Origin UK Info
所以我想在 Origin 而不是 None 的每一行下添加新值。如果已经生成了一个下面没有的行(如示例中的第 2 行),我想将值“Citizenship”添加到“国家/地区类型”列中,并将值“UK”添加到该行的国家/地区中。如果没有行,我想在当前行下创建一个新行并添加相同的值。所以最终的输出会是这样的:
Name_Type Name Surname Country Type Country Other
0 Primary John Green Origin UK Info
1 None None None Citizenship UK None
2 Primary Daniel Brown Origin UK None
3 AKA Daniel Brown Citizenship UK None
4 Primary Bob White Origin UK Info
5 None None None Citizenship UK None
一种可能的方法如下:
import pandas as pd
data = {
'Name_Type': ["Primary", "Primary", "AKA", "Primary"],
'Name': ["John", "Daniel", "Dan", "Bob"],
'Surname': ["Green", "Brown", "Brown", "White"],
'Country Type': ["Origin", "Origin", None, "Origin"],
'Country': ["UK", "UK", None, "UK"],
'Other': ["Info", None, None, "Info"]
}
df = pd.DataFrame(data)
new_rows = []
for i in range(len(df)):
row = df.iloc[i]
if row['Country Type'] == 'Origin':
existing_aka = None
for j in range(i+1, len(df)):
if df.iloc[j]['Name_Type'] == 'AKA' and df.iloc[j]['Name'] == row['Name'] and df.iloc[j]['Surname'] == row['Surname'] and df.iloc[j]['Country Type'] is None:
existing_aka = j
break
if existing_aka is not None:
df.at[existing_aka, 'Country Type'] = 'Citizenship'
df.at[existing_aka, 'Country'] = 'UK'
else:
new_row = {'Name_Type': None, 'Name': None, 'Surname': None, 'Country Type': 'Citizenship', 'Country': 'UK', 'Other': None}
new_rows.append((i+1, new_row))
for index, new_row in reversed(new_rows):
df = pd.concat([df.iloc[:index], pd.DataFrame([new_row]), df.iloc[index:]]).reset_index(drop=True)
print(df)
返回您的预期输出:
Name_Type Name Surname Country Type Country Other
0 Primary John Green Origin UK Info
1 None None None Citizenship UK None
2 Primary Daniel Brown Origin UK None
3 None None None Citizenship UK None
4 AKA Dan Brown None None None
5 Primary Bob White Origin UK Info
6 None None None Citizenship UK None