从我问here这个问题我花了JSON响应寻找与此类似:
(请注意:id
在我的样本数据如下:数字字符串,但有些是字母数字)
data=↓**
{
"state": "active",
"team_size": 20,
"teams": {
"id": "12345679",
"name": "Good Guys",
"level": 10,
"attacks": 4,
"destruction_percentage": 22.6,
"members": [
{
"id": "1",
"name": "John",
"level": 12
},
{
"id": "2",
"name": "Tom",
"level": 11,
"attacks": [
{
"attackerTag": "2",
"defenderTag": "4",
"damage": 64,
"order": 7
}
]
}
]
},
"opponent": {
"id": "987654321",
"name": "Bad Guys",
"level": 17,
"attacks": 5,
"damage": 20.95,
"members": [
{
"id": "3",
"name": "Betty",
"level": 17,
"attacks": [
{
"attacker_id": "3",
"defender_id": "1",
"damage": 70,
"order": 1
},
{
"attacker_id": "3",
"defender_id": "7",
"damage": 100,
"order": 11
}
],
"opponentAttacks": 0,
"some_useless_data": "Want to ignore, this doesn't show in every record"
},
{
"id": "4",
"name": "Fred",
"level": 9,
"attacks": [
{
"attacker_id": "4",
"defender_id": "9",
"damage": 70,
"order": 4
}
],
"opponentAttacks": 0
}
]
}
}
我装这个使用:
df = json_normalize([data['team'], data['opponent']],
'members',
['id', 'name'],
meta_prefix='team.',
errors='ignore')
print(df.iloc(1))
attacks [{'damage': 70, 'order': 4, 'defender_id': '9'...
id 4
level 9
name Fred
opponentAttacks 0
some_useless_data NaN
team.name Bad Guys
team.id 987654321
Name: 3, dtype: object
我本质上是一个3部分的问题。
member = df[df['id']=="1"].iloc[0]
#Now this works, but am I correctly doing this?
#It just feels weird is all.
df.where(df['tag']==df['attacks'].str.get('defender_id'), df['attacks'], axis=0)
#This is totally not working.. Where am I going wrong?
def get_new_attacks(old_data, new_data)
'''params
old_data: Dataframe loaded from JSON in database
new_data: Dataframe loaded from JSON API response
hopefully having new attacks
returns:
iterator over the new attacks
'''
#calculate a dataframe with new attacks listed
return df.iterrows()
我知道上面的功能显示几乎没有比我给的文档其他努力(主要是展示我所需的输入/输出),但相信我,我一直在令人头大我的大脑在这个部分最。我一直在寻找到merg
ing所有的攻击然后做reset_index()
而只是引发错误,由于攻击是一个列表。在我上面链接的第二个问题的map()
功能我难住了。
参考你的问题,以便(下面的代码):
id
是数据的唯一索引,所以你可以使用df.set_index('id')
它允许你通过玩家ID通过df.loc['1']
访问数据为例。attacks
列出的所有词典是自包含的意义上,不需要相应的玩家ID(如attacker_id
或defender_id
似乎是不够的识别数据)。因此,而不是处理包含列表我建议换出在它自己的数据帧,这使得它容易获得该数据的行。attacks
商店,你可以简单地比较指标,以过滤掉旧的数据。下面是一些例子代码来说明各点:
# Question 1.
df.set_index('id', inplace=True)
print(df.loc['1']) # For example player id 1.
# Question 2 & 3.
attacks = pd.concat(map(
lambda x: pd.DataFrame.from_dict(x).set_index('order'), # Is 'order' the right index?
df['attacks'].dropna()
))
# Question 2.
print(attacks[attacks['defender_id'] == '1']) # For example defender_id 1.
# Question 3.
old_attacks = attacks.iloc[:2] # For example.
new_attacks = attacks[~attacks.index.isin(old_attacks.index)]
print(new_attacks)