如何在数据帧基于其他行和其他dataframes找到行

问题描述 投票:2回答:1

从我问here这个问题我花了JSON响应寻找与此类似:

(请注意:id在我的样本数据如下:数字字符串,但有些是字母数字)

data=↓**

{
  "state": "active",
  "team_size": 20,
  "teams": {
    "id": "12345679",
    "name": "Good Guys",
    "level": 10,
    "attacks": 4,
    "destruction_percentage": 22.6,
    "members": [
      {
        "id": "1",
        "name": "John",
        "level": 12
      },
      {
        "id": "2",
        "name": "Tom",
        "level": 11,
        "attacks": [
          {
            "attackerTag": "2",
            "defenderTag": "4",
            "damage": 64,
            "order": 7
          }
        ]
      }
    ]
  },
  "opponent": {
    "id": "987654321",
    "name": "Bad Guys",
    "level": 17,
    "attacks": 5,
    "damage": 20.95,
    "members": [
      {
        "id": "3",
        "name": "Betty",
        "level": 17,
        "attacks": [
          {
            "attacker_id": "3",
            "defender_id": "1",
            "damage": 70,
            "order": 1
          },
          {
            "attacker_id": "3",
            "defender_id": "7",
            "damage": 100,
            "order": 11
          }
        ],
        "opponentAttacks": 0,
        "some_useless_data": "Want to ignore, this doesn't show in every record"
      },
      {
        "id": "4",
        "name": "Fred",
        "level": 9,
        "attacks": [
          {
            "attacker_id": "4",
            "defender_id": "9",
            "damage": 70,
            "order": 4
          }
        ],
        "opponentAttacks": 0
      }
    ]
  }
}

我装这个使用:

df = json_normalize([data['team'], data['opponent']],
                     'members',
                     ['id', 'name'],
                     meta_prefix='team.',
                     errors='ignore')
print(df.iloc(1))
attacks              [{'damage': 70, 'order': 4, 'defender_id': '9'...
id                                                                   4
level                                                                9
name                                                              Fred
opponentAttacks                                                      0
some_useless_data                                                  NaN
team.name                                                     Bad Guys
team.id                                                      987654321
Name: 3, dtype: object

我本质上是一个3部分的问题。

  1. 如何获得一排像上面使用成员标签的人?我试过了: member = df[df['id']=="1"].iloc[0] #Now this works, but am I correctly doing this? #It just feels weird is all.
  2. 我将如何根据检索只给会员的防御,只有攻击被记录,而不是防御(即使defender_id给出)?我努力了: df.where(df['tag']==df['attacks'].str.get('defender_id'), df['attacks'], axis=0) #This is totally not working.. Where am I going wrong?
  3. 因为我从API获取新的数据,我需要检查VS在我的数据库中的旧数据,看看是否有任何新的攻击。然后我可以循环通过新的攻击,我然后显示给用户的攻击信息。 这我真的想不通,我试图寻找到this questionthis one还有,我觉得是在任何地方接近我需要和我仍然有麻烦缠绕的概念我的大脑。从本质上讲我的逻辑如下: def get_new_attacks(old_data, new_data) '''params old_data: Dataframe loaded from JSON in database new_data: Dataframe loaded from JSON API response hopefully having new attacks returns: iterator over the new attacks ''' #calculate a dataframe with new attacks listed return df.iterrows()

我知道上面的功能显示几乎没有比我给的文档其他努力(主要是展示我所需的输入/输出),但相信我,我一直在令人头大我的大脑在这个部分最。我一直在寻找到merging所有的攻击然后做reset_index()而只是引发错误,由于攻击是一个列表。在我上面链接的第二个问题的map()功能我难住了。

python pandas python-3.7
1个回答
1
投票

参考你的问题,以便(下面的代码):

  1. 我看起来像id是数据的唯一索引,所以你可以使用df.set_index('id')它允许你通过玩家ID通过df.loc['1']访问数据为例。
  2. 据我了解你的数据,在每个attacks列出的所有词典是自包含的意义上,不需要相应的玩家ID(如attacker_iddefender_id似乎是不够的识别数据)。因此,而不是处理包含列表我建议换出在它自己的数据帧,这使得它容易获得该数据的行。
  3. 一旦你在自己的数据帧attacks商店,你可以简单地比较指标,以过滤掉旧的数据。

下面是一些例子代码来说明各点:

# Question 1.
df.set_index('id', inplace=True)
print(df.loc['1'])  # For example player id 1.

# Question 2 & 3.
attacks = pd.concat(map(
    lambda x: pd.DataFrame.from_dict(x).set_index('order'),  # Is 'order' the right index?
    df['attacks'].dropna()
))

# Question 2.
print(attacks[attacks['defender_id'] == '1'])  # For example defender_id 1.

# Question 3.
old_attacks = attacks.iloc[:2]  # For example.
new_attacks = attacks[~attacks.index.isin(old_attacks.index)]
print(new_attacks)
© www.soinside.com 2019 - 2024. All rights reserved.