数据框,如果值在范围内,则获取范围之和

问题描述 投票:0回答:1

我有点迷失在这里,不知道如何继续。 基本上我有两个从 csv 文件中读取的数据框。

data = {'A': [0,11,21,31,41,51,61],
        'B': [10,20,30,40,50,60,70]}

data2 = {'Point': [11.5, 18.3, 31.3, 41.2, 51.5, 66.6, 34.7, 12.1, 14.4, 56.8, 54.3]}

df = pd.DataFrame(data)
df2 = pd.DataFrame(data2)

我想要做的是查找 df2 中的点是否在数据 A 列和 B 列的范围内,并返回 (A+B),该点将作为另一列添加到 df 中。以第一点 11.5 为例,我应该返回 11+20 的结果并将其添加到该值的新列婴儿中

所以输出结果是这样的

Point   :    Returned_Data
11.5             31
18.3             31
31.3             71
and so on

我遇到的问题是合并或组合具有不同列和行长度的范围或两个 DataFrame。 我知道如何使用 np.where 来匹配值,但该怎么做我在上面这样做,也尝试过使用 bin,但这给了我范围而不是值。

 range = [0,11,21,31,41,51,61]
 df['Returned_Data'] = pd.cut (x=check[list], bins =range)
    A   B Returned_Data
0   0  10       (0, 11]
1  11  20      (11, 21]
2  21  30      (21, 31]

任何帮助将不胜感激。谢谢你。

python pandas dataframe numpy
1个回答
0
投票
In [5]: interval_ranges = [df['A'].iloc[0]] + df['B'].tolist()

In [6]: df.assign(interval=pd.cut(df['B'], interval_ranges))
Out[6]:
    A   B  interval
0   0  10   (0, 10]
1  11  20  (10, 20]
2  21  30  (20, 30]
3  31  40  (30, 40]
4  41  50  (40, 50]
5  51  60  (50, 60]
6  61  70  (60, 70]

In [7]: df2.assign(interval=pd.cut(df2['Point'], interval_ranges))
Out[7]:
    Point  interval
0    11.5  (10, 20]
1    18.3  (10, 20]
2    31.3  (30, 40]
3    41.2  (40, 50]
4    51.5  (50, 60]
5    66.6  (60, 70]
6    34.7  (30, 40]
7    12.1  (10, 20]
8    14.4  (10, 20]
9    56.8  (50, 60]
10   54.3  (50, 60]

In [8]: df2.assign(interval=pd.cut(df2['Point'], interval_ranges)).merge(df.assign(interval=pd.cut(df['B'], interval_r
     ...: anges)))
Out[183]:
    Point  interval   A   B
0    11.5  (10, 20]  11  20
1    18.3  (10, 20]  11  20
2    31.3  (30, 40]  31  40
3    41.2  (40, 50]  41  50
4    51.5  (50, 60]  51  60
5    66.6  (60, 70]  61  70
6    34.7  (30, 40]  31  40
7    12.1  (10, 20]  11  20
8    14.4  (10, 20]  11  20
9    56.8  (50, 60]  51  60
10   54.3  (50, 60]  51  60

In [9]: df2.assign(interval=pd.cut(df2['Point'], interval_ranges)).merge(df.assign(interval=pd.cut(df['B'], interval_r
     ...: anges))).assign(Returned_Data=lambda x: x['A'] + x['B'])
Out[184]:
    Point  interval   A   B  Returned_Data
0    11.5  (10, 20]  11  20             31
1    18.3  (10, 20]  11  20             31
2    31.3  (30, 40]  31  40             71
3    41.2  (40, 50]  41  50             91
4    51.5  (50, 60]  51  60            111
5    66.6  (60, 70]  61  70            131
6    34.7  (30, 40]  31  40             71
7    12.1  (10, 20]  11  20             31
8    14.4  (10, 20]  11  20             31
9    56.8  (50, 60]  51  60            111
10   54.3  (50, 60]  51  60            111
© www.soinside.com 2019 - 2024. All rights reserved.