使用我有条件标记的值创建新列

问题描述 投票:0回答:1

我创建了一个条件语句,该条件语句在时间窗口中比较ID并根据其矢量长度对其进行标记。以下代码创建一列“ intera”:

time = np.array([1,1,1,1,2,2,2,2,2,2,3,3,3,3,3])
ids = np.array([3271,3229,4228,2778,4228,3271,3229,3229,4228,2778,4228,3271,4228,3229,3271])
vec_len = np.array([,0.1,0.5,-0.0,0.0,0.1,-0.7,-0.3,-0.8,-0.6,0.2,0.1,-0.7,-0.3,-0.8])
quad = np.array([7,0,0,5,0,6,5,2,5,5,0,6,5,2,5])

df = pd.DataFrame({'time': time, 'id': ids, 'vec_len': vec_len, 'id': ids})
df['intera'] = np.array(0)

df = pd.DataFrame({'time': time, 'id': ids, 'vec_len': vec_len, 'id': ids})
grp = df.groupby(['time', 'id'], as_index=False)
quant_25 = grp.vec_len.quantile(.25).rename(columns={'vec_len': 'quantile_2 5'})
quant_75 = grp.vec_len.quantile(.75).rename(columns={'vec_len': 'quantile_75'})

quantiles = pd.merge(quant_25, quant_75, on=['time', 'id'])
df = df.merge(quantiles, on=['time', 'id'])
df.loc[:, 'intera'] = df[['vec_len', 'quantile_25', 'quantile_75']].apply(
lambda x: 1 if x[0] < x[1] or x[0] > x[2] else 0, axis=1)

这将创建交互列:

time   id   vec_len  quadrant   interaction  

1    3271    0.9    7   0 
1    3229    0.1    0   0
1    4228    0.5    0   0
1    2778   -0.3    5   0
2    4228    0.2    0   0
2    3271    0.1    6   0
2    3229    -0.7   5   1    
2    3229    -0.3   2   0
2    4228    -0.8   5   1    
2    2778   -0.6    5   1    
3    4228    0.2    0   0
3    3271    0.1    6   0
3    4228    -0.7   5   1    
3    3229    -0.3   2   0
3    3271    -0.8   5   1

如何创建另一列来显示在“ intera”列中通常以1分配的ID对或成组的ID?

期望的列(与”成对/分组”)

time   id   vec_len  quadrant   interaction    Paired with

1    3271    0.9    7   0 
1    3229    0.1    0   0
1    4228    0.5    0   0
1    2778   -0.3    5   0
2    4228    0.2    0   0
2    3271    0.1    6   0
2    3229    -0.7   5   1    [2778, 4228]
2    3229    -0.3   2   0
2    4228    -0.8   5   1    [2778, 3229]
2    2778   -0.6    5   1    [4228, 3229]
3    4228    0.2    0   0
3    3271    0.1    6   0
3    4228    -0.7   5   1    [3271]
3    3229    -0.3   2   0
3    3271    -0.8   5   1    [4228]
pandas for-loop if-statement conditional-statements
1个回答
0
投票

您可以使用:

df['new_column']= ( (df.groupby('id').interaction.transform('sum')>1)&df.interaction.ne(0) ).astype(int)
print(df)
    time    id  vec_len  quadrant  interaction  new_column
0      1  3271      0.9         7            0           0
1      1  3229      0.1         0            0           0
2      1  4228      0.5         0            0           0
3      1  2778     -0.3         5            0           0
4      2  4228      0.2         0            0           0
5      2  3271      0.1         6            0           0
6      2  3229     -0.7         5            1           0
7      2  3229     -0.3         2            0           0
8      2  4228     -0.8         5            1           1
9      2  2778     -0.6         5            1           0
10     3  4228      0.2         0            0           0
11     3  3271      0.1         6            0           0
12     3  4228     -0.7         5            1           1
13     3  3229     -0.3         2            0           0
14     3  3271     -0.8         5            1           0
© www.soinside.com 2019 - 2023. All rights reserved.