将多个列添加到数据框并跳过空值

问题描述 投票:0回答:2

我有这样的数据帧:

s = {'B1': ['1C', '3A', '41A'], 'B2':['','1A','28A'], 'B3':['','','3A'],
     'B1_m':['2','2','2'], 'B2_m':['2','4','2'],'B3_m':['2','2','4'], 
     'E':['0','0','0']}
s = DataFrame(s)
print(s)

    B1   B2  B3 B1_m B2_m B3_m  E
0   1C             2    2    2  0
1   3A   1A        2    4    2  0
2  41A  28A  3A    2    2    4  0

然后我通过以下格式将这些多列添加到新列qazxsw poi:

Results

但是,我想要的是如果B1-B3中有空值,则跳过该项,如下所示:

s['Results'] = s['B1']+s['B1_m']+'-'+s['B2']+s['B2_m']+'-'+s['B3']+s['B3_m']+'-'+s['E']
print(s)

    B1   B2  B3  B1_m  B2_m  B3_m   E            Results
0   1C              2     2     2   0          1C2-2-2-0
1   3A   1A         2     4     2   0        3A2-1A4-2-0
2  41A  28A  3A     2     2     4   0    41A2-28A2-3A4-0

有条件地跳过那些空值有什么方法吗? 提前致谢

python pandas numpy dataframe
2个回答
2
投票

使用 B1 B2 B3 B1_m B2_m B3_m E Results 0 1C 2 2 2 0 1C2-0 1 3A 1A 2 4 2 0 3A2-1A4-0 2 41A 28A 3A 2 2 4 0 41A2-28A2-3A4-0 是我能想到解决这个问题的最pythonic方式:

numpy.where

将获得您想要的结果:

import numpy as np

s['Results'] = s['B1']+s['B1_m'] + \
                  np.where(s['B2'], '-'+s['B2']+s['B2_m'], "") + \
                  np.where(s['B3'], '-'+s['B3']+s['B3_m'], "") +'-'+s['E']

(请注意,print(s) B1 B2 B3 B1_m B2_m B3_m E Results 0 1C 2 2 2 0 1C2-0 1 3A 1A 2 4 2 0 3A2-1A4-0 2 41A 28A 3A 2 2 4 0 41A2-28A2-3A4-0 需要在长语句中插入换行符)。


2
投票

一种方法是使用正则表达式和concat列\ str.replace单个数字:

E

要么:

s['Results'] = s['Results'].str.replace(r'\b\-[0-9]\b','')+'-'+s['E']

s['Results'] = s['Results'].str.replace(r'\b\-\d\b','')+'-'+s['E']

如果数字不止一个,那么使用:

print(s)
    B1   B2  B3 B1_m B2_m B3_m  E          Results
0   1C             2    2    2  0            1C2-0
1   3A   1A        2    4    2  0        3A2-1A4-0
2  41A  28A  3A    2    2    4  0  41A2-28A2-3A4-0
© www.soinside.com 2019 - 2024. All rights reserved.