Countif pandas python的多列通配符。

Question

我在Excel中有一个数据集，我想复制它。

我的python代码看起来像。

data_frames = [df_mainstore, df_store_A, df_store_B]
df_merged = reduce(lambda  left,right: pd.merge(left,right,on=["Id_number"], how='outer'), data_frames)
print(df_merged)

由于我合并了几个数据框（可以是不同的列数和名称）写出所有的列数是很繁琐的，而这是在这段代码中完成的。例子:

isY = lambda x:int(x=='Y')
countEmail= lambda row: isY(row['Store Contact A']) + isY(row['Store B Contact'])
df['Contact Email'] = df.apply(countEmail,axis=1)

我也在为这个表达方式而苦恼。isY = lambda x:int(x=='@')

如何以类似于在Excel中的方式添加 "联系人有电子邮件 "一栏？

Answer 1

你可以使用 filter 来选择含有联系人的列，然后使用 str.contains 得体邮件地址格式最后你要 any 每行如此。

#data sample
df_merged = pd.DataFrame({'id': [0,1,2,3], 
                          'Store A': list('abcd'),
                          'Store Contact A':['[email protected]', '', 'e', 'f'], 
                          'Store B': list('ghij'),
                          'Store B Contact':['[email protected]', '', '[email protected]', '']})

# define the pattern as in the link
pat = r"^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$"
# create the column as wanted
df_merged['Contact has Email'] = df_merged.filter(like='Contact')\
                                          .apply(lambda x: x.str.contains(pat))\
                                          .any(1)

print (df_merged)
   id Store A Store Contact A Store B Store B Contact  Contact has Email
0   0       a        [email protected]       g         [email protected]               True
1   1       b                       h                              False
2   2       c               e       i       [email protected]               True
3   3       d               f       j                              False

Answer 2

你可以用 pandas.Series.str.包含

df_merged['Contact has Email'] = df_merged['Store Contact A'].str.contains('@', na=False)|df_merged['Store B Contact'].str.contains('@', na=False)

Countif pandas python的多列通配符。

问题描述投票：1回答：2

2个回答

最新问题

Countif pandas python的多列通配符。

问题描述 投票：1回答：2

2个回答

最新问题

问题描述投票：1回答：2