错误。当把字符串分成两列时，如果满足两个条件，DataFrame的真值就会模糊不清。

Question

如果满足以下两个条件，我想把column['first']中的字符串进行拆分。

column['first']包含'floor'或' floors'字样。
列['second']为空

然而，我收到了一条错误信息。

一个DataFrame的真值是模糊的。使用a.empty、a.bool()、a.item()、a.any()或a.all()。

以下是我的代码

#boolean series for condition 1: when values in column['second'] are empty

only_first_token = pd.isna(results_threshold_50_split_ownership['second']) 
print (len(only_first_token)) 
print (type(only_first_token))

#boolean series for condition 2: when values in column['first'] contain string floor or floors

first_token_contain_floor = results_threshold_50_split_ownership['first'].str.contains('floors|floor',case=False)
print (len(first_token_contain_floor))
print (type(only_first_token))

#if both conditions are met, the string in column['first'] will be split into column['first'] and['second']

if results_threshold_50_split_ownership[(only_first_token) & (first_token_contain_floor)]:
    results_threshold_50_split_ownership.first.str.split('Floors|Floor', expand=True)

print(results_threshold_50_split_ownership['first'])

我在这里看了一些答案，已经修改了几次代码。我确保布尔值的总数是一样的，为1016。而且我可以用同样的代码成功地找到能够满足这两个条件的行，如果我删除了 if. 所以我不明白为什么会有歧义。

任何帮助将是非常感激的。非常感谢。

Answer 1

你的条件是完全正确的，问题在于if语句--它是这样的。

if boolean_array :
  ...

但是... if 只需一个布尔值。不一而足 的布尔数组。为了将一个布尔数组减少到只有一个值，你可以使用例如 any()或 all()，就像错误信息所建议的那样--------。 if all(boolean_array): 等。

你真正想做的可能是。

results_threshold_50_split_ownership[(only_first_token) & (first_token_contain_floor)]['first'].str.split('Floors|Floor', expand=True)

即使用布尔数组进行布尔索引。

按照下面的评论进行更新。 你可以将拆分的结果分配给原始列，使用... results_threshold_50_split_ownership.loc[(only_first_token) & (first_token_contain_floor), ['first', 'second']]. 然而，在这种情况下，你需要确保最多返回两列，通过指定的 n=1 (如果你的第一列中多次包含 "floor "一词)。例子:

results_threshold_50_split_ownership = pd.DataFrame({'first': ['first floor value', 'all floors values', 'x'],
                                                     'second': ['y', None, None]})
print(results_threshold_50_split_ownership)
#               first second
#0  first floor value      y
#1  all floors values   None
#2                  x   None
only_first_token = pd.isna(results_threshold_50_split_ownership['second'])
first_token_contain_floor = results_threshold_50_split_ownership['first'].str.contains('floors|floor',case=False)
results_threshold_50_split_ownership.loc[(only_first_token) & (first_token_contain_floor), ['first', 'second']] = results_threshold_50_split_ownership[(only_first_token) & (first_token_contain_floor)]['first'].str.split('floors|floor', 1, expand=True).to_numpy()
print(results_threshold_50_split_ownership)
#               first   second
#0  first floor value        y
#1               all    values
#2                  x     None

错误。当把字符串分成两列时，如果满足两个条件，DataFrame的真值就会模糊不清。

问题描述投票：0回答：1

1个回答

最新问题

错误。当把字符串分成两列时，如果满足两个条件，DataFrame的真值就会模糊不清。

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1