我有问题,查找 DataFrame 的行,其中 2 列包含 String 的一部分。 列值(字符串类型(对象)) 我的意思是与 str.contains 或 isin() 相反,因为子字符串掩码是列值。 该字符串不适合清晰分割,因为 3 个值“Cityname”、“Districtname”和“Streetname”可以包含空格。
你能帮我吗?
s = "Bad Testcity Teststr." df_res = df.loc[(s.find(df['CITY'] != -1) & (s.find(df['DISTRICT'] != -1) & (s.find(df['STREET'] != -1)]
此示例应返回 TRUE。
<bound method DataFrame.info of ZIP CITY STREET NUMBER NUMBER_SFX DISTRICT ONKZ ASB ADSL VDSL VDSL_SV VPSZ OUTDOOR
ID
4025217 12345 Bad Testcity Teststr. 6 NaN Bad Testcity 12345 2 +017.696 +102.784 NaN 49/12345/30 O
4025219 12345 Bad Testcity Teststr. 7 NaN Bad Testcity 12345 2 +017.696 +102.784 NaN 49/12345/30 O
4025242 12345 Bad Testcity Teststr. 8 NaN Bad Testcity 12345 2 +017.696 +102.784 +185.824 49/12345/30 O
4025244 12345 Bad Testcity Teststr. 10 NaN Bad Testcity 12345 2 +017.696 +102.784 NaN 49/12345/30 O
4025245 12345 Bad Testcity Teststr. 11 NaN Bad Testcity 12345 2 +017.696 +051.392 NaN 49/12345/30 O
... ... ... ... ... ... ... ... .. ... ... ... ... ...
[1569530 rows x 13 columns]>
假设这个输入:
ZIP CITY STREET NUMBER NUMBER_SFX DISTRICT ONKZ ASB ADSL VDSL VDSL_SV VPSZ OUTDOOR
ID
4025217 12345 Bad Testcity Teststr. 6 NaN Bad Testcity 12345 2 17.696 102.784 NaN 49/12345/30 O
4025219 12345 Bad Testcity Teststr. 7 NaN Bad Testcity 12345 2 17.696 102.784 NaN 49/12345/30 O
4025242 12345 Bad Testcity Teststr. 8 NaN Bad Testcity 12345 2 17.696 102.784 185.824 49/12345/30 O
4025244 12345 Bad Testcity Teststr. 10 NaN Bad Testcity 12345 2 17.696 102.784 NaN 49/12345/30 O
4025245 12345 Bad Testcity Teststr. 11 NaN Bad Testcity 12345 2 17.696 51.392 NaN 49/12345/30 O
str.contains
:
s = "Bad Testcity Teststr."
df_res = df.loc[(df['CITY']+' '+df['DISTRICT']+' '+df['STREET']).str.contains(s)]
输出(此处不变):
ZIP CITY STREET NUMBER NUMBER_SFX DISTRICT ONKZ ASB ADSL VDSL VDSL_SV VPSZ OUTDOOR
ID
4025217 12345 Bad Testcity Teststr. 6 NaN Bad Testcity 12345 2 17.696 102.784 NaN 49/12345/30 O
4025219 12345 Bad Testcity Teststr. 7 NaN Bad Testcity 12345 2 17.696 102.784 NaN 49/12345/30 O
4025242 12345 Bad Testcity Teststr. 8 NaN Bad Testcity 12345 2 17.696 102.784 185.824 49/12345/30 O
4025244 12345 Bad Testcity Teststr. 10 NaN Bad Testcity 12345 2 17.696 102.784 NaN 49/12345/30 O
4025245 12345 Bad Testcity Teststr. 11 NaN Bad Testcity 12345 2 17.696 51.392 NaN 49/12345/30 O