你好,我试图从包含连续数字的数据框中删除行,例如 1234 或 6789 ...
pattern = r'\{3,}'
mask = df1['prix'].str.contains(pattern)
df_filtered = df1[mask]
print(df_filtered)
我试过这段代码,但它返回相同的数据,当我之前试过这个代码时
pattern = r'\d{3,}'
df1['prix'] = df1['prix'].astype(str)
mask = df1['prix'].str.contains(pattern)
if mask.any():
print("There are rows with sequential digits.")
else:
print("There are no rows with sequential digits.")
返回数据有顺序数字
你可以使用这个正则表达式:
pattern = '|'.join([f'{i}{i+1}{i+2}' for i in range(0, 8, 1)]
+ [f'{i}{i-1}{i-2}' for i in range(9, 1, -1)])
mask = df['prix'].astype(str).str.contains(pattern)
out = df[~mask]
输出:
>>> out
prix
3 1379
>>> pattern
'012|123|234|345|456|567|678|789|987|876|765|654|543|432|321|210'
>>> mask
0 True
1 True
2 True
3 False
Name: prix, dtype: bool
>>> df
prix
0 1234 # drop (123)
1 6789 # drop (678)
2 4563 # drop (456)
3 1379 # keep