如何删除 pandas df 中具有连续数字的行

问题描述 投票:0回答:1

你好,我试图从包含连续数字的数据框中删除行,例如 1234 或 6789 ...

定义连续数字的正则表达式模式

pattern = r'\{3,}'
mask = df1['prix'].str.contains(pattern)
df_filtered = df1[mask]

print(df_filtered)

我试过这段代码,但它返回相同的数据,当我之前试过这个代码时

pattern = r'\d{3,}'
df1['prix'] = df1['prix'].astype(str) 
mask = df1['prix'].str.contains(pattern)
if mask.any(): 
    print("There are rows with sequential digits.")
else:
    print("There are no rows with sequential digits.")

返回数据有顺序数字

python r pandas dataframe machine-learning
1个回答
0
投票

你可以使用这个正则表达式:

pattern = '|'.join([f'{i}{i+1}{i+2}' for i in range(0, 8, 1)]
                 + [f'{i}{i-1}{i-2}' for i in range(9, 1, -1)])
mask = df['prix'].astype(str).str.contains(pattern)

out = df[~mask]

输出:

>>> out
   prix
3  1379

>>> pattern
'012|123|234|345|456|567|678|789|987|876|765|654|543|432|321|210'

>>> mask
0     True
1     True
2     True
3    False
Name: prix, dtype: bool

>>> df
   prix
0  1234  # drop (123)
1  6789  # drop (678)
2  4563  # drop (456)
3  1379  # keep
© www.soinside.com 2019 - 2024. All rights reserved.