使用正则表达式转换大熊猫中的日期格式

问题描述 投票:0回答:1

我有一个数据框:

print(df_test)

               Name Birth Date
0     Anna B Wilson   JUL 1861
1  Victor C Burnett   NOV 1847
2     Ausia Burnett   JUN 1898
3    Alfred Burnett   MAR 1896
4     Viola Burnett   AUG 1894

我希望输出为:

               Name Birth Date
0     Anna B Wilson     7-1861
1  Victor C Burnett    11-1847
2     Ausia Burnett     6-1898
3    Alfred Burnett     3-1896
4     Viola Burnett     8-1894

我是否有一种简洁的方法来执行此操作,而无需每月编写单独的正则表达式,即

df_test = df_test.replace(to_replace ='(MAR)\s(\d{4})', value = r'3-\2', regex = True)
df_test = df_test.replace(to_replace ='(JUN)\s(\d{4})', value = r'6-\2', regex = True)
df_test = df_test.replace(to_replace ='(JUL)\s(\d{4})', value = r'7-\2', regex = True)
df_test = df_test.replace(to_replace ='(AUG)\s(\d{4})', value = r'8-\2', regex = True)
df_test = df_test.replace(to_replace ='(NOV)\s(\d{4})', value = r'11-\2', regex = True)
print(df_test)

regex pandas date-conversion
1个回答
0
投票
您实际上并不需要正则表达式,可以使用pd.to_datetime()后跟strftime()来指定所需的格式,例如:

test_df = pd.DataFrame({'Name':['A','B','C','D','E'], 'Birthdate':['JUL 1861', 'NOV 1847','JUN 1898','MAR 1896','AUG 1894']}) test_df['Birthdate'] = pd.to_datetime(test_df['Birthdate'],infer_datetime_format=True).dt.strftime('%m-%Y')

输出:

Name Birthdate 0 A 07-1861 1 B 11-1847 2 C 06-1898 3 D 03-1896 4 E 08-1894

© www.soinside.com 2019 - 2024. All rights reserved.