我一直在这里得到一个值错误。我到底做错了什么?我读和写的CSV没有在每个日期条目上打上时间戳。日期只是以'mm-dd-yyy'格式输入。
Traceback (most recent call last):
File "C:/Users/Matthew Olive/PycharmProjects/excel_scripts/contango_cleaner.py", line 17, in <module>
dateS = datetime.strptime(row[2], "%m-%d-%Y")
File "C:\Users\Matthew Olive\AppData\Local\Programs\Python\Python38\lib\_strptime.py", line 568, in _strptime_datetime
tt, fraction, gmtoff_fraction = _strptime(data_string, format)
File "C:\Users\Matthew Olive\AppData\Local\Programs\Python\Python38\lib\_strptime.py", line 349, in _strptime
raise ValueError("time data %r does not match format %r" %
ValueError: time data '' does not match format '%m-%d-%Y'
这是我的代码。
import csv
from datetime import datetime
path = "C:\\Users\\Matthew Olive\\Desktop\\Trading Stuff\\sample_for_python .csv"
file = open(path, newline = '')
reader = csv.reader(file)
data =[]
for row in reader:
dateC = datetime.strptime(row[0], "%m-%d-%Y")
dateS = datetime.strptime(row[2], "%m-%d-%Y")
contango = float(row[1])
open_price = float(row[3])
close_price = float(row[4])
data.append([dateC, contango, dateS, open_price, close_price])
print(data[0])
当我试图将日期字符串转换为正确的日期时间对象时,问题似乎就来了。程序的最终目标是说 "如果dateC与dateS不匹配,那么删除该行的第一和第二条。然后创建一个新的CSV文件并将其导出到Excel中。我试图将一个我经常手动完成的过程自动化。 (对于每一行,如果第三、第四和第五列中没有匹配的日期和浮动值,则前两列不需要)
下面是我的CSV结构(列):dateC和dateS应该是datetime对象,而其他的是floats。
[dateC, contango, dateS, open_price, close_price]
我的评论的一个例子;这如何使用 pandas
:
import io
import pandas as pd
# some example input with missing dates (empty fields in the csv)...
data = io.StringIO("""dateC,contango,dateS,open_price,close_price
02-03-2020,1,02-04-2020,50,55
,2,02-04-2020,51,56
02-03-2020,3,,52,57""")
# just replace 'data' with the path to your csv:
df = pd.read_csv(data)
df['dateC'] = pd.to_datetime(df['dateC'], errors='coerce')
df['dateS'] = pd.to_datetime(df['dateS'], errors='coerce')
# df
# dateC contango dateS open_price close_price
# 0 2020-02-03 1 2020-02-04 50 55
# 1 NaT 2 2020-02-04 51 56
# 2 2020-02-03 3 NaT 52 57
一些参考文献。