此代码获取的数据未格式化为正确的 csv 格式。
import requests
import csv
def Download_data():
s = requests.Session()
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36'}
s.headers.update(headers)
resp = s.get('https://www.nseindia.com/market-data/live-equity-market')
resp.raise_for_status()
resp = s.get('https://www.nseindia.com/api/equity-stockIndices?csv=true&index=NIFTY%2050')
resp.raise_for_status()
data_79 = resp.text
data_79 = resp.text.replace('","', '')
with open('___N50__.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile)
writer.writerows([line.split(',') for line in data_79.splitlines()])
if __name__ == "__main__":
Download_Fresh_data2()
标题都是行而不是一列。
我尝试在经过大量学习后得到这段代码,但从我的角度来看,它仍然是知识不足。请帮忙!
如果您访问的网址有有效的 CSV 文件,那么您可以直接将 csv 读取到 pandas dataframe 中,然后保存到本地计算机,如下所示:
import pandas as pd
import io
import requests
url="https://raw.githubusercontent.com/cs109/2014_data/master/countries.csv"
s=requests.get(url).content
c=pd.read_csv(io.StringIO(s.decode('utf-8')))
c.to_csv(r"C:\users\123\Downloads\countries.csv")
您用文本编辑器查看过数据吗?
resp.text
似乎以BOM开头,标题的每个字段都以换行符结束。恕我直言,数据需要一些清理:
# ...
// strip BOM:
data_79 = re.sub(r'\A[^"]+"', '"', resp.text, 1)
// strip unwanted linefeeds:
data_79 = re.sub(r'([^"])\n', '\\1', data_79)
// save the data in a file
with open('___N50__.csv', 'w', newline='') as file:
file.write(data_79)