我正在尝试将每次迭代的股票名称值添加到其相应的数据批量上传到 pandas 数据框:
这是我迄今为止尝试过的:
from pandas_datareader import data as pdr
import requests
from bs4 import BeautifulSoup
import json, requests
import pandas as pd
import re
import numpy as np
import pandas_datareader.data as web
import yfinance as yfin
from tqdm import tqdm
import numpy as np
import datetime
from datetime import timedelta
################# fetch series names for sic ######################
sic_emisoras_df = pd.json_normalize(
json.loads(
requests.get('https://www.bmv.com.mx/es/Grupo_BMV/BmvJsonGeneric?idSitioPagina=6&mercado=CGEN_SCSOP&tipoValor=CGEN_CASEO&random=5845')
.text
.split(';(', 1)[-1]
.split(')')[0]
)['response']['resultado']
).dropna(axis=1, how='all')
####################################################################
# define time range:
start=datetime.date.today()-datetime.timedelta(days=14)
end=datetime.date.today()
# fetch data
# get all SIC names as list
stock_names = sic_emisoras_df["cveCorta"].values.tolist()
# append information per stock name
sic_market_df = pd.DataFrame([])
sic_market_df["stock_name"] = np.nan
for i in tqdm(stock_names):
# fetch data per stock_name
try:
yfin.pdr_override()
# append stock name
sic_market_df["stock_name"]=i
# fetch information by stock name
data = web.DataReader(i,start,end)
# append rows to empty dataframe
sic_market_df = sic_market_df.append(data)
except KeyError:
pass
print("Fetched sic_market_df!")
输出仅获取第一次迭代的名称,但每次其他批量上传都会获取 NaN :
stock_name Open High Low Close Adj Close Volume
2024-02-20 ZS 14.500000 14.950000 14.490000 14.700000 14.700000 30253100.0
2024-02-21 ZS 14.590000 14.860000 14.570000 14.790000 14.790000 23032400.0
2024-02-22 ZS 14.940000 15.280000 14.890000 15.240000 15.240000 35702500.0
2024-02-23 ZS 15.150000 15.290000 14.950000 15.130000 15.130000 22914900.0
2024-02-26 ZS 15.130000 15.480000 15.130000 15.280000 15.280000 23675800.0
我想获得一个数据框,用其唯一的股票名称来标识每次迭代批量上传,即,如下所示:
stock_name Open High Low Close Adj Close Volume
2024-02-20 ZS 14.500000 14.950000 14.490000 14.700000 14.700000 30253100.0
2024-02-21 ZS 14.590000 14.860000 14.570000 14.790000 14.790000 23032400.0
2024-02-22 ZS 14.940000 15.280000 14.890000 15.240000 15.240000 35702500.0
2024-02-23 ZS 15.150000 15.290000 14.950000 15.130000 15.130000 22914900.0
2024-02-26 ZS 15.130000 15.480000 15.130000 15.280000 15.280000 23675800.0
... ... ... ... ... ... ... ...
2024-02-20 AAPL 14.500000 14.950000 14.490000 14.700000 14.700000 30253100.0
2024-02-21 AAPL 14.590000 14.860000 14.570000 14.790000 14.790000 23032400.0
2024-02-22 AAPL 14.940000 15.280000 14.890000 15.240000 15.240000 35702500.0
2024-02-23 AAPL 15.150000 15.290000 14.950000 15.130000 15.130000 22914900.0
2024-02-26 AAPL 15.130000 15.480000 15.130000 15.280000 15.280000 23675800.0
封装版本:
!pip show pandas. #1.5.3
!pip show beautifulsoup4 #4.12.3
!pip show pandas-datareader #0.10.0
您能否协助完成此任务?
您似乎在错误的位置和错误的时间添加了股票名称。这是
DataFrame
返回的 web.DataReader
,您必须在其中添加股票名称,例如 data["stock_name"] = i
或使用 DataFrame.insert 方法将新列放置在左侧:
...
# fetch information by stock name
data = web.DataReader(i,start,end)
# insert the stock name to the left of the data
data.insert(0, "stock_name", i)
# append rows to empty dataframe
sic_market_df = sic_market_df.append(data)
...
此外,使用 pandas.concat 而不是已弃用的 append 方法看起来是合理的。例如:
def get_data(stock_names, start, end):
for stock in tqdm(stock_names):
try:
yfin.pdr_override()
data = web.DataReader(stock, start, end)
data.insert(0, "stock_name", stock)
yield data
except KeyError:
pass
sic_market_df = pd.concat([*get_data(stock_names, start, end)])