如何通过循环将每次迭代的不同值附加到 pandas 数据帧?

问题描述 投票:0回答:1

我正在尝试将每次迭代的股票名称值添加到其相应的数据批量上传到 pandas 数据框:

这是我迄今为止尝试过的:

from pandas_datareader import data as pdr
import requests
from bs4 import BeautifulSoup
import json, requests
import pandas as pd
import re
import numpy as np
import pandas_datareader.data as web
import yfinance as yfin
from tqdm import tqdm
import numpy as np
import datetime
from datetime import timedelta

################# fetch series names for sic ######################

sic_emisoras_df = pd.json_normalize(
    json.loads(
        requests.get('https://www.bmv.com.mx/es/Grupo_BMV/BmvJsonGeneric?idSitioPagina=6&mercado=CGEN_SCSOP&tipoValor=CGEN_CASEO&random=5845')
            .text
            .split(';(', 1)[-1]
            .split(')')[0]
        )['response']['resultado']
).dropna(axis=1, how='all')



####################################################################


# define time range:
start=datetime.date.today()-datetime.timedelta(days=14)

end=datetime.date.today()

# fetch data
# get all SIC names as list
stock_names = sic_emisoras_df["cveCorta"].values.tolist()


# append information per stock name

sic_market_df = pd.DataFrame([])
sic_market_df["stock_name"] = np.nan


for i in tqdm(stock_names):

     # fetch data per stock_name
    try:
       yfin.pdr_override()

       # append stock name
       sic_market_df["stock_name"]=i

       # fetch information by stock name
       data = web.DataReader(i,start,end)
       # append rows to empty dataframe
       sic_market_df = sic_market_df.append(data)
    except KeyError:
       pass



print("Fetched sic_market_df!")

输出仅获取第一次迭代的名称,但每次其他批量上传都会获取 NaN :

         stock_name Open    High       Low        Close   Adj   Close    Volume
2024-02-20  ZS  14.500000   14.950000   14.490000   14.700000   14.700000   30253100.0
2024-02-21  ZS  14.590000   14.860000   14.570000   14.790000   14.790000   23032400.0
2024-02-22  ZS  14.940000   15.280000   14.890000   15.240000   15.240000   35702500.0
2024-02-23  ZS  15.150000   15.290000   14.950000   15.130000   15.130000   22914900.0
2024-02-26  ZS  15.130000   15.480000   15.130000   15.280000   15.280000   23675800.0

我想获得一个数据框,用其唯一的股票名称来标识每次迭代批量上传,即,如下所示:

             stock_name Open    High       Low        Close   Adj   Close    Volume
    2024-02-20  ZS  14.500000   14.950000   14.490000   14.700000   14.700000   30253100.0
    2024-02-21  ZS  14.590000   14.860000   14.570000   14.790000   14.790000   23032400.0
    2024-02-22  ZS  14.940000   15.280000   14.890000   15.240000   15.240000   35702500.0
    2024-02-23  ZS  15.150000   15.290000   14.950000   15.130000   15.130000   22914900.0
    2024-02-26  ZS  15.130000   15.480000   15.130000   15.280000   15.280000   23675800.0
...    ...      ...     ...          ...      ...     ...   ...    
    2024-02-20  AAPL    14.500000   14.950000   14.490000   14.700000   14.700000   30253100.0
    2024-02-21  AAPL    14.590000   14.860000   14.570000   14.790000   14.790000   23032400.0
    2024-02-22  AAPL    14.940000   15.280000   14.890000   15.240000   15.240000   35702500.0
    2024-02-23  AAPL    15.150000   15.290000   14.950000   15.130000   15.130000   22914900.0
    2024-02-26  AAPL    15.130000   15.480000   15.130000   15.280000   15.280000   23675800.0

封装版本:

!pip show pandas.  #1.5.3
!pip show beautifulsoup4  #4.12.3
!pip show pandas-datareader  #0.10.0

您能否协助完成此任务?

python pandas sorting for-loop row
1个回答
0
投票

您似乎在错误的位置和错误的时间添加了股票名称。这是

DataFrame
返回的
web.DataReader
,您必须在其中添加股票名称,例如
data["stock_name"] = i
或使用 DataFrame.insert 方法将新列放置在左侧:

       ...
       # fetch information by stock name
       data = web.DataReader(i,start,end)

       # insert the stock name to the left of the data
       data.insert(0, "stock_name", i)

       # append rows to empty dataframe
       sic_market_df = sic_market_df.append(data)
       ...

此外,使用 pandas.concat 而不是已弃用的 append 方法看起来是合理的。例如:

def get_data(stock_names, start, end):
    for stock in tqdm(stock_names):
        try:
            yfin.pdr_override()
            data = web.DataReader(stock, start, end)
            data.insert(0, "stock_name", stock)
            yield data
        except KeyError:
            pass
        
sic_market_df = pd.concat([*get_data(stock_names, start, end)])
© www.soinside.com 2019 - 2024. All rights reserved.