代码返回空数据框,理解逻辑有问题

问题描述 投票:0回答:0

这段代码是从一个名为cafef的网站爬取股票数据。输入是网站链接和来自该网站 HTML 的元素,预期输出是包含日期、价格、数量等股票数据的表格。但是,该代码不起作用,它返回一个空数据框。我不明白除了块之外的第二次尝试,所以我无法调试运行这段代码。大佬们帮我解释一下

from selenium import webdriver
from time import sleep
from selenium.webdriver.common.keys import Keys
import pandas as pd
from selenium.webdriver.support.ui import Select
def crawl(stock):
 date=[]
 price=[]
 volume=[]
 close=[]
 stock_id=[]
 browser = webdriver.Chrome(executable_path="./chromedriver")
 web = browser.get("https://s.cafef.vn/Lich-su-giao-dich-"+stock+"-1.chn")
 sleep(5)
 for count in range (60):
  try:
      date_data=browser.find_elements("Item_DateItem")
      for row in date_data:
        date.append(row.text)
        print(row.text())
      date_data.clear()
      price_data=browser.find_elements_by_class_name("Item_Price1")
      for row in price_data:
        price.append(row.text)
      price_data.clear()
  except:
   break
  try:
    if count == 0:
      next_page = browser.find_element(By.XPATH, "/html/body/form/div[3]/div/div[2]/div[2]/div[1]/div[3]/div/div/div[2]/div[2]/div[2]/div/div/div/div/table/tbody/tr/td[21]/a")
    else:
       try:
          next_page = browser.find_element(By.XPATH, "/html/body/form/div[3]/div/div[2]/div[2]/div[1]/div[3]/div/div/div[2]/div[2]/div[2]/div/div/div/div/table/tbody/tr/td[22]/a")
       except:
          next_page = browser.find_element(By.XPATH, "/html/body/form/div[3]/div/div[2]/div[2]/div[1]/div[3]/div/div/div[2]/div[2]/div[2]/div/div/div/div/table/tbody/tr/td[23]/a")
    next_page.click()
    sleep(5)
  except:
    break
 for i in range (int(len(price)/10)):
  close.append(price[10*i+1].replace(",",""))
  volume.append(price[10*i+2].replace(",",""))
 for i in range (len(date)):
  stock_id.append(stock)
 d = {'Stock': stock_id,'Date': date,'Close': close,'Volume': volume}
 df = pd.DataFrame(data=d)
 df.to_csv(stock+".csv", index=False)
 return df
print(crawl('ABC'))

我试图找到 xpath 元素,但我没有找到

python dataframe selenium-webdriver web-crawler stock
© www.soinside.com 2019 - 2024. All rights reserved.