使用正确的代码拉动 td/html 元素 - Selenium 和 Python

Question

我一直在尝试从 https://www.tradingview.com/markets/stocks-turkey/market-movers-all 提取股票代码（股票的简称）、股票名称、价格、板块和市值列-股票/ 努力使用正确的代码提取正确的 html 元素。我尝试过使用 Selector Gadget 来识别 Xpath，但是我对 HTML 树和规则不是很有信心。我注意到前 3 列被视为网页中的单个 td。粘贴下面的代码，此时正在拉动整个行。谢谢..

from selenium import webdriver
from selenium.webdriver.common.by import By
import re
import pandas as pd

from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
import time
from selenium.common.exceptions import NoSuchElementException

driver = webdriver.Chrome()
website = 'https://www.tradingview.com/markets/stocks-turkey/market-movers-all-stocks/'
driver.get(website) #to open the website

while True:
    try:
        loadMoreButton = driver.find_element(By.XPATH,'//*[contains(concat( " ", @class, " " ), concat( " ", "content-D4RPB3ZC", " " ))]')
        time.sleep(2)
        loadMoreButton.click()
        time.sleep(5)
    except Exception as e:
        print (e)
        break
print ("Complete")
time.sleep(10)

matches = driver.find_elements(By.TAG_NAME,'tr')

ticker_symbol = []
ticker_name = []
ticker_price =[]
ticker_sector =[]
ticker_marketcap =[]

for match in matches:
    print(match.text)

driver.quit()

Answer 1

我解决了一些问题

用适当的
```
.sleep()
```
s
```
 替换 
```
WebDriverWait
更新了定位器

工作代码如下。

from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException, StaleElementReferenceException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait

url = 'https://www.tradingview.com/markets/stocks-turkey/market-movers-all-stocks/'
driver = webdriver.Chrome()
driver.maximize_window()
driver.get(url)

while True:
    try:
        driver.find_element(By.XPATH,'//span[text()="Load More"]').click()
    except StaleElementReferenceException:
        break

wait = WebDriverWait(driver, 10)
rows = wait.until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR,'table[class="table-Ngq2xrcG"] tr.listRow')))
for row in rows:
    ticker_symbol = row.find_element(By.XPATH, './td[1]//a').text
    ticker_name = row.find_element(By.XPATH, './td[1]//sup').text
    ticker_price = row.find_element(By.XPATH, './td[2]').text
    ticker_marketcap = row.find_element(By.XPATH, './td[6]').text
    try:
        ticker_sector = row.find_element(By.XPATH, './td[11]/a').text
    except NoSuchElementException:
        ticker_sector = "—"

    print(ticker_symbol, ticker_name, ticker_price, ticker_marketcap, ticker_sector)

driver.quit()

输出是

A1CAP A1 CAPITAL YATIRIM 24.76 TRY 3.38B TRY Finance
ACSEL ACIPAYAM SELULOZ 99.7 TRY 1.104B TRY Process Industries
ADEL ADEL KALEMCILIK 322.50 TRY 7.69B TRY Consumer Durables
...

使用正确的代码拉动 td/html 元素 - Selenium 和 Python

问题描述投票：0回答：1

1个回答

最新问题

使用正确的代码拉动 td/html 元素 - Selenium 和 Python

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1