在Selenium Python中,发送键函数不能正常工作。

问题描述 投票:0回答:1

看看这个网站。https:/www.arabam.comilansahibinden-satilik-mercedes-benz-cla-180-d-stylesahibinden-boyasiz-hasarsiz-cam-tavan-temiz-arac14229201

我按结束键,这样它就到了页面的最后。然后一个一个按上键,直到找到这个 。在这里输入图片描述

原本工作得很好,但现在好像不行了。

  options.add_argument('window-size=1200x600')
        prefs = {}

        prefs = {"profile.default_content_setting_values.geolocation": 2, "profile.default_content_setting_values.notifications": 2}
        options.add_experimental_option("prefs", prefs)
        d = webdriver.Chrome(chrome_options=options,
                             executable_path='./chromedriver')
        d.get(features["ad_url"])
        # Use send_keys(Keys.HOME) to scroll up to the top of page
        d.find_element_by_tag_name('body').send_keys(
            Keys.END)
        while True:

                d.find_element_by_tag_name('body').send_keys(
                    Keys.UP)
                time.sleep(1)
                e = d.find_element_by_xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[3]/div/div[3]/div")
                if e.text:
                    break

这里有一个功能齐全的代码,可以试试。

import json
import scrapy
from scrapy.spiders import SitemapSpider
from scrapy.crawler import CrawlerProcess
from selenium import webdriver
from datetime import datetime
from selenium.webdriver.common.keys import Keys
import pickle
import time


class Myspider(SitemapSpider):
    name = 'spidername'
    sitemap_urls = ['https://www.arabam.com/sitemap/otomobil_1.xml','https://www.arabam.com/sitemap/otomobil_2.xml',
                    'https://www.arabam.com/sitemap/otomobil_3.xml','https://www.arabam.com/sitemap/otomobil_4.xml',
                    'https://www.arabam.com/sitemap/otomobil_5.xml','https://www.arabam.com/sitemap/otomobil_6.xml',
                    'https://www.arabam.com/sitemap/otomobil_7.xml','https://www.arabam.com/sitemap/otomobil_8.xml',
                    'https://www.arabam.com/sitemap/otomobil_9.xml','https://www.arabam.com/sitemap/otomobil_10.xml',
                    'https://www.arabam.com/sitemap/otomobil_11.xml','https://www.arabam.com/sitemap/otomobil_12.xml',
                    'https://www.arabam.com/sitemap/otomobil_13.xml']


    sitemap_rules = [
        ('/otomobil/', 'parse'),

    ]
    custom_settings = {'FEED_FORMAT':'csv','FEED_URI': "arabam_"+str(datetime.today().strftime('%d%m%y'))+'.csv'
                       }

    def parse(self,response):


        for td in response.xpath("/html/body/div[3]/div[6]/div[4]/div/div[2]/table/tbody/tr/td[4]/div/a"):
            link = td.xpath("@href").extract()


            year = td.xpath("text()").extract()
            self.crawled.append(link[0])
            self.new_links += 1
            if int(year[0]) > 2010:
                url = "https://www.arabam.com/" + link[0]

                yield scrapy.Request(url, callback=self.parse_dir_contents)

    def parse_dir_contents(self,response):

        features = {}





        options = webdriver.ChromeOptions()

        # options.add_argument('headless')
        options.add_argument('window-size=1200x600')
        prefs = {}

        prefs = {"profile.default_content_setting_values.geolocation": 2, "profile.default_content_setting_values.notifications": 2}
        options.add_experimental_option("prefs", prefs)
        d = webdriver.Chrome(chrome_options=options,
                             executable_path='./chromedriver')
        d.get(features["ad_url"])
        # Use send_keys(Keys.HOME) to scroll up to the top of page
        d.find_element_by_tag_name('body').send_keys(
            Keys.END)
        while True:

                d.find_element_by_tag_name('body').send_keys(
                    Keys.UP)
                time.sleep(1)
                e = d.find_element_by_xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[3]/div/div[3]/div")
                if e.text:
                    break

        overview1 = e.text.split("\n")

        yield features



process = CrawlerProcess({
})


process.crawl(Myspider)
process.start() # the script wi

编辑:我注释了一下,然后运行了这段代码,结果发现键被发送了。问题是试图找到特定的div。我试着把 try catch 放在上面,但似乎没有用。

而真。

        d.find_element_by_tag_name('body').send_keys(
            Keys.UP)
        time.sleep(1)
        try:
            e = d.find_element_by_xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[3]/div/div[3]/div")
            if e.text:
                break
        except:
            pass

编辑:

这是我为了向上滚动所做的。但不幸的是,这对大多数情况并不奏效。

for i in range(0,37):

        d.find_element_by_tag_name('body').send_keys(
            Keys.UP)
        time.sleep(1)

e = d.find_element_by_xpath("/html/body/div[3]/div[6]/div[3]/div/div[1]/div[3]/div/div[3]/div[2]/div")

overview1 = e.text.split("\n")

编辑:试了一下。它滚动到视图中,但没有得到元素。

         e = d.find_element_by_xpath("//div[@id = 'js-hook-appendable-technicalPropertiesWrapper' and @class = 'cf' ] ")


        actions = ActionChains(d)
        actions.move_to_element(e).perform()
        wait = WebDriverWait(d, 20)
        wait.until(EC.visibility_of_element_located((By.XPATH, "//div[@id = 'js-hook-appendable-technicalPropertiesWrapper' and @class = 'cf' ]")))
        overview1 = e.text.split("\n")

编辑:HTML的截图请在此输入图片描述

python selenium selenium-webdriver scrapy sendkeys
1个回答
0
投票

作为回答补充一下,因为这个评论有点长。

首先,你需要等待元素出现。然后找到元素并提取值。从你的代码来看,元素的查找是在可见性检查之前完成的.而另一件事你可以尝试的是在提取值之前滚动到特定的元素。这个特定的表格似乎只有在视口中才会加载值。

actions = ActionChains(d)
actions.move_to_element(e).perform()
wait = WebDriverWait(d, 20)

wait.until(EC.visibility_of_element_located((By.XPATH, "//div[@id = 'js-hook-appendable-technicalPropertiesWrapper' and @class = 'cf' ]")))

e = d.find_element_by_xpath("//div[@id = 'js-hook-appendable-technicalPropertiesWrapper' and @class = 'cf' ] ")
# Scroll to the element
d.executeScript("arguments[0].scrollIntoView(true);", element);
# Check what is the actual text value you get
print(e.text)
print (e.text.split("\n"))
© www.soinside.com 2019 - 2024. All rights reserved.