如果找不到元素,如何继续-使用Python报废

问题描述 投票:0回答:1

我正在抓取一个基本上是搜索引擎的页面。包含一些客户代码(称为CPF)的工作表将密钥发送到页面,然后它给了我一些信息,例如我要剪贴到该工作表。剪贴代码快要完成了,但是我无法处理错误的客户编号。

页面的工作方式如下:

1-如果客户端代码正确,则页面重定向并显示一些我可以抓取的信息;

2-如果客户代码不全是数字,则“搜索”按钮不起作用;

3-如果客户端代码包含所有数字,但有错误,页面将显示一个弹出窗口。

在情况2和3中,我想打印一些内容(CPF Invalido)并转到下一个客户端代码。这是我已经拥有的代码:

        for cpf in self.cpfs:
        print(f"Procurando {cpf}.")

        self.driver.get(self.bot_url)

        cpf_input = self.driver.find_element_by_xpath('//*[@id="search"]/div/div[1]/input')
        cpf_input.send_keys(cpf)

        time.sleep(2)

        cpfButton = self.driver.find_element_by_xpath('//*[@id="search"]/div/div[2]/button')
        cpfButton.click()

        time.sleep(2)

        self.delay = 3  # seconds

        nome = self.driver.find_element_by_xpath("/html/body/main[1]/div[1]/div[1]/div[1]/div[1]/h2").text
        idade = self.driver.find_element_by_xpath("/html/body/main[1]/div[1]/div[1]/div[1]/div[1]/ul/li[2]").text
        age = re.search(r'\((.*?)Anos', idade).group(1)
        beneficio = self.driver.find_element_by_xpath(
            "/html/body/main[1]/div[1]/div[1]/div[1]/div[2]/div[5]/span/b").text
        concessao = self.driver.find_element_by_xpath("/html/body/main[1]/div[1]/div[1]/div[1]/div[2]/div[2]/span").text
        salario = self.driver.find_element_by_xpath(
            "/html/body/main[1]/div[1]/div[2]/div/div[3]/div[1]/div[1]/span").text
        bancos = self.driver.find_element_by_xpath('//*[@id="loans"]').text
        bancosw = re.findall(r'(?<=Banco )(\w+)', bancos)
        bankslist = ', '.join(bancosw)
        bancocard = self.driver.find_element_by_xpath('//*[@id="cards"]').text
        bcardw = re.findall(r'(?<=Banco )(\w+)', bancocard)
        bcardlist = ', '.join(bcardw)
        consig = self.driver.find_element_by_xpath("/html/body/main[1]/div[1]/div[1]/div[3]/div[2]/span").text
        card = self.driver.find_element_by_xpath("/html/body/main[1]/div[1]/div[1]/div[3]/div[3]/span").text

        try:
            WebDriverWait(self.driver, self.delay).until(
                EC.presence_of_element_located((By.XPATH, '//*[@id="main"]/div[1]/h2')))
            print('CPF Valido')

            print(nome, age, beneficio, concessao, salario, bankslist, bcardlist, consig, card)

        except NoSuchElementException:
            print('CPF Invalido')

        nomes.append(nome)
        idades.append(age)
        beneficios.append(beneficio)
        concessoes.append(concessao)
        salarios.append(salario)
        bancoss.append(bankslist)
        bancoscard.append(bcardlist)
        consigs.append(consig)
        cards.append(card)

    return nomes, idades, beneficios, concessoes, salarios, bancoss, bancoscard, consigs, cards

我正在尝试使用客户端代码正确时显示的页面元素,因此,除了NoSuchElementException外,应打印CPF Invalido并继续执行代码,以查找其他客户端代码。

在第2种情况下,错误是:

Traceback (most recent call last):
  File "C:/Users/MOISA/PycharmProjects/inss2/cpf_updater.py", line 47, in <module>
    cpf_updater.process_cpf_list()
  File "C:/Users/MOISA/PycharmProjects/inss2/cpf_updater.py", line 32, in process_cpf_list
    nomes, idades, beneficios, concessoes, salarios, bancoss, bancoscard, consigs, cards = bot_url.search_cpfs()
  File "C:\Users\MOISA\PycharmProjects\inss2\k_bot.py", line 66, in search_cpfs
    nome = self.driver.find_element_by_xpath("/html/body/main[1]/div[1]/div[1]/div[1]/div[1]/h2").text
  File "C:\Users\MOISA\PycharmProjects\inss2\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
    return self.find_element(by=By.XPATH, value=xpath)
  File "C:\Users\MOISA\PycharmProjects\inss2\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 978, in find_element
    'value': value})['value']
  File "C:\Users\MOISA\PycharmProjects\inss2\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Users\MOISA\PycharmProjects\inss2\venv\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: Unable to locate element: /html/body/main[1]/div[1]/div[1]/div[1]/div[1]/h2

在情况3中,它给出:

Traceback (most recent call last):
  File "C:/Users/MOISA/PycharmProjects/inss2/cpf_updater.py", line 47, in <module>
    cpf_updater.process_cpf_list()
  File "C:/Users/MOISA/PycharmProjects/inss2/cpf_updater.py", line 32, in process_cpf_list
    nomes, idades, beneficios, concessoes, salarios, bancoss, bancoscard, consigs, cards = bot_url.search_cpfs()
  File "C:\Users\MOISA\PycharmProjects\inss2\k_bot.py", line 66, in search_cpfs
    nome = self.driver.find_element_by_xpath("/html/body/main[1]/div[1]/div[1]/div[1]/div[1]/h2").text
  File "C:\Users\MOISA\PycharmProjects\inss2\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 394, in find_element_by_xpath
    return self.find_element(by=By.XPATH, value=xpath)
  File "C:\Users\MOISA\PycharmProjects\inss2\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 978, in find_element
    'value': value})['value']
  File "C:\Users\MOISA\PycharmProjects\inss2\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Users\MOISA\PycharmProjects\inss2\venv\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 241, in check_response
    raise exception_class(message, screen, stacktrace, alert_text)
selenium.common.exceptions.UnexpectedAlertPresentException: Alert Text: None
Message: Dismissed user prompt dialog: Nenhum benefício foi localizado para este CPF.

这里是cpf_updater

    def process_cpf_list(self):
            cpfs = self.sheet.col_values(self.cpf_col)[1:]

            bot_url = BOT(cpfs)
            try:
                nomes, idades, beneficios, concessoes, salarios, bancoss, bancoscard, consigs, cards = bot_url.search_cpfs()
                print("Atualizando...")
                for i in range(len(nomes)):
                        self.sheet.update_cell(i + 2, self.nome_col, nomes[i])
                        self.sheet.update_cell(i + 2, self.age_col, idades[i])
                        self.sheet.update_cell(i + 2, self.beneficio_col, beneficios[i])
                        self.sheet.update_cell(i + 2, self.concessao_col, concessoes[i])
                        self.sheet.update_cell(i + 2, self.salario_col, salarios[i])
                        self.sheet.update_cell(i + 2, self.bancos_col, bancoss[i])
                        self.sheet.update_cell(i + 2, self.bancocard_col, bancoscard[i])
                        self.sheet.update_cell(i + 2, self.consig_col, consigs[i])
                        self.sheet.update_cell(i + 2, self.card_col, cards[i])

            except NoSuchElementException:
                print('CPF Invalido')
                pass

cpf_updater = CpfSearch('TESTE')
cpf_updater.process_cpf_list()
python web-scraping try-catch except nosuchelementexception
1个回答
0
投票
您在cpf_updater.py的第47行上收到NoSuchElementException。您应该将相关部分包装在try除外中,并处理NoSuchElementException。

对于情况3:在同一行上,您还应该处理UnexpectedAlertPresentException。当您收到模态,某些弹出窗口或警报时,通常会发生此异常。

我不完全确定哪一行对应于cpf_updater.py的47行,但这就是问题的根源。

编辑:似乎您需要尝试以下内容,但以上两个例外除外。该错误是由于我认为第一行的函数调用引起的。并且结果变量取决于该调用。

nomes, idades, beneficios, concessoes, salarios, bancoss, bancoscard, consigs, cards = bot_url.search_cpfs() print("Atualizando...") for i in range(len(nomes)): self.sheet.update_cell(i+2, self.nome_col, nomes[i]) self.sheet.update_cell(i+2, self.age_col, idades[i]) self.sheet.update_cell(i+2, self.beneficio_col, beneficios[i]) self.sheet.update_cell(i+2, self.concessao_col, concessoes[i]) self.sheet.update_cell(i+2, self.salario_col, salarios[i]) self.sheet.update_cell(i + 2, self.bancos_col, bancoss[i]) self.sheet.update_cell(i + 2, self.bancocard_col, bancoscard[i]) self.sheet.update_cell(i+2, self.consig_col, consigs[i]) self.sheet.update_cell(i+2, self.card_col, cards[i])

© www.soinside.com 2019 - 2024. All rights reserved.