Selenium按标签名称选项搜索

问题描述 投票:2回答:3

我试图从一个名为Correios的网站获取所有数据,在这个网站上,我需要处理一些下拉菜单,我遇到了一些问题:它返回一个带有一堆空字符串的列表。

chrome_path = r"C:\\Users\\Gustavo\\Desktop\\geckodriver\\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()

dropdownEstados = driver.find_elements_by_xpath("""//*[@id="estadoAgencia"]""")

optEstados = driver.find_elements_by_tag_name("option")

for valores in optEstados:
    print(valores.text.encode())

我从中得到的是:

b''
b'ACRE'
b'ALAGOAS'
b'AMAP\xc3\x81'
b'AMAZONAS'
b'BAHIA'
b'CEAR\xc3\x81'
b'DISTRITO FEDERAL'
b'ESP\xc3\x8dRITO SANTO'
b'GOI\xc3\x81S'
b'MARANH\xc3\x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PAR\xc3\x81'
b'PARA\xc3\x8dBA'
b'PERNAMBUCO'
b'PIAU\xc3\x8d'
b'PARAN\xc3\x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'ROND\xc3\x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'S\xc3\x83O PAULO'
b'TOCANTINS'
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''
b''

如何删除空b“”?

python selenium selenium-webdriver drop-down-menu webdriver
3个回答
0
投票

如果我理解,你想要找到所有这些选项。

enter image description here

试试这个xPath来定位下拉元素:

//*[@id="estadoAgencia"]/option

代码示例:

chrome_path = r"C:\\Users\\Gustavo\\Desktop\\geckodriver\\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
lista_x = []
driver.get("http://www2.correios.com.br/sistemas/agencias/")
driver.maximize_window()

dropdownEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']")

# find elements in dropdown
optEstados = driver.find_elements_by_xpath("//*[@id='estadoAgencia']/option")

for valores in optEstados:
    print(valores.text.encode())

通过这个xPath你将得到所有下拉元素,除了一个空字符串,这是在这个下拉列表中。输出:

b''
b'ACRE'
b'ALAGOAS'
b'AMAP\xc3\x81'
b'AMAZONAS'
b'BAHIA'
b'CEAR\xc3\x81'
b'DISTRITO FEDERAL'
b'ESP\xc3\x8dRITO SANTO'
b'GOI\xc3\x81S'
b'MARANH\xc3\x83O'
b'MINAS GERAIS'
b'MATO GROSSO DO SUL'
b'MATO GROSSO'
b'PAR\xc3\x81'
b'PARA\xc3\x8dBA'
b'PERNAMBUCO'
b'PIAU\xc3\x8d'
b'PARAN\xc3\x81'
b'RIO DE JANEIRO'
b'RIO GRANDE DO NORTE'
b'ROND\xc3\x94NIA'
b'RORAIMA'
b'RIO GRANDE DO SUL'
b'SANTA CATARINA'
b'SERGIPE'
b'S\xc3\x83O PAULO'
b'TOCANTINS'

注意:第一个元素是一个空字符串,因为:

img2


0
投票

您的代码需要进行一些小改动:

 dropdownEstados = driver.find_element_by_xpath("""//*[@id="estadoAgencia"]""")
 optEstados = dropdownEstados.find_elements_by_tag_name("option")

  for valores in optEstados:
     print(valores.text.encode())

0
投票

要从ID为<options>的DropDown的所有estadoAgencia中检索文本,因为它是<select>标记,使用与<select>标记关联的方法会更容易和有效,您可以使用以下解决方案:

  • 代码块: estado_select = Select(driver.find_element_by_id('estadoAgencia')) for opt in estado_select.options: print(opt.get_attribute('innerHTML'))
  • 控制台输出: ACRE ALAGOAS AMAPÁ AMAZONAS BAHIA CEARÁ DISTRITO FEDERAL ESPÍRITO SANTO GOIÁS MARANHÃO MINAS GERAIS MATO GROSSO DO SUL MATO GROSSO PARÁ PARAÍBA PERNAMBUCO PIAUÍ PARANÁ RIO DE JANEIRO RIO GRANDE DO NORTE RONDÔNIA RORAIMA RIO GRANDE DO SUL SANTA CATARINA SERGIPE SÃO PAULO TOCANTINS
© www.soinside.com 2019 - 2024. All rights reserved.