TypeError:“str”对象不可调用 beautifulsoup 类别和子类别 href pr

问题描述 投票:0回答:1

我正在尝试废弃一个网站,我可以获取产品详细信息,但在尝试获取所有类别和子类别链接以访问所有页面时出现错误。错误是说链接是字符串,但是当我手动打开网络上的链接时,我可以访问该网站。我添加了下面的错误

import requests
from tqdm import tqdm
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.common.exceptions import *
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from bs4 import BeautifulSoup
import json
import pandas as pd
from unidecode import unidecode
from webdriver_manager.chrome import ChromeDriverManager
browser = webdriver.Chrome(ChromeDriverManager().install())

URL = 'https://www.tapwarehouse.com/'

def get_category_links(URL):
    category_links = []
    browser.get(URL)
    html = browser.page_source
    soup = BeautifulSoup(html,'html5lib')
    cat=soup.find_all("ul",{"class":"c-nav__list"})[0].find_all('a')
    for i in cat:
        try:
            link=i["href"]
            if link=='javascript:void(0)':
                pass
            else:
                category_links.append("https://www.tapwarehouse.com"+i["href"])
        except:
            pass
    return category_links

def get_sub_category_links(URL):
    sub_category_links=[]
    browser.get(URL)
    html = browser.page_source
    soup = BeautifulSoup(html,'html5lib')
    for link in soup.find_all('a', {'class': "m-categories__menu__link"}):
        sub_category_links.append("https://www.tapwarehouse.com/"+link["href"])
    return sub_category_links

response = []
sublist=[]
urllist=[]
for cat_link in get_category_links(URL = URL):
    for subcat_obj in get_sub_category_links(URL = cat_link):
        try:
            get_sub_category_links = subcat_obj
            print(f'sub category is {get_sub_category_links}')
            sublist.append(get_sub_category_links)
            sublist = list(set(sublist))
        except:
            pass

TypeError                                 Traceback (most recent call last)
<ipython-input-22-8873cbd974a9> in <module>
      4 for cat_link in get_category_links(URL = URL):
----> 5     for subcat_obj in get_sub_category_links(URL = cat_link):

TypeError: 'str' object is not callable
python beautifulsoup href
1个回答
0
投票

您已定义变量

get_sub_category_links
两次:一次作为函数,一次作为变量(在 try/ except 中):
get_sub_category_links = subcat_obj

您应该使用不同的名称来定义循环中的变量,也许您可以将其重命名为

sub_category_link
。因此将“try/ except”中的代码替换为:

sub_category_link = subcat_obj
print(f'sub category is {sub_category_link}')
sublist.append(sub_category_link)
sublist = list(set(sublist))
© www.soinside.com 2019 - 2024. All rights reserved.