漂亮的汤我=只返回无[关闭]

问题描述 投票:0回答:1

我试图从此链接中提取信息:https://wuzzuf.net/jobs/p/EVUpYcDnxix7-Odoo-Developer-Yodawy-Med-Giza-Egypt?o=2&l=sp&t=sj&a=search-v3|hpb 尝试获取职位名称和其他信息时,我尝试了大部分要提取的元素,但它总是不返回任何内容,或者当我尝试获取文本时,它返回此错误:AttributeError:“NoneType”对象没有属性“text”

jobTitleList=[]
link=https://wuzzuf.net/jobs/p/EVUpYcDnxix7-Odoo-Developer-Yodawy-Med-Giza-Egypt?o=2&l=sp&t=sj&a=search-v3|hpb
   result=requests.get(link)
    print(link)
    src=result.content
    soup=BeautifulSoup(src,"lxml")
    jobTitle=soup.find("h1",{"class":"css-f9uh36"})
    
    jobTitleList.append(jobTitle.text)
    
print(jobTitleList)
python web-scraping web beautifulsoup information-extraction
1个回答
0
投票

尝试设置

User-Agent
HTTP 标头以从服务器获取正确的响应:

import requests
from bs4 import BeautifulSoup

jobTitleList = []

link = "https://wuzzuf.net/jobs/p/EVUpYcDnxix7-Odoo-Developer-Yodawy-Med-Giza-Egypt?o=2&l=sp&t=sj&a=search-v3|hpb"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:123.0) Gecko/20100101 Firefox/123.0"
}

result = requests.get(link, headers=headers)

soup = BeautifulSoup(result.content, "lxml")
jobTitle = soup.find("h1", class_="css-f9uh36")
jobTitleList.append(jobTitle.text)

print(jobTitleList)

打印:

['Odoo Developer']
© www.soinside.com 2019 - 2024. All rights reserved.