我试图从此链接中提取信息:https://wuzzuf.net/jobs/p/EVUpYcDnxix7-Odoo-Developer-Yodawy-Med-Giza-Egypt?o=2&l=sp&t=sj&a=search-v3|hpb 尝试获取职位名称和其他信息时,我尝试了大部分要提取的元素,但它总是不返回任何内容,或者当我尝试获取文本时,它返回此错误:AttributeError:“NoneType”对象没有属性“text”
jobTitleList=[]
link=https://wuzzuf.net/jobs/p/EVUpYcDnxix7-Odoo-Developer-Yodawy-Med-Giza-Egypt?o=2&l=sp&t=sj&a=search-v3|hpb
result=requests.get(link)
print(link)
src=result.content
soup=BeautifulSoup(src,"lxml")
jobTitle=soup.find("h1",{"class":"css-f9uh36"})
jobTitleList.append(jobTitle.text)
print(jobTitleList)
尝试设置
User-Agent
HTTP 标头以从服务器获取正确的响应:
import requests
from bs4 import BeautifulSoup
jobTitleList = []
link = "https://wuzzuf.net/jobs/p/EVUpYcDnxix7-Odoo-Developer-Yodawy-Med-Giza-Egypt?o=2&l=sp&t=sj&a=search-v3|hpb"
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:123.0) Gecko/20100101 Firefox/123.0"
}
result = requests.get(link, headers=headers)
soup = BeautifulSoup(result.content, "lxml")
jobTitle = soup.find("h1", class_="css-f9uh36")
jobTitleList.append(jobTitle.text)
print(jobTitleList)
打印:
['Odoo Developer']