我正在抓一个FAQ页面，我需要在FAQ页面找到哪个标签有答案

Question

import requests
from bs4 import BeautifulSoup
import numpy as np
import pandas as pd                     
import re
req = requests.get('https://www.godrejproperties.com/nricorner/nri-faqs')
soup = BeautifulSoup(req.text, "html5lib")

ist1=[]
for elem in soup(text=re.compile(r'\s*((?:how|How|Can|can|what|What|where|Where|describe|Describe|Who|who|When|when|Why|why|Should|should|is|Is|I|Do|do|Are|are|Will|will)[^.<>?]*?\s*\?)')):
    print elem.parent
    list1.append(elem.parent)

x=str(list1[1])
tag=x[x.find("<")+1:x.find(">")]
print tag

Ques = []
for header in soup.find_all(tag):
    list_=[header]
    ffff=re.findall(r'\s*((?:how|How|Can|can|what|What|where|Where|describe|Describe|Who|who|When|when|Why|why|Should|should|is|Is|I|Do|do|Are|are|Will|will)[^.<>?]*?\s*\?)',str(list_))
    #print(ffff)
    #print (len(ffff))
    if len(ffff)>0:
        Ques.append(ffff)
Ques = np.array(Ques)
print(Ques)

Similarly I need to find the answers in FAQ pages I need to create a algorithm which will capture in which tag answer is contained and get it's content and save it in a list. Later I need question and answer as a pair

Answer 1

您可以使用xpath获取详细信息。正如你可以看到html结构所有问题和答案都是手风琴。那么基本上我们需要通过属性遍历它。对于直接答案，我们可以使用以下xpath位置

// * [@ class =“ui-accordion-content ui-helper-reset ui-widget-content ui-corner-bottom”]

但是你需要聪明，因为这可能会导致其他手风琴进入你捕获的数据，所以根据问题ID验证数据，这也反映在答案ID中。

// * [@ class =“ui-accordion-header ui-state-default ui-corner-all ui-accordion-icons”]

您还可以使用xpath或css选择器例如：

甚至穿过article

我正在抓一个FAQ页面，我需要在FAQ页面找到哪个标签有答案

问题描述投票：0回答：1

Similarly I need to find the answers in FAQ pages I need to create a algorithm which will capture in which tag answer is contained and get it's content and save it in a list. Later I need question and answer as a pair

1个回答

最新问题

我正在抓一个FAQ页面，我需要在FAQ页面找到哪个标签有答案

问题描述 投票：0回答：1

Similarly I need to find the answers in FAQ pages I need to create a algorithm which will capture in which tag answer is contained and get it's content and save it in a list. Later I need question and answer as a pair

1个回答

最新问题

问题描述投票：0回答：1