csv的for循环中的Python BeautifulSoup find_all

Question

我正在从事的项目遇到麻烦。

我有一个CSV文件，第一列中包含所有网址。

我下面的脚本当前进入并遍历每一行，但是一旦尝试find_all，它就会准备以下错误：IndexError：列表索引超出范围。

import requests
from bs4 import BeautifulSoup
import csv

with open('1.csv', "r", newline="") as inFile, open("1output.csv", "w", newline="") as outFile:
    next(inFile)
    reader = csv.reader(inFile)
    writer = csv.writer(outFile)
    for row in reader:
        subURL = row[0]

        # Parse the HTML from the website
        URL = 'https://www.example.com/{}'.format(subURL)
        page = requests.get(URL)
        soup = BeautifulSoup(page.content, 'html.parser')

        # find iframe on webpage and get the src of the iframe
        iframeDesc = soup.find_all('iframe')[0]
        pageDesc = requests.get(iframeDesc['src'])
        soupDesc = BeautifulSoup(pageDesc.content, 'html.parser')

        # Get Description from iframe Desc
        itemDesc = soupDesc.find_all('div', id="div_01")

此行发生错误：

iframeDesc = soup.find_all('iframe')[0]

Answer 1

可能有多种动机解决您的问题，让我最有可能向您介绍。

错误模式：在这种情况下，异常是正常的，因为您正在请求BeautifulSoup向您返回文档中未发生的内容
Typo：最简单的一个，也许是一个错误的字母不允许您获得所需的节点？

此外，我怀疑您正在树中寻找错误的节点。实际上，在使用BS时，这种情况经常发生，因为您基本上会陷入DOM之中，并且确实有可能丢失标签。只需在代码周围放置一些打印件，以查看这些行的内容。

csv的for循环中的Python BeautifulSoup find_all

问题描述投票：0回答：1

1个回答

最新问题

csv的for循环中的Python BeautifulSoup find_all

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1