不同的 URL 但得到相同的内容

Question

我在写爬虫的代码

我的目标：从网站获取数据

我的烦恼：我写了一个迭代器来访问网站的不同页面，就像 https://www.kroger.com/pl/hair-care/21002?taxonomyId=21002&page=2&fulfillment=ais 和 https://www.kroger。 com/pl/hair-care/21002?taxonomyId=21002&page=3&fulfillment=ais。但是我得到了相同的数据。

    def download_with_wait(self, url: str, wait_elem_id: Optional[str] = None,
                           callback: Optional[typing.Callable] = None):
        logger.info(f"Fetching {url}")
        self.driver.get(url)

        if self.browser == self.Browser.CHROME:
            logs = self.driver.get_log("performance")
            http_status_code = self.get_status(logs)

            if http_status_code is not None and http_status_code >= 400:
                logger.warning(f"Failed to fetch {url} with status code: {http_status_code}")
                return None

        if callback is not None:
            callback(self.driver)

        logger.info("Waiting for page to load")

        if wait_elem_id is not None:
            timeout = 60
            try:
                element_present = ec.presence_of_element_located((By.ID, wait_elem_id))
                WebDriverWait(self.driver, timeout).until(element_present)
            except TimeoutException:
                logger.warning("Timed out waiting for page to load")
                return None
        else:
            sleep(3)

        inner_html = self.driver.page_source
        # self.driver.save_screenshot("ss.png")

        return str(inner_html).encode("utf-8")

我尝试通过谷歌浏览器访问这两个网址，我发现无论我输入这两个网址中的哪一个，我看到的真实网站是“https://www.kroger.com/pl/hair-care/21002 ?taxonomyId=21002&page=3&fulfillment=ais”，其中没有字符串“page=n”。

不同的 URL 但得到相同的内容

问题描述投票：0回答：0

最新问题

不同的 URL 但得到相同的内容

问题描述 投票：0回答：0

最新问题

问题描述投票：0回答：0