从 .py 转换 .exe 文件后出现问题。 “ModuleNotFoundError:没有名为‘selenium’的模块”

问题描述 投票:0回答:1

听我说:我是 Python 新手。我完全可能在某个地方搞砸了。

这是完整的错误消息:

Traceback (most recent call last):
  File "webScrapingTool.py", line 1, in <module>
    from selenium import webdriver
ModuleNotFoundError: No module named 'selenium'

我在Ubuntu 22.04上编写代码,其默认Python版本是3.10.4。我有一个双启动系统。我没有意识到我显然(?)需要直接在Windows中制作Windows可执行文件,所以我将文件移动到那里并尝试。我下载了Python for Windows,版本是3.12.2。据我了解,这可能是问题的一部分。

请记住,我在 Ubuntu 上尝试过“pyinstaller”和“auto-py-to-exe”,也在 Windows 上尝试过“pyinstaller”。当我在 Windows 中创建可执行文件时,它会显示如上的错误消息。

如前所述,我对 Python 几乎是全新的,我做了一个非常基本的项目,但我确实需要知道最终使我的文件对普通人可执行/可用的问题是什么。

这是我的代码:

from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import time
import re
import requests
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.common.exceptions import StaleElementReferenceException
from requests.exceptions import RequestException, Timeout, HTTPError, ConnectionError

filename = "data"
link = input("Please enter the Google Maps link for scraping: ")

browser = webdriver.Chrome()
record = []
e = []
le = 0

def Selenium_extractor():
    action = ActionChains(browser)
    prev_length = 0
    a = browser.find_elements(By.CLASS_NAME, "hfpxzc")

    while len(a) < 1000:
        print(len(a))
        var = len(a)
        last_element = a[-1]
        action.move_to_element(last_element).perform()
        browser.execute_script("arguments[0].scrollIntoView();", last_element)
        time.sleep(2)
        a = browser.find_elements(By.CLASS_NAME, "hfpxzc")

        try:
            if len(a) == var:
                le += 1
                if le > 20 or len(a) == prev_length:
                    break
            else:
                le = 0
            prev_length = len(a)
        except StaleElementReferenceException:
            continue


    names_processed = False  # Flag to indicate if names are processed

    for i in range(len(a)):
        if names_processed:
            break  # If names are processed, break out of the loop
        action.move_to_element(a[i]).perform()
        time.sleep(2)
        source = browser.page_source
        soup = BeautifulSoup(source, 'html.parser')
        try:
            Item_Html = soup.findAll('div', {"class": "lI9IFe"})
            for item_html in Item_Html:
                Name_Html = item_html.find('div', {"class": "qBF1Pd fontHeadlineSmall"})
                name = Name_Html.text.strip()
                if name not in e:
                    e.append(name)
                    divs = item_html.findAll('div', {"class": "W4Efsd"})
                    email_scraped = False

                    for div in divs:
                        phone_span = div.find('span', {"class": "UsdlK"})
                        if phone_span and phone_span.text.strip().startswith("+"):  # Check condition
                            phone = phone_span.text.strip()
                        else:
                            phone = "Not available"
                    Address_Html = divs[2]
                    address_text = Address_Html.get_text().split(' · ')
                    if len(address_text) > 1:
                        address = address_text[1].strip()
                    else:
                        address = "Not available"
                    if not email_scraped:
                        Website_Html = item_html.find('a', {"class": "lcr4fd S9kvJb"})
                        for j in range(len(divs)):
                            if Website_Html:
                                website = Website_Html.get('href')
                                try:
                                    website_source = requests.get(website, timeout=10).text
                                    emails = re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', website_source)
                                    emails = [email for email in emails if not email.endswith('.wixpress.com')]
                                    emails = list(set(emails))
                                    if not emails:
                                        emails = "Not available"
                                    else:
                                        email_scraped = True
                                except (Timeout, ConnectionError) as ex:
                                    print("Error scraping emails from website due to network issues:", ex)
                                except HTTPError as ex:
                                    print("HTTP error occurred while accessing the website:", ex)
                                except RequestException as ex:
                                    print("An error occurred while accessing the website:", ex)
                            else:
                                website = "Not available"
                                emails = "Not available"
                    
                    print([name, phone, address, website, emails])
                    record.append([name, phone, address, website, emails])
            names_processed = True  # Set flag to indicate names are processed
        except Exception as ex:
            print("Error occurred:", ex)
            continue

    print(record)
    return record

browser.get(str(link))
time.sleep(10)
Selenium_extractor()

df=pd.DataFrame(record,columns=['Business Name', 'Phone', 'Street Address', 'Website', 'Email Addresses'])  # writing data to the file
df.to_csv(filename + '.csv',index=False,encoding='utf-8')

当我尝试执行“pip install cx_freeze”或使用 pip 安装“requirements.txt”时,我收到如下错误消息:

https://pastebin.com/KheA21nM

我可以给出的线索是,我几乎可以肯定它与编写程序的位置和生成可执行文件的位置(两个不同的 Python 版本)相关。我在 .spec 文件中看到了一些页面提到了“hiddenimports”,但有些人的建议并没有成功。希望有人确切地知道我的意思,因为虽然这里有类似的问题,但没有一个与我在这里的情况完全相同。请让我知道我能做些什么来解决这个问题。谢谢!

python selenium-webdriver web-scraping pip exe
1个回答
0
投票

描述:

在错误中重点关注这一行:

error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": [url here - Reddit doesn't like links, so I removed it)

这表示您必须安装或升级以前版本的 Microsoft Visual C++。由于 C++ 在将 Python 文件转换为 Windows 可执行文件方面发挥着至关重要的作用。

其次,如果您熟悉在 pyinstaller 中手动导入库,请对 pyinstaller 无法安装的库执行该操作。

© www.soinside.com 2019 - 2024. All rights reserved.