听我说:我是 Python 新手。我完全可能在某个地方搞砸了。
这是完整的错误消息:
Traceback (most recent call last):
File "webScrapingTool.py", line 1, in <module>
from selenium import webdriver
ModuleNotFoundError: No module named 'selenium'
我在Ubuntu 22.04上编写代码,其默认Python版本是3.10.4。我有一个双启动系统。我没有意识到我显然(?)需要直接在Windows中制作Windows可执行文件,所以我将文件移动到那里并尝试。我下载了Python for Windows,版本是3.12.2。据我了解,这可能是问题的一部分。
请记住,我在 Ubuntu 上尝试过“pyinstaller”和“auto-py-to-exe”,也在 Windows 上尝试过“pyinstaller”。当我在 Windows 中创建可执行文件时,它会显示如上的错误消息。
如前所述,我对 Python 几乎是全新的,我做了一个非常基本的项目,但我确实需要知道最终使我的文件对普通人可执行/可用的问题是什么。
这是我的代码:
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
import time
import re
import requests
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.common.exceptions import StaleElementReferenceException
from requests.exceptions import RequestException, Timeout, HTTPError, ConnectionError
filename = "data"
link = input("Please enter the Google Maps link for scraping: ")
browser = webdriver.Chrome()
record = []
e = []
le = 0
def Selenium_extractor():
action = ActionChains(browser)
prev_length = 0
a = browser.find_elements(By.CLASS_NAME, "hfpxzc")
while len(a) < 1000:
print(len(a))
var = len(a)
last_element = a[-1]
action.move_to_element(last_element).perform()
browser.execute_script("arguments[0].scrollIntoView();", last_element)
time.sleep(2)
a = browser.find_elements(By.CLASS_NAME, "hfpxzc")
try:
if len(a) == var:
le += 1
if le > 20 or len(a) == prev_length:
break
else:
le = 0
prev_length = len(a)
except StaleElementReferenceException:
continue
names_processed = False # Flag to indicate if names are processed
for i in range(len(a)):
if names_processed:
break # If names are processed, break out of the loop
action.move_to_element(a[i]).perform()
time.sleep(2)
source = browser.page_source
soup = BeautifulSoup(source, 'html.parser')
try:
Item_Html = soup.findAll('div', {"class": "lI9IFe"})
for item_html in Item_Html:
Name_Html = item_html.find('div', {"class": "qBF1Pd fontHeadlineSmall"})
name = Name_Html.text.strip()
if name not in e:
e.append(name)
divs = item_html.findAll('div', {"class": "W4Efsd"})
email_scraped = False
for div in divs:
phone_span = div.find('span', {"class": "UsdlK"})
if phone_span and phone_span.text.strip().startswith("+"): # Check condition
phone = phone_span.text.strip()
else:
phone = "Not available"
Address_Html = divs[2]
address_text = Address_Html.get_text().split(' · ')
if len(address_text) > 1:
address = address_text[1].strip()
else:
address = "Not available"
if not email_scraped:
Website_Html = item_html.find('a', {"class": "lcr4fd S9kvJb"})
for j in range(len(divs)):
if Website_Html:
website = Website_Html.get('href')
try:
website_source = requests.get(website, timeout=10).text
emails = re.findall(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', website_source)
emails = [email for email in emails if not email.endswith('.wixpress.com')]
emails = list(set(emails))
if not emails:
emails = "Not available"
else:
email_scraped = True
except (Timeout, ConnectionError) as ex:
print("Error scraping emails from website due to network issues:", ex)
except HTTPError as ex:
print("HTTP error occurred while accessing the website:", ex)
except RequestException as ex:
print("An error occurred while accessing the website:", ex)
else:
website = "Not available"
emails = "Not available"
print([name, phone, address, website, emails])
record.append([name, phone, address, website, emails])
names_processed = True # Set flag to indicate names are processed
except Exception as ex:
print("Error occurred:", ex)
continue
print(record)
return record
browser.get(str(link))
time.sleep(10)
Selenium_extractor()
df=pd.DataFrame(record,columns=['Business Name', 'Phone', 'Street Address', 'Website', 'Email Addresses']) # writing data to the file
df.to_csv(filename + '.csv',index=False,encoding='utf-8')
我可以给出的线索是,我几乎可以肯定它与编写程序的位置和生成可执行文件的位置(两个不同的 Python 版本)相关。我在 .spec 文件中看到了一些页面提到了“hiddenimports”,但有些人的建议并没有成功。希望有人确切地知道我的意思,因为虽然这里有类似的问题,但没有一个与我在这里的情况完全相同。请让我知道我能做些什么来解决这个问题。谢谢!
“早上好,你好吗?你的代码在 IDE 中运行正常吗?只是在创建可执行文件后才出现问题吗?”