尝试使用
UnstructuredURLLoader
但收到“libmagic 不可用”。
我有:
代码:
from langchain.document_loaders import UnstructuredURLLoader
loader = UnstructuredURLLoader(
urls = [
"https://www.moneycontrol.com/news/business/banks/hdfc-bank-re-appoints-sanmoy-chakrabarti-as-chief-risk-officer-11259771.html",
"https://www.moneycontrol.com/news/business/markets/market-corrects-post-rbi-ups-inflation-forecast-icrr-bet-on-these-top-10-rate-sensitive-stocks-ideas-11142611.html"
]
)
data = loader.load()
len(data)
错误:
libmagic is unavailable but assists in filetype detection on file-like objects. Please consider installing libmagic for better results.
Error fetching or processing https://www.moneycontrol.com/news/business/banks/hdfc-bank-re-appoints-sanmoy-chakrabarti-as-chief-risk-officer-11259771.html, exception: Invalid file. The FileType.UNK file type is not supported in partition.
libmagic is unavailable but assists in filetype detection on file-like objects. Please consider installing libmagic for better results.
Error fetching or processing https://www.moneycontrol.com/news/business/markets/market-corrects-post-rbi-ups-inflation-forecast-icrr-bet-on-these-top-10-rate-sensitive-stocks-ideas-11142611.html, exception: Invalid file. The FileType.UNK file type is not supported in partition.
解决方案:venv中libmagic.dll文件夹的路径必须添加到系统变量中。
以我为例: D:\ds_project