我正在尝试使用 Azure Automation 运行一些 python 作业,我现在每天都在本地运行。我在 Azure Automation 中使用 Python 的 3.8 版本,因为 .10 在导入某些包时出现问题。这些问题似乎源于其执行的异步功能。该脚本对 Microsoft Graph 进行大量 API 调用以从 ID 的日历中获取事件,这将调用限制为每次调用 10 个事件。 Graph 将提供一个 url 以获取接下来的 10 个事件,直到不再有为止。此脚本遍历每个 ID,获取前 10 个事件的数据,存储下 10 个事件的链接,然后调用下 10 个事件,直到没有更多事件可用。这是我能得到的最接近可重现的例子,它不是一个超级简单的过程:
import aiohttp
import asyncio
import json
#Added this after some debugging - without it it throws an error before getting to the async #functions
if platform.system()=='Windows':
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
df = pd.DataFrame(columns = ['a','b'])
next_url = []
id_list = ['1','2','3']
async def get_data(session,url):
async with session.get(url) as resp:
data = await resp.json
for i in range(len(data['value'])):
a = data['value][i]['a']
b = data['value][i]['b']
row = [a,b]
df.loc[len(df.index)] = row
if len(data) == 3:
next_url.append(data['next_url])
async def main():
async with aiohttp.ClientSession() as session:
tasks = []
for i in range(len(id_list)):
url = "https://graph.microsoft.com/v1.0/users/" + id[i]
tasks.append(asyncio.ensure_future(get_data(session, url)))
await asyncio.gather(*tasks)
asyncio.run(main())
while len(next_url) > 0:
new_urls = list(set(next_url))
next_url = []
async def next():
async with aiohttp.ClientSession() as session:
tasks = []
for i in range(len(new_urls)):
url = new_urls[i]
tasks.append(asyncio.ensure_future(get_data(session, url)))
await asyncio.gather(*tasks)
asyncio.run(next())
在本地,这个脚本在 3.8 虚拟环境中运行良好,但是当我尝试从 azure 部署它时,它给出了这个错误:
Traceback (most recent call last): File "C:\Temp\e5tzl0sn.cg4\2be46d44-7c2c-4932-86a6-48cdbe662467", line 95, in <module> asyncio.run(main()) File "C:\WPy64-3800\python-3.8.0.amd64\lib\asyncio\runners.py", line 39, in run loop = events.new_event_loop() File "C:\WPy64-3800\python-3.8.0.amd64\lib\asyncio\events.py", line 758, in new_event_loop return get_event_loop_policy().new_event_loop() File "C:\WPy64-3800\python-3.8.0.amd64\lib\asyncio\events.py", line 656, in new_event_loop return self._loop_factory() File "C:\WPy64-3800\python-3.8.0.amd64\lib\asyncio\selector_events.py", line 56, in __init__ self._make_self_pipe() File "C:\WPy64-3800\python-3.8.0.amd64\lib\asyncio\selector_events.py", line 103, in _make_self_pipe self._ssock, self._csock = socket.socketpair() File "C:\WPy64-3800\python-3.8.0.amd64\lib\socket.py", line 597, in socketpair lsock.listen()OSError: [WinError 10050] A socket operation encountered a dead networkException ignored in: <function BaseEventLoop.__del__ at 0x00000008AAA6D670>Traceback (most recent call last): File "C:\WPy64-3800\python-3.8.0.amd64\lib\asyncio\base_events.py", line 648, in __del__ self.close() File "C:\WPy64-3800\python-3.8.0.amd64\lib\asyncio\selector_events.py", line 87, in close self._close_self_pipe() File "C:\WPy64-3800\python-3.8.0.amd64\lib\asyncio\selector_events.py", line 94, in _close_self_pipe self._remove_reader(self._ssock.fileno())AttributeError: '_WindowsSelectorEventLoop' object has no attribute '_ssock'sys:1: RuntimeWarning: coroutine 'main' was never awaited
我已经尝试将
asyncio.run(main())
更改为 await main()
因为我知道当我从 .ipynb 文件运行它时它有一个类似的问题,即从不等待函数,但在这种情况下没有帮助。
我期待一个包含我从每条记录中提取的数据的数据框。当我在本地运行时,这一切都完美无缺,但在 Azure Automation 中却失败了。
我也将所有必需的包安装到 Azure Automation 中