资源项目上的权限被拒绝,只有1个线程在tpu_cluster_resolver中没有限制

问题描述 投票:0回答:1

我正在通过Compute Engine在云TPU上运行BERT预训练代码。

每次运行它时,我都会在1个线程上遇到此错误,但是训练可以正常进行。

我在google Colab TPU上运行了相同的代码,并且运行良好。

对于tpu_cluster_resolver我正在传递TPU实例的IP地址,我也尝试传递具有相同结果的区域和项目名称

Exception in thread Thread-5:
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/cluster_resolver/tpu_cluster_resolver.py", line 476, in _fetch_cloud_tpu_metadata
    return request.execute()
  File "/usr/local/lib/python3.5/dist-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/googleapiclient/http.py", line 856, in execute
    raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://tpu.googleapis.com/v1/projects/None/locations/None/nodes/xxxxxx:8470?alt=json returned "Permission denied on resource project None.". Details: "[{'links': [{'url': 'https://console.developers.google.com/project/None/apiui/credential', 'description': 'Google developer console API key'}], '@type': 'type.googleapis.com/google.rpc.Help'}]">

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/tpu/preempted_hook.py", line 87, in run
    response = self._cluster._fetch_cloud_tpu_metadata()  # pylint: disable=protected-access
  File "/usr/local/lib/python3.5/dist-packages/tensorflow_core/python/distribute/cluster_resolver/tpu_cluster_resolver.py", line 480, in _fetch_cloud_tpu_metadata
    "constructor. Exception: %s" % (self._tpu, e))
ValueError: Could not lookup TPU metadata from name 'b'xxxxxxxx:8470''. Please doublecheck the tpu argument in the TPUClusterResolver constructor. Exception: <HttpError 403 when requesting https://tpu.googleapis.com/v1/projects/None/locations/None/nodes/xxxxxx:8470?alt=json returned "Permission denied on resource project None.". Details: "[{'links': [{'url': 'https://console.developers.google.com/project/None/apiui/credential', 'description': 'Google developer console API key'}], '@type': 'type.googleapis.com/google.rpc.Help'}]">

python tensorflow-estimator tpu
1个回答
0
投票

不知道代码就很难知道。

[通过查看错误“对资源项目无权限的权限被拒绝。”,建议您在TPUClusterResolver中将参数“ project”与您的Google Cloud项目名称一起添加,因为它似乎用“ None”填充。

© www.soinside.com 2019 - 2024. All rights reserved.