Ansible:如果主机无法访问,则中止执行

问题描述 投票:6回答:5

摘要:如果任何主机无法访问,则立即中止ansible playbook的更好方法。

如果任何一个主机无法访问,有没有办法中止Ansible playbook。我发现,如果它无法到达主机,它仍将继续执行并执行剧本中的所有播放/任务。

我的所有剧本我指定max_fail_percentage为0,但在这种情况下,ansible不会抱怨,因为所有可访问的主机都可以执行所有播放。

目前我有一个简单而又hacky的解决方案,但看看是否有更好的答案。

我目前的解决方案

自从作为运行剧本的一部分的第一步,ansible收集所有主机的事实。如果主机无法访问,则无法访问。我在剧本的最开始写了一个简单的剧本,它将使用一个事实。如果主机无法访问,则任务将因“未定义的变量错误”而失败。任务只是一个虚拟任务,如果所有主机都可以访问,它将始终通过。

见下面我的例子:

- name: Check Ansible connectivity to all hosts
  hosts: host_all
  user: "{{ remote_user }}"
  sudo: "{{ sudo_required }}"
  sudo_user: root
  connection: ssh # or paramiko
  max_fail_percentage: 0
  tasks:
    - name: check connectivity to hosts (Dummy task)
      shell: echo " {{ hostvars[item]['ansible_hostname'] }}"
      with_items: groups['host_all']
      register: cmd_output

    - name: debug ...
      debug: var=cmd_output

如果主机无法访问,您将收到如下错误:

TASK: [c.. ***************************************************** 
fatal: [172.22.191.160] => One or more undefined variables: 'dict object'    has no attribute 'ansible_hostname' 
fatal: [172.22.191.162] => One or more undefined variables: 'dict object' has no attribute 'ansible_hostname'

FATAL: all hosts have already failed -- aborting
ansible ansible-playbook
5个回答
2
投票

或者,这看起来更简单,更具表现力

- hosts: myservers
  become: true

  pre_tasks:
    - name: Check ALL hosts are reacheable before doing the release
      assert:
        that:
          - ansible_play_hosts == groups.myservers
        fail_msg: 1 or more host is UNREACHABLE
        success_msg: ALL hosts are REACHABLE, go on
      run_once: yes

  roles:
    - deploy

https://github.com/ansible/ansible/issues/18782#issuecomment-319409529


1
投票

你可以对这个检查更明确一些:

- fail: Abort if hosts are unreachable
  when: "'ansible_hostname' not in hostvars[item]"
  with_items: groups['all']

我认为你可以制作一个callback plugin来实现这一目标。就像是:

class CallbackModule(object):
    def runner_on_unreachable(self, host, res):
        raise Exception("Aborting due to unreachable host " + host)

除了我找不到从该回调中止整个剧本的任何好方法(例外不起作用,返回值被忽略,虽然你可能滥用self.playbook来阻止事情,但我没有看到公共API)。


1
投票

您可以将any_errors_fatal: truemax_fail_percentage: 0gather_facts: false结合使用,然后运行一个在主机离线时将失败的任务。在剧本顶部的这样的东西应该做你需要的:

- hosts: all
  gather_facts: false
  max_fail_percentage: 0
  tasks:
    - action: ping

奖励是,这也适用于限制匹配主机的-l SUBSET选项。


1
投票

我发现一种方法可以在gather_facts完成后立即使用回调来中止播放。

通过将_play_hosts设置为空集,没有主机可以继续播放。

class CallbackModule(object):

    def runner_on_unreachable(self, host, res):
        # Truncate the play_hosts to an empty set to fail quickly
        self.play._play_hosts = []

结果如下:

PLAY [test] *******************************************************************

GATHERING FACTS ***************************************************************
fatal: [haderp] => SSH Error: ssh: Could not resolve hostname haderp: nodename nor servname provided, or not known
It is sometimes useful to re-run the command using -vvvv, which prints SSH debug output to help diagnose the issue.
ok: [derp]

TASK: [set a fact] ************************************************************
FATAL: no hosts matched or all hosts have already failed -- aborting


PLAY RECAP ********************************************************************
       to retry, use: --limit @/Users/jkeating/foo.yaml.retry

derp                       : ok=1    changed=0    unreachable=0    failed=0
haderp                     : ok=0    changed=0    unreachable=1    failed=0

0
投票

灵感来自其他答案。

使用ansible-playbook 2.7.8。

检查每个所需主机是否有任何ansible_facts对我来说更明确。

# my-playbook.yml
- hosts: myservers
  tasks:
    - name: Check ALL hosts are reacheable before doing the release
      fail:
        msg: >
          [REQUIRED] ALL hosts to be reachable, so flagging {{ inventory_hostname }} as failed,
          because host {{ item }} has no facts, meaning it is UNREACHABLE.
      when: "hostvars[item].ansible_facts|list|length == 0"
      with_items: "{{ groups.myservers }}"

    - debug:
        msg: "Will only run if all hosts are reacheable"
$ ansible-playbook -i my-inventory.yml my-playbook.yml

PLAY [myservers] *************************************************************************************************************************************************************************************************************

TASK [Gathering Facts] *********************************************************************************************************************************************************************************************************
fatal: [my-host-03]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname my-host-03: Name or service not known", "unreachable": true}
fatal: [my-host-04]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname my-host-04: Name or service not known", "unreachable": true}
ok: [my-host-02]
ok: [my-host-01]

TASK [Check ALL hosts are reacheable before doing the release] ********************************************************************************************************************************************************************************************************************
failed: [my-host-01] (item=my-host-03) => {"changed": false, "item": "my-host-03", "msg": "[REQUIRED] ALL hosts to be reachable, so flagging my-host-01 as failed, because host my-host-03 has no facts, meaning it is UNREACHABLE."}
failed: [my-host-01] (item=my-host-04) => {"changed": false, "item": "my-host-04", "msg": "[REQUIRED] ALL hosts to be reachable, so flagging my-host-01 as failed, because host my-host-04 has no facts, meaning it is UNREACHABLE."}
failed: [my-host-02] (item=my-host-03) => {"changed": false, "item": "my-host-03", "msg": "[REQUIRED] ALL hosts to be reachable, so flagging my-host-02 as failed, because host my-host-03 has no facts, meaning it is UNREACHABLE."}
failed: [my-host-02] (item=my-host-04) => {"changed": false, "item": "my-host-04", "msg": "[REQUIRED] ALL hosts to be reachable, so flagging my-host-02 as failed, because host my-host-04 has no facts, meaning it is UNREACHABLE."}
skipping: [my-host-01] => (item=my-host-01)
skipping: [my-host-01] => (item=my-host-02)
skipping: [my-host-02] => (item=my-host-01)
skipping: [my-host-02] => (item=my-host-02)
        to retry, use: --limit @./my-playbook.retry

PLAY RECAP *********************************************************************************************************************************************************************************************************************
my-host-01 : ok=1    changed=0    unreachable=0    failed=1
my-host-02 : ok=1    changed=0    unreachable=0    failed=1
my-host-03 : ok=0    changed=0    unreachable=1    failed=0
my-host-04 : ok=0    changed=0    unreachable=1    failed=0
© www.soinside.com 2019 - 2024. All rights reserved.