我有以下场景,我选择使用 Ansible 来尝试完成。
参见下面的输出
ansible --version
ansible [core 2.14.3]
config file = /home/user/.ansible.cfg
configured module search path = ['/home/user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3/dist-packages/ansible
ansible collection location = /home/user/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/bin/ansible
python version = 3.11.2 (main, Mar 13 2023, 12:18:29) [GCC 12.2.0] (/usr/bin/python3)
jinja version = 3.1.2
libyaml = True
我需要主动监控服务器的“状态”,一旦发现服务器“正常”,则特定服务器需要继续执行下一个任务。
我面临的问题是,我正在使用一个“角色”来监视我解析它的服务器的“状态”,并且我无法异步触发角色,以便对服务器的监视可以在后台进行.
我在以下任务中监视的字典变量中记录了每个服务器的“状态”,一旦发现其中一台服务器的“状态”为“正常”,它应该运行下一个任务,同时继续监视其余尚未处于“正常”状态的服务器。
见下面的例子:
Ansible 文件夹和文件结构:
"{{playbook_dir}}/testing/main.yaml"
"{{playbook_dir}}/roles/state/tasks/get.yaml"
/tmp/localhost.state
/tmp/127.0.0.1.state
/tmp/127.0.1.1.state
"{{playbook_dir}}/testing/main.yaml"
是我用来运行服务器操作任务的“主要”剧本。
请参阅下面的内容:
---
- hosts: [localhost]
gather_facts: false
vars:
servers_state_files: {
"localhost": "/tmp/localhost.state",
"127.0.0.1": "/tmp/127.0.0.1.state",
"127.0.1.1": "/tmp/127.0.1.1.state",
}
tasks:
- name: "task 1 - get and monitor server/s state"
include_role:
name: state
tasks_from: get.yaml
vars:
- server_names: "{{ servers_state_files.keys() }}"
# async: 100
# poll: 0
register: get_server_state_task
- name: "task 2 - do some actions on server/s that have an 'OK' state"
debug:
msg: "some action on server - {{ item }}"
loop: "{{ server_state_dict_ok.keys() }}"
retries: 5
delay: 10
until: server_state_dict_ok[item].state == 'OK'
when: server_state_dict_ok is defined
查看“/tmp/
/tmp/localhost.state
OK
/tmp/127.0.0.1.state
ERROR
/tmp/127.0.1.1.state
PENDING
请参阅以下内容:
"{{playbook_dir}}/roles/state/tasks/get.yaml"
---
- name: "open state file/s"
debug:
msg:
- "server state file - {{ servers_state_files[item] }}"
- "state - {{ lookup('file', servers_state_files[item]) }}"
loop: "{{ server_names }}"
when:
- server_names is defined
- servers_state_files is defined
- name: "set server state dictionary fact"
set_fact:
server_state_dict: "{{ server_state_dict | default({}) | combine( {item: {'state': lookup('file', servers_state_files[item])}} ) }}"
loop: "{{ server_names }}"
when:
- server_names is defined
- servers_state_files is defined
- name: "show server_state_dict param"
debug:
msg:
- "server_state_dict - {{ server_state_dict }}"
when: server_state_dict is defined
- name: "create server_state_dict_ok"
set_fact:
server_state_dict_ok: "{{ server_state_dict_ok | default({}) | combine( {item: {'state': lookup('file', servers_state_files[item])}} ) }}"
with_items: "{{ server_state_dict.keys() }}"
when:
- server_state_dict is defined
- server_state_dict[item].state == 'OK'
- name: "Show server_state_dict_ok param"
debug:
msg:
- "server_state_dict_ok - {{ server_state_dict_ok }}"
when:
- server_state_dict_ok is defined
- server_state_dict_ok | length > 0
- name: "'Pop' keys from server_state_dict that have 'OK' state"
set_fact:
server_state_dict: "{{ server_state_dict | ansible.utils.remove_keys(target=[item]) }}"
with_items: "{{ server_state_dict }}"
when:
- server_state_dict is defined
- server_state_dict[item].state == 'OK'
- name: "Show server_state_dict param"
debug:
msg:
- "server_state_dict - {{ server_state_dict }}"
when:
- server_state_dict is defined
- server_state_dict | length > 0
- name: "Set and increment counter"
set_fact:
counter: "{{ counter | default(0) | int + 1 | int }}"
when:
- server_state_dict is defined
- server_state_dict | length > 0
- name: "Show counter param"
debug:
msg:
- "counter - {{ counter | int }}"
when:
- counter is defined
- name: "Set a 'sleep' of 10 seconds to give the server/s a chance to have it's state changed to 'OK'"
command: "sleep 10"
when:
- server_state_dict is defined
- server_state_dict | length > 0
- name: "Continue to monitor server/s state for servers whose state is not 'OK'"
include_tasks: get.yaml
vars:
server_names: "{{ server_state_dict.keys() }}"
when:
- server_state_dict is defined
- server_state_dict | length > 0
- counter | int <= 5
由于我设置和检查的计数器,
get.yaml
任务将“循环”自身(请参阅包含自身的最后一个任务步骤)最多 5 次。
这一切都工作得很好,但是我希望能够异步运行
get.yaml
,以便一旦特定服务器的“状态”更改为“正常”,该服务器就可以继续执行下一步操作,即"task 2 - do some actions on server/s that have an 'OK' state"
.
这可能不是最好的方法,因此我的问题。
非常感谢您为此付出的时间和帮助。
但是,我希望能够异步运行 get.yaml,以便一旦特定服务器的“状态”更改为“确定”,该服务器就可以继续执行下一步操作,即“任务 2 - 在服务器上执行一些操作” /s 具有“正常”状态”。 这可能不是最好的方法,因此我的问题。
看起来您正在重新发明内置的 异步执行模式,或 wait_for 模块,或 执行策略,或全部一起。
外部因素由ansible以不同的角色控制。例如,在运行时,我希望能够监视这些文件中设置的“状态”,一旦其中一台服务器报告“正常”,它应该继续执行下一个任务,同时继续监视其他服务器。
该角色应该控制重试本身和相应主机的 set_fact,而不是将状态写入自定义文件。一般来说,Ansible 在默认情况下几乎已经像这样工作了 - 它同时对多个(默认为 5 个)主机批量运行并行播放,然后在所有主机完成后继续执行下一个任务,然后继续执行下一个播放。如果您不希望主机在任务之间“同步”,您可以简单地使用
free
策略 - 对于每个主机,Ansible 将在该主机上完成前一个任务后立即运行下一个任务。