Ansible allows you to extend the system by using plugins. Plugins are are executed at various stages of a run and allow you to hook into the system and add your own logic. Plugins are written in Python.
Callback plugins respond to events Ansible sends and can be used to notify external systems. The use case I used it for was to notify Slack whenever an Ansible run failed. I run Ansible in pull mode as a cronjob on our instances. This means that there is no immediate feedback if an Ansible run failed.
I could implement this by wrapping my entire playbook in a
block and use
the rescue
directive when the playbook files to send a Slack message using the Slack
module. However
when using the rescue directive I wouldn't have enough data on why the playbook
failed unless I used clever ways of registering failures in variables. Overall
it doesn't feel like a sustainable solution because it requires extensive logic
in the playbook for the sole purpose of sending a Slack notification.
Callback plugins are a more elegant solution. A callback plugin is injected with data about the Ansible state when an event (such as a playbook error) occurs and you can extract the exact reason why a playbook failed. I send this data to Slack so that an engineer can quickly understand and debug an Ansible failure.
We're going to develop a simple callback plugin which sends us a Slack message any time a playbook fails. We consider a failure when Ansible can't connect to a host machine or a playbook task fails.
In order to notify Slack we'll need to create a Slack webhook which I'm not going to cover here, but you can find this in the Slack API documentation.
I will create a class which inherits from Ansible's CallbackBase
. The class
will define three class properties:
CALLBACK_VERSION = 2.0
- This is not the version of our plugin, but rather
what version of Ansible the plugin is run with. Ansible v1 and Ansible v2
define different interfaces for plugins. I'm running Ansible 2.5 so I'll use
the v2 plugin interface.
CALLBACK_NEEDS_WHITELIST = True
- Defines whether the plugin needs to be
whitelisted in ansible.cfg
in order to run. If this value is False
I can
simply drop the plugin into our project and it should always run. However I
prefer to be explicit and whitelist my plugins in ansible.cfg
.
CALLBACK_NAME = 'slack'
- I'm actually not sure why this variable is used,
because when Ansible runs plugins it reads the name from the filename. However
we'll define it anyway because Ansible documentation
states that it's required
Callback plugins work by implementing methods which are executed at different stages of a playbook run.
#set_options
- is ran on setup of the callback plugin and allows us to
read environment variables and assign them to python variables.
#v2_runner_on_failed
- is ran when a task playbook fails. We'll send a Slack
message notifying us of the failure.
#v2_runner_on_unreachable
- occurs when Ansible can't connect to a specific
host. We'll also notify Slack when this happens.
Below is an example of a simple Slack callback plugin.
# These imports are defined for every callback plugin I've seen so far.
# If you don't import `absolute_import` standard library modules may be
# overriden by Ansible python modules with the same name. For example: I use the
# standard library `json` module, but Ansible has a callback plugin with the
# same name. When I excluded `absolute_import` and imported `json` the json
# module I got was Ansible's json module and not the standard library one.
# I'm not sure why `division` and `print_function` need to be imported.
from __future__ import (absolute_import, division, print_function)
from ansible.plugins.callback import CallbackBase
__metaclass__ = type
import json
import urllib2
import sys
import os
# Ansible documentation of the module. I'm also not sure why this is required,
# but other plugins add documentation so it seems to be a standard.
DOCUMENTATION = '''
callback: slack
options:
slack_webhook_url:
required: True
env:
- name: SLACK_WEBHOOK_URL
slack_channel:
required: False
env:
- name: SLACK_CHANNEL
'''
class CallbackModule(CallbackBase):
CALLBACK_VERSION = 2.0
CALLBACK_NAME = 'slack'
CALLBACK_NEEDS_WHITELIST = True
def __init__(self):
super(CallbackModule, self).__init__()
def set_options(self, task_keys=None, var_options=None, direct=None):
super(CallbackModule, self).set_options(task_keys=task_keys, var_options=var_options, direct=direct)
# Read and assign environment variables to memory so that we can use
# them later.
self.slack_webhook_url = os.environ.get('SLACK_WEBHOOK_URL')
self.slack_channel = os.environ.get('SLACK_CHANNEL')
if self.slack_webhook_url is None:
self._display.display('Error: The slack callback plugin requires `SLACK_WEBHOOK_URL` to be defined in the environment')
sys.exit(1)
def v2_runner_on_failed(self, taskResult, ignore_errors=False):
notify(self.slack_webhook_url, taskResult, self.slack_channel)
def v2_runner_on_unreachable(self, taskResult):
notify(self.slack_webhook_url, taskResult, self.slack_channel)
def notify(slack_webhook_url, taskResult, slack_channel=None):
# Format the Slack message. We'll use message attachments
# https://api.slack.com/docs/message-attachments
payload = {
'username': 'Ansible',
'attachments': [
{
'title': 'Ansible run has failed. HOST: {} {}'.format(taskResult._host, taskResult._task),
'color': '#FF0000',
'text': '```{}```'.format(json.dumps(taskResult._result, indent=2))
}
]
}
# The webhook has a default url. If one is not configured, we'll use the
# default
if slack_channel:
payload['channel'] = slack_channel
req = urllib2.Request(slack_webhook_url)
urllib2.urlopen(req, data=json.dumps(payload))
To enable the plugin you'll need to whitelist it in ansible.cfg
. Add the
following lines to your ansible.cfg file.
[defaults]
callback_whitelist = slack
It's important to note that Ansible source code does not care what's defined as
the python property CALLBACK_NAME
in the plugin. Instead the
callback_whitelist
needs to match whatever the filename is of your callback
plugin.
I spent an hour debugging why my callback plugin wasn't being called before
digging into the source code and finding this:
(callback_name, _) = os.path.splitext(os.path.basename(callback_plugin._original_path))
It takes the filename of your plugin and compares it to the whitelist. If it
matches, your callback is ran. I'm still not sure what CALLBACK_NAME
is used
for, but according to the documentation
it's required.
Note that the CALLBACK_VERSION and CALLBACK_NAME definitions are required for properly functioning plugins for Ansible >=2.0.
We'll also need to place the python file into a directory that is read by Ansible. Ansible documentation states:
You can activate a custom callback by either dropping it into a callback_plugins directory adjacent to your play, inside a role, or by putting it in one of the callback directory sources configured in ansible.cfg.
I have a per-project ansible.cfg that exists in the same directory as my main playbook. This means that I place my callback plugins in the directory adjacent to ansible.cfg.
$ tree ~/repos/myproject/ansible
/home/dani/repos/myproject/ansible
├── ansible.cfg
├── callback_plugins
│ ├── slack.py
├── inventory.yml
├── main.yml
Our Slack callback plugin should now run anytime we have a failing task or can't connect to a host. We can create an example task which is going to fail immediately.
Example playbook
$ cat slack-playbook.yml
- name: Slack example
hosts: localhost
tasks:
- name: Fail
shell: exit 1
Example inventory
$ cat inventory.yml
all:
hosts:
localhost:
ansible_connection: local
Example ansible.cfg
$ cat ansible.cfg
[defaults]
callback_whitelist = slack
When running our playbook we need to define the environment variables
SLACK_WEBHOOK_URL
and optionally SLACK_CHANNEL
(if the channel is not
defined it will use the default channel the webhook was configured for). Ansible
documentation states that you can pass variables from ansible.cfg
to the
plugins, but I've didn't find any instructions on how this is done. The standard
way I've seen for most plugins is to use environment variables.
$ SLACK_WEBHOOK_URL=https://hooks.slack.com/services/xxxxxxxxxx ANSIBLE_CONFIG=./ansible.cfg ansible-playbook slack-playbook.yml -i inventory.yml
PLAY [Slack example] *****************************************************************
TASK [Gathering Facts] ***************************************************************
ok: [localhost]
TASK [Fail] **************************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "exit 1", "delta": "0:00:00.001367", "end": "2018-03-22 05:54:31.217632", "msg": "non-zero return code", "rc": 1, "start": "2018-03-22 05:54:31.216265", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
to retry, use: --limit @/tmp/tmp.S1IR7jIb1p/slack.retry
PLAY RECAP ***************************************************************************
localhost : ok=1 changed=0 unreachable=0 failed=1
The Slack output
I wrote a simple callback plugin that implemented three of the interface methods callback plugins provide. I didn't find any official documentation on what methods you can override, you can find all of them in the source code.