ANSIBLE 202

Best Practices from the field

 

 

 

Juan Manuel Parrilla Madrid

Index:

  • Whoami
  • Introduction to Ansible Development
  • Official Best Practices
  • Ansible Best Practices & Style-Guide
    • General
    • Git SCM
    • Roles
    • Tasks
    • Variables
  • Ansible Role Testing
    • Testing Methods: Molecule
    • Alternative testing methods
    • Ansible-Lint
    • Ansible-CI/CD
  • Re-using code on Ansible
  • Resources

Whoami

---
galaxy_info:
  Author: Juan Manuel Parrilla
  Description:
    - Automation fan
    - Python Lover
    - Geek!
  Company: Red Hat
  Job: Senior Systems Design Engineer
  Github: github.com/jparrill
  IRC: @jparrill
  Twitter: @Kerbeross
  Galaxy_tags:
    - 'Ansible'
    - 'Origin'
    - 'ManageIQ'
    - 'Foreman'
    - 'AWX'

Introduction to Ansible Development

Introduction...

These slides will introduce you to the world of advanced Ansible Development. This will show you a few requirements to gain more fluent speech ;)

  • This is neither an introduction to Ansible perse nor a Module/Plugin development guide
  • However you will learn about:
    • A Style Guide to develop all Roles/Playbooks
    • Best practices when you create Roles/Playbook on Ansible
    • Best practices to work with your Git repository
    • How to link your Ansible repositories to your CI/CD server and perform automated actions
    • Create re-usable code on Ansible
    • How to debug you Playbooks
    • Handling errors in the right way
    • Testing suite for Ansible
  • Ansible version used: >=2.3.1
  • SO: RHEL7/CentOS7/Fedora2X

Official Best Practices

Best Practices

There are some official best practices at Ansible Documentation that could assist you to start working in the right way:

 

Those entries are a good read to develop your ansible code in the right way. With the experience in big platforms, complex deployments, orchestration and multiple kinds of servers, this will show you how to create sustinable, re-usable and effective code amongst other things.

Ansible

Best Practices & Style-Guide

General Topics

General Topics

In this section we will review some important general topics around Ansible.

Here are some non-groupable best practices:

  • Debug Strategies
  • Literals: ' vs " (*)
  • Passwords on Vaults (*)
  • Ansible.cfg location and best possible approximation
  • Project layouts (Big vs Small)
  • Limit vs Tags
  • Callbacks
  • Serial

Debug Strategies

We can debug our playbooks following different strategies. My favorite one are using debug messages in the important parts of the code. Let's see some examples.

Debug Strategies

These actions could be performed when prompted:

  • EOF/(q)uit: Quit debbugger
  • (c)ontinue: Continue with the execution
    • This may seem useless, but you could change some key-values during the debugger execution and continue this by changing it on-the-fly.
  • (p): This command will perform a print some of these resources:
    • See the result of the task (p result)
    • Ask for node facts 
    • Change the content of variables (task.args['data'] = '{{ var1 }}')
    • etc...
  • (r)edo: Re-do the last failed task

Debug Strategies

Debug Strategies

PLAY ***************************************************************************

TASK [wrong variable] **********************************************************
fatal: [192.0.2.10]: FAILED! => {"failed": true, "msg": "ERROR! 'wrong_var' is undefined"}
Debugger invoked
(debug) p result
{'msg': u"ERROR! 'wrong_var' is undefined", 'failed': True}
(debug) p task.args
{u'data': u'{{ wrong_var }}'}
(debug) task.args['data'] = '{{ var1 }}'
(debug) p task.args
{u'data': '{{ var1 }}'}
(debug) redo
ok: [192.0.2.10]

PLAY RECAP *********************************************************************
192.0.2.10               : ok=1    changed=0    unreachable=0    failed=0

These are ways to debug you actions/variables and to hide them as bellow:

Debug Strategies

Literals: ' vs "                      (*)

This works like a source code of other languages, however here you could use it in different ways. This is to try to standardize those literals:

Literals: ' vs "                      (*)

As per below, variables are used with double quotes: " and literals with single quotes: '.

Literals: ' vs "                     (*)

This could be tricky sometimes, for instance:

Checking conditionals:

Lookups:

Literals: ' vs "                      (*)

The output:

Passwords                            (*)

If you are not yet using vault to store passwords, you must do it. The function is easy:

  • Create a vault:

 

 

 

  • Use a vault on playbooks:
> ansible-vault create [filename]
New Vault password: ****
Confirm New Vault password: ****

Consider a vault file like another variable file.

Passwords                            (*)

  • Output:

 

 

 

 

 

 

 

  • How to store vaults:

We recommend to store all you vaults at root of your project in a vault folder to be good identified visually.

Ansible.cfg

The ansible.cfg file could be located in more than one place and depending where you have it located, the file will use it first. It is recommended to use ansible.cfg on every repository that you have, depending on your needs.

Locations (ordered):

  • Environment variables: ANSIBLE_CONFIG
  • ./ansible.cfg: local directory
  • ~/.ansible.cfg: home directory
  • /etc/ansible/ansible.cfg: system directory​

Project Layout

This is a sample of a big project: 

A complex hierarchy:

  • Docs folder to generate HTML autodocumentation
  • Group Vars and OSP Vars for variables related with Groups and Openstack
  • OSP generic templates
  • Plugins and Modules (library) folder
  • Roles folder
  • End-2-End tests on root path
  • All the secrets on vault files inside a vault folder

Project Layout

This is a sample of a multipurpose medium project: 

We have some roles, and not all are related.

  • A folder with all modules
  • A folder will store all plugins
  • A Roles folder for logic storage
  • Group_vars for variables and a Vault to store the encrypted ones
  • A documentation folder
  • A utilities folder

Limit vs tags

Depending on your project, this topic can be complicated. Here are couple of things to note to ensure a comfortable experience.

  • How many nodes does your inventory have?

If the answer is more than 50, you will need to use --limit on your executions. The reason behind is that when you use tags on an ansible execution, all the inventory is parsed and on each task, this will skip the nodes everytime.

Tags are useful, but not in a big scale.

Callbacks

This topic is not used as much, but it is really useful in most case scenarios of an ansible execution. A Callback is an Ansible plugin that will execute a Python action when an event occurs on time execution. This means, for instance that you could execute a curl to google when a task has been executed or maybe when the Ansible playbook is launched... This is very flexible.

This is the list of available bindings on Ansible 2.4:

Callbacks

Example of use:

  • This callback will trace all the executed tasks and show you the whole execution with a top ten slowest, at the end:

This is very useful for finding bottlenecks on your source code.

Callbacks

Callback code:

Callbacks

How to make available callbacks to be used:

  • Modify you ansible.cfg by adding this line
callback_plugins    = <PATH TO CALLBACK FOLDER>
callback_plugins    = ./plugins/callback
  • Just execute ansible

 

There are pre-existing OOTB callbacks for use:

Serial

Basic part

Serial will signify an execution of Ansible on a number of nodes in parallel form with an specific number that you put in serial.

EG: If you put serial: 1 the playbook will go 1 host by 1.

 

Useful part

Put a number of serials in a row:

- name: '[DUMMY PLAYBOOK]'
  hosts: all
  become: True
  serial: 
    - '10%'
    - '50%'
    - '100%'
  tasks:
    - name: '[DUMMY TASK] Debug message'
      debug:
        msg: 'Test Serial'

This means that your code will be executed in the 10 percent of the all inventory, then in the other 50 percent, and then on the remaining 100 percent.

Questions?

ANSIBLE

BEST PRACTICES & STYLE-GUIDE

Git SCM

Git SCM

While this topic is not from Ansible, it's good practice to use Git with your co-workers to take advantage of all its features.

Resources:

 

The links above explain how to work with a well performing Git branching model within big teams. Our model is similar, just follow these simple number of rules:

  • Master and Develop branches are push-inmutable, which means that we cannot push towards these branches
  • The Master branch is the production branch
  • The Develop branc is the future Master branch
  • All PR/MR are based on Gitlab events
  • Unit-Tests are triggered by pushes to the next branches:
    • feature/XXXXX - If we want to add a feature
    • bugfix/XXXXX  - If we want to solve a bug

Git SCM

Git Submodules are a good resource to ensure that you do not repeat code, variables nor configuration, etc...

For instance, if you always work in the same platform and you are developing separate multiple-purpose roles, you could share all you repositories between them as if they were resources:

  • Vault files
  • Inventories
  • Libraries (Plugins and Modules)
  • Whichever useful resource

This is good practice, but it has an aditional management cost. When you update the repository, which are a submodule within other repos, you must update the commit that you are pointing from other repositories.

Questions?

Ansible

Best Practices & Style-Guide

Roles

Roles

In this section we will review some important Role related topics around Ansible.

  • Examples of role execution on playbooks
  • Validation tasks
  • Metadata utility

Execution Examples

While this is not a mandatory task, it is very useful if it works with many kinds of roles/playbooks. Just create a comment on the Playbook that emulates an Ansible execution of the same Playbook within a comment:

Execution Examples

This is the ouput of the last playbook:

Validation role/playbook

This is a suggestion that could help you when you are using an ENC for variables like ETCD or Consul.

The point is to make sure that you secure the playbook execution with an additional layer on every Role, which we call validation layer.

At this layer make sure:

  • You have loaded our key-value
  • The variables loaded are related with the environment that you are using
  • The variables make sense between them
  • Any other stuff that could protect from a failed execution

Validation role/playbook

This is a code example of it:

MEtadata utility

Think about Ansible Galaxy as Puppet Forge, an external bunch of roles that people upload and maintain from the Community.

As you see on an usual Role, the meta folder is created by default, and is used for multiple purposes to:

  • Introduce the Galaxy info
  • Declare dependencies - All deps will be executed before this role

Questions?

Ansible

Best Practices & Style-Guide

Tasks

Tasks

In this section we will review some important Task related topics around Ansible.

  • Dynamic Includes
  • Oneliner vs YAML default format (*)
  • Task naming
  • Block/Rescue/Always
  • Shell/Command

dynamic includes

You could perform a dynamic includes depending on the logic of your Roles/Playbooks. You  define what is the best logic depending on which case.

This is an example of how to perform an Include by using your logic on a variable/fact/etc...

dynamic includes

This is the ouput:

Oneliner vs yaml               (*)

If you execute tasks in this way it will work fine, but it's better to try to standardise your coding methodology.

 
You will see within an eye shot ;) :

Oneliner vs yaml               (*)

This is the output, both work, but it's better to develop the whole code in the same way ;) :

Task naming

This is a common good practice from Ansible, but you can add some verbosity to the code with the following examples:

You can see that I have put between brackets the name of the Topic that I am talking about and after that a small description. It works the same for tasks.

This makes more sense when you work with a bigger repository and multiple roles.

When you work with Roles and many tasks it's better to locate every task with oneview.

Task naming in a complex role

Root Playbooks:

Main file whitin Role:

Roles folder structure:

After this set_fact at main.yml, more includes, blocks, etc... will come.

Task naming in a complex role

This file is packages.yml inside of Keystone role:

Task naming in a complex role

Block, rescue, always...

Some details about this kind of tasks have been explained already, but I will try to explain in more depth how to take advantage of this kind of meta-task.

When you must use this:

  • When your task could fail or sometimes fail
  • When you want to group executions based on flags
  • When you don't know how a service/file/etc... is in a node

 

I will show you some code-examples:

Block, rescue, always...

  • When your task could fail or sometimes fail:

Block, rescue, always...

  • When you want to group executions based on flags

Block, rescue, always...

  • When you don't know how a service/file/etc... is in a node

SHELL/Command

There is a law that says: "Always use a native module before a Shell or Command execution". There are many reasons, but the main one is idempotency.

Do you know the difference between Command and Shell?

Asuming that we don't have any module that could replace a command, like the last pacemaker example, then we must find the way to do it idempotent.

SHELL/Command

With Shell/Command module we have multiple ways to maintain the idempotency:

  • We have the parameter 'creates', which means, that if the destination file is created, the command will not be executed. In the same way, this command creates this file.
  • When you perform a shell module, this will always be a 'changed' task. To maintain the Ansible philosophy, you must use 'changed_when' and put the condition to really evaluate when the command is acting.
  • Loops - take advantage of meta-parameters like:
    • Register
    • Until
    • Retries
    • Delay

Examples....

SHELL/Command

Creates:

SHELL/Command

Changed_when/Failed_when with Register:

SHELL/Command

Loops:

Questions?

Ansible

Best Practices & Style-Guide

Variables

Variables

In this section we will review some important topics related with variables around Ansible.

  • Default parameters, all the ways... (*)
  • Using variables as ansible-code flags
  • Standardising types of variables
  • Kind of variables
  • ETCD

Default parameters                           (*)

Ok let's see how to declare a default variable:

  • At 'default/main.yml' file like other variables
  • As entries of a playbook
  • As a declared fact
---
# role/role_name/default/main.yml
test01: 'default value1'
test02: 'default value2'
test03: 'default value3'
test04: 'default value4'

At default/main.yml

This is the simplest way to do it and the most traceable.

As entry of a playbook:

As you see above we have involved 2 variables:

  • The variable that you will use in the code: var_with_default
  • The entry variable: troubleshoot with default value as False

Default parameters                           (*)

As a declared fact:

Default parameters                           (*)

Using variables as flags

This is not very popular but I think that it is a great idea to manipulate the flow of your code execution avoiding the use of tags on the code. (I'm not a fan of tags)

The behaviour is the same as "Groupping Features" topic that we have discussed before. I declare a variable that by default is False and I execute ansible-playbook with extra-vars modifying the flow without code modification.

Using variables as flags

As you see on the 'when:' field, this is the way to validate a Boolean as it is.

Kind of variables

In this section I will describe where to put your variables depending on their use:

  • group_vars/all
    • Here are the variables common to all groups and hosts, like NTP, DNS, etc..
  • group_vars/<group_name>
    • Here are the group related variables. For example, for a group called compute,  a variable called 'nova-service' could exist
  • host_vars/<hostname>
    • It is not very popular to allocate a host of related variables
  • role/vars/main.yml
    • Here are all role related variables
  • roles/<role name>/default/main.yml
    • Here are all default role related variables

Kind of variables

  • Also you could create variables inside of a playbook like you have seen in other examples
  • Extra-vars
    • Passed to ansible on command execution of Ansible
  • Include_vars
    • This method will include non-native variable files to be added in execution time

variables Precedence

  • role defaults [1]
  • inventory file or script group vars [2]
  • inventory group_vars/all
  • playbook group_vars/all
  • inventory group_vars/*
  • playbook group_vars/*
  • inventory file or script host vars [2]
  • inventory host_vars/*
  • playbook host_vars/*
  • host facts
  • play vars
  • play vars_prompt
  • play vars_files
  • role vars (defined in role/vars/main.yml)
  • block vars (only for tasks in block)
  • task vars (only for the task)
  • role (and include_role) params
  • include params
  • include_vars
  • set_facts / registered vars
  • extra vars (always win precedence)

ETCD

In time, I have developed a Lookup and a Module to upload/get/override key-value from/to ETCD, here.

It's like an ENC with a folder hierarchy (v2), which we are using on our executions to maintain the state between different playbooks. Also you can perform actions depending on the state. 

This is an example of usage:

ETCD

After the pre-task, we already have a variable called VAR_MAP. This variable is a complex one with a complex hierarchy:

Questions?

Ansible role testing

Ansible role Testing

This section is a important one, after review this info you will be capable to create tests to your role/playbook and execute it on you PC after push the code to remote repository

  • Testing Methods: Molecule
  • Alternative testing methods
  • Ansible-Lint
  • Ansible-CI/CD

Testing Methods

There are many ways to perform tests regarding ansible like, starting with every role in a isolated way, execute all the playbook on a remote machine, etc...

We will see a couple of ways to perform tests from our node and the way to not affect your system.

  • Preference:
    • Molecule
  • Alternatives
    • First way: Docker involved
    • Second way: Vagrant + (Libvirt || VirtualBox)
    • Third way: Cloud (*)

Molecule

If you know Puppet-kitchen, this is the same for ansible, Molecule is a testing suite that includes many useful tools to be used with Ansible.

At first could be a little complicated, but worth it.

To try it, just create a Virtualenv and install all the requirements

After that let's play:

 

Molecule

We will use docker as a driver for this example, it means that you will perform all roles and tests on containers.

> molecule init role --role-name awesome_role --driver-name docker
> molecule --help

As you see you could execute molecule in many ways, but 'molecule test' will contain all the test-suite. Let's create the first role

Molecule

This is a Molecule structure tree:

.molecule folder is the temporary workspace to create/delete files during molecule execution  

default folder is the name of the scenary maked for the test plan

Molecule

Here we have all the Molecule tests:

  • Dependency: Affects to requirements.yml [To-Generate]
    • Works like meta directive dependencies on meta/main.yml
  • Create: Affects to create.yml [Auto-Generated]
    • Destroy all the previous containers in the same context
    • Generate on-the-fly a Dockerfile based on a J2 template
    • Build container/s and run it.
  • Destroy: Affects to destroy.yml [Auto-Generated]
    • Destroy the generated containers
  • Prepare: Affects to prepare.yml
    • ​Previous execution to Converge Phase and goes after containers raise
  • Converge: Affects to playbook.yml [Auto-Generated][To-Complete]
    • Execute Create Phase
    • Execute Prepare Phase
    • Execute playbook.yml on raised containers/vms/instances

Molecule

  • Syntax: Execute a syntax check on playbook.yml
  • Idempotence: Execute Idempotence tests
    • Use playbook.yml as base
    • Execute and Re-Execute the same role on the destination hosts
    • The tests fails when the same task perform changes on  the al cases agains destination nodes
    • The tests goes ok when the first execute perform changes and the re-execution returns ok
  • Lint: Execute lint tests
    • Yamllint: Perform those tests to verify that yaml is compliance
    • Ansible-lint: Perform ansible-lint tests with OOTB rules, you also could add your own.
    • Flake8: Validates Python tests (Testinfra) on folder located as molecule/<scenario>/tests​​
  • Side-Effect: Affects to <name>.yml 
  • Verify: Execute Testinfra tests suite.

Molecule

  • Example:
    • ​​​Test multiple versions of Ansible with the same role

Molecule

  • Example
    • ​Test multiple versions of Ansible with the same role (The config file)

 

With this simple example we will validate that our code are working on multiple versions of Ansible.

Alternative Testing Methods

First Way: Docker

Here you have a great article to use Docker to execute your role tests, l will show you with some code:

Let's deep a few,

  • We will start with hosts pointing to Localhosts to execute on your machine.
  • We will use become to permit ansible to start a docker container
  • At vars section we see our definition containers:
    • Name and image for everyone that you want to raise
    • All inside of a complex variable called inventory
  • Now the Roles for testing:
    • Role: provision_docker --> this is the role to perform the container creation
    • Provider docker inventory: "{{ inventory }}" variable
    • Provider docker privileged: when you need to use a prvileged resource
    • Provision docker use docker connection: Don't connect over ssh, uses sock.
  • After that we could execute whatever we want on the containers, the examples shows a simply "Hello World"

 

Let's see the output...

Alternative Testing Methods

Vagrant is a utility to wake up a VM using you favorite Hypervisor (Libvirt, VirtualBox, etc..), it uses Boxes as guest images to raise your laboratory environment.

Example: https://github.com/padajuan/vagrant-lab-environment

Follow the requirements on the repository itself.

The main file is Vagrantfile where you define the machine status and preferences, also you could provide a provision method:

  • Bash
  • Puppet
  • Chef
  • CFengine
  • Ansible

Functional personal project: https://github.com/padajuan/vagrant-cockpit

Alternative Testing Methods

This is an example of a provision method with Vagrant:

 

As you see we configure the VM at first with variables and after define the vm we expose some ports from VM to the host (like Docker)

Also, at the end, we see the provision step.

Alternative Testing Methods

This some times could be complex because you don't have resources to do it, but you for example could use a service provider like Digital Ocean, that you have modules at Ansible or AWS if you have more money:

It's easy, first step, provision the VM on your cloud and after that execute the playbooks on this host, usually you could do this in one execution creating an empty group and populating it in execution time. I will provide an example with Openstack

Alternative Testing Methods

Create the instance:

Alternative Testing Methods

Get the FIP:

Populate the inventory file (yes, a file, because is permanent between ansible executions), but you could do this on memory

Alternative Testing Methods

Perform tests on VMs created (playbook):

Perform tests on VMs created (task), but first we must get the ip in this new execution:

We use Raw module because is a Cirros image and don't have Python interface

Alternative Testing Methods

Ansible-Lint

Ansible-lint is a group of resources based on rules that will say to you which are ok and what are not ok with you ansible Code.

Is based in Python and provide you an interface to perform testing of methods used on your playbooks.

This is a repositorythat contains some rules that could be useful to you:

https://github.com/padajuan/ansible-lint-rules

Ansible-Lint

This is a couples of examples of rules to check with Ansible-lint

Ansible-Lint

You need to get some pre-requisites like download the rules and have installed ansible-lint pip package.

Ansible-Lint

This is an execution of ansible-lint:

Ansible-CI/CD

This topic will engage all the pieces in a automatic way, we need to integrate using WebHooks or Githooks from your versioning system (Gitlab, Github, etc..) to your CI when you get a push/comment/merge request/etc... event launch a job on your CI/CD server.

I will use Gitlab as an example (follow this steps):

  • Access to your repository and go to integrations tab

Ansible-CI/CD

Configure the web hook here:

The Url must point to your Job at Jenkins server and remember to use a token for auth.

When you finish press Add Webhook.

Ansible-CI/CD

This is the Jenkins part, create a Job with this activated:

Put and dummy execution to test the job, after that, save it.

Ansible-CI/CD

Test the job from Console:

See the execution working

Now configure the Gitlab to use this hook:

Ansible-CI/CD

Now the rest is easy, just use the methods that I mentioned before to perform the test execution when events from gitlab raise.

For instance, a good chain of methods to perform in the environment could be:

  • When push events happens
    • Download the repository
    • Execute ansible-playbook .... --syntax-check
    • Execute ansible-lint using rules from other repository
    • Execute Role base testing
    • Create a Merge request against development branch

Rescue (We are hurry!):

  • Demo running all Molecule tests: DEMO
  • Demo Steps: DEMO
    • Create docker images
    • Test syntax, lint, etc..
    • Test with Testinfra (Must fail)
    • Provision roles
    • Test again with Testinfra
  • Demo of how to test your roles with multiple versions of ansible: DEMO

Demo: Molecule

Questions?

Re-using ansible code

Re-using code in Ansible

I think this topic is one othe most complex to perform in ansible, because there is not a good way to do it that the native one with roles, but we will see some pieces of code that explain how to re-use some code that you has developed in some time.

Re-using code in Ansible

Let's get started with some 'simple' things, create re-usable roles based on functions/actions:

Re-using code in Ansible

Re-using code in Ansible

The purpose of this method is to avoid DRY ('Don't repeat yourself')

Re-using code in Ansible

Nested Loops

Re-using code in Ansible

Fire & Forget

Re-using code in Ansible

Questions?

Resources

Resources

  • Madrid Meetup Video (Spanish): https://vimeo.com/243802379
  • Slides: http://redhat.slides.com/jparrill/ansible-202/fullscreen?token=1hUEEPF4
  • Demos:
    • https://asciinema.org/a/tgLtkp8jor8LFsLT05uWfeQ0W
    • https://asciinema.org/a/hGWTnJ8wOMUiPlLfdY7l6m36d
    • https://asciinema.org/a/bYvXE4739hJUkG3eCCsNaFRli
  • Code: https://github.com/jparrill/devconf-demo-ansible202

Q&A?