Ansible
16 minute read
Introduction to Ansible
Ansible is an open-source IT automation DevOps engine allowing you to manage and configure many compute resources in a scalable, consistent and reliable way.
Ansible to automates the following tasks:
-
Provisioning: It sets up the servers that you will use as part of your infrastructure.
-
Configuration management: You can change the configuration of an application, OS, or device. You can implement security policies and other configuration tasks.
-
Service management: You can start and stop services, install updates
-
Application deployment: You can conduct application deployments in an automated fashion that integrate with your DevOps strategies.
Prerequisite
We assume you
-
can install Ubuntu 18.04 virtual machine on VirtualBox
-
can install software packages via ‘apt-get’ tool in Ubuntu virtual host
-
already reserved a virtual cluster (with at least 1 virtual machine in it) on some cloud. OR you can use VMs installed in VirtualBox instead.
-
have SSH credentials and can login to your virtual machines.
Setting up a playbook
Let us develop a sample from scratch, based on the paradigms that ansible supports. We are going to use Ansible to install Apache server on our virtual machines.
First, we install ansible on our machine and make sure we have an up to date OS:
$ sudo apt-get update
$ sudo apt-get install ansible
Next, we prepare a working environment for your Ansible example
$ mkdir ansible-apache
$ cd ansible-apache
To use ansible we will need a local configuration. When you execute
Ansible within this folder, this local configuration file is always
going to overwrite a system level Ansible configuration. It is in
general beneficial to keep custom configurations locally unless you
absolutely believe it should be applied system wide. Create a file
inventory.cfg
in this folder, add the following:
[defaults]
hostfile = hosts.txt
This local configuration file tells that the target machines' names
are given in a file named hosts.txt
. Next we will specify hosts in
the file.
You should have ssh login accesses to all VMs listed in this file as
part of our prerequisites. Now create and edit file hosts.txt
with
the following content:
[apache]
<server_ip> ansible_ssh_user=<server_username>
The name apache
in the brackets defines a server group name. We will
use this name to refer to all server items in this group. As we intend
to install and run apache on the server, the name choice seems quite
appropriate. Fill in the IP addresses of the virtual machines you
launched in your VirtualBox and fire up these VMs in you VirtualBox.
To deploy the service, we need to create a playbook. A playbook tells
Ansible what to do. it uses YAML Markup syntax. Create and edit a file
with a proper name e.g. apache.yml
as follow:
---
- hosts: apache #comment: apache is the group name we just defined
become: yes #comment: this operation needs privilege access
tasks:
- name: install apache2 # text description
apt: name=apache2 update_cache=yes state=latest
This block defines the target VMs and operations(tasks) need to apply.
We are using the apt
attribute to indicate all software packages that
need to be installed. Dependent on the distribution of the operating
system it will find the correct module installer without your
knowledge. Thus an ansible playbook could also work for multiple
different OSes.
Ansible relies on various kinds of modules to fulfil tasks on the remote
servers. These modules are developed for particular tasks and take in
related arguments. For instance, when we use apt
module, we
need to tell which package we intend to install. That is why we provide
a value for the name=
argument. The first -name
attribute is just
a comment that will be printed when this task is executed.
Run the playbook
In the same folder, execute
ansible-playbook apache.yml --ask-sudo-pass
After a successful run, open a browser and fill in your server IP. you should see an ‘It works!’ Apache2 Ubuntu default page. Make sure the security policy on your cloud opens port 80 to let the HTTP traffic go through.
Ansible playbook can have more complex and fancy structure and syntaxes. Go explore! This example is based on:
We are going to offer an advanced Ansible in next chapter.
Ansible Roles
Next we install the R package onto our cloud VMs. R is a useful statistic programing language commonly used in many scientific and statistics computing projects, maybe also the one you chose for this class. With this example we illustrate the concept of Ansible Roles, install source code through Github, and make use of variables. These are key features you will find useful in your project deployments.
We are going to use a top-down fashion in this example. We first start from a playbook that is already good to go. You can execute this playbook (do not do it yet, always read the entire section first) to get R installed in your remote hosts. We then further complicate this concise playbook by introducing functionalities to do the same tasks but in different ways. Although these different ways are not necessary they help you grasp the power of Ansible and ease your life when they are needed in your real projects.
Let us now create the following playbook with the name example.yml
:
---
- hosts: R_hosts
become: yes
tasks:
- name: install the R package
apt: name=r-base update_cache=yes state=latest
The hosts are defined in a file hosts.txt
, which we configured in
a file that we now call ansible.cfg
:
[R_hosts]
<cloud_server_ip> ansible_ssh_user=<cloud_server_username>
Certainly, this should get the installation job done. But we are going to extend it via new features called role next
Role is an important concept used often in large Ansible projects. You divide a series of tasks into different groups. Each group corresponds to certain role within the project.
For example, if your project is to deploy a web site, you may need to install the back end database, the web server that responses HTTP requests and the web application itself. They are three different roles and should carry out their own installation and configuration tasks.
Even though we only need to install the R package in this example, we
can still do it by defining a role ‘r’. Let us modify our example.yml
to be:
---
- hosts: R_hosts
roles:
- r
Now we create a directory structure in your top project directory as follows
$ mkdir -p roles/r/tasks
$ touch roles/r/tasks/main.yml
Next, we edit the main.yml
file and include the following content:
---
- name: install the R package
apt: name=r-base update_cache=yes state=latest
become: yes
You probably already get the point. We take the ‘tasks’ section out of
the earlier example.yml
and re-organize them into roles. Each role
specified in example.yml
should have its own directory under roles/ and
the tasks need be done by this role is listed in a file ‘tasks/main.yml’
as previous.
Using Variables
We demonstrate this feature by installing source code from Github. Although R can be installed through the OS package manager (apt-get etc.), the software used in your projects may not. Many research projects are available by Git instead. Here we are going to show you how to install packages from their Git repositories. Instead of directly executing the module ‘apt’, we pretend Ubuntu does not provide this package and you have to find it on Git. The source code of R can be found at https://github.com/wch/r-source.git. We are going to clone it to a remote VM’s hard drive, build the package and install the binary there.
To do so, we need a few new Ansible modules. You may remember from the
last example that Ansible modules assist us to do different tasks
based on the arguments we pass to it. It will come to no surprise that
Ansible has a module ‘git’ to take care of git-related works, and a
‘command’ module to run shell commands. Let us modify
roles/r/tasks/main.yml
to be:
---
- name: get R package source
git:
repo: https://github.com/wch/r-source.git
dest: /tmp/R
- name: build and install R
become: yes
command: chdir=/tmp/R "{{ item }}"
with_items:
- ./configure
- make
- make install
The role r
will now carry out two tasks. One to clone the R source
code into /tmp/R
, the other uses a series of shell commands to build and
install the packages.
Note that the commands executed by the second task may not be available on a fresh VM image. But the point of this example is to show an alternative way to install packages, so we conveniently assume the conditions are all met.
To achieve this we are using variables in a separate file.
We typed several string constants in our Ansible scripts so far. In
general, it is a good practice to give these values names and use them
by referring to their names. This way, you complex Ansible project can
be less error prone. Create a file in the same directory, and name it
vars.yml
:
---
repository: https://github.com/wch/r-source.git
tmp: /tmp/R
Accordingly, we will update our example.yml
:
---
- hosts: R_hosts
vars_files:
- vars.yml
roles:
- r
As shown, we specify a vars_files
telling the script that the file
vars.yml
is going to supply variable values, whose keys are denoted by
Double curly brackets like in roles/r/tasks/main.yml
:
---
- name: get R package source
git:
repo: "{{ repository }}"
dest: "{{ tmp }}"
- name: build and install R
become: yes
command: chdir="{{ tmp }}" "{{ item }}"
with_items:
- ./configure
- make
- make install
Now, just edit the hosts.txt
file with your target VMs' IP addresses and
execute the playbook.
You should be able to extend the Ansible playbook for your needs. Configuration tools like Ansible are important components to master the cloud environment.
Ansible Galaxy
Ansible Galaxy is a marketplace, where developers can share Ansible Roles to complete their system administration tasks. Roles exchanged in Ansible Galaxy community need to follow common conventions so that all participants know what to expect. We will illustrate details in this chapter.
It is good to follow the Ansible Galaxy standard during your development as much as possible.
Ansible Galaxy helloworld
Let us start with a simplest case: We will build an Ansible Galaxy project. This project will install the Emacs software package on your localhost as the target host. It is a helloworld project only meant to get us familiar with Ansible Galaxy project structures.
First you need to create a directory. Let us call it mongodb
:
$ mkdir mongodb
Go ahead and create files README.md
, playbook.yml
, inventory
and a
subdirectory roles/
then `playbook.yml is your project playbook. It
should perform the Emacs installation task by executing the
corresponding role you will develop in the folder ‘roles/’. The only
difference is that we will construct the role with the help of
ansible-galaxy this time.
Now, let ansible-galaxy initialize the directory structure for you:
$ cd roles
$ ansible-galaxy init <to-be-created-role-name>
The naming convention is to concatenate your name and the role name by a dot. @fig:ansible shows how it looks like.
{#fig:ansible}
Let us fill in information to our project. There are several main.yml
files in different folders, and we will illustrate their usages.
defaults and vars:
These folders should hold variables key-value pairs for your playbook scripts. We will leave them empty in this example.
files:
This folder is for files need to be copied to the target hosts. Data files or configuration files can be specified if needed. We will leave it empty too.
templates:
Similar missions to files/, templates is allocated for template files. Keep empty for a simple Emacs installation.
handlers:
This is reserved for services running on target hosts. For example, to restart a service under certain circumstance.
tasks:
This file is the actual script for all tasks. You can use the role you built previously for Emacs installation here:
--- - name: install Emacs on Ubuntu 16.04 become: yes package: name=emacs state=present
meta:
Provide necessary metadata for our Ansible Galaxy project for shipping:
---
galaxy_info:
author: <you name>
description: emacs installation on Ubuntu 16.04
license:
- MIT
min_ansible_version: 2.0
platforms:
- name: Ubuntu
versions:
- xenial
galaxy_tags:
- development
dependencies: []
Next let us test it out. You have your Ansible Galaxy role ready
now. To test it as a user, go to your directory and edit the other
two files inventory.txt
and playbook.yml
, which are already generated
for you in directory tests
by the script:
$ ansible-playbook -i ./hosts playbook.yml
After running this playbook, you should have Emacs installed on localhost.
A Complete Ansible Galaxy Project
We are going to use ansible-galaxy to setup a sample project. This sample project will:
- use a cloud cluster with multiple VMs
- deploy Apache Spark on this cluster
- install a particular HPC application
- prepare raw data for this cluster to process
- run the experiment and collect results
Ansible: Write a Playbooks for MongoDB
Ansible Playbooks are automated scripts written in YAML data format. Instead of using manual commands to setup multiple remote machines, you can utilize Ansible Playbooks to configure your entire systems. YAML syntax is easy to read and express the data structure of certain Ansible functions. You simply write some tasks, for example, installing software, configuring default settings, and starting the software, in a Ansible Playbook. With a few examples in this section, you will understand how it works and how to write your own Playbooks.
- There are also several examples of using Ansible Playbooks from the official site. It covers
-
from basic usage of Ansible Playbooks to advanced usage such as applying patches and updates with different roles and groups.
We are going to write a basic playbook of Ansible
software. Keep in mind that Ansible
is a main program and playbook
is a template that you would like to use. You may have several playbooks
in your Ansible.
First playbook for MongoDB Installation
As a first example, we are going to write a playbook which installs MongoDB server. It includes the following tasks:
- Import the public key used by the package management system
- Create a list file for MongoDB
- Reload local package database
- Install the MongoDB packages
- Start MongoDB
The material presented here is based on the manual installation of MongoDB from the official site:
We also assume that we install MongoDB on Ubuntu 15.10.
Enabling Root SSH Access
Some setups of managed nodes may not allow you to log in as root. As
this may be problematic later, let us create a playbook to resolve this.
Create a enable-root-access.yaml
file with the following contents:
---
- hosts: ansible-test
remote_user: ubuntu
tasks:
- name: Enable root login
shell: sudo cp ~/.ssh/authorized_keys /root/.ssh/
Explanation:
-
hosts
specifies the name of a group of machines in the inventory -
remote_user
specifies the username on the managed nodes to log in as -
tasks
is a list of tasks to accomplish having aname
(a description) and modules to execute. In this case we use theshell
module.
We can run this playbook like so:
$ ansible-playbook -i inventory.txt -c ssh enable-root-access.yaml
PLAY [ansible-test] ***********************************************************
GATHERING FACTS ***************************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]
TASK: [Enable root login] *****************************************************
changed: [10.23.2.104]
changed: [10.23.2.105]
PLAY RECAP ********************************************************************
10.23.2.104 : ok=2 changed=1 unreachable=0 failed=0
10.23.2.105 : ok=2 changed=1 unreachable=0 failed=0
Hosts and Users
First step is choosing hosts to install MongoDB and a user account to
run commands (tasks). We start with the following lines in the example
filename of mongodb.yaml
:
---
- hosts: ansible-test
remote_user: root
become: yes
In a previous section, we setup two machines with ansible-test
group
name. We use two machines for MongoDB installation.
Also, we use root
account to complete Ansible tasks.
- Indentation is important in YAML format. Do not ignore spaces start
-
with in each line.
Tasks
A list of tasks contains commands or configurations to be executed on
remote machines in a sequential order. Each task comes with a name
and
a module
to run your command or configuration. You provide a
description of your task in name
section and choose a module
for
your task. There are several modules that you can use, for example,
shell
module simply executes a command without considering a return
value. You may use apt
or yum
module which is one of the packaging
modules to install software. You can find an entire list of modules
here: http://docs.ansible.com/list_of_all_modules.html
Module apt_key: add repository keys
We need to import the MongoDB public GPG Key. This is going to be a first task in our playbook.:
tasks:
- name: Import the public key used by the package management system
apt_key: keyserver=hkp://keyserver.ubuntu.com:80 id=7F0CEB10 state=present
Module apt_repository: add repositories
Next add the MongoDB repository to apt:
- name: Add MongoDB repository
apt_repository: repo='deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' state=present
Module apt: install packages
We use apt
module to install mongodb-org
package. notify
action is
added to start mongod
after the completion of this task. Use the
update_cache=yes
option to reload the local package database.:
- name: install mongodb
apt: pkg=mongodb-org state=latest update_cache=yes
notify:
- start mongodb
Module service: manage services
We use handlers
here to start or restart services. It is similar to
tasks
but will run only once.:
handlers:
- name: start mongodb
service: name=mongod state=started
The Full Playbook
Our first playbook looks like this:
---
- hosts: ansible-test
remote_user: root
become: yes
tasks:
- name: Import the public key used by the package management system
apt_key: keyserver=hkp://keyserver.ubuntu.com:80 id=7F0CEB10 state=present
- name: Add MongoDB repository
apt_repository: repo='deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' state=present
- name: install mongodb
apt: pkg=mongodb-org state=latest update_cache=yes
notify:
- start mongodb
handlers:
- name: start mongodb
service: name=mongod state=started
Running a Playbook
We use ansible-playbook
command to run our playbook:
$ ansible-playbook -i inventory.txt -c ssh mongodb.yaml
PLAY [ansible-test] ***********************************************************
GATHERING FACTS ***************************************************************
ok: [10.23.2.104]
ok: [10.23.2.105]
TASK: [Import the public key used by the package management system] ***********
changed: [10.23.2.104]
changed: [10.23.2.105]
TASK: [Add MongoDB repository] ************************************************
changed: [10.23.2.104]
changed: [10.23.2.105]
TASK: [install mongodb] *******************************************************
changed: [10.23.2.104]
changed: [10.23.2.105]
NOTIFIED: [start mongodb] *****************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]
PLAY RECAP ********************************************************************
10.23.2.104 : ok=5 changed=3 unreachable=0 failed=0
10.23.2.105 : ok=5 changed=3 unreachable=0 failed=0
If you rerun the playbook, you should see that nothing changed:
$ ansible-playbook -i inventory.txt -c ssh mongodb.yaml
PLAY [ansible-test] ***********************************************************
GATHERING FACTS ***************************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]
TASK: [Import the public key used by the package management system] ***********
ok: [10.23.2.104]
ok: [10.23.2.105]
TASK: [Add MongoDB repository] ************************************************
ok: [10.23.2.104]
ok: [10.23.2.105]
TASK: [install mongodb] *******************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]
PLAY RECAP ********************************************************************
10.23.2.104 : ok=4 changed=0 unreachable=0 failed=0
10.23.2.105 : ok=4 changed=0 unreachable=0 failed=0
Sanity Check: Test MongoDB
Let us try to run ‘mongo’ to enter mongodb shell.:
$ ssh ubuntu@$IP
$ mongo
MongoDB shell version: 2.6.9
connecting to: test
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
http://docs.mongodb.org/
Questions? Try the support group
http://groups.google.com/group/mongodb-user
>
Terms
-
Module: Ansible library to run or manage services, packages, files or commands.
-
Handler: A task for notifier.
-
Task: Ansible job to run a command, check files, or update configurations.
-
Playbook: a list of tasks for Ansible nodes. YAML format used.
-
YAML: Human readable generic data serialization.
Reference
The main tutorial from Ansible is here: http://docs.ansible.com/playbooks_intro.html
You can also find an index of the ansible modules here: http://docs.ansible.com/modules_by_category.html
Exercise
We have shown a couple of examples of using Ansible tools. Before you apply it in you final project, we will practice it in this exercise.
- set up the project structure similar to Ansible Galaxy example
- install MongoDB from the package manager (apt in this class)
- configure your MongoDB installation to start the service automatically
- use default port and let it serve local client connections only