We present here a collection information and of tools related to DevOps.
This the multi-page printable view of this section. Click here to print.
DevOps
- 1: DevOps - Continuous Improvement
- 2: Infrastructure as Code (IaC)
- 3: Ansible
- 4: Puppet
- 5: Travis
- 6: DevOps with AWS
- 7: DevOps with Azure Monitor
1 - DevOps - Continuous Improvement
Deploying enterprise applications has been always challenging. Without consistent and reliable processes and practices, it would be impossible to track and measure the deployment artifacts, which code-files and configuration data have been deployed to what servers and what level of unit and integration tests have been done among various components of the enterprise applications. Deploying software to cloud is much more complex, given Dev-Op teams do not have extensive access to the infrastructure and they are forced to follow the guidelines and tools provided by the cloud companies. In recent years, Continuous Integration (CI) and Continuous Deployment (CD) are the Dev-Op mantra for delivering software reliably and consistently.
While CI/CD process is, as difficult as it gets, monitoring the deployed applications is emerging as new challenge, especially, on an infrastructure that is sort of virtual with VMs in combination with containers. Continuous Monitoring (CM) is somewhat new concept, that has gaining rapid popularity and becoming integral part of the overall Dev-Op functionality. Based on where the software has been deployed, continuous monitoring can be as simple as, monitoring the behavior of the applications to as complex as, end-to-end visibility across infrastructure, heart-beat and health-check of the deployed applications along with dynamic scalability based on the usage of these applications. To address this challenge, building robust monitoring pipeline process, would be a necessity. Continuous Monitoring aspects get much better control, if they are thought as early as possible and bake them into the software during the development. We can provide much better tracking and analyze metrics much closer to the application needs, if these aspects are considered very early into the process. Cloud companies aware of this necessity, provide various Dev-Op tools to make CI/CD and continuous monitoring as easy as possible. While, some of these tools and aspects are provided by the cloud offerings, some of them must be planned and planted into our software.
At high level, we can think of a simple pipeline to achieve consistent and scalable deployment process. CI/CD and Continuous Monitoring Pipeline:
-
Step 1 - Continuous Development - Plan, Code, Build and Test:
Planning, Coding, building the deployable artifacts - code, configuration, database, etc. and let them go through the various types of tests with all the dimensions - technical to business and internal to external, as automated as possible. All these aspects come under Continuous Development.
-
Step 2 - Continuous Improvement - Deploy, Operate and Monitor:
Once deployed to production, how these applications get operated - bug and health-checks, performance and scalability along with various high monitoring - infrastructure and cold delays due to on-demand VM/container instantiations by the cloud offerings due to the nature of the dynamic scalability of the deployment and selected hosting options. Making necessary adjustments to improve the overall experience is essentially called Continuous Improvement.
2 - Infrastructure as Code (IaC)
Learning Objectives
Learning Objectives
- Introduction to IaC
- How IaC is related to DevOps
- How IaC differs from Configuration Management Tools, and how is it related
- Listing of IaC Tools
- Further Reading
Introduction to IaC
IaC(Infrastructure as Code) is the ability of code to generate, maintain and destroy application infrastructure like server, storage and networking, without requiring manual changes. State of the infrastructure is maintained in files.
Cloud architectures, and containers have forced usage of IaC, as the amount of elements to manage at each layer are just too many. It is impractical to keep track with the traditional method of raising tickets and having someone do it for you. Scaling demands, elasticity during odd hours, usage-based-billing all require provisioning, managing and destroying infrastructure much more dynamically.
From the book “Amazon Web Services in Action” by Wittig [1], using a script or a declarative description has the following advantages
- Consistent usage
- Dependencies are handled
- Replicable
- Customizable
- Testable
- Can figure out updated state
- Minimizes human failure
- Documentation for your infrastructure
Sometimes IaC tools are also called Orchestration tools, but that label is not as accurate, and often misleading.
How IaC is related to DevOps
DevOps has the following key practices
- Automated Infrastructure
- Automated Configuration Management, including Security
- Shared version control between Dev and Ops
- Continuous Build - Integrate - Test - Deploy
- Continuous Monitoring and Observability
The first practice - Automated Infrastructure can be fulfilled by IaC tools. By having the code for IaC and Configuration Management in the same code repository as application code ensures adhering to the practice of shared version control.
Typically, the workflow of the DevOps team includes running Configuration Management tool scripts after running IaC tools, for configurations, security, connectivity, and initializations.
How IaC tools differs from Configuration Management Tools, and how it is related
There are 4 broad categories of such tools [2], there are
- Ad hoc scripts: Any shell, Python, Perl, Lua scripts that are written
- Configuration management tools: Chef, Puppet, Ansible, SaltStack
- Server templating tools: Docker, Packer, Vagrant
- Server provisioning tools: Terraform, Heat, CloudFormation, Cloud Deployment Manager, Azure Resource Manager
Configuration Management tools make use of scripts to achieve a state. IaC tools maintain state and metadata created in the past.
However, the big difference is the state achieved by running procedural code or scripts may be different from state when it was created because
- Ordering of the scripts determines the state. If the order changes, state will differ. Also, issues like waiting time required for resources to be created, modified or destroyed have to be correctly dealt with.
- Version changes in procedural code are inevitabale, and will lead to a different state.
Chef and Ansible are more procedural, while Terraform, CloudFormation, SaltStack, Puppet and Heat are more declarative.
IaC or declarative tools do suffer from inflexibility related to expressive scripting language.
Listing of IaC Tools
IaC tools that are cloud specific are
- Amazon AWS - AWS CloudFormation
- Google Cloud - Cloud Deployment Manager
- Microsoft Azure - Azure Resource Manager
- OpenStack - Heat
Terraform is not a cloud specific tool, and is multi-vendor. It has got good support for all the clouds, however, Terraform scripts are not portable across clouds.
Advantages of IaC
IaC solves the problem of environment drift, that used to lead to the infamous “but it works on my machine” kind of errors that are difficult to trace. According to ???
IaC guarantees Idempotence – known/predictable end state – irrespective of starting state. Idempotency is achieved by either automatically configuring an existing target or by discarding the existing target and recreating a fresh environment.
Further Reading
Please see books and resources like the “Terraform Up and Running” [2] for more real-world advice on IaC, structuring Terraform code and good deployment practices.
A good resource for IaC is the book “Infrastructure as Code” [3].
Refernces
[1] M. Wittig Andreas; Wittig, Amazon web services in action, 1st ed. Manning Press, 2015.
[2] Y. Brikman, Terraform: Up and running, 1st ed. O’Reilly Media Inc, 2017.
[3] K. Morris, Infrastructure as code, 1st ed. O’Reilly Media Inc, 2015.
3 - Ansible
Introduction to Ansible
Ansible is an open-source IT automation DevOps engine allowing you to manage and configure many compute resources in a scalable, consistent and reliable way.
Ansible to automates the following tasks:
-
Provisioning: It sets up the servers that you will use as part of your infrastructure.
-
Configuration management: You can change the configuration of an application, OS, or device. You can implement security policies and other configuration tasks.
-
Service management: You can start and stop services, install updates
-
Application deployment: You can conduct application deployments in an automated fashion that integrate with your DevOps strategies.
Prerequisite
We assume you
-
can install Ubuntu 18.04 virtual machine on VirtualBox
-
can install software packages via ‘apt-get’ tool in Ubuntu virtual host
-
already reserved a virtual cluster (with at least 1 virtual machine in it) on some cloud. OR you can use VMs installed in VirtualBox instead.
-
have SSH credentials and can login to your virtual machines.
Setting up a playbook
Let us develop a sample from scratch, based on the paradigms that ansible supports. We are going to use Ansible to install Apache server on our virtual machines.
First, we install ansible on our machine and make sure we have an up to date OS:
$ sudo apt-get update
$ sudo apt-get install ansible
Next, we prepare a working environment for your Ansible example
$ mkdir ansible-apache
$ cd ansible-apache
To use ansible we will need a local configuration. When you execute
Ansible within this folder, this local configuration file is always
going to overwrite a system level Ansible configuration. It is in
general beneficial to keep custom configurations locally unless you
absolutely believe it should be applied system wide. Create a file
inventory.cfg
in this folder, add the following:
[defaults]
hostfile = hosts.txt
This local configuration file tells that the target machines' names
are given in a file named hosts.txt
. Next we will specify hosts in
the file.
You should have ssh login accesses to all VMs listed in this file as
part of our prerequisites. Now create and edit file hosts.txt
with
the following content:
[apache]
<server_ip> ansible_ssh_user=<server_username>
The name apache
in the brackets defines a server group name. We will
use this name to refer to all server items in this group. As we intend
to install and run apache on the server, the name choice seems quite
appropriate. Fill in the IP addresses of the virtual machines you
launched in your VirtualBox and fire up these VMs in you VirtualBox.
To deploy the service, we need to create a playbook. A playbook tells
Ansible what to do. it uses YAML Markup syntax. Create and edit a file
with a proper name e.g. apache.yml
as follow:
---
- hosts: apache #comment: apache is the group name we just defined
become: yes #comment: this operation needs privilege access
tasks:
- name: install apache2 # text description
apt: name=apache2 update_cache=yes state=latest
This block defines the target VMs and operations(tasks) need to apply.
We are using the apt
attribute to indicate all software packages that
need to be installed. Dependent on the distribution of the operating
system it will find the correct module installer without your
knowledge. Thus an ansible playbook could also work for multiple
different OSes.
Ansible relies on various kinds of modules to fulfil tasks on the remote
servers. These modules are developed for particular tasks and take in
related arguments. For instance, when we use apt
module, we
need to tell which package we intend to install. That is why we provide
a value for the name=
argument. The first -name
attribute is just
a comment that will be printed when this task is executed.
Run the playbook
In the same folder, execute
ansible-playbook apache.yml --ask-sudo-pass
After a successful run, open a browser and fill in your server IP. you should see an ‘It works!’ Apache2 Ubuntu default page. Make sure the security policy on your cloud opens port 80 to let the HTTP traffic go through.
Ansible playbook can have more complex and fancy structure and syntaxes. Go explore! This example is based on:
We are going to offer an advanced Ansible in next chapter.
Ansible Roles
Next we install the R package onto our cloud VMs. R is a useful statistic programing language commonly used in many scientific and statistics computing projects, maybe also the one you chose for this class. With this example we illustrate the concept of Ansible Roles, install source code through Github, and make use of variables. These are key features you will find useful in your project deployments.
We are going to use a top-down fashion in this example. We first start from a playbook that is already good to go. You can execute this playbook (do not do it yet, always read the entire section first) to get R installed in your remote hosts. We then further complicate this concise playbook by introducing functionalities to do the same tasks but in different ways. Although these different ways are not necessary they help you grasp the power of Ansible and ease your life when they are needed in your real projects.
Let us now create the following playbook with the name example.yml
:
---
- hosts: R_hosts
become: yes
tasks:
- name: install the R package
apt: name=r-base update_cache=yes state=latest
The hosts are defined in a file hosts.txt
, which we configured in
a file that we now call ansible.cfg
:
[R_hosts]
<cloud_server_ip> ansible_ssh_user=<cloud_server_username>
Certainly, this should get the installation job done. But we are going to extend it via new features called role next
Role is an important concept used often in large Ansible projects. You divide a series of tasks into different groups. Each group corresponds to certain role within the project.
For example, if your project is to deploy a web site, you may need to install the back end database, the web server that responses HTTP requests and the web application itself. They are three different roles and should carry out their own installation and configuration tasks.
Even though we only need to install the R package in this example, we
can still do it by defining a role ‘r’. Let us modify our example.yml
to be:
---
- hosts: R_hosts
roles:
- r
Now we create a directory structure in your top project directory as follows
$ mkdir -p roles/r/tasks
$ touch roles/r/tasks/main.yml
Next, we edit the main.yml
file and include the following content:
---
- name: install the R package
apt: name=r-base update_cache=yes state=latest
become: yes
You probably already get the point. We take the ‘tasks’ section out of
the earlier example.yml
and re-organize them into roles. Each role
specified in example.yml
should have its own directory under roles/ and
the tasks need be done by this role is listed in a file ‘tasks/main.yml’
as previous.
Using Variables
We demonstrate this feature by installing source code from Github. Although R can be installed through the OS package manager (apt-get etc.), the software used in your projects may not. Many research projects are available by Git instead. Here we are going to show you how to install packages from their Git repositories. Instead of directly executing the module ‘apt’, we pretend Ubuntu does not provide this package and you have to find it on Git. The source code of R can be found at https://github.com/wch/r-source.git. We are going to clone it to a remote VM’s hard drive, build the package and install the binary there.
To do so, we need a few new Ansible modules. You may remember from the
last example that Ansible modules assist us to do different tasks
based on the arguments we pass to it. It will come to no surprise that
Ansible has a module ‘git’ to take care of git-related works, and a
‘command’ module to run shell commands. Let us modify
roles/r/tasks/main.yml
to be:
---
- name: get R package source
git:
repo: https://github.com/wch/r-source.git
dest: /tmp/R
- name: build and install R
become: yes
command: chdir=/tmp/R "{{ item }}"
with_items:
- ./configure
- make
- make install
The role r
will now carry out two tasks. One to clone the R source
code into /tmp/R
, the other uses a series of shell commands to build and
install the packages.
Note that the commands executed by the second task may not be available on a fresh VM image. But the point of this example is to show an alternative way to install packages, so we conveniently assume the conditions are all met.
To achieve this we are using variables in a separate file.
We typed several string constants in our Ansible scripts so far. In
general, it is a good practice to give these values names and use them
by referring to their names. This way, you complex Ansible project can
be less error prone. Create a file in the same directory, and name it
vars.yml
:
---
repository: https://github.com/wch/r-source.git
tmp: /tmp/R
Accordingly, we will update our example.yml
:
---
- hosts: R_hosts
vars_files:
- vars.yml
roles:
- r
As shown, we specify a vars_files
telling the script that the file
vars.yml
is going to supply variable values, whose keys are denoted by
Double curly brackets like in roles/r/tasks/main.yml
:
---
- name: get R package source
git:
repo: "{{ repository }}"
dest: "{{ tmp }}"
- name: build and install R
become: yes
command: chdir="{{ tmp }}" "{{ item }}"
with_items:
- ./configure
- make
- make install
Now, just edit the hosts.txt
file with your target VMs' IP addresses and
execute the playbook.
You should be able to extend the Ansible playbook for your needs. Configuration tools like Ansible are important components to master the cloud environment.
Ansible Galaxy
Ansible Galaxy is a marketplace, where developers can share Ansible Roles to complete their system administration tasks. Roles exchanged in Ansible Galaxy community need to follow common conventions so that all participants know what to expect. We will illustrate details in this chapter.
It is good to follow the Ansible Galaxy standard during your development as much as possible.
Ansible Galaxy helloworld
Let us start with a simplest case: We will build an Ansible Galaxy project. This project will install the Emacs software package on your localhost as the target host. It is a helloworld project only meant to get us familiar with Ansible Galaxy project structures.
First you need to create a directory. Let us call it mongodb
:
$ mkdir mongodb
Go ahead and create files README.md
, playbook.yml
, inventory
and a
subdirectory roles/
then `playbook.yml is your project playbook. It
should perform the Emacs installation task by executing the
corresponding role you will develop in the folder ‘roles/’. The only
difference is that we will construct the role with the help of
ansible-galaxy this time.
Now, let ansible-galaxy initialize the directory structure for you:
$ cd roles
$ ansible-galaxy init <to-be-created-role-name>
The naming convention is to concatenate your name and the role name by a dot. @fig:ansible shows how it looks like.
{#fig:ansible}
Let us fill in information to our project. There are several main.yml
files in different folders, and we will illustrate their usages.
defaults and vars:
These folders should hold variables key-value pairs for your playbook scripts. We will leave them empty in this example.
files:
This folder is for files need to be copied to the target hosts. Data files or configuration files can be specified if needed. We will leave it empty too.
templates:
Similar missions to files/, templates is allocated for template files. Keep empty for a simple Emacs installation.
handlers:
This is reserved for services running on target hosts. For example, to restart a service under certain circumstance.
tasks:
This file is the actual script for all tasks. You can use the role you built previously for Emacs installation here:
--- - name: install Emacs on Ubuntu 16.04 become: yes package: name=emacs state=present
meta:
Provide necessary metadata for our Ansible Galaxy project for shipping:
---
galaxy_info:
author: <you name>
description: emacs installation on Ubuntu 16.04
license:
- MIT
min_ansible_version: 2.0
platforms:
- name: Ubuntu
versions:
- xenial
galaxy_tags:
- development
dependencies: []
Next let us test it out. You have your Ansible Galaxy role ready
now. To test it as a user, go to your directory and edit the other
two files inventory.txt
and playbook.yml
, which are already generated
for you in directory tests
by the script:
$ ansible-playbook -i ./hosts playbook.yml
After running this playbook, you should have Emacs installed on localhost.
A Complete Ansible Galaxy Project
We are going to use ansible-galaxy to setup a sample project. This sample project will:
- use a cloud cluster with multiple VMs
- deploy Apache Spark on this cluster
- install a particular HPC application
- prepare raw data for this cluster to process
- run the experiment and collect results
Ansible: Write a Playbooks for MongoDB
Ansible Playbooks are automated scripts written in YAML data format. Instead of using manual commands to setup multiple remote machines, you can utilize Ansible Playbooks to configure your entire systems. YAML syntax is easy to read and express the data structure of certain Ansible functions. You simply write some tasks, for example, installing software, configuring default settings, and starting the software, in a Ansible Playbook. With a few examples in this section, you will understand how it works and how to write your own Playbooks.
- There are also several examples of using Ansible Playbooks from the official site. It covers
-
from basic usage of Ansible Playbooks to advanced usage such as applying patches and updates with different roles and groups.
We are going to write a basic playbook of Ansible
software. Keep in mind that Ansible
is a main program and playbook
is a template that you would like to use. You may have several playbooks
in your Ansible.
First playbook for MongoDB Installation
As a first example, we are going to write a playbook which installs MongoDB server. It includes the following tasks:
- Import the public key used by the package management system
- Create a list file for MongoDB
- Reload local package database
- Install the MongoDB packages
- Start MongoDB
The material presented here is based on the manual installation of MongoDB from the official site:
We also assume that we install MongoDB on Ubuntu 15.10.
Enabling Root SSH Access
Some setups of managed nodes may not allow you to log in as root. As
this may be problematic later, let us create a playbook to resolve this.
Create a enable-root-access.yaml
file with the following contents:
---
- hosts: ansible-test
remote_user: ubuntu
tasks:
- name: Enable root login
shell: sudo cp ~/.ssh/authorized_keys /root/.ssh/
Explanation:
-
hosts
specifies the name of a group of machines in the inventory -
remote_user
specifies the username on the managed nodes to log in as -
tasks
is a list of tasks to accomplish having aname
(a description) and modules to execute. In this case we use theshell
module.
We can run this playbook like so:
$ ansible-playbook -i inventory.txt -c ssh enable-root-access.yaml
PLAY [ansible-test] ***********************************************************
GATHERING FACTS ***************************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]
TASK: [Enable root login] *****************************************************
changed: [10.23.2.104]
changed: [10.23.2.105]
PLAY RECAP ********************************************************************
10.23.2.104 : ok=2 changed=1 unreachable=0 failed=0
10.23.2.105 : ok=2 changed=1 unreachable=0 failed=0
Hosts and Users
First step is choosing hosts to install MongoDB and a user account to
run commands (tasks). We start with the following lines in the example
filename of mongodb.yaml
:
---
- hosts: ansible-test
remote_user: root
become: yes
In a previous section, we setup two machines with ansible-test
group
name. We use two machines for MongoDB installation.
Also, we use root
account to complete Ansible tasks.
- Indentation is important in YAML format. Do not ignore spaces start
-
with in each line.
Tasks
A list of tasks contains commands or configurations to be executed on
remote machines in a sequential order. Each task comes with a name
and
a module
to run your command or configuration. You provide a
description of your task in name
section and choose a module
for
your task. There are several modules that you can use, for example,
shell
module simply executes a command without considering a return
value. You may use apt
or yum
module which is one of the packaging
modules to install software. You can find an entire list of modules
here: http://docs.ansible.com/list_of_all_modules.html
Module apt_key: add repository keys
We need to import the MongoDB public GPG Key. This is going to be a first task in our playbook.:
tasks:
- name: Import the public key used by the package management system
apt_key: keyserver=hkp://keyserver.ubuntu.com:80 id=7F0CEB10 state=present
Module apt_repository: add repositories
Next add the MongoDB repository to apt:
- name: Add MongoDB repository
apt_repository: repo='deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' state=present
Module apt: install packages
We use apt
module to install mongodb-org
package. notify
action is
added to start mongod
after the completion of this task. Use the
update_cache=yes
option to reload the local package database.:
- name: install mongodb
apt: pkg=mongodb-org state=latest update_cache=yes
notify:
- start mongodb
Module service: manage services
We use handlers
here to start or restart services. It is similar to
tasks
but will run only once.:
handlers:
- name: start mongodb
service: name=mongod state=started
The Full Playbook
Our first playbook looks like this:
---
- hosts: ansible-test
remote_user: root
become: yes
tasks:
- name: Import the public key used by the package management system
apt_key: keyserver=hkp://keyserver.ubuntu.com:80 id=7F0CEB10 state=present
- name: Add MongoDB repository
apt_repository: repo='deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' state=present
- name: install mongodb
apt: pkg=mongodb-org state=latest update_cache=yes
notify:
- start mongodb
handlers:
- name: start mongodb
service: name=mongod state=started
Running a Playbook
We use ansible-playbook
command to run our playbook:
$ ansible-playbook -i inventory.txt -c ssh mongodb.yaml
PLAY [ansible-test] ***********************************************************
GATHERING FACTS ***************************************************************
ok: [10.23.2.104]
ok: [10.23.2.105]
TASK: [Import the public key used by the package management system] ***********
changed: [10.23.2.104]
changed: [10.23.2.105]
TASK: [Add MongoDB repository] ************************************************
changed: [10.23.2.104]
changed: [10.23.2.105]
TASK: [install mongodb] *******************************************************
changed: [10.23.2.104]
changed: [10.23.2.105]
NOTIFIED: [start mongodb] *****************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]
PLAY RECAP ********************************************************************
10.23.2.104 : ok=5 changed=3 unreachable=0 failed=0
10.23.2.105 : ok=5 changed=3 unreachable=0 failed=0
If you rerun the playbook, you should see that nothing changed:
$ ansible-playbook -i inventory.txt -c ssh mongodb.yaml
PLAY [ansible-test] ***********************************************************
GATHERING FACTS ***************************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]
TASK: [Import the public key used by the package management system] ***********
ok: [10.23.2.104]
ok: [10.23.2.105]
TASK: [Add MongoDB repository] ************************************************
ok: [10.23.2.104]
ok: [10.23.2.105]
TASK: [install mongodb] *******************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]
PLAY RECAP ********************************************************************
10.23.2.104 : ok=4 changed=0 unreachable=0 failed=0
10.23.2.105 : ok=4 changed=0 unreachable=0 failed=0
Sanity Check: Test MongoDB
Let us try to run ‘mongo’ to enter mongodb shell.:
$ ssh ubuntu@$IP
$ mongo
MongoDB shell version: 2.6.9
connecting to: test
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
http://docs.mongodb.org/
Questions? Try the support group
http://groups.google.com/group/mongodb-user
>
Terms
-
Module: Ansible library to run or manage services, packages, files or commands.
-
Handler: A task for notifier.
-
Task: Ansible job to run a command, check files, or update configurations.
-
Playbook: a list of tasks for Ansible nodes. YAML format used.
-
YAML: Human readable generic data serialization.
Reference
The main tutorial from Ansible is here: http://docs.ansible.com/playbooks_intro.html
You can also find an index of the ansible modules here: http://docs.ansible.com/modules_by_category.html
Exercise
We have shown a couple of examples of using Ansible tools. Before you apply it in you final project, we will practice it in this exercise.
- set up the project structure similar to Ansible Galaxy example
- install MongoDB from the package manager (apt in this class)
- configure your MongoDB installation to start the service automatically
- use default port and let it serve local client connections only
4 - Puppet
Overview
Configuration management is an important task of IT department in any organization. It is process of managing infrastructure changes in structured and systematic way. Manual rolling back of infrastructure to previous version of software is cumbersome, time consuming and error prone. Puppet is configuration management tool that simplifies complex task of deploying new software, applying software updates and rollback software packages in large cluster. Puppet does this through Infrastructure as Code (IAC). Code is written for infrastructure on one central location and is pushed to nodes in all environments (Dev, Test, Production) using puppet tool. Configuration management tool has two approaches for managing infrastructure; Configuration push and pull. In push configuration, infrastructure as code is pushed from centralized server to nodes whereas in pull configuration nodes pulls infrastructure as code from central server as shown in fig. 1.
Puppet uses push and pull configuration in centralized manner as shown in fig. 2.
Another popular infrastructure tool is Ansible. It does not have master and client nodes. Any node in Ansible can act as executor. Any node containing list of inventory and SSH credential can play master node role to connect with other nodes as opposed to puppet architecture where server and agent software needs to be setup and installed. Configuring Ansible nodes is simple, it just requires python version 2.5 or greater. Ansible uses push architecture for configuration.
Master slave architecture
Puppet uses master slave architecture as shown in fig. 3. Puppet server is called as master node and client nodes are called as puppet agent. Agents poll server at regular interval and pulls updated configuration from master. Puppet Master is highly available. It supports multi master architecture. If one master goes down backup master stands up to serve infrastructure.
Workflow
- nodes (puppet agents) sends information (for e.g IP, hardware detail, network etc.) to master. Master stores such information in manifest file.
- Master node compiles catalog file containing configuration information that needs to be implemented on agent nodes.
- Master pushes catalog to puppet agent nodes for implementing configuration.
- Client nodes send back updated report to Master. Master updates its inventory.
- All exchange between master and agent is secured through SSL encryption (see fig. 3)
fig. 4, shows flow between master and slave.
fig. 5 shows SSL workflow between master and slave.
Puppet comes in two forms. Open source Puppet and Enterprise In this tutorial we will showcase installation steps of both forms.
Install Opensource Puppet on Ubuntu
We will demonstrate installation of Puppet on Ubuntu
Prerequisite - Atleast 4 GB RAM, Ubuntu box ( standalone or VM )
First, we need to make sure that Puppet master and agent is able to communicate with each other. Agent should be able to connect with master using name.
configure Puppet server name and map with its ip address
$ sudo nano /etc/hosts
contents of the /etc/hosts
should look like
<ip_address> my-puppet-master
my-puppet-master is name of Puppet master to which Puppet agent would try to connect
press <ctrl> + O
to Save and <ctrl> + X
to exit
Next, we will install Puppet on Ubuntu server. We will execute the following commands to pull from official Puppet Labs Repository
$ curl -O https://apt.puppetlabs.com/puppetlabs-release-pc1-xenial.deb
$ sudo dpkg -i puppetlabs-release-pc1-xenial.deb
$ sudo apt-get update
Intstall the Puppet server
$ sudo apt-get install puppetserver
Default instllation of Puppet server is configured to use 2 GB of RAM. However, we can customize this by opening puppetserver configuration file
$ sudo nano /etc/default/puppetserver
This will open the file in editor. Look for JAVA_ARGS line and change
the value of -Xms
and -Xmx
parameters to 3g if we wish to configure
Puppet server for 3GB RAM. Note that default value of this parameter is
2g.
JAVA_ARGS="-Xms3g -Xmx3g -XX:MaxPermSize=256m"
press <ctrl> + O
to Save and <ctrl> + X
to exit
By default Puppet server is configured to use port 8140 to communicate with agents. We need to make sure that firewall allows to communicate on this port
$ sudo ufw allow 8140
next, we start Puppet server
$ sudo systemctl start puppetserver
Verify server has started
$ sudo systemctl status puppetserver
we would see “active(running)” if server has started successfully
$ sudo systemctl status puppetserver
● puppetserver.service - puppetserver Service
Loaded: loaded (/lib/systemd/system/puppetserver.service; disabled; vendor pr
Active: active (running) since Sun 2019-01-27 00:12:38 EST; 2min 29s ago
Process: 3262 ExecStart=/opt/puppetlabs/server/apps/puppetserver/bin/puppetser
Main PID: 3269 (java)
CGroup: /system.slice/puppetserver.service
└─3269 /usr/bin/java -Xms3g -Xmx3g -XX:MaxPermSize=256m -Djava.securi
Jan 27 00:11:34 ritesh-ubuntu1 systemd[1]: Starting puppetserver Service...
Jan 27 00:11:34 ritesh-ubuntu1 puppetserver[3262]: OpenJDK 64-Bit Server VM warn
Jan 27 00:12:38 ritesh-ubuntu1 systemd[1]: Started puppetserver Service.
lines 1-11/11 (END)
configure Puppet server to start at boot time
$ sudo systemctl enable puppetserver
Next, we will install Puppet agent
$ sudo apt-get install puppet-agent
start Puppet agent
$ sudo systemctl start puppet
configure Puppet agent to start at boot time
$ sudo systemctl enable puppet
next, we need to change Puppet agent config file so that it can connect to Puppet master and communicate
$ sudo nano /etc/puppetlabs/puppet/puppet.conf
configuration file will be opened in an editor. Add following sections in file
[main]
certname = <puppet-agent>
server = <my-puppet-server>
[agent]
server = <my-puppet-server>
Note: my-puppet-server is the name that we have set up in /etc/hosts file while installing Puppet server. And certname is the name of the certificate
Puppet agent sends certificate signing request to Puppet server when it connects first time. After signing request, Puppet server trusts and identifies agent for managing.
execute following command on Puppet Master in order to see all incoming cerficate signing requests
$ sudo /opt/puppetlabs/bin/puppet cert list
we will see something like
$ sudo /opt/puppetlabs/bin/puppet cert list
"puppet-agent" (SHA256) 7B:C1:FA:73:7A:35:00:93:AF:9F:42:05:77:9B:
05:09:2F:EA:15:A7:5C:C9:D7:2F:D7:4F:37:A8:6E:3C:FF:6B
- Note that puppet-agent is the name that we have configured for certname in puppet.conf file*
After validating that request is from valid and trusted agent, we sign the request
$ sudo /opt/puppetlabs/bin/puppet cert sign puppet-agent
we will see message saying certificate was signed if successful
$ sudo /opt/puppetlabs/bin/puppet cert sign puppet-agent
Signing Certificate Request for:
"puppet-agent" (SHA256) 7B:C1:FA:73:7A:35:00:93:AF:9F:42:05:77:9B:05:09:2F:
EA:15:A7:5C:C9:D7:2F:D7:4F:37:A8:6E:3C:FF:6B
Notice: Signed certificate request for puppet-agent
Notice: Removing file Puppet::SSL::CertificateRequest puppet-agent
at '/etc/puppetlabs/puppet/ssl/ca/requests/puppet-agent.pem'
Next, we will verify installation and make sure that Puppet server is able to push configuration to agent. Puppet uses domian specific language code written in manifests ( .pp ) file
create default manifest site.pp file
$ sudo nano /etc/puppetlabs/code/environments/production/manifests/site.pp
This will open file in edit mode. Make following changes to this file
file {'/tmp/it_works.txt': # resource type file and filename
ensure => present, # make sure it exists
mode => '0644', # file permissions
content => "It works!\n", # Print the eth0 IP fact
}
domain specific language is used to create it_works.txt file inside /tmp directory on agent node. ensure directive make sure that file is present. It creates one if file is removed. mode directive specifies that process has write permission on file to make changes. content directive is used to define content of the changes applied [hid-sp18-523-open]
next, we test the installation on single node
sudo /opt/puppetlabs/bin/puppet agent --test
successfull verification will display
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Caching catalog for puppet-agent
Info: Applying configuration version '1548305548'
Notice: /Stage[main]/Main/File[/tmp/it_works.txt]/content:
--- /tmp/it_works.txt 2019-01-27 02:32:49.810181594 +0000
+++ /tmp/puppet-file20190124-9628-1vy51gg 2019-01-27 02:52:28.717734377 +0000
@@ -0,0 +1 @@
+it works!
Info: Computing checksum on file /tmp/it_works.txt
Info: /Stage[main]/Main/File[/tmp/it_works.txt]: Filebucketed /tmp/it_works.txt
to puppet with sum d41d8cd98f00b204e9800998ecf8427e
Notice: /Stage[main]/Main/File[/tmp/it_works.txt]/content: content
changed '{md5}d41d8cd98f00b204e9800998ecf8427e' to '{md5}0375aad9b9f3905d3c545b500e871aca'
Info: Creating state file /opt/puppetlabs/puppet/cache/state/state.yaml
Notice: Applied catalog in 0.13 seconds
Installation of Puppet Enterprise
First, download ubuntu-<version and arch>.tar.gz
and CPG signature
file on Ubuntu VM
Second, we import Puppet public key
$ wget -O - https://downloads.puppetlabs.com/puppet-gpg-signing-key.pub | gpg --import
we will see ouput as
--2019-02-03 14:02:54-- https://downloads.puppetlabs.com/puppet-gpg-signing-key.pub
Resolving downloads.puppetlabs.com
(downloads.puppetlabs.com)... 2600:9000:201a:b800:10:d91b:7380:93a1
, 2600:9000:201a:800:10:d91b:7380:93a1, 2600:9000:201a:be00:10:d91b:7380:93a1, ...
Connecting to downloads.puppetlabs.com (downloads.puppetlabs.com)
|2600:9000:201a:b800:10:d91b:7380:93a1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3139 (3.1K) [binary/octet-stream]
Saving to: ‘STDOUT’
- 100%[===================>] 3.07K --.-KB/s in 0s
2019-02-03 14:02:54 (618 MB/s) - written to stdout [3139/3139]
gpg: key 7F438280EF8D349F: "Puppet, Inc. Release Key
(Puppet, Inc. Release Key) <release@puppet.com>" not changed
gpg: Total number processed: 1
gpg: unchanged: 1
Third, we print fingerprint of used key
$ gpg --fingerprint 0x7F438280EF8D349F
we will see successful output as
pub rsa4096 2016-08-18 [SC] [expires: 2021-08-17]
6F6B 1550 9CF8 E59E 6E46 9F32 7F43 8280 EF8D 349F
uid [ unknown] Puppet, Inc. Release Key
(Puppet, Inc. Release Key) <release@puppet.com>
sub rsa4096 2016-08-18 [E] [expires: 2021-08-17]
Fourth, we verify release signature of installed package
$ gpg --verify puppet-enterprise-VERSION-PLATFORM.tar.gz.asc
successful output will show as
gpg: assuming signed data in 'puppet-enterprise-2019.0.2-ubuntu-18.04-amd64.tar.gz'
gpg: Signature made Fri 25 Jan 2019 02:03:23 PM EST
gpg: using RSA key 7F438280EF8D349F
gpg: Good signature from "Puppet, Inc. Release Key
(Puppet, Inc. Release Key) <release@puppet.com>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg: There is no indication that the signature belongs to the owner.
Primary key fingerprint: 6F6B 1550 9CF8 E59E 6E46 9F32 7F43 8280 EF8D 349
Next, we need to unpack installation tarball. Store location of path in
$TARBALL
variable. This variable will be used in our installation.
$ export TARBALL=path of tarball file
then, we extract tarball
$ tar -xf $TARBALL
Next, we run installer from installer directory
$ sudo ./puppet-enterprise-installer
This will ask us to chose installation option; we could chose from guided installation or text based installation
~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64
$ sudo ./puppet-enterprise-installer
~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64
~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64
=============================================================
Puppet Enterprise Installer
=============================================================
## Installer analytics are enabled by default.
## To disable, set the DISABLE_ANALYTICS environment variable and rerun
this script.
For example, "sudo DISABLE_ANALYTICS=1 ./puppet-enterprise-installer".
## If puppet_enterprise::send_analytics_data is set to false in your
existing pe.conf, this is not necessary and analytics will be disabled.
Puppet Enterprise offers three different methods of installation.
[1] Express Installation (Recommended)
This method will install PE and provide you with a link at the end
of the installation to reset your PE console admin password
Make sure to click on the link and reset your password before proceeding
to use PE
[2] Text-mode Install
This method will open your EDITOR (vi) with a PE config file (pe.conf)
for you to edit before you proceed with installation.
The pe.conf file is a HOCON formatted file that declares parameters
and values needed to install and configure PE.
We recommend that you review it carefully before proceeding.
[3] Graphical-mode Install
This method will install and configure a temporary webserver to walk
you through the various configuration options.
NOTE: This method requires you to be able to access port 3000 on this
machine from your desktop web browser.
=============================================================
How to proceed? [1]:
-------------------------------------------------------------------
Press 3 for web based Graphic-mode-Install
when successfull, we will see output as
## We're preparing the Web Installer...
2019-02-02T20:01:39.677-05:00 Running command:
mkdir -p /opt/puppetlabs/puppet/share/installer/installer
2019-02-02T20:01:39.685-05:00 Running command:
cp -pR /home/ritesh/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64/*
/opt/puppetlabs/puppet/share/installer/installer/
## Go to https://<localhost>:3000 in your browser to continue installation.
By default Puppet Enterprise server uses 3000 port. Make sure that firewall allows communication on port 3000
$ sudo ufw allow 3000
Next, go to https://localhost:3000
url for completing installation
Click on get started
button.
Chose install on this server
Enter <mypserver>
as DNS name. This is our Puppet Server name. This
can be configured in confile file also.
Enter console admin password
Click continue
we will get confirm the plan screen with following information
The Puppet master component
Hostname
ritesh-ubuntu-pe
DNS aliases
<mypserver>
click continue and verify installer validation screen.
click Deploy Now
button
Puppet enterprise will be installed and will display message on screen
Puppet agent ran sucessfully
login to console with admin password that was set earlier and click on nodes links to manage nodes.
Installing Puppet Enterprise as Text mode monolithic installation
$ sudo ./puppet-enterprise-installer
Enter 2 on How to Proceed
for text mode monolithic installation.
Following message will be displayed if successfull.
2019-02-02T22:08:12.662-05:00 - [Notice]: Applied catalog in 339.28 seconds
2019-02-02T22:08:13.856-05:00 - [Notice]:
Sent analytics: pe_installer - install_finish - succeeded
* /opt/puppetlabs/puppet/bin/puppet infrastructure configure
--detailed-exitcodes --environmentpath /opt/puppetlabs/server/data/environments
--environment enterprise --no-noop --install=2019.0.2 --install-method='repair'
* returned: 2
## Puppet Enterprise configuration complete!
Documentation: https://puppet.com/docs/pe/2019.0/pe_user_guide.html
Release notes: https://puppet.com/docs/pe/2019.0/pe_release_notes.html
If this is a monolithic configuration, run 'puppet agent -t' to complete the
setup of this system.
If this is a split configuration, install or upgrade the remaining PE components,
and then run puppet agent -t on the Puppet master, PuppetDB, and PE console,
in that order.
~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64
2019-02-02T22:08:14.805-05:00 Running command: /opt/puppetlabs/puppet/bin/puppet
agent --enable
~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64$
This is called as monolithic installation as all components of Puppet Enterprise such as Puppet master, PuppetDB and Console are installed on single node. This installation type is easy to install. Troubleshooting errors and upgrading infrastructure using this type is simple. This installation type can easily support infrastructure of up to 20,000 managed nodes. Compiled master nodes can be added as network grows. This is recommended installation type for small to mid size organizations [2].
pe.conf
configuration file will be opened in editor to configure
values. This file contains parameters and values for installing,
upgrading and configuring Puppet.
Some important parameters that can be specified in pe.conf
file are
console_admin_password
puppet_enterprise::console_host
puppet_enterprise::puppetdb_host
puppet_enterprise::puppetdb_database_name
puppet_enterprise::puppetdb_database_user
Lastly, we run puppet after installation is complete
$ puppet agent -t
Text mode split installation is performed for large networks. Compared to monolithic installation split installation type can manage large infrastucture that requires more than 20,000 nodes. In this type of installation different components of Puppet Enterprise (master, PuppetDB and Console) are installed on different nodes. This installation type is recommended for organizations with large infrastructure needs [3].
In this type of installation, we need to install componenets in specific order. First master then puppet db followed by console.
Puppet Enterprise master and agent settings can be configured in
puppet.conf
file. Most configuration settings of Puppet Enterprise
componenets such as Master, Agent and security certificates are all
specified in this file.
Config section of Agent Node
[main]
certname = <http://your-domain-name.com/>
server = puppetserver
environment = testing
runinterval = 4h
Config section of Master Node
[main]
certname = <http://your-domain-name.com/>
server = puppetserver
environment = testing
runinterval = 4h
strict_variables = true
[master]
dns_alt_names = puppetserver,puppet, <http://your-domain-name.com/>
reports = pupated
storeconfigs_backend = puppetdb
storeconfigs = true
environment_timeout = unlimited
Comment lines, Settings lines and Settings variables are main components of puppet configuration file. Comments in config files are specified by prefixing hash character. Setting line consists name of setting followed by equal sign, value of setting are specified in this section. Setting variable value generally consists of one word but multiple can be specified in rare cases [4].
Refernces
[1] Edureka, “Puppet tutorial – devops tool for configuration management.” Web Page, May-2017 [Online]. Available: https://www.edureka.co/blog/videos/puppet-tutorial/
[2] Puppet, “Text mode installation: Monolithic.” Web Page, Nov-2017 [Online]. Available: https://puppet.com/docs/pe/2017.1/install_text_mode_mono.html
[3] Puppet, “Text mode installation : Split.” Web Page, Nov-2017 [Online]. Available: https://puppet.com/docs/pe/2017.1/install_text_mode_split.html
[4] Puppet, “Config files: The main config files.” Web Page, Apr-2014 [Online]. Available: https://puppet.com/docs/puppet/5.3/config_file_main.html
5 - Travis
Travis CI is a continuous integration tool that is often used as part of DevOps development. It is a hosted service that enables users to test their projects on GitHub.
Once travis is activated in a GitHub project, the developers can place
a .travis
file in the project root. Upon checkin the travis
configuration file will be interpreted and the commands indicated in
it will be executed.
In fact this book has also a travis file that is located at
Please inspect it as we will illustrate some concepts of it. Unfortunately travis does not use an up to date operating system such as ubuntu 18.04. Therefore it contains outdated libraries. Although we would be able to use containers, we have elected for us to chose mechanism to update the operating system as we need.
This is done in the install
phase that in our case installs a new
version of pandoc, as well as some additional libraries that we use.
in the env
we specify where we can find our executables with the
PATH
variable.
The last portion in our example file specifies the script that is executed after the install phase has been completed. As our installation contains convenient and sophisticated makefiles, the script is very simple while executing the appropriate make command in the corresponding directories.
Exercises
E.travis.1:
Develop an alternative travis file that in conjunction uses a preconfigured container for ubuntu 18.04
E.travis.2:
Develop an travis file that checks our books on multiple operating systems such as macOS, and ubuntu 18.04.
Resources
6 - DevOps with AWS
AWS cloud offering comes with end-to-end scalable and most performant support for DevOps, all the way from automatic deployment and monitoring of infrastructure-as-code to our cloud-applications-code. AWS provides various DevOp tools to make the deployment and support automation as simple as possible.
AWS DevOp Tools
Following is the list of DevOp tools for CI/CD workflows.
AWS DevOp Tool | Description |
---|---|
CodeStar | AWS CodeStar provides unified UI to enable simpler deployment automation. |
CodePipeline | CI/CD service for faster and reliable application and infrastructure updates. |
CodeBuild | Fully managed build service that complies, tests and creates software packages that are ready to deploy. |
CodeDeploy | Deployment automation tool to deploy to on-premise and on-cloud EC2 instances with near-to-zero downtime during the application deployments. |
Infrastructure Automation
AWS provides services to make micro-services easily deployable onto containers and serverless platforms.
AWS DevOp Infrastructure Tool | Description |
---|---|
Elastic Container Service | Highly scalable container management service. |
CodePipeline | CI/CD service for faster and reliable application and infrastructure updates. |
AWS Lambda | Serverless Computing using Function-as-service (FaaS) methodologies . |
AWS CloudFormation | Tool to create and manage related AWS resources. |
AWS OpsWorks | Server Configuration Management Tool. |
Monitoring and Logging
AWS DevOp Monitoring Tool | Description |
---|---|
Amazon CloudWatch | Tool to monitor AWS resources and cloud applications to collect and track metrics, logs and set alarms. |
AWS X-Ray | Allows developers to analyze and troubleshoot performance issues of their cloud applications and micro-services. |
For more information, please visit Amazon AWS [1].
Refernces
[1] Amazon AWS, DevOps and AWS. Amazon, 2019 [Online]. Available: https://aws.amazon.com/devops/
7 - DevOps with Azure Monitor
Microsoft provides unified tool called Azure Monitor for end-to-end monitoring of the infrastructure and deployed applications. Azure Monitor can greatly help Dev-Op teams by proactively and reactively monitoring the applications for bug tracking, health-check and provide metrics that can hint on various scalability aspects.
Azure Monitor accommodates applications developed in various programming languages - .NET, Java, Node.JS, Python and various others. With Azure Application Insights telematics API incorporated into the applications, Azure Monitor can provide more detailed metrics and analytics around specific tracking needs - usage, bugs, etc.
Azure Monitor can help us track the health, performance and scalability issues of the infrastructure - VMs, Containers, Storage, Network and all Azure Services by automatically providing various platform metrics, activity and diagnostic logs.
Azure Monitor provides programmatic access through Power Shell scripts to access the activity and diagnostic logs. It also allows querying them using powerful querying tools for advanced in-depth analysis and reporting.
Azure Monitor proactively monitors and notifies us of critical conditions - reaching quota limits, abnormal usage, health-checks and recommendations along with making attempts to correct some of those aspects.
Azure Monitor Dashboards allow visualize various aspects of the data - metrics, logs, usage patterns in tabular and graphical widgets.
Azure Monitor also facilitates closer monitoring of micro-services if they are provided through Azure Serverless Function-As-Service.
For more information, please visit Microsoft Azure Website [1].
Refernces
[1] Microsoft Azure, Azure Monitor Overview. Microsoft, 2018 [Online]. Available: https://docs.microsoft.com/en-us/azure/azure-monitor/overview