This the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

DevOps

We present here a collection information and of tools related to DevOps.

We present here a collection information and of tools related to DevOps.

1 - DevOps - Continuous Improvement

Indorduction to DevOps and Continious Integration

Deploying enterprise applications has been always challenging. Without consistent and reliable processes and practices, it would be impossible to track and measure the deployment artifacts, which code-files and configuration data have been deployed to what servers and what level of unit and integration tests have been done among various components of the enterprise applications. Deploying software to cloud is much more complex, given Dev-Op teams do not have extensive access to the infrastructure and they are forced to follow the guidelines and tools provided by the cloud companies. In recent years, Continuous Integration (CI) and Continuous Deployment (CD) are the Dev-Op mantra for delivering software reliably and consistently.

While CI/CD process is, as difficult as it gets, monitoring the deployed applications is emerging as new challenge, especially, on an infrastructure that is sort of virtual with VMs in combination with containers. Continuous Monitoring (CM) is somewhat new concept, that has gaining rapid popularity and becoming integral part of the overall Dev-Op functionality. Based on where the software has been deployed, continuous monitoring can be as simple as, monitoring the behavior of the applications to as complex as, end-to-end visibility across infrastructure, heart-beat and health-check of the deployed applications along with dynamic scalability based on the usage of these applications. To address this challenge, building robust monitoring pipeline process, would be a necessity. Continuous Monitoring aspects get much better control, if they are thought as early as possible and bake them into the software during the development. We can provide much better tracking and analyze metrics much closer to the application needs, if these aspects are considered very early into the process. Cloud companies aware of this necessity, provide various Dev-Op tools to make CI/CD and continuous monitoring as easy as possible. While, some of these tools and aspects are provided by the cloud offerings, some of them must be planned and planted into our software.

At high level, we can think of a simple pipeline to achieve consistent and scalable deployment process. CI/CD and Continuous Monitoring Pipeline:

  • Step 1 - Continuous Development - Plan, Code, Build and Test:

    Planning, Coding, building the deployable artifacts - code, configuration, database, etc. and let them go through the various types of tests with all the dimensions - technical to business and internal to external, as automated as possible. All these aspects come under Continuous Development.

  • Step 2 - Continuous Improvement - Deploy, Operate and Monitor:

    Once deployed to production, how these applications get operated - bug and health-checks, performance and scalability along with various high monitoring - infrastructure and cold delays due to on-demand VM/container instantiations by the cloud offerings due to the nature of the dynamic scalability of the deployment and selected hosting options. Making necessary adjustments to improve the overall experience is essentially called Continuous Improvement.

2 - Infrastructure as Code (IaC)

Infrastructure as Code is the ability of code to generate, maintain and destroy application infrastructure like server, storage and networking, without requiring manual changes.

Learning Objectives


Learning Objectives

  • Introduction to IaC
  • How IaC is related to DevOps
  • How IaC differs from Configuration Management Tools, and how is it related
  • Listing of IaC Tools
  • Further Reading

Introduction to IaC

IaC(Infrastructure as Code) is the ability of code to generate, maintain and destroy application infrastructure like server, storage and networking, without requiring manual changes. State of the infrastructure is maintained in files.

Cloud architectures, and containers have forced usage of IaC, as the amount of elements to manage at each layer are just too many. It is impractical to keep track with the traditional method of raising tickets and having someone do it for you. Scaling demands, elasticity during odd hours, usage-based-billing all require provisioning, managing and destroying infrastructure much more dynamically.

From the book “Amazon Web Services in Action” by Wittig [1], using a script or a declarative description has the following advantages

  • Consistent usage
  • Dependencies are handled
  • Replicable
  • Customizable
  • Testable
  • Can figure out updated state
  • Minimizes human failure
  • Documentation for your infrastructure

Sometimes IaC tools are also called Orchestration tools, but that label is not as accurate, and often misleading.

DevOps has the following key practices

  • Automated Infrastructure
  • Automated Configuration Management, including Security
  • Shared version control between Dev and Ops
  • Continuous Build - Integrate - Test - Deploy
  • Continuous Monitoring and Observability

The first practice - Automated Infrastructure can be fulfilled by IaC tools. By having the code for IaC and Configuration Management in the same code repository as application code ensures adhering to the practice of shared version control.

Typically, the workflow of the DevOps team includes running Configuration Management tool scripts after running IaC tools, for configurations, security, connectivity, and initializations.

There are 4 broad categories of such tools [2], there are

  • Ad hoc scripts: Any shell, Python, Perl, Lua scripts that are written
  • Configuration management tools: Chef, Puppet, Ansible, SaltStack
  • Server templating tools: Docker, Packer, Vagrant
  • Server provisioning tools: Terraform, Heat, CloudFormation, Cloud Deployment Manager, Azure Resource Manager

Configuration Management tools make use of scripts to achieve a state. IaC tools maintain state and metadata created in the past.

However, the big difference is the state achieved by running procedural code or scripts may be different from state when it was created because

  • Ordering of the scripts determines the state. If the order changes, state will differ. Also, issues like waiting time required for resources to be created, modified or destroyed have to be correctly dealt with.
  • Version changes in procedural code are inevitabale, and will lead to a different state.

Chef and Ansible are more procedural, while Terraform, CloudFormation, SaltStack, Puppet and Heat are more declarative.

IaC or declarative tools do suffer from inflexibility related to expressive scripting language.

Listing of IaC Tools

IaC tools that are cloud specific are

  • Amazon AWS - AWS CloudFormation
  • Google Cloud - Cloud Deployment Manager
  • Microsoft Azure - Azure Resource Manager
  • OpenStack - Heat

Terraform is not a cloud specific tool, and is multi-vendor. It has got good support for all the clouds, however, Terraform scripts are not portable across clouds.

Advantages of IaC

IaC solves the problem of environment drift, that used to lead to the infamous “but it works on my machine” kind of errors that are difficult to trace. According to ???

IaC guarantees Idempotence – known/predictable end state – irrespective of starting state. Idempotency is achieved by either automatically configuring an existing target or by discarding the existing target and recreating a fresh environment.

Further Reading

Please see books and resources like the “Terraform Up and Running” [2] for more real-world advice on IaC, structuring Terraform code and good deployment practices.

A good resource for IaC is the book “Infrastructure as Code” [3].

Refernces

[1] M. Wittig Andreas; Wittig, Amazon web services in action, 1st ed. Manning Press, 2015.

[2] Y. Brikman, Terraform: Up and running, 1st ed. O’Reilly Media Inc, 2017.

[3] K. Morris, Infrastructure as code, 1st ed. O’Reilly Media Inc, 2015.

3 - Ansible

Ansible is an open-source IT automation DevOps engine allowing you to manage and configure many compute resources in a scalable, consistent and reliable way.

Introduction to Ansible

Ansible is an open-source IT automation DevOps engine allowing you to manage and configure many compute resources in a scalable, consistent and reliable way.

Ansible to automates the following tasks:

  • Provisioning: It sets up the servers that you will use as part of your infrastructure.

  • Configuration management: You can change the configuration of an application, OS, or device. You can implement security policies and other configuration tasks.

  • Service management: You can start and stop services, install updates

  • Application deployment: You can conduct application deployments in an automated fashion that integrate with your DevOps strategies.

Prerequisite

We assume you

  • can install Ubuntu 18.04 virtual machine on VirtualBox

  • can install software packages via ‘apt-get’ tool in Ubuntu virtual host

  • already reserved a virtual cluster (with at least 1 virtual machine in it) on some cloud. OR you can use VMs installed in VirtualBox instead.

  • have SSH credentials and can login to your virtual machines.

Setting up a playbook

Let us develop a sample from scratch, based on the paradigms that ansible supports. We are going to use Ansible to install Apache server on our virtual machines.

First, we install ansible on our machine and make sure we have an up to date OS:

$ sudo apt-get update
$ sudo apt-get install ansible

Next, we prepare a working environment for your Ansible example

$ mkdir ansible-apache
$ cd ansible-apache

To use ansible we will need a local configuration. When you execute Ansible within this folder, this local configuration file is always going to overwrite a system level Ansible configuration. It is in general beneficial to keep custom configurations locally unless you absolutely believe it should be applied system wide. Create a file inventory.cfg in this folder, add the following:

[defaults]
hostfile = hosts.txt

This local configuration file tells that the target machines' names are given in a file named hosts.txt. Next we will specify hosts in the file.

You should have ssh login accesses to all VMs listed in this file as part of our prerequisites. Now create and edit file hosts.txt with the following content:

[apache]
<server_ip> ansible_ssh_user=<server_username>

The name apache in the brackets defines a server group name. We will use this name to refer to all server items in this group. As we intend to install and run apache on the server, the name choice seems quite appropriate. Fill in the IP addresses of the virtual machines you launched in your VirtualBox and fire up these VMs in you VirtualBox.

To deploy the service, we need to create a playbook. A playbook tells Ansible what to do. it uses YAML Markup syntax. Create and edit a file with a proper name e.g. apache.yml as follow:

---
- hosts: apache #comment: apache is the group name we just defined
  become: yes #comment: this operation needs privilege access
  tasks:
    - name: install apache2 # text description
      apt: name=apache2 update_cache=yes state=latest

This block defines the target VMs and operations(tasks) need to apply. We are using the apt attribute to indicate all software packages that need to be installed. Dependent on the distribution of the operating system it will find the correct module installer without your knowledge. Thus an ansible playbook could also work for multiple different OSes.

Ansible relies on various kinds of modules to fulfil tasks on the remote servers. These modules are developed for particular tasks and take in related arguments. For instance, when we use apt module, we need to tell which package we intend to install. That is why we provide a value for the name= argument. The first -name attribute is just a comment that will be printed when this task is executed.

Run the playbook

In the same folder, execute

ansible-playbook apache.yml --ask-sudo-pass

After a successful run, open a browser and fill in your server IP. you should see an ‘It works!’ Apache2 Ubuntu default page. Make sure the security policy on your cloud opens port 80 to let the HTTP traffic go through.

Ansible playbook can have more complex and fancy structure and syntaxes. Go explore! This example is based on:

We are going to offer an advanced Ansible in next chapter.

Ansible Roles

Next we install the R package onto our cloud VMs. R is a useful statistic programing language commonly used in many scientific and statistics computing projects, maybe also the one you chose for this class. With this example we illustrate the concept of Ansible Roles, install source code through Github, and make use of variables. These are key features you will find useful in your project deployments.

We are going to use a top-down fashion in this example. We first start from a playbook that is already good to go. You can execute this playbook (do not do it yet, always read the entire section first) to get R installed in your remote hosts. We then further complicate this concise playbook by introducing functionalities to do the same tasks but in different ways. Although these different ways are not necessary they help you grasp the power of Ansible and ease your life when they are needed in your real projects.

Let us now create the following playbook with the name example.yml:

---
- hosts: R_hosts
  become: yes
  tasks:
    - name: install the R package
      apt: name=r-base update_cache=yes state=latest

The hosts are defined in a file hosts.txt, which we configured in a file that we now call ansible.cfg:

[R_hosts]
<cloud_server_ip> ansible_ssh_user=<cloud_server_username>

Certainly, this should get the installation job done. But we are going to extend it via new features called role next

Role is an important concept used often in large Ansible projects. You divide a series of tasks into different groups. Each group corresponds to certain role within the project.

For example, if your project is to deploy a web site, you may need to install the back end database, the web server that responses HTTP requests and the web application itself. They are three different roles and should carry out their own installation and configuration tasks.

Even though we only need to install the R package in this example, we can still do it by defining a role ‘r’. Let us modify our example.yml to be:

---
- hosts: R_hosts

  roles:
    - r

Now we create a directory structure in your top project directory as follows

$ mkdir -p roles/r/tasks
$ touch roles/r/tasks/main.yml

Next, we edit the main.yml file and include the following content:

---
- name: install the R package
  apt: name=r-base update_cache=yes state=latest
  become: yes

You probably already get the point. We take the ‘tasks’ section out of the earlier example.yml and re-organize them into roles. Each role specified in example.yml should have its own directory under roles/ and the tasks need be done by this role is listed in a file ‘tasks/main.yml’ as previous.

Using Variables

We demonstrate this feature by installing source code from Github. Although R can be installed through the OS package manager (apt-get etc.), the software used in your projects may not. Many research projects are available by Git instead. Here we are going to show you how to install packages from their Git repositories. Instead of directly executing the module ‘apt’, we pretend Ubuntu does not provide this package and you have to find it on Git. The source code of R can be found at https://github.com/wch/r-source.git. We are going to clone it to a remote VM’s hard drive, build the package and install the binary there.

To do so, we need a few new Ansible modules. You may remember from the last example that Ansible modules assist us to do different tasks based on the arguments we pass to it. It will come to no surprise that Ansible has a module ‘git’ to take care of git-related works, and a ‘command’ module to run shell commands. Let us modify roles/r/tasks/main.yml to be:

---
- name: get R package source
  git:
    repo: https://github.com/wch/r-source.git
    dest: /tmp/R

- name: build and install R
  become: yes
  command: chdir=/tmp/R "{{ item }}"
  with_items:
    - ./configure
    - make
    - make install

The role r will now carry out two tasks. One to clone the R source code into /tmp/R, the other uses a series of shell commands to build and install the packages.

Note that the commands executed by the second task may not be available on a fresh VM image. But the point of this example is to show an alternative way to install packages, so we conveniently assume the conditions are all met.

To achieve this we are using variables in a separate file.

We typed several string constants in our Ansible scripts so far. In general, it is a good practice to give these values names and use them by referring to their names. This way, you complex Ansible project can be less error prone. Create a file in the same directory, and name it vars.yml:

---
repository: https://github.com/wch/r-source.git
tmp: /tmp/R

Accordingly, we will update our example.yml:

---
- hosts: R_hosts
  vars_files:
    - vars.yml
  roles:
    - r

As shown, we specify a vars_files telling the script that the file vars.yml is going to supply variable values, whose keys are denoted by Double curly brackets like in roles/r/tasks/main.yml:

---
- name: get R package source
  git:
    repo: "{{ repository }}"
    dest: "{{ tmp }}"

- name: build and install R
  become: yes
  command: chdir="{{ tmp }}" "{{ item }}"
  with_items:
    - ./configure
    - make
    - make install

Now, just edit the hosts.txt file with your target VMs' IP addresses and execute the playbook.

You should be able to extend the Ansible playbook for your needs. Configuration tools like Ansible are important components to master the cloud environment.

Ansible Galaxy

Ansible Galaxy is a marketplace, where developers can share Ansible Roles to complete their system administration tasks. Roles exchanged in Ansible Galaxy community need to follow common conventions so that all participants know what to expect. We will illustrate details in this chapter.

It is good to follow the Ansible Galaxy standard during your development as much as possible.

Ansible Galaxy helloworld

Let us start with a simplest case: We will build an Ansible Galaxy project. This project will install the Emacs software package on your localhost as the target host. It is a helloworld project only meant to get us familiar with Ansible Galaxy project structures.

First you need to create a directory. Let us call it mongodb:

$ mkdir mongodb

Go ahead and create files README.md, playbook.yml, inventory and a subdirectory roles/ then `playbook.yml is your project playbook. It should perform the Emacs installation task by executing the corresponding role you will develop in the folder ‘roles/’. The only difference is that we will construct the role with the help of ansible-galaxy this time.

Now, let ansible-galaxy initialize the directory structure for you:

$ cd roles
$ ansible-galaxy init <to-be-created-role-name>

The naming convention is to concatenate your name and the role name by a dot. @fig:ansible shows how it looks like.

image{#fig:ansible}

Let us fill in information to our project. There are several main.yml files in different folders, and we will illustrate their usages.

defaults and vars:

These folders should hold variables key-value pairs for your playbook scripts. We will leave them empty in this example.

files:

This folder is for files need to be copied to the target hosts. Data files or configuration files can be specified if needed. We will leave it empty too.

templates:

Similar missions to files/, templates is allocated for template files. Keep empty for a simple Emacs installation.

handlers:

This is reserved for services running on target hosts. For example, to restart a service under certain circumstance.

tasks:

This file is the actual script for all tasks. You can use the role you built previously for Emacs installation here:

---
- name: install Emacs on Ubuntu 16.04
  become: yes
  package: name=emacs state=present

meta:

Provide necessary metadata for our Ansible Galaxy project for shipping:

    ---
    galaxy_info:
      author: <you name>
      description: emacs installation on Ubuntu 16.04
      license:
        - MIT
      min_ansible_version: 2.0
      platforms:
        - name: Ubuntu
          versions:
            - xenial
      galaxy_tags:
        - development

    dependencies: []

Next let us test it out. You have your Ansible Galaxy role ready now. To test it as a user, go to your directory and edit the other two files inventory.txt and playbook.yml, which are already generated for you in directory tests by the script:

$ ansible-playbook -i ./hosts playbook.yml

After running this playbook, you should have Emacs installed on localhost.

A Complete Ansible Galaxy Project

We are going to use ansible-galaxy to setup a sample project. This sample project will:

  • use a cloud cluster with multiple VMs
  • deploy Apache Spark on this cluster
  • install a particular HPC application
  • prepare raw data for this cluster to process
  • run the experiment and collect results

Ansible: Write a Playbooks for MongoDB

Ansible Playbooks are automated scripts written in YAML data format. Instead of using manual commands to setup multiple remote machines, you can utilize Ansible Playbooks to configure your entire systems. YAML syntax is easy to read and express the data structure of certain Ansible functions. You simply write some tasks, for example, installing software, configuring default settings, and starting the software, in a Ansible Playbook. With a few examples in this section, you will understand how it works and how to write your own Playbooks.

There are also several examples of using Ansible Playbooks from the official site. It covers

from basic usage of Ansible Playbooks to advanced usage such as applying patches and updates with different roles and groups.

We are going to write a basic playbook of Ansible software. Keep in mind that Ansible is a main program and playbook is a template that you would like to use. You may have several playbooks in your Ansible.

First playbook for MongoDB Installation

As a first example, we are going to write a playbook which installs MongoDB server. It includes the following tasks:

  • Import the public key used by the package management system
  • Create a list file for MongoDB
  • Reload local package database
  • Install the MongoDB packages
  • Start MongoDB

The material presented here is based on the manual installation of MongoDB from the official site:

We also assume that we install MongoDB on Ubuntu 15.10.

Enabling Root SSH Access

Some setups of managed nodes may not allow you to log in as root. As this may be problematic later, let us create a playbook to resolve this. Create a enable-root-access.yaml file with the following contents:

---
- hosts: ansible-test
  remote_user: ubuntu
  tasks:
    - name: Enable root login
      shell: sudo cp ~/.ssh/authorized_keys /root/.ssh/

Explanation:

  • hosts specifies the name of a group of machines in the inventory

  • remote_user specifies the username on the managed nodes to log in as

  • tasks is a list of tasks to accomplish having a name (a description) and modules to execute. In this case we use the shell module.

We can run this playbook like so:

$ ansible-playbook -i inventory.txt -c ssh enable-root-access.yaml

PLAY [ansible-test] ***********************************************************

GATHERING FACTS ***************************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]

TASK: [Enable root login] *****************************************************
changed: [10.23.2.104]
changed: [10.23.2.105]

PLAY RECAP ********************************************************************
10.23.2.104                : ok=2    changed=1    unreachable=0    failed=0
10.23.2.105                : ok=2    changed=1    unreachable=0    failed=0

Hosts and Users

First step is choosing hosts to install MongoDB and a user account to run commands (tasks). We start with the following lines in the example filename of mongodb.yaml:

---
- hosts: ansible-test
  remote_user: root
  become: yes

In a previous section, we setup two machines with ansible-test group name. We use two machines for MongoDB installation. Also, we use root account to complete Ansible tasks.

Indentation is important in YAML format. Do not ignore spaces start

with in each line.

Tasks

A list of tasks contains commands or configurations to be executed on remote machines in a sequential order. Each task comes with a name and a module to run your command or configuration. You provide a description of your task in name section and choose a module for your task. There are several modules that you can use, for example, shell module simply executes a command without considering a return value. You may use apt or yum module which is one of the packaging modules to install software. You can find an entire list of modules here: http://docs.ansible.com/list_of_all_modules.html

Module apt_key: add repository keys

We need to import the MongoDB public GPG Key. This is going to be a first task in our playbook.:

tasks:
  - name: Import the public key used by the package management system
    apt_key: keyserver=hkp://keyserver.ubuntu.com:80 id=7F0CEB10 state=present

Module apt_repository: add repositories

Next add the MongoDB repository to apt:

- name: Add MongoDB repository
  apt_repository: repo='deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' state=present

Module apt: install packages

We use apt module to install mongodb-org package. notify action is added to start mongod after the completion of this task. Use the update_cache=yes option to reload the local package database.:

- name: install mongodb
  apt: pkg=mongodb-org state=latest update_cache=yes
  notify:
  - start mongodb

Module service: manage services

We use handlers here to start or restart services. It is similar to tasks but will run only once.:

handlers:
  - name: start mongodb
    service: name=mongod state=started

The Full Playbook

Our first playbook looks like this:

---
- hosts: ansible-test
  remote_user: root
  become: yes
  tasks:
  - name: Import the public key used by the package management system
    apt_key: keyserver=hkp://keyserver.ubuntu.com:80 id=7F0CEB10 state=present
  - name: Add MongoDB repository
    apt_repository: repo='deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' state=present
  - name: install mongodb
    apt: pkg=mongodb-org state=latest update_cache=yes
    notify:
    - start mongodb
  handlers:
    - name: start mongodb
      service: name=mongod state=started

Running a Playbook

We use ansible-playbook command to run our playbook:

$ ansible-playbook -i inventory.txt -c ssh mongodb.yaml

PLAY [ansible-test] ***********************************************************

GATHERING FACTS ***************************************************************
ok: [10.23.2.104]
ok: [10.23.2.105]

TASK: [Import the public key used by the package management system] ***********
changed: [10.23.2.104]
changed: [10.23.2.105]

TASK: [Add MongoDB repository] ************************************************
changed: [10.23.2.104]
changed: [10.23.2.105]

TASK: [install mongodb] *******************************************************
changed: [10.23.2.104]
changed: [10.23.2.105]

NOTIFIED: [start mongodb] *****************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]

PLAY RECAP ********************************************************************
10.23.2.104                : ok=5    changed=3    unreachable=0    failed=0
10.23.2.105                : ok=5    changed=3    unreachable=0    failed=0

If you rerun the playbook, you should see that nothing changed:

$ ansible-playbook -i inventory.txt -c ssh mongodb.yaml

PLAY [ansible-test] ***********************************************************

GATHERING FACTS ***************************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]

TASK: [Import the public key used by the package management system] ***********
ok: [10.23.2.104]
ok: [10.23.2.105]

TASK: [Add MongoDB repository] ************************************************
ok: [10.23.2.104]
ok: [10.23.2.105]

TASK: [install mongodb] *******************************************************
ok: [10.23.2.105]
ok: [10.23.2.104]

PLAY RECAP ********************************************************************
10.23.2.104                : ok=4    changed=0    unreachable=0    failed=0
10.23.2.105                : ok=4    changed=0    unreachable=0    failed=0

Sanity Check: Test MongoDB

Let us try to run ‘mongo’ to enter mongodb shell.:

$ ssh ubuntu@$IP
$ mongo
MongoDB shell version: 2.6.9
connecting to: test
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
        http://docs.mongodb.org/
Questions? Try the support group
        http://groups.google.com/group/mongodb-user
>

Terms

  • Module: Ansible library to run or manage services, packages, files or commands.

  • Handler: A task for notifier.

  • Task: Ansible job to run a command, check files, or update configurations.

  • Playbook: a list of tasks for Ansible nodes. YAML format used.

  • YAML: Human readable generic data serialization.

Reference

The main tutorial from Ansible is here: http://docs.ansible.com/playbooks_intro.html

You can also find an index of the ansible modules here: http://docs.ansible.com/modules_by_category.html

Exercise

We have shown a couple of examples of using Ansible tools. Before you apply it in you final project, we will practice it in this exercise.

  • set up the project structure similar to Ansible Galaxy example
  • install MongoDB from the package manager (apt in this class)
  • configure your MongoDB installation to start the service automatically
  • use default port and let it serve local client connections only

4 - Puppet

Puppet is configuration management tool that simplifies complex task of deploying new software, applying software updates and rollback software packages in large cluster

Overview

Configuration management is an important task of IT department in any organization. It is process of managing infrastructure changes in structured and systematic way. Manual rolling back of infrastructure to previous version of software is cumbersome, time consuming and error prone. Puppet is configuration management tool that simplifies complex task of deploying new software, applying software updates and rollback software packages in large cluster. Puppet does this through Infrastructure as Code (IAC). Code is written for infrastructure on one central location and is pushed to nodes in all environments (Dev, Test, Production) using puppet tool. Configuration management tool has two approaches for managing infrastructure; Configuration push and pull. In push configuration, infrastructure as code is pushed from centralized server to nodes whereas in pull configuration nodes pulls infrastructure as code from central server as shown in fig. 1.

Figure 1: Infrastructure As Code [1]

Puppet uses push and pull configuration in centralized manner as shown in fig. 2.

Figure 2: push-pull-config Image [1]

Another popular infrastructure tool is Ansible. It does not have master and client nodes. Any node in Ansible can act as executor. Any node containing list of inventory and SSH credential can play master node role to connect with other nodes as opposed to puppet architecture where server and agent software needs to be setup and installed. Configuring Ansible nodes is simple, it just requires python version 2.5 or greater. Ansible uses push architecture for configuration.

Master slave architecture

Puppet uses master slave architecture as shown in fig. 3. Puppet server is called as master node and client nodes are called as puppet agent. Agents poll server at regular interval and pulls updated configuration from master. Puppet Master is highly available. It supports multi master architecture. If one master goes down backup master stands up to serve infrastructure.

Workflow

  • nodes (puppet agents) sends information (for e.g IP, hardware detail, network etc.) to master. Master stores such information in manifest file.
  • Master node compiles catalog file containing configuration information that needs to be implemented on agent nodes.
  • Master pushes catalog to puppet agent nodes for implementing configuration.
  • Client nodes send back updated report to Master. Master updates its inventory.
  • All exchange between master and agent is secured through SSL encryption (see fig. 3)
Figure 3: Master and Slave Architecture [1]

fig. 4, shows flow between master and slave.

Figure 4: Master Slave Workflow 1 [1]

fig. 5 shows SSL workflow between master and slave.

Figure 5: Master Slave SSL Workflow [1]

Puppet comes in two forms. Open source Puppet and Enterprise In this tutorial we will showcase installation steps of both forms.

Install Opensource Puppet on Ubuntu

We will demonstrate installation of Puppet on Ubuntu

Prerequisite - Atleast 4 GB RAM, Ubuntu box ( standalone or VM )

First, we need to make sure that Puppet master and agent is able to communicate with each other. Agent should be able to connect with master using name.

configure Puppet server name and map with its ip address

$ sudo nano /etc/hosts

contents of the /etc/hosts should look like

<ip_address> my-puppet-master

my-puppet-master is name of Puppet master to which Puppet agent would try to connect

press <ctrl> + O to Save and <ctrl> + X to exit

Next, we will install Puppet on Ubuntu server. We will execute the following commands to pull from official Puppet Labs Repository

$ curl -O https://apt.puppetlabs.com/puppetlabs-release-pc1-xenial.deb
$ sudo dpkg -i puppetlabs-release-pc1-xenial.deb
$ sudo apt-get update

Intstall the Puppet server

$ sudo apt-get install puppetserver

Default instllation of Puppet server is configured to use 2 GB of RAM. However, we can customize this by opening puppetserver configuration file

$ sudo nano /etc/default/puppetserver

This will open the file in editor. Look for JAVA_ARGS line and change the value of -Xms and -Xmx parameters to 3g if we wish to configure Puppet server for 3GB RAM. Note that default value of this parameter is 2g.

JAVA_ARGS="-Xms3g -Xmx3g -XX:MaxPermSize=256m"

press <ctrl> + O to Save and <ctrl> + X to exit

By default Puppet server is configured to use port 8140 to communicate with agents. We need to make sure that firewall allows to communicate on this port

$ sudo ufw allow 8140

next, we start Puppet server

$ sudo systemctl start puppetserver

Verify server has started

$ sudo systemctl status puppetserver

we would see “active(running)” if server has started successfully

$ sudo systemctl status puppetserver
● puppetserver.service - puppetserver Service
   Loaded: loaded (/lib/systemd/system/puppetserver.service; disabled; vendor pr
   Active: active (running) since Sun 2019-01-27 00:12:38 EST; 2min 29s ago
  Process: 3262 ExecStart=/opt/puppetlabs/server/apps/puppetserver/bin/puppetser
 Main PID: 3269 (java)
   CGroup: /system.slice/puppetserver.service
           └─3269 /usr/bin/java -Xms3g -Xmx3g -XX:MaxPermSize=256m -Djava.securi

Jan 27 00:11:34 ritesh-ubuntu1 systemd[1]: Starting puppetserver Service...
Jan 27 00:11:34 ritesh-ubuntu1 puppetserver[3262]: OpenJDK 64-Bit Server VM warn
Jan 27 00:12:38 ritesh-ubuntu1 systemd[1]: Started puppetserver Service.
lines 1-11/11 (END)

configure Puppet server to start at boot time

$ sudo systemctl enable puppetserver

Next, we will install Puppet agent

$ sudo apt-get install puppet-agent

start Puppet agent

$ sudo systemctl start puppet

configure Puppet agent to start at boot time

$ sudo systemctl enable puppet

next, we need to change Puppet agent config file so that it can connect to Puppet master and communicate

$ sudo nano /etc/puppetlabs/puppet/puppet.conf

configuration file will be opened in an editor. Add following sections in file

[main]
certname = <puppet-agent>
server = <my-puppet-server>

[agent]
server = <my-puppet-server>

Note: my-puppet-server is the name that we have set up in /etc/hosts file while installing Puppet server. And certname is the name of the certificate

Puppet agent sends certificate signing request to Puppet server when it connects first time. After signing request, Puppet server trusts and identifies agent for managing.

execute following command on Puppet Master in order to see all incoming cerficate signing requests

$ sudo /opt/puppetlabs/bin/puppet cert list

we will see something like

$ sudo /opt/puppetlabs/bin/puppet cert list
 "puppet-agent" (SHA256) 7B:C1:FA:73:7A:35:00:93:AF:9F:42:05:77:9B:
 05:09:2F:EA:15:A7:5C:C9:D7:2F:D7:4F:37:A8:6E:3C:FF:6B
  • Note that puppet-agent is the name that we have configured for certname in puppet.conf file*

After validating that request is from valid and trusted agent, we sign the request

$ sudo /opt/puppetlabs/bin/puppet cert sign puppet-agent

we will see message saying certificate was signed if successful

$ sudo /opt/puppetlabs/bin/puppet cert sign puppet-agent
Signing Certificate Request for:
  "puppet-agent" (SHA256) 7B:C1:FA:73:7A:35:00:93:AF:9F:42:05:77:9B:05:09:2F:
  EA:15:A7:5C:C9:D7:2F:D7:4F:37:A8:6E:3C:FF:6B
Notice: Signed certificate request for puppet-agent
Notice: Removing file Puppet::SSL::CertificateRequest puppet-agent
at '/etc/puppetlabs/puppet/ssl/ca/requests/puppet-agent.pem'

Next, we will verify installation and make sure that Puppet server is able to push configuration to agent. Puppet uses domian specific language code written in manifests ( .pp ) file

create default manifest site.pp file

$ sudo nano /etc/puppetlabs/code/environments/production/manifests/site.pp

This will open file in edit mode. Make following changes to this file

file {'/tmp/it_works.txt':                        # resource type file and filename
  ensure  => present,                             # make sure it exists
  mode    => '0644',                              # file permissions
  content => "It works!\n",  # Print the eth0 IP fact
}

domain specific language is used to create it_works.txt file inside /tmp directory on agent node. ensure directive make sure that file is present. It creates one if file is removed. mode directive specifies that process has write permission on file to make changes. content directive is used to define content of the changes applied [hid-sp18-523-open]

next, we test the installation on single node

sudo /opt/puppetlabs/bin/puppet agent --test

successfull verification will display

Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Caching catalog for puppet-agent
Info: Applying configuration version '1548305548'
Notice: /Stage[main]/Main/File[/tmp/it_works.txt]/content:
--- /tmp/it_works.txt    2019-01-27 02:32:49.810181594 +0000
+++ /tmp/puppet-file20190124-9628-1vy51gg    2019-01-27 02:52:28.717734377 +0000
@@ -0,0 +1 @@
+it works!

Info: Computing checksum on file /tmp/it_works.txt
Info: /Stage[main]/Main/File[/tmp/it_works.txt]: Filebucketed /tmp/it_works.txt
to puppet with sum d41d8cd98f00b204e9800998ecf8427e
Notice: /Stage[main]/Main/File[/tmp/it_works.txt]/content: content
changed '{md5}d41d8cd98f00b204e9800998ecf8427e' to '{md5}0375aad9b9f3905d3c545b500e871aca'
Info: Creating state file /opt/puppetlabs/puppet/cache/state/state.yaml
Notice: Applied catalog in 0.13 seconds

Installation of Puppet Enterprise

First, download ubuntu-<version and arch>.tar.gz and CPG signature file on Ubuntu VM

Second, we import Puppet public key

$ wget -O - https://downloads.puppetlabs.com/puppet-gpg-signing-key.pub | gpg --import

we will see ouput as

--2019-02-03 14:02:54--  https://downloads.puppetlabs.com/puppet-gpg-signing-key.pub
Resolving downloads.puppetlabs.com
(downloads.puppetlabs.com)... 2600:9000:201a:b800:10:d91b:7380:93a1
, 2600:9000:201a:800:10:d91b:7380:93a1, 2600:9000:201a:be00:10:d91b:7380:93a1, ...
Connecting to downloads.puppetlabs.com (downloads.puppetlabs.com)
|2600:9000:201a:b800:10:d91b:7380:93a1|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3139 (3.1K) [binary/octet-stream]
Saving to: ‘STDOUT’

-                   100%[===================>]   3.07K  --.-KB/s    in 0s

2019-02-03 14:02:54 (618 MB/s) - written to stdout [3139/3139]

gpg: key 7F438280EF8D349F: "Puppet, Inc. Release Key
(Puppet, Inc. Release Key) <release@puppet.com>" not changed
gpg: Total number processed: 1
gpg:              unchanged: 1

Third, we print fingerprint of used key

$ gpg --fingerprint 0x7F438280EF8D349F

we will see successful output as

pub   rsa4096 2016-08-18 [SC] [expires: 2021-08-17]
      6F6B 1550 9CF8 E59E 6E46  9F32 7F43 8280 EF8D 349F
uid           [ unknown] Puppet, Inc. Release Key
(Puppet, Inc. Release Key) <release@puppet.com>
sub   rsa4096 2016-08-18 [E] [expires: 2021-08-17]

Fourth, we verify release signature of installed package

$ gpg --verify puppet-enterprise-VERSION-PLATFORM.tar.gz.asc

successful output will show as

gpg: assuming signed data in 'puppet-enterprise-2019.0.2-ubuntu-18.04-amd64.tar.gz'
gpg: Signature made Fri 25 Jan 2019 02:03:23 PM EST
gpg:                using RSA key 7F438280EF8D349F
gpg: Good signature from "Puppet, Inc. Release Key
(Puppet, Inc. Release Key) <release@puppet.com>" [unknown]
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: 6F6B 1550 9CF8 E59E 6E46  9F32 7F43 8280 EF8D 349

Next, we need to unpack installation tarball. Store location of path in $TARBALL variable. This variable will be used in our installation.

$ export TARBALL=path of tarball file

then, we extract tarball

$ tar -xf $TARBALL

Next, we run installer from installer directory

$ sudo ./puppet-enterprise-installer

This will ask us to chose installation option; we could chose from guided installation or text based installation

~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64
$ sudo ./puppet-enterprise-installer
~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64
~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64
=============================================================
    Puppet Enterprise Installer
=============================================================

## Installer analytics are enabled by default.
## To disable, set the DISABLE_ANALYTICS environment variable and rerun
this script.
For example, "sudo DISABLE_ANALYTICS=1 ./puppet-enterprise-installer".
## If puppet_enterprise::send_analytics_data is set to false in your
existing pe.conf, this is not necessary and analytics will be disabled.

Puppet Enterprise offers three different methods of installation.

[1] Express Installation (Recommended)

This method will install PE and provide you with a link at the end
of the installation to reset your PE console admin password

Make sure to click on the link and reset your password before proceeding
to use PE

[2] Text-mode Install

This method will open your EDITOR (vi) with a PE config file (pe.conf)
for you to edit before you proceed with installation.

The pe.conf file is a HOCON formatted file that declares parameters
and values needed to install and configure PE.
We recommend that you review it carefully before proceeding.

[3] Graphical-mode Install

This method will install and configure a temporary webserver to walk
you through the various configuration options.

NOTE: This method requires you to be able to access port 3000 on this
machine from your desktop web browser.

=============================================================

 How to proceed? [1]:

-------------------------------------------------------------------

Press 3 for web based Graphic-mode-Install

when successfull, we will see output as

## We're preparing the Web Installer...

2019-02-02T20:01:39.677-05:00 Running command:
mkdir -p /opt/puppetlabs/puppet/share/installer/installer
2019-02-02T20:01:39.685-05:00 Running command:
cp -pR /home/ritesh/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64/*
/opt/puppetlabs/puppet/share/installer/installer/

## Go to https://<localhost>:3000 in your browser to continue installation.

By default Puppet Enterprise server uses 3000 port. Make sure that firewall allows communication on port 3000

$ sudo ufw allow 3000

Next, go to https://localhost:3000 url for completing installation

Click on get started button.

Chose install on this server

Enter <mypserver> as DNS name. This is our Puppet Server name. This can be configured in confile file also.

Enter console admin password

Click continue

we will get confirm the plan screen with following information

The Puppet master component
Hostname
ritesh-ubuntu-pe
DNS aliases
<mypserver>

click continue and verify installer validation screen.

click Deploy Now button

Puppet enterprise will be installed and will display message on screen

Puppet agent ran sucessfully

login to console with admin password that was set earlier and click on nodes links to manage nodes.

Installing Puppet Enterprise as Text mode monolithic installation

$ sudo ./puppet-enterprise-installer

Enter 2 on How to Proceed for text mode monolithic installation. Following message will be displayed if successfull.

2019-02-02T22:08:12.662-05:00 - [Notice]: Applied catalog in 339.28 seconds
2019-02-02T22:08:13.856-05:00 - [Notice]:
Sent analytics: pe_installer - install_finish - succeeded
* /opt/puppetlabs/puppet/bin/puppet infrastructure configure
--detailed-exitcodes --environmentpath /opt/puppetlabs/server/data/environments
--environment enterprise --no-noop --install=2019.0.2 --install-method='repair'
* returned: 2

## Puppet Enterprise configuration complete!


Documentation: https://puppet.com/docs/pe/2019.0/pe_user_guide.html
Release notes: https://puppet.com/docs/pe/2019.0/pe_release_notes.html

If this is a monolithic configuration, run 'puppet agent -t' to complete the
setup of this system.

If this is a split configuration, install or upgrade the remaining PE components,
and then run puppet agent -t on the Puppet master, PuppetDB, and PE console,
in that order.
~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64
2019-02-02T22:08:14.805-05:00 Running command: /opt/puppetlabs/puppet/bin/puppet
agent --enable
~/pe/puppet-enterprise-2019.0.2-ubuntu-18.04-amd64$

This is called as monolithic installation as all components of Puppet Enterprise such as Puppet master, PuppetDB and Console are installed on single node. This installation type is easy to install. Troubleshooting errors and upgrading infrastructure using this type is simple. This installation type can easily support infrastructure of up to 20,000 managed nodes. Compiled master nodes can be added as network grows. This is recommended installation type for small to mid size organizations [2].

pe.conf configuration file will be opened in editor to configure values. This file contains parameters and values for installing, upgrading and configuring Puppet.

Some important parameters that can be specified in pe.conf file are

console_admin_password
puppet_enterprise::console_host
puppet_enterprise::puppetdb_host
puppet_enterprise::puppetdb_database_name
puppet_enterprise::puppetdb_database_user

Lastly, we run puppet after installation is complete

$ puppet agent -t

Text mode split installation is performed for large networks. Compared to monolithic installation split installation type can manage large infrastucture that requires more than 20,000 nodes. In this type of installation different components of Puppet Enterprise (master, PuppetDB and Console) are installed on different nodes. This installation type is recommended for organizations with large infrastructure needs [3].

In this type of installation, we need to install componenets in specific order. First master then puppet db followed by console.

Puppet Enterprise master and agent settings can be configured in puppet.conf file. Most configuration settings of Puppet Enterprise componenets such as Master, Agent and security certificates are all specified in this file.

Config section of Agent Node

[main]

certname = <http://your-domain-name.com/>
server = puppetserver
environment = testing
runinterval = 4h

Config section of Master Node

[main]

certname =  <http://your-domain-name.com/>
server = puppetserver
environment = testing
runinterval = 4h
strict_variables = true

[master]

dns_alt_names = puppetserver,puppet, <http://your-domain-name.com/>
reports = pupated
storeconfigs_backend = puppetdb
storeconfigs = true
environment_timeout = unlimited

Comment lines, Settings lines and Settings variables are main components of puppet configuration file. Comments in config files are specified by prefixing hash character. Setting line consists name of setting followed by equal sign, value of setting are specified in this section. Setting variable value generally consists of one word but multiple can be specified in rare cases [4].

Refernces

[1] Edureka, “Puppet tutorial – devops tool for configuration management.” Web Page, May-2017 [Online]. Available: https://www.edureka.co/blog/videos/puppet-tutorial/

[2] Puppet, “Text mode installation: Monolithic.” Web Page, Nov-2017 [Online]. Available: https://puppet.com/docs/pe/2017.1/install_text_mode_mono.html

[3] Puppet, “Text mode installation : Split.” Web Page, Nov-2017 [Online]. Available: https://puppet.com/docs/pe/2017.1/install_text_mode_split.html

[4] Puppet, “Config files: The main config files.” Web Page, Apr-2014 [Online]. Available: https://puppet.com/docs/puppet/5.3/config_file_main.html

5 - Travis

Travis CI is a continuous integration tool that is often used as part of DevOps development. It is a hosted service that enables users to test their projects on GitHub.

Travis CI is a continuous integration tool that is often used as part of DevOps development. It is a hosted service that enables users to test their projects on GitHub.

Once travis is activated in a GitHub project, the developers can place a .travis file in the project root. Upon checkin the travis configuration file will be interpreted and the commands indicated in it will be executed.

In fact this book has also a travis file that is located at

Please inspect it as we will illustrate some concepts of it. Unfortunately travis does not use an up to date operating system such as ubuntu 18.04. Therefore it contains outdated libraries. Although we would be able to use containers, we have elected for us to chose mechanism to update the operating system as we need.

This is done in the install phase that in our case installs a new version of pandoc, as well as some additional libraries that we use.

in the env we specify where we can find our executables with the PATH variable.

The last portion in our example file specifies the script that is executed after the install phase has been completed. As our installation contains convenient and sophisticated makefiles, the script is very simple while executing the appropriate make command in the corresponding directories.

Exercises

E.travis.1:

Develop an alternative travis file that in conjunction uses a preconfigured container for ubuntu 18.04

E.travis.2:

Develop an travis file that checks our books on multiple operating systems such as macOS, and ubuntu 18.04.

Resources

6 - DevOps with AWS

AWS cloud offering comes with end-to-end scalable and most performant support for DevOps

AWS cloud offering comes with end-to-end scalable and most performant support for DevOps, all the way from automatic deployment and monitoring of infrastructure-as-code to our cloud-applications-code. AWS provides various DevOp tools to make the deployment and support automation as simple as possible.

AWS DevOp Tools

Following is the list of DevOp tools for CI/CD workflows.

AWS DevOp Tool Description
CodeStar AWS CodeStar provides unified UI to enable simpler deployment automation.
CodePipeline CI/CD service for faster and reliable application and infrastructure updates.
CodeBuild Fully managed build service that complies, tests and creates software packages that are ready to deploy.
CodeDeploy Deployment automation tool to deploy to on-premise and on-cloud EC2 instances with near-to-zero downtime during the application deployments.

Infrastructure Automation

AWS provides services to make micro-services easily deployable onto containers and serverless platforms.

AWS DevOp Infrastructure Tool Description
Elastic Container Service Highly scalable container management service.
CodePipeline CI/CD service for faster and reliable application and infrastructure updates.
AWS Lambda Serverless Computing using Function-as-service (FaaS) methodologies .
AWS CloudFormation Tool to create and manage related AWS resources.
AWS OpsWorks Server Configuration Management Tool.

Monitoring and Logging

AWS DevOp Monitoring Tool Description
Amazon CloudWatch Tool to monitor AWS resources and cloud applications to collect and track metrics, logs and set alarms.
AWS X-Ray Allows developers to analyze and troubleshoot performance issues of their cloud applications and micro-services.

For more information, please visit Amazon AWS [1].

Refernces

[1] Amazon AWS, DevOps and AWS. Amazon, 2019 [Online]. Available: https://aws.amazon.com/devops/

7 - DevOps with Azure Monitor

Microsoft provides unified tool called Azure Monitor for end-to-end monitoring of the infrastructure and deployed applications.

Microsoft provides unified tool called Azure Monitor for end-to-end monitoring of the infrastructure and deployed applications. Azure Monitor can greatly help Dev-Op teams by proactively and reactively monitoring the applications for bug tracking, health-check and provide metrics that can hint on various scalability aspects.

Figure 1: Azure Monitor [1]

Azure Monitor accommodates applications developed in various programming languages - .NET, Java, Node.JS, Python and various others. With Azure Application Insights telematics API incorporated into the applications, Azure Monitor can provide more detailed metrics and analytics around specific tracking needs - usage, bugs, etc.

Azure Monitor can help us track the health, performance and scalability issues of the infrastructure - VMs, Containers, Storage, Network and all Azure Services by automatically providing various platform metrics, activity and diagnostic logs.

Azure Monitor provides programmatic access through Power Shell scripts to access the activity and diagnostic logs. It also allows querying them using powerful querying tools for advanced in-depth analysis and reporting.

Azure Monitor proactively monitors and notifies us of critical conditions - reaching quota limits, abnormal usage, health-checks and recommendations along with making attempts to correct some of those aspects.

Azure Monitor Dashboards allow visualize various aspects of the data - metrics, logs, usage patterns in tabular and graphical widgets.

Azure Monitor also facilitates closer monitoring of micro-services if they are provided through Azure Serverless Function-As-Service.

For more information, please visit Microsoft Azure Website [1].

Refernces

[1] Microsoft Azure, Azure Monitor Overview. Microsoft, 2018 [Online]. Available: https://docs.microsoft.com/en-us/azure/azure-monitor/overview