Inserisci Infobox

Puppet

Configuration and use of the Puppet Configuration management tool.

Introduction to Puppet
Autore: lab42 - Ultimo Aggiornamento: 2009-03-24 15:05:53 - Data di creazione: 2009-03-24 15:01:07
Tipo Infobox: DESCRIPTION - Skill: 4- ADVANCED

In Unix World is often said that a good system administrator tends to automate every activity that has to be repeated more than one times.
This is typically done with shell scripts and custom gluing of available tools, but of course is not the best solution for large infrastructures with distributed sysadmin force.
Different Linux distributions or Unix dialects feature various custom tools to administer systems, sometimes these tools provide automation facilities and management of a large number of servers.
Puppet is a systems management and configuration  tool that can be used to manage and automate system administration on a variety of *nix servers (currently have been reported successful installations on RedHat/Centos/Debian/Suse/Gentoo Linux and Solaris, Darwin and FreeBSD).
In the Open Source there are few alternatives as Cfengine, Bcfg2 and Chef (this one is a recent derivation of Puppet)  and among the commercial products you can somehow relate Puppet to Bladelogic or OpsWare Server Automation System (now HP) (unlike Puppet they can manage also Windows servers and network equipment, at a considerable cost and with questionable efficiency).

Puppet key features are:
- managing of typical administration items (packages, services, users, cron jobs, configuration files...) on different systems using a unique generic, platform independent language.
- highly scalable client/server architecture: you can manage from one to thousands servers
- recreation of the same system based on puppet language is a repeatable and predictable activity
- Written in Ruby, Puppet is Open Source, can be easily enhanced and adapted to custom needs
- modular architecture permits user's enhancements
- logic and content of file items distribution can be based on local automatic server "facts"

Elements
On client machines you can just install the client package (yum install puppet, apt-get install puppet), the package provides a service /etc/init.d/puppetd that works on the configuration file, usually /etc/puppet/puppet.conf ( on versions of puppet pre 0.24.x there were different configuration files according to roles: /etc/puppet/puppetd.conf,  /etc/puppet/puppetmasterd.conf ...).
The puppetd service by default, polls every 30 minutes the server configured in /etc/puppet/puppet.conf ( with a line like: server=10.42.0.10 ) or automatically looks for an host named puppet (you obtain the same result with a line like: "10.42.0.10 puppet" in /etc/hosts ): if it finds a configuration set defined for the local node, it pulls and applies it.
Puppetd requires also the package facter, which provides a customizable tool (facter indeed) that collects local information (hostname, mac addresses, operating system and so on) that can be used as variables in puppet configurations.

On the server system (which can be client of itself) you have to install the puppet-server package. This provides the /etc/init.d/puppetmasterd service which listens to port TCP 8140 and uses the same /etc/puppet/puppet.conf  as configuration file.
On the puppet server you can begin to start your puppet configurations (written in "manifests" files) editing the file /etc/puppet/manifests/site.pp (or what is configured as manifest= /path/to/first/manifest/file in /etc/puppet.conf ).
From this file you can import other manifests where you can describe the whole puppet configuration for your infrastructure.
Communication between client and server is secured by SSL x509 cryptographic certificates. By default the puppetmaster is also the CA that signs server and client's certificates.

Puppet Infrastructure Design Guidelines
Autore: lab42 - Ultimo Aggiornamento: 2009-03-24 14:50:56 - Data di creazione: 2009-03-24 14:34:10
Tipo Infobox: WHITE PAPER - Skill: 5- SENIOR

Puppet is a powerful configuration management tool. Here we define some good practice guidelines useful to deploy a mid-large scale puppet installation.

Designing a Puppet infrastructure is a matter of knowledge, method, contingency and somehow fantasy.    
First of all you must know Puppet's logic and it's main language features, then you should define a general method to manage points in common and differences in the configurations you apply to your hosts, this is mostly dependent on your own infrastructure and needs, finally you can add a bit of creativity to handle different situations and singularities.    
As usual in Unix world there are different ways to achieve the wanted results and there is not an unique solution or recommendation worth for every case, still we try to define here different scenarios and the relevant "good practices", well aware that there might totally different and still good practices to handle the same cases.    

Some preliminary notes:    
- Here with "role" we intend the function of a host. Defining a role has a sense when there are at least 2 nodes having the same role.    
It can be an arbitrary string, such as "webserver" and should be shared for all the host that have exactly the same services running, where configurations general tend to be similar and can have differences in details as local hostname, IP and similar.    
For example a battery of frontend web servers can share the same role (ie role: "webserver"), they can be balanced by a couple of load balancers in HA (ie role: "loadbalancer"), use a backend database cluster (ie role: "database"), being monitored by one or more monitoring host ("monitor"), send syslog messages to one or more syslog servers ("syslog") and so on.    
It's worth to underline that if you use the concept of role it's better to always use roles, also when there are cases of roles used only by a single host.    
- A "zone" can be generally seen as a separated network. In different zones you can define variables for different parameters that change from zone to zone. For example the network IP/subnet, the default gateway but also the dns/ntp/syslog/whatever server that all nodes in the same zone share. A zone can identify also development / testing / staging / production environments, eventually divided in different sub-zones if each of them span over different networks.    
- The general logic is that every node (host) inherits a more general node (more precisely a (sub)zone, which could then inherit a "wider" zone) and includes a single role (more precisely a class defining the role).    
- The examples here are based on a module based logic, as defined in Module Organisation    

The practices used here have been applied successfully in different companies ranging from few nodes to, in the biggest case, about 200 nodes sharing different roles (more than 20) and different zones (about 10). It should apply seamlessly to wider installations, where the number of nodes could be of several hundreds, sharing dozens of roles and zones.    
We'll not face here the issues of planning a distributed and redundant puppetmaster infrastructure, the delegation of editing permissions to different groups or how to cope with testing/production puppet configurations (but we'll face cases of a infrastructure with development/testing/production nodes).    
We'll start from simple cases and then try to face more complex scenarios.    


Very simple infrastructure: Few nodes, no roles, no zones    

If you have few nodes to manage, all sharing the same network and without the need of defining roles, the logic is simple and can be reduced to defining nodes in a similar way:    

node basenode {    
        $puppet_server = "10.42.0.10"    
        $local_network = "10.42.0.0/24"    

        $syslog_server = "10.42.0.11"    
        $ntp_server = "10.42.0.12"    
}    

node 'www.example42.com' inherits basenode {    
        include general    
        include httpd::php    
        include mysql::server    
}
    

Note that on basenode you can define variables used in the templates of your classes, these variables can be overriden at host node level to manage exceptions. For example:    

node 'ntp.example42.com' inherits basenode {    
        $ntp_server = "0.pool.ntp.org"    

        include general    
}
    

Note that is important to declare variables BEFORE including the classes that use them.    

It's a good practice to define a class that provides general configurations applied to every node. This class should just include all the common classes. Something like:    

class general {    
        include yum    
        include hosts    
        include puppet    
        include iptables    
        include sysctl    
        include nrpe    
        include ntp    
        include syslog    
}
    

In a simple environment you can decide to prefer sourcing static files instead of templates, since their content is not likely to change within your infrastructure.    
A syslog class, for example, can be:    

class syslog {    
        package {    
            "syslogd":    
                ensure  => present,    
                name    => $operatingsystem ? {    
                        default => "sysklogd",    
                        },    
        }    

        file {    
            "syslog.conf":    
                owner   => "root",    
                group    => "root",    
                mode    => "640",    
                require  => Package["syslogd"],    
                path     => $operatingsystem ? {    
                           default => "/etc/syslog.conf",    
                           },    
                ## If you want to use a template:    
                content => template("syslog/syslog.conf.erb"),    

                ## If you want to source a static file:    
                ## source => "puppet://$server/syslog/syslog.conf",    
    }    

        service {    
            "syslog":    
                enable    => "true",    
                ensure    => "running",    
                hasstatus => "true",    
                require   => File["syslog.conf"],    
                subscribe => File["syslog.conf"],    
                name => $operatingsystem ? {    
                        default => "syslog",    
                        },    
        }    
}
    

In this case you can either define the content of your syslog.conf in the template MODULEDIR/syslog/templates/syslog.conf.erb or in the static file MODULEDIR/syslog/files/syslog.conf, of course the two options are mutually exclusive.    


Simple infrastructure with roles  

If you have various nodes with similar function it's worth to consider the use of roles (note that the concept of role in not intrinsic in Puppet but just an arbitrary way to summarize functions), shared by different nodes. Something like:    

node 'www1.example42.com' inherits basenode {    
        include role_webserver # (the role_ prefix is arbitrary and not strictly necessary)    
}    
node 'www2.example42.com' inherits basenode {    
        include role_webserver    
}    
node 'www3.example42.com' inherits basenode {    
        include role_webserver    
}    
node 'lb1.example42.com' inherits basenode {    
        include role_loadbalancer    
}    
node 'lb2.example42.com' inherits basenode {    
        include role_loadbalancer    
}    
    
You then define roles in normal classes, with something like:    

class role_webserver {    
        $role = "webserver"    
        include general    
        include httpd::php    
}    
class role_loadbalancer {    
        $role = "loadbalancer"    
        include general    
        include lvs    
}
    

Note the definition of the $role variable at the beginning of the class.    
It's recommended to define such a variable because it can be useful in different situations, where you must define totally different configurations according to the role of the host.    
For example iptables rules can be crafted to be the same for all the nodes of the same role:    

class iptables {    
        service {    
            "iptables":    
                name => $operatingsystem ? {    
                        default => "iptables",    
                        },    
                ensure => running,    
                enable => true,    
                hasrestart => false,    
                restart => $operatingsystem ? {    
                        default => ""iptables-restore < /etc/sysconfig/iptables",    
                        },    
                hasstatus => true,    
                subscribe File["iptables"],    
        }    

        file {        
            "iptables":    
                mode => 600, owner => root, group => root,    
                ensure => present,    
                path => $operatingsystem ?{    
                            default => "/etc/sysconfig/iptables",    
                        },    
                source => [ "puppet://$server/iptables/iptables-$role" , "puppet://$server/iptables/iptables" ],    
        }    
}
    

Here you can define the rules for webservers in MODULEDIR/iptables/files/iptables-webserver, the rules for loadbalancers in MODULEDIR/iptables/files/iptables-loadbalancer and a default ruleset, applied if not role-specific files have been defined, in MODULEDIR/iptables/files/iptables.    
You can easily manage host based exceptions changing the source definition in something like:    

source => [ "puppet://$server/iptables/iptables-$hostname" , "puppet://$server/iptables/iptables-$role" , "puppet://$server/iptables/iptables" ],    

and then, where necessary, creating a file like MODULEDIR/iptables/files/iptables-lb1 to apply specific settings for the host lb1.    

Another way to use a variable like $role is directly in templates. You can change the above line in:    

content => template("iptables/iptables.erb"),    

and create a MODULEDIR/iptables/templates/iptables.erb with something like:    

*filter    
:INPUT DROP [0:0]    
:FORWARD DROP [0:0]    
:OUTPUT DROP [0:0]    
-A INPUT -i lo -j ACCEPT    
-A INPUT -p icmp -j ACCEPT    
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT    
# SSH allowed only from management console    
-A INPUT -s 10.42.0.200 -j ACCEPT    

# Role specific settings    
<% if role=="webserver" %>    
-A INPUT -p tcp --dport 80 -j ACCEPT    
-A INPUT -p tcp --dport 443 -j ACCEPT    
<% end %>    

<% if role=="dbserver" %>    
-A INPUT -s 10.42.0.0/24 -p tcp --dport 3306 -j ACCEPT    
<% end %>    

-A INPUT -m pkttype --pkt-type UNICAST -j LOG --log-prefix "[INPUT DROP] : "    
-A FORWARD -j LOG --log-prefix "[FORWARD DROP] : "    
-A OUTPUT -m state --state NEW,RELATED,ESTABLISHED -j ACCEPT    
-A OUTPUT -m pkttype --pkt-type UNICAST -j LOG --log-prefix "[OUTPUT DROP] : "    
COMMIT
    


Infrastructure with different roles and zones  

More complex scenarios can involve the presence of several nodes (scaling up to hundreds) using different roles and being placed in different networks with different functions (ie: development/testing/production... ).    
In these cases it's recommended to work on nodes' inheritance managing relevant variables at different levels, according to custom needs. For example:    

node basenode {    
        $puppet_server = "10.42.0.10"    
        $syslog_server = "10.42.0.11"    
        $ntp_server = "10.42.0.12"    
}    

node devel inherits basenode {    
        $local_network = "192.168.0.0/24"    
        $syslog_server = "192.168.0.11"    
        $zone = "devel"    
}    

node test inherits basenode {    
        $local_network = "10.42.1.0/24"    
        $syslog_server = "10.42.1.11"    
        $zone = "devel"    
}    

node prod inherits basenode {    
        $local_network = "10.42.0.0/24"    
        $zone = "prod"    
}    

node 'www1.example42.com' inherits prod {    
        include role_webserver    
}    

node 'www1.example42.devel' inherits devel {    
        include role_webserver    
}
    

A similar approach leaves you freedom to define per zone settings but also to keep the possibility to override them at more specific levels.    
The inheritance tree can have more intermediate nodes, according to your own infrastructure, but it's important, to avoid headaches and overcomplexity, to have for each host a single and linear inheritance tree (ie: node inherits subzone inherits zone inherits basenode).    
Note also that zones (as roles these are not a Puppet internal concept) can be related to IP networks but also to functional levels (prod/test/devel...) or geographical locations (headquarters, branch office...). The use of a $zone variable has the same advantages of the $role variable, it can be used in many different places to manage differences based on different zones. Another example:    

class general {    
        include yum    
        include hosts    
        include puppet    
        include iptables    
        include sysctl    
        include nrpe    
        include ntp    
        include syslog    

        case $zone  {    
            prod: { include hardening }    
            test: { include hardening }    
            default:  {  }    
        }    
}
    

So, for each node, you have 2 main characterizations:    
- The zone (network or ) where it stays (inherited from an higher level node)    
- The role (function) it has (included as a class)    
these should be enough to cover many different scenarios with different complexity keeping both the needs of high-level standardization and host-level characterization.    

The guidelines defined here are being applied to the Example42 Puppet Infrastructure (a sample infrastructure that can be used as starting point for customization) by Lab42. Regards and credits to Francesco Crippa of Byte-Code for the initial architectural approach.

Privacy Policy