How does puppet send commands to the OS?

问题

I am new to Puppet, but understand the concepts quite well. Puppet Manifests call Puppet Modules and the Modules perform the actual task.

I am trying to understand what happens at the Puppet Module layer. How does the command actually execute? Taking the example of the following, what commands are actually passed on to the operating system? Also, where is that defined?

package { 'ntp':
  ensure => installed,
}

回答1:

Summary: Puppet determines the commands that need to be run, based on the facts of the system and the configuration within Puppet itself.

So, when Puppet compiles the catalog to run on it's system it looks like the following:

"I need to install a Pacakge resource, called ntp. I am CentOS system, of the RedHat family. By default on RedHat, I use the yum command. So I need to run yum install ntp"

Longer Explanation

The way that Puppet knows the commands to run and how to run them is known as the Resource Abstraction Layer.

When it's all boiled down, Puppet is not doing anything magical: the commands that are being run on the system are the same commands that would be run by a human operator.

Maybe Puppet has figured out a clever way to do it, and takes into account obscure bugs and gotchas for the platform you're on, or is raising an error because what you're trying to do contains a spelling mistake or similar.

But eventually, the action has to actually be performed using the systems actual applications and tooling.

That's where the RAL actually comes in. It's the biggest layer of abstraction in Puppet: turning all interactions with the base system into a consistent interface.

In the example you give, packages are fairly simple. The concept of installing a package is (mostly) the same to pretty much every operating system in at least the last two decades:

packagesystemtool keywordforinstall packagename

Generally, the install keyword is install, but there are a few exceptions. BSD's pkg which uses pkg add for example.

However: the actual attributes that can be managed in that package can vary a lot:

Can you specify the version?
Can you downgrade that version?
If the package is already installed, do you need to specify a different command to upgrade it?

A huge swath of other optional parameters such as proxy information, error logging level.

The RAL allows the user to define the characteristics of a resource regardless of the implementation in a consistent way:

type { 'title':  
  attribute => 'value',
}

Every resource follows the same syntax:

A resource type (eg. user, package, service, file)
Curly braces to define the resource block.
A title, separated from the body of the resource with a colon A body consisting of attributes and value pairs

So our package declaration looks like this:

package {'tree':  
  ensure => 'present',
}

The RAL can handle that behavior on every platform that has been defined, and support different package features where available, all in a well-defined way, hidden from the user by default.

The best metaphor I've heard for the RAL is it is the Swan gliding along on the lake on the Lake:

When you look at a swan on a body of water, it looks elegant and graceful, gliding along. It barely looks like it's working at all.

What's hidden from the eye is the activity going on beneath the water’s surface. That swan is kicking it's webbed feet, way less gracefully that it looks up top: The actual command Puppet is running is the kicking legs under the water.

Ok, enough background, you're probably asking...

How does it actually work? The RAL splits all resources on the system into two elements:

Types: High-level Models of the valid attributes for a resource
Providers: Platform-specific implementation of a type

This lets you describe resources in a way that can apply to any system. Each resource, regardless of what it is, has one or more providers. Providers are the interface between the underlying OS and the resource types.

Generally, there will be a default provider for a type, but you can specify a specific provider if required.

For a package, the default provider will be the default package provider for a system: yum for RHEL, apt for Debian, pkg for BSD etc. This is determined by a, which takes the facts from the system.

For example, the yum provider has the following:

defaultfor :osfamily => :redhat

But you might want to install a pip package, or a gem. For this you would specify the provider, so it would install it with a different command:

package {'tree':  
  ensure   => 'present',
  provider => 'pip', 
}

This would mean we're saying to the RAL: "Hey, I know yum is the default to install a package, but this is a python package I need, so I'm telling you to use pip instead"

The most important resources of an attribute type are usually conceptually the same across operating systems, regardless of how the actual implementations differ.

Like we said, most packages will be installed with package installer install packagename

So, the description of a resource can be abstracted away from its implementation:

Puppet uses the RAL to both read and modify the state of resources on a system. Since it's a declarative system, Puppet starts with an understanding of what state a resource should have.

To sync the resource, it uses the RAL to query the current state, compare that against the desired state, and then use the RAL again to make any necessary changes. It uses the tooling to get the current state of the system and then figures out what it needs to do to change that state to the state defined by the resource.

When Puppet applies the catalog containing the resource, it will read the actual state of the resource on the target system, compare the actual state to the desired state, and, if necessary, change the system to enforce the desired state.

Let's look at how the RAL will manage this:

We've given the type as package.
The title/name of the package is ntp
I'm running this on a RHEL7 system, so the default provider is yum.
yum is a "child" provider of rpm: it uses the RPM command to check if the package is installed on the system. (This is a lot faster than running "yum info", as it doesn't make any internet calls, and won't fail if a yumrepo is failing)
The install command however, will be yum install

So previously we talked about how Puppet uses the RAL to both read and modify the state of resources on a system.

The "getter" of the RAL is the self.instances method in the provider.

Depending on the resource type, this is generally done in one of two ways:

Read a file on disk, iterate through the lines in a file and turn those into resources
Run a command on the terminal, break the stdout into lines, turn those into hashes which become resources

The rpm instances step goes with the latter. It runs rpm -qa with some given flags to check what packages are on the system:

def self.instances  
    packages = []

    # list out all of the packages
    begin
      execpipe("#{command(:rpm)} -qa #{nosignature} #{nodigest} --qf '#{self::NEVRA_FORMAT}'") { |process|
        # now turn each returned line into a package object
        process.each_line { |line|
          hash = nevra_to_hash(line)
          packages << new(hash) unless hash.empty?
        }
      }
    rescue Puppet::ExecutionFailure
      raise Puppet::Error, "Failed to list packages", $!.backtrace
    end

    packages
  end

So it's running /usr/bin/rpm -qa --nosignature --nodigest --qf '%{NAME} %|EPOCH?{%{EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n', then taking the stdout from that command, looping through each line of output from that, and using the nevra_to_hash method to turn the lines of STDOUT it into a hash.

self::NEVRA_REGEX  = %r{^(\S+) (\S+) (\S+) (\S+) (\S+)$}  
self::NEVRA_FIELDS = [:name, :epoch, :version, :release, :arch]

private  
  # @param line [String] one line of rpm package query information
  # @return [Hash] of NEVRA_FIELDS strings parsed from package info
  # or an empty hash if we failed to parse
  # @api private
  def self.nevra_to_hash(line)
    line.strip!
    hash = {}

    if match = self::NEVRA_REGEX.match(line)
      self::NEVRA_FIELDS.zip(match.captures) { |f, v| hash[f] = v }
      hash[:provider] = self.name
      hash[:ensure] = "#{hash[:version]}-#{hash[:release]}"
      hash[:ensure].prepend("#{hash[:epoch]}:") if hash[:epoch] != '0'
    else
      Puppet.debug("Failed to match rpm line #{line}")
    end

    return hash
  end

So basically it's a regex on the output, then turns those bits from the regex into the given fields.

These hashes become the current state of the resource.

We can run --debug to see this in action:

Debug: Prefetching yum resources for package  
Debug: Executing: '/usr/bin/rpm --version'  
Debug: Executing '/usr/bin/rpm -qa --nosignature --nodigest --qf '%{NAME} %|EPOCH?{%{EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n''  
Debug: Executing: '/usr/bin/rpm -q ntp --nosignature --nodigest --qf %{NAME} %|EPOCH?{%{EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n'  
Debug: Executing: '/usr/bin/rpm -q ntp --nosignature --nodigest --qf %{NAME} %|EPOCH?{%{EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n --whatprovides'

So it uses the RAL to fetch the current state. Puppet is doing the following:

Hmm, this is a package resource titled 'ntp' on a RHEL system, so I should use RPM
Let's get the current state of the RPM packages installed (eg. the instances method
ntp isn't here...
So we need ntp to be installed
the Yum provider then specifies the command required to install.

There's a lot of logic here:

def install  
    wanted = @resource[:name]
    error_level = self.class.error_level
    update_command = self.class.update_command
    # If not allowing virtual packages, do a query to ensure a real package exists
    unless @resource.allow_virtual?
      execute([command(:cmd), '-d', '0', '-e', error_level, '-y', install_options, :list, wanted].compact)
    end

    should = @resource.should(:ensure)
    self.debug "Ensuring => #{should}"
    operation = :install

    case should
    when :latest
      current_package = self.query
      if current_package && !current_package[:ensure].to_s.empty?
        operation = update_command
        self.debug "Ensuring latest, so using #{operation}"
      else
        self.debug "Ensuring latest, but package is absent, so using #{:install}"
        operation = :install
      end
      should = nil
    when true, false, Symbol
      # pass
      should = nil
    else
      # Add the package version
      wanted += "-#{should}"
      if wanted.scan(ARCH_REGEX)
        self.debug "Detected Arch argument in package! - Moving arch to end of version string"
        wanted.gsub!(/(.+)(#{ARCH_REGEX})(.+)/,'\1\3\2')
      end

      current_package = self.query
      if current_package
        if rpm_compareEVR(rpm_parse_evr(should), rpm_parse_evr(current_package[:ensure])) < 0
          self.debug "Downgrading package #{@resource[:name]} from version #{current_package[:ensure]} to #{should}"
          operation = :downgrade
        elsif rpm_compareEVR(rpm_parse_evr(should), rpm_parse_evr(current_package[:ensure])) > 0
          self.debug "Upgrading package #{@resource[:name]} from version #{current_package[:ensure]} to #{should}"
          operation = update_command
        end
      end
    end

    # Yum on el-4 and el-5 returns exit status 0 when trying to install a package it doesn't recognize;
    # ensure we capture output to check for errors.
    no_debug = if Facter.value(:operatingsystemmajrelease).to_i > 5 then ["-d", "0"] else [] end
    command = [command(:cmd)] + no_debug + ["-e", error_level, "-y", install_options, operation, wanted].compact
    output = execute(command)

    if output =~ /^No package #{wanted} available\.$/
      raise Puppet::Error, "Could not find package #{wanted}"
    end

    # If a version was specified, query again to see if it is a matching version
    if should
      is = self.query
      raise Puppet::Error, "Could not find package #{self.name}" unless is

      # FIXME: Should we raise an exception even if should == :latest
      # and yum updated us to a version other than @param_hash[:ensure] ?
      vercmp_result = rpm_compareEVR(rpm_parse_evr(should), rpm_parse_evr(is[:ensure]))
      raise Puppet::Error, "Failed to update to version #{should}, got version #{is[:ensure]} instead" if vercmp_result != 0
    end
  end

This is some serious Swan leg kicking. There's a lot of logic here, for the more complex use case of a package on Yum, but making sure it works on the various versions of Yum avaliable, including RHEL 4 and 5.

The logic is broken down thusly: We haven't specified a version, so we don't need to check what version to install. Simply run yum install tree with the default options specified

Debug: Package[tree](provider=yum): Ensuring => present  
Debug: Executing: '/usr/bin/yum -d 0 -e 0 -y install tree'  
Notice: /Stage[main]/Main/Package[tree]/ensure: created

Ta-dah, installed.

回答2:

It depends on your flavour of linux.

First is checked if the package ntp is installed. If not it will be installed.

CentOS example:

yum list installed ntp If not installed

yum install ntp

Debian example:

dpkg -s ntp If not installed

apt-get install ntp

This is all handled by the package provider on your Linux of choice.

https://docs.puppet.com/puppet/latest/types/package.html

来源：https://stackoverflow.com/questions/41781030/how-does-puppet-send-commands-to-the-os

标签

puppet