Change the AWS account to launch instances with Chef Test Kitchen

The Chef SDK contains Test Kitchen, that can launch server instances to test your Chef cookbooks. Test Kitchen uses the “chef_zero” provisioner to use your workstation as the virtual Chef server.

To switch Test Kitchen to launch instances in another AWS account

  1. In the .kitchen.yml file update the
    1. availability_zone
    2. subnet_id
    3. aws_ssh_key_id
    4. security_group_ids. Make sure the management ports are open:
      1. for Linux: port 22 for SSH, or
      2. for Windows: port 5985 – 5986 for WinRM and port 3389 for Remote Desktop
    5. ssh_key
  2. In the default.rb attribute file update the
    1. …[‘aws_profile’]
  3. In the ~/.aws/credentials file update the
    1. [default] section

Test Chef cookbooks locally on a virtual workstation

When you use a virtual machine to write your Chef cookbooks you may want to test them locally with Vagrant.

This nested virtual machine cannot use a 64 bit operating system, because to run a 64 bit virtual machine, the host computer’s CPU has to provide the CPU Extensions. Currently only physical CPUs can provide the CPU Extensions, no virtualization software can do that.

The Chef Test Kitchen uses Vagrant to launch test instances. In the background Vagrant uses the VirtualBox engine, so your virtual workstation needs to have Vagrant and VirtualBox installed.

To make sure TestKitchen uses the 32 bit operating system first we will set up the 32 bit virtual machine in Vagrant.

Install the necessary software

  1. Install the Chef Development Kit
  2. Install VirtualBox
  3. Install Vagrant

Create the 32 bit virtual machine

  1. Open a terminal window
  2. Initialize the vagrant virtual machine. Vagrant will download the OS image and start the virtual machine.
     vagrant init hashicorp/precise32
  3. Stop the virtual machine
    vagrant halt
  4. Make sure all Vagrant virtual machines are off
     vagrant global-status

    The result looks like this:

    id name provider state directory
    21590cd default virtualbox running C:/Users/interview/Documents
  5. Shut down the running virtual machines. Add the id of the virtual machine to any command to administer it from any directory.
    vagrant halt 21590cd

Set up the Vagrant connection in the Chef cookbook for Test Kitchen

  1. Navigate to cookbook folder
  2. Edit the .kitchen.yml file:Set the network port forwarding in the driver section:
      name: vagrant
        - ["forwarded_port", {guest: 2200, host: 2222}]
        - ["private_network", {ip: ""}]
      - name: hashicorp/precise32
  3. Test the cookbook
    kitchen converge hashicorp-precise32


Ruby tips and tricks

Bang methods

(Exclamation point at the end of the method name)

There are methods that have a permanent or dangerous version. The exclamation point designates to use the dangerous version of the method.

String manipulation

The bang versions of the string manipulation methods (with the exclamation point), modify the string variable in place. Some of these methods are

  • sub
  • gsub
  • reverse
  • sort

The original string ‘gsub’ substitution method:

original = 'My old cat'
new = original.gsub('old', 'new')

original = 'My old cat'
new ='My new cat'

The dangerous (bang) ‘gsub’ method with the exclamation point modifies the original variable:

original = 'My old cat'
original.gsub!('old', 'new')

original = 'My new cat'


The script exits with Kernel::exit, but Kernel::exit! causes an immediate exit, bypassing any exit handlers.


Migrating from Chef Client version 12 to 13

Chef is under heavy development, every new major version introduces new features, and many times changes, deprecates, or removes some commands or options.

Chef Client 13 introduced a new way of handling reboots and Windows scheduled tasks.

reboot resource

In Chef version 12, the “:reboot_now” action continued the execution of the Chef cookbook but after the specified “delay_mins” the instance was forcefully rebooted even if an application was still running. This could unexpectedly interrupt software installations that did not finish during the specified delay.

This is not a problem if the reboot resource was called with the “:request_reboot” action because in that case, the delay before the reboot starts when the cookbook execution has completed.

In Chef version 13, when the reboot command is ultimately sent to the operating system handling the”:reboot_now” or”:request_reboot” action, a Ruby error is raised to stop the execution of the Chef cookbooks. This prevents the execution of further commands, that would be interrupted by the imminent reboot but puts misleading error messages into the “C:/Chef/Cache/chef-stacktrace.out” file.

Generated at …
Chef::Exceptions::Reboot: Rebooting server at a recipe’s request. Details: {:delay_mins=>1, :reason=>”…”, :timestamp=>…, :requested_by=>”…”}

This is unnecessary in the case of the “:request_reboot” action because the instance is only rebooted when the Chef cookbook execution has already been completed. Chef still raises the error in both cases.

In Test Kitchen

In Chef 12 when the instance was rebooted the DevOps engineer had to wait for the instance to reboot and issue the “kitchen converge” command again to execute the cookbook again.

In Chef 13 Test Kitchen executes the “kitchen converge” command if the cookbook execution failed. This simulates the execution of the Chef Client when the instance starts after the reboot. The error raised by the reboot command helps to test cookbooks with multiple reboots without the need of paying attention to every reboot. This new behavior causes a timeout error when Test Kitchen tries to access the instance while it is still rebooting.

To avoid this we need to add lines to the provisioner section of the .kitchen.yml file

  name: chef_zero
  always_update_cookbooks: true
  retry_on_exit_code: # An array of exit codes that can indicate that kitchen should retry the converge command. Defaults to an empty array.
    - 35              # Reboot is scheduled
    - 20              # The Windows system cannot find the device specified.
    - 1               # Generic failure
  wait_for_retry: 120 # Number of seconds to wait between converge attempts. Defaults to 30.
                     # Chef starts another converge after reboot was requested.
                     # If the 'wait_for_retry' (seconds) is shorter than the 'delay_mins', the converge starts before the reboot happens.
                     # 'wait_for_retry' (set in seconds!) has to be greater than any 'delay_mins' passed to reboot
  max_retries: 10 # Number of times to retry the converge before passing along the failed status. Defaults to 1.

It is very important to set the value of the “wait_for_retry” option greater in seconds, than the longer “delay_mins” set in any “reboot” resources of the cookbook, for Test Kitchen to wait with the retry of the cookbook execution until the instance is rebooted. If Test Kitchen does not wait for the reboot, the next “converge” starts before the instance is rebooted, and the execution will be interrupted by the reboot. After the “wait_for_retry” delay Test Kitchen will try to connect to the instance for “max_retries” times waiting “wait_for_retry” seconds between the tries until the box starts to accept WINRM connections again.


windows_task resource

There are two major changes in the windows_task resource.

If the scheduled task already exists and the resource tries to create it again with the same values, an error is thrown

It looks like the error is in the step that tries to parse the start time of the task. Make sure

To change an existing scheduled task

Chef version 13 does not understand the “:change” action. To change an existing scheduled task, use the “:create” action. If you run your cookbooks on multiple Chef versions, you have to make sure those work on the old and the new versions of Chef.

The transition is not so simple, both versions of Chef throw errors when we switch to the :create action.

If you try to modify an existing scheduled task with the “:create” action in Chef 12:

NoMethodError: undefined method `tr’ for nil:NilClass


in Chef 13:

Invalid starttime value.


Use PowerShell to modify existing scheduled tasks.

To delay the execution of the chef-client scheduled task by two hours to give enough time for the application install between reboots:

$taskpath = "\\"
$taskname = "#{chef_client_task_name}"

$tsDelay = New-TimeSpan -Days 0 -Hours 2 -Minutes 0
$newStartTime = (get-date) + $tsDelay

$tsRepetition = New-TimeSpan -Days 0 -Hours 0 -Minutes 30
$tsDuration = ([timeSpan]::maxvalue)

$trigger = New-ScheduledTaskTrigger -Once -At $newStartTime -RepetitionInterval $tsRepetition -RepetitionDuration $tsDuration

$user = (schtasks.exe /query /s localhost /V /FO CSV | ConvertFrom-Csv | Where { $_.TaskName -eq ($taskpath + $taskname) })."Run As User"
$principal = New-ScheduledTaskPrincipal -UserID $user -LogonType S4U -RunLevel Highest

set-scheduledtask -TaskPath $taskpath -TaskName $taskname -Trigger $trigger -Principal $principal

To set the RunAs user of a Scheduled Task

$user = "NEW_USER_NAME"
$password = "NEW_USER_PASSWORD"

$taskpath = "\\"
$taskname = "TASK_NAME"

set-scheduledtask -TaskPath $taskpath -TaskName $taskname -User $user -Password $password




Calling a resource in the Chef recipe

During major Chef Client version upgrades, some instructions need to be changed based on the version of the Chef Client. For example, upgrading from version 12 to version 13, the “windows_task” resource requires a different action to make changes to existing scheduled tasks.

To call a Chef resource in your cookbook, you need another resource to call from. If you execute the notifies command by itself you get the error message:

undefined method `declared_key’ for “….[…]”:String

In this example, we call the “windows_task” resource to delay the execution of the chef-client by two hours to give enough time for the server configuration to complete. In Chef version 12 we need to set the action to “:change”, in Chef version 13, it needs to be “:create”.

if ( Chef::VERSION.to_f >= 13.0 )
  puts "Version 13.0 or higher"

  ruby_block "call delay chef-client" do
    block do
      # Do nothing, "block do" has to be here
    notifies :create, "windows_task[delay chef-client]", :immediately
  puts "Before version 13.0"

  ruby_block "call delay chef-client" do
    block do
      # Do nothing, "block do" has to be here
    notifies :change, "windows_task[delay chef-client]", :immediately

future_time = + 7200               # Offset is in seconds: 2 hour delay
new_start_day = future_time.strftime('%m/%d/%Y')
new_start_time = future_time.strftime('%H:%M')

windows_task 'delay chef-client' do
  task_name 'chef-client'
  start_day new_start_day
  start_time new_start_time
  action :nothing
  guard_interpreter :powershell_script
  only_if "(Get-ScheduledTask | Where-Object {$_.TaskName -eq 'chef-client'}).TaskName.Length -gt 0"


Make decisions in your Chef recipe based on the version of the Chef Client

There are times when you have to make a decision in your Chef recipe, based on the version of the Chef Client installed on the node.

There are two ways to get the version of the installed Chef Client:


To make a decision based on the installed Chef Client version

if ( Chef::VERSION.to_f >= 14.0 )
  puts "Version 14.0 or higher"
  puts "Before version 14.0"


Knife commands

Knife is a ChefDK command line tool to access the Chef server. We use it to upload our cookbooks, environment files, and data bags to the Chef server, and query the server for information on cookbooks and nodes.

Get the list of cookbooks used by a node with version information

knife node show <node-name> -a cookbooks


Test your cookbook in Chef Test Kitchen against multiple version of the Chef Client

In large environments, during the Chef Client version change, some older servers still run the prior version of the Chef Client, the newly created servers launch with the new version of the Chef Client.

It is very important to test your cookbooks with the old and the new versions of Chef Client.

To specify the Chef Client version for the Test Kitchen “converge” run, add the highlighted three lines to the provisioner section of the .kitchen.yml file:

  name: chef_zero
  product_name: chef          # Install the Chef Client
  product_version: 14.0.0   # Set the Chef Client version, default=latest
  install_strategy: always    # Forces the installation of the specified Chef Client version even if another version is installed

During “converge”, test kitchen will download and install the specified version of Chef Client.


If the installed Chef Client version is newer than the specified version, Test Kitchen downloads the specified version but executes the already installed newer version.


windows_task, ArgumentError: invalid date

When the windows_task resource is called to “create” a Windows Scheduled Task that already exists, an error message is returned. In the past, the “modify” action was responsible for the modification of the scheduled tasks, since Chef 13 the “create” action would update the task if exists.

ArgumentError: …[…] (…:… line …) had an error: ArgumentError: windows_task[…] (… line …) had an error: ArgumentError: invalid date

C:/opscode/chef/embedded/lib/ruby/gems/2.4.0/gems/chef-13.6.4-universal-mingw32/lib/chef/provider/windows_task.rb:507:in `strptime’
C:/opscode/chef/embedded/lib/ruby/gems/2.4.0/gems/chef-13.6.4-universal-mingw32/lib/chef/provider/windows_task.rb:507:in `parse_day’
C:/opscode/chef/embedded/lib/ruby/gems/2.4.0/gems/chef-13.6.4-universal-mingw32/lib/chef/provider/windows_task.rb:239:in `convert_user_date_to_system_date’
C:/opscode/chef/embedded/lib/ruby/gems/2.4.0/gems/chef-13.6.4-universal-mingw32/lib/chef/provider/windows_task.rb:99:in `action_create’

It looks like Chef tries to get a date from the existing scheduled task, and cannot parse it successfully.

As we currently cannot update existing tasks, we need to put a guard into the windows_task resource to prevent the execution if the task exists.

startup_task_name = 'chef-client-at_startup'

windows_task "create #{startup_task_name}" do
  task_name "#{startup_task_name}"
  guard_interpreter :powershell_script
  only_if "(Get-ScheduledTask | Where-Object {$_.TaskName -eq "#{startup_task_name}"}).TaskName.Length -eq 0"