Migrating from Chef Client version 12 to 13

Chef is under heavy development, every new major version introduces new features, and many times changes, deprecates, or removes some commands or options.

Chef Client 13 introduced a new way of handling reboots and Windows scheduled tasks.

reboot resource

In Chef version 12, the “:reboot_now” action continued the execution of the Chef cookbook but after the specified “delay_mins” the instance was forcefully rebooted even if an application was still running. This could unexpectedly interrupt software installations that did not finish during the specified delay.

This is not a problem if the reboot resource was called with the “:request_reboot” action because in that case, the delay before the reboot starts when the cookbook execution has completed.

In Chef version 13, when the reboot command is ultimately sent to the operating system handling the”:reboot_now” or”:request_reboot” action, a Ruby error is raised to stop the execution of the Chef cookbooks. This prevents the execution of further commands, that would be interrupted by the imminent reboot but puts misleading error messages into the “C:/Chef/Cache/chef-stacktrace.out” file.

Generated at …
Chef::Exceptions::Reboot: Rebooting server at a recipe’s request. Details: {:delay_mins=>1, :reason=>”…”, :timestamp=>…, :requested_by=>”…”}

This is unnecessary in the case of the “:request_reboot” action because the instance is only rebooted when the Chef cookbook execution has already been completed. Chef still raises the error in both cases.

In Test Kitchen

In Chef 12 when the instance was rebooted the DevOps engineer had to wait for the instance to reboot and issue the “kitchen converge” command again to execute the cookbook again.

In Chef 13 Test Kitchen executes the “kitchen converge” command if the cookbook execution failed. This simulates the execution of the Chef Client when the instance starts after the reboot. The error raised by the reboot command helps to test cookbooks with multiple reboots without the need of paying attention to every reboot. This new behavior causes a timeout error when Test Kitchen tries to access the instance while it is still rebooting.

To avoid this we need to add lines to the provisioner section of the .kitchen.yml file

provisioner:
  name: chef_zero
  always_update_cookbooks: true
  retry_on_exit_code: # An array of exit codes that can indicate that kitchen should retry the converge command. Defaults to an empty array.
    - 35              # Reboot is scheduled
    - 20              # The Windows system cannot find the device specified.
    - 1               # Generic failure
  wait_for_retry: 120 # Number of seconds to wait between converge attempts. Defaults to 30.
                     # Chef starts another converge after reboot was requested.
                     # If the 'wait_for_retry' (seconds) is shorter than the 'delay_mins', the converge starts before the reboot happens.
                     # 'wait_for_retry' (set in seconds!) has to be greater than any 'delay_mins' passed to reboot
  max_retries: 10 # Number of times to retry the converge before passing along the failed status. Defaults to 1.

It is very important to set the value of the “wait_for_retry” option greater in seconds, than the longer “delay_mins” set in any “reboot” resources of the cookbook, for Test Kitchen to wait with the retry of the cookbook execution until the instance is rebooted. If Test Kitchen does not wait for the reboot, the next “converge” starts before the instance is rebooted, and the execution will be interrupted by the reboot. After the “wait_for_retry” delay Test Kitchen will try to connect to the instance for “max_retries” times waiting “wait_for_retry” seconds between the tries until the box starts to accept WINRM connections again.

 

windows_task resource

There are two major changes in the windows_task resource.

If the scheduled task already exists and the resource tries to create it again with the same values, an error is thrown

It looks like the error is in the step that tries to parse the start time of the task. Make sure

To change an existing scheduled task

Chef version 13 does not understand the “:change” action. To change an existing scheduled task, use the “:create” action. If you run your cookbooks on multiple Chef versions, you have to make sure those work on the old and the new versions of Chef.

The transition is not so simple, both versions of Chef throw errors when we switch to the :create action.

If you try to modify an existing scheduled task with the “:create” action in Chef 12:

NoMethodError: undefined method `tr’ for nil:NilClass

 

in Chef 13:

Invalid starttime value.

 

Use PowerShell to modify existing scheduled tasks.

To delay the execution of the chef-client scheduled task by two hours to give enough time for the application install between reboots:

$taskpath = "\\"
$taskname = "#{chef_client_task_name}"

$tsDelay = New-TimeSpan -Days 0 -Hours 2 -Minutes 0
$newStartTime = (get-date) + $tsDelay

$tsRepetition = New-TimeSpan -Days 0 -Hours 0 -Minutes 30
$tsDuration = ([timeSpan]::maxvalue)

$trigger = New-ScheduledTaskTrigger -Once -At $newStartTime -RepetitionInterval $tsRepetition -RepetitionDuration $tsDuration

$user = (schtasks.exe /query /s localhost /V /FO CSV | ConvertFrom-Csv | Where { $_.TaskName -eq ($taskpath + $taskname) })."Run As User"
$principal = New-ScheduledTaskPrincipal -UserID $user -LogonType S4U -RunLevel Highest

set-scheduledtask -TaskPath $taskpath -TaskName $taskname -Trigger $trigger -Principal $principal

To set the RunAs user of a Scheduled Task

$user = "NEW_USER_NAME"
$password = "NEW_USER_PASSWORD"

$taskpath = "\\"
$taskname = "TASK_NAME"

set-scheduledtask -TaskPath $taskpath -TaskName $taskname -User $user -Password $password

 

 

 

Leave a comment

Your email address will not be published. Required fields are marked *