Migrating from Chef Client version 12 to 13

Chef is under heavy development, every new major version introduces new features, and many times changes, deprecates, or removes some commands or options.

Chef Client 13 introduced a new way of handling reboots and Windows scheduled tasks.

reboot resource

In Chef version 12, the “:reboot_now” action continued the execution of the Chef cookbook but after the specified “delay_mins” the instance was forcefully rebooted even if an application was still running. This could unexpectedly interrupt software installations that did not finish during the specified delay.

This is not a problem if the reboot resource was called with the “:request_reboot” action because in that case, the delay before the reboot starts when the cookbook execution has completed.

In Chef version 13, when the reboot command is ultimately sent to the operating system handling the”:reboot_now” or”:request_reboot” action, a Ruby error is raised to stop the execution of the Chef cookbooks. This prevents the execution of further commands, that would be interrupted by the imminent reboot but puts misleading error messages into the “C:/Chef/Cache/chef-stacktrace.out” file.

Generated at …
Chef::Exceptions::Reboot: Rebooting server at a recipe’s request. Details: {:delay_mins=>1, :reason=>”…”, :timestamp=>…, :requested_by=>”…”}

This is unnecessary in the case of the “:request_reboot” action because the instance is only rebooted when the Chef cookbook execution has already been completed. Chef still raises the error in both cases.

In Test Kitchen

In Chef 12 when the instance was rebooted the DevOps engineer had to wait for the instance to reboot and issue the “kitchen converge” command again to execute the cookbook again.

In Chef 13 Test Kitchen executes the “kitchen converge” command if the cookbook execution failed. This simulates the execution of the Chef Client when the instance starts after the reboot. The error raised by the reboot command helps to test cookbooks with multiple reboots without the need of paying attention to every reboot. This new behavior causes a timeout error when Test Kitchen tries to access the instance while it is still rebooting.

To avoid this we need to add lines to the provisioner section of the .kitchen.yml file

provisioner:
  name: chef_zero
  always_update_cookbooks: true
  retry_on_exit_code: # An array of exit codes that can indicate that kitchen should retry the converge command. Defaults to an empty array.
    - 35              # Reboot is scheduled
    - 20              # The Windows system cannot find the device specified.
    - 1               # Generic failure
  wait_for_retry: 120 # Number of seconds to wait between converge attempts. Defaults to 30.
                     # Chef starts another converge after reboot was requested.
                     # If the 'wait_for_retry' (seconds) is shorter than the 'delay_mins', the converge starts before the reboot happens.
                     # 'wait_for_retry' (set in seconds!) has to be greater than any 'delay_mins' passed to reboot
  max_retries: 10 # Number of times to retry the converge before passing along the failed status. Defaults to 1.

It is very important to set the value of the “wait_for_retry” option greater in seconds, than the longer “delay_mins” set in any “reboot” resources of the cookbook, for Test Kitchen to wait with the retry of the cookbook execution until the instance is rebooted. If Test Kitchen does not wait for the reboot, the next “converge” starts before the instance is rebooted, and the execution will be interrupted by the reboot. After the “wait_for_retry” delay Test Kitchen will try to connect to the instance for “max_retries” times waiting “wait_for_retry” seconds between the tries until the box starts to accept WINRM connections again.

 

windows_task resource

There are two major changes in the windows_task resource.

If the scheduled task already exists and the resource tries to create it again with the same values, an error is thrown

It looks like the error is in the step that tries to parse the start time of the task. Make sure

To change an existing scheduled task

Chef version 13 does not understand the “:change” action. To change an existing scheduled task, use the “:create” action. If you run your cookbooks on multiple Chef versions, you have to make sure those work on the old and the new versions of Chef.

The transition is not so simple, both versions of Chef throw errors when we switch to the :create action.

If you try to modify an existing scheduled task with the “:create” action in Chef 12:

NoMethodError: undefined method `tr’ for nil:NilClass

 

in Chef 13:

Invalid starttime value.

 

Use PowerShell to modify existing scheduled tasks.

To delay the execution of the chef-client scheduled task by two hours to give enough time for the application install between reboots:

$taskpath = "\\"
$taskname = "#{chef_client_task_name}"

$tsDelay = New-TimeSpan -Days 0 -Hours 2 -Minutes 0
$newStartTime = (get-date) + $tsDelay

$tsRepetition = New-TimeSpan -Days 0 -Hours 0 -Minutes 30
$tsDuration = ([timeSpan]::maxvalue)

$trigger = New-ScheduledTaskTrigger -Once -At $newStartTime -RepetitionInterval $tsRepetition -RepetitionDuration $tsDuration

$user = (schtasks.exe /query /s localhost /V /FO CSV | ConvertFrom-Csv | Where { $_.TaskName -eq ($taskpath + $taskname) })."Run As User"
$principal = New-ScheduledTaskPrincipal -UserID $user -LogonType S4U -RunLevel Highest

set-scheduledtask -TaskPath $taskpath -TaskName $taskname -Trigger $trigger -Principal $principal

To set the RunAs user of a Scheduled Task

$user = "NEW_USER_NAME"
$password = "NEW_USER_PASSWORD"

$taskpath = "\\"
$taskname = "TASK_NAME"

set-scheduledtask -TaskPath $taskpath -TaskName $taskname -User $user -Password $password

 

 

 

How to upgrade Jenkins

Jenkins regularly releases new versions. To upgrade Jenkins

On Linux

To install the Generic Java Package (.war)

  1. Write down the current version of Jenkins you have on your server
  2. Download the Jenkins “Generic Java Package (.war)” file from https://jenkins.io/download/to your workstation
  3. Upload the installer to Artifactory or other HTTP repository
  4. SSH into the Jenkins server
  5. Switch to sudo mode
  6. Stop the Jenkins service with
    service jenkins stop
  7. Navigate to the Jenkins directory
    cd /usr/lib/jenkins/
  8. Rename the current Jenkins .war file to include the version, so you can restore it if the new version does not work as expected
    mv jenkins.war jenkins_2.46.3.war
  9. Download the Jenkins war file from Artifactory
    url -O http://THE_IP_ADDRESS_OF_THE REPOSITORY/.../jenkins.war
  10. Start the Jenkins service
    service jenkins start

 

Calling a resource in the Chef recipe

During major Chef Client version upgrades, some instructions need to be changed based on the version of the Chef Client. For example, upgrading from version 12 to version 13, the “windows_task” resource requires a different action to make changes to existing scheduled tasks.

To call a Chef resource in your cookbook, you need another resource to call from. If you execute the notifies command by itself you get the error message:

NoMethodError
————-
undefined method `declared_key’ for “….[…]”:String

In this example, we call the “windows_task” resource to delay the execution of the chef-client by two hours to give enough time for the server configuration to complete. In Chef version 12 we need to set the action to “:change”, in Chef version 13, it needs to be “:create”.

if ( Chef::VERSION.to_f >= 13.0 )
  puts "Version 13.0 or higher"

  ruby_block "call delay chef-client" do
    block do
      # Do nothing, "block do" has to be here
    end
    notifies :create, "windows_task[delay chef-client]", :immediately
  end
else
  puts "Before version 13.0"

  ruby_block "call delay chef-client" do
    block do
      # Do nothing, "block do" has to be here
    end
    notifies :change, "windows_task[delay chef-client]", :immediately
  end
end

future_time = Time.now.utc + 7200               # Offset is in seconds: 2 hour delay
new_start_day = future_time.strftime('%m/%d/%Y')
new_start_time = future_time.strftime('%H:%M')

windows_task 'delay chef-client' do
  task_name 'chef-client'
  start_day new_start_day
  start_time new_start_time
  action :nothing
  guard_interpreter :powershell_script
  only_if "(Get-ScheduledTask | Where-Object {$_.TaskName -eq 'chef-client'}).TaskName.Length -gt 0"
end

 

Make decisions in your Chef recipe based on the version of the Chef Client

There are times when you have to make a decision in your Chef recipe, based on the version of the Chef Client installed on the node.

There are two ways to get the version of the installed Chef Client:

Chef::VERSION
node['chef_packages']['chef']['version']

To make a decision based on the installed Chef Client version

if ( Chef::VERSION.to_f >= 14.0 )
  puts "Version 14.0 or higher"
else
  puts "Before version 14.0"
end

 

Test your cookbook in Chef Test Kitchen against multiple versions of the Chef Client

In large environments, during the Chef Client version change, some older servers still run the prior version of the Chef Client, the newly created servers launch with the new version of the Chef Client.

It is very important to test your cookbooks with the old and the new versions of Chef Client.

To specify the Chef Client version for the Test Kitchen “converge” run, add the highlighted three lines to the provisioner section of the .kitchen.yml file:

provisioner:
  name: chef_zero
  product_name: chef          # Install the Chef Client
  product_version: 14.0.0   # Set the Chef Client version, default=latest
  install_strategy: always    # Forces the installation of the specified Chef Client version even if another version is installed

During “converge”, test kitchen will download and install the specified version of Chef Client.

Warning:

If the server image already has Chef Client installed, and the installed Chef Client version is newer than the specified version in the kitchen.yml file, Test Kitchen downloads the specified earlier version, but executes the already installed newer version.

Open the system drive of an AWS instance you cannot log into

If you cannot log into an AWS instance and want to inspect files on it, you can detach the volume from the lost instance and attach it to another instance as the secondary drive.

Create a new instance

  1. Create a new AWS instance and log into it,
  2. Make a note of the Instance ID of this new instance,
  3. In the AWS console find your lost instance, that you cannot log into, by IP address or server name,
  4. Using the AWS console stop the lost instance,
  5. Make a note of the Instance ID of the stopped instance.

Detach the root volume from the stopped lost instance

  1. On the Description tab click the name of the root device
  2.  In the popup click the EBS ID
  3. Make a note of the volume ID,
  4. Click the Actions button and select Detach Volume.

Attach the volume to the new instance

  1. On the same, detached root volume page, click the Actions button and select Attach Volume,
  2. Search for the new instance by Instance ID and enter a name for the device
  3. On the Description tab of the new instance, the second drive appears,

Assign a drive letter to the attached drive

  1. Log into the new instance,
  2. On the toolbar click the Server Manager icon,
  3. In the Tools menu select Computer Management,
  4. On the left side select Disk Management,
  5. Right-click the second disk and select Online,
  6. The partitions of the attached volume automatically get the drive letters. The primary partition received the letter E:
  7. The drives show up in Windows Explorer too.

Debugging a Packer build

Packer can configure a server instance and create an image of it.

If you need to examine what Packer is doing, you can run Packer in debug mode and RDP into the instance to view files and other settings.

Open the firewall

  1. In the “builders” element add a line to the Packer template to execute a configuration file:
    "user_data_file": "bootstrap-aws.txt",
  2. Create the bootstrap-aws.txt file in the same directory where your Packer template is located,
  3. Add lines to the bootstrap-aws.txt file to create an administrator user, enable the Remote Desktop connection, and open the firewall for the RDP port:
    # Administrator user
    cmd.exe /c net user /add MY_USERNAME MY_PASSWORD
    cmd.exe /c net localgroup administrators MY_USERNAME /add
    
    # RDP
    cmd.exe /c netsh advfirewall firewall add rule name="Open Port 3389" dir=in action=allow protocol=TCP localport=3389
    cmd.exe /c reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server" /v fDenyTSConnections /t REG_DWORD /d 0 /f

Build in debug mode

  1. Execute the build with the -debug option:
    packer build -debug win2012.json

Open the RDP port in the temporary security group

During the build, Packer will prompt you to press enter before major steps.

  1. When the IP address of the instance is displayed on the screen, search for the temporary security group’s name by
    Creating temporary security group for this instance
  2. In the AWS console search for the security group and add a new inbound rule to open port: 3389 for the Remote Desktop Connection:

Connect to the instance

  1. Find the IP address of the instance on the screen,
  2. Start a Remote Desktop connection to the instance,
  3. Use the administrator username and password specified in the bootstrap-aws.txt file.

Log locations

  1. The EC2 log file on the instance is at
    C:\Program Files\Amazon\Ec2ConfigService\Logs\Ec2ConfigLog.txt

 

Create a new server image for a RightScale server template

The RightScale server templates publish server images to launch. It is advisable to create your own server image because the cloud providers can remove their published images anytime. If you generate your own image, you control the lifecycle of those.

Create your own server image

  1. Use Packer to create a new server image.
    1. Install RightLink.
      1. On Windows
            1. Create a PowerShell script and save it as install-rightlink.ps1
              $wc = New-Object System.Net.Webclient
              $wc.DownloadFile("https://rightlink.rightscale.com/rll/10/rightlink.install.ps1",
                 "$pwd\rightlink.install.ps1")
              Powershell -ExecutionPolicy Unrestricted -File rightlink.install.ps1
            2. Execute it from the Packer template
              "provisioners": [
                {
                  "type": "powershell",
                  "pause_before":"10s",
                  "scripts": [
                  "install-rightlink.ps1"
                  ]
                }
              ]
    2. For more information see “Create Custom Images with RightLink” at  http://docs.rightscale.com/rl10/getting_started.html
  2. Make the image public, so RightScale can see it
    1. Open the AWS console
    2. On the EC2 Dashboard select AMIs
    3. Find the image by AMI resource ID
    4. On the Permission tab click the Edit button
    5. Keep the image private, but share it with the AWs account you create the server template in.

Attach the image to your RightScale server template

Make sure the image is visible in RightScale.

It can take 30 minutes for the new image to appear in the list.

  1. Open the RightScale Cloud Management console
  2. In the upper right corner select the account you want to work in,
  3. In the Clouds menu find the region where you created the image and click the Images link
  4. Search for the image

If you have an existing MultiCloud image and want to update the server image in it

  1. In the Design menu select the MultiCloud Images link
  2. Find the existing image by name
  3. Click the name of the image to open it
  4. On the Clouds tab select region, you are working in and click the pencil icon
  5. Click the Edit Image button
  6. Type a part of the name of the new server image you have just created to find it
  7. Click the name of the image to select it
  8. Click the Save button to add it to the MultiCloud image
  9. Click the Update button to save the MultiCloud image

Test the new MultiColud Image

Set the MultiCloud image version in the Server Template

Test the new MultiCloud image by TEMPORARILY using the HEAD revision of the Server Template and the MultiCloud image

To test the new image TEMPORARILY select the HEAD revision of the MultiCloud image to be used by the Server Template

  1. In the Design menu select the ServerTemplates link
  2. Find your Server Template
  3. Click the name of the template to open the info page
  4. Select the HEAD revision of the Server Template
  5. On the Images tab click the pencil to set the revision of the MultiCloud image to use
  6. TEMPORARILY select the HEAD revision for the MultiCloud Image and click the green check mark to save the selection.

In your CAT file set revision 0 to use the HEAD revision of the Server Template

Launch a new server using Self Service with the HEAD revision of the Server Template


Create the new version of the MultiCloud Image

When you have successfully tested the new MultiCloud Image create a new version of it.

  1. Find your MultiCloud Image in Desing, MultiCloud Images
  2. Click the Commit button and enter a description for the change

Create the new version of the Server Template

Save a new version of the Server Template

  1. Find your Server Template in Design, ServerTemplates
  2. Click the Commit button

Publish the Server Template

  1. Select the last revision and click the Publish to MultiCloud Marketplace button
  2. Keep it private
  3. Select the account group to share with and click the DESCRIPTIONS button
  4. Click the PREVIEW button
  5. Click the PUBLISH button

Import the Server Image to the other accounts you use

To be able to use the new version of the Server Image switch to the other accounts and import the template

  1. In the Design menu under MultiCloud Marketplace select the ServerTemplates link
  2. Enter a portion of the Server Template name in the Keyword box and click the GO button
  3. Click the name of the template
  4. Click the Import button

If you want to use a new MultiCloud image in the Server Template

To create a new MultiCloud image

 

Add the new MultiCloud image to the server template

  1. Open the RightScale Cloud Management console
  2. In the Design menu select ServerTemplates
  3. Find your server template
  4. On the Images tab click the Add MultiCloud Image button