Munki Client Scheduling
A question on the MacAdmins Slack asked how groups handle scheduling Munki. We built out a server-side solution that offers a lot of flexibility with a very simple API.
Our requirements:
- Allow users to run Managed Software Center manually at any time.
- Provide for a number of flexible scheduling policies, such as daily maintenance windows, or delay until a future date for faculty traveling to low-connectivity areas like sub-Saharan Africa, and so on.
- Be able to change schedules without client code changes.
First, we have a step in the munki
preflight: it checks to see if the current munki
run is auto
, and if it is, it runs our scheduling client:
1if [[ "$1" == "auto" ]]; then
2 # if the scheduler returns zero, that means run now;
3 # anything non-zero means we want to stop running.
4 if ! /usr/local/izzy/bin/izzy scheduler; then
5 exit 1
6 fi
7fi
The izzy schedule
command calls a JSON API and provides the client’s current timezone offset. The server either returns a “ask me again after” time, or the token now
. If it’s not now
, that time is stored on the client to preempt polling the server.
izzy schedule
1let izzy = IzzySwift()
2let nextActionSleepfile = "/var/tmp/next_izzy_action_at"
3let fileManager = FileManager.default
4let now = Date.init()
5let zone = TimeZone.current
6
7// if we have a sleep file, and it's got an mtime in the future, just bail now.
8if fileManager.fileExists(atPath: nextActionSleepfile),
9 let attribs = try? fileManager.attributesOfItem(atPath: nextActionSleepfile),
10 let modifiedTime = attribs[FileAttributeKey.modificationDate] as? Date,
11 modifiedTime > now {
12 print("Scheduler: \(nextActionSleepfile) exists and has an mtime in the future; sleeping")
13 throw ExitCode.init(1)
14}
15
16// Ask izzy what to do
17let utcOffset = zone.secondsFromGMT()
18if let response = izzy.getJson(path: "/api/v1/next_action_at",
19 params: [ "offset": String(utcOffset) ]) {
20 // good response
21 if response["status"] as! String == "success",
22 let nextAction = response["next_action_at"] as? String {
23
24 // now
25 if nextAction == "now" {
26 print("Scheduler: IzzyWeb says run now.")
27 // delete touch file
28 if fileManager.fileExists(atPath: nextActionSleepfile) {
29 try! fileManager.removeItem(atPath: nextActionSleepfile)
30 }
31 throw ExitCode.success
32
33 } else {
34 // create a touchfile for the next action, so we don't re-hit the API over and over
35 if let date = nextAction.toDate() {
36 fileManager.createFile(atPath: nextActionSleepfile,
37 contents: nil,
38 attributes: [FileAttributeKey.modificationDate: date.date])
39 print("Scheduler: updated \(nextActionSleepfile); deferring for now.")
40 throw ExitCode.init(1)
41 }
42 }
43
44 // failed
45 } else {
46 print("Scheduler: got an error: \(response)")
47 throw ExitCode.failure
48 }
49}
The scheduling logic server is implemented in Rails, using a STI polymorphic ClientScheduler
class with a single required method, next_action_at()
.
1class ClientScheduler < ApplicationRecord
2 # All concrete schedulers implement this method
3 def next_action_at(tz)
4 raise NotImplementedError, "Abstract base class"
5 nil
6 end
7end
The simplest schedulers are the HourlyScheduler
, which always thinks it’s a good time to run, and the NeverScheduler
, which always returns 6 hours from now.
1class HourlyScheduler < ClientScheduler
2 def next_action_at(tz)
3 :now
4 end
5end
6
7class NeverScheduler < ClientScheduler
8 def next_action_at(opts)
9 Time.now + 6.hours
10 end
11end
The most complicated is the MaintenanceWindowScheduler
. The server uses the client’s provided timezone to decide whether the system is in or out of its window, rather than relying on the server’s time.
MaintenanceWindowScheduler
1class MaintenanceWindowScheduler < ClientScheduler
2 def next_action_at(opts)
3 if opts.nil? || opts[:offset].nil?
4 raise "Need :offset specified"
5 end
6
7 offset = opts[:offset].to_i
8 local_time_in_utc = ActiveSupport::TimeZone.new("UTC").now
9 midnight_in_utc = ActiveSupport::TimeZone.new("UTC").parse("00:00").to_i
10
11 client_offset_from_midnight =
12 local_time_in_utc.to_i - midnight_in_utc.to_i + offset
13 if (client_offset_from_midnight < 0)
14 client_offset_from_midnight += 24.hours
15 end
16
17 local_window_start_time = self.window_starts_as_offset
18 local_window_end_time = self.window_ends_as_offset
19
20 # Non over-midnight window (eg, 4 PM to 8 PM)
21 if (local_window_end_time > local_window_start_time)
22
23 # In the window?
24 # _________S###^####E_____
25 if (client_offset_from_midnight >= local_window_start_time &&
26 client_offset_from_midnight <= local_window_end_time)
27 return :now
28 end
29
30 # Compute next start time
31 if (client_offset_from_midnight >= local_window_end_time)
32 # Next start is tomorrow - window passed today
33 next_start_time = local_window_start_time + 24.hours
34 else
35 # Next start is tonight - window hasn't arrived yet
36 next_start_time = local_window_start_time
37 end
38
39 # If the end time is earlier than the start time,
40 # the window spans midnight eg 8 PM to 8 AM
41 else
42 # Treat it as two windows, one from midnight to the
43 # end time, and another from the start time until midnight
44 # ###t##E____________S###t##
45 if (client_offset_from_midnight <= local_window_end_time ||
46 client_offset_from_midnight >= local_window_start_time)
47 return :now
48 end
49
50 # Next time is always later today
51 next_start_time = local_window_start_time
52 end
53
54 # 6 hours or the next start time, whichever comes first
55 time_until_next_start =
56 [ next_start_time - client_offset_from_midnight, 6.hours ].min
57 return local_time_in_utc + time_until_next_start
58 end
59end
If the client isn’t in its maintenance window, it returns a “ask again” time of either the start of the window or six hours in the future, whichever comes first. Remember: the client doesn’t re-poll the server before this time; six hours is a reasonable choice between lowering network traffic and responsiveness to server-side schedule changes.
We’re able to build a number of other complicated policies just by implementing a class with this single next_action_at
method. Over the last five years, it’s been quite flexible and durable.