Jim Zajkowski

Goodbye, Charles

In truth, I only knew Charles Edge a little, but I knew of him for quite a long time. He was a wonderfully delightful, easy-going spirit; the kind of person who instantly made you feel like a long lost friend.

Tom’s remembrance; Rich’s

Charles Edge, myself, and a few other Mac Admins at the corner of Apple and Orchard streets on the Penn State campus.
Apr 22, 2024

Keychain client https certificates and URLSession with Swift 5

We issue client identity certificates to our Mac fleet with MDM; we also want to use those identifies with our own client-side tools and backends.

I didn’t find a good end-to-end example, just snippets of partial answers on Stack Overflow or the Apple Developer forums, so I hope this code is useful for someone else.

KeychainCertificateDelegate is a URLSession delegate that finds the correct identity by handling urlSession:didReceive:completionHandler:

KeychainCertificateDelegate
 1class KeychainCertificateDelegate: NSObject, URLSessionDelegate {
 2    func urlSession(_: URLSession, didReceive challenge: URLAuthenticationChallenge, completionHandler: @escaping (URLSession.AuthChallengeDisposition, URLCredential?) -> Void) {
 3        if challenge.protectionSpace.authenticationMethod == NSURLAuthenticationMethodClientCertificate {
 4            // Get the DNs the server will accept
 5            guard let expectedDNs = challenge.protectionSpace.distinguishedNames else {
 6                completionHandler(.cancelAuthenticationChallenge, nil)
 7                return
 8            }
 9
10            // Ask Keychain to search, based on the DNs the server presented on its cert
11            var identityRefs: CFTypeRef? = nil
12            let err = SecItemCopyMatching([
13                kSecClass: kSecClassIdentity,
14                kSecMatchLimit: kSecMatchLimitAll,
15                kSecMatchIssuers: expectedDNs,
16                kSecReturnRef: true,
17            ] as NSDictionary, &identityRefs)
18
19            // can't get keychain certs, get out
20            if err != errSecSuccess {
21                completionHandler(.cancelAuthenticationChallenge, nil)
22                return
23            }
24
25            // just use the first identity
26            guard let identities = identityRefs as? [SecIdentity],
27                  let identity = identities.first
28            else {
29                completionHandler(.cancelAuthenticationChallenge, nil)
30                return
31            }
32
33            // DEBUGGING -- obtain the cert details to print it out
34            //              you can just delete this entire block
35            var certificateRef: SecCertificate?
36            let status = SecIdentityCopyCertificate(identity, &certificateRef)
37            guard status == errSecSuccess, let certificateRef
38            else {
39                completionHandler(.cancelAuthenticationChallenge, nil)
40                return
41            }
42            debugPrint(certificateRef)
43            // END DEBUG
44
45            let credential = URLCredential(identity: identity, certificates: nil, persistence: .forSession)
46            completionHandler(.useCredential, credential)
47            return
48
49        } else {
50            // Handle any other kinds of authentication challenge
51            completionHandler(.performDefaultHandling, nil)
52        }
53    }
54}

KeychainCertificateDelegate obtains the list of valid issuers from the server, using the challenge.protectedSpace.distinguishedNames attribute. Then, we ask Keychain for a matching SecIdentity using SecItemCopyMatching(). Finally, we wrap that identity in a URLCredential and call the completion handler.

Using this delegate is relatively simple from a command line tool - just create a new instance of KeychainCertificateDelegate and set it as your delegate (lines 6-8) when creating a new URLSession, and then call that new session for your requests.

 1main
 2struct HelloWorld {
 3    static func main() async throws {
 4        let url = URL(string: "https://some-server.goes.here")!
 5
 6        let config = URLSessionConfiguration.default
 7        let keyDelegate = KeychainCertificateDelegate()
 8        let session = URLSession(configuration: config, delegate: keyDelegate, delegateQueue: nil)
 9
10        let (data, response) = try await session.data(from: url)
11
12        guard let httpResponse = response as? HTTPURLResponse,
13              (200 ... 299).contains(httpResponse.statusCode)
14        else {
15            throw URLError(.badServerResponse)
16        }
17
18        print(String(data: data, encoding: .utf8)!)
19    }
20}

Hope this helps someone else!

Apr 15, 2024

Setting Up Yubikey ECDSA SSH Keys

I’ve been using 1Password’s ssh agent for a while, but I wanted to revisit ssh keys on my iPhone. Prompt 3 came out recently and supported Yubi hardware keys, so after two days of yak shaving:

Yubikey Prep

Reset your Yubikey’s PIV module. This does not impact any YubiOTP, FIDO, U2F, or other parts of the Yubikey you might be using.

1ykman piv reset

Your Yubikey has three “PIN” like things for PIV functions: the access PIN which you will use every day; the PIN-Unlocking Key (PUK); and the Management Key. My mnemonic is “oh PUK!” is what you’ll say when you accidentally lock your PIN.

By default, the Yubi will only let you make three mistakes before locking. It’s not a terrible idea to increase both, because typing a secret on an iPhone screen can be difficult. 8 for the PIN and 10 for the PUK seems sane:

1ykman piv access set-retries 8 10

Enter 123456 (the default after the piv reset).

Change your management key next - this will store the new management key on the Yubikey, protected by the PIN:

1ykman piv access change-management-key --touch --generate --protect

(press enter at the prompt, enter 123456 for the PIN)

Change the PUK. You do not need to keep it to digits; it’s easier to type a 6-8 character word.

1ykman piv access change-puk

(enter 12345678 as the old PUK)

Change the PIN. Again, use a short word or similar. If you get an error, it’s too short or too long.

1ykman piv change-pin

(enter 123456 as the old PIN).

Key Generation and Conversion

Create an ssh ecsda keypair. If you already have a key deployed to servers hither and yon, you can convert it instead.

1ssh-keygen -t ecdsa -b 384  # my key doesn't support ed25519, but newer Yubis may

By default that will be named ~/.ssh/id_ecdsa, but you can name it something else, just fix it below.

ykman won’t import a file in OPENSSH PRIVATE KEY form, so we need to first convert it to generic PEM form, and then to a named-curve format that ykman expects:

1ssh-keygen -p -f ~/.ssh/id_ecdsa -m PEM # you could skip this with ssh-keygen -m PEM
2openssl ec -in ~/.ssh/id_ecdsa -param_enc named_curve -out ~/for-ykman.pem -AES-128-CBC

Now we can (finally!) import the key to our Yubikey

1ykman piv keys import --pin-policy once --touch-policy cached 9a ~/for-ykman.pem

Enter your PIN and touch the key when prompted. I’m not at all sure what pin-policy once is supposed to do, because I have to re-enter my PIN every time I use it. cached means you don’t need to re-tap the button within about 15 seconds, which can be helpful if you have some kind of nutty sftp client.

N.B.: Many Yubikey/SSH tutorials tell you that you also need to generate a certificate for the key as well. My experience so far has been that’s not been necessary with relatively new versions of OpenSSH, Prompt 3, and perhaps because it’s an ecdsa key. If I run into a situation where I need to create a certificate I will update this post.

Okay, now what?

Copy the contents of the ~/.ssh/id_ecdsa.pub file into the authorized_keys file on the servers you’re trying to reach, same as any other SSH key.

Prompt 3 on all platforms will work “out of the box” if you set it to use Yubikey PIV. I found it less fiddly to plug it into my iPhone than using NFC, but both work.

For macOS, you’ll need a PKCS11 provider. Apple ships one in ssh-keychain but sadly this version doesn’t support ECDSA keys (FB13663068), only RSA - and my Yubikey only supports up to RSA2048.* That’s okay: Yubikey publishes a provider in the Yubico PIV Tool download (alternate: brew install yubico-piv-tool and change the path below). I’ve heard that OpenSC also has a PKCS11 provider, but the Yubico dylib worked fine for me.

You’ll want to tell Sonoma’s ssh how to use this by adding the PKCS11Provider directive to your ~/.ssh/config, such as this:

Host *
    PKCS11Provider /usr/local/lib/libykcs11.dylib

Once that’s done, you should be able to ssh server. You’ll need to enter the PIN and then touch the key to activate - it will just hang until you do.

* Firmware 5.7 and later Yubikeys support RSA4096 and I don’t know if ssh-keychain supports RSA4096, even if the key does.

Why not use FIDO2?

FIDO2 sk type keys are probably The Future, as it’s a bit less fiddly than this, but there’s no way to make them work on iPadOS: the OS only allows NFC reading of the FIDO2 key, and no iPads have NFC readers. Since I wanted the ability to ssh from my iPad, this was a nonstarter.

Feb 28, 2024

Creating GSX SSL keys for Jamf with OpenSSL

GSX’s API requires authentication via an Apple-signed client SSL certificate. Jamf has an automatic CSR generator, but it’s valuable to do these steps with openssl instead: you get a backup of the key and get experience with generating certificate requests by hand.

Before you start, collect your Sold To number from GSX, including all the leading zeros.

Open Terminal and follow these steps to create a certificate request:

 1mkdir ~/gsx-certs; cd ~/gsx-certs
 2
 3# generate a key pair
 4openssl genrsa -aes256 -out gsx-key.pem 2048
 5# enter a passphrase for the key when prompted
 6
 7# now generate a certificate request with the key
 8openssl req -new -sha256 -key gsx-key.pem -out gsx.csr
 9
10# fill in the following fields:
11# - Passphrase used in earlier step
12# - Country code (2 digits), eg, US
13# - State or Province (full name), eg, Michigan
14# - City or locality (full name), eg, Ann Arbor
15# - Company name (can not be Apple), eg, WidgetCorp
16# - Common Name (fully qualified host name) in format:
17#     AppleCare-Partner-XXXXXXXXXX.Prod.apple.com,
18#     where XXXXXXXXXX is your company or organization's Apple-assigned
19#     Sold To number, including leading zeros
20# - Your email address
21# - Challenge password can be blank

E-mail the gsx.csr file to GSX’s web services team, gsxws@apple.com. Also note that Apple needs to specifically allow-list your IP addresses, so provide any egress addresses your Jamf instance will use. In a day or so you will get back a pem file.

You now need to combine the pem file from Apple with the gsx-key.pem file. Drop it in the gsx-certs folder in your home directory, then open Terminal and run:

1openssl pkcs12 -export \
2   -in AppleCare-Partner-[digits].Prod.apple.com.cert.pem \
3   -inkey gsx-key.pem \
4   -out jamf-gsx.p12

Don’t use an overly complicated password, Jamf may hiccup.

You can then just need to upload the new jamf-gsx.p12 file to Jamf’s GSX’s integration.

Hope that helps!

Aug 17, 2023

macOS Screenshots during Setup Assistant

Did you know you can take a screenshot (cmd+shift+3/4) during Setup Assistant?

After your first login, they magically land on the desktop… but where are they stored before then?

Using Console (cmd+control+option+c), I found out that screenshots taken during Setup are stored in /var/log/SetupUserScreenshots. Probably not where anyone expected, but if you have a weird need to get them off before the first user’s desktop, now you know!

Aug 04, 2023

macOS Screenshots mini-game easymode

The default timeout in macOS for a new screenshot to slide off and go live on my desktop is a half-second faster than it takes me to flip back to Slack. I don’t want to play a twitch game with my desktop.

Fortunately, defaults write com.apple.screencaptureui thumbnailExpiration -int 30 to the rescue.

Jul 05, 2023

GitHub Actions and AWS OIDC Roles

This morning, while adding a GitHub Action deployment to push to AWS, I took a quick sojourn into how to use GitHub’s short-lived OIDC session tokens, as opposed to creating yet another AWS access key. The documentation is in a few places and mostly ‘reference’ grade, but this is all you need to actually do:

Add the OIDC provider

  • Add an OIDC provider to AWS. Use these values:
    • Provider: token.actions.githubusercontent.com
    • Audience: sts.amazonaws.com
    • Fingerprints: no longer needed, Amazon and GitHub have these synced

Create an AWS Role

Add a new role and configure the Trust relationships in IAM like this. GitHub has comprehensive documentation about the ways sub gets presented, but it can be hard to follow at first; environments seem like a sane option. You can’t use the other claims the GitHub OIDC includes, as AWS doesn’t import them.

 1{
 2    "Version": "2012-10-17",
 3    "Statement": [
 4        {
 5            "Effect": "Allow",
 6            "Principal": {
 7                "Federated": "arn:aws:iam::5555555555:oidc-provider/token.actions.githubusercontent.com"
 8            },
 9            "Action": "sts:AssumeRoleWithWebIdentity",
10            "Condition": {
11                "StringEquals": {
12                    "token.actions.githubusercontent.com:aud": "sts.amazonaws.com"
13                },
14                "ForAnyValue:StringEquals": {
15                    "token.actions.githubusercontent.com:sub": [
16                        "repo:jamesez/my-lambda:environment:east-one",
17                        "repo:jamesez/my-lambda:environment:west-two"
18                    ]
19                }
20            }
21        }
22    ]
23}

You then assign any permissions to the role, as appropriate.

Action changes

The Action needs write on id-token. Checkout needs contents: read, so you have to mention both:

1permissions:
2  id-token: write
3  contents: read

Then add the configure-aws-credentials action to your Action workflow to obtain AWS keys. Something like this - where the role-to-assume is the role you created above. Remove any AWS secrets you might be setting in Actions.

Note: needs to be version 2 or higher; update if it’s below.

1      - name: Log into AWS
2        uses: aws-actions/configure-aws-credentials@v2
3        with:
4          aws-region: ${{ matrix.region }}
5          role-to-assume: arn:aws:iam::5555555555:role/github-action-role-name

Here’s an example deploy phase that uses a matrix to deploy a Lambda to two environments.

 1deploy:
 2	needs: build
 3    runs-on: ubuntu-latest
 4    strategy:
 5      matrix:
 6        include:
 7          - environment: east-one
 8            region: us-east-1
 9          - environment: west-two
10            region: us-west-2
11
12    environment:
13      name: ${{ matrix.environment }}
14
15    steps:
16      - name: Download deployment.zip
17        uses: actions/download-artifact@v2.1.1
18
19      - name: Display structure of downloaded files
20        run: ls -R
21
22      - name: Log into AWS
23        uses: aws-actions/configure-aws-credentials@v2
24        with:
25          aws-region: ${{ matrix.region }}
26          role-to-assume: arn:aws:iam::555555555:role/github-action-role-name
27
28      - name: Deploy
29        uses: kazimanzurrashid/aws-lambda-update-action@695db4dd92dbd6ee63f1f014bea6f868affa469a
30        with:
31          zip-file: deployment/deployment.zip
32          lambda-name: my-lambda

GitHub: Configuring OIDC with AWS

Jun 23, 2023

Jamf and the Case of the Exploding Database

“Huh, that’s weird, why did mysqldump break? When did the DB get so big?” I asked aloud. My cat gave me an annoyed look for interrupting her nap, turned over, and fell right back to sleep.

“Wait, does this database actually have, uh, is that, 1.3 billion extension attribute rows!? 240 gigabytes? of EAs?! What the heck!”

 1+-----------------------------------+------------+--------------+--------------+------------+
 2| TABLE_NAME                        | table_rows | data_length  | index_length | Size in MB |
 3+-----------------------------------+------------+--------------+--------------+------------+
 4| extension_attribute_values        | 1367505492 | 124721840128 | 127488458752 |  240526.48 |
 5| event_logs                        |  182470223 |  28996288512 |  12575621120 |   39646.06 |
 6| log_actions                       |  185294463 |  24295505920 |   5252284416 |   28178.97 |
 7| policy_history                    |   38554225 |   3448963072 |   4084613120 |    7184.58 |
 8| logs                              |   39372868 |   2586886144 |   3100901376 |    5424.30 |
 9| hardware_reports                  |   18167461 |   2960146432 |    391643136 |    3196.52 |
10| operating_systems                 |   27521885 |   2778710016 |    490717184 |    3117.97 |
11| reports                           |   18804657 |   1035616256 |   1672773632 |    2582.92 |
12| object_history                    |    9148848 |   1548763136 |    904937472 |    2340.03 |
13| mobile_device_management_commands |     342277 |    467894272 |   1638137856 |    2008.47 |
14+-----------------------------------+------------+--------------+--------------+------------+
report sql
1SELECT table_name, table_rows, data_length, index_length,
2    round(((data_length + index_length) / 1024 / 1024),2) "Size in MB",
3    round((data_free / 1024 / 1024),2) "Free in MB"
4FROM information_schema.TABLES
5WHERE table_schema = "jamf_production" ORDER BY (data_length + index_length) DESC limit 10;

It turns out that every time your client “submits inventory” or you write an EA value to Jamf via an API, Jamf constructs a “report” by creating a parent row in the reports table and a number of rows in other tables: fonts, hardware_reports, hard_drive_partitions, hard_drives, extension_attribute_values, and a few more. This is typical database normalization.

Even if you only change is a single EA value, Jamf duplicates every row from all of these tables. That means if you have, say, 50 extension attributes, every API write for just one means 50 more rows get created: 49 with the old value and 1 with the new.

We run a web app to manage Munki and Jamf for our field staff, and it writes EAs to trigger scope changes for policies. Our app compares the existing EA value before trying to change it, but it still generated hundreds of extra writes - or in one very bad case, 55,000. 55,000 reports times 50 EAs is 2.75 million EA rows.

Our app was making these writes because Jamf was (correctly) stripping off spaces on the value provided, but the app’s data source had an extra space on the end. That is, it was comparing "Unit Name " with "Unit Name" and, seeing they were different, would write the EA back as "Unit Name ". Jamf would remove the space, repeating this cycle forever. We had a similar issue with Jamf returning values as strings but the app compared to an integer and got a type mismatch.

Jamf Support suggested we clear this out by trying to flush the reports table at increasingly short intervals, starting at the longest option and working down to one week. The problem is Jamf’s reporting flushing logic executes a DELETE FROM table WHERE report_id IN ([list of 5,000 IDs]). MySQL is particularly bad at this kind of query and it tends to bog down the entire table, causing any pending INSERTs or UPDATEs to hang. Our Jamf instance is pretty busy - about 15,000 endpoints - and so we quickly ran out of hanging database connections. After enough delay, Kubernetes notices Jamf has stopped handling web requests and it gets force restarted.

If the table is indexed property, a single-row DELETE .. where id = x executes in milliseconds and doesn’t bog down other queries.

I reversed out the methods Jamf uses to actually flush the database and wrote a few small utilities in Go to execute millions of row-level deletes. It took about two weeks to finish the deletions, but we have the database down to a reasonable size and it didn’t mean breaking our production environment while hoping Jamf’s manual flush worked.

I hope to release those tools and a garbage collector for Jamf DBs after they are cleaned up - they’re pretty hacky - soon.

May 19, 2023

Munki Client Scheduling

A question on the MacAdmins Slack asked how groups handle scheduling Munki. We built out a server-side solution that offers a lot of flexibility with a very simple API.

Our requirements:

  • Allow users to run Managed Software Center manually at any time.
  • Provide for a number of flexible scheduling policies, such as daily maintenance windows, or delay until a future date for faculty traveling to low-connectivity areas like sub-Saharan Africa, and so on.
  • Be able to change schedules without client code changes.

First, we have a step in the munki preflight: it checks to see if the current munki run is auto, and if it is, it runs our scheduling client:

1if [[ "$1" == "auto" ]]; then
2    # if the scheduler returns zero, that means run now;
3    # anything non-zero means we want to stop running.
4    if ! /usr/local/izzy/bin/izzy scheduler; then
5        exit 1
6    fi
7fi

The izzy schedule command calls a JSON API and provides the client’s current timezone offset. The server either returns a “ask me again after” time, or the token now. If it’s not now, that time is stored on the client to preempt polling the server.

izzy schedule
 1let izzy = IzzySwift()
 2let nextActionSleepfile = "/var/tmp/next_izzy_action_at"
 3let fileManager = FileManager.default
 4let now = Date.init()
 5let zone = TimeZone.current
 6
 7// if we have a sleep file, and it's got an mtime in the future, just bail now.
 8if fileManager.fileExists(atPath: nextActionSleepfile),
 9    let attribs = try? fileManager.attributesOfItem(atPath: nextActionSleepfile),
10    let modifiedTime = attribs[FileAttributeKey.modificationDate] as? Date,
11    modifiedTime > now {
12        print("Scheduler: \(nextActionSleepfile) exists and has an mtime in the future; sleeping")
13        throw ExitCode.init(1)
14}
15
16// Ask izzy what to do
17let utcOffset = zone.secondsFromGMT()
18if let response = izzy.getJson(path: "/api/v1/next_action_at",
19                               params: [ "offset": String(utcOffset) ]) {
20    // good response
21    if response["status"] as! String == "success",
22        let nextAction = response["next_action_at"] as? String {
23
24        // now
25        if nextAction == "now" {
26            print("Scheduler: IzzyWeb says run now.")
27            // delete touch file
28            if fileManager.fileExists(atPath: nextActionSleepfile) {
29                try! fileManager.removeItem(atPath: nextActionSleepfile)
30            }
31            throw ExitCode.success
32
33        } else {
34            // create a touchfile for the next action, so we don't re-hit the API over and over
35            if let date = nextAction.toDate() {
36                fileManager.createFile(atPath: nextActionSleepfile,
37                                       contents: nil,
38                                       attributes: [FileAttributeKey.modificationDate: date.date])
39                print("Scheduler: updated \(nextActionSleepfile); deferring for now.")
40                throw ExitCode.init(1)
41            }
42        }
43
44    // failed
45    } else {
46        print("Scheduler: got an error: \(response)")
47        throw ExitCode.failure
48    }
49}

The scheduling logic server is implemented in Rails, using a STI polymorphic ClientScheduler class with a single required method, next_action_at().

1class ClientScheduler < ApplicationRecord
2  # All concrete schedulers implement this method
3  def next_action_at(tz)
4    raise NotImplementedError, "Abstract base class"
5    nil
6  end
7end

The simplest schedulers are the HourlyScheduler, which always thinks it’s a good time to run, and the NeverScheduler, which always returns 6 hours from now.

 1class HourlyScheduler < ClientScheduler
 2  def next_action_at(tz)
 3    :now
 4  end
 5end
 6
 7class NeverScheduler < ClientScheduler
 8  def next_action_at(opts)
 9    Time.now + 6.hours
10  end
11end

The most complicated is the MaintenanceWindowScheduler. The server uses the client’s provided timezone to decide whether the system is in or out of its window, rather than relying on the server’s time.

MaintenanceWindowScheduler
 1class MaintenanceWindowScheduler < ClientScheduler
 2  def next_action_at(opts)
 3    if opts.nil? || opts[:offset].nil?
 4      raise "Need :offset specified"
 5    end
 6
 7    offset = opts[:offset].to_i
 8    local_time_in_utc = ActiveSupport::TimeZone.new("UTC").now
 9    midnight_in_utc = ActiveSupport::TimeZone.new("UTC").parse("00:00").to_i
10
11    client_offset_from_midnight =
12        local_time_in_utc.to_i - midnight_in_utc.to_i + offset
13    if (client_offset_from_midnight < 0)
14      client_offset_from_midnight += 24.hours
15    end
16
17    local_window_start_time = self.window_starts_as_offset
18    local_window_end_time = self.window_ends_as_offset
19
20    # Non over-midnight window (eg, 4 PM to 8 PM)
21    if (local_window_end_time > local_window_start_time)
22
23      # In the window?
24      # _________S###^####E_____
25      if (client_offset_from_midnight >= local_window_start_time &&
26          client_offset_from_midnight <= local_window_end_time)
27        return :now
28      end
29
30      # Compute next start time
31      if (client_offset_from_midnight >= local_window_end_time)
32        # Next start is tomorrow - window passed today
33        next_start_time = local_window_start_time + 24.hours
34      else
35        # Next start is tonight - window hasn't arrived yet
36        next_start_time = local_window_start_time
37      end
38
39    # If the end time is earlier than the start time,
40    # the window spans midnight eg 8 PM to 8 AM
41    else
42      # Treat it as two windows, one from midnight to the
43      # end time, and another from the start time until midnight
44      # ###t##E____________S###t##
45      if (client_offset_from_midnight <= local_window_end_time ||
46          client_offset_from_midnight >= local_window_start_time)
47        return :now
48      end
49
50      # Next time is always later today
51      next_start_time = local_window_start_time
52    end
53
54    # 6 hours or the next start time, whichever comes first
55    time_until_next_start =
56      [ next_start_time - client_offset_from_midnight, 6.hours ].min
57    return local_time_in_utc + time_until_next_start
58  end
59end

If the client isn’t in its maintenance window, it returns a “ask again” time of either the start of the window or six hours in the future, whichever comes first. Remember: the client doesn’t re-poll the server before this time; six hours is a reasonable choice between lowering network traffic and responsiveness to server-side schedule changes.

We’re able to build a number of other complicated policies just by implementing a class with this single next_action_at method. Over the last five years, it’s been quite flexible and durable.

May 09, 2023