Background Jobs: Proofing Ruby Workers

Overview

To minimize long-running requests in the IDP, we’ve moved calls that talk to vendors to background jobs. We have implemented those background jobs as jobs using GoodJob.

We currently use proofing jobs for PII verification.

architecture diagram of async/ruby workers (to update this diagram, edit the Async Architecture file in Figma and re-export it)

The lifecycle of a job:

  1. The user submits a form to the IDP
    • For PII verification jobs, the payload will contain PII:
      • First name
      • Last name
      • Date of Birth
      • SSN
      • Driver’s license number
      • Address
  2. The IDP will enqueue a background job
    • Job parameters are persisted to the PSQL database
    • Sensitive parameters are symmetrically encrypted by a server-side IDP key (see notes on server-side encryption)
  3. The IDP will show a waiting page to the user
  4. The Worker host polls the background jobs table. When it pulls a job:
    • Writes to the jobs table to mark the job as claimed
    • It will make HTTP requests via our outbound proxy to vendors
      • The request to the vendors will include PII
  5. When the worker process is done, it will
    • Update the jobs table to mark the job as done, and
    • Store the result (which may contain PII) in Redis, symmetrically encrypted and with a 60 second expiration.
    • PII in the payload may include data from reading the driver’s license
      • First name
      • Last name
      • Date of Birth
      • Driver’s license number
      • Address
  6. The user waiting page will be polling for the result of the background job, where the IDP will check Redis for the result for that particular job. Once it is complete, the user will continue to the next step of the flow.
    • If after 60 seconds the IDP has not seen a response for the job, the IDP will decide the job has timed out, and show an error screen to the user, giving them an option to retry.

Server-Side Encryption

The server stores job arguments in RDS. The Ruby code encrypts arguments that contain PII using the same encryption to encode a session: AES-256-GCM inside of an AWS KMS-encrypted message. The AES key is the application’s session_encryption_key, stored with the application secrets. The application secrets are sensitive config items are stored live in S3 in YAML files, and are pulled down when the app launches and read into memory.

Logging

Logging for the workers will go to log/production.log just like the IDP web hosts, which will be ingested into Cloudwatch.

GoodJob logs job durations by default.

Deploys

The code for the workers lives in the same repository as the IDP, but is deployed to separate worker instances.

Configuration

To enable ruby workers in an environment:

  1. Update the environment’s application.yml
    • Set ruby_workers_idv_enabled: 'true' (this enables async for resolution, address jobs)
  2. Set terraform variables:
    • Positive worker sizes to be positive integers (example pull request):
      • asg_worker_min: 2
      • asg_worker_desired: 2
      • asg_worker_max: 8 (or something)
    • Enable worker alarms for alerting (example pull request)
      • idp_worker_alarms_enabled: 1