← Setting Up Posture Scans---
title: Scheduling and running scans
status: draft
note: AI-generated first-pass transcript pending video production + SME review.
---
Scan cadence is a small decision that compounds. Get it right and you
catch posture drift before it matters. Get it wrong and you either
miss things or you're paying for noise.
## First scan vs ongoing
The **first scan** for a new client is bigger than ongoing scans —
it's a full inventory plus baseline. Expect it to take 30 minutes to
a few hours depending on connector inventory. Don't promise the
customer a 5-minute first run; you'll be wrong, and they'll lose
trust on the first interaction.
Ongoing scans are incremental. Default cadence is **every 6 hours**.
The platform compares against the last scan and emits
`finding_created` only on new posture issues, not redundant
re-detections.
## Configuring cadence
Cadence is set in `tenantConfig.aegis.scan_cadence`. You can override
per connector for high-volume sources (e.g., M365 audit logs every 1
hour, PSA every 24 hours).
Don't set every connector to 1-hour cadence — you'll burn rate limits
and produce noise. The 6-hour default is calibrated for the typical
MSP load.
## Stall handling + idempotency
Scans can stall: a connector times out, a worker dies, the queue
backs up. The platform's behavior:
1. Each scan has a unique `scan_id` from the moment it starts.
2. Resumable units of work are checkpointed on the way through.
3. If the scan stalls, you click "resume" and the platform picks up
from the last checkpoint with the same `scan_id`.
Don't cancel + restart from scratch on a stalled scan. You'll
duplicate work, you'll waste connector rate budget, and you may end
up with two sets of findings that look like a real change.
## Backpressure
When connector providers rate-limit AEGIS, the platform queues. You
won't see scans fail; you'll see them slow down. The Pulse event
`connector_throttled` lets you know which provider is the bottleneck.
If a provider is consistently throttling, the right fix is usually
- decrease cadence on that connector,
- request a higher quota from the provider,
- or check whether you're collecting more scope than necessary.
## What clients see
Clients see scan status in their AEGIS surface. They see when a scan
is running, when it last completed, and whether the score moved. They
don't see the internal queue state — they see the outcome.
Be honest in customer-facing comms. If a scan stalled and was
resumed, that's not a problem worth narrating. If a scan failed and
the data is incomplete, say so.
## Hands-on
The **aegis-scan-cadence** seed gives you a tenant with three clients
on different cadences. Run a manual scan on the high-volume client.
While it's running, deliberately revoke a connector. Watch:
1. The scan continues, marking the affected control families as
"data not collected".
2. A `connector_failure` Pulse fires.
3. The Raw Score moves to reflect partial coverage.
That's the system showing you the truth. Practice reading it.
## What's next
Module 4 covers the actual work: triaging findings, deciding what to
escalate, and using accepted-risk attestations honestly.
Module 3 of 5
Scheduling and running scans
Cadence for first scan vs ongoing scans, idempotency on retries, and what to do when a scan stalls.
Video — pending production
Read the transcript below. Once recording is complete, the video will replace this notice.
Hands-on sandbox
aegis · seed:
aegis-scan-cadence · 60 min