Auto-Remediation
Automatically fix security deviations while maintaining safety and control. Configure remediation modes, safety gates, and approval workflows to match your risk tolerance.
What is Auto-Remediation?
Auto-remediation is TrueConfig's capability to automatically fix security deviations detected during scans. Instead of just alerting you to problems, TrueConfig can take corrective action to restore your environment to its baseline state.
Think of it as "infrastructure as code" for identity security - your baseline defines the desired state, and TrueConfig continuously enforces it through automated drift correction.
How It Works
Drift Detected
During a scheduled scan, TrueConfig detects that a control has failed (e.g., 5 Global Admins when your baseline allows max 3).
Safety Gates Check
Before taking action, TrueConfig runs safety gate checks to ensure the change is safe (reversible, no dependencies broken, emergency access verified).
Approval Decision
Based on the control's remediation mode and safety gate results, TrueConfig either auto-remediates, requests manual approval, or provides advisory guidance.
Remediation Executed
TrueConfig calls Microsoft Graph API to apply the fix, records audit events, and creates a rollback window for quick reversal if needed.
Verification Scan
The next scan verifies the fix was successful. If the control still fails, TrueConfig alerts your team for manual investigation.
Remediation Modes
Each control in your baseline can operate in one of three remediation modes. You can configure modes globally or override per-control based on your risk tolerance.
Advisory Mode
Read-only monitoring with remediation guidance
How It Works
TrueConfig detects deviations and provides step-by-step remediation instructions, but never makes changes automatically. You manually implement fixes using the provided guidance.
When to Use
- First-time TrueConfig deployment (build trust)
- High-risk controls (Global Admin changes, CA policies)
- Controls affecting critical business workflows
- Learning mode - understanding what changes TrueConfig recommends
Example Advisory Guidance
Control PA-01: Excessive Privileged Accounts
Status: FAIL | Severity: High
Issue:
You have 5 permanent Global Administrators. Your baseline allows maximum 3.
Why It Matters:
Excessive permanent privileged accounts increase attack surface and insider threat risk. Each Global Admin account is a potential path for attackers to gain full tenant control.
Recommended Actions:
- Review the 5 Global Admin accounts listed in the evidence
- Identify which 2 accounts can be removed or downgraded
- Consider using PIM (Privileged Identity Management) for just-in-time access
- Keep only break-glass emergency accounts as permanent Global Admins
How to Fix (Manual):
- Navigate to Entra ID → Roles and administrators
- Select "Global Administrator" role
- Remove unnecessary assignments
- Document why remaining accounts require permanent access
Manual Mode
One-click remediation after approval
How It Works
TrueConfig prepares the remediation, shows you exactly what will change, and waits for your approval. You review the planned changes, click "Approve & Execute," and TrueConfig applies the fix.
When to Use
- Medium-risk controls (app ownership, secret expiration)
- Controls with business impact (may affect workflows)
- Compliance-required change management (audit trail needed)
- Testing auto-remediation before full enablement
What an Approval Request Looks Like
APP-02: Secret Expiration Enforcement
Status: Pending Approval
Planned Change:
Update application secret expiration
Affected Resources:
- App: Legacy API
- Current expiration: December 31, 2026 (3 years from now)
- New expiration: December 21, 2025 (12 months from now)
Safety Checks:
- ✓ Change is reversible
- ✓ No dependencies will break
- ✓ Emergency access verified
Rollback available for 24 hours after execution
You review this request in the TrueConfig dashboard, verify the changes are acceptable, add approval notes if needed, and click "Approve & Execute."
Approval Controls
- Role-Based Approval: Only users with security_admin or owner roles can approve
- Audit Trail: All approvals logged with approver identity and timestamp
- Timeout: Requests expire after 7 days if not approved
- Rollback Window: 24-hour window to reverse the change
Auto Mode
Automatic remediation with safety gates
How It Works
TrueConfig detects drift, validates safety gates pass, and automatically applies the fix without human intervention. All actions are logged, and rollback windows are available.
When to Use
- Low-risk controls (app ownership assignment, audit log settings)
- Controls with low blast radius (≤2)
- Well-tested controls (after 30+ days in manual mode)
- High-frequency drift (e.g., developers creating apps without owners)
Safety Gates (Required)
Before auto-remediation executes, ALL safety gates must pass:
1. Reversibility Gate
Change must be reversible. Controls with blast_radius ≥ 4 require manual approval. Irreversible actions (delete user, revoke all permissions) are blocked.
2. Dependency Gate
All prerequisites must be satisfied. Controls check that dependent controls have passed and required licenses are available.
3. Emergency Access Gate
Break-glass accounts must be verified within the last 30 days. This prevents auto-remediation from locking you out of your tenant.
4. Observability Gate
Success signals must be observable. The next scan must be able to verify the fix was applied successfully.
5. Enablement Gate
Auto-remediation must be enabled globally AND for this specific control. Per-control overrides can disable auto mode for high-risk controls.
How Safety Gates Protect You
Before any automatic change is made, TrueConfig runs all five safety gates. If even one gate fails, the change requires manual approval instead. This ensures that risky changes never happen automatically.
Configuring Remediation Settings
Remediation settings are configured at the tenant level and can be customized per-control.
Global Configuration
You can configure remediation settings through the TrueConfig dashboard in your tenant settings. This includes:
- Global kill switch: Instantly disable all auto-remediation if needed
- Emergency access verification: Track when you last tested your break-glass accounts
- Control-specific overrides: Enable auto mode for low-risk controls (like APP-01) while keeping high-risk controls (like PA-01) in manual mode
- Disable controls: Turn off controls that don't apply to your environment (with documented reason)
Enabling Auto-Remediation
Follow these steps to safely enable auto-remediation:
Verify Break-Glass Accounts
Test your emergency access accounts to ensure you can regain access if auto-remediation causes issues. Document the test in TrueConfig.
Grant Write Permissions
Add write permissions to TrueConfig's app registration (RoleManagement.ReadWrite.Directory, Policy.ReadWrite.ConditionalAccess, etc.).
Start with Manual Mode
Enable manual mode for 1-2 low-risk controls (APP-01, APP-02). Test the approval workflow and verify rollback works.
Enable Auto for Low-Risk Controls
After 2-4 weeks of successful manual remediation, promote controls to auto mode. Monitor closely for the first week.
Gradual Expansion
Incrementally add controls to auto mode. Never enable more than 2-3 controls per month until you've verified stability.
Rollback & Recovery
All remediation actions include rollback capabilities to quickly undo changes if issues arise.
Rollback Window
After any remediation executes, you have 24 hours to undo the change if needed. TrueConfig saves the previous configuration so you can restore it with one click.
For example, if TrueConfig updates an app secret's expiration from 3 years to 12 months, you can restore the 3-year expiration anytime within 24 hours.
How to Rollback
From TrueConfig Dashboard
Navigate to Remediation History → Select the remediation → Click "Rollback Changes". TrueConfig restores the previous configuration and records the rollback in audit logs.
Manual Rollback
If TrueConfig is unavailable, use the recorded before_snapshot from the audit log to manually restore the previous configuration via Entra ID portal or PowerShell.
Recovery Scenarios
Automatic Rollback
If the next scan detects the control still fails after remediation, TrueConfig can automatically rollback the change and alert your team.
Manual Intervention Required
For high-impact remediations (CA policies, role assignments), automatic rollback is disabled. Security admins must manually review and decide whether to rollback.
Emergency Disable
If auto-remediation causes widespread issues, set auto_remediation_enabled=false in tenant settings. All pending remediations are cancelled, and no new ones will be created.
Audit Trail & Compliance
All remediation actions are recorded in immutable audit logs for compliance and forensic analysis.
What Audit Events Capture
Every remediation action creates a detailed audit record that includes:
- What changed: Exactly what was modified (e.g., "Removed 2 permanent Global Administrator assignments")
- Before/after state: The configuration before the change (5 Global Admins) and after (3 Global Admins)
- Who/what made the change: Whether it was automatic (system) or manually approved (with the approver's identity)
- When: Precise timestamp of when the change occurred
- Microsoft Graph correlation: Request IDs that can be cross-referenced with Entra ID audit logs
Compliance Features
Immutable Logs
Audit events are append-only and cannot be modified or deleted (enforced by database RLS policies).
Graph Request IDs
Every remediation includes Microsoft Graph request IDs for correlation with Entra ID audit logs.
Before/After Snapshots
Every change records the previous state and new state for forensic analysis.
Actor Attribution
System-initiated vs. user-initiated actions are clearly distinguished with user identity captured.
Best Practices
Start Conservative
Begin with advisory mode for all controls. Progress to manual mode for 1-2 controls. Only enable auto mode after 30+ days of successful manual remediation.
Test Break-Glass Monthly
Verify emergency access accounts monthly. Document the test in TrueConfig to keep the emergency_access_verified gate current.
Monitor Remediation Metrics
Track remediation success rate, rollback frequency, and time-to-remediation. High rollback rates indicate you should downgrade controls to manual mode.
Document Overrides
When disabling auto-remediation for a control, document why in the control_overrides.reason field. Review quarterly.
Never Auto-Remediate in Production First
If you have a test/dev tenant, enable auto-remediation there first. Verify it works as expected before enabling in production.