Legal AI Continuous Improvement Tracker

Purpose

The Continuous Improvement Tracker gives legal departments a systematic framework for capturing feedback, measuring AI system performance against baseline, identifying improvement opportunities, and executing structured improvement cycles. It anchors on Kaizen PDCA methodology and is enhanced with Risk Taxonomy 2026 monitoring, ROAI quadrant prioritisation, and Agentic Tier provisions for Level 4 tools.

When to Use

Blueprint Stage: Pillar 5 — Use Cases, Execution and Measurement (optimisation phase)
Frequency: Ongoing monitoring with weekly check-ins, monthly reviews, and quarterly strategic sessions
Audiences: AI Task Force (STR-07), Legal Operations, legal professionals using AI tools, and management
Context: Post-deployment optimisation, performance management, maturity band progression, and strategic planning

How to Use

Baseline Establishment: Capture initial performance metrics, risk class baselines, and user satisfaction scores before improvement cycles begin.
Feedback Collection: Implement multi-channel feedback (in-tool prompts, surveys, focus groups, performance data).
Risk Class Monitoring: Track all nine Risk Taxonomy 2026 classes as performance dimensions on the Risk Class Metrics Dashboard.
Performance Monitoring: Track KPIs and OKRs through dashboards and regular reviews.
Improvement Identification: Use data analysis and ROAI quadrant framing to identify and prioritise improvements.
Kaizen PDCA Cycles: Execute improvements through disciplined Plan–Do–Check–Act cycles with documentation.
Agentic Tier Monitoring: Verify five ongoing monitoring provisions monthly for all Level 4 tools in scope.

Risk Class Metrics Dashboard

Track these nine metrics as continuous improvement dimensions alongside standard performance KPIs. For each active AI system, rate current performance on each risk class: On Track / Watch / Breach.

|—|—|—|—|—|

Class 6 Shadow AI escalation rule: Any quarterly increase in unauthorised AI use incidents triggers STR-07 notification and TAL-02 change management response. Do not wait for Breach threshold if trend is upward.

Aggregate Risk Score: Sum Watch items (1 point each) + Breach items (3 points each). Score ≥10 = STR-07 strategic review required.

Agentic Tier Ongoing Monitoring Provisions

For every Level 4 (AI as Executor) tool tracked in this system, verify these five provisions monthly in addition to standard performance metrics.

| Provision | Monthly Verification | Pass/Watch/Fail |

|—|—|—|

| 1. Kill-Switch Operational | Test kill-switch activation monthly; confirm response time meets documented spec | |

| 2. Intervention Logging Current | Verify all agent decisions logged correctly; review log completeness for the month | |

| 3. Scope Limitation Enforced | Confirm agent has not accessed data or systems outside defined deployment scope | |

| 4. Escalation Protocol Active | Confirm escalation triggers functioned correctly for any escalation events in the month | |

| 5. Bias Monitoring Current | Review bias monitoring results for the month; confirm no new bias patterns emerging | |

Monthly Gate: Any Fail on provisions 1, 2, or 3 = immediate STR-07 notification and deployment suspension until remediated. Provision 4 or 5 Fail = remediation plan within 5 business days.

The Improvement Imperative

Legal AI implementations cannot remain static after deployment. Technology evolves, user needs change, business requirements shift, and risk profiles develop continuously. Without systematic feedback loops and structured improvement processes, initial ROAI gains erode, adoption stalls, and risk exposure increases. The critical risk is slow degradation after deployment as Class 6 (Shadow AI), Class 1 (Hallucination), and Class 9 (Operational Resilience) risks accumulate without a monitoring framework.

Kaizen Meets Legal AI

This framework integrates Kaizen PDCA methodology (Plan–Do–Check–Act) with modern AI performance management. It captures multi-channel feedback, tracks performance across standard KPIs and Risk Taxonomy 2026 metrics, prioritises improvements using ROAI quadrants, and documents every cycle as DPS Adoption and Sophistication lens evidence.

ROAI Quadrant Prioritisation

All improvement decisions should be framed against the ROAI 4-quadrant model:

Protect ROAI: Risk reduction improvements (Class 1–9 remediation) — highest priority.
Comply ROAI: Compliance and governance improvements (e.g. EU AI Act, GOV-02 alignment) — second priority.
Grow ROAI: Adoption and usage expansion improvements — third priority.
Transform ROAI: Strategic capability enhancements — funded only when Protect and Comply ROAI are maintained.

Feedback Collection Framework

User Experience Feedback

In-application prompts at key workflow completion points.
Quick ratings (1–5) for AI-generated outputs.
One-click issue reporting with context capture and routing.
Monthly user satisfaction surveys with NPS tracking.
Quarterly focus groups with AI champions and power users.
Annual strategic feedback sessions with leadership.

Performance Data Collection

Real-time metrics: response time, availability, accuracy.
Usage analytics: adoption patterns, feature utilisation, workflow completion.
Error rate monitoring with automatic categorisation.
Business impact: time savings, cost reduction, quality improvement.
Risk class incident monitoring feeding the Risk Class Metrics Dashboard.

Stakeholder Engagement

Leadership strategic reviews (quarterly).
AI champion network check-ins (monthly).
Client service impact assessments (quarterly).
STR-07 AI Task Force performance briefings (quarterly).

Feedback Routing and Triage

Immediate: Class 6 Shadow AI incidents, Class 2 confidentiality concerns, Class 9 SLA breaches → STR-07 notification.
Within 48 hours: Class 1 accuracy failures, Class 3 bias concerns → vendor notification + GOV-03 log.
Weekly: User experience issues, feature requests, workflow friction → improvement backlog.
Monthly: User satisfaction trends, performance metrics, OKR progress → performance review.

Kaizen PDCA Integration

Plan

Identify improvement opportunities from feedback, Risk Class Dashboard, or OKR gaps.
Classify each improvement against ROAI (Protect / Comply / Grow / Transform).
Define success metrics and DPS evidence requirements.
Secure STR-07 awareness for improvements affecting Class 6 or Agentic Tier provisions.

Do

Execute improvements in controlled scope (e.g. pilot practice group).
Document all changes with before/after metrics.
Apply Metric 0 pre-check before deploying any change to configuration or scope.

Check

Measure impact against defined success metrics.
Verify Risk Class Dashboard position has not worsened.
Confirm Agentic Tier provisions still pass for Level 4 tools.
Assess user adoption and satisfaction impact.

Act

Standardise successful improvements across the organisation.
Update AI BoM, GOV-03, and MAT-04 with improvement records.
Document lessons in the improvement knowledge base (DPS Sophistication evidence).
Feed results to MAT-02 for maturity band progression.

Improvement Prioritisation Matrix

Impact Score (1–5): Protect/Comply (5), Grow (3), Transform (2), UX-only (1).
Effort Score (1–5): 1 = minimal effort; 5 = major effort.
Priority = Impact ÷ Effort; Class 6 and Class 9 risk improvements are always Priority 1.

OKR Tracking System

Map objectives to ROAI quadrants:

Protect: e.g. maintain hallucination rate below 5%; zero Class 2 incidents; reduce Class 6 incidents to zero.
Comply: e.g. full GOV-02 compliance; EU AI Act checks for Tier 1 vendors; 100% Agentic Tier pass rate.
Grow: e.g. increase adoption %, expand use cases, raise NPS.
Transform: e.g. progress from current MAT-02 band to target; deploy new agentic capabilities with full governance.

Cadence

Weekly: brief KR check-ins.
Monthly: OKR progress review with risk class status.
Quarterly: OKR retrospective and reset with STR-07 briefing.
Annual: OKR achievement review and maturity assessment (MAT-02).

Performance Monitoring Framework

Efficiency KPIs

Time savings per AI-assisted task vs. manual baseline.
Workflow completion rate for AI-enabled processes.

Defensibility Evidence

Monthly performance reports, quarterly improvement plans, Kaizen cycle records, Risk Class Metrics Dashboard snapshots, Agentic Tier monitoring logs, and STR-07 briefing records are DPS Adoption and Sophistication lens evidence demonstrating systematic, evidence-based AI performance management. Retained for 3 years (performance reports) and 5 years (risk class incident records and STR-07 records).

Operational Artefacts

Continuous Improvement Tracker – Performance & Risk Dashboard Template
xlsx · v2026.1
Gated
Kaizen PDCA Cycle Record – Legal AI Improvement Log
docx · v2026.1
Gated
Untitled artefact
checklist
Gated

Framework Crosswalk

NIST AI Risk Management Framework

NIST

Supports Govern and Map functions by providing continuous monitoring, feedback loops, and risk-based performance tracking for deployed AI systems.

ISO/IEC 42001 AI Management System

ISO/IEC

Operationalises Plan–Do–Check–Act cycles for AI services, with explicit evidence trails, KPIs, and risk controls aligned to an AI management system.

EU AI Act Provider and User Obligations

European Union

Supports post-deployment monitoring, incident logging, risk management, and technical documentation obligations for high-risk and general-purpose AI use in legal functions.

Operational Details

Inputs

· MAT-02 Maturity Assessment (current maturity band and readiness score for calibration)
· MAT-04 Quarterly Progress Tracker (current performance baseline)
· GOV-03 Risk Register (current risk class incident counts per AI system)
· AI Bill of Materials (tools in scope for improvement tracking)
· MAT-01 Adoption Stage assessment (user adoption levels per tool)
· TAL-02 Change Management data (resistance patterns and training completion)
· Vendor SLA performance data from SUS-01
· User satisfaction and NPS data from feedback channels

Outputs

· Monthly performance report with Risk Class Metrics Dashboard
· Quarterly improvement plan with ROAI quadrant prioritisation
· Kaizen PDCA cycle records for each implemented improvement
· Risk Class Metrics Dashboard (9-class quarterly trends)
· Agentic Tier ongoing monitoring log (Level 4 tools)
· Class 6 Shadow AI incident trend and STR-07 escalation log
· OKR progress report with maturity band progression evidence
· DPS Adoption and Sophistication lens evidence package

Owner

Legal Operations Lead + AI Task Force (STR-07)

Telemetry & Observability

Telemetry-ready

Continuous Improvement Tracker