Purpose
The Continuous Improvement Tracker gives legal departments a systematic framework for capturing feedback, measuring AI system performance against baseline, identifying improvement opportunities, and executing structured improvement cycles. It anchors on Kaizen PDCA methodology and is enhanced with Risk Taxonomy 2026 monitoring, ROAI quadrant prioritisation, and Agentic Tier provisions for Level 4 tools.
When to Use
- Blueprint Stage: Pillar 5 — Use Cases, Execution and Measurement (optimisation phase)
- Frequency: Ongoing monitoring with weekly check-ins, monthly reviews, and quarterly strategic sessions
- Audiences: AI Task Force (STR-07), Legal Operations, legal professionals using AI tools, and management
- Context: Post-deployment optimisation, performance management, maturity band progression, and strategic planning
How to Use
- Baseline Establishment: Capture initial performance metrics, risk class baselines, and user satisfaction scores before improvement cycles begin.
- Feedback Collection: Implement multi-channel feedback (in-tool prompts, surveys, focus groups, performance data).
- Risk Class Monitoring: Track all nine Risk Taxonomy 2026 classes as performance dimensions on the Risk Class Metrics Dashboard.
- Performance Monitoring: Track KPIs and OKRs through dashboards and regular reviews.
- Improvement Identification: Use data analysis and ROAI quadrant framing to identify and prioritise improvements.
- Kaizen PDCA Cycles: Execute improvements through disciplined Plan–Do–Check–Act cycles with documentation.
- Agentic Tier Monitoring: Verify five ongoing monitoring provisions monthly for all Level 4 tools in scope.
Risk Class Metrics Dashboard
Track these nine metrics as continuous improvement dimensions alongside standard performance KPIs. For each active AI system, rate current performance on each risk class: On Track / Watch / Breach.
| Risk Class | Key Metric | Watch Threshold | Breach Threshold | Escalation |
|—|—|—|—|—|
| Class 1: Hallucination | Verified accuracy rate on legal tasks | <95% | <90% | Immediate vendor notification |
| Class 2: Privilege/Confidentiality | Data handling incidents | 1 incident | 2+ incidents | STR-07 + GOV-03 |
| Class 3: Bias/Fairness | Bias testing fail rate | 1 fail | 2+ consecutive fails | GOV-04 assessment |
| Class 4: Privacy/Data Protection | Data protection violations | 1 concern | 1 confirmed violation | Legal/DPO review |
| Class 5: Supply Chain/Vendor | SLA compliance rate | <98% | <95% | SUS-01 escalation |
| Class 6: Shadow AI | Unauthorised AI use incidents/quarter | 2+ | 5+ | Immediate STR-07 + TAL-02 |
| Class 7: Regulatory Compliance | Regulatory alignment gap | Any new gap | Unresolved gap >30 days | Legal review |
| Class 8: IP/Licensing | IP or licensing concerns | 1 concern | 1 confirmed issue | Legal review |
| Class 9: Operational Resilience | Uptime vs. SLA | <99.5% actual | <99% actual | SUS-01 + STR-07 if Critical |
Class 6 Shadow AI escalation rule: Any quarterly increase in unauthorised AI use incidents triggers STR-07 notification and TAL-02 change management response. Do not wait for Breach threshold if trend is upward.
Aggregate Risk Score: Sum Watch items (1 point each) + Breach items (3 points each). Score ≥10 = STR-07 strategic review required.
Agentic Tier Ongoing Monitoring Provisions
For every Level 4 (AI as Executor) tool tracked in this system, verify these five provisions monthly in addition to standard performance metrics.
| Provision | Monthly Verification | Pass/Watch/Fail |
|—|—|—|
| 1. Kill-Switch Operational | Test kill-switch activation monthly; confirm response time meets documented spec | |
| 2. Intervention Logging Current | Verify all agent decisions logged correctly; review log completeness for the month | |
| 3. Scope Limitation Enforced | Confirm agent has not accessed data or systems outside defined deployment scope | |
| 4. Escalation Protocol Active | Confirm escalation triggers functioned correctly for any escalation events in the month | |
| 5. Bias Monitoring Current | Review bias monitoring results for the month; confirm no new bias patterns emerging | |
Monthly Gate: Any Fail on provisions 1, 2, or 3 = immediate STR-07 notification and deployment suspension until remediated. Provision 4 or 5 Fail = remediation plan within 5 business days.
The Improvement Imperative
Legal AI implementations cannot remain static after deployment. Technology evolves, user needs change, business requirements shift, and risk profiles develop continuously. Without systematic feedback loops and structured improvement processes, initial ROAI gains erode, adoption stalls, and risk exposure increases. The critical risk is slow degradation after deployment as Class 6 (Shadow AI), Class 1 (Hallucination), and Class 9 (Operational Resilience) risks accumulate without a monitoring framework.
Kaizen Meets Legal AI
This framework integrates Kaizen PDCA methodology (Plan–Do–Check–Act) with modern AI performance management. It captures multi-channel feedback, tracks performance across standard KPIs and Risk Taxonomy 2026 metrics, prioritises improvements using ROAI quadrants, and documents every cycle as DPS Adoption and Sophistication lens evidence.
ROAI Quadrant Prioritisation
All improvement decisions should be framed against the ROAI 4-quadrant model:
- Protect ROAI: Risk reduction improvements (Class 1–9 remediation) — highest priority.
- Comply ROAI: Compliance and governance improvements (e.g. EU AI Act, GOV-02 alignment) — second priority.
- Grow ROAI: Adoption and usage expansion improvements — third priority.
- Transform ROAI: Strategic capability enhancements — funded only when Protect and Comply ROAI are maintained.
Feedback Collection Framework
User Experience Feedback
- In-application prompts at key workflow completion points.
- Quick ratings (1–5) for AI-generated outputs.
- One-click issue reporting with context capture and routing.
- Monthly user satisfaction surveys with NPS tracking.
- Quarterly focus groups with AI champions and power users.
- Annual strategic feedback sessions with leadership.
Performance Data Collection
- Real-time metrics: response time, availability, accuracy.
- Usage analytics: adoption patterns, feature utilisation, workflow completion.
- Error rate monitoring with automatic categorisation.
- Business impact: time savings, cost reduction, quality improvement.
- Risk class incident monitoring feeding the Risk Class Metrics Dashboard.
Stakeholder Engagement
- Leadership strategic reviews (quarterly).
- AI champion network check-ins (monthly).
- Client service impact assessments (quarterly).
- STR-07 AI Task Force performance briefings (quarterly).
Feedback Routing and Triage
- Immediate: Class 6 Shadow AI incidents, Class 2 confidentiality concerns, Class 9 SLA breaches → STR-07 notification.
- Within 48 hours: Class 1 accuracy failures, Class 3 bias concerns → vendor notification + GOV-03 log.
- Weekly: User experience issues, feature requests, workflow friction → improvement backlog.
- Monthly: User satisfaction trends, performance metrics, OKR progress → performance review.
Kaizen PDCA Integration
Plan
- Identify improvement opportunities from feedback, Risk Class Dashboard, or OKR gaps.
- Classify each improvement against ROAI (Protect / Comply / Grow / Transform).
- Define success metrics and DPS evidence requirements.
- Secure STR-07 awareness for improvements affecting Class 6 or Agentic Tier provisions.
Do
- Execute improvements in controlled scope (e.g. pilot practice group).
- Document all changes with before/after metrics.
- Apply Metric 0 pre-check before deploying any change to configuration or scope.
Check
- Measure impact against defined success metrics.
- Verify Risk Class Dashboard position has not worsened.
- Confirm Agentic Tier provisions still pass for Level 4 tools.
- Assess user adoption and satisfaction impact.
Act
- Standardise successful improvements across the organisation.
- Update AI BoM, GOV-03, and MAT-04 with improvement records.
- Document lessons in the improvement knowledge base (DPS Sophistication evidence).
- Feed results to MAT-02 for maturity band progression.
Improvement Prioritisation Matrix
- Impact Score (1–5): Protect/Comply (5), Grow (3), Transform (2), UX-only (1).
- Effort Score (1–5): 1 = minimal effort; 5 = major effort.
- Priority = Impact ÷ Effort; Class 6 and Class 9 risk improvements are always Priority 1.
OKR Tracking System
Map objectives to ROAI quadrants:
- Protect: e.g. maintain hallucination rate below 5%; zero Class 2 incidents; reduce Class 6 incidents to zero.
- Comply: e.g. full GOV-02 compliance; EU AI Act checks for Tier 1 vendors; 100% Agentic Tier pass rate.
- Grow: e.g. increase adoption %, expand use cases, raise NPS.
- Transform: e.g. progress from current MAT-02 band to target; deploy new agentic capabilities with full governance.
Cadence
- Weekly: brief KR check-ins.
- Monthly: OKR progress review with risk class status.
- Quarterly: OKR retrospective and reset with STR-07 briefing.
- Annual: OKR achievement review and maturity assessment (MAT-02).
Performance Monitoring Framework
Efficiency KPIs
- Time savings per AI-assisted task vs. manual baseline.
- Workflow completion rate for AI-enabled processes.