feat(guardrails): add before_final_action hook for workflow and chat final commits by nitinawari · Pull Request #534 · GenAI-Security-Project/finbot-ctf

nitinawari · 2026-06-26T20:41:39Z

Summary

Adds the third guardrail hook point for FinBot Labs — before_final_action — so defenders can inspect and score agent outcomes before they are committed. This completes the hook trio required for Deliverable A: before tool, after tool, and before agent final actions.
The hook fires when:

Workflow agents call complete_task (including forced-failure paths: stall, iteration error, exhausted iterations)
Chat assistants save the final assistant reply
Guardrails remain passive: a block verdict is logged and scorable via GuardrailPreventionDetector, but does not stop execution.

What changed

Guardrail core

finbot/guardrails/schemas.py — Added HookKind.before_final_action; extended HookEnvelope with agent_name, task_status, task_summary
finbot/guardrails/service.py — invoke() accepts and emits final-action fields on webhook payloads and agent.guardrail.* events

Agent integration

finbot/agents/base.py
- _invoke_before_final_action_guardrail() helper for complete_task
- Tool loop routes complete_task → before_final_action (not before_tool)
- after_tool fires on complete_task before log_task_completion
- Forced completion paths (stall / error / max iterations) invoke the hook before complete_task
finbot/agents/chat.py
- _invoke_before_final_action_guardrail() for final chat replies (tool_name: chat_response)
- Hook runs before _save_message("assistant", ...)

Labs configuration & UI

finbot/core/data/models.py — Default hooks include before_final_action: true
finbot/core/data/repositories.py — VALID_HOOK_KINDS includes before_final_action
finbot/apps/labs/templates/pages/guardrails.html — Checkbox for "Before Final Action"; "Test before_final_action" button
finbot/apps/labs/routes/guardrails.py — POST /api/v1/guardrails/test/before-final-action

CTF detection

finbot/ctf/detectors/implementations/guardrail_prevention.py
- Supports required_hook_kind: before_final_action
- Optional required_task_status filter
- Final-action evidence: agent_name, task_status, task_summary

Tests

tests/unit/labs/test_guardrail_final_action.py — Integration tests for base agent tool loop + chat stream_response ordering
tests/unit/labs/test_guardrail_service.py — Webhook payload test for before_final_action
tests/unit/labs/test_guardrail_detector.py — Detector tests for final-action block + required_task_status
tests/unit/labs/test_guardrail_config.py — Default hooks assertion updated

Test plan

uv run pytest tests/unit/labs/test_guardrail_final_action.py -v (7 tests)
uv run pytest tests/unit/labs/ -v
Labs → configure webhook → enable Before Final Action → Test before_final_action → webhook receives payload
Run a workflow until an agent calls complete_task → Guardrail Activity shows hook_kind: before_final_action
Send a chat message → Activity shows before_final_action with tool_name: chat_response
Existing Guardrail 101 / Carte Noire (before_tool) still work unchanged

Notes

Passive only — enforcement (actually blocking on block verdict) is out of scope for this PR; discuss with mentor separately
Existing Labs configs — saved hooks_json without before_final_action will have the hook disabled until users re-save config in Labs UI
Chat streaming — tokens may reach the client before the final-action hook; hook gates DB commit, not first streamed token
No new Labs challenge YAML in this PR — paired blue-path challenge can follow in a separate PR

GSoC mapping

Week 1-2 (phase 1)

Deliverable A : Third hook point — before agent final actions — for workflow agents and chat

…final commits

feat(guardrails): add before_final_action hook for workflow and chat …

56efe04

…final commits

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(guardrails): add before_final_action hook for workflow and chat final commits#534

feat(guardrails): add before_final_action hook for workflow and chat final commits#534
nitinawari wants to merge 1 commit into
GenAI-Security-Project:mainfrom
nitinawari:feat/Guardrail-framework

nitinawari commented Jun 26, 2026 •

edited

Loading

Labels

1 participant

Uh oh!

Conversation

nitinawari commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

Guardrail core

Agent integration

Labs configuration & UI

CTF detection

Tests

Test plan

Notes

GSoC mapping

Week 1-2 (phase 1)

Labels

1 participant

nitinawari commented Jun 26, 2026 •

edited

Loading