After running on production since early March, we’re tagging the official v1.5.0 release. This consolidates two months of stability hardening, audit remediation, and verified bug fixes into one named version.
Patch Highlights
Platform stability hardening. The systemd unit now caps memory (MemoryHigh=1200M / MemoryMax=1500M) so a runaway process triggers a clean unit-level restart instead of the kernel OOM killer picking a random victim. Prompted by the 2026-05-09 Discord gateway outage, which stalled all three bots’ event loops simultaneously.
Bug-hunt sweep (BH-2026-05-11). Seven verified fixes — four high, three medium — across RollCall, DiceBoy, and the backup tooling. Each fix shipped with a regression test promoted from its original repro. Test suite went from 0 → 9 passing.
Red-team audit closed. Audit 2026-04-17 produced 0 critical, 3 high, 4 medium findings across Scribey’s quest claim flow, the restore script, and the resilience monitor. All remediated in commit 4b56f52.
Added
- pytest + pytest-asyncio infrastructure and a
tests/regression/suite (9 passing tests, all promoted from bug-hunt repros). pbp_shared_data/.env.exampletemplate documenting required token environment without committing secrets.
Changed
- Backup mechanism.
backup_data.pynow uses SQLite’s Online Backup API for.dbfiles instead ofshutil.copy2. Captures a consistent WAL snapshot in a single transaction. Previously, hot backups could miss uncheckpointed writes. - Systemd unit (
pbp-bots.service). AddedMemoryHigh=1200M/MemoryMax=1500M. Tune to ~70–80% of VM RAM; if your VM is under 2 GB, lower these before deploying. - Watchdog ping logic in
run_all_bots.py. The supervisor withholds theWATCHDOG=1ping if any child’s heartbeat is older than 240s (2× the kill window). A hung child now fails the unit cleanly. Look forheartbeat stale past kill window — withholding watchdog pinginjournalctl -u pbp-bots. - Deploy archive (
prepare_deploy.py). Exclusion list tightened to runtime-only:.bugs/,.pytest_cache/,tests/,pytest.ini,.gitignore,CLAUDE.md, andDEPLOYMENT.mdno longer ship to the VM. Archive shrank from 97 files / 262 KB to 51 files / 204 KB. - All
requirements.txtfiles now pin third-party packages to specific versions. - Type hints across Scribey use a
ScribeyBotalias for stricter type checking.
Fixed
BH-2026-05-11 batch — verified repros, regression tests included
High
- RollCall daily-streak counter reported
0for users whose only check-in was yesterday. (BH-001,9c4cc31) backup_data.pycould lose uncheckpointed WAL writes during a hot backup. Now uses the Online Backup API. (BH-002,a6e79ba)- DiceBoy wager resolver picked a stranded zero-balance row over a positive-balance row when both existed for the same user, returning
0and refusing wagers. (BH-004,4869adb) update_user_statsraisedIntegrityErroron a user’s first-ever roll when two stats updates raced. Now serialized withBEGIN IMMEDIATE. (BH-006,5d64677)
Medium
- Group rolls dropped compound modifiers (
1d20+5+3summed only the first operand). Now sums all+/-terms. (BH-003,3473247) - DiceParser ignored compound keep/drop expressions like
kh3dl1. Modifiers now apply sequentially. (BH-005,65bb03d) get_dice_configcrashed whenvalid_dicewas empty or malformed. Now tolerates degenerate config rows. (BH-007,f946b6e)
Red-team audit batch — 2026-04-17, remediated in 4b56f52
High
- Scribey’s quest claim audit trail recorded the intended award even when the wallet/inventory write rolled back, producing phantom “Reward Claimed” embeds. Award functions now return
Optional[int]andfinalize_claimrecords the amount actually delivered. - Rare-drop announcements had no per-drop dedup. A high-claim Rare item could queue dozens of announcement embeds against Discord’s per-channel rate limit. Now tracks announced item IDs per drop.
restore_data.pysilently swallowed all migration exceptions, falsely reporting success on a half-migrated database. Now logs the exception and exits non-zero.
Medium
restore_data.pywrote viashutil.copy2, leaving a partial DB on crash. Now uses atomic-write pattern.- Scribey was unnecessarily requesting the privileged
message_contentintent. Removed. - Backup integrity verification switched from MD5 to SHA256.
_resilience_monitoronly restarted loops that explicitly reported failure; silent hangs went undetected. Now catches silent hangs too.
Security
- Removed a
.envfile containing live bot tokens from the working tree (f1b3830). Tokens were rotated; production values live only on the VM. - Pinned third-party dependencies across all three bots and the runner (
4c44f98). .env.exampletemplate plus.gitignorereview prevent future token commits.- Manual security audit (
SECURITY_AUDIT_2026-03-05) and red-team audit (2026-04-17) both closed. Reports archived from the working tree after remediation, retained in git history.
Internal
- pytest scaffolding (
pytest.ini,tests/conftest.py) landed as achore:commit onmainso the bug-hunt remediation branch stayed fix-only. _shared.pywrapper archived; consumers now import frompbp_shared_datadirectly.- Old audit-report artifacts removed from working tree after remediation (
197523b).
Operator Notes
- After deploying,
vm_deploy.shrunssystemctl daemon-reload, so the new memory limits take effect automatically. If you deploy manually, run it yourself beforesystemctl restart pbp-bots. - If
MemoryMaxtriggers a restart under normal load, raise it. The default assumes a 2 GB+ VM; bots run around 200–450 MB combined under ordinary traffic.
On Deck
- Restore-side stale
-wal/-shmcleanup (followup to BH-2026-05-11-002). Tracked in.bugs/followups.md.
