System Design
Three Subsystems
Slop is built from three interlocking parts, each with a distinct responsibility.
Next.js UI
src/app/
App Router front end; Server Components load data from SQLite directly. Client Components subscribe to SSE. Mutations via Server Actions that validate and delegate to the daemon.
Daemon Loop
src/server/daemon/
A singleton process (SQLite PID lock) that runs a configurable poll cycle (default 30s). 13 named steps in fixed order. All external collaborators injected through
DaemonDeps.Worker Pipeline
src/server/workers/
Per-issue coroutines. Worker DB row tracks 17-status state machine. Phases: implement → verify → review → conflict → CI fix → ship. All steps receive collaborators via
Deps objects.Daemon
Daemon Loop
The daemon is a singleton Node.js process enforced by a SQLite PID lock. It re-evaluates the world on a fixed interval. The poll interval is read fresh from the config DB each cycle so changes take effect without a restart.
flowchart LR
A([Next.js\nServer Start]) --> B[bootDaemon]
B --> C[acquireDaemonLock]
C --> D[createDaemon]
D --> E[daemon.start]
E --> F{cycle}
F --> G[13 steps\nin order]
G --> H[sleep pollIntervalMs]
H --> F
F -- error in step --> I[log + continue]
I --> H
Daemon
Boot Sequence
bootDaemon(env, overrides?) is called once from src/instrumentation-node.ts when the Next.js server starts. The 11-step sequence below runs in order.
1
checkDatabaseIntegrity()
Runs
PRAGMA integrity_check via a read-only better-sqlite3 connection. Throws DaemonBootError("DATABASE_INTEGRITY", ...) on any failure.2
acquireDaemonLock(configRepo)
Reads the
daemon_lock config key. If a PID is present and the process is alive, throws DaemonBootError("DAEMON_LOCK", ...). Otherwise writes process.pid. Returns a releaseLock() function.3
seedConfigDefaults(configRepo, env)
For each of ~25 config keys, writes the default only when the key is absent. Covers
parallelismCap=1, pollIntervalMs=30000, all timeout defaults, autoMode=false, autoMergeMode=true, etc.4
migrateToMultiRepo(configRepo, repoRepo, env)
One-shot migration: if no
Repo rows exist, seeds the first from legacy config keys or env vars; back-fills repoId on Worker, ReadyIssue, ReadyOrder, IssueConfig, RepoSnapshot rows in a single transaction.5
Read githubToken
Throws
DaemonBootError("GITHUB_TOKEN", ...) if the token is missing from config and from the environment.6
Build getClientForRepo lazy cache
Creates a
Map<string, GitHubClient> resolved on first access via repoRepo.getRepo(repoId) then createGitHubClient({ owner, repo, token }).7
Create EventBus
Instantiates the in-memory pub/sub bus and registers it globally via
setEventBus so SSE routes can acquire it.8
createDaemon(deps)
Wires all subsystems together with fully resolved
DaemonDeps. Returns the Daemon implementation exposed to Server Actions.9
daemon.start()
Begins the poll loop. Runs boot-time recovery (
resumeOrphanedImplementers, failInterruptedRuns, recoverStrandedReportingWorkers) then enters the first cycle.10
Assert lock still held by process.pid
Invariant guard: verifies the config row still carries the current PID before returning to the caller.
11
Return { ...daemon, stop }
Returns the running daemon handle.
stop() calls daemon.stop() then releaseLock().
Idempotent boot:
bootDaemon is idempotent across server restarts because acquireDaemonLock guards duplicate starts. src/instrumentation-node.ts uses a globalThis.__slopDaemon cached promise to prevent concurrent boot calls during Next.js hot reload.
Daemon
The 13 Cycle Steps
Each cycle executes these steps in fixed order. Every step is wrapped in
runStep(name, fn) — a throw is logged and the loop continues.1
refreshGuardStatus
Checks branch protection for all repos. Stops the cycle if guards are not satisfied — all subsequent steps are skipped.
2
refreshBaseCiState
Anti-flake debounce: base CI only transitions to "red" after
RED_STREAK_THRESHOLD=2 consecutive red cycles. Publishes repo.health_changed on status transitions.3
tick
Auto-pilot claiming step: reads parallelism caps, finds open slots per repo, claims issues from the ready queue, spawns runners. Skipped when base CI is red.
4
runLifecyclePoll
Reconciles all non-terminal workers against GitHub: checks merged PRs, closed issues, dirty branches, failing CI, review verdicts, and merge eligibility.
5
monitorWorkerResources
Samples process RSS for all active workers (60-sample ring). Tracks
peakRssBytes. Fires triggerReport and optionally stops the worker on first breach of memoryThresholdBytes (default 2 GB).6
spawnPendingResolvers
Recovers
resolving_conflict workers not in the registry after a daemon restart. Reaps any orphaned agent process, then re-spawns the resolver.7
spawnPendingCiFixers
Same recovery pattern for
fixing_ci workers stranded by a daemon restart.8
spawnPendingVerifiers
Same recovery pattern for
verifying workers stranded by a daemon restart.9
spawnPendingReviewers
Recovers
in_review workers unconditionally. Also dispatches waiting_review workers when autoReviewMode is enabled globally or per-worker.10
spawnPendingAddressers
Recovers
in_address workers unconditionally. Also dispatches waiting_address workers when both autoReviewMode and autoAddressMode are enabled.11
refreshRepoSnapshot
Fetches GitHub snapshot: open issues, PRs, releases, base branch head, CI aggregates. Skipped when
pollPaused config flag is set.12
backupDatabase
Daily SQLite backup using the online
db.backup(dest) API. No-op if today's backup already exists. Retains the 7 most recent backups.13
reapOrphanedWorktrees
Walks
WORKTREES_ROOT/<repoId>/<issueNumber>, removes directories with no active worker row (with a grace period to avoid removing newly-created worktrees).
Resilient loop: A throw inside any step is logged and the loop continues — no single step failure can crash the daemon.
Workers
Worker Pipeline
Each claimed issue gets a Worker DB row. The daemon dispatches a runner which drives the agent through sequential phases.
flowchart TD
A([claimed]) --> B[implementing\nagent runs]
B --> C{verify\ngate?}
C -- yes --> D[verifying]
C -- no --> E{remote\nshipping?}
D --> E
E -- yes --> F[waiting_ci\nPR open]
E -- no --> M
F --> G{CI\nresult?}
G -- pass --> H{review\ngate?}
G -- fail --> I[fixing_ci]
G -- conflict --> J[resolving_conflict]
I --> F
J --> F
H -- yes --> K[waiting_review]
H -- no --> L{auto\nmerge?}
K --> K2[in_review]
K2 --> K3{verdict?}
K3 -- approved --> L
K3 -- changes --> K4[waiting_address]
K4 --> K5[in_address]
K5 --> F
L -- off or base red --> M[waiting_merge]
L -- on --> N[merging]
M --> N
N --> O([merged])
Workers
Worker State Machine
Workers move through 17 statuses. Three are terminal. All transitions are guarded compare-and-set writes.
claimed
pre-start
implementing
running
verifying
running
waiting_ci
waiting
waiting_review
waiting
in_review
active
waiting_address
waiting
in_address
active
waiting_merge
waiting
resolving_conflict
active
fixing_ci
active
merging
running
reporting
running
paused
suspended
merged
terminal ok
failed
terminal err
cancelled
terminal err
Guarded CAS transitions: All transitions use
updateWorkerStatusFrom — an updateMany WHERE status IN (expected) write. Returns false if the expected status did not match; the caller silently no-ops. This prevents two concurrent actors (the daemon lifecycle poll and a running worker agent) from double-advancing the state machine.
Design
Architecture Patterns
Four recurring patterns that shape every part of the codebase.
Pattern 01
Layered server/client
Types (
src/types/) have no dependencies. Lib (src/lib/) cannot import from app. Repos are pure Prisma passthroughs. Server logic owns all orchestration. App layer handles routing and rendering. Module boundaries enforced by dependency-cruiser.Pattern 02
Dependency injection
Every daemon step and worker phase receives collaborators via a typed
Deps object — not direct module imports. Production wiring happens once in boot.ts. Tests inject fakes at any granularity without mocking modules.Pattern 03
Poll-driven with CAS
Rather than a push-driven event queue, the daemon re-evaluates the world on a fixed interval. State transitions use compare-and-set writes at the DB level so concurrent actors cannot corrupt the state machine.
Pattern 04
Event-driven live updates
Mutations publish typed events onto an in-memory
EventBus. The bus persists each event to SQLite (fire-and-forget). An SSE route streams all events to browsers. Client Components apply them to local state without a full page reload.External
GitHub Integration
A single typed
GitHubClient per watched repo, cached in a lazy Map inside the daemon boot context. One GITHUB_TOKEN authenticates all repos.| Domain | Methods |
|---|---|
| Issues | getIssue · closeIssue · listOpenIssues · listRecentlyClosedIssues |
| Pull Requests | listOpenPullRequests · findOpenPullRequestForIssue · getPullRequest · createPullRequest · mergePullRequest · closePullRequest · enableAutoMerge · getLatestPrReviewVerdict |
| CI | getCheckRunsForCommit · getBranchCiStatus · getBranchProtection |
| Repo | getBaseBranchHead · getReleases |
| Error types | GitHubAuthError (401/403) · GitHubNotFoundError (404) · GitHubRateLimitError (429) |
latestCheckRunsByName(runs) deduplicates check runs by name, keeping the latest attempt per name (by started_at). This prevents re-runs of a CI job from appearing as two separate failures.
Live Updates
SSE Events
Mutations publish typed events through an in-memory bus. The bus persists each event to SQLite. An SSE route streams them to connected browsers in real time.
flowchart LR
A[Server Action\nor daemon step] -->|publish| B[EventBus\nin-memory]
B -->|appendEvent\nfire-and-forget| C[(SQLite\nEvent table)]
B -->|stream| D[GET /api/events\nSSE route]
D -->|EventSource| E[Client Component\nuseWorkerEvents]
E -->|applyEvent| F[Local React state\nno page reload]
| Event | Effect on client |
|---|---|
worker.claimed |
Prepends a new worker row to the board |
worker.state_changed |
Updates the matching row's status badge |
worker.completed / worker.failed |
Stamps terminal status on the row |
worker.github_state_changed |
Updates issueState / prState display |
repo.updated |
Triggers a re-fetch of the repo snapshot |
resource.sampled |
Updates memory and CPU display for active workers |
Enforcement
Module Boundaries
Enforced by dependency-cruiser (
make deps). Violations are error-severity and fail CI.| Rule | From | To | Reason |
|---|---|---|---|
| no-circular | any | any (circular) | Circular dependencies forbidden everywhere |
| no-components-to-prisma-client | UI components | @prisma/client |
Components use src/types/ not raw Prisma types |
| no-components-to-repos | UI components | *.repo.ts |
Components go through Server Components or Actions, not repos directly |
| no-lib-to-app | src/lib/ |
src/app/ |
Lib is app-agnostic and must not depend on routing or UI |
| no-cross-feature-imports | src/app/<feature>/ |
other src/app/<feature>/ |
Feature modules are isolated; shared code goes to src/lib/ or src/components/ |
| no-types-to-app | src/types/ |
src/app/ |
Types have no app dependencies |
| no-types-to-lib | src/types/ |
src/lib/ |
Types have no lib dependencies; they only import from node_modules |
Design
Key Invariants
Eight non-negotiable properties the codebase upholds. These are the guards that make autonomous operation safe.
1
No try/catch in repository functions
Repository functions (
src/db/*.repo.ts) are pure Prisma passthroughs. Errors propagate to callers that have the context to decide whether to log, retry, swallow, or fail the worker. This prevents hidden error suppression in the data layer.2
Guarded CAS transitions
Every lifecycle transition uses
updateWorkerStatusFrom(id, expectedFrom[], to). If the row's current status is not in expectedFrom, the update is a no-op (returns false). The event is not published. Two concurrent actors cannot double-advance or back-advance the state machine.3
Dependency injection at every boundary
No function in
src/server/daemon/ or src/server/workers/ has bare module-level imports of external collaborators. Production wiring is assembled once in boot.ts. Tests inject fakes at any granularity.4
Single-flight runner registry
RunnerRegistry.reserve(workerId, fn) is synchronous and returns false immediately if a runner is already registered. This is the real double-click guard — it does not depend on DB state, network latency, or race conditions in async code.5
Poll interval read fresh each cycle
createDaemon reads pollIntervalMs from the config DB at the start of each cycle() call rather than at boot. A user changing the interval on the Config page takes effect on the next cycle without requiring a restart.6
Worktree-scoped agent execution
assertScopedToWorktree(cwd) throws if an agent's working directory is not under WORKTREES_ROOT. This runtime invariant prevents agent commands from accidentally mutating the main repo checkout.7
Report fold serialization
serializeReportFold<T>(workerId, task) maintains a per-worker promise chain. Concurrent phases (implement, verify, CI fix) each persist stats to the same WorkerReport row. The chain makes these read-modify-write operations sequential in-process, preventing clobbering.8
Verify gate fail-safe
parseVerdict(summary) returns "findings" for any unparseable or absent verdict string. The verify gate defaults to blocking: only an explicit SLOP_VERDICT: pass in the agent output advances the worker. A crashed or silent agent never accidentally ships.File Layout
Placement Rules
Where each concern lives in the source tree. Following these rules keeps module boundaries enforceable.
| Concern | Location |
|---|---|
| Daemon poll loop | src/server/daemon/ |
| Per-issue worker | src/server/workers/ |
| GitHub REST wrapper | src/server/github/ |
| Events pub/sub | src/server/events/ |
| Prisma repo (one per concern) | src/db/<feature>.repo.ts |
| Cross-cutting type | src/types/<feature>.ts |
| Shared util (config, logging, time) | src/lib/ |
| Pages, routes, components | src/app/ |
| Server Actions | src/app/actions/ |
| Test | Next to the file it tests, <feature>.test.ts |
| Agent skills, rules, global config | harness/ |