Files
Commander-Deck-App-backup/PROJECT_STATE.md
T

8.1 KiB

MTG Commander Deck Builder — Project State

Overview

Self-hosted, AI-powered Magic: The Gathering Commander deck builder. Runs as a Docker Compose stack managed via Portainer, exposed externally via Cloudflare Tunnel + Traefik.

Stack: FastAPI (Python 3.12) · React/Vite (TypeScript) · PostgreSQL 16 · Redis 7 · Nginx · Docker Compose


Current Status

App is live at: http://commander.bussenet.ca (HTTP locally) / https://commander.bussenet.ca (via Cloudflare)

  • Login works
  • Admin panel works
  • Collection import endpoints exist
  • Deck generation failing (see Active Issues below)

Active Issues

1. Deck Generation JSON Parse Failure

Claude returns a response structured as commander + deck/decklist instead of the required deck_name + strategy_summary + cards. Additionally the response was being truncated due to insufficient max_tokens.

Fixes applied:

  • max_tokens increased to 16000 in deck_service.py
  • _build_payload updated to accept decklist/deck as fallback keys
  • _parse_json updated with multi-stage parsing and fallback extraction
  • System prompt strengthened with explicit JSON structure requirements

Current state: max_tokens fix is confirmed in the running image (grep shows GENERATE_MAX_TOKENS=16000 at line 15). Not yet confirmed working end-to-end due to deploy pipeline issues.

2. Cloudflare 100s Timeout

Claude API calls take 30-60 seconds. Cloudflare free tier imposes a 100s limit. With max_tokens=16000, responses may take longer and hit this limit.

Planned fix: Implement async deck generation — return job ID immediately, frontend polls for result.

3. Deployment Pipeline (Root Cause of Most Pain)

Building Docker images directly from git URLs (docker build http://gitea/...) uses Docker's internal git cache which frequently serves stale code even with --no-cache --pull. This caused multiple "fix applied but not running" cycles.

Planned fix: Set up CI/CD webhook — Gitea push triggers server script that clones fresh, builds from local filesystem, restarts container. This is the top priority for the next session.


Deployment Architecture

Stack

  • Portainer stack ID: 54 (commander-forge)
  • All services use pre-built images — no build: directives in compose file
  • Images: commander-forge-nginx:latest, commander-forge-frontend:latest, commander-forge-backend:latest

Networking

  • Cloudflare Tunnel → http://localhost:80 → Traefik → nginx (traefik-public network)
  • traefik.docker.network=traefik-public label required on nginx
  • All other services on commander-forge_internal network
  • Cloudflared on host network — all ingress uses localhost not container IPs
  • Cloudflare region2 (198.41.200.x) unreachable — ISP routing issue, outside our control

Manual Build Commands (current process — to be replaced by CI/CD)

# Always build from GitHub to avoid Gitea git cache issues
sudo docker rmi commander-forge-backend -f
sudo docker build -t commander-forge-backend:latest --no-cache --pull "https://github.com/danbusse/Commander-Deck-App.git#master:backend"
sudo docker restart commander-forge-backend-1

sudo docker rmi commander-forge-frontend -f
sudo docker build -t commander-forge-frontend:latest --no-cache "https://github.com/danbusse/Commander-Deck-App.git#master:frontend"
sudo docker restart commander-forge-frontend-1

sudo docker rmi commander-forge-nginx -f
sudo docker build -t commander-forge-nginx:latest --no-cache --pull "https://github.com/danbusse/Commander-Deck-App.git#master:nginx"
sudo docker restart commander-forge-nginx-1

IMPORTANT: Always build from GitHub URL, not Gitea. Gitea has persistent git cache issues.


Infrastructure

Service URL Notes
Commander Forge https://commander.bussenet.ca Main app
Portainer https://portainer.bussenet.ca Stack management
Gitea https://gitea.bussenet.ca Primary git (SSH port 2222)
GitHub mirror https://github.com/danbusse/Commander-Deck-App Private, Claude's file access path
Vault https://vault.bussenet.ca Secrets store
Portainer MCP https://mcp-portainer.bussenet.ca/sse Custom image with entrypoint fix

Portainer MCP

Custom mcp-portainer:latest image built from ghcr.io/serraniel/portainer-mcp-docker:http.

  • Fixed entrypoint passes -- before portainer-mcp command
  • Tools written to /tmp/tools.yaml
  • PORTAINER_SERVER set to 192.168.0.62:9443 (no protocol prefix — binary prepends https://)
  • Rebuild command: sudo docker build -t mcp-portainer:latest ~/portainer-mcp-build/

GitHub API Access

Claude reads/writes files via GitHub API using token in Vault at secret/github.claude-api-token. This is Claude's primary mechanism for updating project files between sessions.

Git Workflow

Two remotes configured:

  • origin → Gitea (ssh://git@192.168.0.62:2222/Dan/Commander-Deck-App.git)
  • github → GitHub (https://github.com/danbusse/Commander-Deck-App.git)

Always push to both: git push origin master && git push github master


Known Fixes Applied

Issue Fix File
passlib incompatible with bcrypt 4.x Replaced with bcrypt==4.1.3 requirements.txt, security.py
npm ci fails on Linux Changed to npm install frontend/Dockerfile
Portainer volume mount creates directory Baked nginx config into image nginx/Dockerfile
Traefik routing wrong network Added traefik.docker.network=traefik-public label docker-compose.yml
UserRole enum uppercase/lowercase mismatch Renamed members to lowercase (pending/approved/admin) user.py, admin_bootstrap.py, deps.py, admin.py
Missing DATABASE_URL/REDIS_URL Passed explicitly in stack env vars Portainer stack
JSON truncation in deck generation Increased max_tokens to 16000 deck_service.py
Claude returns wrong JSON structure Added fallback key handling + multi-stage parser claude_client.py
Archidekt JSON crash on missing set code Added or "" before .lower() archidekt.py

Environment Variables (Portainer stack env)

Variable Value Notes
SECRET_KEY changeme ⚠️ Needs replacing
POSTGRES_PASSWORD changeme ⚠️ Needs replacing
POSTGRES_DB mtgdb
POSTGRES_USER mtg
DATABASE_URL postgresql+asyncpg://mtg:changeme@db:5432/mtgdb ⚠️ Update with new password
REDIS_URL redis://cache:6379
ANTHROPIC_API_KEY (in Vault at secret/anthropic)
ADMIN_EMAIL busse.daniel@gmail.com
ADMIN_PASSWORD Admin1234

Test Suite

Located at backend/tests/. Run with:

cd /tmp/Commander-Deck-App/backend
pip install -r requirements.txt --break-system-packages
pytest tests/ -v

56 tests, all passing. Covers: claude_client parsing, constraints, archidekt/manabox importers, UserRole enum.


Next Session — Start Here

Priority 1 — Set up CI/CD webhook (DO THIS FIRST)

The manual build process is unreliable due to Docker git source caching. Set up a Gitea webhook that triggers a deploy script on the server on every push to master.

Basic approach:

  1. Create a deploy script on the server (/home/dan/deploy.sh) that:
    • git clone or git pull from Gitea into a temp directory
    • docker build from local filesystem (not git URL)
    • docker restart the affected container
  2. Set up a simple webhook receiver (e.g. a small Python/bash HTTP server or use Gitea's built-in webhook with a tool like webhook)
  3. Configure Gitea to POST to the webhook on push to master

Priority 2 — Confirm deck generation works

Once CI/CD is in place and we can deploy reliably, test deck generation with the max_tokens=16000 fix.

Priority 3 — Async deck generation

If deck generation still hits Cloudflare's 100s timeout, implement async pattern:

  • POST /generate returns job ID immediately
  • Background task runs Claude call
  • Frontend polls GET /decks/{id}/status until complete

Priority 4 — Harden credentials

  • SECRET_KEYopenssl rand -hex 32
  • POSTGRES_PASSWORD → strong password
  • Update DATABASE_URL to match
  • Update in Portainer stack env vars