Files

182 lines
8.1 KiB
Markdown

# MTG Commander Deck Builder — Project State
## Overview
Self-hosted, AI-powered Magic: The Gathering Commander deck builder.
Runs as a Docker Compose stack managed via Portainer, exposed externally via Cloudflare Tunnel + Traefik.
**Stack:** FastAPI (Python 3.12) · React/Vite (TypeScript) · PostgreSQL 16 · Redis 7 · Nginx · Docker Compose
---
## Current Status
**App is live at:** http://commander.bussenet.ca (HTTP locally) / https://commander.bussenet.ca (via Cloudflare)
- ✅ Login works
- ✅ Admin panel works
- ✅ Collection import endpoints exist
- ❌ Deck generation failing (see Active Issues below)
---
## Active Issues
### 1. Deck Generation JSON Parse Failure
Claude returns a response structured as `commander` + `deck`/`decklist` instead of the required `deck_name` + `strategy_summary` + `cards`. Additionally the response was being truncated due to insufficient `max_tokens`.
**Fixes applied:**
- `max_tokens` increased to 16000 in `deck_service.py`
- `_build_payload` updated to accept `decklist`/`deck` as fallback keys
- `_parse_json` updated with multi-stage parsing and fallback extraction
- System prompt strengthened with explicit JSON structure requirements
**Current state:** max_tokens fix is confirmed in the running image (grep shows GENERATE_MAX_TOKENS=16000 at line 15). Not yet confirmed working end-to-end due to deploy pipeline issues.
### 2. Cloudflare 100s Timeout
Claude API calls take 30-60 seconds. Cloudflare free tier imposes a 100s limit. With max_tokens=16000, responses may take longer and hit this limit.
**Planned fix:** Implement async deck generation — return job ID immediately, frontend polls for result.
### 3. Deployment Pipeline (Root Cause of Most Pain)
Building Docker images directly from git URLs (`docker build http://gitea/...`) uses Docker's internal git cache which frequently serves stale code even with `--no-cache --pull`. This caused multiple "fix applied but not running" cycles.
**Planned fix:** Set up CI/CD webhook — Gitea push triggers server script that clones fresh, builds from local filesystem, restarts container. This is the **top priority for the next session**.
---
## Deployment Architecture
### Stack
- Portainer stack ID: **54** (commander-forge)
- All services use pre-built images — no `build:` directives in compose file
- Images: `commander-forge-nginx:latest`, `commander-forge-frontend:latest`, `commander-forge-backend:latest`
### Networking
- Cloudflare Tunnel → `http://localhost:80` → Traefik → nginx (`traefik-public` network)
- `traefik.docker.network=traefik-public` label required on nginx
- All other services on `commander-forge_internal` network
- Cloudflared on host network — all ingress uses `localhost` not container IPs
- Cloudflare region2 (`198.41.200.x`) unreachable — ISP routing issue, outside our control
### Manual Build Commands (current process — to be replaced by CI/CD)
```bash
# Always build from GitHub to avoid Gitea git cache issues
sudo docker rmi commander-forge-backend -f
sudo docker build -t commander-forge-backend:latest --no-cache --pull "https://github.com/danbusse/Commander-Deck-App.git#master:backend"
sudo docker restart commander-forge-backend-1
sudo docker rmi commander-forge-frontend -f
sudo docker build -t commander-forge-frontend:latest --no-cache "https://github.com/danbusse/Commander-Deck-App.git#master:frontend"
sudo docker restart commander-forge-frontend-1
sudo docker rmi commander-forge-nginx -f
sudo docker build -t commander-forge-nginx:latest --no-cache --pull "https://github.com/danbusse/Commander-Deck-App.git#master:nginx"
sudo docker restart commander-forge-nginx-1
```
**IMPORTANT:** Always build from GitHub URL, not Gitea. Gitea has persistent git cache issues.
---
## Infrastructure
| Service | URL | Notes |
|---------|-----|-------|
| Commander Forge | https://commander.bussenet.ca | Main app |
| Portainer | https://portainer.bussenet.ca | Stack management |
| Gitea | https://gitea.bussenet.ca | Primary git (SSH port 2222) |
| GitHub mirror | https://github.com/danbusse/Commander-Deck-App | Private, Claude's file access path |
| Vault | https://vault.bussenet.ca | Secrets store |
| Portainer MCP | https://mcp-portainer.bussenet.ca/sse | Custom image with entrypoint fix |
### Portainer MCP
Custom `mcp-portainer:latest` image built from `ghcr.io/serraniel/portainer-mcp-docker:http`.
- Fixed entrypoint passes `--` before portainer-mcp command
- Tools written to `/tmp/tools.yaml`
- PORTAINER_SERVER set to `192.168.0.62:9443` (no protocol prefix — binary prepends https://)
- Rebuild command: `sudo docker build -t mcp-portainer:latest ~/portainer-mcp-build/`
### GitHub API Access
Claude reads/writes files via GitHub API using token in Vault at `secret/github.claude-api-token`.
This is Claude's primary mechanism for updating project files between sessions.
### Git Workflow
Two remotes configured:
- `origin` → Gitea (`ssh://git@192.168.0.62:2222/Dan/Commander-Deck-App.git`)
- `github` → GitHub (`https://github.com/danbusse/Commander-Deck-App.git`)
Always push to both: `git push origin master && git push github master`
---
## Known Fixes Applied
| Issue | Fix | File |
|-------|-----|------|
| passlib incompatible with bcrypt 4.x | Replaced with `bcrypt==4.1.3` | `requirements.txt`, `security.py` |
| npm ci fails on Linux | Changed to `npm install` | `frontend/Dockerfile` |
| Portainer volume mount creates directory | Baked nginx config into image | `nginx/Dockerfile` |
| Traefik routing wrong network | Added `traefik.docker.network=traefik-public` label | `docker-compose.yml` |
| UserRole enum uppercase/lowercase mismatch | Renamed members to lowercase (`pending/approved/admin`) | `user.py`, `admin_bootstrap.py`, `deps.py`, `admin.py` |
| Missing DATABASE_URL/REDIS_URL | Passed explicitly in stack env vars | Portainer stack |
| JSON truncation in deck generation | Increased max_tokens to 16000 | `deck_service.py` |
| Claude returns wrong JSON structure | Added fallback key handling + multi-stage parser | `claude_client.py` |
| Archidekt JSON crash on missing set code | Added `or ""` before `.lower()` | `archidekt.py` |
---
## Environment Variables (Portainer stack env)
| Variable | Value | Notes |
|----------|-------|-------|
| SECRET_KEY | changeme | ⚠️ Needs replacing |
| POSTGRES_PASSWORD | changeme | ⚠️ Needs replacing |
| POSTGRES_DB | mtgdb | |
| POSTGRES_USER | mtg | |
| DATABASE_URL | postgresql+asyncpg://mtg:changeme@db:5432/mtgdb | ⚠️ Update with new password |
| REDIS_URL | redis://cache:6379 | |
| ANTHROPIC_API_KEY | (in Vault at secret/anthropic) | |
| ADMIN_EMAIL | busse.daniel@gmail.com | |
| ADMIN_PASSWORD | Admin1234 | |
---
## Test Suite
Located at `backend/tests/`. Run with:
```bash
cd /tmp/Commander-Deck-App/backend
pip install -r requirements.txt --break-system-packages
pytest tests/ -v
```
56 tests, all passing. Covers: claude_client parsing, constraints, archidekt/manabox importers, UserRole enum.
---
## Next Session — Start Here
### Priority 1 — Set up CI/CD webhook (DO THIS FIRST)
The manual build process is unreliable due to Docker git source caching. Set up a Gitea webhook that triggers a deploy script on the server on every push to master.
Basic approach:
1. Create a deploy script on the server (`/home/dan/deploy.sh`) that:
- `git clone` or `git pull` from Gitea into a temp directory
- `docker build` from local filesystem (not git URL)
- `docker restart` the affected container
2. Set up a simple webhook receiver (e.g. a small Python/bash HTTP server or use Gitea's built-in webhook with a tool like `webhook`)
3. Configure Gitea to POST to the webhook on push to master
### Priority 2 — Confirm deck generation works
Once CI/CD is in place and we can deploy reliably, test deck generation with the max_tokens=16000 fix.
### Priority 3 — Async deck generation
If deck generation still hits Cloudflare's 100s timeout, implement async pattern:
- POST /generate returns job ID immediately
- Background task runs Claude call
- Frontend polls GET /decks/{id}/status until complete
### Priority 4 — Harden credentials
- `SECRET_KEY``openssl rand -hex 32`
- `POSTGRES_PASSWORD` → strong password
- Update `DATABASE_URL` to match
- Update in Portainer stack env vars