# DevLoop - Closed-Loop Development System ## Overview DevLoop is Atomizer's autonomous development cycle system that coordinates AI agents and automated testing to create a closed-loop development workflow. **Key Features:** - Uses your existing CLI subscriptions - no API keys needed - Playwright browser testing for UI verification - Multiple test types: API, browser, CLI, filesystem - Automatic analysis and fix iterations - Persistent state in `.devloop/` directory ``` +-----------------------------------------------------------------------------+ | ATOMIZER DEVLOOP - CLOSED-LOOP DEVELOPMENT | +-----------------------------------------------------------------------------+ | | | +----------+ +----------+ +----------+ +----------+ | | | PLAN |---->| BUILD |---->| TEST |---->| ANALYZE | | | | Gemini | | Claude | | Playwright| | Gemini | | | | OpenCode | | CLI | | + API | | OpenCode | | | +----------+ +----------+ +----------+ +----------+ | | ^ | | | | | | | +---------------------------------------------------+ | | FIX LOOP (max iterations) | +-----------------------------------------------------------------------------+ ``` ## Quick Start ### CLI Commands ```bash # Full development cycle python tools/devloop_cli.py start "Create new bracket study" # Step-by-step execution python tools/devloop_cli.py plan "Fix dashboard validation" python tools/devloop_cli.py implement python tools/devloop_cli.py test --study support_arm python tools/devloop_cli.py analyze # Browser UI tests (Playwright) python tools/devloop_cli.py browser # Quick smoke test python tools/devloop_cli.py browser --level home # Home page tests python tools/devloop_cli.py browser --level full # All UI tests python tools/devloop_cli.py browser --study support_arm # Study-specific # Check status python tools/devloop_cli.py status # Quick test with support_arm study python tools/devloop_cli.py quick ``` ### Prerequisites 1. **Backend running**: `cd atomizer-dashboard/backend && python -m uvicorn api.main:app --reload --port 8000` 2. **Frontend running**: `cd atomizer-dashboard/frontend && npm run dev` 3. **Playwright browsers installed**: `cd atomizer-dashboard/frontend && npx playwright install chromium` ## Architecture ### Directory Structure ``` optimization_engine/devloop/ +-- __init__.py # Module exports +-- orchestrator.py # DevLoopOrchestrator - full cycle coordination +-- cli_bridge.py # DevLoopCLIOrchestrator - CLI-based execution | +-- ClaudeCodeCLI # Claude Code CLI wrapper | +-- OpenCodeCLI # OpenCode (Gemini) CLI wrapper +-- test_runner.py # DashboardTestRunner - test execution +-- browser_scenarios.py # Pre-built Playwright scenarios +-- planning.py # GeminiPlanner - strategic planning +-- analyzer.py # ProblemAnalyzer - failure analysis +-- claude_bridge.py # ClaudeCodeBridge - Claude API integration tools/ +-- devloop_cli.py # CLI entry point .devloop/ # Persistent state directory +-- current_plan.json # Current planning state +-- test_results.json # Latest filesystem/API test results +-- browser_test_results.json# Latest browser test results +-- analysis.json # Latest analysis results ``` ### Core Components | Component | Location | Purpose | |-----------|----------|---------| | `DevLoopCLIOrchestrator` | `cli_bridge.py` | CLI-based cycle orchestration | | `ClaudeCodeCLI` | `cli_bridge.py` | Execute Claude Code CLI commands | | `OpenCodeCLI` | `cli_bridge.py` | Execute OpenCode (Gemini) CLI commands | | `DashboardTestRunner` | `test_runner.py` | Run all test types | | `get_browser_scenarios()` | `browser_scenarios.py` | Pre-built Playwright tests | | `DevLoopOrchestrator` | `orchestrator.py` | API-based orchestration (WebSocket) | | `GeminiPlanner` | `planning.py` | Gemini API planning | | `ProblemAnalyzer` | `analyzer.py` | Failure analysis | ### CLI Tools Configuration DevLoop uses your existing CLI subscriptions: ```python # In cli_bridge.py CLAUDE_PATH = r"C:\Users\antoi\.local\bin\claude.exe" OPENCODE_PATH = r"C:\Users\antoi\AppData\Roaming\npm\opencode.cmd" ``` ## CLI Commands Reference ### `start` - Full Development Cycle Runs the complete PLAN -> BUILD -> TEST -> ANALYZE -> FIX loop. ```bash python tools/devloop_cli.py start "Create support_arm study" --max-iterations 5 ``` **Arguments:** - `objective` (required): What to achieve - `--max-iterations`: Maximum fix iterations (default: 5) **Flow:** 1. Gemini creates implementation plan 2. Claude Code implements the plan 3. Tests verify implementation 4. If tests fail: Gemini analyzes, Claude fixes, loop 5. Exits on success or max iterations ### `plan` - Create Implementation Plan Uses Gemini (via OpenCode) to create a strategic plan. ```bash python tools/devloop_cli.py plan "Fix dashboard validation" python tools/devloop_cli.py plan "Add new extractor" --context context.json ``` **Output:** Saves plan to `.devloop/current_plan.json` **Plan structure:** ```json { "objective": "Fix dashboard validation", "approach": "Update validation logic in spec_validator.py", "tasks": [ { "id": "task_001", "description": "Update bounds validation", "file": "optimization_engine/config/spec_validator.py", "priority": "high" } ], "test_scenarios": [ { "id": "test_001", "name": "Validation passes for valid spec", "type": "api", "steps": [...] } ], "acceptance_criteria": ["All validation tests pass"] } ``` ### `implement` - Execute Plan with Claude Code Implements the current plan using Claude Code CLI. ```bash python tools/devloop_cli.py implement python tools/devloop_cli.py implement --plan custom_plan.json ``` **Arguments:** - `--plan`: Custom plan file (default: `.devloop/current_plan.json`) **Output:** Reports files modified and success/failure. ### `test` - Run Tests Run filesystem, API, or custom tests for a study. ```bash python tools/devloop_cli.py test --study support_arm python tools/devloop_cli.py test --scenarios custom_tests.json ``` **Arguments:** - `--study`: Study name (generates standard tests) - `--scenarios`: Custom test scenarios JSON file **Standard study tests:** 1. Study directory exists 2. `atomizer_spec.json` is valid JSON 3. `README.md` exists 4. `run_optimization.py` exists 5. `1_setup/model/` directory exists **Output:** Saves results to `.devloop/test_results.json` ### `browser` - Run Playwright UI Tests Run browser-based UI tests using Playwright. ```bash python tools/devloop_cli.py browser # Quick smoke test python tools/devloop_cli.py browser --level home # Home page tests python tools/devloop_cli.py browser --level full # All UI tests python tools/devloop_cli.py browser --level study --study support_arm ``` **Arguments:** - `--level`: Test level (`quick`, `home`, `full`, `study`) - `--study`: Study name for study-specific tests **Test Levels:** | Level | Tests | Description | |-------|-------|-------------| | `quick` | 1 | Smoke test - page loads | | `home` | 2 | Home page stats + folder expansion | | `full` | 5+ | All UI + study-specific | | `study` | 3 | Canvas, dashboard for specific study | **Output:** Saves results to `.devloop/browser_test_results.json` ### `analyze` - Analyze Test Results Uses Gemini (via OpenCode) to analyze failures and create fix plans. ```bash python tools/devloop_cli.py analyze python tools/devloop_cli.py analyze --results custom_results.json ``` **Arguments:** - `--results`: Custom results file (default: `.devloop/test_results.json`) **Output:** Saves analysis to `.devloop/analysis.json` ### `status` - View Current State Shows the current DevLoop state. ```bash python tools/devloop_cli.py status ``` **Output:** ``` DevLoop Status ============================================================ Current Plan: Fix dashboard validation Tasks: 3 Last Test Results: Passed: 4/5 Last Analysis: Issues: 1 ============================================================ CLI Tools: - Claude Code: C:\Users\antoi\.local\bin\claude.exe - OpenCode: C:\Users\antoi\AppData\Roaming\npm\opencode.cmd ``` ### `quick` - Quick Test Runs tests for the `support_arm` study as a quick verification. ```bash python tools/devloop_cli.py quick ``` ## Test Types ### Filesystem Tests Check files and directories exist, JSON validity, content matching. ```json { "id": "test_fs_001", "name": "Study directory exists", "type": "filesystem", "steps": [ {"action": "check_exists", "path": "studies/my_study"} ], "expected_outcome": {"exists": true} } ``` **Actions:** - `check_exists` - Verify path exists - `check_json_valid` - Parse JSON file - `check_file_contains` - Search for content ### API Tests Test REST endpoints. ```json { "id": "test_api_001", "name": "Get study spec", "type": "api", "steps": [ {"action": "get", "endpoint": "/api/studies/my_study/spec"} ], "expected_outcome": {"status_code": 200} } ``` **Actions:** - `get` - HTTP GET - `post` - HTTP POST with `data` - `put` - HTTP PUT with `data` - `delete` - HTTP DELETE ### Browser Tests (Playwright) Test UI interactions. ```json { "id": "test_browser_001", "name": "Canvas loads nodes", "type": "browser", "steps": [ {"action": "navigate", "url": "/canvas/support_arm"}, {"action": "wait_for", "selector": ".react-flow__node"}, {"action": "click", "selector": "[data-testid='node-dv_001']"} ], "expected_outcome": {"status": "pass"}, "timeout_ms": 20000 } ``` **Actions:** - `navigate` - Go to URL - `wait_for` - Wait for selector - `click` - Click element - `fill` - Fill input with value - `screenshot` - Take screenshot ### CLI Tests Execute shell commands. ```json { "id": "test_cli_001", "name": "Run optimization test", "type": "cli", "steps": [ {"command": "python run_optimization.py --test", "cwd": "studies/my_study"} ], "expected_outcome": {"returncode": 0} } ``` ## Browser Test Scenarios Pre-built scenarios in `browser_scenarios.py`: ```python from optimization_engine.devloop.browser_scenarios import get_browser_scenarios # Get scenarios by level scenarios = get_browser_scenarios(level="full", study_name="support_arm") # Available functions get_browser_scenarios(level, study_name) # Main entry point get_study_browser_scenarios(study_name) # Study-specific tests get_ui_verification_scenarios() # Home page tests get_chat_verification_scenarios() # Chat panel tests ``` ## Standalone Playwright Tests In addition to DevLoop integration, you can run standalone Playwright tests: ```bash cd atomizer-dashboard/frontend # Run all E2E tests npm run test:e2e # Run with Playwright UI npm run test:e2e:ui # Run specific test file npx playwright test tests/e2e/home.spec.ts ``` **Test files:** - `tests/e2e/home.spec.ts` - Home page tests (8 tests) ## API Integration DevLoop also provides REST API endpoints when running the dashboard backend: | Endpoint | Method | Description | |----------|--------|-------------| | `/api/devloop/status` | GET | Current loop status | | `/api/devloop/start` | POST | Start development cycle | | `/api/devloop/stop` | POST | Stop current cycle | | `/api/devloop/step` | POST | Execute single phase | | `/api/devloop/history` | GET | View past cycles | | `/api/devloop/health` | GET | System health check | | `/api/devloop/ws` | WebSocket | Real-time updates | **Start a cycle via API:** ```bash curl -X POST http://localhost:8000/api/devloop/start \ -H "Content-Type: application/json" \ -d '{"objective": "Create support_arm study", "max_iterations": 5}' ``` ## State Files DevLoop maintains state in `.devloop/`: | File | Purpose | Updated By | |------|---------|------------| | `current_plan.json` | Current implementation plan | `plan` command | | `test_results.json` | Filesystem/API test results | `test` command | | `browser_test_results.json` | Browser test results | `browser` command | | `analysis.json` | Failure analysis | `analyze` command | ## Example Workflows ### Create a New Study ```bash # Full autonomous cycle python tools/devloop_cli.py start "Create bracket_lightweight study with mass and displacement objectives" # Or step by step python tools/devloop_cli.py plan "Create bracket_lightweight study" python tools/devloop_cli.py implement python tools/devloop_cli.py test --study bracket_lightweight python tools/devloop_cli.py browser --study bracket_lightweight ``` ### Debug a Dashboard Issue ```bash # Plan the fix python tools/devloop_cli.py plan "Fix canvas node selection not updating panel" # Implement python tools/devloop_cli.py implement # Test UI python tools/devloop_cli.py browser --level full # If tests fail, analyze python tools/devloop_cli.py analyze # Fix and retest loop... ``` ### Verify Study Before Running ```bash # File structure tests python tools/devloop_cli.py test --study my_study # Browser tests (canvas loads, etc.) python tools/devloop_cli.py browser --level study --study my_study ``` ## Troubleshooting ### Browser Tests Fail 1. **Ensure frontend is running**: `npm run dev` in `atomizer-dashboard/frontend` 2. **Check port**: DevLoop uses `localhost:3003` (Vite default) 3. **Install browsers**: `npx playwright install chromium` ### CLI Tools Not Found Check paths in `cli_bridge.py`: ```python CLAUDE_PATH = r"C:\Users\antoi\.local\bin\claude.exe" OPENCODE_PATH = r"C:\Users\antoi\AppData\Roaming\npm\opencode.cmd" ``` ### API Tests Fail 1. **Ensure backend is running**: Port 8000 2. **Check endpoint paths**: May need `/api/` prefix ### Tests Timeout Increase timeout in test scenario: ```json { "timeout_ms": 30000 } ``` ### Unclosed Client Session Warning This is a known aiohttp warning on Windows. Tests still pass correctly. ## Integration with LAC DevLoop records learnings to LAC (Learning Atomizer Core): ```python from knowledge_base.lac import get_lac lac = get_lac() # Record after successful cycle lac.record_insight( category="success_pattern", context="DevLoop created support_arm study", insight="TPE sampler works well for 4-variable bracket problems", confidence=0.9 ) ``` ## Future Enhancements 1. **Parallel test execution** - Run independent tests concurrently 2. **Visual diff** - Show code changes in dashboard 3. **Smart rollback** - Automatic rollback on regression 4. **Branch management** - Auto-create feature branches 5. **Cost tracking** - Monitor CLI usage