## DevLoop - Closed-Loop Development System - Orchestrator for plan → build → test → analyze cycle - Gemini planning via OpenCode CLI - Claude implementation via CLI bridge - Playwright browser testing integration - Test runner with API, filesystem, and browser tests - Persistent state in .devloop/ directory - CLI tool: tools/devloop_cli.py Usage: python tools/devloop_cli.py start 'Create new feature' python tools/devloop_cli.py plan 'Fix bug in X' python tools/devloop_cli.py test --study support_arm python tools/devloop_cli.py browser --level full ## HTML Reports (optimization_engine/reporting/) - Interactive Plotly-based reports - Convergence plot, Pareto front, parallel coordinates - Parameter importance analysis - Self-contained HTML (offline-capable) - Tailwind CSS styling ## Playwright E2E Tests - Home page tests - Test results in test-results/ ## LAC Knowledge Base Updates - Session insights (failures, workarounds, patterns) - Optimization memory for arm support study
541 lines
15 KiB
Markdown
541 lines
15 KiB
Markdown
# DevLoop - Closed-Loop Development System
|
|
|
|
## Overview
|
|
|
|
DevLoop is Atomizer's autonomous development cycle system that coordinates AI agents and automated testing to create a closed-loop development workflow.
|
|
|
|
**Key Features:**
|
|
- Uses your existing CLI subscriptions - no API keys needed
|
|
- Playwright browser testing for UI verification
|
|
- Multiple test types: API, browser, CLI, filesystem
|
|
- Automatic analysis and fix iterations
|
|
- Persistent state in `.devloop/` directory
|
|
|
|
```
|
|
+-----------------------------------------------------------------------------+
|
|
| ATOMIZER DEVLOOP - CLOSED-LOOP DEVELOPMENT |
|
|
+-----------------------------------------------------------------------------+
|
|
| |
|
|
| +----------+ +----------+ +----------+ +----------+ |
|
|
| | PLAN |---->| BUILD |---->| TEST |---->| ANALYZE | |
|
|
| | Gemini | | Claude | | Playwright| | Gemini | |
|
|
| | OpenCode | | CLI | | + API | | OpenCode | |
|
|
| +----------+ +----------+ +----------+ +----------+ |
|
|
| ^ | |
|
|
| | | |
|
|
| +---------------------------------------------------+ |
|
|
| FIX LOOP (max iterations) |
|
|
+-----------------------------------------------------------------------------+
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
### CLI Commands
|
|
|
|
```bash
|
|
# Full development cycle
|
|
python tools/devloop_cli.py start "Create new bracket study"
|
|
|
|
# Step-by-step execution
|
|
python tools/devloop_cli.py plan "Fix dashboard validation"
|
|
python tools/devloop_cli.py implement
|
|
python tools/devloop_cli.py test --study support_arm
|
|
python tools/devloop_cli.py analyze
|
|
|
|
# Browser UI tests (Playwright)
|
|
python tools/devloop_cli.py browser # Quick smoke test
|
|
python tools/devloop_cli.py browser --level home # Home page tests
|
|
python tools/devloop_cli.py browser --level full # All UI tests
|
|
python tools/devloop_cli.py browser --study support_arm # Study-specific
|
|
|
|
# Check status
|
|
python tools/devloop_cli.py status
|
|
|
|
# Quick test with support_arm study
|
|
python tools/devloop_cli.py quick
|
|
```
|
|
|
|
### Prerequisites
|
|
|
|
1. **Backend running**: `cd atomizer-dashboard/backend && python -m uvicorn api.main:app --reload --port 8000`
|
|
2. **Frontend running**: `cd atomizer-dashboard/frontend && npm run dev`
|
|
3. **Playwright browsers installed**: `cd atomizer-dashboard/frontend && npx playwright install chromium`
|
|
|
|
## Architecture
|
|
|
|
### Directory Structure
|
|
|
|
```
|
|
optimization_engine/devloop/
|
|
+-- __init__.py # Module exports
|
|
+-- orchestrator.py # DevLoopOrchestrator - full cycle coordination
|
|
+-- cli_bridge.py # DevLoopCLIOrchestrator - CLI-based execution
|
|
| +-- ClaudeCodeCLI # Claude Code CLI wrapper
|
|
| +-- OpenCodeCLI # OpenCode (Gemini) CLI wrapper
|
|
+-- test_runner.py # DashboardTestRunner - test execution
|
|
+-- browser_scenarios.py # Pre-built Playwright scenarios
|
|
+-- planning.py # GeminiPlanner - strategic planning
|
|
+-- analyzer.py # ProblemAnalyzer - failure analysis
|
|
+-- claude_bridge.py # ClaudeCodeBridge - Claude API integration
|
|
|
|
tools/
|
|
+-- devloop_cli.py # CLI entry point
|
|
|
|
.devloop/ # Persistent state directory
|
|
+-- current_plan.json # Current planning state
|
|
+-- test_results.json # Latest filesystem/API test results
|
|
+-- browser_test_results.json# Latest browser test results
|
|
+-- analysis.json # Latest analysis results
|
|
```
|
|
|
|
### Core Components
|
|
|
|
| Component | Location | Purpose |
|
|
|-----------|----------|---------|
|
|
| `DevLoopCLIOrchestrator` | `cli_bridge.py` | CLI-based cycle orchestration |
|
|
| `ClaudeCodeCLI` | `cli_bridge.py` | Execute Claude Code CLI commands |
|
|
| `OpenCodeCLI` | `cli_bridge.py` | Execute OpenCode (Gemini) CLI commands |
|
|
| `DashboardTestRunner` | `test_runner.py` | Run all test types |
|
|
| `get_browser_scenarios()` | `browser_scenarios.py` | Pre-built Playwright tests |
|
|
| `DevLoopOrchestrator` | `orchestrator.py` | API-based orchestration (WebSocket) |
|
|
| `GeminiPlanner` | `planning.py` | Gemini API planning |
|
|
| `ProblemAnalyzer` | `analyzer.py` | Failure analysis |
|
|
|
|
### CLI Tools Configuration
|
|
|
|
DevLoop uses your existing CLI subscriptions:
|
|
|
|
```python
|
|
# In cli_bridge.py
|
|
CLAUDE_PATH = r"C:\Users\antoi\.local\bin\claude.exe"
|
|
OPENCODE_PATH = r"C:\Users\antoi\AppData\Roaming\npm\opencode.cmd"
|
|
```
|
|
|
|
## CLI Commands Reference
|
|
|
|
### `start` - Full Development Cycle
|
|
|
|
Runs the complete PLAN -> BUILD -> TEST -> ANALYZE -> FIX loop.
|
|
|
|
```bash
|
|
python tools/devloop_cli.py start "Create support_arm study" --max-iterations 5
|
|
```
|
|
|
|
**Arguments:**
|
|
- `objective` (required): What to achieve
|
|
- `--max-iterations`: Maximum fix iterations (default: 5)
|
|
|
|
**Flow:**
|
|
1. Gemini creates implementation plan
|
|
2. Claude Code implements the plan
|
|
3. Tests verify implementation
|
|
4. If tests fail: Gemini analyzes, Claude fixes, loop
|
|
5. Exits on success or max iterations
|
|
|
|
### `plan` - Create Implementation Plan
|
|
|
|
Uses Gemini (via OpenCode) to create a strategic plan.
|
|
|
|
```bash
|
|
python tools/devloop_cli.py plan "Fix dashboard validation"
|
|
python tools/devloop_cli.py plan "Add new extractor" --context context.json
|
|
```
|
|
|
|
**Output:** Saves plan to `.devloop/current_plan.json`
|
|
|
|
**Plan structure:**
|
|
```json
|
|
{
|
|
"objective": "Fix dashboard validation",
|
|
"approach": "Update validation logic in spec_validator.py",
|
|
"tasks": [
|
|
{
|
|
"id": "task_001",
|
|
"description": "Update bounds validation",
|
|
"file": "optimization_engine/config/spec_validator.py",
|
|
"priority": "high"
|
|
}
|
|
],
|
|
"test_scenarios": [
|
|
{
|
|
"id": "test_001",
|
|
"name": "Validation passes for valid spec",
|
|
"type": "api",
|
|
"steps": [...]
|
|
}
|
|
],
|
|
"acceptance_criteria": ["All validation tests pass"]
|
|
}
|
|
```
|
|
|
|
### `implement` - Execute Plan with Claude Code
|
|
|
|
Implements the current plan using Claude Code CLI.
|
|
|
|
```bash
|
|
python tools/devloop_cli.py implement
|
|
python tools/devloop_cli.py implement --plan custom_plan.json
|
|
```
|
|
|
|
**Arguments:**
|
|
- `--plan`: Custom plan file (default: `.devloop/current_plan.json`)
|
|
|
|
**Output:** Reports files modified and success/failure.
|
|
|
|
### `test` - Run Tests
|
|
|
|
Run filesystem, API, or custom tests for a study.
|
|
|
|
```bash
|
|
python tools/devloop_cli.py test --study support_arm
|
|
python tools/devloop_cli.py test --scenarios custom_tests.json
|
|
```
|
|
|
|
**Arguments:**
|
|
- `--study`: Study name (generates standard tests)
|
|
- `--scenarios`: Custom test scenarios JSON file
|
|
|
|
**Standard study tests:**
|
|
1. Study directory exists
|
|
2. `atomizer_spec.json` is valid JSON
|
|
3. `README.md` exists
|
|
4. `run_optimization.py` exists
|
|
5. `1_setup/model/` directory exists
|
|
|
|
**Output:** Saves results to `.devloop/test_results.json`
|
|
|
|
### `browser` - Run Playwright UI Tests
|
|
|
|
Run browser-based UI tests using Playwright.
|
|
|
|
```bash
|
|
python tools/devloop_cli.py browser # Quick smoke test
|
|
python tools/devloop_cli.py browser --level home # Home page tests
|
|
python tools/devloop_cli.py browser --level full # All UI tests
|
|
python tools/devloop_cli.py browser --level study --study support_arm
|
|
```
|
|
|
|
**Arguments:**
|
|
- `--level`: Test level (`quick`, `home`, `full`, `study`)
|
|
- `--study`: Study name for study-specific tests
|
|
|
|
**Test Levels:**
|
|
|
|
| Level | Tests | Description |
|
|
|-------|-------|-------------|
|
|
| `quick` | 1 | Smoke test - page loads |
|
|
| `home` | 2 | Home page stats + folder expansion |
|
|
| `full` | 5+ | All UI + study-specific |
|
|
| `study` | 3 | Canvas, dashboard for specific study |
|
|
|
|
**Output:** Saves results to `.devloop/browser_test_results.json`
|
|
|
|
### `analyze` - Analyze Test Results
|
|
|
|
Uses Gemini (via OpenCode) to analyze failures and create fix plans.
|
|
|
|
```bash
|
|
python tools/devloop_cli.py analyze
|
|
python tools/devloop_cli.py analyze --results custom_results.json
|
|
```
|
|
|
|
**Arguments:**
|
|
- `--results`: Custom results file (default: `.devloop/test_results.json`)
|
|
|
|
**Output:** Saves analysis to `.devloop/analysis.json`
|
|
|
|
### `status` - View Current State
|
|
|
|
Shows the current DevLoop state.
|
|
|
|
```bash
|
|
python tools/devloop_cli.py status
|
|
```
|
|
|
|
**Output:**
|
|
```
|
|
DevLoop Status
|
|
============================================================
|
|
|
|
Current Plan: Fix dashboard validation
|
|
Tasks: 3
|
|
|
|
Last Test Results:
|
|
Passed: 4/5
|
|
|
|
Last Analysis:
|
|
Issues: 1
|
|
|
|
============================================================
|
|
CLI Tools:
|
|
- Claude Code: C:\Users\antoi\.local\bin\claude.exe
|
|
- OpenCode: C:\Users\antoi\AppData\Roaming\npm\opencode.cmd
|
|
```
|
|
|
|
### `quick` - Quick Test
|
|
|
|
Runs tests for the `support_arm` study as a quick verification.
|
|
|
|
```bash
|
|
python tools/devloop_cli.py quick
|
|
```
|
|
|
|
## Test Types
|
|
|
|
### Filesystem Tests
|
|
|
|
Check files and directories exist, JSON validity, content matching.
|
|
|
|
```json
|
|
{
|
|
"id": "test_fs_001",
|
|
"name": "Study directory exists",
|
|
"type": "filesystem",
|
|
"steps": [
|
|
{"action": "check_exists", "path": "studies/my_study"}
|
|
],
|
|
"expected_outcome": {"exists": true}
|
|
}
|
|
```
|
|
|
|
**Actions:**
|
|
- `check_exists` - Verify path exists
|
|
- `check_json_valid` - Parse JSON file
|
|
- `check_file_contains` - Search for content
|
|
|
|
### API Tests
|
|
|
|
Test REST endpoints.
|
|
|
|
```json
|
|
{
|
|
"id": "test_api_001",
|
|
"name": "Get study spec",
|
|
"type": "api",
|
|
"steps": [
|
|
{"action": "get", "endpoint": "/api/studies/my_study/spec"}
|
|
],
|
|
"expected_outcome": {"status_code": 200}
|
|
}
|
|
```
|
|
|
|
**Actions:**
|
|
- `get` - HTTP GET
|
|
- `post` - HTTP POST with `data`
|
|
- `put` - HTTP PUT with `data`
|
|
- `delete` - HTTP DELETE
|
|
|
|
### Browser Tests (Playwright)
|
|
|
|
Test UI interactions.
|
|
|
|
```json
|
|
{
|
|
"id": "test_browser_001",
|
|
"name": "Canvas loads nodes",
|
|
"type": "browser",
|
|
"steps": [
|
|
{"action": "navigate", "url": "/canvas/support_arm"},
|
|
{"action": "wait_for", "selector": ".react-flow__node"},
|
|
{"action": "click", "selector": "[data-testid='node-dv_001']"}
|
|
],
|
|
"expected_outcome": {"status": "pass"},
|
|
"timeout_ms": 20000
|
|
}
|
|
```
|
|
|
|
**Actions:**
|
|
- `navigate` - Go to URL
|
|
- `wait_for` - Wait for selector
|
|
- `click` - Click element
|
|
- `fill` - Fill input with value
|
|
- `screenshot` - Take screenshot
|
|
|
|
### CLI Tests
|
|
|
|
Execute shell commands.
|
|
|
|
```json
|
|
{
|
|
"id": "test_cli_001",
|
|
"name": "Run optimization test",
|
|
"type": "cli",
|
|
"steps": [
|
|
{"command": "python run_optimization.py --test", "cwd": "studies/my_study"}
|
|
],
|
|
"expected_outcome": {"returncode": 0}
|
|
}
|
|
```
|
|
|
|
## Browser Test Scenarios
|
|
|
|
Pre-built scenarios in `browser_scenarios.py`:
|
|
|
|
```python
|
|
from optimization_engine.devloop.browser_scenarios import get_browser_scenarios
|
|
|
|
# Get scenarios by level
|
|
scenarios = get_browser_scenarios(level="full", study_name="support_arm")
|
|
|
|
# Available functions
|
|
get_browser_scenarios(level, study_name) # Main entry point
|
|
get_study_browser_scenarios(study_name) # Study-specific tests
|
|
get_ui_verification_scenarios() # Home page tests
|
|
get_chat_verification_scenarios() # Chat panel tests
|
|
```
|
|
|
|
## Standalone Playwright Tests
|
|
|
|
In addition to DevLoop integration, you can run standalone Playwright tests:
|
|
|
|
```bash
|
|
cd atomizer-dashboard/frontend
|
|
|
|
# Run all E2E tests
|
|
npm run test:e2e
|
|
|
|
# Run with Playwright UI
|
|
npm run test:e2e:ui
|
|
|
|
# Run specific test file
|
|
npx playwright test tests/e2e/home.spec.ts
|
|
```
|
|
|
|
**Test files:**
|
|
- `tests/e2e/home.spec.ts` - Home page tests (8 tests)
|
|
|
|
## API Integration
|
|
|
|
DevLoop also provides REST API endpoints when running the dashboard backend:
|
|
|
|
| Endpoint | Method | Description |
|
|
|----------|--------|-------------|
|
|
| `/api/devloop/status` | GET | Current loop status |
|
|
| `/api/devloop/start` | POST | Start development cycle |
|
|
| `/api/devloop/stop` | POST | Stop current cycle |
|
|
| `/api/devloop/step` | POST | Execute single phase |
|
|
| `/api/devloop/history` | GET | View past cycles |
|
|
| `/api/devloop/health` | GET | System health check |
|
|
| `/api/devloop/ws` | WebSocket | Real-time updates |
|
|
|
|
**Start a cycle via API:**
|
|
```bash
|
|
curl -X POST http://localhost:8000/api/devloop/start \
|
|
-H "Content-Type: application/json" \
|
|
-d '{"objective": "Create support_arm study", "max_iterations": 5}'
|
|
```
|
|
|
|
## State Files
|
|
|
|
DevLoop maintains state in `.devloop/`:
|
|
|
|
| File | Purpose | Updated By |
|
|
|------|---------|------------|
|
|
| `current_plan.json` | Current implementation plan | `plan` command |
|
|
| `test_results.json` | Filesystem/API test results | `test` command |
|
|
| `browser_test_results.json` | Browser test results | `browser` command |
|
|
| `analysis.json` | Failure analysis | `analyze` command |
|
|
|
|
## Example Workflows
|
|
|
|
### Create a New Study
|
|
|
|
```bash
|
|
# Full autonomous cycle
|
|
python tools/devloop_cli.py start "Create bracket_lightweight study with mass and displacement objectives"
|
|
|
|
# Or step by step
|
|
python tools/devloop_cli.py plan "Create bracket_lightweight study"
|
|
python tools/devloop_cli.py implement
|
|
python tools/devloop_cli.py test --study bracket_lightweight
|
|
python tools/devloop_cli.py browser --study bracket_lightweight
|
|
```
|
|
|
|
### Debug a Dashboard Issue
|
|
|
|
```bash
|
|
# Plan the fix
|
|
python tools/devloop_cli.py plan "Fix canvas node selection not updating panel"
|
|
|
|
# Implement
|
|
python tools/devloop_cli.py implement
|
|
|
|
# Test UI
|
|
python tools/devloop_cli.py browser --level full
|
|
|
|
# If tests fail, analyze
|
|
python tools/devloop_cli.py analyze
|
|
|
|
# Fix and retest loop...
|
|
```
|
|
|
|
### Verify Study Before Running
|
|
|
|
```bash
|
|
# File structure tests
|
|
python tools/devloop_cli.py test --study my_study
|
|
|
|
# Browser tests (canvas loads, etc.)
|
|
python tools/devloop_cli.py browser --level study --study my_study
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Browser Tests Fail
|
|
|
|
1. **Ensure frontend is running**: `npm run dev` in `atomizer-dashboard/frontend`
|
|
2. **Check port**: DevLoop uses `localhost:3003` (Vite default)
|
|
3. **Install browsers**: `npx playwright install chromium`
|
|
|
|
### CLI Tools Not Found
|
|
|
|
Check paths in `cli_bridge.py`:
|
|
```python
|
|
CLAUDE_PATH = r"C:\Users\antoi\.local\bin\claude.exe"
|
|
OPENCODE_PATH = r"C:\Users\antoi\AppData\Roaming\npm\opencode.cmd"
|
|
```
|
|
|
|
### API Tests Fail
|
|
|
|
1. **Ensure backend is running**: Port 8000
|
|
2. **Check endpoint paths**: May need `/api/` prefix
|
|
|
|
### Tests Timeout
|
|
|
|
Increase timeout in test scenario:
|
|
```json
|
|
{
|
|
"timeout_ms": 30000
|
|
}
|
|
```
|
|
|
|
### Unclosed Client Session Warning
|
|
|
|
This is a known aiohttp warning on Windows. Tests still pass correctly.
|
|
|
|
## Integration with LAC
|
|
|
|
DevLoop records learnings to LAC (Learning Atomizer Core):
|
|
|
|
```python
|
|
from knowledge_base.lac import get_lac
|
|
|
|
lac = get_lac()
|
|
|
|
# Record after successful cycle
|
|
lac.record_insight(
|
|
category="success_pattern",
|
|
context="DevLoop created support_arm study",
|
|
insight="TPE sampler works well for 4-variable bracket problems",
|
|
confidence=0.9
|
|
)
|
|
```
|
|
|
|
## Future Enhancements
|
|
|
|
1. **Parallel test execution** - Run independent tests concurrently
|
|
2. **Visual diff** - Show code changes in dashboard
|
|
3. **Smart rollback** - Automatic rollback on regression
|
|
4. **Branch management** - Auto-create feature branches
|
|
5. **Cost tracking** - Monitor CLI usage
|