Files
Atomizer/docs/guides/DEVLOOP.md
Anto01 3193831340 feat: Add DevLoop automation and HTML Reports
## DevLoop - Closed-Loop Development System
- Orchestrator for plan → build → test → analyze cycle
- Gemini planning via OpenCode CLI
- Claude implementation via CLI bridge
- Playwright browser testing integration
- Test runner with API, filesystem, and browser tests
- Persistent state in .devloop/ directory
- CLI tool: tools/devloop_cli.py

Usage:
  python tools/devloop_cli.py start 'Create new feature'
  python tools/devloop_cli.py plan 'Fix bug in X'
  python tools/devloop_cli.py test --study support_arm
  python tools/devloop_cli.py browser --level full

## HTML Reports (optimization_engine/reporting/)
- Interactive Plotly-based reports
- Convergence plot, Pareto front, parallel coordinates
- Parameter importance analysis
- Self-contained HTML (offline-capable)
- Tailwind CSS styling

## Playwright E2E Tests
- Home page tests
- Test results in test-results/

## LAC Knowledge Base Updates
- Session insights (failures, workarounds, patterns)
- Optimization memory for arm support study
2026-01-24 21:18:18 -05:00

15 KiB

DevLoop - Closed-Loop Development System

Overview

DevLoop is Atomizer's autonomous development cycle system that coordinates AI agents and automated testing to create a closed-loop development workflow.

Key Features:

  • Uses your existing CLI subscriptions - no API keys needed
  • Playwright browser testing for UI verification
  • Multiple test types: API, browser, CLI, filesystem
  • Automatic analysis and fix iterations
  • Persistent state in .devloop/ directory
+-----------------------------------------------------------------------------+
|                    ATOMIZER DEVLOOP - CLOSED-LOOP DEVELOPMENT               |
+-----------------------------------------------------------------------------+
|                                                                             |
|   +----------+     +----------+     +----------+     +----------+          |
|   |  PLAN    |---->|  BUILD   |---->|  TEST    |---->| ANALYZE  |          |
|   | Gemini   |     | Claude   |     | Playwright|    |  Gemini  |          |
|   | OpenCode |     | CLI      |     | + API    |     | OpenCode |          |
|   +----------+     +----------+     +----------+     +----------+          |
|        ^                                                   |                |
|        |                                                   |                |
|        +---------------------------------------------------+                |
|                          FIX LOOP (max iterations)                          |
+-----------------------------------------------------------------------------+

Quick Start

CLI Commands

# Full development cycle
python tools/devloop_cli.py start "Create new bracket study"

# Step-by-step execution
python tools/devloop_cli.py plan "Fix dashboard validation"
python tools/devloop_cli.py implement
python tools/devloop_cli.py test --study support_arm
python tools/devloop_cli.py analyze

# Browser UI tests (Playwright)
python tools/devloop_cli.py browser                      # Quick smoke test
python tools/devloop_cli.py browser --level home         # Home page tests
python tools/devloop_cli.py browser --level full         # All UI tests
python tools/devloop_cli.py browser --study support_arm  # Study-specific

# Check status
python tools/devloop_cli.py status

# Quick test with support_arm study
python tools/devloop_cli.py quick

Prerequisites

  1. Backend running: cd atomizer-dashboard/backend && python -m uvicorn api.main:app --reload --port 8000
  2. Frontend running: cd atomizer-dashboard/frontend && npm run dev
  3. Playwright browsers installed: cd atomizer-dashboard/frontend && npx playwright install chromium

Architecture

Directory Structure

optimization_engine/devloop/
+-- __init__.py              # Module exports
+-- orchestrator.py          # DevLoopOrchestrator - full cycle coordination
+-- cli_bridge.py            # DevLoopCLIOrchestrator - CLI-based execution
|   +-- ClaudeCodeCLI        # Claude Code CLI wrapper
|   +-- OpenCodeCLI          # OpenCode (Gemini) CLI wrapper
+-- test_runner.py           # DashboardTestRunner - test execution
+-- browser_scenarios.py     # Pre-built Playwright scenarios
+-- planning.py              # GeminiPlanner - strategic planning
+-- analyzer.py              # ProblemAnalyzer - failure analysis
+-- claude_bridge.py         # ClaudeCodeBridge - Claude API integration

tools/
+-- devloop_cli.py           # CLI entry point

.devloop/                    # Persistent state directory
+-- current_plan.json        # Current planning state
+-- test_results.json        # Latest filesystem/API test results
+-- browser_test_results.json# Latest browser test results
+-- analysis.json            # Latest analysis results

Core Components

Component Location Purpose
DevLoopCLIOrchestrator cli_bridge.py CLI-based cycle orchestration
ClaudeCodeCLI cli_bridge.py Execute Claude Code CLI commands
OpenCodeCLI cli_bridge.py Execute OpenCode (Gemini) CLI commands
DashboardTestRunner test_runner.py Run all test types
get_browser_scenarios() browser_scenarios.py Pre-built Playwright tests
DevLoopOrchestrator orchestrator.py API-based orchestration (WebSocket)
GeminiPlanner planning.py Gemini API planning
ProblemAnalyzer analyzer.py Failure analysis

CLI Tools Configuration

DevLoop uses your existing CLI subscriptions:

# In cli_bridge.py
CLAUDE_PATH = r"C:\Users\antoi\.local\bin\claude.exe"
OPENCODE_PATH = r"C:\Users\antoi\AppData\Roaming\npm\opencode.cmd"

CLI Commands Reference

start - Full Development Cycle

Runs the complete PLAN -> BUILD -> TEST -> ANALYZE -> FIX loop.

python tools/devloop_cli.py start "Create support_arm study" --max-iterations 5

Arguments:

  • objective (required): What to achieve
  • --max-iterations: Maximum fix iterations (default: 5)

Flow:

  1. Gemini creates implementation plan
  2. Claude Code implements the plan
  3. Tests verify implementation
  4. If tests fail: Gemini analyzes, Claude fixes, loop
  5. Exits on success or max iterations

plan - Create Implementation Plan

Uses Gemini (via OpenCode) to create a strategic plan.

python tools/devloop_cli.py plan "Fix dashboard validation"
python tools/devloop_cli.py plan "Add new extractor" --context context.json

Output: Saves plan to .devloop/current_plan.json

Plan structure:

{
  "objective": "Fix dashboard validation",
  "approach": "Update validation logic in spec_validator.py",
  "tasks": [
    {
      "id": "task_001",
      "description": "Update bounds validation",
      "file": "optimization_engine/config/spec_validator.py",
      "priority": "high"
    }
  ],
  "test_scenarios": [
    {
      "id": "test_001",
      "name": "Validation passes for valid spec",
      "type": "api",
      "steps": [...]
    }
  ],
  "acceptance_criteria": ["All validation tests pass"]
}

implement - Execute Plan with Claude Code

Implements the current plan using Claude Code CLI.

python tools/devloop_cli.py implement
python tools/devloop_cli.py implement --plan custom_plan.json

Arguments:

  • --plan: Custom plan file (default: .devloop/current_plan.json)

Output: Reports files modified and success/failure.

test - Run Tests

Run filesystem, API, or custom tests for a study.

python tools/devloop_cli.py test --study support_arm
python tools/devloop_cli.py test --scenarios custom_tests.json

Arguments:

  • --study: Study name (generates standard tests)
  • --scenarios: Custom test scenarios JSON file

Standard study tests:

  1. Study directory exists
  2. atomizer_spec.json is valid JSON
  3. README.md exists
  4. run_optimization.py exists
  5. 1_setup/model/ directory exists

Output: Saves results to .devloop/test_results.json

browser - Run Playwright UI Tests

Run browser-based UI tests using Playwright.

python tools/devloop_cli.py browser                      # Quick smoke test
python tools/devloop_cli.py browser --level home         # Home page tests  
python tools/devloop_cli.py browser --level full         # All UI tests
python tools/devloop_cli.py browser --level study --study support_arm

Arguments:

  • --level: Test level (quick, home, full, study)
  • --study: Study name for study-specific tests

Test Levels:

Level Tests Description
quick 1 Smoke test - page loads
home 2 Home page stats + folder expansion
full 5+ All UI + study-specific
study 3 Canvas, dashboard for specific study

Output: Saves results to .devloop/browser_test_results.json

analyze - Analyze Test Results

Uses Gemini (via OpenCode) to analyze failures and create fix plans.

python tools/devloop_cli.py analyze
python tools/devloop_cli.py analyze --results custom_results.json

Arguments:

  • --results: Custom results file (default: .devloop/test_results.json)

Output: Saves analysis to .devloop/analysis.json

status - View Current State

Shows the current DevLoop state.

python tools/devloop_cli.py status

Output:

DevLoop Status
============================================================

Current Plan: Fix dashboard validation
  Tasks: 3

Last Test Results:
  Passed: 4/5

Last Analysis:
  Issues: 1

============================================================
CLI Tools:
  - Claude Code: C:\Users\antoi\.local\bin\claude.exe
  - OpenCode:    C:\Users\antoi\AppData\Roaming\npm\opencode.cmd

quick - Quick Test

Runs tests for the support_arm study as a quick verification.

python tools/devloop_cli.py quick

Test Types

Filesystem Tests

Check files and directories exist, JSON validity, content matching.

{
  "id": "test_fs_001",
  "name": "Study directory exists",
  "type": "filesystem",
  "steps": [
    {"action": "check_exists", "path": "studies/my_study"}
  ],
  "expected_outcome": {"exists": true}
}

Actions:

  • check_exists - Verify path exists
  • check_json_valid - Parse JSON file
  • check_file_contains - Search for content

API Tests

Test REST endpoints.

{
  "id": "test_api_001",
  "name": "Get study spec",
  "type": "api",
  "steps": [
    {"action": "get", "endpoint": "/api/studies/my_study/spec"}
  ],
  "expected_outcome": {"status_code": 200}
}

Actions:

  • get - HTTP GET
  • post - HTTP POST with data
  • put - HTTP PUT with data
  • delete - HTTP DELETE

Browser Tests (Playwright)

Test UI interactions.

{
  "id": "test_browser_001",
  "name": "Canvas loads nodes",
  "type": "browser",
  "steps": [
    {"action": "navigate", "url": "/canvas/support_arm"},
    {"action": "wait_for", "selector": ".react-flow__node"},
    {"action": "click", "selector": "[data-testid='node-dv_001']"}
  ],
  "expected_outcome": {"status": "pass"},
  "timeout_ms": 20000
}

Actions:

  • navigate - Go to URL
  • wait_for - Wait for selector
  • click - Click element
  • fill - Fill input with value
  • screenshot - Take screenshot

CLI Tests

Execute shell commands.

{
  "id": "test_cli_001",
  "name": "Run optimization test",
  "type": "cli",
  "steps": [
    {"command": "python run_optimization.py --test", "cwd": "studies/my_study"}
  ],
  "expected_outcome": {"returncode": 0}
}

Browser Test Scenarios

Pre-built scenarios in browser_scenarios.py:

from optimization_engine.devloop.browser_scenarios import get_browser_scenarios

# Get scenarios by level
scenarios = get_browser_scenarios(level="full", study_name="support_arm")

# Available functions
get_browser_scenarios(level, study_name)  # Main entry point
get_study_browser_scenarios(study_name)   # Study-specific tests
get_ui_verification_scenarios()           # Home page tests
get_chat_verification_scenarios()         # Chat panel tests

Standalone Playwright Tests

In addition to DevLoop integration, you can run standalone Playwright tests:

cd atomizer-dashboard/frontend

# Run all E2E tests
npm run test:e2e

# Run with Playwright UI
npm run test:e2e:ui

# Run specific test file
npx playwright test tests/e2e/home.spec.ts

Test files:

  • tests/e2e/home.spec.ts - Home page tests (8 tests)

API Integration

DevLoop also provides REST API endpoints when running the dashboard backend:

Endpoint Method Description
/api/devloop/status GET Current loop status
/api/devloop/start POST Start development cycle
/api/devloop/stop POST Stop current cycle
/api/devloop/step POST Execute single phase
/api/devloop/history GET View past cycles
/api/devloop/health GET System health check
/api/devloop/ws WebSocket Real-time updates

Start a cycle via API:

curl -X POST http://localhost:8000/api/devloop/start \
  -H "Content-Type: application/json" \
  -d '{"objective": "Create support_arm study", "max_iterations": 5}'

State Files

DevLoop maintains state in .devloop/:

File Purpose Updated By
current_plan.json Current implementation plan plan command
test_results.json Filesystem/API test results test command
browser_test_results.json Browser test results browser command
analysis.json Failure analysis analyze command

Example Workflows

Create a New Study

# Full autonomous cycle
python tools/devloop_cli.py start "Create bracket_lightweight study with mass and displacement objectives"

# Or step by step
python tools/devloop_cli.py plan "Create bracket_lightweight study"
python tools/devloop_cli.py implement
python tools/devloop_cli.py test --study bracket_lightweight
python tools/devloop_cli.py browser --study bracket_lightweight

Debug a Dashboard Issue

# Plan the fix
python tools/devloop_cli.py plan "Fix canvas node selection not updating panel"

# Implement
python tools/devloop_cli.py implement

# Test UI
python tools/devloop_cli.py browser --level full

# If tests fail, analyze
python tools/devloop_cli.py analyze

# Fix and retest loop...

Verify Study Before Running

# File structure tests
python tools/devloop_cli.py test --study my_study

# Browser tests (canvas loads, etc.)
python tools/devloop_cli.py browser --level study --study my_study

Troubleshooting

Browser Tests Fail

  1. Ensure frontend is running: npm run dev in atomizer-dashboard/frontend
  2. Check port: DevLoop uses localhost:3003 (Vite default)
  3. Install browsers: npx playwright install chromium

CLI Tools Not Found

Check paths in cli_bridge.py:

CLAUDE_PATH = r"C:\Users\antoi\.local\bin\claude.exe"
OPENCODE_PATH = r"C:\Users\antoi\AppData\Roaming\npm\opencode.cmd"

API Tests Fail

  1. Ensure backend is running: Port 8000
  2. Check endpoint paths: May need /api/ prefix

Tests Timeout

Increase timeout in test scenario:

{
  "timeout_ms": 30000
}

Unclosed Client Session Warning

This is a known aiohttp warning on Windows. Tests still pass correctly.

Integration with LAC

DevLoop records learnings to LAC (Learning Atomizer Core):

from knowledge_base.lac import get_lac

lac = get_lac()

# Record after successful cycle
lac.record_insight(
    category="success_pattern",
    context="DevLoop created support_arm study",
    insight="TPE sampler works well for 4-variable bracket problems",
    confidence=0.9
)

Future Enhancements

  1. Parallel test execution - Run independent tests concurrently
  2. Visual diff - Show code changes in dashboard
  3. Smart rollback - Automatic rollback on regression
  4. Branch management - Auto-create feature branches
  5. Cost tracking - Monitor CLI usage