Files

Anto01 3193831340 feat: Add DevLoop automation and HTML Reports

## DevLoop - Closed-Loop Development System
- Orchestrator for plan → build → test → analyze cycle
- Gemini planning via OpenCode CLI
- Claude implementation via CLI bridge
- Playwright browser testing integration
- Test runner with API, filesystem, and browser tests
- Persistent state in .devloop/ directory
- CLI tool: tools/devloop_cli.py

Usage:
  python tools/devloop_cli.py start 'Create new feature'
  python tools/devloop_cli.py plan 'Fix bug in X'
  python tools/devloop_cli.py test --study support_arm
  python tools/devloop_cli.py browser --level full

## HTML Reports (optimization_engine/reporting/)
- Interactive Plotly-based reports
- Convergence plot, Pareto front, parallel coordinates
- Parameter importance analysis
- Self-contained HTML (offline-capable)
- Tailwind CSS styling

## Playwright E2E Tests
- Home page tests
- Test results in test-results/

## LAC Knowledge Base Updates
- Session insights (failures, workarounds, patterns)
- Optimization memory for arm support study

2026-01-24 21:18:18 -05:00

15 KiB

Raw Blame History

DevLoop - Closed-Loop Development System

Overview

DevLoop is Atomizer's autonomous development cycle system that coordinates AI agents and automated testing to create a closed-loop development workflow.

Key Features:

Uses your existing CLI subscriptions - no API keys needed
Playwright browser testing for UI verification
Multiple test types: API, browser, CLI, filesystem
Automatic analysis and fix iterations
Persistent state in .devloop/ directory

+-----------------------------------------------------------------------------+
|                    ATOMIZER DEVLOOP - CLOSED-LOOP DEVELOPMENT               |
+-----------------------------------------------------------------------------+
|                                                                             |
|   +----------+     +----------+     +----------+     +----------+          |
|   |  PLAN    |---->|  BUILD   |---->|  TEST    |---->| ANALYZE  |          |
|   | Gemini   |     | Claude   |     | Playwright|    |  Gemini  |          |
|   | OpenCode |     | CLI      |     | + API    |     | OpenCode |          |
|   +----------+     +----------+     +----------+     +----------+          |
|        ^                                                   |                |
|        |                                                   |                |
|        +---------------------------------------------------+                |
|                          FIX LOOP (max iterations)                          |
+-----------------------------------------------------------------------------+

Quick Start

CLI Commands

# Full development cycle
python tools/devloop_cli.py start "Create new bracket study"

# Step-by-step execution
python tools/devloop_cli.py plan "Fix dashboard validation"
python tools/devloop_cli.py implement
python tools/devloop_cli.py test --study support_arm
python tools/devloop_cli.py analyze

# Browser UI tests (Playwright)
python tools/devloop_cli.py browser                      # Quick smoke test
python tools/devloop_cli.py browser --level home         # Home page tests
python tools/devloop_cli.py browser --level full         # All UI tests
python tools/devloop_cli.py browser --study support_arm  # Study-specific

# Check status
python tools/devloop_cli.py status

# Quick test with support_arm study
python tools/devloop_cli.py quick

Prerequisites

Backend running: cd atomizer-dashboard/backend && python -m uvicorn api.main:app --reload --port 8000
Frontend running: cd atomizer-dashboard/frontend && npm run dev
Playwright browsers installed: cd atomizer-dashboard/frontend && npx playwright install chromium

Architecture

Directory Structure

optimization_engine/devloop/
+-- __init__.py              # Module exports
+-- orchestrator.py          # DevLoopOrchestrator - full cycle coordination
+-- cli_bridge.py            # DevLoopCLIOrchestrator - CLI-based execution
|   +-- ClaudeCodeCLI        # Claude Code CLI wrapper
|   +-- OpenCodeCLI          # OpenCode (Gemini) CLI wrapper
+-- test_runner.py           # DashboardTestRunner - test execution
+-- browser_scenarios.py     # Pre-built Playwright scenarios
+-- planning.py              # GeminiPlanner - strategic planning
+-- analyzer.py              # ProblemAnalyzer - failure analysis
+-- claude_bridge.py         # ClaudeCodeBridge - Claude API integration

tools/
+-- devloop_cli.py           # CLI entry point

.devloop/                    # Persistent state directory
+-- current_plan.json        # Current planning state
+-- test_results.json        # Latest filesystem/API test results
+-- browser_test_results.json# Latest browser test results
+-- analysis.json            # Latest analysis results

Core Components

Component	Location	Purpose
`DevLoopCLIOrchestrator`	`cli_bridge.py`	CLI-based cycle orchestration
`ClaudeCodeCLI`	`cli_bridge.py`	Execute Claude Code CLI commands
`OpenCodeCLI`	`cli_bridge.py`	Execute OpenCode (Gemini) CLI commands
`DashboardTestRunner`	`test_runner.py`	Run all test types
`get_browser_scenarios()`	`browser_scenarios.py`	Pre-built Playwright tests
`DevLoopOrchestrator`	`orchestrator.py`	API-based orchestration (WebSocket)
`GeminiPlanner`	`planning.py`	Gemini API planning
`ProblemAnalyzer`	`analyzer.py`	Failure analysis

CLI Tools Configuration

DevLoop uses your existing CLI subscriptions:

# In cli_bridge.py
CLAUDE_PATH = r"C:\Users\antoi\.local\bin\claude.exe"
OPENCODE_PATH = r"C:\Users\antoi\AppData\Roaming\npm\opencode.cmd"

CLI Commands Reference

`start` - Full Development Cycle

Runs the complete PLAN -> BUILD -> TEST -> ANALYZE -> FIX loop.

python tools/devloop_cli.py start "Create support_arm study" --max-iterations 5

Arguments:

objective (required): What to achieve
--max-iterations: Maximum fix iterations (default: 5)

Flow:

Gemini creates implementation plan
Claude Code implements the plan
Tests verify implementation
If tests fail: Gemini analyzes, Claude fixes, loop
Exits on success or max iterations

`plan` - Create Implementation Plan

Uses Gemini (via OpenCode) to create a strategic plan.

python tools/devloop_cli.py plan "Fix dashboard validation"
python tools/devloop_cli.py plan "Add new extractor" --context context.json

Output: Saves plan to .devloop/current_plan.json

Plan structure:

{
  "objective": "Fix dashboard validation",
  "approach": "Update validation logic in spec_validator.py",
  "tasks": [
    {
      "id": "task_001",
      "description": "Update bounds validation",
      "file": "optimization_engine/config/spec_validator.py",
      "priority": "high"
    }
  ],
  "test_scenarios": [
    {
      "id": "test_001",
      "name": "Validation passes for valid spec",
      "type": "api",
      "steps": [...]
    }
  ],
  "acceptance_criteria": ["All validation tests pass"]
}

`implement` - Execute Plan with Claude Code

Implements the current plan using Claude Code CLI.

python tools/devloop_cli.py implement
python tools/devloop_cli.py implement --plan custom_plan.json

Arguments:

--plan: Custom plan file (default: .devloop/current_plan.json)

Output: Reports files modified and success/failure.

`test` - Run Tests

Run filesystem, API, or custom tests for a study.

python tools/devloop_cli.py test --study support_arm
python tools/devloop_cli.py test --scenarios custom_tests.json

Arguments:

--study: Study name (generates standard tests)
--scenarios: Custom test scenarios JSON file

Standard study tests:

Study directory exists
atomizer_spec.json is valid JSON
README.md exists
run_optimization.py exists
1_setup/model/ directory exists

Output: Saves results to .devloop/test_results.json

`browser` - Run Playwright UI Tests

Run browser-based UI tests using Playwright.

python tools/devloop_cli.py browser                      # Quick smoke test
python tools/devloop_cli.py browser --level home         # Home page tests  
python tools/devloop_cli.py browser --level full         # All UI tests
python tools/devloop_cli.py browser --level study --study support_arm

Arguments:

--level: Test level (quick, home, full, study)
--study: Study name for study-specific tests

Test Levels:

Level	Tests	Description
`quick`	1	Smoke test - page loads
`home`	2	Home page stats + folder expansion
`full`	5+	All UI + study-specific
`study`	3	Canvas, dashboard for specific study

Output: Saves results to .devloop/browser_test_results.json

`analyze` - Analyze Test Results

Uses Gemini (via OpenCode) to analyze failures and create fix plans.

python tools/devloop_cli.py analyze
python tools/devloop_cli.py analyze --results custom_results.json

Arguments:

--results: Custom results file (default: .devloop/test_results.json)

Output: Saves analysis to .devloop/analysis.json

`status` - View Current State

Shows the current DevLoop state.

python tools/devloop_cli.py status

Output:

DevLoop Status
============================================================

Current Plan: Fix dashboard validation
  Tasks: 3

Last Test Results:
  Passed: 4/5

Last Analysis:
  Issues: 1

============================================================
CLI Tools:
  - Claude Code: C:\Users\antoi\.local\bin\claude.exe
  - OpenCode:    C:\Users\antoi\AppData\Roaming\npm\opencode.cmd

`quick` - Quick Test

Runs tests for the support_arm study as a quick verification.

python tools/devloop_cli.py quick

Test Types

Filesystem Tests

Check files and directories exist, JSON validity, content matching.

{
  "id": "test_fs_001",
  "name": "Study directory exists",
  "type": "filesystem",
  "steps": [
    {"action": "check_exists", "path": "studies/my_study"}
  ],
  "expected_outcome": {"exists": true}
}

Actions:

check_exists - Verify path exists
check_json_valid - Parse JSON file
check_file_contains - Search for content

API Tests

Test REST endpoints.

{
  "id": "test_api_001",
  "name": "Get study spec",
  "type": "api",
  "steps": [
    {"action": "get", "endpoint": "/api/studies/my_study/spec"}
  ],
  "expected_outcome": {"status_code": 200}
}

Actions:

get - HTTP GET
post - HTTP POST with data
put - HTTP PUT with data
delete - HTTP DELETE

Browser Tests (Playwright)

Test UI interactions.

{
  "id": "test_browser_001",
  "name": "Canvas loads nodes",
  "type": "browser",
  "steps": [
    {"action": "navigate", "url": "/canvas/support_arm"},
    {"action": "wait_for", "selector": ".react-flow__node"},
    {"action": "click", "selector": "[data-testid='node-dv_001']"}
  ],
  "expected_outcome": {"status": "pass"},
  "timeout_ms": 20000
}

Actions:

navigate - Go to URL
wait_for - Wait for selector
click - Click element
fill - Fill input with value
screenshot - Take screenshot

CLI Tests

Execute shell commands.

{
  "id": "test_cli_001",
  "name": "Run optimization test",
  "type": "cli",
  "steps": [
    {"command": "python run_optimization.py --test", "cwd": "studies/my_study"}
  ],
  "expected_outcome": {"returncode": 0}
}

Browser Test Scenarios

Pre-built scenarios in browser_scenarios.py:

from optimization_engine.devloop.browser_scenarios import get_browser_scenarios

# Get scenarios by level
scenarios = get_browser_scenarios(level="full", study_name="support_arm")

# Available functions
get_browser_scenarios(level, study_name)  # Main entry point
get_study_browser_scenarios(study_name)   # Study-specific tests
get_ui_verification_scenarios()           # Home page tests
get_chat_verification_scenarios()         # Chat panel tests

Standalone Playwright Tests

In addition to DevLoop integration, you can run standalone Playwright tests:

cd atomizer-dashboard/frontend

# Run all E2E tests
npm run test:e2e

# Run with Playwright UI
npm run test:e2e:ui

# Run specific test file
npx playwright test tests/e2e/home.spec.ts

Test files:

tests/e2e/home.spec.ts - Home page tests (8 tests)

API Integration

DevLoop also provides REST API endpoints when running the dashboard backend:

Endpoint	Method	Description
`/api/devloop/status`	GET	Current loop status
`/api/devloop/start`	POST	Start development cycle
`/api/devloop/stop`	POST	Stop current cycle
`/api/devloop/step`	POST	Execute single phase
`/api/devloop/history`	GET	View past cycles
`/api/devloop/health`	GET	System health check
`/api/devloop/ws`	WebSocket	Real-time updates

Start a cycle via API:

curl -X POST http://localhost:8000/api/devloop/start \
  -H "Content-Type: application/json" \
  -d '{"objective": "Create support_arm study", "max_iterations": 5}'

State Files

DevLoop maintains state in .devloop/:

File	Purpose	Updated By
`current_plan.json`	Current implementation plan	`plan` command
`test_results.json`	Filesystem/API test results	`test` command
`browser_test_results.json`	Browser test results	`browser` command
`analysis.json`	Failure analysis	`analyze` command

Example Workflows

Create a New Study

# Full autonomous cycle
python tools/devloop_cli.py start "Create bracket_lightweight study with mass and displacement objectives"

# Or step by step
python tools/devloop_cli.py plan "Create bracket_lightweight study"
python tools/devloop_cli.py implement
python tools/devloop_cli.py test --study bracket_lightweight
python tools/devloop_cli.py browser --study bracket_lightweight

Debug a Dashboard Issue

# Plan the fix
python tools/devloop_cli.py plan "Fix canvas node selection not updating panel"

# Implement
python tools/devloop_cli.py implement

# Test UI
python tools/devloop_cli.py browser --level full

# If tests fail, analyze
python tools/devloop_cli.py analyze

# Fix and retest loop...

Verify Study Before Running

# File structure tests
python tools/devloop_cli.py test --study my_study

# Browser tests (canvas loads, etc.)
python tools/devloop_cli.py browser --level study --study my_study

Troubleshooting

Browser Tests Fail

Ensure frontend is running: npm run dev in atomizer-dashboard/frontend
Check port: DevLoop uses localhost:3003 (Vite default)
Install browsers: npx playwright install chromium

CLI Tools Not Found

Check paths in cli_bridge.py:

CLAUDE_PATH = r"C:\Users\antoi\.local\bin\claude.exe"
OPENCODE_PATH = r"C:\Users\antoi\AppData\Roaming\npm\opencode.cmd"

API Tests Fail

Ensure backend is running: Port 8000
Check endpoint paths: May need /api/ prefix

Tests Timeout

Increase timeout in test scenario:

{
  "timeout_ms": 30000
}

Unclosed Client Session Warning

This is a known aiohttp warning on Windows. Tests still pass correctly.

Integration with LAC

DevLoop records learnings to LAC (Learning Atomizer Core):

from knowledge_base.lac import get_lac

lac = get_lac()

# Record after successful cycle
lac.record_insight(
    category="success_pattern",
    context="DevLoop created support_arm study",
    insight="TPE sampler works well for 4-variable bracket problems",
    confidence=0.9
)

Future Enhancements

Parallel test execution - Run independent tests concurrently
Visual diff - Show code changes in dashboard
Smart rollback - Automatic rollback on regression
Branch management - Auto-create feature branches
Cost tracking - Monitor CLI usage

15 KiB Raw Blame History

DevLoop - Closed-Loop Development System

Overview

Quick Start

CLI Commands

Prerequisites

Architecture

Directory Structure

Core Components

CLI Tools Configuration

CLI Commands Reference

start - Full Development Cycle

plan - Create Implementation Plan

implement - Execute Plan with Claude Code

test - Run Tests

browser - Run Playwright UI Tests

analyze - Analyze Test Results

status - View Current State

quick - Quick Test

Test Types

Filesystem Tests

API Tests

Browser Tests (Playwright)

CLI Tests

Browser Test Scenarios

Standalone Playwright Tests

API Integration

State Files

Example Workflows

Create a New Study

Debug a Dashboard Issue

Verify Study Before Running

Troubleshooting

Browser Tests Fail

CLI Tools Not Found

API Tests Fail

Tests Timeout

Unclosed Client Session Warning

Integration with LAC

Future Enhancements

15 KiB

Raw Blame History

`start` - Full Development Cycle

`plan` - Create Implementation Plan

`implement` - Execute Plan with Claude Code

`test` - Run Tests

`browser` - Run Playwright UI Tests

`analyze` - Analyze Test Results

`status` - View Current State

`quick` - Quick Test