289 lines
16 KiB
Markdown
289 lines
16 KiB
Markdown
# V2 Migration Master Plan — Audit Report
|
|
|
|
**Auditor:** Auditor Agent 🔍
|
|
**Date:** 2026-02-22
|
|
**Document Reviewed:** `ATOMIZER-V2-MIGRATION-MASTERPLAN.md`
|
|
**Verdict:** 🟡 MAJOR issues found — plan is strong but has significant gaps that will cause problems during execution
|
|
|
|
---
|
|
|
|
## 1. Completeness — 🔴 CRITICAL GAPS
|
|
|
|
### 1.1 Missing V1 Modules (Not Accounted For)
|
|
|
|
The migration plan lists modules to port but **misses at least 8 significant V1 subpackages**:
|
|
|
|
| V1 Module | Files | Purpose | Impact if Missed |
|
|
|-----------|-------|---------|-----------------|
|
|
| `optimization_engine/context/` | 7 files | Session state, compaction, feedback loop, playbook, reflector | 🔴 Core runtime functionality — sessions won't persist state |
|
|
| `optimization_engine/study/` | 8 files | Study creator, wizard, continuation, reset, benchmarking, state, history | 🔴 Can't create or manage studies without this |
|
|
| `optimization_engine/utils/` | 12 files | Logger, dashboard_db, trial_manager, NX file discovery, study archiver, realtime tracking | 🔴 Infrastructure that everything depends on |
|
|
| `optimization_engine/plugins/` | 4 files | hook_manager, hooks, validators (DIFFERENT from `hooks/`) | 🟡 Plugin system won't work |
|
|
| `optimization_engine/intake/` | 3 files | Config intake, context intake, processor | 🟡 Study intake pipeline broken |
|
|
| `optimization_engine/validation/` | 3 files | checker.py, gate.py (DIFFERENT from `validators/`) | 🟡 Validation gates lost |
|
|
| `optimization_engine/model_discovery/` | 2 files | NX model introspection | 🟡 Model discovery capability lost |
|
|
| `optimization_engine/devloop/` | 7 files | Analyzer, orchestrator, planning, test_runner, browser scenarios | 🟢 DevLoop was planned for `tools/devloop_cli.py` but the full subpackage has 7 files |
|
|
| `optimization_engine/processors/` | 2 files | adaptive_characterization.py | 🟡 V1 already has a `processors/` concept |
|
|
| `optimization_engine/future/` | 11 files | Research agents, LLM workflow analyzer, step classifier | 🟢 May be intentionally excluded, but not listed in "DO NOT MIGRATE" |
|
|
| `optimization_engine/custom_functions/` | 2 files | NX material generator | 🟢 Utility, should be documented |
|
|
| `optimization_engine/templates/` | 3 files | run_optimization_template, run_nn_optimization_template | 🟡 Template system for studies |
|
|
| `optimization_engine/surrogates/` | 1 file | `__init__.py` (separate from `gnn/`) | 🟢 Minor |
|
|
|
|
### 1.2 Missing V1 Core Files
|
|
|
|
| V1 File | Role | Plan Status |
|
|
|---------|------|-------------|
|
|
| `optimization_engine/core/base_runner.py` | Base class for runners | ❌ Not mentioned (plan only lists runner.py) |
|
|
| `optimization_engine/core/gradient_optimizer.py` | Gradient-based optimization | ❌ Not mentioned |
|
|
| `optimization_engine/core/runner_with_neural.py` | Neural-accelerated runner | ❌ Not mentioned |
|
|
| `optimization_engine/core/strategy_portfolio.py` | Strategy portfolio management | ❌ Not mentioned |
|
|
| `optimization_engine/core/strategy_selector.py` | Strategy selection (different from method_selector) | ❌ Not mentioned |
|
|
| `optimization_engine/schemas/` | Schema files | ✅ Mentioned but directory contents not inventoried |
|
|
|
|
### 1.3 Missing V1 Root-Level Files
|
|
|
|
| File | Status |
|
|
|------|--------|
|
|
| `atomizer.py` (25KB monolith) | Listed in "DO NOT MIGRATE" ✅ but its functionality needs a replacement |
|
|
| `launch_dashboard.py` | ❌ Not mentioned — how does V2 launch the dashboard? |
|
|
| `requirements.txt` | Replaced by pyproject.toml ✅ |
|
|
| `install.bat` | ❌ Not mentioned — Windows install script |
|
|
|
|
### 1.4 V1 Tools Directory
|
|
|
|
The plan only mentions `tools/devloop_cli.py`. V1 `tools/` has **25+ scripts** including:
|
|
- `analyze_study.py`, `find_best_iteration.py`, `archive_study.py`
|
|
- `create_pareto_graphs.py`, `generate_psd_figures.py`
|
|
- Zernike-specific tools (HTML generator, WFE PSD, optical report)
|
|
- Study migration tools
|
|
|
|
**Recommendation:** Create an inventory of tools/ and decide per-file: migrate, archive, or replace.
|
|
|
|
---
|
|
|
|
## 2. Risk Assessment — 🟡 MAJOR
|
|
|
|
### 2.1 Identified Risks (Plan Section 11)
|
|
|
|
The plan's risk table is reasonable but **underestimates these risks:**
|
|
|
|
| Risk | Plan's Mitigation | My Assessment |
|
|
|------|-------------------|---------------|
|
|
| Import breakage | Find-replace `optimization_engine.` → `atomizer.` | 🟡 **Insufficient.** Many V1 modules use relative imports, cross-module imports, and `optimization_engine.` is nested (e.g., `from optimization_engine.core.runner import Runner` where `runner.py` imports from `optimization_engine.extractors`). A mechanical find-replace will miss circular dependencies and runtime-only imports. Need a test suite, not just sed. |
|
|
| NX integration breaks | Test on dalidou before archiving V1 | ✅ Adequate |
|
|
| `.gitignore` too aggressive | Test essential files | 🟡 See Data Safety section below |
|
|
|
|
### 2.2 Unidentified Risks
|
|
|
|
| Risk | Severity | Mitigation Needed |
|
|
|------|----------|-------------------|
|
|
| **V1 `utils/` dependency web** — logger, trial_manager, dashboard_db are imported EVERYWHERE in V1. Where do they go in V2? | 🔴 HIGH | Create `atomizer/utils/` or distribute into appropriate modules. Map ALL import dependencies before porting. |
|
|
| **`context/` module loss** — session state, compaction, feedback loops. If not ported, studies can't resume, context is lost between runs | 🔴 HIGH | Add to migration table, decide V2 location |
|
|
| **`study/` module loss** — study creation wizard, continuation, reset. Without this, can't create studies from V2 | 🔴 HIGH | Add to migration table as P0 |
|
|
| **Optuna DB path changes** — V1 studies store Optuna databases at specific paths. V2 restructure may break study continuation | 🟡 MED | Test study continuation with path remapping |
|
|
| **NX journal path references** — NX journals may hardcode V1 paths | 🟡 MED | Audit all journal files for hardcoded paths |
|
|
| **Knowledge base `.jsonl` files** — are these tracked in git or gitignored? They're small (212KB) but grow over time | 🟡 MED | Clarify: track in git or gitignore with backup strategy |
|
|
| **Python version compatibility** — pyproject.toml says `>=3.10` but V1 may use patterns from 3.8/3.9 | 🟢 LOW | Test on target Python version |
|
|
|
|
---
|
|
|
|
## 3. Feasibility — 🟡 8-Day Timeline is Aggressive
|
|
|
|
### 3.1 Phase-by-Phase Assessment
|
|
|
|
| Phase | Planned | Realistic | Issue |
|
|
|-------|---------|-----------|-------|
|
|
| Phase 0: Bootstrap + AOM | 1 day | 1.5 days | AOM link conversion for 48 docs is tedious even with a script. Needs manual QA. |
|
|
| Phase 1: Core Engine | 2 days | 3-4 days | **Plan lists 13 steps but misses ~25 additional files** from `core/`, `context/`, `study/`, `utils/`. Refactoring runner→engine while maintaining all runner variants (base_runner, runner_with_neural) is non-trivial. |
|
|
| Phase 2: Supporting | 2 days | 2 days | Reasonable if scope is truly "direct port" |
|
|
| Phase 3: Integration | 2 days | 3 days | Import fixes across 100+ files. This is where the missing modules will surface. |
|
|
| Phase 4: Syncthing | 1 day | 1 day | Reasonable |
|
|
| Phase 5: GitHub + CI | 1 day | 0.5 days | Straightforward |
|
|
| Phase 6: Archive V1 | 1 day | 0.5 days | Straightforward |
|
|
| **Total** | **8 days** | **11-13 days** | |
|
|
|
|
### 3.2 Key Bottleneck
|
|
|
|
**Phase 1 is underscoped.** The migration table shows 13 clean steps, but V1's `optimization_engine/` has **~150 Python files across 20 subpackages**. The plan only explicitly accounts for ~60 of these. The remaining ~90 files will surface during Phase 3 integration testing, causing scope creep and rework.
|
|
|
|
**Recommendation:** Before starting, create a complete file-level inventory mapping every V1 `.py` file to its V2 destination (or explicit "skip" decision). This takes ~2 hours but saves days of surprises.
|
|
|
|
---
|
|
|
|
## 4. Architecture Alignment — ✅ STRONG
|
|
|
|
### 4.1 AOM Component Map Match
|
|
|
|
The V2 structure maps well to the AOM's four pillars:
|
|
|
|
| AOM Component | V2 Location | Match |
|
|
|--------------|-------------|-------|
|
|
| Pillar 1 (Philosophy) | `docs/AOM/01-Philosophy/` | ✅ |
|
|
| Pillar 2 (Operations) | `docs/AOM/02-Operations/` | ✅ |
|
|
| Pillar 3 (Developer) | `docs/AOM/03-Developer/` | ✅ |
|
|
| Pillar 4 (Knowledge) | `docs/AOM/04-Knowledge/` | ✅ |
|
|
| Contracts | `atomizer/contracts/` | ✅ Matches AOM 03-Developer/08-Data-Contracts |
|
|
| Processors | `atomizer/processors/` | ✅ Matches AOM 03-Developer/09-Processor-Development |
|
|
| Orchestrator | `atomizer/orchestrator/` | ✅ Matches AOM 01-Philosophy/08-Tool-Agnostic |
|
|
| Extractors | `atomizer/extractors/` | ✅ Matches AOM 02-Operations/04-Extractor-Library |
|
|
| Protocols | `docs/protocols/` | ✅ Matches AOM 02-Operations/02-Protocol-Reference |
|
|
|
|
### 4.2 Minor Misalignments
|
|
|
|
| Issue | Severity |
|
|
|-------|----------|
|
|
| AOM has `Audit/` folder (2 docs) — plan places it under `docs/AOM/Audit/` ✅ | None |
|
|
| AOM Phase 4/5 docs (CLAUDE-v2, Living-Document-Protocol) need explicit V2 homes — plan addresses this in Section 4.4 ✅ | None |
|
|
| MCP servers are in V2 repo as `mcp_servers/` but AOM 03-Developer/10 suggests they could be separate repos | 🟢 Minor — decide later |
|
|
|
|
---
|
|
|
|
## 5. Data Safety — 🟡 NEEDS ATTENTION
|
|
|
|
### 5.1 .gitignore Assessment
|
|
|
|
**Good coverage for:**
|
|
- NX/solver binary files (`.sim`, `.prt`, `.fem`, `.bdf`, `.op2`, `.f06`, `.frd`)
|
|
- Python artifacts
|
|
- IDE files
|
|
- Study data directory
|
|
|
|
**Missing patterns:**
|
|
|
|
| Pattern | Risk | Recommendation |
|
|
|---------|------|---------------|
|
|
| `*.backup` / `*.bak` | Backup files could leak | Add `*.bak` and `*.backup` |
|
|
| `*.csv` | Large result CSVs from studies | Add or use `studies/` containment |
|
|
| `*.png` / `*.jpg` in study dirs | Iteration screenshots, contour plots | Covered by `studies/` gitignore ✅ |
|
|
| `*.sqlite` / `*.sqlite3` | Optuna databases | Add explicitly (`.db` covers some but not all) |
|
|
| `research_sessions/` | Knowledge base research data | Clarify if tracked |
|
|
| `*.jsonl` | Session insights grow unbounded | Clarify: should `knowledge/session_insights/*.jsonl` be tracked? |
|
|
| `*.whl` | Wheel files | Add |
|
|
| `*.tar.gz` / `*.zip` | Archives in tools/ | Not currently present but preventive |
|
|
|
|
### 5.2 Large File Risk
|
|
|
|
The plan correctly excludes `projects/` (99GB), `atomizer_field_training_data/` (68MB), and `tools/` (462MB — wait, why is V1 tools/ 462MB?).
|
|
|
|
**Action item:** Investigate what's in V1 `tools/` that's 462MB. The plan lists it as "Large tool archives" — these could contaminate V2 if `tools/` is ported carelessly.
|
|
|
|
### 5.3 Success Criterion #9
|
|
|
|
> "No file larger than 1MB in git history (excluding initial dashboard assets)"
|
|
|
|
This is good but needs enforcement. **Recommendation:** Add a pre-commit hook or CI check that rejects files >1MB.
|
|
|
|
---
|
|
|
|
## 6. Backward Compatibility — 🟡 RISKS EXIST
|
|
|
|
### 6.1 AtomizerSpec v2→v3 Migration
|
|
|
|
The plan mentions `atomizer/spec/migrator.py` for v2.0→v3.0 migration. This is critical.
|
|
|
|
**Key question:** What happens when a V1 `atomizer_spec.json` is loaded?
|
|
- V1 specs have no `toolchain` section → must default to `NX/NX mesher/Nastran`
|
|
- V1 specs use `optimization_engine.*` import paths in custom hooks → must still work
|
|
- V1 specs may reference absolute paths on dalidou → need path translation
|
|
|
|
### 6.2 V1 Study Continuation
|
|
|
|
Can a V2 installation continue an in-progress V1 study?
|
|
- Optuna DB: needs same database path or migration
|
|
- Study state: `optimization_engine/study/state.py` tracks progress — needs porting
|
|
- Iteration results: stored in `studies/*/` — path-dependent
|
|
|
|
**The plan doesn't address mid-study migration.** This may be acceptable if all V1 studies are completed before migration, but this should be an explicit decision.
|
|
|
|
### 6.3 Import Path Compatibility
|
|
|
|
The plan says "find-replace `optimization_engine.` → `atomizer.`" but:
|
|
- V1 custom hooks may import from `optimization_engine.*`
|
|
- User-created study scripts import V1 paths
|
|
- NX journals may import from V1 paths
|
|
|
|
**Recommendation:** Consider a compatibility shim:
|
|
```python
|
|
# optimization_engine/__init__.py (temporary)
|
|
import warnings
|
|
warnings.warn("optimization_engine is deprecated, use atomizer", DeprecationWarning)
|
|
from atomizer import *
|
|
```
|
|
|
|
---
|
|
|
|
## 7. Gaps — What Hasn't Been Considered
|
|
|
|
### 7.1 🔴 No Rollback Plan
|
|
If V2 migration fails at Phase 3, what's the recovery? V1 is still there (not archived until Phase 6), but there's no documented rollback procedure.
|
|
|
|
### 7.2 🟡 No Migration Verification Checklist
|
|
The "Success Criteria" (Section 13) are end-state checks. There's no per-phase verification that catches issues early. Each phase needs explicit "done when" criteria with test commands.
|
|
|
|
### 7.3 🟡 Environment/Dependencies
|
|
- V1 uses `requirements.txt` + conda (`atomizer` env). V2 uses `pyproject.toml`.
|
|
- How are V1 dependencies captured? Is there a `pip freeze` of the working V1 environment?
|
|
- PyTorch + torch-geometric (for GNN) are notoriously version-sensitive. Pin versions.
|
|
|
|
### 7.4 🟡 Windows Path Handling
|
|
V1 was developed on Windows (NX is Windows-only). V2 development is on Linux. Cross-platform path handling (`pathlib.Path` vs string paths) needs systematic review, not just "update Windows paths in NX processor (if needed)."
|
|
|
|
### 7.5 🟢 Documentation for `config/` Migration
|
|
V1 has `config/nx_config.json.template` and `config/optimization_config_template.json`. These aren't mentioned in the migration plan. They should either map to V2's `atomizer/spec/` or `.env.example`.
|
|
|
|
### 7.6 🟢 `optimization_engine/schemas/` Contents
|
|
The plan says "Port schemas" but doesn't inventory what's in this directory. Should be checked.
|
|
|
|
### 7.7 🟢 Feature Registry
|
|
V1 has `optimization_engine/feature_registry.json`. Not mentioned in migration plan.
|
|
|
|
---
|
|
|
|
## Summary Scorecard
|
|
|
|
| Criteria | Grade | Notes |
|
|
|----------|-------|-------|
|
|
| **Completeness** | 🟡 C+ | ~60% of V1 files explicitly mapped. 8+ subpackages missing. |
|
|
| **Risk Assessment** | 🟡 B- | Good risks identified, but `utils/`, `context/`, `study/` omissions are high-risk |
|
|
| **Feasibility** | 🟡 B- | 8 days → realistically 11-13 days |
|
|
| **Architecture Alignment** | ✅ A | Excellent match to AOM Component Map |
|
|
| **Data Safety** | 🟡 B | Solid .gitignore but missing some patterns; needs pre-commit hook |
|
|
| **Backward Compatibility** | 🟡 B- | Spec migration planned but mid-study and import shims not addressed |
|
|
| **Overall** | 🟡 B- | Strong vision, solid architecture, but execution plan has dangerous gaps in file inventory |
|
|
|
|
---
|
|
|
|
## Recommendations (Priority Ordered)
|
|
|
|
1. **🔴 IMMEDIATE: Create complete file inventory** — Map every V1 `.py` file to V2 destination or explicit skip. ~2 hours, saves days. (`find optimization_engine -name "*.py" | sort` → spreadsheet with V2 destination column)
|
|
|
|
2. **🔴 Add missing modules to migration table:**
|
|
- `context/` → `atomizer/context/` or merge into `optimization/`
|
|
- `study/` → `atomizer/study/` (this is P0, not optional)
|
|
- `utils/` → `atomizer/utils/` (infrastructure everything depends on)
|
|
- `plugins/` → merge with `hooks/` or separate
|
|
- `validation/` → merge with `spec/validator.py`
|
|
- `intake/` → `atomizer/intake/` or merge into `interview/`
|
|
|
|
3. **🟡 Extend timeline to 12 days** or explicitly reduce scope (e.g., "Phase 1 ports only the minimum for NX workflow; remaining modules in Phase 2")
|
|
|
|
4. **🟡 Add per-phase verification commands** (not just end-state criteria)
|
|
|
|
5. **🟡 Add rollback procedure** to Section 11
|
|
|
|
6. **🟡 Pin dependency versions** in pyproject.toml (especially PyTorch, torch-geometric)
|
|
|
|
7. **🟡 Add pre-commit hook** for file size enforcement (>1MB rejection)
|
|
|
|
8. **🟢 Consider import compatibility shim** for transition period
|
|
|
|
9. **🟢 Investigate V1 `tools/` size** (462MB — what's in there?)
|
|
|
|
10. **🟢 Decide on `.jsonl` tracking** — knowledge base files should probably be tracked, session data should not
|
|
|
|
---
|
|
|
|
*This is a strong plan with the right vision and principles. The architecture alignment is excellent. The gaps are execution-level — they're fixable before work begins. Fixing them now prevents the "oh wait, where does this module go?" problem that derails migrations mid-stream.*
|
|
|
|
*— Auditor 🔍, 2026-02-22*
|