16 KiB
V2 Migration Master Plan — Audit Report
Auditor: Auditor Agent 🔍
Date: 2026-02-22
Document Reviewed: ATOMIZER-V2-MIGRATION-MASTERPLAN.md
Verdict: 🟡 MAJOR issues found — plan is strong but has significant gaps that will cause problems during execution
1. Completeness — 🔴 CRITICAL GAPS
1.1 Missing V1 Modules (Not Accounted For)
The migration plan lists modules to port but misses at least 8 significant V1 subpackages:
| V1 Module | Files | Purpose | Impact if Missed |
|---|---|---|---|
optimization_engine/context/ |
7 files | Session state, compaction, feedback loop, playbook, reflector | 🔴 Core runtime functionality — sessions won't persist state |
optimization_engine/study/ |
8 files | Study creator, wizard, continuation, reset, benchmarking, state, history | 🔴 Can't create or manage studies without this |
optimization_engine/utils/ |
12 files | Logger, dashboard_db, trial_manager, NX file discovery, study archiver, realtime tracking | 🔴 Infrastructure that everything depends on |
optimization_engine/plugins/ |
4 files | hook_manager, hooks, validators (DIFFERENT from hooks/) |
🟡 Plugin system won't work |
optimization_engine/intake/ |
3 files | Config intake, context intake, processor | 🟡 Study intake pipeline broken |
optimization_engine/validation/ |
3 files | checker.py, gate.py (DIFFERENT from validators/) |
🟡 Validation gates lost |
optimization_engine/model_discovery/ |
2 files | NX model introspection | 🟡 Model discovery capability lost |
optimization_engine/devloop/ |
7 files | Analyzer, orchestrator, planning, test_runner, browser scenarios | 🟢 DevLoop was planned for tools/devloop_cli.py but the full subpackage has 7 files |
optimization_engine/processors/ |
2 files | adaptive_characterization.py | 🟡 V1 already has a processors/ concept |
optimization_engine/future/ |
11 files | Research agents, LLM workflow analyzer, step classifier | 🟢 May be intentionally excluded, but not listed in "DO NOT MIGRATE" |
optimization_engine/custom_functions/ |
2 files | NX material generator | 🟢 Utility, should be documented |
optimization_engine/templates/ |
3 files | run_optimization_template, run_nn_optimization_template | 🟡 Template system for studies |
optimization_engine/surrogates/ |
1 file | __init__.py (separate from gnn/) |
🟢 Minor |
1.2 Missing V1 Core Files
| V1 File | Role | Plan Status |
|---|---|---|
optimization_engine/core/base_runner.py |
Base class for runners | ❌ Not mentioned (plan only lists runner.py) |
optimization_engine/core/gradient_optimizer.py |
Gradient-based optimization | ❌ Not mentioned |
optimization_engine/core/runner_with_neural.py |
Neural-accelerated runner | ❌ Not mentioned |
optimization_engine/core/strategy_portfolio.py |
Strategy portfolio management | ❌ Not mentioned |
optimization_engine/core/strategy_selector.py |
Strategy selection (different from method_selector) | ❌ Not mentioned |
optimization_engine/schemas/ |
Schema files | ✅ Mentioned but directory contents not inventoried |
1.3 Missing V1 Root-Level Files
| File | Status |
|---|---|
atomizer.py (25KB monolith) |
Listed in "DO NOT MIGRATE" ✅ but its functionality needs a replacement |
launch_dashboard.py |
❌ Not mentioned — how does V2 launch the dashboard? |
requirements.txt |
Replaced by pyproject.toml ✅ |
install.bat |
❌ Not mentioned — Windows install script |
1.4 V1 Tools Directory
The plan only mentions tools/devloop_cli.py. V1 tools/ has 25+ scripts including:
analyze_study.py,find_best_iteration.py,archive_study.pycreate_pareto_graphs.py,generate_psd_figures.py- Zernike-specific tools (HTML generator, WFE PSD, optical report)
- Study migration tools
Recommendation: Create an inventory of tools/ and decide per-file: migrate, archive, or replace.
2. Risk Assessment — 🟡 MAJOR
2.1 Identified Risks (Plan Section 11)
The plan's risk table is reasonable but underestimates these risks:
| Risk | Plan's Mitigation | My Assessment |
|---|---|---|
| Import breakage | Find-replace optimization_engine. → atomizer. |
🟡 Insufficient. Many V1 modules use relative imports, cross-module imports, and optimization_engine. is nested (e.g., from optimization_engine.core.runner import Runner where runner.py imports from optimization_engine.extractors). A mechanical find-replace will miss circular dependencies and runtime-only imports. Need a test suite, not just sed. |
| NX integration breaks | Test on dalidou before archiving V1 | ✅ Adequate |
.gitignore too aggressive |
Test essential files | 🟡 See Data Safety section below |
2.2 Unidentified Risks
| Risk | Severity | Mitigation Needed |
|---|---|---|
V1 utils/ dependency web — logger, trial_manager, dashboard_db are imported EVERYWHERE in V1. Where do they go in V2? |
🔴 HIGH | Create atomizer/utils/ or distribute into appropriate modules. Map ALL import dependencies before porting. |
context/ module loss — session state, compaction, feedback loops. If not ported, studies can't resume, context is lost between runs |
🔴 HIGH | Add to migration table, decide V2 location |
study/ module loss — study creation wizard, continuation, reset. Without this, can't create studies from V2 |
🔴 HIGH | Add to migration table as P0 |
| Optuna DB path changes — V1 studies store Optuna databases at specific paths. V2 restructure may break study continuation | 🟡 MED | Test study continuation with path remapping |
| NX journal path references — NX journals may hardcode V1 paths | 🟡 MED | Audit all journal files for hardcoded paths |
Knowledge base .jsonl files — are these tracked in git or gitignored? They're small (212KB) but grow over time |
🟡 MED | Clarify: track in git or gitignore with backup strategy |
Python version compatibility — pyproject.toml says >=3.10 but V1 may use patterns from 3.8/3.9 |
🟢 LOW | Test on target Python version |
3. Feasibility — 🟡 8-Day Timeline is Aggressive
3.1 Phase-by-Phase Assessment
| Phase | Planned | Realistic | Issue |
|---|---|---|---|
| Phase 0: Bootstrap + AOM | 1 day | 1.5 days | AOM link conversion for 48 docs is tedious even with a script. Needs manual QA. |
| Phase 1: Core Engine | 2 days | 3-4 days | Plan lists 13 steps but misses ~25 additional files from core/, context/, study/, utils/. Refactoring runner→engine while maintaining all runner variants (base_runner, runner_with_neural) is non-trivial. |
| Phase 2: Supporting | 2 days | 2 days | Reasonable if scope is truly "direct port" |
| Phase 3: Integration | 2 days | 3 days | Import fixes across 100+ files. This is where the missing modules will surface. |
| Phase 4: Syncthing | 1 day | 1 day | Reasonable |
| Phase 5: GitHub + CI | 1 day | 0.5 days | Straightforward |
| Phase 6: Archive V1 | 1 day | 0.5 days | Straightforward |
| Total | 8 days | 11-13 days |
3.2 Key Bottleneck
Phase 1 is underscoped. The migration table shows 13 clean steps, but V1's optimization_engine/ has ~150 Python files across 20 subpackages. The plan only explicitly accounts for ~60 of these. The remaining ~90 files will surface during Phase 3 integration testing, causing scope creep and rework.
Recommendation: Before starting, create a complete file-level inventory mapping every V1 .py file to its V2 destination (or explicit "skip" decision). This takes ~2 hours but saves days of surprises.
4. Architecture Alignment — ✅ STRONG
4.1 AOM Component Map Match
The V2 structure maps well to the AOM's four pillars:
| AOM Component | V2 Location | Match |
|---|---|---|
| Pillar 1 (Philosophy) | docs/AOM/01-Philosophy/ |
✅ |
| Pillar 2 (Operations) | docs/AOM/02-Operations/ |
✅ |
| Pillar 3 (Developer) | docs/AOM/03-Developer/ |
✅ |
| Pillar 4 (Knowledge) | docs/AOM/04-Knowledge/ |
✅ |
| Contracts | atomizer/contracts/ |
✅ Matches AOM 03-Developer/08-Data-Contracts |
| Processors | atomizer/processors/ |
✅ Matches AOM 03-Developer/09-Processor-Development |
| Orchestrator | atomizer/orchestrator/ |
✅ Matches AOM 01-Philosophy/08-Tool-Agnostic |
| Extractors | atomizer/extractors/ |
✅ Matches AOM 02-Operations/04-Extractor-Library |
| Protocols | docs/protocols/ |
✅ Matches AOM 02-Operations/02-Protocol-Reference |
4.2 Minor Misalignments
| Issue | Severity |
|---|---|
AOM has Audit/ folder (2 docs) — plan places it under docs/AOM/Audit/ ✅ |
None |
| AOM Phase 4/5 docs (CLAUDE-v2, Living-Document-Protocol) need explicit V2 homes — plan addresses this in Section 4.4 ✅ | None |
MCP servers are in V2 repo as mcp_servers/ but AOM 03-Developer/10 suggests they could be separate repos |
🟢 Minor — decide later |
5. Data Safety — 🟡 NEEDS ATTENTION
5.1 .gitignore Assessment
Good coverage for:
- NX/solver binary files (
.sim,.prt,.fem,.bdf,.op2,.f06,.frd) - Python artifacts
- IDE files
- Study data directory
Missing patterns:
| Pattern | Risk | Recommendation |
|---|---|---|
*.backup / *.bak |
Backup files could leak | Add *.bak and *.backup |
*.csv |
Large result CSVs from studies | Add or use studies/ containment |
*.png / *.jpg in study dirs |
Iteration screenshots, contour plots | Covered by studies/ gitignore ✅ |
*.sqlite / *.sqlite3 |
Optuna databases | Add explicitly (.db covers some but not all) |
research_sessions/ |
Knowledge base research data | Clarify if tracked |
*.jsonl |
Session insights grow unbounded | Clarify: should knowledge/session_insights/*.jsonl be tracked? |
*.whl |
Wheel files | Add |
*.tar.gz / *.zip |
Archives in tools/ | Not currently present but preventive |
5.2 Large File Risk
The plan correctly excludes projects/ (99GB), atomizer_field_training_data/ (68MB), and tools/ (462MB — wait, why is V1 tools/ 462MB?).
Action item: Investigate what's in V1 tools/ that's 462MB. The plan lists it as "Large tool archives" — these could contaminate V2 if tools/ is ported carelessly.
5.3 Success Criterion #9
"No file larger than 1MB in git history (excluding initial dashboard assets)"
This is good but needs enforcement. Recommendation: Add a pre-commit hook or CI check that rejects files >1MB.
6. Backward Compatibility — 🟡 RISKS EXIST
6.1 AtomizerSpec v2→v3 Migration
The plan mentions atomizer/spec/migrator.py for v2.0→v3.0 migration. This is critical.
Key question: What happens when a V1 atomizer_spec.json is loaded?
- V1 specs have no
toolchainsection → must default toNX/NX mesher/Nastran - V1 specs use
optimization_engine.*import paths in custom hooks → must still work - V1 specs may reference absolute paths on dalidou → need path translation
6.2 V1 Study Continuation
Can a V2 installation continue an in-progress V1 study?
- Optuna DB: needs same database path or migration
- Study state:
optimization_engine/study/state.pytracks progress — needs porting - Iteration results: stored in
studies/*/— path-dependent
The plan doesn't address mid-study migration. This may be acceptable if all V1 studies are completed before migration, but this should be an explicit decision.
6.3 Import Path Compatibility
The plan says "find-replace optimization_engine. → atomizer." but:
- V1 custom hooks may import from
optimization_engine.* - User-created study scripts import V1 paths
- NX journals may import from V1 paths
Recommendation: Consider a compatibility shim:
# optimization_engine/__init__.py (temporary)
import warnings
warnings.warn("optimization_engine is deprecated, use atomizer", DeprecationWarning)
from atomizer import *
7. Gaps — What Hasn't Been Considered
7.1 🔴 No Rollback Plan
If V2 migration fails at Phase 3, what's the recovery? V1 is still there (not archived until Phase 6), but there's no documented rollback procedure.
7.2 🟡 No Migration Verification Checklist
The "Success Criteria" (Section 13) are end-state checks. There's no per-phase verification that catches issues early. Each phase needs explicit "done when" criteria with test commands.
7.3 🟡 Environment/Dependencies
- V1 uses
requirements.txt+ conda (atomizerenv). V2 usespyproject.toml. - How are V1 dependencies captured? Is there a
pip freezeof the working V1 environment? - PyTorch + torch-geometric (for GNN) are notoriously version-sensitive. Pin versions.
7.4 🟡 Windows Path Handling
V1 was developed on Windows (NX is Windows-only). V2 development is on Linux. Cross-platform path handling (pathlib.Path vs string paths) needs systematic review, not just "update Windows paths in NX processor (if needed)."
7.5 🟢 Documentation for config/ Migration
V1 has config/nx_config.json.template and config/optimization_config_template.json. These aren't mentioned in the migration plan. They should either map to V2's atomizer/spec/ or .env.example.
7.6 🟢 optimization_engine/schemas/ Contents
The plan says "Port schemas" but doesn't inventory what's in this directory. Should be checked.
7.7 🟢 Feature Registry
V1 has optimization_engine/feature_registry.json. Not mentioned in migration plan.
Summary Scorecard
| Criteria | Grade | Notes |
|---|---|---|
| Completeness | 🟡 C+ | ~60% of V1 files explicitly mapped. 8+ subpackages missing. |
| Risk Assessment | 🟡 B- | Good risks identified, but utils/, context/, study/ omissions are high-risk |
| Feasibility | 🟡 B- | 8 days → realistically 11-13 days |
| Architecture Alignment | ✅ A | Excellent match to AOM Component Map |
| Data Safety | 🟡 B | Solid .gitignore but missing some patterns; needs pre-commit hook |
| Backward Compatibility | 🟡 B- | Spec migration planned but mid-study and import shims not addressed |
| Overall | 🟡 B- | Strong vision, solid architecture, but execution plan has dangerous gaps in file inventory |
Recommendations (Priority Ordered)
-
🔴 IMMEDIATE: Create complete file inventory — Map every V1
.pyfile to V2 destination or explicit skip. ~2 hours, saves days. (find optimization_engine -name "*.py" | sort→ spreadsheet with V2 destination column) -
🔴 Add missing modules to migration table:
context/→atomizer/context/or merge intooptimization/study/→atomizer/study/(this is P0, not optional)utils/→atomizer/utils/(infrastructure everything depends on)plugins/→ merge withhooks/or separatevalidation/→ merge withspec/validator.pyintake/→atomizer/intake/or merge intointerview/
-
🟡 Extend timeline to 12 days or explicitly reduce scope (e.g., "Phase 1 ports only the minimum for NX workflow; remaining modules in Phase 2")
-
🟡 Add per-phase verification commands (not just end-state criteria)
-
🟡 Add rollback procedure to Section 11
-
🟡 Pin dependency versions in pyproject.toml (especially PyTorch, torch-geometric)
-
🟡 Add pre-commit hook for file size enforcement (>1MB rejection)
-
🟢 Consider import compatibility shim for transition period
-
🟢 Investigate V1
tools/size (462MB — what's in there?) -
🟢 Decide on
.jsonltracking — knowledge base files should probably be tracked, session data should not
This is a strong plan with the right vision and principles. The architecture alignment is excellent. The gaps are execution-level — they're fixable before work begins. Fixing them now prevents the "oh wait, where does this module go?" problem that derails migrations mid-stream.
— Auditor 🔍, 2026-02-22