# V2 Migration Master Plan β€” Audit Report **Auditor:** Auditor Agent πŸ” **Date:** 2026-02-22 **Document Reviewed:** `ATOMIZER-V2-MIGRATION-MASTERPLAN.md` **Verdict:** 🟑 MAJOR issues found β€” plan is strong but has significant gaps that will cause problems during execution --- ## 1. Completeness β€” πŸ”΄ CRITICAL GAPS ### 1.1 Missing V1 Modules (Not Accounted For) The migration plan lists modules to port but **misses at least 8 significant V1 subpackages**: | V1 Module | Files | Purpose | Impact if Missed | |-----------|-------|---------|-----------------| | `optimization_engine/context/` | 7 files | Session state, compaction, feedback loop, playbook, reflector | πŸ”΄ Core runtime functionality β€” sessions won't persist state | | `optimization_engine/study/` | 8 files | Study creator, wizard, continuation, reset, benchmarking, state, history | πŸ”΄ Can't create or manage studies without this | | `optimization_engine/utils/` | 12 files | Logger, dashboard_db, trial_manager, NX file discovery, study archiver, realtime tracking | πŸ”΄ Infrastructure that everything depends on | | `optimization_engine/plugins/` | 4 files | hook_manager, hooks, validators (DIFFERENT from `hooks/`) | 🟑 Plugin system won't work | | `optimization_engine/intake/` | 3 files | Config intake, context intake, processor | 🟑 Study intake pipeline broken | | `optimization_engine/validation/` | 3 files | checker.py, gate.py (DIFFERENT from `validators/`) | 🟑 Validation gates lost | | `optimization_engine/model_discovery/` | 2 files | NX model introspection | 🟑 Model discovery capability lost | | `optimization_engine/devloop/` | 7 files | Analyzer, orchestrator, planning, test_runner, browser scenarios | 🟒 DevLoop was planned for `tools/devloop_cli.py` but the full subpackage has 7 files | | `optimization_engine/processors/` | 2 files | adaptive_characterization.py | 🟑 V1 already has a `processors/` concept | | `optimization_engine/future/` | 11 files | Research agents, LLM workflow analyzer, step classifier | 🟒 May be intentionally excluded, but not listed in "DO NOT MIGRATE" | | `optimization_engine/custom_functions/` | 2 files | NX material generator | 🟒 Utility, should be documented | | `optimization_engine/templates/` | 3 files | run_optimization_template, run_nn_optimization_template | 🟑 Template system for studies | | `optimization_engine/surrogates/` | 1 file | `__init__.py` (separate from `gnn/`) | 🟒 Minor | ### 1.2 Missing V1 Core Files | V1 File | Role | Plan Status | |---------|------|-------------| | `optimization_engine/core/base_runner.py` | Base class for runners | ❌ Not mentioned (plan only lists runner.py) | | `optimization_engine/core/gradient_optimizer.py` | Gradient-based optimization | ❌ Not mentioned | | `optimization_engine/core/runner_with_neural.py` | Neural-accelerated runner | ❌ Not mentioned | | `optimization_engine/core/strategy_portfolio.py` | Strategy portfolio management | ❌ Not mentioned | | `optimization_engine/core/strategy_selector.py` | Strategy selection (different from method_selector) | ❌ Not mentioned | | `optimization_engine/schemas/` | Schema files | βœ… Mentioned but directory contents not inventoried | ### 1.3 Missing V1 Root-Level Files | File | Status | |------|--------| | `atomizer.py` (25KB monolith) | Listed in "DO NOT MIGRATE" βœ… but its functionality needs a replacement | | `launch_dashboard.py` | ❌ Not mentioned β€” how does V2 launch the dashboard? | | `requirements.txt` | Replaced by pyproject.toml βœ… | | `install.bat` | ❌ Not mentioned β€” Windows install script | ### 1.4 V1 Tools Directory The plan only mentions `tools/devloop_cli.py`. V1 `tools/` has **25+ scripts** including: - `analyze_study.py`, `find_best_iteration.py`, `archive_study.py` - `create_pareto_graphs.py`, `generate_psd_figures.py` - Zernike-specific tools (HTML generator, WFE PSD, optical report) - Study migration tools **Recommendation:** Create an inventory of tools/ and decide per-file: migrate, archive, or replace. --- ## 2. Risk Assessment β€” 🟑 MAJOR ### 2.1 Identified Risks (Plan Section 11) The plan's risk table is reasonable but **underestimates these risks:** | Risk | Plan's Mitigation | My Assessment | |------|-------------------|---------------| | Import breakage | Find-replace `optimization_engine.` β†’ `atomizer.` | 🟑 **Insufficient.** Many V1 modules use relative imports, cross-module imports, and `optimization_engine.` is nested (e.g., `from optimization_engine.core.runner import Runner` where `runner.py` imports from `optimization_engine.extractors`). A mechanical find-replace will miss circular dependencies and runtime-only imports. Need a test suite, not just sed. | | NX integration breaks | Test on dalidou before archiving V1 | βœ… Adequate | | `.gitignore` too aggressive | Test essential files | 🟑 See Data Safety section below | ### 2.2 Unidentified Risks | Risk | Severity | Mitigation Needed | |------|----------|-------------------| | **V1 `utils/` dependency web** β€” logger, trial_manager, dashboard_db are imported EVERYWHERE in V1. Where do they go in V2? | πŸ”΄ HIGH | Create `atomizer/utils/` or distribute into appropriate modules. Map ALL import dependencies before porting. | | **`context/` module loss** β€” session state, compaction, feedback loops. If not ported, studies can't resume, context is lost between runs | πŸ”΄ HIGH | Add to migration table, decide V2 location | | **`study/` module loss** β€” study creation wizard, continuation, reset. Without this, can't create studies from V2 | πŸ”΄ HIGH | Add to migration table as P0 | | **Optuna DB path changes** β€” V1 studies store Optuna databases at specific paths. V2 restructure may break study continuation | 🟑 MED | Test study continuation with path remapping | | **NX journal path references** β€” NX journals may hardcode V1 paths | 🟑 MED | Audit all journal files for hardcoded paths | | **Knowledge base `.jsonl` files** β€” are these tracked in git or gitignored? They're small (212KB) but grow over time | 🟑 MED | Clarify: track in git or gitignore with backup strategy | | **Python version compatibility** β€” pyproject.toml says `>=3.10` but V1 may use patterns from 3.8/3.9 | 🟒 LOW | Test on target Python version | --- ## 3. Feasibility β€” 🟑 8-Day Timeline is Aggressive ### 3.1 Phase-by-Phase Assessment | Phase | Planned | Realistic | Issue | |-------|---------|-----------|-------| | Phase 0: Bootstrap + AOM | 1 day | 1.5 days | AOM link conversion for 48 docs is tedious even with a script. Needs manual QA. | | Phase 1: Core Engine | 2 days | 3-4 days | **Plan lists 13 steps but misses ~25 additional files** from `core/`, `context/`, `study/`, `utils/`. Refactoring runnerβ†’engine while maintaining all runner variants (base_runner, runner_with_neural) is non-trivial. | | Phase 2: Supporting | 2 days | 2 days | Reasonable if scope is truly "direct port" | | Phase 3: Integration | 2 days | 3 days | Import fixes across 100+ files. This is where the missing modules will surface. | | Phase 4: Syncthing | 1 day | 1 day | Reasonable | | Phase 5: GitHub + CI | 1 day | 0.5 days | Straightforward | | Phase 6: Archive V1 | 1 day | 0.5 days | Straightforward | | **Total** | **8 days** | **11-13 days** | | ### 3.2 Key Bottleneck **Phase 1 is underscoped.** The migration table shows 13 clean steps, but V1's `optimization_engine/` has **~150 Python files across 20 subpackages**. The plan only explicitly accounts for ~60 of these. The remaining ~90 files will surface during Phase 3 integration testing, causing scope creep and rework. **Recommendation:** Before starting, create a complete file-level inventory mapping every V1 `.py` file to its V2 destination (or explicit "skip" decision). This takes ~2 hours but saves days of surprises. --- ## 4. Architecture Alignment β€” βœ… STRONG ### 4.1 AOM Component Map Match The V2 structure maps well to the AOM's four pillars: | AOM Component | V2 Location | Match | |--------------|-------------|-------| | Pillar 1 (Philosophy) | `docs/AOM/01-Philosophy/` | βœ… | | Pillar 2 (Operations) | `docs/AOM/02-Operations/` | βœ… | | Pillar 3 (Developer) | `docs/AOM/03-Developer/` | βœ… | | Pillar 4 (Knowledge) | `docs/AOM/04-Knowledge/` | βœ… | | Contracts | `atomizer/contracts/` | βœ… Matches AOM 03-Developer/08-Data-Contracts | | Processors | `atomizer/processors/` | βœ… Matches AOM 03-Developer/09-Processor-Development | | Orchestrator | `atomizer/orchestrator/` | βœ… Matches AOM 01-Philosophy/08-Tool-Agnostic | | Extractors | `atomizer/extractors/` | βœ… Matches AOM 02-Operations/04-Extractor-Library | | Protocols | `docs/protocols/` | βœ… Matches AOM 02-Operations/02-Protocol-Reference | ### 4.2 Minor Misalignments | Issue | Severity | |-------|----------| | AOM has `Audit/` folder (2 docs) β€” plan places it under `docs/AOM/Audit/` βœ… | None | | AOM Phase 4/5 docs (CLAUDE-v2, Living-Document-Protocol) need explicit V2 homes β€” plan addresses this in Section 4.4 βœ… | None | | MCP servers are in V2 repo as `mcp_servers/` but AOM 03-Developer/10 suggests they could be separate repos | 🟒 Minor β€” decide later | --- ## 5. Data Safety β€” 🟑 NEEDS ATTENTION ### 5.1 .gitignore Assessment **Good coverage for:** - NX/solver binary files (`.sim`, `.prt`, `.fem`, `.bdf`, `.op2`, `.f06`, `.frd`) - Python artifacts - IDE files - Study data directory **Missing patterns:** | Pattern | Risk | Recommendation | |---------|------|---------------| | `*.backup` / `*.bak` | Backup files could leak | Add `*.bak` and `*.backup` | | `*.csv` | Large result CSVs from studies | Add or use `studies/` containment | | `*.png` / `*.jpg` in study dirs | Iteration screenshots, contour plots | Covered by `studies/` gitignore βœ… | | `*.sqlite` / `*.sqlite3` | Optuna databases | Add explicitly (`.db` covers some but not all) | | `research_sessions/` | Knowledge base research data | Clarify if tracked | | `*.jsonl` | Session insights grow unbounded | Clarify: should `knowledge/session_insights/*.jsonl` be tracked? | | `*.whl` | Wheel files | Add | | `*.tar.gz` / `*.zip` | Archives in tools/ | Not currently present but preventive | ### 5.2 Large File Risk The plan correctly excludes `projects/` (99GB), `atomizer_field_training_data/` (68MB), and `tools/` (462MB β€” wait, why is V1 tools/ 462MB?). **Action item:** Investigate what's in V1 `tools/` that's 462MB. The plan lists it as "Large tool archives" β€” these could contaminate V2 if `tools/` is ported carelessly. ### 5.3 Success Criterion #9 > "No file larger than 1MB in git history (excluding initial dashboard assets)" This is good but needs enforcement. **Recommendation:** Add a pre-commit hook or CI check that rejects files >1MB. --- ## 6. Backward Compatibility β€” 🟑 RISKS EXIST ### 6.1 AtomizerSpec v2β†’v3 Migration The plan mentions `atomizer/spec/migrator.py` for v2.0β†’v3.0 migration. This is critical. **Key question:** What happens when a V1 `atomizer_spec.json` is loaded? - V1 specs have no `toolchain` section β†’ must default to `NX/NX mesher/Nastran` - V1 specs use `optimization_engine.*` import paths in custom hooks β†’ must still work - V1 specs may reference absolute paths on dalidou β†’ need path translation ### 6.2 V1 Study Continuation Can a V2 installation continue an in-progress V1 study? - Optuna DB: needs same database path or migration - Study state: `optimization_engine/study/state.py` tracks progress β€” needs porting - Iteration results: stored in `studies/*/` β€” path-dependent **The plan doesn't address mid-study migration.** This may be acceptable if all V1 studies are completed before migration, but this should be an explicit decision. ### 6.3 Import Path Compatibility The plan says "find-replace `optimization_engine.` β†’ `atomizer.`" but: - V1 custom hooks may import from `optimization_engine.*` - User-created study scripts import V1 paths - NX journals may import from V1 paths **Recommendation:** Consider a compatibility shim: ```python # optimization_engine/__init__.py (temporary) import warnings warnings.warn("optimization_engine is deprecated, use atomizer", DeprecationWarning) from atomizer import * ``` --- ## 7. Gaps β€” What Hasn't Been Considered ### 7.1 πŸ”΄ No Rollback Plan If V2 migration fails at Phase 3, what's the recovery? V1 is still there (not archived until Phase 6), but there's no documented rollback procedure. ### 7.2 🟑 No Migration Verification Checklist The "Success Criteria" (Section 13) are end-state checks. There's no per-phase verification that catches issues early. Each phase needs explicit "done when" criteria with test commands. ### 7.3 🟑 Environment/Dependencies - V1 uses `requirements.txt` + conda (`atomizer` env). V2 uses `pyproject.toml`. - How are V1 dependencies captured? Is there a `pip freeze` of the working V1 environment? - PyTorch + torch-geometric (for GNN) are notoriously version-sensitive. Pin versions. ### 7.4 🟑 Windows Path Handling V1 was developed on Windows (NX is Windows-only). V2 development is on Linux. Cross-platform path handling (`pathlib.Path` vs string paths) needs systematic review, not just "update Windows paths in NX processor (if needed)." ### 7.5 🟒 Documentation for `config/` Migration V1 has `config/nx_config.json.template` and `config/optimization_config_template.json`. These aren't mentioned in the migration plan. They should either map to V2's `atomizer/spec/` or `.env.example`. ### 7.6 🟒 `optimization_engine/schemas/` Contents The plan says "Port schemas" but doesn't inventory what's in this directory. Should be checked. ### 7.7 🟒 Feature Registry V1 has `optimization_engine/feature_registry.json`. Not mentioned in migration plan. --- ## Summary Scorecard | Criteria | Grade | Notes | |----------|-------|-------| | **Completeness** | 🟑 C+ | ~60% of V1 files explicitly mapped. 8+ subpackages missing. | | **Risk Assessment** | 🟑 B- | Good risks identified, but `utils/`, `context/`, `study/` omissions are high-risk | | **Feasibility** | 🟑 B- | 8 days β†’ realistically 11-13 days | | **Architecture Alignment** | βœ… A | Excellent match to AOM Component Map | | **Data Safety** | 🟑 B | Solid .gitignore but missing some patterns; needs pre-commit hook | | **Backward Compatibility** | 🟑 B- | Spec migration planned but mid-study and import shims not addressed | | **Overall** | 🟑 B- | Strong vision, solid architecture, but execution plan has dangerous gaps in file inventory | --- ## Recommendations (Priority Ordered) 1. **πŸ”΄ IMMEDIATE: Create complete file inventory** β€” Map every V1 `.py` file to V2 destination or explicit skip. ~2 hours, saves days. (`find optimization_engine -name "*.py" | sort` β†’ spreadsheet with V2 destination column) 2. **πŸ”΄ Add missing modules to migration table:** - `context/` β†’ `atomizer/context/` or merge into `optimization/` - `study/` β†’ `atomizer/study/` (this is P0, not optional) - `utils/` β†’ `atomizer/utils/` (infrastructure everything depends on) - `plugins/` β†’ merge with `hooks/` or separate - `validation/` β†’ merge with `spec/validator.py` - `intake/` β†’ `atomizer/intake/` or merge into `interview/` 3. **🟑 Extend timeline to 12 days** or explicitly reduce scope (e.g., "Phase 1 ports only the minimum for NX workflow; remaining modules in Phase 2") 4. **🟑 Add per-phase verification commands** (not just end-state criteria) 5. **🟑 Add rollback procedure** to Section 11 6. **🟑 Pin dependency versions** in pyproject.toml (especially PyTorch, torch-geometric) 7. **🟑 Add pre-commit hook** for file size enforcement (>1MB rejection) 8. **🟒 Consider import compatibility shim** for transition period 9. **🟒 Investigate V1 `tools/` size** (462MB β€” what's in there?) 10. **🟒 Decide on `.jsonl` tracking** β€” knowledge base files should probably be tracked, session data should not --- *This is a strong plan with the right vision and principles. The architecture alignment is excellent. The gaps are execution-level β€” they're fixable before work begins. Fixing them now prevents the "oh wait, where does this module go?" problem that derails migrations mid-stream.* *β€” Auditor πŸ”, 2026-02-22*