Files

Antoine 1f58bb8016 chore(hq): daily sync 2026-02-23

2026-02-23 10:00:17 +00:00

16 KiB

Raw Blame History

V2 Migration Master Plan — Audit Report

Auditor: Auditor Agent 🔍 Date: 2026-02-22 Document Reviewed: ATOMIZER-V2-MIGRATION-MASTERPLAN.md Verdict: 🟡 MAJOR issues found — plan is strong but has significant gaps that will cause problems during execution

1. Completeness — 🔴 CRITICAL GAPS

1.1 Missing V1 Modules (Not Accounted For)

The migration plan lists modules to port but misses at least 8 significant V1 subpackages:

V1 Module	Files	Purpose	Impact if Missed
`optimization_engine/context/`	7 files	Session state, compaction, feedback loop, playbook, reflector	🔴 Core runtime functionality — sessions won't persist state
`optimization_engine/study/`	8 files	Study creator, wizard, continuation, reset, benchmarking, state, history	🔴 Can't create or manage studies without this
`optimization_engine/utils/`	12 files	Logger, dashboard_db, trial_manager, NX file discovery, study archiver, realtime tracking	🔴 Infrastructure that everything depends on
`optimization_engine/plugins/`	4 files	hook_manager, hooks, validators (DIFFERENT from `hooks/`)	🟡 Plugin system won't work
`optimization_engine/intake/`	3 files	Config intake, context intake, processor	🟡 Study intake pipeline broken
`optimization_engine/validation/`	3 files	checker.py, gate.py (DIFFERENT from `validators/`)	🟡 Validation gates lost
`optimization_engine/model_discovery/`	2 files	NX model introspection	🟡 Model discovery capability lost
`optimization_engine/devloop/`	7 files	Analyzer, orchestrator, planning, test_runner, browser scenarios	🟢 DevLoop was planned for `tools/devloop_cli.py` but the full subpackage has 7 files
`optimization_engine/processors/`	2 files	adaptive_characterization.py	🟡 V1 already has a `processors/` concept
`optimization_engine/future/`	11 files	Research agents, LLM workflow analyzer, step classifier	🟢 May be intentionally excluded, but not listed in "DO NOT MIGRATE"
`optimization_engine/custom_functions/`	2 files	NX material generator	🟢 Utility, should be documented
`optimization_engine/templates/`	3 files	run_optimization_template, run_nn_optimization_template	🟡 Template system for studies
`optimization_engine/surrogates/`	1 file	`__init__.py` (separate from `gnn/`)	🟢 Minor

1.2 Missing V1 Core Files

V1 File	Role	Plan Status
`optimization_engine/core/base_runner.py`	Base class for runners	❌ Not mentioned (plan only lists runner.py)
`optimization_engine/core/gradient_optimizer.py`	Gradient-based optimization	❌ Not mentioned
`optimization_engine/core/runner_with_neural.py`	Neural-accelerated runner	❌ Not mentioned
`optimization_engine/core/strategy_portfolio.py`	Strategy portfolio management	❌ Not mentioned
`optimization_engine/core/strategy_selector.py`	Strategy selection (different from method_selector)	❌ Not mentioned
`optimization_engine/schemas/`	Schema files	✅ Mentioned but directory contents not inventoried

1.3 Missing V1 Root-Level Files

File	Status
`atomizer.py` (25KB monolith)	Listed in "DO NOT MIGRATE" ✅ but its functionality needs a replacement
`launch_dashboard.py`	❌ Not mentioned — how does V2 launch the dashboard?
`requirements.txt`	Replaced by pyproject.toml ✅
`install.bat`	❌ Not mentioned — Windows install script

1.4 V1 Tools Directory

The plan only mentions tools/devloop_cli.py. V1 tools/ has 25+ scripts including:

analyze_study.py, find_best_iteration.py, archive_study.py
create_pareto_graphs.py, generate_psd_figures.py
Zernike-specific tools (HTML generator, WFE PSD, optical report)
Study migration tools

Recommendation: Create an inventory of tools/ and decide per-file: migrate, archive, or replace.

2. Risk Assessment — 🟡 MAJOR

2.1 Identified Risks (Plan Section 11)

The plan's risk table is reasonable but underestimates these risks:

Risk	Plan's Mitigation	My Assessment
Import breakage	Find-replace `optimization_engine.` → `atomizer.`	🟡 Insufficient. Many V1 modules use relative imports, cross-module imports, and `optimization_engine.` is nested (e.g., `from optimization_engine.core.runner import Runner` where `runner.py` imports from `optimization_engine.extractors`). A mechanical find-replace will miss circular dependencies and runtime-only imports. Need a test suite, not just sed.
NX integration breaks	Test on dalidou before archiving V1	✅ Adequate
`.gitignore` too aggressive	Test essential files	🟡 See Data Safety section below

2.2 Unidentified Risks

Risk	Severity	Mitigation Needed
V1 `utils/` dependency web — logger, trial_manager, dashboard_db are imported EVERYWHERE in V1. Where do they go in V2?	🔴 HIGH	Create `atomizer/utils/` or distribute into appropriate modules. Map ALL import dependencies before porting.
`context/` module loss — session state, compaction, feedback loops. If not ported, studies can't resume, context is lost between runs	🔴 HIGH	Add to migration table, decide V2 location
`study/` module loss — study creation wizard, continuation, reset. Without this, can't create studies from V2	🔴 HIGH	Add to migration table as P0
Optuna DB path changes — V1 studies store Optuna databases at specific paths. V2 restructure may break study continuation	🟡 MED	Test study continuation with path remapping
NX journal path references — NX journals may hardcode V1 paths	🟡 MED	Audit all journal files for hardcoded paths
Knowledge base `.jsonl` files — are these tracked in git or gitignored? They're small (212KB) but grow over time	🟡 MED	Clarify: track in git or gitignore with backup strategy
Python version compatibility — pyproject.toml says `>=3.10` but V1 may use patterns from 3.8/3.9	🟢 LOW	Test on target Python version

3. Feasibility — 🟡 8-Day Timeline is Aggressive

3.1 Phase-by-Phase Assessment

Phase	Planned	Realistic	Issue
Phase 0: Bootstrap + AOM	1 day	1.5 days	AOM link conversion for 48 docs is tedious even with a script. Needs manual QA.
Phase 1: Core Engine	2 days	3-4 days	Plan lists 13 steps but misses ~25 additional files from `core/`, `context/`, `study/`, `utils/`. Refactoring runner→engine while maintaining all runner variants (base_runner, runner_with_neural) is non-trivial.
Phase 2: Supporting	2 days	2 days	Reasonable if scope is truly "direct port"
Phase 3: Integration	2 days	3 days	Import fixes across 100+ files. This is where the missing modules will surface.
Phase 4: Syncthing	1 day	1 day	Reasonable
Phase 5: GitHub + CI	1 day	0.5 days	Straightforward
Phase 6: Archive V1	1 day	0.5 days	Straightforward
Total	8 days	11-13 days

3.2 Key Bottleneck

Phase 1 is underscoped. The migration table shows 13 clean steps, but V1's optimization_engine/ has ~150 Python files across 20 subpackages. The plan only explicitly accounts for ~60 of these. The remaining ~90 files will surface during Phase 3 integration testing, causing scope creep and rework.

Recommendation: Before starting, create a complete file-level inventory mapping every V1 .py file to its V2 destination (or explicit "skip" decision). This takes ~2 hours but saves days of surprises.

4. Architecture Alignment — ✅ STRONG

4.1 AOM Component Map Match

The V2 structure maps well to the AOM's four pillars:

AOM Component	V2 Location	Match
Pillar 1 (Philosophy)	`docs/AOM/01-Philosophy/`	✅
Pillar 2 (Operations)	`docs/AOM/02-Operations/`	✅
Pillar 3 (Developer)	`docs/AOM/03-Developer/`	✅
Pillar 4 (Knowledge)	`docs/AOM/04-Knowledge/`	✅
Contracts	`atomizer/contracts/`	✅ Matches AOM 03-Developer/08-Data-Contracts
Processors	`atomizer/processors/`	✅ Matches AOM 03-Developer/09-Processor-Development
Orchestrator	`atomizer/orchestrator/`	✅ Matches AOM 01-Philosophy/08-Tool-Agnostic
Extractors	`atomizer/extractors/`	✅ Matches AOM 02-Operations/04-Extractor-Library
Protocols	`docs/protocols/`	✅ Matches AOM 02-Operations/02-Protocol-Reference

4.2 Minor Misalignments

Issue	Severity
AOM has `Audit/` folder (2 docs) — plan places it under `docs/AOM/Audit/` ✅	None
AOM Phase 4/5 docs (CLAUDE-v2, Living-Document-Protocol) need explicit V2 homes — plan addresses this in Section 4.4 ✅	None
MCP servers are in V2 repo as `mcp_servers/` but AOM 03-Developer/10 suggests they could be separate repos	🟢 Minor — decide later

5. Data Safety — 🟡 NEEDS ATTENTION

5.1 .gitignore Assessment

Good coverage for:

NX/solver binary files (.sim, .prt, .fem, .bdf, .op2, .f06, .frd)
Python artifacts
IDE files
Study data directory

Missing patterns:

Pattern	Risk	Recommendation
`.backup` / `.bak`	Backup files could leak	Add `.bak` and `.backup`
`*.csv`	Large result CSVs from studies	Add or use `studies/` containment
`.png` / `.jpg` in study dirs	Iteration screenshots, contour plots	Covered by `studies/` gitignore ✅
`.sqlite` / `.sqlite3`	Optuna databases	Add explicitly (`.db` covers some but not all)
`research_sessions/`	Knowledge base research data	Clarify if tracked
`*.jsonl`	Session insights grow unbounded	Clarify: should `knowledge/session_insights/*.jsonl` be tracked?
`*.whl`	Wheel files	Add
`.tar.gz` / `.zip`	Archives in tools/	Not currently present but preventive

5.2 Large File Risk

The plan correctly excludes projects/ (99GB), atomizer_field_training_data/ (68MB), and tools/ (462MB — wait, why is V1 tools/ 462MB?).

Action item: Investigate what's in V1 tools/ that's 462MB. The plan lists it as "Large tool archives" — these could contaminate V2 if tools/ is ported carelessly.

5.3 Success Criterion #9

"No file larger than 1MB in git history (excluding initial dashboard assets)"

This is good but needs enforcement. Recommendation: Add a pre-commit hook or CI check that rejects files >1MB.

6. Backward Compatibility — 🟡 RISKS EXIST

6.1 AtomizerSpec v2→v3 Migration

The plan mentions atomizer/spec/migrator.py for v2.0→v3.0 migration. This is critical.

Key question: What happens when a V1 atomizer_spec.json is loaded?

V1 specs have no toolchain section → must default to NX/NX mesher/Nastran
V1 specs use optimization_engine.* import paths in custom hooks → must still work
V1 specs may reference absolute paths on dalidou → need path translation

6.2 V1 Study Continuation

Can a V2 installation continue an in-progress V1 study?

Optuna DB: needs same database path or migration
Study state: optimization_engine/study/state.py tracks progress — needs porting
Iteration results: stored in studies/*/ — path-dependent

The plan doesn't address mid-study migration. This may be acceptable if all V1 studies are completed before migration, but this should be an explicit decision.

6.3 Import Path Compatibility

The plan says "find-replace optimization_engine. → atomizer." but:

V1 custom hooks may import from optimization_engine.*
User-created study scripts import V1 paths
NX journals may import from V1 paths

Recommendation: Consider a compatibility shim:

# optimization_engine/__init__.py (temporary)
import warnings
warnings.warn("optimization_engine is deprecated, use atomizer", DeprecationWarning)
from atomizer import *

7. Gaps — What Hasn't Been Considered

7.1 🔴 No Rollback Plan

If V2 migration fails at Phase 3, what's the recovery? V1 is still there (not archived until Phase 6), but there's no documented rollback procedure.

7.2 🟡 No Migration Verification Checklist

The "Success Criteria" (Section 13) are end-state checks. There's no per-phase verification that catches issues early. Each phase needs explicit "done when" criteria with test commands.

7.3 🟡 Environment/Dependencies

V1 uses requirements.txt + conda (atomizer env). V2 uses pyproject.toml.
How are V1 dependencies captured? Is there a pip freeze of the working V1 environment?
PyTorch + torch-geometric (for GNN) are notoriously version-sensitive. Pin versions.

7.4 🟡 Windows Path Handling

V1 was developed on Windows (NX is Windows-only). V2 development is on Linux. Cross-platform path handling (pathlib.Path vs string paths) needs systematic review, not just "update Windows paths in NX processor (if needed)."

7.5 🟢 Documentation for `config/` Migration

V1 has config/nx_config.json.template and config/optimization_config_template.json. These aren't mentioned in the migration plan. They should either map to V2's atomizer/spec/ or .env.example.

7.6 🟢 `optimization_engine/schemas/` Contents

The plan says "Port schemas" but doesn't inventory what's in this directory. Should be checked.

7.7 🟢 Feature Registry

V1 has optimization_engine/feature_registry.json. Not mentioned in migration plan.

Summary Scorecard

Criteria	Grade	Notes
Completeness	🟡 C+	~60% of V1 files explicitly mapped. 8+ subpackages missing.
Risk Assessment	🟡 B-	Good risks identified, but `utils/`, `context/`, `study/` omissions are high-risk
Feasibility	🟡 B-	8 days → realistically 11-13 days
Architecture Alignment	✅ A	Excellent match to AOM Component Map
Data Safety	🟡 B	Solid .gitignore but missing some patterns; needs pre-commit hook
Backward Compatibility	🟡 B-	Spec migration planned but mid-study and import shims not addressed
Overall	🟡 B-	Strong vision, solid architecture, but execution plan has dangerous gaps in file inventory

Recommendations (Priority Ordered)

🔴 IMMEDIATE: Create complete file inventory — Map every V1 .py file to V2 destination or explicit skip. ~2 hours, saves days. (find optimization_engine -name "*.py" | sort → spreadsheet with V2 destination column)
🔴 Add missing modules to migration table:
- context/ → atomizer/context/ or merge into optimization/
- study/ → atomizer/study/ (this is P0, not optional)
- utils/ → atomizer/utils/ (infrastructure everything depends on)
- plugins/ → merge with hooks/ or separate
- validation/ → merge with spec/validator.py
- intake/ → atomizer/intake/ or merge into interview/
🟡 Extend timeline to 12 days or explicitly reduce scope (e.g., "Phase 1 ports only the minimum for NX workflow; remaining modules in Phase 2")
🟡 Add per-phase verification commands (not just end-state criteria)
🟡 Add rollback procedure to Section 11
🟡 Pin dependency versions in pyproject.toml (especially PyTorch, torch-geometric)
🟡 Add pre-commit hook for file size enforcement (>1MB rejection)
🟢 Consider import compatibility shim for transition period
🟢 Investigate V1 tools/ size (462MB — what's in there?)
🟢 Decide on .jsonl tracking — knowledge base files should probably be tracked, session data should not

This is a strong plan with the right vision and principles. The architecture alignment is excellent. The gaps are execution-level — they're fixable before work begins. Fixing them now prevents the "oh wait, where does this module go?" problem that derails migrations mid-stream.

— Auditor 🔍, 2026-02-22

16 KiB Raw Blame History