feat: Pre-migration checkpoint - updated docs and utilities

Updates before optimization_engine migration: - Updated migration plan to v2.1 with complete file inventory - Added OP_07 disk optimization protocol - Added SYS_16 self-aware turbo protocol - Added study archiver and cleanup utilities - Added ensemble surrogate module - Updated NX solver and session manager - Updated zernike HTML generator - Added context engineering plan - LAC session insights updates 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 10:22:45 -05:00
parent faa7779a43
commit 82f36689b7
21 changed files with 6304 additions and 890 deletions
--- a/.claude/skills/modules/study-disk-optimization.md
+++ b/.claude/skills/modules/study-disk-optimization.md
@@ -0,0 +1,464 @@
+# Study Disk Optimization Module
+
+## Atomizer Disk Space Management System
+
+**Version:** 1.0
+**Created:** 2025-12-29
+**Status:** PRODUCTION READY
+**Impact:** Reduced M1_Mirror from 194 GB → 114 GB (80 GB freed, 41% reduction)
+
+---
+
+## Executive Summary
+
+FEA optimization studies consume massive disk space due to per-trial file copying. This module provides:
+
+1. **Local Cleanup** - Remove regenerable files from completed studies (50%+ savings)
+2. **Remote Archival** - Archive to dalidou server (14TB available)
+3. **On-Demand Restore** - Pull archived studies when needed
+
+### Key Insight
+
+Each trial folder contains ~150 MB, but only **~70 MB is essential** (OP2 results + metadata). The rest are copies of master files that can be regenerated.
+
+---
+
+## Part 1: File Classification
+
+### Essential Files (KEEP)
+
+| Extension | Purpose | Typical Size |
+|-----------|---------|--------------|
+| `.op2` | Nastran binary results | 68 MB |
+| `.json` | Parameters, results, metadata | <1 MB |
+| `.npz` | Pre-computed Zernike coefficients | <1 MB |
+| `.html` | Generated reports | <1 MB |
+| `.png` | Visualization images | <1 MB |
+| `.csv` | Exported data tables | <1 MB |
+
+### Deletable Files (REGENERABLE)
+
+| Extension | Purpose | Why Deletable |
+|-----------|---------|---------------|
+| `.prt` | NX part files | Copy of master in `1_setup/` |
+| `.fem` | FEM mesh files | Copy of master |
+| `.sim` | Simulation files | Copy of master |
+| `.afm` | Assembly FEM | Regenerable |
+| `.dat` | Solver input deck | Regenerable from params |
+| `.f04` | Nastran output log | Diagnostic only |
+| `.f06` | Nastran printed output | Diagnostic only |
+| `.log` | Generic logs | Diagnostic only |
+| `.diag` | Diagnostic files | Diagnostic only |
+| `.txt` | Temp text files | Intermediate data |
+| `.exp` | Expression files | Regenerable |
+| `.bak` | Backup files | Not needed |
+
+### Protected Folders (NEVER TOUCH)
+
+| Folder | Reason |
+|--------|--------|
+| `1_setup/` | Master model files (source of truth) |
+| `3_results/` | Final database, reports, best designs |
+| `best_design_archive/` | Archived optimal configurations |
+
+---
+
+## Part 2: Disk Usage Analysis
+
+### M1_Mirror Project Baseline (Dec 2025)
+
+```
+Total: 194 GB across 28 studies, 2000+ trials
+
+By File Type:
+  .op2    94 GB (48.5%) - Nastran results [ESSENTIAL]
+  .prt    41 GB (21.4%) - NX parts [DELETABLE]
+  .fem    22 GB (11.5%) - FEM mesh [DELETABLE]
+  .dat    22 GB (11.3%) - Solver input [DELETABLE]
+  .sim     9 GB (4.5%)  - Simulation [DELETABLE]
+  .afm     5 GB (2.5%)  - Assembly FEM [DELETABLE]
+  Other   <1 GB (<1%)   - Logs, configs [MIXED]
+
+By Folder:
+  2_iterations/    168 GB (87%) - Per-trial data
+  3_results/        22 GB (11%) - Final results
+  1_setup/           4 GB (2%)  - Master models
+```
+
+### Per-Trial Breakdown (Typical V11+ Structure)
+
+```
+iter1/
+  assy_m1_assyfem1_sim1-solution_1.op2    68.15 MB  [KEEP]
+  M1_Blank.prt                            29.94 MB  [DELETE]
+  assy_m1_assyfem1_sim1-solution_1.dat    15.86 MB  [DELETE]
+  M1_Blank_fem1.fem                       14.07 MB  [DELETE]
+  ASSY_M1_assyfem1_sim1.sim                7.47 MB  [DELETE]
+  M1_Blank_fem1_i.prt                      5.20 MB  [DELETE]
+  ASSY_M1_assyfem1.afm                     4.13 MB  [DELETE]
+  M1_Vertical_Support_Skeleton_fem1.fem    3.76 MB  [DELETE]
+  ... (logs, temps)                       <1.00 MB  [DELETE]
+  _temp_part_properties.json               0.00 MB  [KEEP]
+  -------------------------------------------------------
+  TOTAL:                                 149.67 MB
+  Essential only:                         68.15 MB
+  Savings:                                54.5%
+```
+
+---
+
+## Part 3: Implementation
+
+### Core Utility
+
+**Location:** `optimization_engine/utils/study_archiver.py`
+
+```python
+from optimization_engine.utils.study_archiver import (
+    analyze_study,        # Get disk usage analysis
+    cleanup_study,        # Remove deletable files
+    archive_to_remote,    # Archive to dalidou
+    restore_from_remote,  # Restore from dalidou
+    list_remote_archives, # List server archives
+)
+```
+
+### Command Line Interface
+
+**Batch Script:** `tools/archive_study.bat`
+
+```bash
+# Analyze disk usage
+archive_study.bat analyze studies\M1_Mirror
+archive_study.bat analyze studies\M1_Mirror\m1_mirror_V12
+
+# Cleanup completed study (dry run by default)
+archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12
+archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12 --execute
+
+# Archive to remote server
+archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute
+archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute --tailscale
+
+# List remote archives
+archive_study.bat list
+archive_study.bat list --tailscale
+
+# Restore from remote
+archive_study.bat restore m1_mirror_V12
+archive_study.bat restore m1_mirror_V12 --tailscale
+```
+
+### Python API
+
+```python
+from pathlib import Path
+from optimization_engine.utils.study_archiver import (
+    analyze_study,
+    cleanup_study,
+    archive_to_remote,
+)
+
+# Analyze
+study_path = Path("studies/M1_Mirror/m1_mirror_V12")
+analysis = analyze_study(study_path)
+print(f"Total: {analysis['total_size_bytes']/1e9:.2f} GB")
+print(f"Essential: {analysis['essential_size']/1e9:.2f} GB")
+print(f"Deletable: {analysis['deletable_size']/1e9:.2f} GB")
+
+# Cleanup (dry_run=False to execute)
+deleted, freed = cleanup_study(study_path, dry_run=False)
+print(f"Freed {freed/1e9:.2f} GB")
+
+# Archive to server
+success = archive_to_remote(study_path, use_tailscale=False, dry_run=False)
+```
+
+---
+
+## Part 4: Remote Server Configuration
+
+### dalidou Server Specs
+
+| Property | Value |
+|----------|-------|
+| Hostname | dalidou |
+| Local IP | 192.168.86.50 |
+| Tailscale IP | 100.80.199.40 |
+| SSH User | papa |
+| Archive Path | /srv/storage/atomizer-archive/ |
+| Available Storage | 3.6 TB (SSD) + 12.7 TB (HDD) |
+
+### First-Time Setup
+
+```bash
+# 1. SSH into server and create archive directory
+ssh papa@192.168.86.50
+mkdir -p /srv/storage/atomizer-archive
+
+# 2. Set up passwordless SSH (on Windows)
+ssh-keygen -t ed25519  # If you don't have a key
+ssh-copy-id papa@192.168.86.50
+
+# 3. Test connection
+ssh papa@192.168.86.50 "echo 'Connection OK'"
+```
+
+### Archive Structure on Server
+
+```
+/srv/storage/atomizer-archive/
+├── m1_mirror_V11_20251229.tar.gz    # Compressed study archive
+├── m1_mirror_V12_20251229.tar.gz
+├── m1_mirror_flat_back_V3_20251229.tar.gz
+└── manifest.json                     # Index of all archives
+```
+
+---
+
+## Part 5: Recommended Workflows
+
+### During Active Optimization
+
+**Keep all files** - You may need to:
+- Re-run specific failed trials
+- Debug mesh issues
+- Analyze intermediate results
+
+### After Study Completion
+
+1. **Generate final report** (STUDY_REPORT.md)
+2. **Archive best design** to `3_results/best_design_archive/`
+3. **Run cleanup:**
+   ```bash
+   archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12 --execute
+   ```
+4. **Verify results still accessible:**
+   - Database queries work
+   - Best design files intact
+   - OP2 files for Zernike extraction present
+
+### For Long-Term Storage
+
+1. **After cleanup**, archive to server:
+   ```bash
+   archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute
+   ```
+2. **Optionally delete local** study folder
+3. **Keep only** `3_results/best_design_archive/` locally if needed
+
+### When Revisiting Old Study
+
+1. **Check if archived:**
+   ```bash
+   archive_study.bat list
+   ```
+2. **Restore:**
+   ```bash
+   archive_study.bat restore m1_mirror_V12
+   ```
+3. **If re-running trials needed**, master files in `1_setup/` allow full regeneration
+
+---
+
+## Part 6: Disk Space Targets
+
+### Per-Project Guidelines
+
+| Stage | Expected Size | Notes |
+|-------|---------------|-------|
+| Active (full) | 100% | All files present |
+| Completed (cleaned) | ~50% | Deletables removed |
+| Archived (minimal) | ~3% | Best design only locally |
+
+### M1_Mirror Specific
+
+| Stage | Size | Notes |
+|-------|------|-------|
+| Full | 194 GB | 28 studies, 2000+ trials |
+| After cleanup | 114 GB | OP2 + metadata only |
+| Minimal local | 5-10 GB | Best designs + database |
+| Server archive | ~50 GB | Compressed |
+
+---
+
+## Part 7: Safety Features
+
+### Built-in Protections
+
+1. **Dry run by default** - Must explicitly add `--execute`
+2. **Master files untouched** - `1_setup/` is never modified
+3. **Results preserved** - `3_results/` is never touched
+4. **Essential files preserved** - OP2, JSON, NPZ always kept
+5. **Archive verification** - rsync checks integrity
+
+### What Cannot Be Recovered After Cleanup
+
+| File Type | Recovery Method |
+|-----------|-----------------|
+| `.prt` | Copy from `1_setup/` + update params |
+| `.fem` | Regenerate from `.prt` in NX |
+| `.sim` | Recreate simulation setup |
+| `.dat` | Regenerate from params.json + model |
+| `.f04/.f06` | Re-run solver (if needed) |
+
+**Note:** With `1_setup/` master files and `params.json`, ANY trial can be fully reconstructed. The only irreplaceable data is the OP2 results (which we keep).
+
+---
+
+## Part 8: Troubleshooting
+
+### SSH Connection Failed
+
+```bash
+# Test connectivity
+ping 192.168.86.50
+
+# Test SSH
+ssh papa@192.168.86.50 "echo connected"
+
+# If on different network, use Tailscale
+ssh papa@100.80.199.40 "echo connected"
+```
+
+### Archive Upload Slow
+
+Large studies (50+ GB) take time. Options:
+- Run overnight
+- Use wired LAN connection
+- Pre-cleanup to reduce size
+
+### Out of Disk Space During Archive
+
+Archive is created locally first. Need ~1.5x study size free:
+- 20 GB study = ~30 GB temp space required
+
+### Cleanup Removed Wrong Files
+
+If accidentally executed without dry run:
+- OP2 files preserved (can still extract results)
+- Master files in `1_setup/` intact
+- Regenerate other files by re-running trial
+
+---
+
+## Part 9: Integration with Atomizer
+
+### Protocol Reference
+
+**Related Protocol:** `docs/protocols/operations/OP_07_DISK_OPTIMIZATION.md`
+
+### Claude Commands
+
+When user says:
+- "analyze disk usage" → Run `analyze_study()`
+- "clean up study" → Run `cleanup_study()` with confirmation
+- "archive to server" → Run `archive_to_remote()`
+- "restore study" → Run `restore_from_remote()`
+
+### Automatic Suggestions
+
+After optimization completion, suggest:
+```
+Optimization complete! The study is using X GB.
+Would you like me to clean up regenerable files to save Y GB?
+(This keeps all results but removes intermediate model copies)
+```
+
+---
+
+## Part 10: File Inventory
+
+### Files Created
+
+| File | Purpose |
+|------|---------|
+| `optimization_engine/utils/study_archiver.py` | Core utility module |
+| `tools/archive_study.bat` | Windows batch script |
+| `docs/protocols/operations/OP_07_DISK_OPTIMIZATION.md` | Full protocol |
+| `.claude/skills/modules/study-disk-optimization.md` | This document |
+
+### Dependencies
+
+- Python 3.8+
+- rsync (for remote operations, usually pre-installed)
+- SSH client (for remote operations)
+- Tailscale (optional, for remote access outside LAN)
+
+---
+
+## Appendix A: Cleanup Results Log (Dec 2025)
+
+### Initial Cleanup Run
+
+| Study | Before | After | Freed | Files Deleted |
+|-------|--------|-------|-------|---------------|
+| m1_mirror_cost_reduction_V11 | 32.24 GB | 15.94 GB | 16.30 GB | 3,403 |
+| m1_mirror_cost_reduction_flat_back_V3 | 52.50 GB | 26.87 GB | 25.63 GB | 5,084 |
+| m1_mirror_cost_reduction_flat_back_V6 | 33.71 GB | 16.64 GB | 17.08 GB | 3,391 |
+| m1_mirror_cost_reduction_V12 | 22.68 GB | 10.60 GB | 12.08 GB | 2,508 |
+| m1_mirror_cost_reduction_flat_back_V1 | 8.76 GB | 4.54 GB | 4.22 GB | 813 |
+| m1_mirror_cost_reduction_flat_back_V5 | 8.01 GB | 4.09 GB | 3.92 GB | 765 |
+| m1_mirror_cost_reduction | 3.58 GB | 3.08 GB | 0.50 GB | 267 |
+| **TOTAL** | **161.48 GB** | **81.76 GB** | **79.73 GB** | **16,231** |
+
+### Project-Wide Summary
+
+```
+Before cleanup: 193.75 GB
+After cleanup:  114.03 GB
+Total freed:     79.72 GB (41% reduction)
+```
+
+---
+
+## Appendix B: Quick Reference Card
+
+### Commands
+
+```bash
+# Analyze
+archive_study.bat analyze <path>
+
+# Cleanup (always dry-run first!)
+archive_study.bat cleanup <study>           # Dry run
+archive_study.bat cleanup <study> --execute # Execute
+
+# Archive
+archive_study.bat archive <study> --execute
+archive_study.bat archive <study> --execute --tailscale
+
+# Remote
+archive_study.bat list
+archive_study.bat restore <name>
+```
+
+### Python
+
+```python
+from optimization_engine.utils.study_archiver import *
+
+# Quick analysis
+analysis = analyze_study(Path("studies/M1_Mirror"))
+print(f"Deletable: {analysis['deletable_size']/1e9:.2f} GB")
+
+# Cleanup
+cleanup_study(Path("studies/M1_Mirror/m1_mirror_V12"), dry_run=False)
+```
+
+### Server Access
+
+```bash
+# Local
+ssh papa@192.168.86.50
+
+# Remote (Tailscale)
+ssh papa@100.80.199.40
+
+# Archive location
+/srv/storage/atomizer-archive/
+```
+
+---
+
+*This module enables efficient disk space management for large-scale FEA optimization studies.*