Files

Anto01 82f36689b7 feat: Pre-migration checkpoint - updated docs and utilities

Updates before optimization_engine migration:
- Updated migration plan to v2.1 with complete file inventory
- Added OP_07 disk optimization protocol
- Added SYS_16 self-aware turbo protocol
- Added study archiver and cleanup utilities
- Added ensemble surrogate module
- Updated NX solver and session manager
- Updated zernike HTML generator
- Added context engineering plan
- LAC session insights updates

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-29 10:22:45 -05:00

12 KiB

Raw Blame History

Study Disk Optimization Module

Atomizer Disk Space Management System

Version: 1.0 Created: 2025-12-29 Status: PRODUCTION READY Impact: Reduced M1_Mirror from 194 GB → 114 GB (80 GB freed, 41% reduction)

Executive Summary

FEA optimization studies consume massive disk space due to per-trial file copying. This module provides:

Local Cleanup - Remove regenerable files from completed studies (50%+ savings)
Remote Archival - Archive to dalidou server (14TB available)
On-Demand Restore - Pull archived studies when needed

Key Insight

Each trial folder contains ~150 MB, but only ~70 MB is essential (OP2 results + metadata). The rest are copies of master files that can be regenerated.

Part 1: File Classification

Essential Files (KEEP)

Extension	Purpose	Typical Size
`.op2`	Nastran binary results	68 MB
`.json`	Parameters, results, metadata	<1 MB
`.npz`	Pre-computed Zernike coefficients	<1 MB
`.html`	Generated reports	<1 MB
`.png`	Visualization images	<1 MB
`.csv`	Exported data tables	<1 MB

Deletable Files (REGENERABLE)

Extension	Purpose	Why Deletable
`.prt`	NX part files	Copy of master in `1_setup/`
`.fem`	FEM mesh files	Copy of master
`.sim`	Simulation files	Copy of master
`.afm`	Assembly FEM	Regenerable
`.dat`	Solver input deck	Regenerable from params
`.f04`	Nastran output log	Diagnostic only
`.f06`	Nastran printed output	Diagnostic only
`.log`	Generic logs	Diagnostic only
`.diag`	Diagnostic files	Diagnostic only
`.txt`	Temp text files	Intermediate data
`.exp`	Expression files	Regenerable
`.bak`	Backup files	Not needed

Protected Folders (NEVER TOUCH)

Folder	Reason
`1_setup/`	Master model files (source of truth)
`3_results/`	Final database, reports, best designs
`best_design_archive/`	Archived optimal configurations

Part 2: Disk Usage Analysis

M1_Mirror Project Baseline (Dec 2025)

Total: 194 GB across 28 studies, 2000+ trials

By File Type:
  .op2    94 GB (48.5%) - Nastran results [ESSENTIAL]
  .prt    41 GB (21.4%) - NX parts [DELETABLE]
  .fem    22 GB (11.5%) - FEM mesh [DELETABLE]
  .dat    22 GB (11.3%) - Solver input [DELETABLE]
  .sim     9 GB (4.5%)  - Simulation [DELETABLE]
  .afm     5 GB (2.5%)  - Assembly FEM [DELETABLE]
  Other   <1 GB (<1%)   - Logs, configs [MIXED]

By Folder:
  2_iterations/    168 GB (87%) - Per-trial data
  3_results/        22 GB (11%) - Final results
  1_setup/           4 GB (2%)  - Master models

Per-Trial Breakdown (Typical V11+ Structure)

iter1/
  assy_m1_assyfem1_sim1-solution_1.op2    68.15 MB  [KEEP]
  M1_Blank.prt                            29.94 MB  [DELETE]
  assy_m1_assyfem1_sim1-solution_1.dat    15.86 MB  [DELETE]
  M1_Blank_fem1.fem                       14.07 MB  [DELETE]
  ASSY_M1_assyfem1_sim1.sim                7.47 MB  [DELETE]
  M1_Blank_fem1_i.prt                      5.20 MB  [DELETE]
  ASSY_M1_assyfem1.afm                     4.13 MB  [DELETE]
  M1_Vertical_Support_Skeleton_fem1.fem    3.76 MB  [DELETE]
  ... (logs, temps)                       <1.00 MB  [DELETE]
  _temp_part_properties.json               0.00 MB  [KEEP]
  -------------------------------------------------------
  TOTAL:                                 149.67 MB
  Essential only:                         68.15 MB
  Savings:                                54.5%

Part 3: Implementation

Core Utility

Location: optimization_engine/utils/study_archiver.py

from optimization_engine.utils.study_archiver import (
    analyze_study,        # Get disk usage analysis
    cleanup_study,        # Remove deletable files
    archive_to_remote,    # Archive to dalidou
    restore_from_remote,  # Restore from dalidou
    list_remote_archives, # List server archives
)

Command Line Interface

Batch Script: tools/archive_study.bat

# Analyze disk usage
archive_study.bat analyze studies\M1_Mirror
archive_study.bat analyze studies\M1_Mirror\m1_mirror_V12

# Cleanup completed study (dry run by default)
archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12
archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12 --execute

# Archive to remote server
archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute
archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute --tailscale

# List remote archives
archive_study.bat list
archive_study.bat list --tailscale

# Restore from remote
archive_study.bat restore m1_mirror_V12
archive_study.bat restore m1_mirror_V12 --tailscale

Python API

from pathlib import Path
from optimization_engine.utils.study_archiver import (
    analyze_study,
    cleanup_study,
    archive_to_remote,
)

# Analyze
study_path = Path("studies/M1_Mirror/m1_mirror_V12")
analysis = analyze_study(study_path)
print(f"Total: {analysis['total_size_bytes']/1e9:.2f} GB")
print(f"Essential: {analysis['essential_size']/1e9:.2f} GB")
print(f"Deletable: {analysis['deletable_size']/1e9:.2f} GB")

# Cleanup (dry_run=False to execute)
deleted, freed = cleanup_study(study_path, dry_run=False)
print(f"Freed {freed/1e9:.2f} GB")

# Archive to server
success = archive_to_remote(study_path, use_tailscale=False, dry_run=False)

Part 4: Remote Server Configuration

dalidou Server Specs

Property	Value
Hostname	dalidou
Local IP	192.168.86.50
Tailscale IP	100.80.199.40
SSH User	papa
Archive Path	/srv/storage/atomizer-archive/
Available Storage	3.6 TB (SSD) + 12.7 TB (HDD)

First-Time Setup

# 1. SSH into server and create archive directory
ssh papa@192.168.86.50
mkdir -p /srv/storage/atomizer-archive

# 2. Set up passwordless SSH (on Windows)
ssh-keygen -t ed25519  # If you don't have a key
ssh-copy-id papa@192.168.86.50

# 3. Test connection
ssh papa@192.168.86.50 "echo 'Connection OK'"

Archive Structure on Server

/srv/storage/atomizer-archive/
├── m1_mirror_V11_20251229.tar.gz    # Compressed study archive
├── m1_mirror_V12_20251229.tar.gz
├── m1_mirror_flat_back_V3_20251229.tar.gz
└── manifest.json                     # Index of all archives

Part 5: Recommended Workflows

During Active Optimization

Keep all files - You may need to:

Re-run specific failed trials
Debug mesh issues
Analyze intermediate results

After Study Completion

Generate final report (STUDY_REPORT.md)
Archive best design to 3_results/best_design_archive/

Run cleanup:

archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12 --execute

Verify results still accessible:
- Database queries work
- Best design files intact
- OP2 files for Zernike extraction present

For Long-Term Storage

After cleanup, archive to server:

archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute

Optionally delete local study folder
Keep only 3_results/best_design_archive/ locally if needed

When Revisiting Old Study

Check if archived:
```
archive_study.bat list
```

Restore:

archive_study.bat restore m1_mirror_V12

If re-running trials needed, master files in 1_setup/ allow full regeneration

Part 6: Disk Space Targets

Per-Project Guidelines

Stage	Expected Size	Notes
Active (full)	100%	All files present
Completed (cleaned)	~50%	Deletables removed
Archived (minimal)	~3%	Best design only locally

M1_Mirror Specific

Stage	Size	Notes
Full	194 GB	28 studies, 2000+ trials
After cleanup	114 GB	OP2 + metadata only
Minimal local	5-10 GB	Best designs + database
Server archive	~50 GB	Compressed

Part 7: Safety Features

Built-in Protections

Dry run by default - Must explicitly add --execute
Master files untouched - 1_setup/ is never modified
Results preserved - 3_results/ is never touched
Essential files preserved - OP2, JSON, NPZ always kept
Archive verification - rsync checks integrity

What Cannot Be Recovered After Cleanup

File Type	Recovery Method
`.prt`	Copy from `1_setup/` + update params
`.fem`	Regenerate from `.prt` in NX
`.sim`	Recreate simulation setup
`.dat`	Regenerate from params.json + model
`.f04/.f06`	Re-run solver (if needed)

Note: With 1_setup/ master files and params.json, ANY trial can be fully reconstructed. The only irreplaceable data is the OP2 results (which we keep).

Part 8: Troubleshooting

SSH Connection Failed

# Test connectivity
ping 192.168.86.50

# Test SSH
ssh papa@192.168.86.50 "echo connected"

# If on different network, use Tailscale
ssh papa@100.80.199.40 "echo connected"

Archive Upload Slow

Large studies (50+ GB) take time. Options:

Run overnight
Use wired LAN connection
Pre-cleanup to reduce size

Out of Disk Space During Archive

Archive is created locally first. Need ~1.5x study size free:

20 GB study = ~30 GB temp space required

Cleanup Removed Wrong Files

If accidentally executed without dry run:

OP2 files preserved (can still extract results)
Master files in 1_setup/ intact
Regenerate other files by re-running trial

Part 9: Integration with Atomizer

Protocol Reference

Related Protocol: docs/protocols/operations/OP_07_DISK_OPTIMIZATION.md

Claude Commands

When user says:

"analyze disk usage" → Run analyze_study()
"clean up study" → Run cleanup_study() with confirmation
"archive to server" → Run archive_to_remote()
"restore study" → Run restore_from_remote()

Automatic Suggestions

After optimization completion, suggest:

Optimization complete! The study is using X GB.
Would you like me to clean up regenerable files to save Y GB?
(This keeps all results but removes intermediate model copies)

Part 10: File Inventory

Files Created

File	Purpose
`optimization_engine/utils/study_archiver.py`	Core utility module
`tools/archive_study.bat`	Windows batch script
`docs/protocols/operations/OP_07_DISK_OPTIMIZATION.md`	Full protocol
`.claude/skills/modules/study-disk-optimization.md`	This document

Dependencies

Python 3.8+
rsync (for remote operations, usually pre-installed)
SSH client (for remote operations)
Tailscale (optional, for remote access outside LAN)

Appendix A: Cleanup Results Log (Dec 2025)

Initial Cleanup Run

Study	Before	After	Freed	Files Deleted
m1_mirror_cost_reduction_V11	32.24 GB	15.94 GB	16.30 GB	3,403
m1_mirror_cost_reduction_flat_back_V3	52.50 GB	26.87 GB	25.63 GB	5,084
m1_mirror_cost_reduction_flat_back_V6	33.71 GB	16.64 GB	17.08 GB	3,391
m1_mirror_cost_reduction_V12	22.68 GB	10.60 GB	12.08 GB	2,508
m1_mirror_cost_reduction_flat_back_V1	8.76 GB	4.54 GB	4.22 GB	813
m1_mirror_cost_reduction_flat_back_V5	8.01 GB	4.09 GB	3.92 GB	765
m1_mirror_cost_reduction	3.58 GB	3.08 GB	0.50 GB	267
TOTAL	161.48 GB	81.76 GB	79.73 GB	16,231

Project-Wide Summary

Before cleanup: 193.75 GB
After cleanup:  114.03 GB
Total freed:     79.72 GB (41% reduction)

Appendix B: Quick Reference Card

Commands

# Analyze
archive_study.bat analyze <path>

# Cleanup (always dry-run first!)
archive_study.bat cleanup <study>           # Dry run
archive_study.bat cleanup <study> --execute # Execute

# Archive
archive_study.bat archive <study> --execute
archive_study.bat archive <study> --execute --tailscale

# Remote
archive_study.bat list
archive_study.bat restore <name>

Python

from optimization_engine.utils.study_archiver import *

# Quick analysis
analysis = analyze_study(Path("studies/M1_Mirror"))
print(f"Deletable: {analysis['deletable_size']/1e9:.2f} GB")

# Cleanup
cleanup_study(Path("studies/M1_Mirror/m1_mirror_V12"), dry_run=False)

Server Access

# Local
ssh papa@192.168.86.50

# Remote (Tailscale)
ssh papa@100.80.199.40

# Archive location
/srv/storage/atomizer-archive/

This module enables efficient disk space management for large-scale FEA optimization studies.

12 KiB Raw Blame History