# OP_07: Disk Space Optimization **Version:** 1.0 **Last Updated:** 2025-12-29 ## Overview This protocol manages disk space for Atomizer studies through: 1. **Local cleanup** - Remove regenerable files from completed studies 2. **Remote archival** - Archive to dalidou server (14TB available) 3. **On-demand restore** - Pull archived studies when needed ## Disk Usage Analysis ### Typical Study Breakdown | File Type | Size/Trial | Purpose | Keep? | |-----------|------------|---------|-------| | `.op2` | 68 MB | Nastran results | **YES** - Needed for analysis | | `.prt` | 30 MB | NX parts | NO - Copy of master | | `.dat` | 16 MB | Solver input | NO - Regenerable | | `.fem` | 14 MB | FEM mesh | NO - Copy of master | | `.sim` | 7 MB | Simulation | NO - Copy of master | | `.afm` | 4 MB | Assembly FEM | NO - Regenerable | | `.json` | <1 MB | Params/results | **YES** - Metadata | | Logs | <1 MB | F04/F06/log | NO - Diagnostic only | **Per-trial overhead:** ~150 MB total, only ~70 MB essential ### M1_Mirror Example ``` Current: 194 GB (28 studies, 2000+ trials) After cleanup: 95 GB (51% reduction) After archive: 5 GB (keep best_design_archive only) ``` ## Commands ### 1. Analyze Disk Usage ```bash # Single study archive_study.bat analyze studies\M1_Mirror\m1_mirror_V12 # All studies in a project archive_study.bat analyze studies\M1_Mirror ``` Output shows: - Total size - Essential vs deletable breakdown - Trial count per study - Per-extension analysis ### 2. Cleanup Completed Study ```bash # Dry run (default) - see what would be deleted archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12 # Actually delete archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12 --execute ``` **What gets deleted:** - `.prt`, `.fem`, `.sim`, `.afm` in trial folders - `.dat`, `.f04`, `.f06`, `.log`, `.diag` solver files - Temp files (`.txt`, `.exp`, `.bak`) **What is preserved:** - `1_setup/` folder (master model) - `3_results/` folder (database, reports) - All `.op2` files (Nastran results) - All `.json` files (params, metadata) - All `.npz` files (Zernike coefficients) - `best_design_archive/` folder ### 3. Archive to Remote Server ```bash # Dry run archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 # Actually archive archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute # Use Tailscale (when not on local network) archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute --tailscale ``` **Process:** 1. Creates compressed `.tar.gz` archive 2. Uploads to `papa@192.168.86.50:/srv/storage/atomizer-archive/` 3. Deletes local archive after successful upload ### 4. List Remote Archives ```bash archive_study.bat list # Via Tailscale archive_study.bat list --tailscale ``` ### 5. Restore from Remote ```bash # Restore to studies/ folder archive_study.bat restore m1_mirror_V12 # Via Tailscale archive_study.bat restore m1_mirror_V12 --tailscale ``` ## Remote Server Setup **Server:** dalidou (Lenovo W520) - Local IP: `192.168.86.50` - Tailscale IP: `100.80.199.40` - SSH user: `papa` - Archive path: `/srv/storage/atomizer-archive/` ### First-Time Setup SSH into dalidou and create the archive directory: ```bash ssh papa@192.168.86.50 mkdir -p /srv/storage/atomizer-archive ``` Ensure SSH key authentication is set up for passwordless transfers: ```bash # On Windows (PowerShell) ssh-copy-id papa@192.168.86.50 ``` ## Recommended Workflow ### During Active Optimization Keep all files - you may need to re-run specific trials. ### After Study Completion 1. **Generate final report** (`STUDY_REPORT.md`) 2. **Archive best design** to `3_results/best_design_archive/` 3. **Cleanup:** ```bash archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12 --execute ``` ### For Long-Term Storage 1. **After cleanup**, archive to server: ```bash archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute ``` 2. **Optionally delete local** (keep only `3_results/best_design_archive/`) ### When Revisiting Old Study 1. **Restore:** ```bash archive_study.bat restore m1_mirror_V12 ``` 2. If you need to re-run trials, the `1_setup/` master files allow regenerating everything ## Safety Features - **Dry run by default** - Must add `--execute` to actually delete/transfer - **Master files preserved** - `1_setup/` is never touched - **Results preserved** - `3_results/` is never touched - **Essential files preserved** - OP2, JSON, NPZ always kept ## Disk Space Targets | Stage | M1_Mirror Target | |-------|------------------| | Active development | 200 GB (full) | | Completed studies | 95 GB (after cleanup) | | Archived (minimal local) | 5 GB (best only) | | Server archive | 50 GB compressed | ## Troubleshooting ### SSH Connection Failed ```bash # Test connectivity ping 192.168.86.50 # Test SSH ssh papa@192.168.86.50 "echo connected" # If on different network, use Tailscale ssh papa@100.80.199.40 "echo connected" ``` ### Archive Upload Slow Large studies (50+ GB) take time. The tool uses `rsync` with progress display. For very large archives, consider running overnight or using direct LAN connection. ### Out of Disk Space During Archive The archive is created locally first. Ensure you have ~1.5x the study size free: - 20 GB study = ~30 GB temp space needed ## Python API ```python from optimization_engine.utils.study_archiver import ( analyze_study, cleanup_study, archive_to_remote, restore_from_remote, list_remote_archives, ) # Analyze analysis = analyze_study(Path("studies/M1_Mirror/m1_mirror_V12")) print(f"Deletable: {analysis['deletable_size']/1e9:.2f} GB") # Cleanup (dry_run=False to actually delete) cleanup_study(Path("studies/M1_Mirror/m1_mirror_V12"), dry_run=False) # Archive archive_to_remote(Path("studies/M1_Mirror/m1_mirror_V12"), dry_run=False) # List remote archives = list_remote_archives() for a in archives: print(f"{a['name']}: {a['size']}") ```