# OP_07: Disk Space Optimization

**Version:** 1.0
**Last Updated:** 2025-12-29

## Overview

This protocol manages disk space for Atomizer studies through:
1. **Local cleanup** - Remove regenerable files from completed studies
2. **Remote archival** - Archive to dalidou server (14TB available)
3. **On-demand restore** - Pull archived studies when needed

## Disk Usage Analysis

### Typical Study Breakdown

| File Type | Size/Trial | Purpose | Keep? |
|-----------|------------|---------|-------|
| `.op2` | 68 MB | Nastran results | **YES** - Needed for analysis |
| `.prt` | 30 MB | NX parts | NO - Copy of master |
| `.dat` | 16 MB | Solver input | NO - Regenerable |
| `.fem` | 14 MB | FEM mesh | NO - Copy of master |
| `.sim` | 7 MB | Simulation | NO - Copy of master |
| `.afm` | 4 MB | Assembly FEM | NO - Regenerable |
| `.json` | <1 MB | Params/results | **YES** - Metadata |
| Logs | <1 MB | F04/F06/log | NO - Diagnostic only |

**Per-trial overhead:** ~150 MB total, only ~70 MB essential

### M1_Mirror Example

```
Current:     194 GB (28 studies, 2000+ trials)
After cleanup: 95 GB (51% reduction)
After archive:  5 GB (keep best_design_archive only)
```

## Commands

### 1. Analyze Disk Usage

```bash
# Single study
archive_study.bat analyze studies\M1_Mirror\m1_mirror_V12

# All studies in a project
archive_study.bat analyze studies\M1_Mirror
```

Output shows:
- Total size
- Essential vs deletable breakdown
- Trial count per study
- Per-extension analysis

### 2. Cleanup Completed Study

```bash
# Dry run (default) - see what would be deleted
archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12

# Actually delete
archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12 --execute
```

**What gets deleted:**
- `.prt`, `.fem`, `.sim`, `.afm` in trial folders
- `.dat`, `.f04`, `.f06`, `.log`, `.diag` solver files
- Temp files (`.txt`, `.exp`, `.bak`)

**What is preserved:**
- `1_setup/` folder (master model)
- `3_results/` folder (database, reports)
- All `.op2` files (Nastran results)
- All `.json` files (params, metadata)
- All `.npz` files (Zernike coefficients)
- `best_design_archive/` folder

### 3. Archive to Remote Server

```bash
# Dry run
archive_study.bat archive studies\M1_Mirror\m1_mirror_V12

# Actually archive
archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute

# Use Tailscale (when not on local network)
archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute --tailscale
```

**Process:**
1. Creates compressed `.tar.gz` archive
2. Uploads to `papa@192.168.86.50:/srv/storage/atomizer-archive/`
3. Deletes local archive after successful upload

### 4. List Remote Archives

```bash
archive_study.bat list

# Via Tailscale
archive_study.bat list --tailscale
```

### 5. Restore from Remote

```bash
# Restore to studies/ folder
archive_study.bat restore m1_mirror_V12

# Via Tailscale
archive_study.bat restore m1_mirror_V12 --tailscale
```

## Remote Server Setup

**Server:** dalidou (Lenovo W520)
- Local IP: `192.168.86.50`
- Tailscale IP: `100.80.199.40`
- SSH user: `papa`
- Archive path: `/srv/storage/atomizer-archive/`

### First-Time Setup

SSH into dalidou and create the archive directory:

```bash
ssh papa@192.168.86.50
mkdir -p /srv/storage/atomizer-archive
```

Ensure SSH key authentication is set up for passwordless transfers:

```bash
# On Windows (PowerShell)
ssh-copy-id papa@192.168.86.50
```

## Recommended Workflow

### During Active Optimization

Keep all files - you may need to re-run specific trials.

### After Study Completion

1. **Generate final report** (`STUDY_REPORT.md`)
2. **Archive best design** to `3_results/best_design_archive/`
3. **Cleanup:**
   ```bash
   archive_study.bat cleanup studies\M1_Mirror\m1_mirror_V12 --execute
   ```

### For Long-Term Storage

1. **After cleanup**, archive to server:
   ```bash
   archive_study.bat archive studies\M1_Mirror\m1_mirror_V12 --execute
   ```

2. **Optionally delete local** (keep only `3_results/best_design_archive/`)

### When Revisiting Old Study

1. **Restore:**
   ```bash
   archive_study.bat restore m1_mirror_V12
   ```

2. If you need to re-run trials, the `1_setup/` master files allow regenerating everything

## Safety Features

- **Dry run by default** - Must add `--execute` to actually delete/transfer
- **Master files preserved** - `1_setup/` is never touched
- **Results preserved** - `3_results/` is never touched
- **Essential files preserved** - OP2, JSON, NPZ always kept

## Disk Space Targets

| Stage | M1_Mirror Target |
|-------|------------------|
| Active development | 200 GB (full) |
| Completed studies | 95 GB (after cleanup) |
| Archived (minimal local) | 5 GB (best only) |
| Server archive | 50 GB compressed |

## Troubleshooting

### SSH Connection Failed

```bash
# Test connectivity
ping 192.168.86.50

# Test SSH
ssh papa@192.168.86.50 "echo connected"

# If on different network, use Tailscale
ssh papa@100.80.199.40 "echo connected"
```

### Archive Upload Slow

Large studies (50+ GB) take time. The tool uses `rsync` with progress display.
For very large archives, consider running overnight or using direct LAN connection.

### Out of Disk Space During Archive

The archive is created locally first. Ensure you have ~1.5x the study size free:
- 20 GB study = ~30 GB temp space needed

## Python API

```python
from optimization_engine.utils.study_archiver import (
    analyze_study,
    cleanup_study,
    archive_to_remote,
    restore_from_remote,
    list_remote_archives,
)

# Analyze
analysis = analyze_study(Path("studies/M1_Mirror/m1_mirror_V12"))
print(f"Deletable: {analysis['deletable_size']/1e9:.2f} GB")

# Cleanup (dry_run=False to actually delete)
cleanup_study(Path("studies/M1_Mirror/m1_mirror_V12"), dry_run=False)

# Archive
archive_to_remote(Path("studies/M1_Mirror/m1_mirror_V12"), dry_run=False)

# List remote
archives = list_remote_archives()
for a in archives:
    print(f"{a['name']}: {a['size']}")
```