Files
Atomizer/.claude/skills/troubleshoot.md
Anto01 e3bdb08a22 feat: Major update with validators, skills, dashboard, and docs reorganization
- Add validation framework (config, model, results, study validators)
- Add Claude Code skills (create-study, run-optimization, generate-report,
  troubleshoot, analyze-model)
- Add Atomizer Dashboard (React frontend + FastAPI backend)
- Reorganize docs into structured directories (00-09)
- Add neural surrogate modules and training infrastructure
- Add multi-objective optimization support

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-25 19:23:58 -05:00

9.1 KiB

Troubleshoot Skill

Last Updated: November 25, 2025 Version: 1.0 - Debug Common Issues and Error Recovery

You are helping the user diagnose and fix problems with Atomizer optimization studies.

Purpose

Diagnose and resolve common issues:

  1. Configuration validation failures
  2. Model file problems
  3. NX solver errors
  4. Optuna/database issues
  5. Result extraction failures
  6. Constraint violation patterns

Triggers

  • "troubleshoot"
  • "debug"
  • "error"
  • "failed"
  • "not working"
  • "what's wrong"
  • "fix"

Prerequisites

  • Study must exist in studies/{study_name}/
  • User should describe the error or symptom

Diagnostic Process

Step 1: Run Full Validation

from optimization_engine.validators import validate_study

result = validate_study("{study_name}")
print(result)

This provides a complete health check covering:

  • Configuration validity
  • Model file presence
  • Results integrity (if any)

Step 2: Identify Error Category

Classify the issue into one of these categories:

Category Symptoms First Check
Config "Invalid config", validation errors validate_config_file()
Model "File not found", NX errors validate_study_model()
Solver "Simulation failed", timeout NX logs, OP2 files
Database "Study not found", lock errors study.db file, Optuna
Extraction "Cannot extract", NaN values OP2 file validity
Constraints All trials infeasible Constraint thresholds

Common Issues & Solutions

Issue 1: Configuration Validation Fails

Symptoms:

[ERROR] [DESIGN_VAR_BOUNDS] beam_thickness: min (5) >= max (3)

Diagnosis:

from optimization_engine.validators import validate_config_file

result = validate_config_file("studies/{study_name}/1_setup/optimization_config.json")
for error in result.errors:
    print(error)

Solutions:

Error Code Cause Fix
DESIGN_VAR_BOUNDS Bounds inverted Swap min/max values
MISSING_OBJECTIVES No objectives defined Add objectives array
INVALID_DIRECTION Wrong goal value Use "minimize" or "maximize"
PROTOCOL_MISMATCH Wrong protocol for objectives Match protocol to # objectives

Issue 2: Model Files Missing

Symptoms:

[ERROR] No part file (.prt) found in model directory

Diagnosis:

from optimization_engine.validators import validate_study_model

result = validate_study_model("{study_name}")
print(f"Part: {result.prt_file}")
print(f"Sim: {result.sim_file}")
print(f"FEM: {result.fem_file}")

Solutions:

  1. Ensure files are in studies/{study_name}/1_setup/model/
  2. Check file naming convention (e.g., Beam.prt, Beam_sim1.sim)
  3. FEM file auto-generates on first solve (not required initially)

Issue 3: NX Solver Fails

Symptoms:

[NX SOLVER] Error: Simulation timeout after 600s
[NX SOLVER] Error: Unable to open simulation file

Diagnosis:

  1. Check NX is installed and configured:

    from config import NX_VERSION, NX_INSTALL_PATH
    print(f"NX Version: {NX_VERSION}")
    print(f"Path: {NX_INSTALL_PATH}")
    
  2. Check for running NX processes:

    tasklist | findstr "ugraf"
    
  3. Read NX journal output:

    studies/{study_name}/1_setup/model/_temp_solve_journal.py
    

Solutions:

Error Cause Fix
Timeout Complex mesh or bad parameters Increase timeout or simplify design
License error NX license unavailable Wait or check license server
File locked Another NX process has file open Close NX and retry
Expression not found NX expression name mismatch Verify expression names in NX

Issue 4: Database Errors

Symptoms:

[ERROR] Study 'my_study' not found in storage
[ERROR] database is locked

Diagnosis:

import optuna
storage = "sqlite:///studies/{study_name}/2_results/study.db"
studies = optuna.study.get_all_study_summaries(storage)
print([s.study_name for s in studies])

Solutions:

Error Cause Fix
Study not found Wrong study name Check exact name in database
Database locked Multiple processes Kill other optimization processes
Corrupted DB Interrupted write Delete and restart (backup first)

Issue 5: Result Extraction Fails

Symptoms:

[ERROR] Cannot extract displacement from OP2
[ERROR] NaN values in objectives

Diagnosis:

  1. Check OP2 file exists:

    dir studies\{study_name}\1_setup\model\*.op2
    
  2. Validate OP2 contents:

    from pyNastran.op2.op2 import OP2
    op2 = OP2()
    op2.read_op2("path/to/file.op2")
    print(op2.get_result_table_names())
    
  3. Check extraction config matches OP2:

    {
      "extraction": {
        "params": {
          "subcase": 1,
          "result_type": "displacement"
        }
      }
    }
    

Solutions:

Error Cause Fix
No OP2 file Solve didn't run Check NX solver output
Wrong subcase Subcase ID mismatch Match subcase to solution
Missing result Result not requested Enable output in NX

Issue 6: All Trials Infeasible

Symptoms:

Feasibility rate: 0%
All trials violating constraints

Diagnosis:

from optimization_engine.validators import validate_results

result = validate_results("studies/{study_name}/2_results/study.db")
print(f"Feasibility: {result.info.feasibility_rate}%")

Check constraint violations in Optuna dashboard or:

import optuna
study = optuna.load_study(...)
for trial in study.trials:
    if trial.user_attrs.get('feasible') == False:
        print(f"Trial {trial.number}: {trial.user_attrs.get('violated_constraints')}")

Solutions:

Issue Cause Fix
All trials fail constraint Threshold too tight Relax constraint threshold
Single constraint always fails Wrong extraction Check constraint extraction
Bounds cause violations Design space infeasible Expand design variable bounds

Quick Diagnostic Commands

Validate Everything

python -m optimization_engine.validators.study_validator {study_name}

Check Results

python -m optimization_engine.validators.results_validator {study_name}

List All Studies

python -m optimization_engine.validators.study_validator

Check Optuna Database

import optuna
storage = "sqlite:///studies/{study_name}/2_results/study.db"
study = optuna.load_study(study_name="{study_name}", storage=storage)
print(f"Trials: {len(study.trials)}")
print(f"Completed: {len([t for t in study.trials if t.state == optuna.trial.TrialState.COMPLETE])}")
print(f"Failed: {len([t for t in study.trials if t.state == optuna.trial.TrialState.FAIL])}")

Recovery Actions

Reset Study Database

import optuna
storage = "sqlite:///studies/{study_name}/2_results/study.db"
optuna.delete_study(study_name="{study_name}", storage=storage)

Or use the reset script:

python studies/{study_name}/reset_study.py

Resume Interrupted Study

python studies/{study_name}/run_optimization.py --trials 30 --resume

Clean Worker Directories

# Remove temp files from worker dirs
del /S /Q studies\{study_name}\1_setup\worker_*

Backup and Restore Database

# Backup
copy studies\{study_name}\2_results\study.db studies\{study_name}\2_results\study_backup.db

# Restore
copy studies\{study_name}\2_results\study_backup.db studies\{study_name}\2_results\study.db

Error Message Reference

Error Message Category Quick Fix
"min >= max" Config Swap bounds
"No part file found" Model Add .prt file
"Simulation timeout" Solver Increase timeout or check NX
"database is locked" Database Kill other processes
"Cannot extract" Extraction Check OP2 and extraction config
"All trials infeasible" Constraints Relax thresholds

Integration with Other Skills

  • After fixing config -> /run-optimization to restart
  • After debugging results -> /generate-report to analyze
  • If model needs changes -> Check NX_FILE_MODIFICATIONS_REQUIRED.md

Example Conversation

USER: My optimization is failing with "Cannot extract displacement"

CLAUDE: Let me diagnose the displacement extraction issue.

First, I'll validate the study:

[Runs validate_study()]

The validation shows model files are present. Let me check if the OP2 file exists
and contains displacement data...

[Checks OP2 file]

I found the issue: Your extraction config specifies subcase=1, but the displacement
results are in subcase=2 (the static analysis solution).

Fix: Update optimization_config.json:
```json
{
  "extraction": {
    "params": {
      "subcase": 2  // Was 1, should be 2
    }
  }
}

After this change, run again with: python run_optimization.py --trials 30 --resume


## Notes

- Always run full validation before investigating specific issues
- Most problems are configuration or file path issues
- NX solver errors often require checking NX directly
- Keep backups before modifying study.db
- Use --resume to continue after fixing issues