# OP_06: Troubleshoot ## Overview This protocol provides systematic troubleshooting for common optimization issues, covering NX errors, extraction failures, database problems, and performance issues. --- ## When to Use | Trigger | Action | |---------|--------| | "error", "failed" | Follow this protocol | | "not working", "crashed" | Follow this protocol | | "help", "stuck" | Follow this protocol | | Unexpected behavior | Follow this protocol | --- ## Quick Diagnostic ```bash # 1. Check environment conda activate atomizer python --version # Should be 3.9+ # 2. Check study structure ls studies/my_study/ # Should have: 1_setup/, run_optimization.py # 3. Check model files ls studies/my_study/1_setup/model/ # Should have: .prt, .sim files # 4. Test single trial python run_optimization.py --test ``` --- ## Error Categories ### 1. Environment Errors #### "ModuleNotFoundError: No module named 'optuna'" **Cause**: Wrong Python environment **Solution**: ```bash conda activate atomizer # Verify conda list | grep optuna ``` #### "Python version mismatch" **Cause**: Wrong Python version **Solution**: ```bash python --version # Need 3.9+ conda activate atomizer ``` --- ### 2. NX Model Setup Errors #### "All optimization trials produce identical results" **Cause**: Missing idealized part (`*_i.prt`) or broken file chain **Symptoms**: - Journal shows "FE model updated" but results don't change - DAT files have same node coordinates with different expressions - OP2 file timestamps update but values are identical **Root Cause**: NX simulation files have a parent-child hierarchy: ``` .sim → .fem → _i.prt → .prt (geometry) ``` If the `_i.prt` (idealized part) is missing or not properly linked, `UpdateFemodel()` runs but the mesh doesn't regenerate because: - FEM mesh is tied to idealized geometry, not master geometry - Without idealized part updating, FEM has nothing new to mesh against **Solution**: 1. **Check file chain in NX**: - Open `.sim` file - Go to **Part Navigator** or **Assembly Navigator** - List ALL referenced parts 2. **Copy ALL linked files** to study folder: ```bash # Typical file set needed: Model.prt # Geometry Model_fem1_i.prt # Idealized part ← OFTEN MISSING! Model_fem1.fem # FEM file Model_sim1.sim # Simulation file ``` 3. **Verify links are intact**: - Open model in NX after copying - Check that updates propagate: Geometry → Idealized → FEM → Sim 4. **CRITICAL CODE FIX** (already implemented in `solve_simulation.py`): The idealized part MUST be explicitly loaded before `UpdateFemodel()`: ```python # Load idealized part BEFORE updating FEM for filename in os.listdir(working_dir): if '_i.prt' in filename.lower(): idealized_part, status = theSession.Parts.Open(path) break # Now UpdateFemodel() will work correctly feModel.UpdateFemodel() ``` Without loading the `_i.prt`, NX cannot propagate geometry changes to the mesh. **Prevention**: Always use introspection to list all parts referenced by a simulation. --- ### 3. NX/Solver Errors #### "NX session timeout after 600s" **Cause**: Model too complex or NX stuck **Solution**: 1. Increase timeout in config: ```json "simulation": { "timeout": 1200 } ``` 2. Simplify mesh if possible 3. Check NX license availability #### "Expression 'xxx' not found in model" **Cause**: Expression name mismatch **Solution**: 1. Open model in NX 2. Go to Tools → Expressions 3. Verify exact expression name (case-sensitive) 4. Update config to match #### "NX license error" **Cause**: License server unavailable **Solution**: 1. Check license server status 2. Wait and retry 3. Contact IT if persistent #### "NX solve failed - check log" **Cause**: Nastran solver error **Solution**: 1. Find log file: `1_setup/model/*.log` or `*.f06` 2. Search for "FATAL" or "ERROR" 3. Common causes: - Singular stiffness matrix (constraints issue) - Bad mesh (distorted elements) - Missing material properties --- ### 3. Extraction Errors #### "OP2 file not found" **Cause**: Solve didn't produce output **Solution**: 1. Check if solve completed 2. Look for `.op2` file in model directory 3. Check NX log for solve errors #### "No displacement data for subcase X" **Cause**: Wrong subcase number **Solution**: 1. Check available subcases in OP2: ```python from pyNastran.op2.op2 import OP2 op2 = OP2() op2.read_op2('model.op2') print(op2.displacements.keys()) ``` 2. Update subcase in extractor call #### "Element type 'xxx' not supported" **Cause**: Extractor doesn't support element type **Solution**: 1. Check available types in extractor 2. Common types: `cquad4`, `ctria3`, `ctetra`, `chexa` 3. May need different extractor --- ### 4. Database Errors #### "Database is locked" **Cause**: Another process using database **Solution**: 1. Check for running processes: ```bash ps aux | grep run_optimization ``` 2. Kill stale process if needed 3. Wait for other optimization to finish #### "Study 'xxx' not found" **Cause**: Wrong study name or path **Solution**: 1. Check exact study name in database: ```python import optuna storage = optuna.storages.RDBStorage('sqlite:///study.db') print(storage.get_all_study_summaries()) ``` 2. Use correct name when loading #### "IntegrityError: UNIQUE constraint failed" **Cause**: Duplicate trial number **Solution**: 1. Don't run multiple optimizations on same study simultaneously 2. Use `--resume` flag for continuation --- ### 5. Constraint/Feasibility Errors #### "All trials pruned" **Cause**: No feasible region **Solution**: 1. Check constraint values: ```python # In objective function, print constraint values print(f"Stress: {stress}, limit: 250") ``` 2. Relax constraints 3. Widen design variable bounds #### "No improvement after N trials" **Cause**: Stuck in local minimum or converged **Solution**: 1. Check if truly converged (good result) 2. Try different starting region 3. Use different sampler 4. Increase exploration (lower `n_startup_trials`) --- ### 6. Performance Issues #### "Trials running very slowly" **Cause**: Complex model or inefficient extraction **Solution**: 1. Profile time per component: ```python import time start = time.time() # ... operation ... print(f"Took: {time.time() - start:.1f}s") ``` 2. Simplify mesh if NX is slow 3. Check extraction isn't re-parsing OP2 multiple times #### "Memory error" **Cause**: Large OP2 file or many trials **Solution**: 1. Clear Python memory between trials 2. Don't store all results in memory 3. Use database for persistence --- ## Diagnostic Commands ### Quick Health Check ```bash # Environment conda activate atomizer python -c "import optuna; print('Optuna OK')" python -c "import pyNastran; print('pyNastran OK')" # Study structure ls -la studies/my_study/ # Config validity python -c " import json with open('studies/my_study/1_setup/optimization_config.json') as f: config = json.load(f) print('Config OK') print(f'Objectives: {len(config.get(\"objectives\", []))}') " # Database status python -c " import optuna study = optuna.load_study('my_study', 'sqlite:///studies/my_study/2_results/study.db') print(f'Trials: {len(study.trials)}') " ``` ### NX Log Analysis ```bash # Find latest log ls -lt studies/my_study/1_setup/model/*.log | head -1 # Search for errors grep -i "error\|fatal\|fail" studies/my_study/1_setup/model/*.log ``` ### Trial Failure Analysis ```python import optuna study = optuna.load_study(...) # Failed trials failed = [t for t in study.trials if t.state == optuna.trial.TrialState.FAIL] print(f"Failed: {len(failed)}") for t in failed[:5]: print(f"Trial {t.number}: {t.user_attrs}") # Pruned trials pruned = [t for t in study.trials if t.state == optuna.trial.TrialState.PRUNED] print(f"Pruned: {len(pruned)}") ``` --- ## Recovery Actions ### Reset Study (Start Fresh) ```bash # Backup first cp -r studies/my_study/2_results studies/my_study/2_results_backup # Delete results rm -rf studies/my_study/2_results/* # Run fresh python run_optimization.py ``` ### Resume Interrupted Study ```bash python run_optimization.py --resume ``` ### Restore from Backup ```bash cp -r studies/my_study/2_results_backup/* studies/my_study/2_results/ ``` --- ## Getting Help ### Information to Provide When asking for help, include: 1. Error message (full traceback) 2. Config file contents 3. Study structure (`ls -la`) 4. What you tried 5. NX log excerpt (if NX error) ### Log Locations | Log | Location | |-----|----------| | Optimization | Console output or redirect to file | | NX Solve | `1_setup/model/*.log`, `*.f06` | | Database | `2_results/study.db` (query with optuna) | | Intelligence | `2_results/intelligent_optimizer/*.json` | --- ## Cross-References - **Related**: All operation protocols - **System**: [SYS_10_IMSO](../system/SYS_10_IMSO.md), [SYS_12_EXTRACTOR_LIBRARY](../system/SYS_12_EXTRACTOR_LIBRARY.md) --- ## Version History | Version | Date | Changes | |---------|------|---------| | 1.0 | 2025-12-05 | Initial release |