# NX Session Management **Status**: Implemented **Version**: 1.0 **Date**: 2025-11-20 ## Problem When running multiple optimizations concurrently or when a user has NX open for manual work, conflicts can occur: 1. **Multiple Optimizations**: Two optimization studies trying to modify the same model simultaneously 2. **User's Interactive NX**: Batch optimization interfering with user's manual work 3. **File Corruption**: Concurrent writes to .prt/.sim files causing corruption 4. **License Conflicts**: Multiple NX instances competing for licenses 5. **Journal Failures**: Journals trying to run on wrong NX session ## Solution: NX Session Manager The `NXSessionManager` class provides intelligent session conflict prevention. ### Key Features 1. **Session Detection** - Detects all running NX processes (interactive + batch) - Identifies interactive vs batch sessions - Warns if user has NX open 2. **File Locking** - Exclusive locks on model files (.prt) - Prevents two optimizations from modifying same model - Queues trials if model is locked 3. **Process Queuing** - Limits concurrent NX batch sessions (default: 1) - Waits if max sessions reached - Automatic timeout and error handling 4. **Stale Lock Cleanup** - Detects crashed processes - Removes orphaned lock files - Prevents permanent deadlocks ## Architecture ### Session Manager Components ```python from optimization_engine.nx_session_manager import NXSessionManager # Initialize session_mgr = NXSessionManager( lock_dir=Path.home() / ".atomizer" / "locks", max_concurrent_sessions=1, # Max parallel NX instances wait_timeout=300, # Max wait time (5 min) verbose=True ) ``` ### Two-Level Locking **Level 1: Model File Lock** (most important) ```python # Ensures exclusive access to a specific model with session_mgr.acquire_model_lock(prt_file, study_name): # Update CAD model updater.update_expressions(params) # Run simulation result = solver.run_simulation(sim_file) ``` **Level 2: NX Session Lock** (optional) ```python # Limits total concurrent NX batch instances with session_mgr.acquire_nx_session(study_name): # Run NX batch operation pass ``` ## Usage Examples ### Example 1: Single Optimization (Recommended) ```python from optimization_engine.nx_solver import NXSolver from optimization_engine.nx_updater import NXParameterUpdater from optimization_engine.nx_session_manager import NXSessionManager # Initialize components session_mgr = NXSessionManager(verbose=True) updater = NXParameterUpdater("model.prt") solver = NXSolver() # Check for interactive NX sessions if session_mgr.is_nx_interactive_session_running(): print("WARNING: NX is open! Close it before running optimization.") # You can choose to abort or continue # Run trials with session management for trial in trials: with session_mgr.acquire_model_lock(prt_file, "my_study"): # Exclusive access to model - safe to modify updater.update_expressions(params) result = solver.run_simulation(sim_file) ``` ### Example 2: Multiple Concurrent Optimizations ```python # Study A (in one terminal) session_mgr_A = NXSessionManager() with session_mgr_A.acquire_model_lock(model_A_prt, "study_A"): # Works on model A updater_A.update_expressions(params_A) solver_A.run_simulation(sim_A) # Study B (in another terminal, simultaneously) session_mgr_B = NXSessionManager() with session_mgr_B.acquire_model_lock(model_B_prt, "study_B"): # Works on model B (different model - no conflict) updater_B.update_expressions(params_B) solver_B.run_simulation(sim_B) # If they try to use SAME model: with session_mgr_A.acquire_model_lock(model_SAME, "study_A"): pass # Acquires lock with session_mgr_B.acquire_model_lock(model_SAME, "study_B"): # Waits here until study_A releases lock # Then proceeds safely pass ``` ### Example 3: Protection Against User's Interactive NX ```python session_mgr = NXSessionManager(verbose=True) # Detect if user has NX open nx_sessions = session_mgr.get_running_nx_sessions() for session in nx_sessions: print(f"Detected: {session.name} (PID {session.pid})") if session_mgr.is_nx_interactive_session_running(): print("Interactive NX session detected!") print("Recommend closing NX before running optimization.") # Option 1: Abort raise RuntimeError("Close NX and try again") # Option 2: Continue with warning print("Continuing anyway... (may cause conflicts)") ``` ## Configuration ### Lock Directory Default: `~/.atomizer/locks/` Custom: ```python session_mgr = NXSessionManager( lock_dir=Path("/custom/lock/dir") ) ``` ### Concurrent Session Limit Default: 1 (safest) Allow multiple: ```python session_mgr = NXSessionManager( max_concurrent_sessions=2 # Allow 2 parallel NX batches ) ``` **Warning**: Multiple concurrent NX sessions require multiple licenses! ### Wait Timeout Default: 300 seconds (5 minutes) Custom: ```python session_mgr = NXSessionManager( wait_timeout=600 # Wait up to 10 minutes ) ``` ## Integration with NXSolver The `NXSolver` class has built-in session management: ```python from optimization_engine.nx_solver import NXSolver solver = NXSolver( enable_session_management=True, # Default study_name="my_study" ) # Session management happens automatically result = solver.run_simulation(sim_file) ``` **Note**: Full automatic integration is planned but not yet implemented. Currently, manual wrapping is recommended. ## Status Monitoring ### Get Current Status ```python report = session_mgr.get_status_report() print(report) ``` Output: ``` ====================================================================== NX SESSION MANAGER STATUS ====================================================================== Running NX Processes: 2 PID 12345: ugraf.exe Working dir: C:/Users/username/project PID 12346: run_journal.exe WARNING: Interactive NX session detected! Batch operations may conflict with user's work. Active Optimization Sessions: 1/1 my_study (PID 12347) Active Lock Files: 1 ====================================================================== ``` ### Cleanup Stale Locks ```python # Run at startup session_mgr.cleanup_stale_locks() ``` Removes lock files from crashed processes. ## Error Handling ### Lock Timeout ```python try: with session_mgr.acquire_model_lock(prt_file, study_name): # ... modify model ... pass except TimeoutError as e: print(f"Could not acquire model lock: {e}") print("Another optimization may be using this model.") # Handle error (skip trial, abort, etc.) ``` ### NX Session Timeout ```python try: with session_mgr.acquire_nx_session(study_name): # ... run NX batch ... pass except TimeoutError as e: print(f"Could not acquire NX session: {e}") print(f"Max concurrent sessions ({session_mgr.max_concurrent}) reached.") # Handle error ``` ## Platform Support - ✅ **Windows**: Full support (uses `msvcrt` for file locking) - ✅ **Linux/Mac**: Full support (uses `fcntl` for file locking) - ✅ **Cross-Platform**: Lock files work across different OS instances ## Limitations 1. **Same Machine Only**: Session manager only prevents conflicts on the same machine - For networked optimizations, need distributed lock manager 2. **File System Required**: Requires writable lock directory - May not work on read-only filesystems 3. **Process Detection**: Relies on `psutil` for process detection - May miss processes in some edge cases 4. **Not Real-Time**: Lock checking has small latency - Not suitable for microsecond-level synchronization ## Best Practices ### 1. Always Use Model Locks ```python # GOOD: Protected with session_mgr.acquire_model_lock(prt_file, study_name): updater.update_expressions(params) # BAD: Unprotected (race condition!) updater.update_expressions(params) ``` ### 2. Check for Interactive NX ```python # Before starting optimization if session_mgr.is_nx_interactive_session_running(): print("WARNING: Close NX before running optimization!") # Decide: abort or continue with warning ``` ### 3. Cleanup on Startup ```python # At optimization start session_mgr = NXSessionManager() session_mgr.cleanup_stale_locks() # Remove crashed process locks ``` ### 4. Use Unique Study Names ```python # GOOD: Unique names solver_A = NXSolver(study_name="beam_optimization_trial_42") solver_B = NXSolver(study_name="plate_optimization_trial_15") # BAD: Same name (confusing logs) solver_A = NXSolver(study_name="default_study") solver_B = NXSolver(study_name="default_study") ``` ### 5. Handle Timeouts Gracefully ```python try: with session_mgr.acquire_model_lock(prt_file, study_name): result = solver.run_simulation(sim_file) except TimeoutError: # Don't crash entire optimization! print("Lock timeout - skipping this trial") raise optuna.TrialPruned() # Optuna will continue ``` ## Troubleshooting ### "Lock timeout" errors **Cause**: Another process holds the lock longer than timeout **Solutions**: 1. Check if another optimization is running 2. Increase timeout: `wait_timeout=600` 3. Check for stale locks: `cleanup_stale_locks()` ### "Interactive NX session detected" warnings **Cause**: User has NX open in GUI mode **Solutions**: 1. Close interactive NX before optimization 2. Use different model files 3. Continue with warning (risky!) ### Stale lock files **Cause**: Optimization crashed without releasing locks **Solution**: ```python session_mgr.cleanup_stale_locks() ``` ### Multiple optimizations on different models still conflict **Cause**: NX session limit reached **Solution**: ```python session_mgr = NXSessionManager( max_concurrent_sessions=2 # Allow 2 parallel NX instances ) ``` **Warning**: Requires 2 NX licenses! ## Future Enhancements - [ ] Distributed lock manager (for cluster computing) - [ ] Automatic NX session affinity (assign trials to specific NX instances) - [ ] License pool management - [ ] Network file lock support (for shared drives) - [ ] Real-time session monitoring dashboard - [ ] Automatic crash recovery ## Version History ### Version 1.0 (2025-11-20) - Initial implementation - Model file locking - NX session detection - Concurrent session limiting - Stale lock cleanup - Status reporting --- **Implementation Status**: ✅ Core functionality complete **Testing Status**: ⚠️ Needs production testing **Documentation Status**: ✅ Complete