Skip to content

The concorekill.bat PID-file mechanism creates race conditions and stale-PID conflicts on Windows #391

@GaneshPatil7517

Description

@GaneshPatil7517

Problem Description

On Windows, concore.py creates a concorekill.bat file at module import time that hardcodes the current process PID:

if hasattr(sys, 'getwindowsversion'):
    with open("concorekill.bat","w") as fpid:
        fpid.write("taskkill /F /PID "+str(os.getpid())+"\n")

All nodes in a study share the same working directory. If multiple Python nodes are launched (which makestudy does), each one overwrites concorekill.bat with its own PID. Only the last launched node can be killed via this file all previously launched nodes' PIDs are lost.

Additionally, if a study crashes and is restarted, the stale concorekill.bat from the previous run may contain a PID that has been reassigned to an unrelated process, leading to taskkill terminating the wrong process.

Technical Analysis

The file is written at import time (concore.py lines 29–31) with no locking, no per-node naming, and no PID-validity check. There is no corresponding cleanup on exit the file persists after the node terminates.

In a typical study with 3 Python nodes (e.g., controller + PM + observer), concorekill.bat is overwritten 3 times in rapid succession. Only the third PID survives.

Proposed Improvement

  1. Use per-node PID files: concorekill_{PID}.bat or write all PIDs as separate lines (append mode instead of write mode).
  2. Add atexit cleanup to remove the PID entry on graceful shutdown.
  3. Before taskkill, validate the PID is still a Python process (e.g., tasklist /FI "PID eq ..." check).
  4. Consider replacing this entirely with a process-group approach using os.killpg on POSIX and Windows Job Objects, which can terminate all study processes in one call.

Impact

  • Reliability: Only 1 of N Python nodes is killable via the advertised mechanism
  • Cross-language: C++/MATLAB/Verilog nodes are not covered by this mechanism at all
  • Julia impl: Should not replicate this pattern; needs a proper process lifecycle approach

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions