Thursday, May 22, 2014

Cross-platform way to start use subprocess.Popen without inherited file descriptors

As mentioned in my previous post on how subprocess and file descriptors work in Python, there are significant differences in the way subprocess.Popen works in Linux and Windows. After the previous post, I tried to find a better way to handle this issue in a cross-platform way.

So, I ended up with the following module, "ingenuously" named file_safe_subprocess.py :

import subprocess
import threading
import platform
import __builtin__

_system = platform.system()

if _system=='Windows':
    import msvcrt
    import win32api
    import win32con
    import win32file
else:
    import fcntl
    import termios
    import resource

_lock = threading.RLock()
_std_Popen = subprocess.Popen
_std_open = open

def _set_file_descriptor_as_non_inheritable(fd):
    if _system=='Windows':
        win32api.SetHandleInformation(msvcrt.get_osfhandle(fd),
                                      win32con.HANDLE_FLAG_INHERIT,
                                      0)
    else:
        fcntl.ioctl(0, termios.FIOCLEX, 0)

def _max_opened_files():
    if _system=='Windows':
        return win32file._getmaxstdio()
    return resource.getrlimit(resource.RLIMIT_NOFILE)[0]

if _system=='Windows':
    def _safe_open(*args, **kwargs):
        with _lock:
            f = _std_open(*args, **kwargs)
            _set_file_descriptor_as_non_inheritable(f.fileno())
        return f

    __builtin__.open = _safe_open

    def _safe_Popen(*args, **kwargs):
        with _lock:
            p = _std_Popen(*args, **kwargs)
            for f in (p.stdin, p.stdout, p.stderr):
                if f!=None and f.fileno()>2:
                    try:
                        _set_file_descriptor_as_non_inheritable(f.fileno())
                    except:
                        pass
            return p
else:
    def _safe_Popen(*args, **kwargs):
        with _lock:
            kwargs['close_fds'] = True
            return _std_Popen(*args, **kwargs)

subprocess.Popen = _safe_Popen

def set_fds_as_non_inheritable(fds=None):
    if fds==None:
        fds = range(3, _max_opened_files())
    for i in fds:
        try:
            _set_file_descriptor_as_non_inheritable(fd)
        except:
            pass

def wrap_process(func, *args, **kwargs):
    with _lock:
        set_fds_as_non_inheritable()
        return func(*args, **kwargs)

def wrap_fd_open(func, *args, **kwargs):
    with _lock:

        return func(*args, **kwargs)

if _system=='Windows':
    set_fds_as_non_inheritable()

What does it do?

By importing this module at an early stage in a script, you essentially do the following:

  1. [Windows-only] change the built-in open() function to make sure that, after opening a file, the file descriptor is set as non-inheritable.
  2. override subprocess.Popen with a factory function that make sure that a) in Windows, any new pipes are marked as non-inheritable, and b) in non-Windows, make sure you use close_fds=True so that the subprocess.Popen constructor closes all other file descriptors besides 0, 1, 2
  3. [Windows-only] run the set_fds_as_non_inheritable() function once, to make sure that any already opened file descriptors are properly marked as non-inheritable.
Note that I use a lock, since my application is multi-threaded and I do not want to leak file descriptors opened from some other thread.

Alternative approach

Since I only use Popen for spawning sub-processes and mainly open logging files using the open() function, this module will do just fine. Of course, there are many other cases in which a file descriptor is created. For example,
  • opening up a socket
  • creating some other pipe though non-Popen calls
  • etc, etc
If you use some other function and you want the same (more or less) protection against file descriptor leaking, you can use the wrap_process() and wrap_fd_open() functions:
  • When you want to call any callable (function or class constructor) that internally creates a new process, then you can use the wrap_process() which tries to mark ALL possible file descriptors as non-inheritable and then calls the function you specify, with the arguments you specify.
  • When you want to call any callable (function or class constructor) that internally opens a new file descriptor (file, socket, pipe, whatever), then you can use the wrap_fd_open() which properly synchronizes against the wrap_process() function calls, so that no file descriptors are created while the wrap_process() is running.
Use caution when using the wrap_process() and wrap_fd_open() may create a deadlock in multi-threaded applications if the wrap_process() call does not return immediately.

Else, you could make additional functions that are used for wrapping your preferred method of process invocation and file descriptor creation, to make it faster and more efficient.

No comments: