Skip to content

gh-133485: Use _interpreters.call() in InterpreterPoolExecutor #133957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ericsnowcurrently
Copy link
Member

@ericsnowcurrently ericsnowcurrently commented May 13, 2025

Most importantly, this resolves the issues with functions and types defined in __main__.
It also expands the number of supported objects.

(This is based on gh-133484, thus only the last commit.)

@ericsnowcurrently ericsnowcurrently added needs backport to 3.14 bugs and security fixes and removed awaiting core review labels May 13, 2025
@ericsnowcurrently ericsnowcurrently force-pushed the interp-pool-executor-use-interp-call branch from 5340a57 to 62d7c2c Compare May 13, 2025 01:13
@ericsnowcurrently ericsnowcurrently marked this pull request as draft May 13, 2025 01:14
@ericsnowcurrently ericsnowcurrently force-pushed the interp-pool-executor-use-interp-call branch 9 times, most recently from b3c2477 to 7697c11 Compare May 27, 2025 16:31
Comment on lines +23 to +26
# InterpreterPoolInitializerTest.test_initializer fails
# if we don't have a LOAD_GLOBAL. (It could be any global.)
# We will address this separately.
INITIALIZER_STATUS
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markshannon, any ideas on why this is happening? It smells like a ceval bug, but it certainly could be something I've done wrong.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@neonene neonene Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be related changes in inspect.getclosurevars() since 83ba8c2:

before:
ClosureVars(nonlocals={},
            globals={'INITIALIZER_STATUS': 'uninitialized'},
            builtins={}, unbound=set())
after:
ClosureVars(nonlocals={},
            globals={},
            builtins={}, unbound=set())
  • init() on main (without L26):
  3           RESUME                   0

  5           LOAD_FAST_BORROW         0 (x)
              STORE_GLOBAL             0 (INITIALIZER_STATUS)
              LOAD_CONST               0 (None)
              RETURN_VALUE
  • 3.3.5 (2014):
  5           0 LOAD_FAST                0 (x)
              3 STORE_GLOBAL             0 (INITIALIZER_STATUS)
              6 LOAD_CONST               0 (None)
              9 RETURN_VALUE

@ericsnowcurrently ericsnowcurrently force-pushed the interp-pool-executor-use-interp-call branch from 5ea2bb2 to 14a8eb9 Compare May 29, 2025 21:02
@ericsnowcurrently ericsnowcurrently force-pushed the interp-pool-executor-use-interp-call branch from 14a8eb9 to ccc135c Compare May 30, 2025 15:27
@ericsnowcurrently ericsnowcurrently marked this pull request as ready for review May 30, 2025 21:32
@ericsnowcurrently ericsnowcurrently added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 30, 2025
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @ericsnowcurrently for commit ccc135c 🤖

Results will be shown at:

https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F133957%2Fmerge

If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label May 30, 2025
@neonene
Copy link
Contributor

neonene commented May 31, 2025

The wasm32-wasi Non-Debug buildbot seems to be unused on merged commits and out of order on PRs. For example: https://buildbot.python.org/#/builders/1373/builds/507

@neonene
Copy link
Contributor

neonene commented Jun 1, 2025

Is the following usage invalid?

INITIALIZER_STATUS = 'uninitialized'

def init(x):
    global INITIALIZER_STATUS
    INITIALIZER_STATUS = x
    INITIALIZER_STATUS  # for now

def get_init_status():
    return INITIALIZER_STATUS

if __name__ == "__main__":
    from concurrent.futures import InterpreterPoolExecutor
    exe = InterpreterPoolExecutor(initializer=init, initargs=('initialized',))
    fut = exe.submit(get_init_status)
    assert fut.result() == 'initialized'  # fails
    exe.shutdown(wait=True)
    assert INITIALIZER_STATUS == 'uninitialized'

@neonene
Copy link
Contributor

neonene commented Jun 3, 2025

res.loaded = runpy_run_path(filename, run_modname);

I guess the failure case in my previous comment can be resolved if the runpy_run_path is called only once for __main__ by using/keeping ctx->main.cached->loaded or something across xi-sessions without being replaced. (My experiment was just putting a static local PyObject* var here ignoring the leak.)

@ericsnowcurrently
Copy link
Member Author

That example should definitely work. You're probably right about the fix.

@ericsnowcurrently
Copy link
Member Author

I've committed a solution that fixes the failure. The quirky thing, I realized, is that the loaded functions use the cached NS for __globals__, rather than the actual module's __dict__. Thus the functions copied into __main__ for unpickling won't actually operate against __main__. In the earlier failing example, that means the global variable INITIALIZER_STATUS doesn't show up in __main__. I need to ponder if that's a problem. (It might be surprising to users.)

@neonene

This comment was marked as resolved.

@ericsnowcurrently
Copy link
Member Author

I need to ponder if that's a problem. (It might be surprising to users.)

I think we're okay on this. The unpickled function isn't actually bound anywhere in the subinterpreter. Instead, it is pulled from the cached namespace, called, and then discarded (decref'ed). At worst, the function could treat its own __globals__ as though it were the subinterpreter's __main__ module, and cache something there, which would be pretty mysterious. We can deal with that later if it comes up.

@neonene
Copy link
Contributor

neonene commented Jun 5, 2025

A concern is that the shared items can conflict with the __main__ module's attributes:

InterpreterPoolExecutor(..., shared={'init': None})

If putting that aside, could _handle_unpickle_missing_attr() use something like PyDict_Update(_PyModule_GetDict(mod.module), mod.loaded) as an alternative to using the interpreter dictionary?

task = (fn, args, kwargs)
data = pickle.dumps(task)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like pickle and contextlib are now unused in this file. Importing pickle might be unnecessary when running only stateless functions?

@ericsnowcurrently
Copy link
Member Author

A concern is that the shared items can conflict with the __main__ module's attributes:

InterpreterPoolExecutor(..., shared={'init': None})

Good point. The shared arg made more sense when InterpreterPoolExecutor supported scripts (text), in addition to callables. Since scripts aren't supported any more, I'll drop the shared parameter,

If putting that aside, could _handle_unpickle_missing_attr() use something like PyDict_Update(_PyModule_GetDict(mod.module), mod.loaded) as an alternative to using the interpreter dictionary?

That is essentially the same as exec'ing the original __main__ file in the subinterpreter's __main__ module. Either way, I'd prefer to avoid modifying __main__ when calling a function across interpreters (e.g. via submit()).

One alternative is to add a new module, e.g. __main_main__, in which we'd exec the original script. We'd do that instead of stashing it away on the interpreter's internal dict. I'll look into that.

@ericsnowcurrently
Copy link
Member Author

@neonene, thanks for all the super helpful comments and feedback. You've had a real impact on the outcome here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants