JIT: remove jit dlopen global#802
Conversation
|
Hey, @LeeLee26 ! I’m not very familiar with JIT/linker internals, so this might be a naive question, but I wanted to clarify the exact scope of the symbol-visibility improvement here. MREI ran a small probe that checks whether a few Codon/GC runtime symbols are visible via from __future__ import annotations
import ctypes
import os
import sys
import traceback
SYMBOLS = [
"GC_malloc",
"GC_init",
"GC_get_version",
"seq_alloc",
]
def section(title: str) -> None:
print()
print(f"== {title} ==")
def dlsym_default(symbol: str) -> int:
libdl = ctypes.CDLL("libdl.so.2")
libdl.dlsym.argtypes = [ctypes.c_void_p, ctypes.c_char_p]
libdl.dlsym.restype = ctypes.c_void_p
return int(libdl.dlsym(ctypes.c_void_p(0), symbol.encode()) or 0)
def dump_visibility(stage: str) -> None:
section(stage)
print(f"dlopenflags: {sys.getdlopenflags()}")
print(f"RTLD_GLOBAL bit: {bool(sys.getdlopenflags() & os.RTLD_GLOBAL)}")
for symbol in SYMBOLS:
addr = dlsym_default(symbol)
print(f"{symbol}: {'VISIBLE' if addr else 'hidden'}" + (f" @ 0x{addr:x}" if addr else ""))
def main() -> int:
dump_visibility("before import")
section("import codon")
try:
import codon # noqa: F401
from codon.codon_jit import codon_library
print("import codon: OK")
print(f"codon_library(): {codon_library()!r}")
except Exception:
print("import codon: FAILED")
traceback.print_exc()
return 1
dump_visibility("after import")
section("first jit call")
try:
import codon
@codon.jit
def add_one(x):
return x + 1
print(f"jit result: {add_one(41)!r}")
except Exception:
print("first jit call: FAILED")
traceback.print_exc()
return 2
dump_visibility("after first jit")
return 0
if __name__ == "__main__":
raise SystemExit(main())ResultOn the current codon implementation, When it comes to this PR Branch, Python dlopen flags no longer include But after importing Codon, those runtime symbols (GC_*, seq_*) still become visible through the process-default native symbol lookup scope: One small clarification question about the wording in the PR description. I confirmed that this PR removes the Python-level While checking that, I also noticed that after Given the description:
I wanted to make sure I’m reading the scope correctly. Is that phrase mainly referring to the previous I’m asking just to avoid misunderstanding the intended guarantee of this PR. |
|
Hi @BI71317, Thank you so much for taking the time to confirm its details and ask this great question—it really helps clarify the intent of this PR! The reason Codon runtime symbols (like GC_*) are still accessible from the main process is that during jit_init, we call jit->getEngine()->addDynamicLibrary(rt) to load the Codon runtime. In the LLVM ORC JIT, this follows the execution chain: On UNIX systems, Key difference from the original Python-side
Regarding your question about whether all Codon/GC runtime symbols should be hidden:
Ideally, if we can restrict Codon runtime to expose only a limited, controlled set of symbols, we would not need to rely on loading via To confirm once more: the phrase "pollutes the global symbol table" in the PR description specifically refers to the problematic Python Thanks again for your careful review and valuable feedback! |
|
Hey @BI71317, I have opened a new PR at #812 to further mitigate symbol pollution during JIT compilation caused by the global loading of
|
RTLD_GLOBALwhen importing codon_jit , so LLVM ORC could resolve runtime symbols from the process-wide global symbol table. This is unsafe because it changes symbol visibility for the whole Python process and pollutes the global symbol table.RTLD_GLOBALrequirement and switched to a layered lookup model: LLVM ORC still usesGetForCurrentProcess()for common system/Python symbols, while Codon runtime symbols are resolved by explicitly registeringlibcodonrtas an additional dynamic library. The runtime path is discovered from the Codon install layout and canonicalized with LLVM filesystem utilities before loading.