Draft
Conversation
9e0c9da to
9839ce6
Compare
- Move reduce.jl include to after show modules for better bootstrap ordering - Simplify nested @nexprs macro in multidimensional.jl - Simplify error message for missing reduced_index implementation - Add support for general UnitRange in reduced_index for OffsetArrays - Remove redundant test for empty array dimensional reductions
- Add 16x unrolled kernel for small types (<=32 bits) in mapreduce_kernel_commutative - Add RFMap type for better type inference in reduction kernels - Add type-aware commutativity checks for multiplication operations - Optimize pairwise blocksize for compute-heavy functions (sqrt, sin, cos, exp, log) - Add MIN_BLOCK constant for minimum reduction block size - Introduce _MRAllocSink stub for future sink API (minimal bootstrap support)
This is a major architectural refactor that replaces the _MapReduceAllocator and _MapReduceInPlace system with a unified sink API. Key changes: - Replace _MapReduceAllocator with _MRSink abstract type and _MRAllocSink - Add _MRInPlaceSink for in-place reductions with update support - Implement sink API methods: mapreduce_allocate, mapreduce_set!, mapreduce_accum! - Add streaming kernel _mapreducedim_stream_firstdim for "keep dim 1" case - Refactor mapreducedim functions to use sink API throughout - Update mapreduce! and reduce! to support update parameter - Improve pairwise reduction with better chunking and type dispatch - Add comprehensive handling of empty arrays and edge cases Benefits: - Unified API for both allocating and in-place reductions - Better performance for streaming reductions - More maintainable and extensible architecture - Proper support for update semantics in reduce!
- Remove unnecessary reduced_index(i::UnitRange) method (OffsetArrays handles its own types) - Remove unused layout_trait function and non-existent type references - Remove unused RFMap struct (was added but never constructed) - Remove unused _to_sink helper functions
adienes
pushed a commit
that referenced
this pull request
Sep 15, 2025
Fixes https://buildkite.com/julialang/julia-master/builds/46446#0195f712-1844-4e81-8b16-27b953fedcd3/899-1778 ``` | Error in testset Profile: | Test Failed at /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/stdlib/v1.13/Profile/test/runtests.jl:231 | Expression: occursin("@Compiler" * slash, str) | Evaluated: occursin("@Compiler/", "Overhead ╎ [+additional indent] Count File:Line Function\n=========================================================\n ╎9 @juliasrc/task.c:1249 start_task\n ╎ 9 @juliasrc/julia.h:2353 jl_apply\n ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ 9 [unknown stackframe]\n ╎ 9 @Distributed/src/process_messages.jl:287 (::Distributed.var\"#handle_msg##2#handle_msg##3\"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()\n ╎ ╎ 9 @Distributed/src/process_messages.jl:70 run_work_thunk(thunk::Distributed.var\"#handle_msg##4#handle_msg##5\"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)\n ╎ ╎ 9 @Distributed/src/process_messages.jl:287 (::Distributed.var\"#handle_msg##4#handle_msg##5\"{Distributed.CallMsg{:call_fetch}})()\n ╎ ╎ 9 @juliasrc/builtins.c:841 jl_f__apply_iterate\n ╎ ╎ 9 @juliasrc/julia.h:2353 jl_apply\n ╎ ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ ╎ ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ ╎ ╎ 9 @Base/Base_compiler.jl:223 kwcall(::@NamedTuple{seed::UInt128}, ::typeof(invokelatest), ::Function, ::String, ::Vararg{String})\n ╎ ╎ ╎ 9 @juliasrc/builtins.c:841 jl_f__apply_iterate\n ╎ ╎ ╎ 9 @juliasrc/julia.h:2353 jl_apply\n ╎ ╎ ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ ╎ ╎ ╎ 9 @juliasrc/builtins.c:853 jl_f_invokelatest\n ╎ ╎ ╎ ╎ 9 @juliasrc/julia.h:2353 jl_apply\n ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ ╎ ╎ ╎ ╎ 9 [unknown stackframe]\n ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:7 kwcall(::@NamedTuple{seed::UInt128}, ::typeof(runtests), name::String, path::String)\n ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:7 runtests\n ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:13 runtests(name::String, path::String, isolate::Bool; seed::UInt128)\n ╎ ╎ ╎ ╎ ╎ 9 @Base/env.jl:265 withenv(f::var\"#4#5\"{UInt128, String, String, Bool, Bool}, keyvals::Pair{String, Bool})\n ╎ ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:27 (::var\"#4#5\"{UInt128, String, String, Bool, Bool})()\n ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/timing.jl:621 macro expansion\n ╎ ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:29 macro expansion\n ╎ ╎ ╎ ╎ ╎ ╎ 9 @Test/src/Test.jl:1835 macro expansion\n ╎ ╎ ╎ ╎ ╎ ╎ 9 /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:37 macro expansion\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/Base.jl:303 include(mod::Module, _path::String)\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/loading.jl:2925 _include(mapexpr::Function, mod::Module, _path::String)\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3693 ijl_apply_generic\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/gf.c:3493 _jl_invoke\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/loading.jl:2865 include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String)\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @Base/boot.jl:489 eval(m::Module, e::Any)\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/toplevel.c:1095 ijl_toplevel_eval_in\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/toplevel.c:1050 ijl_toplevel_eval\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/toplevel.c:978 jl_toplevel_eval_flex\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/toplevel.c:1038 jl_toplevel_eval_flex\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:897 jl_interpret_toplevel_thunk\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:557 eval_body\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:557 eval_body\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:557 eval_body\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:692 eval_body\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:193 eval_stmt_value\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:242 eval_value\n ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ ╎ 9 @juliasrc/interpreter.c:124 do_call\n ╎ ╎ ╎ ╎ ... ```
adienes
pushed a commit
that referenced
this pull request
Sep 15, 2025
Use an atomic fetch and add to fix a data race in `Module()` identified by tsan: ``` ./usr/bin/julia -t4,0 --gcthreads=1 -e 'Threads.@threads for i=1:100 Module() end' ================== WARNING: ThreadSanitizer: data race (pid=5575) Write of size 4 at 0xffff9bf9bd28 by thread T9: #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4) #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4) #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968) #3 <null> <null> (0xffff76a21164) #4 <null> <null> (0xffff76a1f074) #5 <null> <null> (0xffff76a1f0c4) #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04) #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04) #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4) #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4) Previous write of size 4 at 0xffff9bf9bd28 by thread T10: #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4) #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4) #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968) #3 <null> <null> (0xffff76a21164) #4 <null> <null> (0xffff76a1f074) #5 <null> <null> (0xffff76a1f0c4) #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04) #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04) #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4) #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4) Location is global 'jl_new_module__.mcounter' of size 4 at 0xffff9bf9bd28 (libjulia-internal.so.1.13+0x3dbd28) ```
adienes
pushed a commit
that referenced
this pull request
Sep 15, 2025
Simplify `workqueue_for`. While not strictly necessary, the acquire load
in `getindex(once::OncePerThread{T,F}, tid::Integer)` makes
ThreadSanitizer happy. With the existing implementation, we get false
positives whenever a thread other than the one that originally allocated
the array reads it:
```
==================
WARNING: ThreadSanitizer: data race (pid=6819)
Atomic read of size 8 at 0xffff86bec058 by main thread:
#0 getproperty Base_compiler.jl:57 (sys.so+0x113b478)
#1 julia_pushNOT._1925 task.jl:868 (sys.so+0x113b478)
#2 julia_enq_work_1896 task.jl:969 (sys.so+0x5cd218)
#3 schedule task.jl:983 (sys.so+0x892294)
#4 macro expansion threadingconstructs.jl:522 (sys.so+0x892294)
#5 julia_start_profile_listener_60681 Base.jl:355 (sys.so+0x892294)
#6 julia___init___60641 Base.jl:392 (sys.so+0x1178dc)
#7 jfptr___init___60642 <null> (sys.so+0x118134)
#8 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
#9 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
#10 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0xbba74)
#11 jl_module_run_initializer /home/user/c/julia/src/toplevel.c:68:13 (libjulia-internal.so.1.13+0xbba74)
JuliaLang#12 _finish_jl_init_ /home/user/c/julia/src/init.c:632:13 (libjulia-internal.so.1.13+0x9c0fc)
JuliaLang#13 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
JuliaLang#14 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
JuliaLang#15 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
JuliaLang#16 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)
Previous write of size 8 at 0xffff86bec058 by thread T2:
#0 IntrusiveLinkedListSynchronized task.jl:863 (sys.so+0x78d220)
#1 macro expansion task.jl:932 (sys.so+0x78d220)
#2 macro expansion lock.jl:376 (sys.so+0x78d220)
#3 julia_workqueue_for_1933 task.jl:924 (sys.so+0x78d220)
#4 julia_wait_2048 task.jl:1204 (sys.so+0x6255ac)
#5 julia_task_done_hook_49205 task.jl:839 (sys.so+0x128fdc0)
#6 jfptr_task_done_hook_49206 <null> (sys.so+0x902218)
#7 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
#8 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
#9 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9c79c)
#10 jl_finish_task /home/user/c/julia/src/task.c:345:13 (libjulia-internal.so.1.13+0x9c79c)
#11 jl_threadfun /home/user/c/julia/src/scheduler.c:122:5 (libjulia-internal.so.1.13+0xe7db8)
Thread T2 (tid=6824, running) created by main thread at:
#0 pthread_create <null> (julia+0x85f88)
#1 uv_thread_create_ex /workspace/srcdir/libuv/src/unix/thread.c:172 (libjulia-internal.so.1.13+0x1a8d70)
#2 _finish_jl_init_ /home/user/c/julia/src/init.c:618:5 (libjulia-internal.so.1.13+0x9c010)
#3 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
#4 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
#5 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
#6 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)
SUMMARY: ThreadSanitizer: data race Base_compiler.jl:57 in getproperty
==================
```
adienes
pushed a commit
that referenced
this pull request
Oct 10, 2025
Use an atomic fetch and add to fix a data race in `Module()` identified by tsan: ``` ./usr/bin/julia -t4,0 --gcthreads=1 -e 'Threads.@threads for i=1:100 Module() end' ================== WARNING: ThreadSanitizer: data race (pid=5575) Write of size 4 at 0xffff9bf9bd28 by thread T9: #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4) #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4) #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968) #3 <null> <null> (0xffff76a21164) #4 <null> <null> (0xffff76a1f074) #5 <null> <null> (0xffff76a1f0c4) #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04) #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04) #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4) #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4) Previous write of size 4 at 0xffff9bf9bd28 by thread T10: #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4) #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4) #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968) #3 <null> <null> (0xffff76a21164) #4 <null> <null> (0xffff76a1f074) #5 <null> <null> (0xffff76a1f0c4) #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04) #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04) #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4) #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4) Location is global 'jl_new_module__.mcounter' of size 4 at 0xffff9bf9bd28 (libjulia-internal.so.1.13+0x3dbd28) ``` (cherry picked from commit 9039555)
adienes
pushed a commit
that referenced
this pull request
Feb 9, 2026
This still needs some cleanup, but after a lot of discussion and some experimentation it dawned on me that if we always initialize after late gc lowering runs then there's no need for tricks to hide this from LLVM. There are some caveats (that exist today so this is not a regression). Codegen currently always emits stores of GC tracked fields as release atomics, but doesn't do that for any of it's null initialization. For reasons of LLVM never touching atomics, that stops all optimizations of the null pointer stores. This causes things like ```llvm %"new::RefValue" = call noalias nonnull align 8 dereferenceable(16) ptr @ijl_gc_small_alloc(ptr %ptls_load3, i32 360, i32 16, i64 140078842429264) #5, !dbg !25 %"new::RefValue.tag_addr" = getelementptr inbounds i8, ptr %"new::RefValue", i64 -8, !dbg !25 store atomic i64 140078842429264, ptr %"new::RefValue.tag_addr" unordered, align 8, !dbg !25, !tbaa !30 store ptr null, ptr %"new::RefValue", align 8, !dbg !25, !tbaa !33, !alias.scope !36, !noalias !39 store atomic ptr %"x::Array", ptr %"new::RefValue" release, align 8, !dbg !25, !tbaa !33, !alias.scope !36, !noalias !39 ret ptr %"new::RefValue", !dbg !25 ``` for something like ` @code_llvm raw=true dump_module=true Ref([])` AI PART Move GC pointer field zeroing from codegen to late-gc-lowering to prevent optimization passes from sinking the null stores past safepoints. This ensures the GC never sees uninitialized pointer values. The implementation uses LLVM operand bundles on the gc_alloc_obj call: - `julia.gc_alloc_ptr_offsets(i64 off1, i64 off2, ...)`: specifies byte offsets of GC pointer fields that need null initialization - `julia.gc_alloc_zeroinit(i64 offset, i64 size)`: specifies a contiguous region to zero (for GenericMemory with boxed elements) Late-gc-lowering processes these bundles after lowering the allocation to gc_alloc_bytes and emits the appropriate null stores or memset calls. Since this happens after optimization passes, the zeroing cannot be incorrectly sunk past safepoints. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
adienes
pushed a commit
that referenced
this pull request
Feb 25, 2026
# Overview This PR overhauls the way linking works in Julia, both in the JIT and AOT. The point is to enable us to generate LLVM IR that depends only on the source IR, eliminating both nondeterminism and statefulness. This serves two purposes. First, if the IR is predictable, we can cache compile objects using the bitcode hash as a key, like how the ThinLTO cache works. JuliaLang#58592 was an early experiment along these lines. Second, we can reuse work that was done in a previous session, like pkgimages, but for the JIT. We accomplish this by generating names that are unique only within the current LLVM module, removing most uses of the `globalUniqueGeneratedNames` counter. The replacement for `jl_codegen_params_t`, `jl_codegen_output_t`, represents a Julia "translation unit", and tracks the information we'll need to link the compiled module into the running session. When linking, we manipulate the JITLink [LinkGraph](https://llvm.org/docs/JITLink.html#linkgraph) (after compilation) instead of renaming functions in the LLVM IR (before). ## Example ``` julia> @noinline foo(x) = x + 2.0 baz(x) = foo(foo(x)) code_llvm(baz, (Int64,); dump_module=true, optimize=false) ``` Nightly: ```llvm [...] @"+Core.Float64#774" = private unnamed_addr constant ptr @"+Core.Float64#774.jit" @"+Core.Float64#774.jit" = private alias ptr, inttoptr (i64 4797624416 to ptr) ; Function Signature: baz(Int64) ; @ REPL[1]:2 within `baz` define double @julia_baz_772(i64 signext %"x::Int64") #0 { top: %pgcstack = call ptr @julia.get_pgcstack() %0 = call double @j_foo_775(i64 signext %"x::Int64") %1 = call double @j_foo_776(double %0) ret double %1 } ; Function Attrs: noinline optnone define nonnull ptr @jfptr_baz_773(ptr %"function::Core.Function", ptr noalias nocapture noundef readonly %"args::Any[]", i32 %"nargs::UInt32") #1 { top: %pgcstack = call ptr @julia.get_pgcstack() %0 = getelementptr inbounds i8, ptr %"args::Any[]", i32 0 %1 = load ptr, ptr %0, align 8 %.unbox = load i64, ptr %1, align 8 %2 = call double @julia_baz_772(i64 signext %.unbox) %"+Core.Float64#774" = load ptr, ptr @"+Core.Float64#774", align 8 %Float64 = ptrtoint ptr %"+Core.Float64#774" to i64 %3 = inttoptr i64 %Float64 to ptr %current_task = getelementptr inbounds i8, ptr %pgcstack, i32 -152 %"box::Float64" = call noalias nonnull align 8 dereferenceable(8) ptr @julia.gc_alloc_obj(ptr %current_task, i64 8, ptr %3) #5 store double %2, ptr %"box::Float64", align 8 ret ptr %"box::Float64" } [...] ``` Diff after this PR. Notice how each symbol gets the lowest possible integer suffix that will make it unique to the module, and how the two specializations for `foo` get different names: ```diff @@ -4,18 +4,18 @@ target triple = "arm64-apple-darwin24.6.0" -@"+Core.Float64#774" = external global ptr +@"+Core.Float64#_0" = external global ptr ; Function Signature: baz(Int64) ; @ REPL[1]:2 within `baz` -define double @julia_baz_772(i64 signext %"x::Int64") #0 { +define double @julia_baz_0(i64 signext %"x::Int64") #0 { top: %pgcstack = call ptr @julia.get_pgcstack() - %0 = call double @j_foo_775(i64 signext %"x::Int64") - %1 = call double @j_foo_776(double %0) + %0 = call double @j_foo_0(i64 signext %"x::Int64") + %1 = call double @j_foo_1(double %0) ret double %1 } ; Function Attrs: noinline optnone -define nonnull ptr @jfptr_baz_773(ptr %"function::Core.Function", ptr noalias nocapture noundef readonly %"args::Any[]", i32 %"nargs::UInt32") #1 { +define nonnull ptr @jfptr_baz_0(ptr %"function::Core.Function", ptr noalias nocapture noundef readonly %"args::Any[]", i32 %"nargs::UInt32") #1 { top: %pgcstack = call ptr @julia.get_pgcstack() @@ -23,7 +23,7 @@ %1 = load ptr, ptr %0, align 8 %.unbox = load i64, ptr %1, align 8 - %2 = call double @julia_baz_772(i64 signext %.unbox) - %"+Core.Float64#774" = load ptr, ptr @"+Core.Float64#774", align 8 - %Float64 = ptrtoint ptr %"+Core.Float64#774" to i64 + %2 = call double @julia_baz_0(i64 signext %.unbox) + %"+Core.Float64#_0" = load ptr, ptr @"+Core.Float64#_0", align 8 + %Float64 = ptrtoint ptr %"+Core.Float64#_0" to i64 %3 = inttoptr i64 %Float64 to ptr %current_task = getelementptr inbounds i8, ptr %pgcstack, i32 -152 @@ -39,8 +39,8 @@ ; Function Signature: foo(Int64) -declare double @j_foo_775(i64 signext) #3 +declare double @j_foo_0(i64 signext) #3 ; Function Signature: foo(Float64) -declare double @j_foo_776(double) #4 +declare double @j_foo_1(double) #4 attributes #0 = { "frame-pointer"="all" "julia.fsig"="baz(Int64)" "probe-stack"="inline-asm" } ``` ## List of changes - Many sources of statefulness and nondeterminism in the emitted LLVM IR have been eliminated, namely: - Function symbols defined for CodeInstances - Global symbols referring to data on the Julia heap - Undefined function symbols referring to invoked external CodeInstances - `jl_codeinst_params_t` has become `jl_codegen_output_t`. It now represents one Julia "translation unit". More than one CodeInstance can be emitted to the same `jl_codegen_output_t`, if desired, though in the JIT every CI gets its own right now. One motivation behind this is to allow us to emit code on multiple threads and avoid the bitcode serialize/deserialize step we currently do, if that proves worthwhile. When we are done emitting to a `jl_codegen_output_t`, we call `.finish()`, which discards the intermediate state and returns only the LLVM module and the info needed for linking (`jl_linker_info_t`). - The new `JLMaterializationUnit` wraps emitting Julia LLVM modules and the associated `jl_linker_info_t`. It informs ORC that we can materialize symbols for the CIs defined by that output, and picks globally unique names for them. When it is materialized, it resolves all the call targets and generates trampolines for CodeInstances that are invoked but have the wrong calling convention, or are not yet compiled. - We now postpone linking decisions to after codegen whenever possible. For example, `emit_invoke` no longer tries to find a compiled version of the CodeInstance, and it no longer generates trampolines to adapt calling conventions. `jl_analyze_workqueue`'s job has been absorbed into `JuliaOJIT::linkOutput`. - Some `image_codegen` differences have been removed: - Codegen no longer cares if a compiled CodeInstance came from an image. During ahead-of-time linking, we generate thunk functions that load the address from the fvars table. - In `jl_emit_native_impl`, emit every CodeInstance into one `jl_codegen_output_t`. We now defer the creation of the `llvm::Linker` for llvmcalls, which has construction cost that grows with the size of the destination module, until the very end. - RTDyld is removed completely, since we cannot control linking like we can with JITLink. Since JuliaLang#60105, platforms that previous used the optimized memory manager now use the new one. ### General refactoring - Adapt the `jl_callingconv_t` enum from `staticdata.c` into `jl_invoke_api_t` and use it in more places. There is one enumerator for each special `jl_callptr_t` function that can go in a CodeInstance's `invoke` field, as well as one that indicates an invoke wrapper should be there. There is a convenience function for reading an invoke pointer and getting the API type, and vice versa. - Avoid using magic string values, and try to directly pass pointers to LLVM `Function *` or ORC string pool entries when possible. ## Future work - `DLSymOptimizer` should be mostly removed, in favour of emitting raw ccalls and redirecting them to the appropriate target during linking. - We should support ahead-of-time linking multiple `jl_codegen_output_t`s together, in order to parallelize LLVM IR emission when compiling a system image. - We still pass strings to `emit_call_specfun_other`, even though the prototype for the function is now created by `jl_codegen_output_t::get_call_target`. We should hold on to the calling convention info so it doesn't have to be recomputed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.