This issue tracks progress/roadmap for what needs to be done to codegen for targets like AMDGPUs. Personally, I am working on AMDGPU codegen as it would be used for HSA. Specifically, I am aiming for the amdgcn-amd-amdhsa-amdgiz LLVM target. Note that I’m still learning, so this issue will likely change as guided by experience.
Here are the pieces that will be needed to make this work to a MVP level (ie not providing access to most GPU specific stuff):
The address space changes are pretty general. However, in order to not require sweeping changes to how Rust is codegen-ed for LLVM, any target must support a flat address space. Flat meaning an addr space which is a superset of all others.
amdgpu-kernel requires its return type be void. There are two ways I see to do this:
- compile-time checks (somewhere in
rustc), ie disallow any return type except ! and ().
- rewriting returns to use an
sret-like style: promote the return to be an indirect first argument of the function.
As I recall, Rust inserts wrapper functions for functions with extern “abi” which call the real rust abi function. My current impl went with the magical rewriting, but I think forcing the user to acknowledge this with an error is better long term.
Privately, I've made it to errors stemming from # 4 on general Rust code (ie std/core code). See this repo/crate. Regarding virtual function calls, in principle, it’s possible to support, if using HSA, completely GPU side. amdgpu-kernels have access to two different hsa_queue_ts (one for the host and the device), setup by the GPU’s hardware command processor. When a virtual call is encountered, the trick is to have the GPU write to its own hsa_queue_t then wait on the completion signal. Foreign functions can also be supported in this way, by writing to the host hsa_queue_t instead.
Post-MVP
TBD(TODO) Discuss?
Informational links
This issue tracks progress/roadmap for what needs to be done to codegen for targets like AMDGPUs. Personally, I am working on AMDGPU codegen as it would be used for HSA. Specifically, I am aiming for the
amdgcn-amd-amdhsa-amdgizLLVM target. Note that I’m still learning, so this issue will likely change as guided by experience.Here are the pieces that will be needed to make this work to a MVP level (ie not providing access to most GPU specific stuff):
librustc_codegen_llvmaware of LLVM address spaces. #51576. E.g. allocas are in address space 5 for the target triple I mentioned above.amdgpu-kernelABI (PR Add theamdgpu-kernelABI. #52032).The address space changes are pretty general. However, in order to not require sweeping changes to how Rust is codegen-ed for LLVM, any target must support a flat address space. Flat meaning an addr space which is a superset of all others.
amdgpu-kernelrequires its return type bevoid. There are two ways I see to do this:rustc), ie disallow any return type except!and().sret-like style: promote the return to be an indirect first argument of the function.As I recall, Rust inserts wrapper functions for functions with
extern “abi”which call the real rust abi function. My current impl went with the magical rewriting, but I think forcing the user to acknowledge this with an error is better long term.Privately, I've made it to errors stemming from # 4 on general Rust code (ie
std/corecode). See this repo/crate. Regarding virtual function calls, in principle, it’s possible to support, if using HSA, completely GPU side.amdgpu-kernels have access to two differenthsa_queue_ts (one for the host and the device), setup by the GPU’s hardware command processor. When a virtual call is encountered, the trick is to have the GPU write to its ownhsa_queue_tthen wait on the completion signal. Foreign functions can also be supported in this way, by writing to the hosthsa_queue_tinstead.Post-MVP
TBD(TODO) Discuss?
Informational links