Conversation
BackendBench/backends/directory.py
Outdated
| cpp_sources=cpp_source, | ||
| cuda_sources=cuda_source, | ||
| functions=[folder_name], | ||
| verbose=True, |
There was a problem hiding this comment.
check the no implicit headers mode, otherwise this function will take 90s per call
There was a problem hiding this comment.
I set no_implicit headers to Ture and added the header to CUDA files. As a result, the running time for TestDirectoryBackendCUDA decreased from approximately 50 seconds to around 10 seconds.
|
Update:
|
|
You should be able to set CUDA_HOME on the CI runners though, it's a T4 machine with a GPU |
Yeah I found that I can install the CUDA toolkit directly on the CI runner. Now the tests are running successfully: |
|
Is this ready for a review? |
Yes! |
| cuda_source = "" | ||
|
|
||
| # Read both files if they exist | ||
| if os.path.exists(cu_file): |
There was a problem hiding this comment.
I'm wondering if we can simplify this a a bit and only make the LLM spit out the .cu file - the cpp file should typically be quite simple for us to provide. see this as an example https://github.com/gpu-mode/reference-kernels/blob/main/problems/pmpp/vectoradd_py/solutions/correct/submission_cuda_inline.py#L48
There was a problem hiding this comment.
Solved! Added a new parameter load_cpp_source which by default set to false. It controls whether to load cpp source from the .cpp file (load_cpp_source=true) or to generate the cpp source content from the cuda source (oad_cpp_source=false)
msaroufim
left a comment
There was a problem hiding this comment.
mostly looking good, some minor questions
This pull request introduces the following changes: