Conversation
|
|
||
| template<typename T> | ||
| recv_request recv(message_buffer<T>& msg, rank_type src, tag_type tag) | ||
| recv_request recv(message_buffer<T>& msg, rank_type src, tag_type tag, void* stream = nullptr) |
There was a problem hiding this comment.
Is this a good API?
This means that for NCCL the default stream is used if nothing is specified (a stream is always required for NCCL). For other backends the stream is ignored.
|
|
||
| template<typename T, typename CallBack> | ||
| recv_request recv(message_buffer<T>&& msg, rank_type src, tag_type tag, CallBack&& callback) | ||
| recv_request recv(message_buffer<T>&& msg, rank_type src, tag_type tag, CallBack&& callback, void* stream = nullptr) |
There was a problem hiding this comment.
These signatures can lead to ambiguous calls: leaving out the callback but supplying a stream can match this overload as well with the stream taking the place of CallBack. Is this ok?
There was a problem hiding this comment.
I would add some SFINE tests such as std::enable_if_t<std::is_invocable_v<CallBack>> but I am not sure if this is a good idea.
There was a problem hiding this comment.
Yeah, it's a bit unfortunate. There's OOMPH_CHECK_CALLBACK* that's used essentially for that in the body of the functions, but that's not SFINAE. Also unsure what's best here.
e81de62 to
8a854e4
Compare
test/test_send_recv.cpp
Outdated
| // TODO: The sreq.wait was previously called immediately. With NCCL | ||
| // groups can't call wait so early (communication hasn't started yet). |
There was a problem hiding this comment.
Note the semantic change here: If one attempts to call env.comm.send(...).wait() within the NCCL group it will hang. wait will block forever since the group never starts. Should that just throw an exception instead (we can easily query whether the group has already been ended)?
There was a problem hiding this comment.
I would say it should throw an exception.
There was a problem hiding this comment.
Sounds good, I'll (try to) add that.
|
This now seems to work in icon fortran. While I still have some open TODOs I'd be grateful for feedback on this already. The general implementation is pretty much what I want it to be, though I still have some profiling to do with NCCL to check if I'm missing some additional low hanging fruit. Besides any comments you may have on the implementation itself (in particular I'm grateful if you have comments on me misunderstanding oomph requirements for backends) I guess we may need to discuss some sort of CI for the NCCL backend... I can't request reviews so pinging @boeschf @biddisco @philip-paul-mueller. |
philip-paul-mueller
left a comment
There was a problem hiding this comment.
I have some comments/suggestions, but I am not sure what they are worth; probably not much.
|
|
||
| template<typename T, typename CallBack> | ||
| recv_request recv(message_buffer<T>&& msg, rank_type src, tag_type tag, CallBack&& callback) | ||
| recv_request recv(message_buffer<T>&& msg, rank_type src, tag_type tag, CallBack&& callback, void* stream = nullptr) |
There was a problem hiding this comment.
I would add some SFINE tests such as std::enable_if_t<std::is_invocable_v<CallBack>> but I am not sure if this is a good idea.
| static cuda_event_pool pool{128}; | ||
| return pool; |
There was a problem hiding this comment.
| static cuda_event_pool pool{128}; | |
| return pool; | |
| static cuda_event_pool* pool new cuda_event_pool(128); | |
| return *pool; |
See: hhttps://isocpp.org/wiki/faq/ctors#construct-on-first-use-v2
There was a problem hiding this comment.
I have to say I disagree with that motivation, or at least the solution. IMO if the events outlive the pool, then the events should be returned earlier, not the pool leaked. But I can be convinced otherwise...
| ncclResult_t result; | ||
| do { | ||
| OOMPH_CHECK_NCCL_RESULT(ncclCommGetAsyncError(m_comm, &result)); | ||
| } while (result == ncclInProgress); |
There was a problem hiding this comment.
This is more of a question for myself, but this can technically go on indefinitely.
So would it be a good idea to include a timeout?
There was a problem hiding this comment.
I think NCCL internally has enough timeouts that this should not be a problem, but not completely sure... If there's a timeout, the question is what value it should be and how it's configured.
test/test_send_recv.cpp
Outdated
| // TODO: The sreq.wait was previously called immediately. With NCCL | ||
| // groups can't call wait so early (communication hasn't started yet). |
There was a problem hiding this comment.
I would say it should throw an exception.
Mostly just copy MPI implementation to a new directory, not functional.
| # ConstructorInitializerAllOnOneLineOrOnePerLine: false | ||
| BreakConstructorInitializers: BeforeComma | ||
| ConstructorInitializerIndentWidth: 0 | ||
| BreakInheritanceList: BeforeComma |
src/nccl/request_state.hpp
Outdated
| : base{ctxt, comm, scheduled, rank, tag, std::move(cb)} | ||
| , m_req{std::move(m)} | ||
| { | ||
| // std::cerr << "creating nccl shared_request_state\n"; |
There was a problem hiding this comment.
To do:
| // std::cerr << "creating nccl shared_request_state\n"; |
src/nccl/request_state.hpp
Outdated
| : base{ctxt, comm, scheduled, rank, tag, std::move(cb)} | ||
| , m_req{std::move(m)} | ||
| { | ||
| // std::cerr << "creating nccl request_state\n"; |
There was a problem hiding this comment.
| // std::cerr << "creating nccl request_state\n"; |
src/nccl/request_queue.hpp
Outdated
| { | ||
| if (e->m_req.is_ready()) | ||
| { | ||
| // std::cerr << "found ready request in shared queue\n"; |
There was a problem hiding this comment.
| // std::cerr << "found ready request in shared queue\n"; |
NCCL can work with host memory on unified memory systems.
This adds a NCCL backend, with some strong constraints compared to the MPI, libfabric, and UCX backends:
If one sticks to these requirements one should be able to use any backend. If one needs any of the above features, NCCL can't be used.
Adds a few extra features to communicators:
start_group/end_group: These map toncclGroupStart/ncclGroupEndfor NCCL, and no-ops for other backends.is_stream_aware: The NCCL backend is the only one that returnstruefor this. If a backendis_stream_awareit will take into account the optionalstreamargument that can be passed tosend/recv.