Conversation
|
Thanks for working on this! I wonder if QUIC listener SSL, QUIC connection SSL, and QUIC stream SSL should be represented as separate (sub)classes. Although they all have the same type I think the blocking methods such as Regarding the thread assisted mode ( @ioquatix, what do you think? |
|
|
||
| GetSSL(self, ssl); | ||
| return SSL_is_init_finished(ssl) ? Qtrue : Qfalse; | ||
| } |
There was a problem hiding this comment.
Is this related to QUIC support? Is this different from seeing whether #accept/#connect/#accept_connection successfully returned or not?
There was a problem hiding this comment.
Yes. I'm using it on the server side to detect if the handshake has finished here. Without this, open_streams can raise an exception.
There was a problem hiding this comment.
Thanks for the pointer. I'm still trying to understand this, but I think this means SSL_accept_connection() returns once it receives the Initial packet and doesn't wait for the Handshake packet. The man page says:
https://docs.openssl.org/3.6/man3/SSL_new_listener/
The new SSL object returned from SSL_accept_connection() may or may not have completed its handshake at the point it is returned. Optionally, you may use the function SSL_is_init_finished(3) to determine this. You may call the functions SSL_accept(3), SSL_do_handshake(3) or SSL_handle_events(3) to progress the state of the SSL object towards handshake completion. Other "I/O" functions may also implicitly progress the state of the handshake such as SSL_poll(3), SSL_read(3) and SSL_write(3).
#accept_nonblock appears to work for this use-case, and I think this is clearer.
diff --git a/lib/kantan/h3/poll_session.rb b/lib/kantan/h3/poll_session.rb
index 609c8d1f46ad..c6fe30d1695e 100644
--- a/lib/kantan/h3/poll_session.rb
+++ b/lib/kantan/h3/poll_session.rb
@@ -13,13 +13,12 @@ module Kantan
wfds = []
# Wait for handshake
- until @conn.init_finished?
+ until @conn.accept_nonblock(exception: false) == @conn
rfds.clear
wfds.clear
rfds << @io if @conn.net_read_desired?
wfds << @io if @conn.net_write_desired?
IO.select(rfds, wfds, nil, @conn.event_timeout)
- @conn.handle_events
end
open_streamsI wish I could simply use the blocking @conn.accept here, but currently it doesn't take SSL_get_event_timeout() into account when waiting for the underlying socket with rb_io_wait(). I think that can be a later improvement.
There was a problem hiding this comment.
Ah great. I was able to apply your patch and it works well. It seems like I still need init_finished? on the client side though. I'm not sure how to get rid of it.
This commit exposes various QUIC methods so that you can build QUIC clients and servers
I'm not sure. We could do that, and I'm happy to do that work. It's up to you.
I think we shouldn't support it, and I'll remove it. I've updated the patch to make QUIC connections always non-blocking, and I think that's the way we should go. I've removed the accessor to |
Ah, I remember. I had considered adding a subclass as you say, but I've always run in to issues when trying to work with a subclass on |
Clients should use IO.select and non-blocking sockets instead
|
@rhenium do you mind taking another look? I've:
I think the patch is much smaller now, so hopefully it is easier to review. I don't mind adding a subclass for QUIC if you want it (but as I said I've had troubles with them in the past). I'm not sure if we want blocking methods like |
We need this method to determine if a stream is closed or not. Calling read_nonblock on a closed stream will raise an exception (even with `exception: false`) so we need this API to know not to call `read_nonblock`
I'm interested to hear about the problems you ran into. In ruby/openssl, I think separating them might be clearer for documentation purpose, e.g., methods like That said, I'm fine with having a single
I think they would be handy, but not essential. We can add them later. Are the deadlock possibly because the GVL was not released during the blocking calls? We should be able to implement them safely using non-blocking OpenSSL calls and Ruby's |
| if (!stream_ssl) { | ||
| if (no_exception_p(opts)) | ||
| return sym_wait_readable; | ||
| ossl_raise(eSSLErrorWaitReadable, "accept_stream would block"); | ||
| } |
There was a problem hiding this comment.
We should query the SSL_get_error() value since the connection may have failed or been closed and it's not retryable. In that case OpenSSL::SSL::SSLError should be raised instead.
SSL_accept_connection() and SSL_new_stream() also need it according to the man page.
There was a problem hiding this comment.
I'm checking these. I'm not sure about SSL_accept_connection (though I saw [this section in the man pages)(https://github.com/openssl/openssl/blob/b7a42c6a3c5fe58a613e6d34ee5c23a5c072c5e2/doc/man3/SSL_new_listener.pod?plain=1#L175-L176).
If you read the server demo code, the non-blocking server never checks SSL_get_error. The blocking server doesn't either. I tried calling it and got a "wrong type" error from this line.
I added error checking for SSL_new_stream and SSL_accept_stream.
There was a problem hiding this comment.
I had to partially revert the error checking. If I add error checking to SSL_accept_stream, it causes my test server to start stalling.
In the Kantan repo you can do this to start the server:
$ ruby -I lib::~/git/openssl/lib benchmarks/h3_server.rb
Then this to start the client:
$ ruby -I lib:~/git/openssl/lib benchmarks/h3_client.rb
Adding the error checking will cause it to stall.
| GetSSLCTX(v_ctx, ctx); | ||
| ossl_sslctx_setup(v_ctx); | ||
|
|
||
| listener = SSL_new_listener(ctx, 0); |
There was a problem hiding this comment.
listener is leaked in the error cases, for example when v_io is invalid or the wrapper object allocation fails.
#accept_stream and #new_stream can also leak the SSL if the wrapper object allocation fails.
| stream_ssl = SSL_new_stream(ssl, flags); | ||
| if (!stream_ssl) { | ||
| if (flags & SSL_STREAM_FLAG_NO_BLOCK) | ||
| return Qnil; | ||
| ossl_raise(eSSLError, "SSL_new_stream"); | ||
| } |
There was a problem hiding this comment.
After re-reading the man page, I started to think SSL_new_stream() also should have a non-blocking variant. I thought SSL_new_stream() was a local-only operation, but apparently it needs to write to the underlying socket unless SSL_STREAM_FLAG_ADVANCE is specified.
| /* | ||
| * call-seq: | ||
| * ssl.accept_stream => SSLSocket | ||
| * | ||
| * Accepts an incoming QUIC stream from the peer. Blocks until a stream is | ||
| * available. Returns a new SSLSocket representing the stream. | ||
| * | ||
| * Raises OpenSSL::SSL::SSLError on failure. | ||
| */ | ||
| static VALUE | ||
| ossl_ssl_accept_stream(VALUE self) | ||
| { | ||
| SSL *ssl, *stream_ssl; | ||
|
|
||
| GetSSL(self, ssl); | ||
| stream_ssl = SSL_accept_stream(ssl, 0); | ||
| if (!stream_ssl) | ||
| ossl_raise(eSSLError, "SSL_accept_stream"); | ||
|
|
||
| return ossl_ssl_wrap_stream(self, stream_ssl); | ||
| } |
There was a problem hiding this comment.
This currently doesn't actually block, presumably because we disable SSL_set_blocking_mode() and the underlying socket is also set to O_NONBLOCK. I think we'll need to do something similar to ossl_start_ssl() here.
|
I've playing with this, and I found The blocking I think both methods should prevent When running this code, one CPU starts pegging at 100% in 513th require "openssl"
require "socket"
pkey = OpenSSL::PKey::EC.generate("prime256v1")
cert = OpenSSL::X509::Certificate.new.tap do |cert|
cert.version = 2
cert.serial = 1
cert.not_before = Time.now - 60
cert.not_after = Time.now + 86400
cert.public_key = pkey
cert.subject = cert.issuer = OpenSSL::X509::Name.new([["CN", "localhost"]])
cert.sign(pkey, "sha256")
end
server_sock = UDPSocket.new.tap { it.bind("localhost", 5433) }
client_sock = UDPSocket.new.tap { it.connect("localhost", 5433) }
server_thread = Thread.start do
ctx = OpenSSL::SSL::SSLContext.quic(:server)
ctx.add_certificate(cert, pkey)
ctx.alpn_select_cb = -> ary { ary.last }
listener = OpenSSL::SSL::SSLSocket.new_listener(server_sock, context: ctx)
begin
connection = listener.accept_connection_nonblock
rescue IO::WaitReadable
p [:s, :accept_connection_wait_readable]
server_sock.wait_readable
retry
end
p [:s, :accept_connection]
begin
connection.accept_nonblock
rescue IO::WaitReadable
p [:s, :accept_wait_readable]
server_sock.wait_readable(connection.event_timeout)
retry
end
p [:s, :accept]
stream = begin
connection.accept_stream_nonblock
rescue IO::WaitReadable
p [:s, :accept_stream_wait_readable]
sleep 0.1
server_sock.wait_readable(connection.event_timeout)
retry
end
p [:s, :accept_stream, stream.stream_id]
loop do
connection.handle_events
sleep 0.1
end
end
client_thread = Thread.start do
ctx = OpenSSL::SSL::SSLContext.quic(:client)
ctx.alpn_protocols = ["a"]
ctx.groups = "X25519" # XXX
connection = OpenSSL::SSL::SSLSocket.new(client_sock, ctx)
begin
connection.connect_nonblock
rescue IO::WaitReadable
p [:c, :connect_wait_readable]
client_sock.wait_readable
retry
end
stream = connection.new_stream
# OpenSSL's initial_max_stream_data_bidi_remote appears to be 512KB
# write will fail after 512 times
1_000.times do |i|
stream.write("x"*1024)
p [:c, :write, stream.stream_id, i]
end
end
[server_thread, client_thread].each(&:join) |
|
@rhenium I will try your example and see if I can fix it.
That makes sense. I'll try to implement it. This really makes me think we should have a specific class for QUIC connections.
Honestly this was a long time ago and I don't remember the details. I think it was something to do with figuring out logic inside the GC free function but not knowing the type. I really have no strong feeling against adding a special QUIC class (in fact I'm starting to think we really should add one).
Yes. IO was holding the GVL in one thread, and GC tried to enter in another (actually I was testing with Ractors, but I think the same would happen with Thread). |
We want to have blocking operations, but we can't hold the GVL, so we'll emulate blocking operations by calling the non-blocking API, then call io_waitreadable

This commit exposes various QUIC methods so that you can build QUIC clients and servers.
I'm really sorry about the very large patch. I can split it in to smaller patches if that helps to get this upstream. I think the patch is pretty straightforward though. It just adds different QUIC related methods that OpenSSL provides. I've been using this patch to develop a QUIC/HTTP3 client and server in Ruby.
My development branch is here. Here is an example script that uses these APIs.