Make sure MYSQLI_OPT_CONNECT_TIMEOUT applies to the whole handshake.#9532
Make sure MYSQLI_OPT_CONNECT_TIMEOUT applies to the whole handshake.#9532vnsavage wants to merge 1 commit intophp:masterfrom
Conversation
Otherwise a broken MySQL server can cause denial of service of the whole application with connection pile-ups.
|
I've decided to review this PR now, but I cannot understand what this change actually does. Can you share some real life example of when that is useful? From what I can see |
|
Thanks for taking the time to look at this! The distinction is subtle but important. MYSQLI_OPT_CONNECT_TIMEOUT currently only applies to the TCP/socket connection establishment (i.e. the connect() syscall). Once the socket is connected, the MySQL handshake — where the server sends its initial greeting and the client responds with auth — runs under mysqlnd.net_read_timeout (or the default socket timeout), not the connect timeout. The problem: if you have a server that accepts the TCP connection but then hangs or is very slow during the handshake (e.g. a broken or overloaded MySQL instance), PHP will sit waiting far longer than the connect timeout you configured. This is the denial-of-service vector — in a high-traffic app, worker processes pile up waiting on these stalled handshakes, exhausting your connection pool and eventually your entire PHP-FPM worker pool. The test reproduces this by creating a TCP listener that accepts connections but never sends the MySQL greeting, then verifies that mysqli_real_connect times out within the connect timeout window rather than hanging. |
So then why not use read_timeout instead?
Wouldn't this change introduce a potential DoS for the users who rely on the current behaviour? By swapping from I'm also not sure that it is more semantically correct. From a certain perspective, perhaps yes, but from the networking side, no. When mysqlnd waits for a handshake response, the network connection must have already been established. Now, mysqlnd is reading data from the server, i.e. the handshake. Granted, it's not documented anywhere what each of these two settings applies to exactly, but from the source code, it's obvious that it's implemented correctly right now. This is also why I was having a very hard time understanding your patch, because it lacks code comments and changes the behaviour in an unintuitive way. It looks like there are several people interested in merging this PR, but I am not keen on doing that. Given that, IMHO, it works correctly right now, and that changing this would cause problems for users who are used to the current behaviour, I don't think it's a good idea to merge it. However, before declining it, I would like to hear more about the potential pros and cons. There is a linked PR from HyperDb putting this PR as a blocker. Do you know more about this? Why can it not be done with |
The issue is that read_timeout and connect_timeout serve different purposes from the user's perspective. A user who sets MYSQLI_OPT_CONNECT_TIMEOUT=2 and MYSQLI_OPT_READ_TIMEOUT=30 is saying "I want connections to be fast, but once connected I'm okay with queries taking up to 30 seconds." If we tell them to lower read_timeout to fix the handshake hang, we break their ability to run long queries. The two timeouts need to remain independently configurable.
I'd argue the opposite — the current behaviour is the DoS. Users setting MYSQLI_OPT_CONNECT_TIMEOUT reasonably expect it to bound the entire connection attempt. The fact that it silently stops applying mid-handshake is a bug, not a feature anyone is intentionally relying on. I'd find it surprising if anyone has deliberately tuned connect_timeout low and read_timeout high specifically because they want the handshake to be unbounded — that would be a very unusual and fragile configuration.
That's true at the TCP layer, but I'd argue the MySQL handshake is still logically part of "connecting" from the application's point of view. The connection isn't usable until the handshake completes. This mirrors how most other database drivers and client libraries define "connect timeout" — it covers everything up to having an authenticated, usable connection, not just the TCP SYN/ACK. Regarding the HyperDB PR: the reason MYSQLI_OPT_READ_TIMEOUT isn't a workable solution there is exactly the one above — HyperDB (and WordPress sites using it) needs to allow long-running queries while still having a short connection timeout. The two are fundamentally in tension if read_timeout is the only lever available. We introduced this patch exactly because of problems we've run into while using HyperDB ( we are also the maintainers of HyperDB ) in our infrastructure, because MySQL broken in specific ways that the connect timeout doesn't catch can break enormous amount of sites. Also I should note - the original PR here was created almost 4 years ago. What we use now in production has further evolved to support more flexible use cases, where high traffic load patterns actually need to detect broken mysql servers much quicker and we have introduced milliseconds support for the connect timeout. You don't seem very interested in merging the patch, but if for some reason you get interested - I'm happy to update this PR with the latest changes. |
|
Ohh, that's right, because you cannot change MYSQLI_OPT_READ_TIMEOUT whenever you want, it must happen before the connection is attempted. I will do some more investigate on this, but as of now I am against this change. However, maybe there is some alternative. For the moment, I read the documentation by the original mysqlnd creator and compared PDO's behaviour. Ulf says on his blog:
So this seems like the core design decision that should not be changed. In the end he says:
IMHO, we should not change how it works now. After all, prior to PHP 8.2, this would be considered impossible. The fact that it's impossible to control the timeout of the handshake without setting read_timeout seems like a gap, but I wonder why this hasn't come up with C library? |
Yep, exactly.
Because the description in that blog post you mentioned is not entirely accurate. MySQL C client library and |
Otherwise a broken MySQL server can cause denial of service of the whole application with connection pile-ups.
I have also added a PHP test which can be used to reproduce the problem and test the fix.