Skip to content

Fix mid-transaction connection switch issue#86

Draft
viralpraxis wants to merge 1 commit intopalkan:masterfrom
viralpraxis:yk/fix-mid-transaction-connection-switch-issue
Draft

Fix mid-transaction connection switch issue#86
viralpraxis wants to merge 1 commit intopalkan:masterfrom
viralpraxis:yk/fix-mid-transaction-connection-switch-issue

Conversation

@viralpraxis
Copy link
Collaborator

@viralpraxis viralpraxis commented Jul 11, 2025

Resolves #83

Apparently, there's another issue on the Rails side so Isolator's failure is just a symptom. Still, I think it makes sense to check connection_id explicitly and skip threshold change in case of connection switching, since it's an implicit invariant.

Checklist

  • I've added tests for this change
  • I've added a Changelog entry

@viralpraxis viralpraxis force-pushed the yk/fix-mid-transaction-connection-switch-issue branch from 5198bc3 to 34bb06b Compare July 22, 2025 11:12
Resolves palkan#83

Apparently, there's [another issue on the Rails side](rails/rails#55306)
so Isolator's failure is just a symptom. Still, I think it makes sense
to check `connection_id` explicitly and skip threshold change in case of connection switching,
since it's an implicit invariant.
@viralpraxis viralpraxis force-pushed the yk/fix-mid-transaction-connection-switch-issue branch from 34bb06b to 85599c4 Compare November 16, 2025 09:21
@viralpraxis viralpraxis marked this pull request as draft November 16, 2025 09:27
@fatkodima
Copy link

@viralpraxis do you have plans to finish this? I think we experience the same issue.

@viralpraxis
Copy link
Collaborator Author

@viralpraxis do you have plans to finish this? I think we experience the same issue.

Yeah, I’m coming back from my vacation in a week and I’m planning to revisit this one. As far as I remember current patch does not solve the issue so there is some extra investigation to do

@viralpraxis
Copy link
Collaborator Author

So I looked into this again and here’s what I think.

The most common scenario where this issue occurs is probably related to connection reaping. In practice, though, this is not isolator-specific (i.e., it can happen even without an active isolator), and according to byroot’s answer [0], it should be fixed by pinning connections (or by disabling the reaper; e.g. GitLab does this [1]). Because of that, this scenario seems out of scope for our responsibility; it should be fixed on database-cleaner_active-record's side.

There is another (synthetic) scenario where the RDBMS goes down for some time and, iff the isolator is enabled, instead of a couple of failing tests the entire suite (starting from the point of time when RDBMS becomes unavailable) fails due to a corrupted isolator's state (see Isolator.state[:transactions] and Isolator.state[:thresholds]). In theory, I believe, this could be fixed. I have a semi-working PoC based on resetting per-connection state on a TracePoint's :raise event for specific connection-related ActiveRecord exceptions (I don't see any alternatives based on current AR's public API). However, since this is mostly a DX issue, I don’t think it’s worth the effort.

What can be done is:

  1. Add a short note to the troubleshooting section [2]. This makes sense, since the underlying issue is usually observed through Isolator noissy false positives.
  2. It might make sense to add a debug message when there is a mismatch between connections while executing the start and clean hooks [3]. The current implementation assumes they are identical -- and having a debug message here would have saved me a couple of hours of debugging some time ago.

[0] rails/rails#55306 (comment)
[1] https://gitlab.com/gitlab-org/gitlab/-/merge_requests/129687
[2] https://github.com/palkan/isolator?tab=readme-ov-file#troubleshooting
[3] https://github.com/palkan/isolator/blob/master/lib/isolator/database_cleaner_support.rb#L7-L27

@fatkodima any thoughts? You have much deeper expertise in Rails internals than I do. Is your use case also related to the connection reaper?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tricky heisenbug related to database cleaner

2 participants