Skip to content

Fix ISOTPSoftSocket.select() dropping ObjectPipe, causing sr1() to hang (fix #4838)#4929

Open
BenGardiner wants to merge 2 commits intosecdev:masterfrom
BenGardiner:master
Open

Fix ISOTPSoftSocket.select() dropping ObjectPipe, causing sr1() to hang (fix #4838)#4929
BenGardiner wants to merge 2 commits intosecdev:masterfrom
BenGardiner:master

Conversation

@BenGardiner
Copy link
Contributor

@BenGardiner BenGardiner commented Feb 22, 2026

( I worked this out with Copilot / Claude Opus 4.6 here: BenGardiner#1 to try to fix the ISO TP soft socket problem that has been plaguing me )

The select() method was filtering out ObjectPipe instances (like the sniffer's close_pipe) from its return value. This prevented the sniffer's stop mechanism from working correctly in threaded mode - when sniffer.stop() sent to close_pipe, the select() method would unblock but not return the close_pipe, so the sniffer loop couldn't detect the stop signal and had to rely on continue_sniff timing, causing hangs under load.

The fix includes close_pipe (ObjectPipe) instances in the select return value, so the sniffer loop properly detects the stop signal via the 'if s is close_pipe: break' check.

Added unit tests:

  • sr1 timeout with threaded=True (no response scenario)
  • sr1 timeout with threaded=True and background CAN traffic
  • Add deterministic unit test that fails without fix; keep integration tests

The new "ISOTPSoftSocket select returns control ObjectPipe" test directly verifies that ISOTPSoftSocket.select() passes through ready ObjectPipe instances (e.g. the sniffer's close_pipe). This test deterministically FAILS without the fix and PASSES with it.

The integration tests (sr1 timeout with threaded=True) are kept for end-to-end coverage but the race window is too narrow on Linux with TestSocket to reliably trigger the bug.

Verification:

ver driver bg traffic sr1() threaded=? ISOTP DNE result RDID SF req + MF resp result
2.6.0-rc2 candle YES False timeouts ✅ data ✅
2.6.0-rc2 candle YES True hang data
2.6.0-rc2 slcan YES False timeouts ✅ timeouts ❌
2.6.0-rc2 slcan YES True hang timeouts [^1]
2.6.0 candle YES False timeouts ✅ data ✅
2.6.0 candle YES True hang data
2.6.0 slcan YES False timeouts ✅ timeouts [^1] ❌
2.6.0 slcan YES True hang hang
this PR candle YES False timeouts ✅ data ✅
this PR candle YES True timeouts ✅ data ✅
this PR slcan YES False timeouts ✅ timeouts ❌
this PR slcan YES True timeouts ✅ timeouts ❌

This fixes #4838

The timeouts on slcan for a SF request -> MF response are a pre-existing problem.

NB: copilot/claude also suggested this optimization BenGardiner@8992e1b which seems correct to me but is not directly related to fixing the sr1() hang I've observed. It is not included in this PR.

cc: @polybassa

@codecov
Copy link

codecov bot commented Feb 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.11%. Comparing base (2c787cd) to head (cc5b306).
⚠️ Report is 19 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4929      +/-   ##
==========================================
+ Coverage   80.10%   80.11%   +0.01%     
==========================================
  Files         370      370              
  Lines       91733    91920     +187     
==========================================
+ Hits        73482    73642     +160     
- Misses      18251    18278      +27     
Files with missing lines Coverage Δ
scapy/contrib/isotp/isotp_soft_socket.py 85.81% <100.00%> (+0.62%) ⬆️

... and 17 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@BenGardiner
Copy link
Contributor Author

I think the ubuntu-latest 3.9 failure is unrelated to this pull request

…ang in threaded mode

The select() method was filtering out ObjectPipe instances (like the
sniffer's close_pipe) from its return value. This prevented the sniffer's
stop mechanism from working correctly in threaded mode - when sniffer.stop()
sent to close_pipe, the select() method would unblock but not return the
close_pipe, so the sniffer loop couldn't detect the stop signal and had to
rely on continue_sniff timing, causing hangs under load.

The fix includes close_pipe (ObjectPipe) instances in the select return
value, so the sniffer loop properly detects the stop signal via the
'if s is close_pipe: break' check.

Added two new tests:
- sr1 timeout with threaded=True (no response scenario)
- sr1 timeout with threaded=True and background CAN traffic

The new "ISOTPSoftSocket select returns control ObjectPipe" test directly
verifies that ISOTPSoftSocket.select() passes through ready ObjectPipe
instances (e.g. the sniffer's close_pipe). This test deterministically
FAILS without the fix and PASSES with it.

The integration tests (sr1 timeout with threaded=True) are kept for
end-to-end coverage but the race window is too narrow on Linux with
TestSocket to reliably trigger the bug.

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: Ben Gardiner <ben.l.gardiner@gmail.com>
@BenGardiner
Copy link
Contributor Author

I think the macos failure is unrelated as well

@polybassa
Copy link
Contributor

Thanks a lot for your PR.

Both failed pipelines are unrelated

@polybassa
Copy link
Contributor

@BenGardiner Is this PR ready to merge from your side?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Isotp soft socket sr1 hang/no return and timeout= was specified

3 participants