CASSANALYTICS-145: Add exception stack walking to port mapping retry …#191
CASSANALYTICS-145: Add exception stack walking to port mapping retry …#191jmckenzie-dev wants to merge 1 commit intoapache:trunkfrom
Conversation
…logic on analytics integration tests Patch by Josh McKenzie; reviewed by TBD for CASSANALYTICS-145
|
Going to keep an eye on https://app.circleci.com/pipelines/gh/jmckenzie-dev/cassandra-analytics/21/details?useNewPipelines=true and see how it looks. Might kick it a couple times for extra runs just to make sure we don't see the transient port mapping issues. |
| String message = cause.getMessage(); | ||
| if (message != null && (message.contains("Address already in use") || | ||
| message.contains("is in use by another") || | ||
| message.contains("Failed to bind port"))) | ||
| { | ||
| isBindFailure = true; | ||
| break; | ||
| } |
There was a problem hiding this comment.
Minor:
If "Address already in use" and "is in use by another" originate from BindException (OS/JVM), a cause instanceof BindException check inside the loop would be more robust than string matching, which can vary across OS, JVM vendor, or locale. If they come from application-level messages, the string matching is fine as-is.
| String message = cause.getMessage(); | |
| if (message != null && (message.contains("Address already in use") || | |
| message.contains("is in use by another") || | |
| message.contains("Failed to bind port"))) | |
| { | |
| isBindFailure = true; | |
| break; | |
| } | |
| String message = cause.getMessage(); | |
| if (cause instanceof java.net.BindException || (cause.getMessage() != null && cause.getMessage().contains("Failed to bind port"))) | |
| { | |
| isBindFailure = true; | |
| break; | |
| } |
There was a problem hiding this comment.
This comes from our friend Server.java#start in Cassandra:
if (!bindFuture.awaitUninterruptibly().isSuccess())
throw new IllegalStateException(String.format("Failed to bind port %d on %s.", socket.getPort(), socket.getAddress().getHostAddress()),
bindFuture.cause());
Certainly if that changes we could have a bad time. But the reality is that we're a) throwing an IllegalStateException in C*, and b) it's going through layers of indirection (C* inside in-jvm dtest API inside classloaders) that make it pretty hard to durably determine what exception type is going to show up in the test level since various tiers can catch and rethrow as different types.
Hence the throwing my hands in the air and just strcmp'ing it for now. On the plus side, that string has been stable since at least 2018 so seems stable.
Famous last words.
arjunashok
left a comment
There was a problem hiding this comment.
Minor suggestion. Otherwise, looks good, +1.
You might want to re-run the sole failing integ test, which is unrelated - likely flaky.
JoiningMultiDCFailureTest > testJoiningNodeInMultiDCTest(TestConsistencyLevel) > 4 => readCL=QUORUM, writeCL=QUORUM FAILED
java.lang.AssertionError:
…logic on analytics integration tests
Patch by Josh McKenzie; reviewed by TBD for CASSANALYTICS-145