Skip to content

TS-45056 Class hot reloading detected and coverage dump fixed #881

Open
stahlbauer wants to merge 12 commits intomasterfrom
ts/45056_class_hot_reload_coverage_issue
Open

TS-45056 Class hot reloading detected and coverage dump fixed #881
stahlbauer wants to merge 12 commits intomasterfrom
ts/45056_class_hot_reload_coverage_issue

Conversation

@stahlbauer
Copy link
Copy Markdown
Contributor

Addresses issue TS-45056

  • Changes are tested adequately
  • CHANGELOG.md updated

Please respect the vote of the Teamscale bot or flag irrelevant findings as tolerated or false positives. If you feel that the Teamscale config needs adjustment, please state so in a comment and discuss this with your reviewer.

Copy link
Copy Markdown
Contributor

@DreierF DreierF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks good on the surface, but has some serious problems in some cases. I think we either need to more specifically detect the reload for individual jar files or at least ensure that if no duplicates is set, that the default warn does not crash the report generation, but just warn about the issue.

Comment on lines +53 to +57
/**
* Tracks class IDs (CRC64) by class name across dumps to detect application server reloads. When a known class
* name appears with a different ID, we know the class was reloaded with different bytecode.
*/
private final Map<String, Long> previousClassIds = new HashMap<>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the agent itself is the wrong level of abstraction at where to track this

Comment on lines +224 to +226
Long previousId = previousClassIds.put(data.getName(), data.getId());
if (previousId != null && previousId != data.getId()) {
reloadDetected = true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't necessarily detect a reload. I could also just be two conflicting class files, i.e., a third-party dependency that is on the classpath in two different versions.


Path classDumpDir = options.getClassDumpDirectory();
if (classDumpDir != null) {
FileSystemUtils.deleteRecursively(classDumpDir.toFile());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Classes are only dumped to that directory by jacoco once. Some all class files from other jar files are also gone afterwards and hence will not produce any coverage anymore, but warnings, that the class files are missing.

@stahlbauer
Copy link
Copy Markdown
Contributor Author

@DreierF Thanks for the review. I guess these are the pitfalls of Agentic Coding. Now a variant where primarily the duplication-detection is kept alive by sharing the visitor instance.

README.md Outdated
│ │
└──────────────────────────────────────────────────────────────────────────┘
│ Periodically (default: every 10 min) or on HTTP request
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the default is 8h 🤔

// (and thus a fresh CoverageBuilder) per dump, the changed class ID does not cause a crash.
SystemUnderTest().bar()
dumpCoverage(SystemTestUtils.AGENT_PORT)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also check that the log file does contain the expected warning? (we have some other tests that already check the log file)

class ReloadDetectionSystemTest {

@Test
fun dumpAfterClassReloadDoesNotCrash() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test also passes on master without any changes to the agent, so it does not really reproduce the issue.

Path tempDir = createTemporaryDumpDirectory();
tempDir.toFile().deleteOnExit();
builder.append(",classdumpdir=").append(tempDir.toAbsolutePath());
Path classDumpDirectory = createTemporaryDumpDirectory();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a variable renaming

// Created fresh per dump so its CoverageBuilder does not carry class IDs from previous dumps.
// Without this, application server reloads (for example, JBoss) cause an IllegalStateException
// because the reloaded class has a different CRC64 than what CoverageBuilder stored earlier.
JaCoCoXmlReportGenerator generator = new JaCoCoXmlReportGenerator(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core of the bug fix

* Base class for generating reports based on the binary JaCoCo exec dump files.
*
* It takes care of ignoring non-identical classes with the same fully qualified name and classes without coverage.
* JaCoCo's execution data only contains per-class boolean arrays indicating which probes fired. It does not contain
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just better JavaDocs

│ │
└──────────────────────────────────────────────────────────────────────────┘
│ Periodically (default: every 480 min) or on HTTP request
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, the 8h are here. The cause was an inconsistency in CLAUDE.md


**Coverage modes:**
- **Interval-based** (`Agent` class) - Periodically dumps coverage (default every 10 min)
- **Interval-based** (`Agent` class) - Periodically dumps coverage (default every 480 min)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was inconsistent to the implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants