Skip to content

Make Backoff class thread-safe#13993

Closed
michaeljmarshall wants to merge 7 commits intoapache:masterfrom
michaeljmarshall:make-backoff-threadsafe
Closed

Make Backoff class thread-safe#13993
michaeljmarshall wants to merge 7 commits intoapache:masterfrom
michaeljmarshall:make-backoff-threadsafe

Conversation

@michaeljmarshall
Copy link
Member

Motivation

In preparing #13951 (still a draft at this time), I noticed that the Backoff class is not thread safe, but it is accessed as if it were because we pass it to multiple threads and call .next() to get the next backoff delay.

Modifications

  • Move assertions for initial and max values to constructor and remove them from the RetryUtil class.
  • Remove the @Data annotation on the Backoff class. It allowed for unsafe access to internal state.
  • Add Javadocs to improve documentation of the class.
  • Remove unsafe initialization for the next variable. Here is my source showing that this initialization is unsafe: https://shipilev.net/blog/2014/safe-public-construction/.

Verifying this change

There is already test coverage for this class. The primary changes just add synchronized keywords to methods that get or update mutable state.

Does this pull request potentially affect one of the following parts:

It's possible that removing the @Data annotation on the class will break client applications since Backoff is a public class in the java client. I'll need guidance on how we should proceed here.

Documentation

  • doc

This PR contains Javadocs for relevant documentation. There is no need to document this change elsewhere.

@michaeljmarshall michaeljmarshall added doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. area/client labels Jan 27, 2022
@michaeljmarshall michaeljmarshall added this to the 2.10.0 milestone Jan 27, 2022
@michaeljmarshall michaeljmarshall self-assigned this Jan 27, 2022
mattisonchao
mattisonchao previously approved these changes Jan 27, 2022
this.mandatoryStopMade = false;
}

public synchronized boolean isMandatoryStopMade() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method looks like just getting the current value, should we synchronize it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We either need to synchronize it or make the variable volatile. Since it is only used in one place (and is possibly used incorrectly), I lean towards making it synchronized, for now.


@VisibleForTesting
long getFirstBackoffTimeInMillis() {
synchronized long getFirstBackoffTimeInMillis() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same above. 👆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above, except this is only used for a test.

import lombok.Getter;

// All variables are in TimeUnit millis by default
@Data
Copy link
Member

@lhotari lhotari Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess @Data should be replaced with @ToString @EqualsAndHashCode if the intention is to remove setters. @Value would be the correct solution if Backoff would be fully immutable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. Do you think that I also need to make the mutable fields volatile then, since toString() and .equals won't be synchronized?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point. Do you think that I also need to make the mutable fields volatile then, since toString() and .equals won't be synchronized?

I think that it would be an overkill to use volatile in this case. In Java with 64 bit JVMs you cannot hit a "word tearing" issue where the observed field value would be corrupted. The worst that could happen in Java (on a 64 bit JVM) is a data race.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added in the annotations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, it looks like spot bugs didn't like the change for the lack of synchronization introduced by the annotations:

Error: Medium: Inconsistent synchronization of org.apache.pulsar.client.impl.Backoff.next; locked 75% of time [org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff, org.apache.pulsar.client.impl.Backoff] Unsynchronized access at Backoff.java:[line 35]Unsynchronized access at Backoff.java:[line 35]Unsynchronized access at Backoff.java:[line 35]Synchronized access at Backoff.java:[line 96]Synchronized access at Backoff.java:[line 99]Synchronized access at Backoff.java:[line 97]Synchronized access at Backoff.java:[line 101]Synchronized access at Backoff.java:[line 101]Synchronized access at Backoff.java:[line 142]Synchronized access at Backoff.java:[line 133]Synchronized access at Backoff.java:[line 134]Synchronized access at Backoff.java:[line 134]Synchronized access at Backoff.java:[line 34] IS2_INCONSISTENT_SYNC

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any use for equality checking of Backoff objects outside of reference equality, and calling toString on the class is going to log a lot of information. I am leaning towards just removing these annotations. If we really want toString, I can write one in the class.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lhotari - are you able to take a look at this again?

@github-actions
Copy link

The pr had no activity for 30 days, mark with Stale label.

@github-actions
Copy link

The pr had no activity for 30 days, mark with Stale label.

@codelipenghui codelipenghui modified the milestones: 2.11.0, 2.12.0 Jul 26, 2022
@tisonkun
Copy link
Member

tisonkun commented Mar 28, 2023

Closing as stale...

We later duplicate BackOff in another module - #10744

Maybe we should move this class to pulsar-common and do the refactor if it's still relevant.

@tisonkun tisonkun closed this Mar 28, 2023
@tisonkun
Copy link
Member

tisonkun commented Mar 28, 2023

pass it to multiple threads

Another approach is we pass always the builder and initialize the instance per thread (call graph). It seems more reasonable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/client doc Your PR contains doc changes, no matter whether the changes are in markdown or code files. lifecycle/stale Stale

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants