Create the new RequestBlocklist interface and implementation#7880
Create the new RequestBlocklist interface and implementation#7880catalinradoiu wants to merge 6 commits intodevelopfrom
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Autofix Details
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Debug
printlnstatement left in production code- Removed the accidental
println(blockedRequests)fromloadToMemoryso blocklist data is no longer dumped to stdout.
- Removed the accidental
- ✅ Fixed: Non-atomic map update creates brief blocklist gap
- Replaced in-place
ConcurrentHashMapmutation with a volatile map reference swap to publish the new blocklist atomically.
- Replaced in-place
Or push these changes by commenting:
@cursor push 692b4fa21d
Preview (692b4fa21d)
diff --git a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
--- a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
+++ b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
@@ -32,7 +32,6 @@
import kotlinx.coroutines.launch
import logcat.logcat
import okhttp3.HttpUrl.Companion.toHttpUrlOrNull
-import java.util.concurrent.ConcurrentHashMap
import javax.inject.Inject
@SingleInstanceIn(AppScope::class)
@@ -45,7 +44,7 @@
@AppCoroutineScope private val appCoroutineScope: CoroutineScope,
) : RequestBlocklist, PrivacyConfigCallbackPlugin {
- private val blockedRequests = ConcurrentHashMap<String, List<BlocklistRuleEntity>>()
+ @Volatile private var blockedRequests: Map<String, List<BlocklistRuleEntity>> = emptyMap()
init {
if (isMainProcess) {
@@ -77,7 +76,7 @@
private fun loadToMemory() {
appCoroutineScope.launch(dispatchers.io()) {
- val newBlockedRequests = ConcurrentHashMap<String, List<BlocklistRuleEntity>>()
+ val newBlockedRequests = mutableMapOf<String, List<BlocklistRuleEntity>>()
requestBlocklistFeature.self().getSettings()?.let { settingsJson ->
runCatching {
@@ -101,9 +100,7 @@
}
}
- blockedRequests.clear()
- blockedRequests.putAll(newBlockedRequests)
- println(blockedRequests)
+ blockedRequests = newBlockedRequests.toMap()
}
}
...in/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
Outdated
Show resolved
Hide resolved
| } | ||
|
|
||
| blockedRequests.clear() | ||
| blockedRequests.putAll(newBlockedRequests) |
There was a problem hiding this comment.
Non-atomic map update creates brief blocklist gap
Medium Severity
The blockedRequests.clear() followed by blockedRequests.putAll(newBlockedRequests) is not atomic. Between these two calls, any concurrent containedInBlocklist invocation will see an empty map and return false, allowing requests that are in the blocklist to pass through. Using a volatile reference swap (e.g., AtomicReference<Map<...>>) instead of mutating a ConcurrentHashMap in place would avoid this window.
Additional Locations (1)
There was a problem hiding this comment.
The blocklist is processed at app launch or afer the privacy config is downloaded, this shouldn’t be a real issue.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Regex recompiled per call on request interception path
- Each rule now precompiles its wildcard regex once during JSON parsing and request interception reuses that cached regex instead of rebuilding and recompiling per match.
Or push these changes by commenting:
@cursor push 7b2bdf21dc
Preview (7b2bdf21dc)
diff --git a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
--- a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
+++ b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
@@ -71,7 +71,7 @@
val normalizedUrl = httpUrl.toString()
return rules.any { rule ->
- ruleMatches(normalizedUrl, rule.rule) && domainMatches(documentUrl, rule.domains)
+ rule.ruleRegex.containsMatchIn(normalizedUrl) && domainMatches(documentUrl, rule.domains)
}
}
@@ -106,19 +106,6 @@
}
}
- private fun ruleMatches(
- normalizedUrl: String,
- rule: String,
- ): Boolean = buildString {
- for (char in rule) {
- if (char == '*') {
- append("[^/]*")
- } else {
- append(Regex.escape(char.toString()))
- }
- }
- }.toRegex().containsMatchIn(normalizedUrl)
-
private fun domainMatches(
documentUrl: String?,
domains: List<String>,
diff --git a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
--- a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
+++ b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
@@ -28,6 +28,7 @@
val rule: String,
val domains: List<String>,
val reason: String?,
+ val ruleRegex: Regex,
) {
companion object {
private const val PROPERTY_RULE = "rule"
@@ -42,7 +43,22 @@
val rule = map[PROPERTY_RULE] as? String ?: return null
val domains = map[PROPERTY_DOMAINS] as? List<String> ?: return null
val reason = map[PROPERTY_REASON] as? String
- return BlocklistRuleEntity(rule = rule, domains = domains, reason = reason)
+ return BlocklistRuleEntity(
+ rule = rule,
+ domains = domains,
+ reason = reason,
+ ruleRegex = compileRuleRegex(rule),
+ )
}
+
+ private fun compileRuleRegex(rule: String): Regex = buildString {
+ for (char in rule) {
+ if (char == '*') {
+ append("[^/]*")
+ } else {
+ append(Regex.escape(char.toString()))
+ }
+ }
+ }.toRegex()
}
}
...in/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
Outdated
Show resolved
Hide resolved
|
|
||
| val rules = blockedRequests[requestDomain] ?: return false | ||
|
|
||
| val normalizedUrl = httpUrl.toString() |
There was a problem hiding this comment.
Why do we need this toString again? If the URL is malformed, requestUrl.toHttpUrlOrNull() should've returned null, which already short-circuits
There was a problem hiding this comment.
The requestUrl.toHttpUrlOrNull() is used to normalise the request URL. The requirement is that the request URL should be normalised first, so protocol and domain parts are always considered to be lowercase. That function is used to handle that. However, the regex matchins requires a string, that’s why I am calling that again.
| if (!requestBlocklistFeature.self().isEnabled()) return false | ||
|
|
||
| val httpUrl = requestUrl.toHttpUrlOrNull() ?: return false | ||
| val requestDomain = httpUrl.topPrivateDomain() ?: return false |
There was a problem hiding this comment.
Just thinking out loud, but if in the future we want to better orchestrate all types of request interception, we might want to avoid each implementation performing the same URL manipulations, as a way to improve performance
There was a problem hiding this comment.
I agree with that. But we need to take into account that we will need to add other dependencies to the API module. As in this case we’re using strings for the urls, but we’re converting them here because we cannot pass them as HttpUrl/ Uri becase the API module doesn’t have Android/ OkHttp dependencies.
|
|
||
| requestBlocklistFeature.self().getSettings()?.let { settingsJson -> | ||
| runCatching { | ||
| val moshi = Moshi.Builder().build() |
There was a problem hiding this comment.
nit: Can we inject, or at least avoid creation every time we call loadToMemory? The latter also applies to the next line
| val topPrivateDomain = "https://$domain".toHttpUrlOrNull()?.topPrivateDomain() | ||
| if (topPrivateDomain != null && topPrivateDomain == domain) { |
There was a problem hiding this comment.
Entries must be for eTLD+1 domains, other entries will be ignored. I need to make sure that the entry is in that format, because we don’t support subdomains in the entries. example.com is a valid entry, while some.subdomain.example.com should be skipped.
| ): Boolean = buildString { | ||
| for (char in rule) { | ||
| if (char == '*') { | ||
| append("[^/]*") |
There was a problem hiding this comment.
can we do the manipulation at the rule parsing level instead?
There was a problem hiding this comment.
Updated. We are now manipulating this when we create the object from the JSON entry.
| val normalizedUrl = httpUrl.toString() | ||
|
|
||
| return rules.any { rule -> | ||
| ruleMatches(normalizedUrl, rule.rule) && domainMatches(documentUrl, rule.domains) |
There was a problem hiding this comment.
Similar to what I mentioned in https://github.com/duckduckgo/Android/pull/7880/changes#r2895185205, we could create a new implementation for SameOrSubdomain that accepts HttpUrl and Domain, so that we avoid converting back and forth
There was a problem hiding this comment.
The HttpUrl is used for the request url, to extract the top domain so that we can find the match in rules map. The domain match is used for the documentUrl, which is passed as String. The solution would be to pass it as an Uri, but that would require a change in the API module, as we cannot import it there.
| * @return true if the request matches a blocklist rule | ||
| */ | ||
| UNKNOWN, | ||
| fun containedInBlocklist(documentUrl: String, requestUrl: String): Boolean |
There was a problem hiding this comment.
I know this was accepted on the proposal, but looking at the implementation details, and some notes on performance I added below, I'm wondering whether the function should use HttpUrl instead. That way we could get pretty much the same performance benefits as with Uri, while still keeping this as a Kotlin module
There was a problem hiding this comment.
HttpUrl is from the okhttp module, which the API module is not using. We cannot use it unless we add the dependency there. It’s the same issue that was with Uri, which required Android dependencies in a pure Kotlin module.
|
|
||
| package com.duckduckgo.privacy.config.impl.features.requestblocklist | ||
|
|
||
| data class RequestBlocklistSettings( |
There was a problem hiding this comment.
nit: I'd move this to the RequestBlocklistFeature, for clarity
| ) | ||
|
|
||
| data class BlockedRequestEntry( | ||
| val rules: List<Map<String, @JvmSuppressWildcards Any>>?, |
There was a problem hiding this comment.
Shouldn't this be a String (or a Regex)?
There was a problem hiding this comment.
Rules have the following format, so we cannot use String or Regex, because the parsing would fail.
"rules": [
{
"rule": “testing/*.jpg",
"domains": ["example.com"],
"reason": ["EXAMPLE"]
}
]
This List<Map<String, @JvmSuppressWildcards Any>>? is used here because this is a Moshi/JSON deserialization model. It represents the raw JSON structure before it's parsed into typed objects.
Each rule in the JSON is an object with varying keys (rule, domains, reason), so Moshi deserializes each one as a Map<String, Any>. The typed conversion happens later in BlocklistRuleEntity.fromJson(map), which validates the keys and casts values to their expected types.
| private val KNOWN_PROPERTIES = setOf(PROPERTY_RULE, PROPERTY_DOMAINS, PROPERTY_REASON) | ||
|
|
||
| @Suppress("UNCHECKED_CAST") | ||
| fun fromJson(map: Map<String, Any>): BlocklistRuleEntity? { |
There was a problem hiding this comment.
What do we need this manual parsing for?
Also, I think failing on unknown properties is too restrictive and will break older versions if we add a new field. If we really need this, I think we should at least fire a pixel to alert on the situation, but I think we can just ignore unknown fields
There was a problem hiding this comment.
The requirement was to skip rules with unknown properties or missing required ones. I discussed with Dave as well and this is how he suggested to handle this. I can add a pixel there so that we know about this situation. But the reason for this was to avoid situations where the rule is accepted but behaves differently on older versions (I guess in scenarios where we would add a new property which should be considered as well when checking if the request is in the blocklist).
… a dependency to RealRequestBlocklist
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Unchecked generic cast on deserialized list elements
fromJsonnow validatesdomainsasList<*>and returns null unless every element is aString, preventing deferredClassCastExceptionat match time.
Or push these changes by commenting:
@cursor push b65576d00f
Preview (b65576d00f)
diff --git a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
--- a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
+++ b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
@@ -32,12 +32,11 @@
private val KNOWN_PROPERTIES = setOf(PROPERTY_RULE, PROPERTY_DOMAINS, PROPERTY_REASON)
- @Suppress("UNCHECKED_CAST")
fun fromJson(map: Map<String, Any>): BlocklistRuleEntity? {
if (map.keys.any { it !in KNOWN_PROPERTIES }) return null
val ruleString = map[PROPERTY_RULE] as? String ?: return null
- val domains = map[PROPERTY_DOMAINS] as? List<String> ?: return null
+ val domains = (map[PROPERTY_DOMAINS] as? List<*>)?.map { it as? String ?: return null } ?: return null
val reason = map[PROPERTY_REASON] as? String
val rule = buildString {| if (map.keys.any { it !in KNOWN_PROPERTIES }) return null | ||
|
|
||
| val ruleString = map[PROPERTY_RULE] as? String ?: return null | ||
| val domains = map[PROPERTY_DOMAINS] as? List<String> ?: return null |
There was a problem hiding this comment.
Unchecked generic cast on deserialized list elements
Low Severity
The cast as? List<String> on the domains value only checks is List at runtime due to JVM type erasure — it does not verify that elements are actually String. If the config JSON contains non-string values in the domains array, the cast silently succeeds but a ClassCastException will be thrown later when domainMatches iterates the list, bypassing the runCatching in loadToMemory since the error occurs at match time in containedInBlocklist.
Please tell me if this was useful or not with a 👍 or 👎.
There was a problem hiding this comment.
This is not a real concern, the same would happen if we would use a data model for parsing instead of using the fromJson method.




Task/Issue URL: https://app.asana.com/1/137249556945/project/1211724162604201/task/1213530900303284?focus=true
Description
This PR introduces a new Request Blocklist interface and implementation for the privacy configuration system. The implementation provides the ability to determine if specific network requests are in the blocklist based on configurable rules that match request URLs and document domains.
The feature includes:
RequestBlocklistAPI interface with acontainedInBlocklistmethod to check if requests should be blockedRealRequestBlocklistimplementation that loads blocklist rules from privacy configuration settings and evaluates them against incoming requestsSteps to test this PR
Unit tests pass. This PR only introduces the new interface with its implementation and new data models. The integration in the request interceptor will be added in a follow-up PR.
UI changes
No UI changes.
Note
Medium Risk
Introduces new parsing and matching logic for remotely-supplied blocklist rules (URL/domain matching), which could cause unintended blocking or performance overhead when enabled, though it is gated by a feature toggle and well-tested.
Overview
Adds a new
RequestBlocklistcapability to the privacy-config module, exposingcontainedInBlocklist(documentUrl, requestUrl)for checking requests against blocklist rules.Implements
RealRequestBlocklist, which listens for privacy-config downloads, loadsrequestBlocklistfeature settings into an in-memory map keyed by top private domain, validates/filters rule entries, and matches requests via a simple*-wildcard-to-regex conversion plus document-domain constraints (including<all>). Adds extensive unit tests covering enablement, parsing/validation, wildcard semantics, domain matching, and reload behavior.Written by Cursor Bugbot for commit 9a9368e. This will update automatically on new commits. Configure here.