Skip to content

Create the new RequestBlocklist interface and implementation#7880

Open
catalinradoiu wants to merge 6 commits intodevelopfrom
feature/cradoiu/create-request-blocklist-api-and-impl
Open

Create the new RequestBlocklist interface and implementation#7880
catalinradoiu wants to merge 6 commits intodevelopfrom
feature/cradoiu/create-request-blocklist-api-and-impl

Conversation

@catalinradoiu
Copy link
Contributor

@catalinradoiu catalinradoiu commented Mar 5, 2026

Task/Issue URL: https://app.asana.com/1/137249556945/project/1211724162604201/task/1213530900303284?focus=true

Description

This PR introduces a new Request Blocklist interface and implementation for the privacy configuration system. The implementation provides the ability to determine if specific network requests are in the blocklist based on configurable rules that match request URLs and document domains.

The feature includes:

  • A new RequestBlocklist API interface with a containedInBlocklist method to check if requests should be blocked
  • RealRequestBlocklist implementation that loads blocklist rules from privacy configuration settings and evaluates them against incoming requests

Steps to test this PR

Unit tests pass. This PR only introduces the new interface with its implementation and new data models. The integration in the request interceptor will be added in a follow-up PR.

UI changes

No UI changes.


Note

Medium Risk
Introduces new parsing and matching logic for remotely-supplied blocklist rules (URL/domain matching), which could cause unintended blocking or performance overhead when enabled, though it is gated by a feature toggle and well-tested.

Overview
Adds a new RequestBlocklist capability to the privacy-config module, exposing containedInBlocklist(documentUrl, requestUrl) for checking requests against blocklist rules.

Implements RealRequestBlocklist, which listens for privacy-config downloads, loads requestBlocklist feature settings into an in-memory map keyed by top private domain, validates/filters rule entries, and matches requests via a simple *-wildcard-to-regex conversion plus document-domain constraints (including <all>). Adds extensive unit tests covering enablement, parsing/validation, wildcard semantics, domain matching, and reload behavior.

Written by Cursor Bugbot for commit 9a9368e. This will update automatically on new commits. Configure here.

Copy link
Contributor Author

catalinradoiu commented Mar 5, 2026

@catalinradoiu catalinradoiu changed the title Create the data models for the request blocklist Create the new RequestBlocklist interface and implementation Mar 5, 2026
@catalinradoiu catalinradoiu marked this pull request as ready for review March 6, 2026 09:15
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Debug println statement left in production code
    • Removed the accidental println(blockedRequests) from loadToMemory so blocklist data is no longer dumped to stdout.
  • ✅ Fixed: Non-atomic map update creates brief blocklist gap
    • Replaced in-place ConcurrentHashMap mutation with a volatile map reference swap to publish the new blocklist atomically.

Create PR

Or push these changes by commenting:

@cursor push 692b4fa21d
Preview (692b4fa21d)
diff --git a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
--- a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
+++ b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
@@ -32,7 +32,6 @@
 import kotlinx.coroutines.launch
 import logcat.logcat
 import okhttp3.HttpUrl.Companion.toHttpUrlOrNull
-import java.util.concurrent.ConcurrentHashMap
 import javax.inject.Inject
 
 @SingleInstanceIn(AppScope::class)
@@ -45,7 +44,7 @@
     @AppCoroutineScope private val appCoroutineScope: CoroutineScope,
 ) : RequestBlocklist, PrivacyConfigCallbackPlugin {
 
-    private val blockedRequests = ConcurrentHashMap<String, List<BlocklistRuleEntity>>()
+    @Volatile private var blockedRequests: Map<String, List<BlocklistRuleEntity>> = emptyMap()
 
     init {
         if (isMainProcess) {
@@ -77,7 +76,7 @@
 
     private fun loadToMemory() {
         appCoroutineScope.launch(dispatchers.io()) {
-            val newBlockedRequests = ConcurrentHashMap<String, List<BlocklistRuleEntity>>()
+            val newBlockedRequests = mutableMapOf<String, List<BlocklistRuleEntity>>()
 
             requestBlocklistFeature.self().getSettings()?.let { settingsJson ->
                 runCatching {
@@ -101,9 +100,7 @@
                 }
             }
 
-            blockedRequests.clear()
-            blockedRequests.putAll(newBlockedRequests)
-            println(blockedRequests)
+            blockedRequests = newBlockedRequests.toMap()
         }
     }

}

blockedRequests.clear()
blockedRequests.putAll(newBlockedRequests)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-atomic map update creates brief blocklist gap

Medium Severity

The blockedRequests.clear() followed by blockedRequests.putAll(newBlockedRequests) is not atomic. Between these two calls, any concurrent containedInBlocklist invocation will see an empty map and return false, allowing requests that are in the blocklist to pass through. Using a volatile reference swap (e.g., AtomicReference<Map<...>>) instead of mutating a ConcurrentHashMap in place would avoid this window.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The blocklist is processed at app launch or afer the privacy config is downloaded, this shouldn’t be a real issue.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Autofix Details

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Regex recompiled per call on request interception path
    • Each rule now precompiles its wildcard regex once during JSON parsing and request interception reuses that cached regex instead of rebuilding and recompiling per match.

Create PR

Or push these changes by commenting:

@cursor push 7b2bdf21dc
Preview (7b2bdf21dc)
diff --git a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
--- a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
+++ b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RealRequestBlocklist.kt
@@ -71,7 +71,7 @@
         val normalizedUrl = httpUrl.toString()
 
         return rules.any { rule ->
-            ruleMatches(normalizedUrl, rule.rule) && domainMatches(documentUrl, rule.domains)
+            rule.ruleRegex.containsMatchIn(normalizedUrl) && domainMatches(documentUrl, rule.domains)
         }
     }
 
@@ -106,19 +106,6 @@
         }
     }
 
-    private fun ruleMatches(
-        normalizedUrl: String,
-        rule: String,
-    ): Boolean = buildString {
-        for (char in rule) {
-            if (char == '*') {
-                append("[^/]*")
-            } else {
-                append(Regex.escape(char.toString()))
-            }
-        }
-    }.toRegex().containsMatchIn(normalizedUrl)
-
     private fun domainMatches(
         documentUrl: String?,
         domains: List<String>,

diff --git a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
--- a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
+++ b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
@@ -28,6 +28,7 @@
     val rule: String,
     val domains: List<String>,
     val reason: String?,
+    val ruleRegex: Regex,
 ) {
     companion object {
         private const val PROPERTY_RULE = "rule"
@@ -42,7 +43,22 @@
             val rule = map[PROPERTY_RULE] as? String ?: return null
             val domains = map[PROPERTY_DOMAINS] as? List<String> ?: return null
             val reason = map[PROPERTY_REASON] as? String
-            return BlocklistRuleEntity(rule = rule, domains = domains, reason = reason)
+            return BlocklistRuleEntity(
+                rule = rule,
+                domains = domains,
+                reason = reason,
+                ruleRegex = compileRuleRegex(rule),
+            )
         }
+
+        private fun compileRuleRegex(rule: String): Regex = buildString {
+            for (char in rule) {
+                if (char == '*') {
+                    append("[^/]*")
+                } else {
+                    append(Regex.escape(char.toString()))
+                }
+            }
+        }.toRegex()
     }
 }


val rules = blockedRequests[requestDomain] ?: return false

val normalizedUrl = httpUrl.toString()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this toString again? If the URL is malformed, requestUrl.toHttpUrlOrNull() should've returned null, which already short-circuits

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requestUrl.toHttpUrlOrNull() is used to normalise the request URL. The requirement is that the request URL should be normalised first, so protocol and domain parts are always considered to be lowercase. That function is used to handle that. However, the regex matchins requires a string, that’s why I am calling that again.

if (!requestBlocklistFeature.self().isEnabled()) return false

val httpUrl = requestUrl.toHttpUrlOrNull() ?: return false
val requestDomain = httpUrl.topPrivateDomain() ?: return false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking out loud, but if in the future we want to better orchestrate all types of request interception, we might want to avoid each implementation performing the same URL manipulations, as a way to improve performance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with that. But we need to take into account that we will need to add other dependencies to the API module. As in this case we’re using strings for the urls, but we’re converting them here because we cannot pass them as HttpUrl/ Uri becase the API module doesn’t have Android/ OkHttp dependencies.


requestBlocklistFeature.self().getSettings()?.let { settingsJson ->
runCatching {
val moshi = Moshi.Builder().build()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we inject, or at least avoid creation every time we call loadToMemory? The latter also applies to the next line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Comment on lines +91 to +92
val topPrivateDomain = "https://$domain".toHttpUrlOrNull()?.topPrivateDomain()
if (topPrivateDomain != null && topPrivateDomain == domain) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entries must be for eTLD+1 domains, other entries will be ignored. I need to make sure that the entry is in that format, because we don’t support subdomains in the entries. example.com is a valid entry, while some.subdomain.example.com should be skipped.

): Boolean = buildString {
for (char in rule) {
if (char == '*') {
append("[^/]*")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we do the manipulation at the rule parsing level instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. We are now manipulating this when we create the object from the JSON entry.

val normalizedUrl = httpUrl.toString()

return rules.any { rule ->
ruleMatches(normalizedUrl, rule.rule) && domainMatches(documentUrl, rule.domains)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to what I mentioned in https://github.com/duckduckgo/Android/pull/7880/changes#r2895185205, we could create a new implementation for SameOrSubdomain that accepts HttpUrl and Domain, so that we avoid converting back and forth

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HttpUrl is used for the request url, to extract the top domain so that we can find the match in rules map. The domain match is used for the documentUrl, which is passed as String. The solution would be to pass it as an Uri, but that would require a change in the API module, as we cannot import it there.

* @return true if the request matches a blocklist rule
*/
UNKNOWN,
fun containedInBlocklist(documentUrl: String, requestUrl: String): Boolean
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was accepted on the proposal, but looking at the implementation details, and some notes on performance I added below, I'm wondering whether the function should use HttpUrl instead. That way we could get pretty much the same performance benefits as with Uri, while still keeping this as a Kotlin module

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HttpUrl is from the okhttp module, which the API module is not using. We cannot use it unless we add the dependency there. It’s the same issue that was with Uri, which required Android dependencies in a pure Kotlin module.


package com.duckduckgo.privacy.config.impl.features.requestblocklist

data class RequestBlocklistSettings(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd move this to the RequestBlocklistFeature, for clarity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

)

data class BlockedRequestEntry(
val rules: List<Map<String, @JvmSuppressWildcards Any>>?,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a String (or a Regex)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rules have the following format, so we cannot use String or Regex, because the parsing would fail.

"rules": [
                    {
                        "rule": “testing/*.jpg",
                        "domains": ["example.com"],
                        "reason": ["EXAMPLE"]
                    }
                ]

This List<Map<String, @JvmSuppressWildcards Any>>? is used here because this is a Moshi/JSON deserialization model. It represents the raw JSON structure before it's parsed into typed objects.

Each rule in the JSON is an object with varying keys (rule, domains, reason), so Moshi deserializes each one as a Map<String, Any>. The typed conversion happens later in BlocklistRuleEntity.fromJson(map), which validates the keys and casts values to their expected types.

private val KNOWN_PROPERTIES = setOf(PROPERTY_RULE, PROPERTY_DOMAINS, PROPERTY_REASON)

@Suppress("UNCHECKED_CAST")
fun fromJson(map: Map<String, Any>): BlocklistRuleEntity? {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we need this manual parsing for?

Also, I think failing on unknown properties is too restrictive and will break older versions if we add a new field. If we really need this, I think we should at least fire a pixel to alert on the situation, but I think we can just ignore unknown fields

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requirement was to skip rules with unknown properties or missing required ones. I discussed with Dave as well and this is how he suggested to handle this. I can add a pixel there so that we know about this situation. But the reason for this was to avoid situations where the rule is accepted but behaves differently on older versions (I guess in scenarios where we would add a new property which should be considered as well when checking if the request is in the blocklist).

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Unchecked generic cast on deserialized list elements
    • fromJson now validates domains as List<*> and returns null unless every element is a String, preventing deferred ClassCastException at match time.

Create PR

Or push these changes by commenting:

@cursor push b65576d00f
Preview (b65576d00f)
diff --git a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
--- a/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
+++ b/privacy-config/privacy-config-impl/src/main/java/com/duckduckgo/privacy/config/impl/features/requestblocklist/RequestBlocklistModels.kt
@@ -32,12 +32,11 @@
 
         private val KNOWN_PROPERTIES = setOf(PROPERTY_RULE, PROPERTY_DOMAINS, PROPERTY_REASON)
 
-        @Suppress("UNCHECKED_CAST")
         fun fromJson(map: Map<String, Any>): BlocklistRuleEntity? {
             if (map.keys.any { it !in KNOWN_PROPERTIES }) return null
 
             val ruleString = map[PROPERTY_RULE] as? String ?: return null
-            val domains = map[PROPERTY_DOMAINS] as? List<String> ?: return null
+            val domains = (map[PROPERTY_DOMAINS] as? List<*>)?.map { it as? String ?: return null } ?: return null
             val reason = map[PROPERTY_REASON] as? String
 
             val rule = buildString {

if (map.keys.any { it !in KNOWN_PROPERTIES }) return null

val ruleString = map[PROPERTY_RULE] as? String ?: return null
val domains = map[PROPERTY_DOMAINS] as? List<String> ?: return null
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unchecked generic cast on deserialized list elements

Low Severity

The cast as? List<String> on the domains value only checks is List at runtime due to JVM type erasure — it does not verify that elements are actually String. If the config JSON contains non-string values in the domains array, the cast silently succeeds but a ClassCastException will be thrown later when domainMatches iterates the list, bypassing the runCatching in loadToMemory since the error occurs at match time in containedInBlocklist.


Please tell me if this was useful or not with a 👍 or 👎.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a real concern, the same would happen if we would use a data model for parsing instead of using the fromJson method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants