Skip to content

[2/2] Add CGROUP_SOCK_ADDR Logging Infrastructure#491

Open
yaakov-stein wants to merge 13 commits intofacebook:mainfrom
yaakov-stein:add_cgroup_sock_addr_logging
Open

[2/2] Add CGROUP_SOCK_ADDR Logging Infrastructure#491
yaakov-stein wants to merge 13 commits intofacebook:mainfrom
yaakov-stein:add_cgroup_sock_addr_logging

Conversation

@yaakov-stein
Copy link
Copy Markdown
Contributor

@yaakov-stein yaakov-stein commented Mar 29, 2026

Stack:

Summary

This adds the remaining infrastructure to go from the hard-coded packet-specific logging to a flavor-specific logging approach that makes it easy to enable logging for CGROUP_SOCK_ADDR:

BEFORE                                  AFTER

program.c (hardcoded log emission)      program.c (callback dispatch)
──────────────────────────────────      ────────────────────────────────  
if rule->log:                           if rule->log:
  setup R1-R5 for packet headers          ops->gen_inline_log(program, rule)
  EMIT_FIXUP_ELFSTUB(PKT_LOG)

chain.c:                                Packet flavors (XDP, TC, NF, cgroup_skb)
  cgroup_sock_addr? → -ENOTSUP          ────────────────────────────────  
                                        bf_packet_gen_inline_log():
                                          R1=ctx, R2=rule_id, R3=log_opts,
                                          R4=verdict, R5=packed_protos
                                          → ELFSTUB_PKT_LOG

                                        cgroup_sock_addr.c
                                        ──────────────────────────────── 
                                        gen_inline_log():
                                          R1=ctx, R2=rule_id, R3=verdict,
                                          R4=packed_protos
                                          → ELFSTUB_SOCK_ADDR_LOG

We add a new sock_addr_log ELF stub, a gen_inline_log callback (see #486), CLI formatting for socket log entries, and per-hook log option validation (e.g. BF_LOG_OPT_LINK is rejected on socket hooks, BF_LOG_OPT_PID is rejected on packet hooks). It also introduces two new log options, BF_LOG_OPT_PID and BF_LOG_OPT_COMM, which use the bpf_get_current_pid_tgid() and bpf_get_current_comm() helper calls in the stub.

One of the challenges here is that the BPF verifier restricts which bpf_sock_addr context fields a program can access based on its attach type. For example, msg_src_ip4 is only valid for SENDMSG4 and user_ip4 is only valid for IPv4 hooks. The verifier checks all code paths statically regardless of runtime reachability, so a shared ELF stub compiled once and linked into all hook types cannot read these fields directly. To solve this, the per-hook inline codegen in the prologue copies the permitted fields into a new _bf_runtime_sock_addr struct on the BPF stack, and the stub copies from there into the log entry. The copying is done once per program invocation, guarded by the existing BF_CHAIN_LOG flag, so programs without logging rules pay zero cost.

Adding _bf_runtime_sock_addr to bf_runtime increases the stack frame for all flavors by 40 bytes (with alignment), but the struct remains well within the BPF 512-byte stack limit.

Testing

  • Manual testing, example sendmsg4 log:
Screenshot 2026-03-29 at 3 10 01 PM

Note

  • As mentioned above, the verifier doesn't allow accessing certain paths (ex. connect4 reading msg_src_ip6), so we'll need to add it to the stack frame, similar to how we do with packets. This leads to a new question - we really should have different bf_runtime for packet flavors and CGROUP_SOCK_ADDR given how different they are and the different data they use (can be done in a similar type of way we did it in bf_log, sharing common fields and then using a union). However, I don't think know is the right time to make this invasive change when a less invasive approach works, albeit in a less clean/proper way. Imo we can add this as a follow-up.

@meta-cla meta-cla bot added the cla signed label Mar 29, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 29, 2026

Claude review of PR #491 (3ca48eb)

Suggestions

  • Uninitialized local variables in prologuesrc/libbpfilter/cgen/cgroup_sock_addr.c:175saddr_off and daddr_off declared without initialization may trigger -Wmaybe-uninitialized despite being structurally safe
  • ABI break in bf_log layout undocumentedsrc/libbpfilter/include/bpfilter/runtime.h:178struct bf_log layout changed significantly (fields moved into tagged union, bf_pkthdr renamed); worth noting the intentional break in the commit description
  • Prologue pre-reads all fields unconditionallysrc/libbpfilter/cgen/cgroup_sock_addr.c:180 — Context field copies (dport, saddr, daddr) execute whenever any rule has logging, even for rules requesting only pid,comm
  • Struct naming diverges from conventionsrc/libbpfilter/include/bpfilter/runtime.h:110_bf_log_pkt, _bf_log_sock_addr, and _bf_runtime_sock_addr use leading-underscore for struct names, but project convention reserves this for static functions/variables
  • E2E logging tests only cover connect4tests/e2e/hooks/cgroup_sock_addr_connect4.sh — No logging tests for connect6, sendmsg4, or sendmsg6; the has_saddr code path is untested

Nits

  • Duplicate register setup codesrc/libbpfilter/cgen/packet.c:367bf_packet_gen_inline_log and _bf_cgroup_sock_addr_gen_inline_log share identical R1-R5 setup, differing only in the ELF stub ID
  • Kernel UAPI include in public headersrc/libbpfilter/include/bpfilter/runtime.h:15 — first <linux/...> include in any include/bpfilter/ header, added solely for sizeof(struct in6_addr); the literal 16 is already used in the sibling struct
  • Error message missing offending bitssrc/libbpfilter/chain.c:144"invalid log options for %s" doesn't indicate which options are invalid; including the bitmask would improve debuggability
  • Hardcoded %.16s for comm buffersrc/bfcli/print.c:664 — Format width matches BF_COMM_LEN but uses a literal; consider the symbolic constant via %.*s

Workflow run

@yaakov-stein yaakov-stein changed the title [2/2][CGROUP_SOCK_ADDR Logging] Enable Logging [2/2][CGROUP_SOCK_ADDR Logging] Add CGROUP_SOCK_ADDR Logging Infrastructure Mar 29, 2026
}

if (log_opts & (1 << BF_LOG_OPT_INTERNET)) {
__builtin_memcpy(log->payload.sock_addr.saddr, ctx->sock_addr.saddr,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: must-fix: For IPv4 hooks the prologue writes only 4 bytes into ctx->sock_addr.saddr and ctx->sock_addr.daddr, but here sizeof(ctx->sock_addr.saddr) is 16. The __builtin_memcpy unconditionally copies all 16 bytes, so for IPv4 the remaining 12 bytes per address are uninitialized BPF stack data copied into the ring buffer entry.

For CONNECT4/CONNECT6 hooks, has_saddr is false so ctx->sock_addr.saddr is never written at all — the entire 16 bytes are uninitialized. This is either a BPF verifier rejection (if the verifier tracks the uninitialized read) or uninitialized stack data submitted to userspace via bpf_ringbuf_submit.

Consider either: (a) zero-initializing the sock_addr region in the prologue before writing the partial address, or (b) passing the address size to the stub so it copies only the correct number of bytes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the packet logging does the same thing for headers that are not requested. This is not a verifier rejection in either case, but we do submit uninitialized stack data in both cases. As long as this doesn't become a verifier issue, I think it can be left as is given that both option add some amount of BPF instructions. Happy to make a change though if others think it makes sense.

@yaakov-stein yaakov-stein changed the title [2/2][CGROUP_SOCK_ADDR Logging] Add CGROUP_SOCK_ADDR Logging Infrastructure [2/2] Add CGROUP_SOCK_ADDR Logging Infrastructure Mar 29, 2026
Rename the packet logging ELF stub from `log` to `pkt_log` to
establish naming symmetry with the upcoming `sock_log` stub for
socket-based hooks.
Add a bf_log_type discriminator and restructure bf_log as a tagged
union with packet and socket variants. Packet-specific fields
(pkt_size, headers, l2hdr/l3hdr/l4hdr) move into the pkt variant.
The sock variant (pid, comm) is defined but not yet populated.

The struct sizes to the larger packet variant, so total size is
unchanged for ring buffer reservations.
Move packet.c/h from cgen/matcher/ to cgen/ and rename
bf_matcher_generate_packet() to bf_packet_gen_inline_matcher(). This
file provides shared codegen utilities for all packet-based flavors,
not a matcher implementation, and will host additional shared packet
codegen like bf_packet_gen_inline_log() in subsequent commits.
Add struct _bf_runtime_sock_addr to bf_runtime with saddr, daddr, and
dport fields. The cgroup_sock_addr inline codegen will pre-read context
fields into these before calling the sock_addr_log ELF stub, avoiding
per-hook-type verifier restrictions on bpf_sock_addr field access.
Add a BPF ELF stub for cgroup_sock_addr logging. Captures process
metadata and socket address fields pre-read into bf_runtime by the
prologue.
@yaakov-stein yaakov-stein force-pushed the add_cgroup_sock_addr_logging branch from 7606403 to fd68815 Compare March 29, 2026 22:05
Add BF_LOG_OPT_PID and BF_LOG_OPT_COMM to the bf_log_opt enum for
socket-based hooks that log process metadata instead of packet headers.
Reject invalid log options during chain creation: packet hooks only
support link/internet/transport, socket hooks only support
internet/transport/pid/comm.
@yaakov-stein yaakov-stein force-pushed the add_cgroup_sock_addr_logging branch 3 times, most recently from 80e428a to 0827510 Compare March 29, 2026 22:31
@yaakov-stein yaakov-stein marked this pull request as ready for review March 29, 2026 22:42
#define static_assert _Static_assert
#endif

#include <linux/in6.h>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: This is the first <linux/...> include in any public header under include/bpfilter/. It is only needed for sizeof(struct in6_addr) in _bf_log_sock_addr.saddr and .daddr. The sibling struct _bf_runtime_sock_addr in cgen/runtime.h already uses the literal 16 for the same purpose. Using 16 here (with a comment referencing sizeof(struct in6_addr)) would avoid introducing a kernel UAPI header dependency into the public API surface.

@yaakov-stein yaakov-stein force-pushed the add_cgroup_sock_addr_logging branch from 0827510 to 3ca48eb Compare March 30, 2026 01:29
/* Copy context fields into the runtime for the logging stub.
* The verifier restricts context access by attach type, so the shared
* stub reads from here instead of the context directly. */
if (program->runtime.chain->flags & BF_FLAG(BF_CHAIN_LOG)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: The context field copies (dport, saddr, daddr) are guarded by BF_CHAIN_LOG, which is set if any rule in the chain has logging enabled. This means every connection through the chain pays the cost of the BPF loads and stores, even if the matched rule does not have logging or only requests pid,comm.

For chains where logging is rare (e.g., one log rule among many), consider computing the union of all rule->log options at chain creation time and only emitting copies for actually-needed fields. Not a correctness issue, but the number of instructions per invocation could be reduced.

struct _bf_log_pkt
{
/** Total size of the packet, including the payload. */
__u64 pkt_size;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: struct _bf_log_pkt, struct _bf_log_sock_addr (here in runtime.h), and struct _bf_runtime_sock_addr (in cgen/runtime.h) use a leading-underscore prefix for struct names. The project style guide reserves leading underscores for static functions and variables (e.g., _bf_ctx_free()), and all existing struct types in the codebase use the plain bf_ prefix (e.g., struct bf_chain, struct bf_runtime, struct bf_log).

Consider struct bf_log_pkt, struct bf_log_sock_addr, and struct bf_runtime_sock_addr for consistency.

}

if (req & (1 << BF_LOG_OPT_COMM)) {
(void)fprintf(stdout, " Process : %s%.16s%s\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The format specifier %.16s hardcodes the comm buffer size. BF_COMM_LEN is defined to 16 in this PR and could be used here via %.*s to stay in sync if the buffer size ever changes:

(void)fprintf(stdout, "  Process   : %s%.*s%s\n",
              bf_logger_get_color(BF_COLOR_LIGHT_GREEN, BF_STYLE_BOLD),
              BF_COMM_LEN, log->payload.sock_addr.comm,
              bf_logger_get_color(BF_COLOR_RESET, BF_STYLE_RESET));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant