Skip to content

Segfault in RE2 compiler when compiling a crafted regex #596

@hgarrereyn

Description

@hgarrereyn

Details

Hi, there is a potential bug in RE2 pattern construction reachable with a crafted input.

This bug was reproduced on current HEAD of re2 972a15c.

Description

What crashes

  • Constructing re2::RE2 from a crafted pattern causes a segfault during compilation. The stack trace shows a null read in absl::container_internal::raw_hash_set iterator construction (HashSetIteratorGenerationInfoEnabled) called from re2::Compiler::CachedRuneByteSuffix, then AddRuneRangeUTF8, PostVisit, and Compile, reached via RE2::Init and RE2::RE2(const char*).

The testcase below uses the pattern "((([ä-])))" (note the non-ascii a) which seems to trigger an edge case during compilation.

RE2 is explicitly documented as supporting unicode, and also designed to fail safely on malformed inputs:

RE2 was designed and implemented with an explicit goal of being able to handle regular expressions from untrusted users without risk.

Safe usage here would likely be to reject this type of pattern during compilation (pat.ok() == false).

POC

The following testcase demonstrates the bug:

testcase.cpp

#include <string>
#include "/fuzz/install/include/re2/re2.h"

int main(){
  static const unsigned char buf[] = {
    40, 40, 40, 91, 195, 164, 45, 93, 41, 41, 41
  };
  std::string pat(reinterpret_cast<const char*>(buf), sizeof(buf));
  re2::RE2 re(pat);
  return re.ok();
}

stdout


stderr

AddressSanitizer:DEADLYSIGNAL
=================================================================
==1==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fe087157268 bp 0x7ffdc5874ef0 sp 0x7ffdc5874e20 T0)
==1==The signal is caused by a READ memory access.
==1==Hint: address points to the zero page.
    #0 0x7fe087157268 in absl::lts_20250814::container_internal::raw_hash_set<absl::lts_20250814::container_internal::FlatHashMapPolicy<unsigned long, int>, absl::lts_20250814::hash_internal::Hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int>>>::iterator absl::lts_20250814::container_internal::raw_hash_set<absl::lts_20250814::container_internal::FlatHashMapPolicy<unsigned long, int>, absl::lts_20250814::hash_internal::Hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int>>>::find_small<unsigned long>(unsigned long const&) (/fuzz/install/lib/libre2.so.11+0x3f268) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389)
    #1 0x7fe087155e68 in absl::lts_20250814::container_internal::raw_hash_set<absl::lts_20250814::container_internal::FlatHashMapPolicy<unsigned long, int>, absl::lts_20250814::hash_internal::Hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int>>>::iterator absl::lts_20250814::container_internal::raw_hash_set<absl::lts_20250814::container_internal::FlatHashMapPolicy<unsigned long, int>, absl::lts_20250814::hash_internal::Hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int>>>::find<unsigned long>(unsigned long const&) (/fuzz/install/lib/libre2.so.11+0x3de68) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389)
    #2 0x7fe08714d91c in re2::Compiler::CachedRuneByteSuffix(unsigned char, unsigned char, bool, int) (/fuzz/install/lib/libre2.so.11+0x3591c) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389)
    #3 0x7fe08714f9aa in re2::Compiler::AddRuneRangeUTF8(int, int, bool) (/fuzz/install/lib/libre2.so.11+0x379aa) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389)
    #4 0x7fe087151e5e in re2::Compiler::PostVisit(re2::Regexp*, re2::Frag, re2::Frag, re2::Frag*, int) (/fuzz/install/lib/libre2.so.11+0x39e5e) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389)
    #5 0x7fe087159441 in re2::Regexp::Walker<re2::Frag>::WalkInternal(re2::Regexp*, re2::Frag, bool) (/fuzz/install/lib/libre2.so.11+0x41441) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389)
    #6 0x7fe08715311c in re2::Compiler::Compile(re2::Regexp*, bool, long) (/fuzz/install/lib/libre2.so.11+0x3b11c) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389)
    #7 0x7fe0871dbff6 in re2::RE2::Init(std::basic_string_view<char, std::char_traits<char>>, re2::RE2::Options const&) (/fuzz/install/lib/libre2.so.11+0xc3ff6) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389)
    #8 0x7fe0871dce1e in re2::RE2::RE2(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) (/fuzz/install/lib/libre2.so.11+0xc4e1e) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389)
    #9 0x564f038c15e2 in main /fuzz/testcase.cpp:9:12
    #10 0x7fe086bc8d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
    #11 0x7fe086bc8e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #12 0x564f037e6354 in _start (/fuzz/test+0x2c354) (BuildId: 239984f5837b3922e4fc55b4a2d27a5a92b97202)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/fuzz/install/lib/libre2.so.11+0x3f268) (BuildId: fef1cf9efc48c20d81394670bd40f961d8ade389) in absl::lts_20250814::container_internal::raw_hash_set<absl::lts_20250814::container_internal::FlatHashMapPolicy<unsigned long, int>, absl::lts_20250814::hash_internal::Hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int>>>::iterator absl::lts_20250814::container_internal::raw_hash_set<absl::lts_20250814::container_internal::FlatHashMapPolicy<unsigned long, int>, absl::lts_20250814::hash_internal::Hash<unsigned long>, std::equal_to<unsigned long>, std::allocator<std::pair<unsigned long const, int>>>::find_small<unsigned long>(unsigned long const&)
==1==ABORTING

Steps to Reproduce

The crash was triaged with the following Dockerfile:

Dockerfile

# Ubuntu 22.04 with some packages pre-installed
FROM hgarrereyn/stitch_repro_base@sha256:3ae94cdb7bf2660f4941dc523fe48cd2555049f6fb7d17577f5efd32a40fdd2c

RUN git clone https://github.com/google/re2 /fuzz/src && \
    cd /fuzz/src && \
    git checkout 972a15c && \
    git submodule update --init --remote --recursive

ENV LD_LIBRARY_PATH=/fuzz/install/lib
ENV ASAN_OPTIONS=hard_rss_limit_mb=1024:detect_leaks=0

RUN echo '#!/bin/bash\nexec clang-17 -fsanitize=address -O0 "$@"' > /usr/local/bin/clang_wrapper && \
    chmod +x /usr/local/bin/clang_wrapper && \
    echo '#!/bin/bash\nexec clang++-17 -fsanitize=address -O0 "$@"' > /usr/local/bin/clang_wrapper++ && \
    chmod +x /usr/local/bin/clang_wrapper++

# Install dependencies: build tools, pkg-config, Abseil
RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
      make pkg-config ca-certificates cmake ninja-build && \
    rm -rf /var/lib/apt/lists/*

WORKDIR /tmp
RUN git clone https://github.com/abseil/abseil-cpp.git && \
    cd abseil-cpp && \
    git checkout d38452e && \
    cd .. && \
    cmake -S abseil-cpp -B absl-build \
        -DCMAKE_CXX_STANDARD=17 \
        -DCMAKE_CXX_STANDARD_REQUIRED=ON \
        -DCMAKE_BUILD_TYPE=Release \
        -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
        -DCMAKE_INSTALL_PREFIX=/fuzz/install \
        -DBUILD_SHARED_LIBS=ON && \
    cmake --build absl-build -j$(nproc) && \
    cmake --install absl-build && \
    rm -rf /tmp/abseil-cpp /tmp/absl-build

# Configure and build RE2
WORKDIR /fuzz/src
RUN cmake -S . -B build \
    -DCMAKE_BUILD_TYPE=Release \
    -DCMAKE_CXX_STANDARD=17 \
    -DABSL_PROPAGATE_CXX_STD=ON \
    -DABSL_ENABLE_INSTALL=ON \
    -DCMAKE_CXX_STANDARD_REQUIRED=ON \
    -DCMAKE_C_COMPILER=clang_wrapper \
    -DCMAKE_CXX_COMPILER=clang_wrapper++ \
    -DCMAKE_INSTALL_PREFIX=/fuzz/install \
    -DCMAKE_PREFIX_PATH=/fuzz/install \
    -DBUILD_SHARED_LIBS=ON \
    -DCMAKE_SKIP_BUILD_RPATH=FALSE \
    -DCMAKE_BUILD_RPATH=/fuzz/install/lib \
    -DCMAKE_INSTALL_RPATH=/fuzz/install/lib \
    -DCMAKE_INSTALL_RPATH_USE_LINK_PATH=TRUE \
    -DRE2_BUILD_TESTING=OFF && \
    cmake --build build -j$(nproc) && \
    cmake --install build

Build Command

clang++-17 -fsanitize=address -g -O0 -o /fuzz/test /fuzz/testcase.cpp -I/fuzz/install/include -L/fuzz/install/lib -lre2 -pthread && /fuzz/test

Reproduce

  1. Copy Dockerfile and testcase.cpp into a local folder.
  2. Build the repro image:
docker build . -t repro --platform=linux/amd64
  1. Compile and run the testcase in the image:
docker run \
    -it --rm \
    --platform linux/amd64 \
    --mount type=bind,source="$(pwd)/testcase.cpp",target=/fuzz/testcase.cpp \
    repro \
    bash -c "clang++-17 -fsanitize=address -g -O0 -o /fuzz/test /fuzz/testcase.cpp -I/fuzz/install/include -L/fuzz/install/lib -lre2 -pthread && /fuzz/test"


Additional Info

This testcase was discovered by STITCH, an autonomous fuzzing system. All reports are reviewed manually (by a human) before submission.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions