Skip to content

Panel_EPD.cpp: two PSRAM heap corruption bugs (task_update() odd byte-width overrun + display() not rotating the update rectangle) #181

@mikaryyn

Description

@mikaryyn

While working on an app for M5Paper S3 I encountered some unexpected crashes. After debugging the issue I found the reason in Panel_EPD.cpp. There are two separate issues that I will describe here.

Root Cause 1: task_update() odd byte-width overrun

Repro Instructions:

  • Clone repository with git clone https://codeberg.org/mikaryyn/panel-epd-bug-repro.git. The repo contains a test project using ESP-IDF 5.3.3 and M5GFX 0.2.19.
  • Checkout the repro case: git checkout bug1
  • Run the repro case on M5PaperS3: idf.py build flash monitor and observe a crash in the logs.

Proposed Fix:

  • Checkout the fix to Panel_EPD.cpp: git checkout fix1
  • Run the fixed application: idf.py build flash monitor. Now it should run without crashing.

Details: Panel_EPD::task_update() can write past the end of a row when the update width (in bytes) is odd.

  • In task_update(), the update width is converted from pixels to bytes: w_bytes = new_data.w >> 1 (because 1 byte = 2 pixels).
  • The hot loop processes 2 bytes at a time (s[0] and s[1]) and writes 4 uint16_t values (d[0..3]), then advances s += 2, d += 4.
  • If w_bytes is odd (example: a 2-pixel-wide update becomes w_bytes = 1), the “process 2 bytes” loop still runs once and effectively performs one extra iteration worth of work, resulting in a 4-byte out-of-bounds write past the intended end of the row.
  • If that write lands at the tail of the _step_framebuf allocation, it can clobber heap poison/tail metadata, and the next PSRAM heap check
    (heap_caps_check_integrity(...)) will fail (often tripping in tlsf_check()), even though the heap checker is only reporting earlier corruption.

Root Cause 2: display() not rotating the update rectangle

Repro Instructions:

  • Use the same git repo as before.
  • Checkout the repro case: git checkout bug2
  • Run the repro case on M5PaperS3: idf.py build flash monitor and observe a crash in the logs.

Proposed Fix:

  • Checkout the fix to Panel_EPD.cpp: git checkout fix2
  • Run the fixed application: idf.py build flash monitor. Now it should run without crashing.

Details:
Panel_EPD::display(x, y, w, h) can enqueue an update rectangle using unrotated logical coordinates, which becomes out-of-range for the panel’s physical backing buffer, causing task_update() to walk off the end of _step_framebuf.

  • LGFXBase::display(x,y,w,h) clips and forwards the rectangle, but does not apply rotation. Rotation must be handled inside the panel implementation.
  • Panel_EPD::display() updates the internal dirty region (_range_mod) using the unrotated (x,y,w,h), and then builds the queued update (update_data_t) from _range_mod.
  • On M5Paper S3 the physical EPD backing buffer in PSRAM is 960×540, while the logical size exposed to the app is typically 540×960 under rotation.
  • As a result, calls like display.display(0, 0, display.width(), display.height()) (i.e. display(0,0,540,960)) can enqueue an update with h = 960 even though _step_framebuf only has 540 rows. When task_update() iterates h rows, it writes past the end of _step_framebuf and corrupts PSRAM heap metadata; the next heap_caps_check_integrity(...) is just where the corruption gets detected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions