Skip to content

"Skipping data after last boundary" warning when CRLF is split across chunks #246

@yahiro-code

Description

@yahiro-code

This is a follow-up to #192 / #193.

The fix in #193 correctly handles CRLF after the closing boundary when it arrives in a single chunk. However, when \r and \n are split across separate TCP chunks (which happens in production with network proxies/load balancers), the warning is still emitted.

If you want, I can contribute to this library with creating PR.

Cause

In MultipartState.END, the current check requires both \r and \n to be in the same chunk:

https://github.com/Kludex/python-multipart/blob/master/python_multipart/multipart.py#L1415-L1423

elif state == MultipartState.END:
    # Don't do anything if chunk ends with CRLF.
    if c == CR and i + 1 < length and data[i + 1] == LF:
        i += 2
        continue
    # Skip data after the last boundary.
    self.logger.warning("Skipping data after last boundary")
    i = length
    break

When \r is the last byte of a chunk, i + 1 < length is False, so it falls through to the warning.

Who appends the CRLF

aiohttp's MultipartWriter.write() appends \r\n after the closing boundary:

https://github.com/aio-libs/aiohttp/blob/master/aiohttp/multipart.py (around line 1000)

if close_boundary:
    await writer.write(b"--" + self._boundary + b"--\r\n")

This is valid per RFC 2046 Section 5.1.1:

close-delimiter := "--" boundary "--" transport-padding
                   [CRLF epilogue]

NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the
boundary value with the beginning of each candidate line. An exact
match of the entire candidate line is not required; it is sufficient
that the boundary appear in its entirety following the CRLF.

...these areas are generally not used because of the lack of proper
typing of these parts and the lack of clear semantics for handling
these areas at gateways, particularly X.400 gateways.

Reproduction

import asyncio
import io
import logging
import sys

from aiohttp import FormData, MultipartWriter
from python_multipart.multipart import MultipartParser

# -- Build body using aiohttp's FormData --

class BytesWriter:
    def __init__(self):
        self.buffer = bytearray()
    async def write(self, data: bytes) -> None:
        self.buffer.extend(data)

async def build_aiohttp_body():
    form = FormData()
    form.add_field("file", io.BytesIO(b"hello"), filename="test.txt", content_type="text/plain")
    mpwriter = form()
    assert isinstance(mpwriter, MultipartWriter)
    writer = BytesWriter()
    await mpwriter.write(writer)
    return mpwriter._boundary, bytes(writer.buffer)

boundary, body = asyncio.run(build_aiohttp_body())

# Confirm aiohttp appends \r\n after closing boundary
final_boundary = b"--" + boundary + b"--"
idx = body.rfind(final_boundary)
after = body[idx + len(final_boundary):]
print(f"Data after final boundary: {after!r}")  # b'\r\n'

# -- Logging setup --

handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.WARNING)
handler.setFormatter(logging.Formatter("%(name)s - %(levelname)s - %(message)s"))
logger = logging.getLogger("python_multipart.multipart")
logger.addHandler(handler)
logger.setLevel(logging.WARNING)

# Case 1: single chunk - no warning
print("\n=== Case 1: single chunk ===")
p1 = MultipartParser(boundary, {})
p1.write(body)
p1.finalize()
print("(no warning)")

# Case 2: CR/LF split across chunks (split at -1) - BUG
print("\n=== Case 2: CR/LF split at -1 ===")
p2 = MultipartParser(boundary, {})
p2.write(body[:-1])  # chunk ends with \r
p2.write(body[-1:])  # next chunk is just \n
p2.finalize()
print("(done)")

# Case 3: split at -2 (both \r\n in second chunk) - OK
print("\n=== Case 3: split at -2 ===")
p3 = MultipartParser(boundary, {})
p3.write(body[:-2])
p3.write(body[-2:])  # \r\n together
p3.finalize()
print("(done)")

Output:

Data after final boundary: b'\r\n'

=== Case 1: single chunk ===
(no warning)

=== Case 2: CR/LF split at -1 ===
python_multipart.multipart - WARNING - Skipping data after last boundary
python_multipart.multipart - WARNING - Skipping data after last boundary
(done)

=== Case 3: split at -2 ===
(done)

Only Case 2 triggers the warning — when \r is the last byte of a chunk and \n arrives in the next chunk.

Environment

  • python-multipart 0.0.20 (also reproducible on 0.0.22)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions