Skip to content

Renderer content disposition encoding#7

Open
romsahel wants to merge 2 commits intoparser-rfc-2822-memory-optimizationfrom
renderer-content-disposition-encoding
Open

Renderer content disposition encoding#7
romsahel wants to merge 2 commits intoparser-rfc-2822-memory-optimizationfrom
renderer-content-disposition-encoding

Conversation

@romsahel
Copy link
Copy Markdown
Owner

@romsahel romsahel commented Feb 4, 2026

Problem Statement

RFC 2047 defines "encoded-words" (the =?UTF-8?Q?encoded-text?= format) for encoding non-ASCII characters in email headers. However, RFC 2047 explicitly states:

An 'encoded-word' MUST NOT be used in parameter of a MIME Content-Type or Content-Disposition field, or in any structured field body except within a 'comment' or 'phrase'.

This means:

  • Allowed: Subject: =?UTF-8?Q?caf=C3=A9?=
  • Allowed: From: =?UTF-8?Q?Jos=C3=A9?= <jose@example.com>
  • Forbidden: Content-Disposition: attachment; filename="=?UTF-8?Q?caf=C3=A9.pdf?="

These headers must be encoded using the rfc2231 specifications:

  • Format: parameter*=charset'language'encoded-value
  • Uses percent-encoding (like URLs)
  • No quotes around the encoded value
  • The parameter name gets an asterisk (*) suffix when encoding is used

Examples:

filename*=UTF-8''caf%C3%A9.pdf
filename*=UTF-8''%E4%B8%AD%E6%96%87.txt

Current implementation

In lib/mail/renderers/rfc_2822.ex, the render_subtypes/1 function is using RFC 2047 encoding for all header parameters.

The proposed fix

Introduce a new render_subtypes that handles those headers (@rfc2231_headers) and apply the proper encoding when necessary (ie if it contains non-ASCII characters

…ype parameters

RFC 2047 encoded-words are forbidden in MIME parameter values. This change
implements RFC 2231 encoding (charset'language'percent-encoded) for
Content-Type and Content-Disposition parameters while preserving RFC 2047
for other headers.
Fix parser to correctly handle simple RFC 2231 extended parameters
(e.g., filename*=UTF-8''value) by using `trim: true` in String.split.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parser currently only supported Parameter Value Continuations which has the format key*index="value"

        Content-Type: message/external-body; access-type=URL;
         URL*0="ftp://";
         URL*1="cs.utk.edu/pub/moore/bulk-mailer/bulk-mailer.tar"

The non-continuation version is simply: key*="value"

Comment on lines +167 to +177
defp render_subtypes([{key, value} | subtypes], :rfc_2231) do
key = String.replace(key, "_", "-")
value = encode_header_value(value, :quoted_printable)

value =
if value =~ ~r/[\s()<>@,;:\\<\/\[\]?=]/ do
"\"#{value}\""
else
value
end
if contains_non_ascii?(value) do
value = encode_header_value(value, :rfc_2231)
["#{key}*=UTF-8''#{value}" | render_subtypes(subtypes, :rfc_2231)]
else
value = maybe_wrap_in_quotes(value)
["#{key}=#{value}" | render_subtypes(subtypes, :rfc_2231)]
end
end
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new handler is here.
The logic of the other handler is untouched

@romsahel romsahel marked this pull request as ready for review February 4, 2026 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant