Skip to content

human readable portion of addresses are not encoded properly when non-ASCII #227

@tisdall

Description

@tisdall

The normal case works fine:

Mail.build()
|> Mail.put_from({"Tim", "test@example.com"})
|> Mail.render()
|> IO.inspect()
|> Mail.Parsers.RFC2822.parse()
|> IO.inspect()

outputs:

"From: \"Tim\" <test@example.com>\r\n\r\n"
%Mail.Message{
  headers: %{"from" => {"Tim", "test@example.com"}},
  body: "",
  parts: [],
  multipart: false
}

But any non-ASCII causes issues. Changing "Tim" above to "王" outputs:

"From: =?UTF-8?Q?\"=E7=8E=8B\"?= <test@example.com>\r\n\r\n"
%Mail.Message{
  headers: %{"from" => {"?=", "test@example.com"}},
  body: "",
  parts: [],
  multipart: false
}

Note the human readable name is coming out as "?=" with this parser.

The pertinent parts of RFC 2047 §5(3):

In this case the set of characters that may be used in a "Q"-encoded
'encoded-word' is restricted to: <upper and lower case ASCII
letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_"
(underscore, ASCII 95.)>.

These are the ONLY locations where an 'encoded-word' may appear. In
particular:

  • An 'encoded-word' MUST NOT appear within a 'quoted-string'.

Essentially, there shouldn't be a " within the encoded-word and the encoded-word can't be put within a quoted-string.

In this case, the proper encoding would be From: =?UTF-8?Q?=E7=8E=8B?= <test@example.com> and the parser properly returns headers: %{"from" => {"王", "test@example.com"}}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions