Skip conversion if charset is the same#1
Conversation
|
Hi @pupaxxo -- wouldn't that be better a few lines down after the calls to getMbCharset to normalize them? For instance the target could be "utf-8" and the source "UTF8" according to content-type, so calling that a few lines down may catch more occurrences of what you're trying to prevent... although would involve looking up charsets, etc... |
|
Yes, you are right. It surely better to normalize them before checking the two charset. I'll update the branch |
|
I'm using the web ui, can you squash the two commits? Thanks. |
|
Aah, sorry it wasn't that simple either. This now fails a test. It's because getMbCharset can return false if the charset is not supported by mb_* (not listed in mb_list_encodings). That would make the conditional more complicated if you wanted to check against iconv charsets as well. To be honest, not sure it's worth it either -- in the end the mb_* call and iconv call probably just return the string in the same way if the source/target are the same, no? The other option would be to: write a more complicated conditional, or: incorporate the checks into the existing conditions, and write an additional condition for the iconv call, or: go back to the first commit, but use getNormalizedCharset to compare the two. That would result in multiple calls to getNormalizedCharset though because it's called in getMbCharset/getIconvCharset, and so could likely negate any benefit from not calling mb_convert*/iconv. |
No description provided.