Implement Unicode support by utilizing PosixString and friends#88
Implement Unicode support by utilizing PosixString and friends#88hasufell wants to merge 1 commit intohaskell:masterfrom
Conversation
Codec/Archive/Tar/Types.hs
Outdated
| fromTarPathToWindowsPath :: MonadThrow m => TarPath -> m WindowsPath | ||
| fromTarPathToWindowsPath tarPath = do | ||
| let posix = fromTarPathToPosixPath tarPath | ||
| toWindowsPath posix | ||
|
|
||
| -- | We assume UTF-8 on posix and UTF-16 on windows. | ||
| toWindowsPath :: MonadThrow m => PosixPath -> m WindowsPath | ||
| toWindowsPath posix = do | ||
| str <- PFP.decodeUtf posix | ||
| win <- WFP.encodeUtf str | ||
| pure $ WS.map (\c -> if WFP.isPathSeparator c then WFP.pathSeparator else c) win | ||
|
|
||
| -- | We assume UTF-8 on posix and UTF-16 on windows. | ||
| toPosixPath :: MonadThrow m => WindowsPath -> m PosixPath | ||
| toPosixPath win = do | ||
| str <- WFP.decodeUtf win | ||
| posix <- PFP.encodeUtf str | ||
| pure $ PS.map (\c -> if PFP.isPathSeparator c then PFP.pathSeparator else c) posix | ||
|
|
||
| -- | We assume UTF-8 on posix and UTF-16 on windows. | ||
| toPosixPath' :: MonadThrow m => OsPath -> m PosixPath | ||
| #if defined(mingw32_HOST_OS) | ||
| toPosixPath' (OsString ws) = toPosixPath ws | ||
| #else | ||
| toPosixPath' (OsString ps) = pure ps | ||
| #endif | ||
|
|
||
| -- | We assume UTF-8 on posix and UTF-16 on windows. | ||
| fromPosixPath :: MonadThrow m => PosixPath -> m OsPath | ||
| #if defined(mingw32_HOST_OS) | ||
| fromPosixPath ps = OsPath <$> toWindowsPath ps | ||
| #else | ||
| fromPosixPath ps = pure $ OsString ps | ||
| #endif |
There was a problem hiding this comment.
These are the main conversion functions. As we can see... we leave posix filepaths untouched, but assume UTF-8 encoding when converting from posix filepaths (e.g. those coming from the actual tar archive) to windows, where we assume UTF-16.
There was a problem hiding this comment.
The tar spec obviously demands PosixPath where we don't assume an encoding. So all filepaths within the tar archive are posix.
|
It seems this is safe, because hackage-server uses |
4ed6bfc to
35ca6b0
Compare
|
Thanks, that's great! My current intention is to release the current
There is also an option for a middle ground: break this PR into two. One to change low-level interfaces to use
@hasufell what do you think? Are you interested in splitting the PR into two phases? That's obviously a massive amount of additional work, which we can avoid by delaying entire |
|
I'm fine with delaying |
35ca6b0 to
62794b9
Compare
b66f5cc to
d94a988
Compare
62794b9 to
f366d67
Compare
|
@Bodigrim I rebased. It's possible I made a bit of a mess or there are redundant functions. |
f3675c2 to
28aa81c
Compare
|
I tried hard to avoid |
|
This is still blocked from doing a proper hackage release due to Win32: haskell/win32#226 (comment) |
28aa81c to
1f4feb6
Compare
|
@hasufell Windows failures seem genuine. |
e1e02d8 to
bd8cae7
Compare
1299d00 to
f6ae02c
Compare
acaf54d to
26633f4
Compare
Fixes #78
TODO: