Skip to content

Fix includes normalize win32#2552

Open
digit-google wants to merge 4 commits intoninja-build:masterfrom
digit-google:fix-includes_normalize-win32
Open

Fix includes normalize win32#2552
digit-google wants to merge 4 commits intoninja-build:masterfrom
digit-google:fix-includes_normalize-win32

Conversation

@digit-google
Copy link
Copy Markdown
Contributor

Remove the use of fixed-size buffers in src/includes_normalize-win32.cc in order to support long file paths.

This also requires using GetFullPathNameW() as, surprisingly, GetFullPathNameA() would fail with long paths even when long paths are enabled on the host system running the tests.

To achieve this, introduce ConvertWin32UnicodeToUTF8() and ConvertUTF8ToWin32Unicode() functions to perform conversions properly in src/util.cc.

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Jan 15, 2025

Have you created a bug report for Microsoft? I think we should try to avoid working around bugs in Windows as much as possible, so that the underlying issue can be fixed and everyone benefits.

@digit-google
Copy link
Copy Markdown
Contributor Author

So I actually tried to file a bug, which requires registering a Microsoft account (don't have one, don't want one), and from what I can gather on various Internet comments, essentially goes into a /dev/null bucket most of the time. Even if they fixed this, existing Vista / Windows 10 / whatever installs would never get the fix, so I think we'll have to just work-around this here.

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Feb 7, 2025

I think someone who wants to keep this moving probably has an account and could do it for us. Until then let's merge the #else version so that we are doing the correct thing and would immediately benefit after a Windows update.

@huber-mvtec
Copy link
Copy Markdown

@jhasse After reading the Maximum File Path Limitation documentation again from my understanding this section states that only the *W functions don't have any MAX_PATH limitations, which is in line with @digit-google observation that GetFullPathNameA() fails for long paths even when the system option is set in the registry and during build.

Therefore I don't understand how working with GetFullPathNameA() can fix the problem.

Thanks to the both of you for looking into this.

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Feb 7, 2025

The fact that only the W versions don't have the MAX_PATH limitation is a bug on Microsoft's side which I don't want to work around.

@huber-mvtec
Copy link
Copy Markdown

@jhasse While I agree with you about this being a bug on Microsoft's side (I would say the whole existence of a MAX_PATH in 2025 is a bug), i strongly doubt that this is something they will likely fix. The implications of such a change could be quite big from Microsoft's perspective, because potentially people will depend on the current (documented) behavior.

Just to clarify:

  • You will not merge the PR as is but only the #else part with the *A version, and ninja wont support long path names under Windows for the time being, right? If this is the case, as a compromise would you accept a cmake option to switch the behavior?
  • Would you merge the *W part if there is a bug report at Microsoft?

Thank you.

@digit-google
Copy link
Copy Markdown
Contributor Author

It's even crazier than that:

  • The W functions are the only documented APIs that support long paths, when the feature is enabled and the application has the right manifest attribute. See for yourself: https://learn.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation?tabs=registry

  • The fact that most A functions support them too when the mode is enabled is an implementation detail, that just happens to work most of the time. There is no guarantee that this won't break in the future (though Microsoft is very good at maintaining backwards compatibility). Maybe GetFullPathNameA() does not work because it returns an output path (while most other functions take an input path). Who knows?

  • Ninja relying on A functions for long paths support could be considered a bug in Ninja itself. The correct fix is to only use W functions for all file path-related operations.

I would not be surprised if a bug filed for this "issue" would be closed as "working as intended". We can try, but I don't have high hopes it will work.

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Feb 10, 2025

You will not merge the PR as is but only the #else part with the *A version, and ninja wont support long path names under Windows for the time being, right?

Well, not for the normalization of long paths at least. There is a subset of features where we support long paths already.

If this is the case, as a compromise would you accept a cmake option to switch the behavior?

No, if someone builds from source anyway, they can checkout one of the PRs that use the *W function.

Would you merge the *W part if there is a bug report at Microsoft?

That depends on Microsoft's answer.

There is no guarantee that this won't break in the future (though Microsoft is very good at maintaining backwards compatibility).

The best thing to make them not break it, is to rely on it ;)

Ninja relying on A functions for long paths support could be considered a bug in Ninja itself.

We're using A functions for UTF-8, not for long path support.

@sztomi
Copy link
Copy Markdown

sztomi commented Mar 31, 2025

As a data point, I managed to build a large Qt codebase that previously failed to compile because of moc-generated long relative include paths; it works with the ninja from this branch.

@sztomi
Copy link
Copy Markdown

sztomi commented Mar 31, 2025

@jhasse To also reflect on the discussion above: from its inception, long path support was documented to only be supported with the *W APIs. The fact that it's not ergonomic/surprising that the *A versions don't support it does not make it a bug. It's unlikely that Microsoft will act on (or respond to) this. Converting paths to wide chars is the correct thing to do on Windows and not a workaround.

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Mar 31, 2025

The fact that Microsoft documented their bug does not make it not a bug.

@sztomi
Copy link
Copy Markdown

sztomi commented Apr 1, 2025

@jhasse On the other hand (and I say this with no malice), your judgement does not make it a bug. It is not documented as a bug, there is simply a list of functions on MSDN that support long paths if the setting is enabled in the registry and the application manifest, and they happen to be the *W family of functions only.

Whether or not it is a bug, boils down to what the intention of Microsoft was while introducing long paths. It is a valid choice to only add a new feature to the *W APIs (even if we, not being aware of the entire implementation constraints feel like we would have chosen differently).

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Apr 1, 2025

Well, I disagree.

@sztomi
Copy link
Copy Markdown

sztomi commented Apr 3, 2025

I filed a bug a report: https://aka.ms/AAvdklt (unfortunately there is no way to view it in a browser, it will try to open the awful Feedback Hub app on desktop). I'll update this thread if I get any kind of feedback.

image

@sztomi
Copy link
Copy Markdown

sztomi commented May 6, 2025

No reaction after a month from Microsoft.

@pkasting
Copy link
Copy Markdown

This behavior is intentional on Microsoft's part, for backward compat reasons (the same reason why the A/W split was introduced alongside macros mapping to one or the other to begin with). Their assumption is that the only apps using the A versions are older apps, for whom the APIs must not change behavior, even for arguable "bugfix" reasons. So this will never be implemented in the A versions.

From Chromium's perspective, we no longer care, since we no longer build with ninja. However, as a neutral third party, this is something where ninja should change its behavior (to use the W versions, not because it wants UCS-2 or because it wants this bug fix per se, but because that is what "apps which support more modern conventions such as Unicode and long paths" are assumed to use).

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Sep 22, 2025

Their assumption is that the only apps using the A versions are older apps, for whom the APIs must not change behavior, even for arguable "bugfix" reasons. So this will never be implemented in the A versions.

Even the A versions absolutely change behavior, that's what manifests are for.

UCS-2 was a mistake and that's why the A API is the future (with the introduction of UTF-8). It's true that there was a time where the programming community thought that wide charsets are more modern but that has definitely shifted in recent years.

@pkasting
Copy link
Copy Markdown

In theory, yes, Microsoft could have made these changes to the A APIs under the manifest gating. But in practice that's not how these specific APIs work. The A ones do not change at all, and the W ones change under manifest gates.

I consider this overly cautious, and I understand where you're coming from in principle. It is certainly how some of their APIs work, after all. But it is what it is. I could be missing something about how AppCompat and future planned registry tweaks affect this that makes Microsoft's behavior sane; either way, i think the future is not the A APIs (which are often not so much UTF-8 as ASCII or 1252, but it varies by function), but something entirely different. Until then, the change here seems like the best course to me.

@digit-google digit-google force-pushed the fix-includes_normalize-win32 branch from 0e706ed to 864c812 Compare September 23, 2025 06:28
@pkasting
Copy link
Copy Markdown

(See also https://learn.microsoft.com/en-us/windows/win32/intl/conventions-for-function-prototypes, especially the highlighted note: "New Windows applications should use Unicode...They should be written with generic functions, and should define UNICODE to compile the functions into Unicode functions." The "A" versions are for ANSI, i.e. the old Windows code pages, and may have arbitrary functional gaps; they most definitely do not guarantee UTF-8 input/output will always work.)

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Sep 23, 2025

Of course Microsoft wants to shift the blame on the developers instead of fixing their shit. As I have said here and in other issues/PRs: It's a bug on Microsoft's side which I don't want to work-around.

@pkasting
Copy link
Copy Markdown

Shrug; you can believe what you want. They have stated reasons, they don't consider it buggy but a principled compromise, they believe changing this is worse than not changing it. Regardless of the truth of their position, the only practical impact refusing to accept the PR here has at this point is that ninja is unusable for certain users and projects, punishing users for something they didn't cause and can't control.

As I said, I don't personally care. Chromium has ditched ninja and now Valve has too, so it doesn't affect me.

Cheers,
PK

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Sep 25, 2025

Yes. The discussion drifted a bit away from this PR on to the general issue, let's focus again.

Would it be possible to add a test for this? Also if you would create a PR with the #else version I could merge that right away as mentioned above.

(I also think that putting <> includes over "" includes is completely wrong, did clang-format do this?)

@digit-google
Copy link
Copy Markdown
Contributor Author

digit-google commented Sep 25, 2025

Would it be possible to add a test for this?

Actually, the tests were changed to check the new behavior (but I failed to update the comments in them, so this wasn't obvious, I'll fix it).

Also if you would create a PR with the #else version I could merge that right away as mentioned above.

I assume you mean just removing the use of fixed-size buffers, right? Yes I can, but the function will still fail due to the GetFullPathNameA() failure. Please confirm if this is what you really want (I see little benefit in this).

(I also think that putting <> includes over "" includes is completely wrong, did clang-format do this?)

Indeed, this is what clang-format does with the current .clang-format settings. I try to use git clang-format as much as possible for my CLs so I don't have to deal with style issues myself. If you feel strongly about that style, I suggest we discuss this in a separate PR where we update .clang-format first.

@digit-google digit-google force-pushed the fix-includes_normalize-win32 branch from 864c812 to 842fca0 Compare September 26, 2025 09:41
@digit-google
Copy link
Copy Markdown
Contributor Author

I have refactored the PR into four commits for clarity:

  • The first two remove some bounded buffer usage and removes the using namespacec std, but only fix one partial issue related to long paths.

  • The last two introduce UTF-8 <> Win32 Unicode conversion functions, and fix the problem properly using GetFullPathNameW().

Both adjust the unit-tests accordingly.

@panther7
Copy link
Copy Markdown

@digit-google hi, i tried your ninja (1.14.0.git) and my build always end with full rebuild "warning: premature end of file; recovering"

@digit-google
Copy link
Copy Markdown
Contributor Author

@digit-google hi, i tried your ninja (1.14.0.git) and my build always end with full rebuild "warning: premature end of file; recovering"

That means your .ninja_deps is corrupted, and the truncation that is applied by Ninja in this case doesn't fix it.
I have seen similar issue crop up rarely in my project's build, and diagnosed this to an invalid checksum value (but don't have a good repro case or explanation for this).

Maybe what you are seeing is incidental, what happens if you remove the .ninja_deps file and rebuild in your case?

@panther7
Copy link
Copy Markdown

panther7 commented Sep 30, 2025

@digit-google It's clear build, .ninja_deps is fresh file.

builds commands are:

# fresh repo/dir
# ...
ninja -C out/x64 electron
# ... full build
# ... done
ninja -C out/x64 electron:electron_dist_zip
# ninja: warning: premature end of file; recovering
# ... full build again

@digit-google
Copy link
Copy Markdown
Contributor Author

That doesn't seem related to this PR at all, why are you asking this here?

Also what do you mean exactly by "your ninja" here?

If you are talking about the tip-of-tree of github.com/ninja-build/ninja, then please file a different Github issue with reproduction steps to carry the conversation there.

If you are talking about the Fuchsia Ninja program instead, report that through the Fuchsia public tracker instead. Please do not pollute upstream issues with unrelated content.

@panther7
Copy link
Copy Markdown

panther7 commented Oct 1, 2025

I only say, that ninja v1.31.1 works properly, but "1.14.0.git" from fix-includes_normalize-win32 has bug with full rebuild.

@digit-google
Copy link
Copy Markdown
Contributor Author

Thanks for the clarification. In that case what about 1.14.0.git tip-of-tree (i.e. without the fix-includes_normalize-win32 commits)?

@panther7
Copy link
Copy Markdown

panther7 commented Oct 2, 2025

I tried 1.14.0 from master, and I think, that works properly.
Build 1.14.0 from fix-includes_normalize-win32 repeating build like this.
@digit-google

@bebuch
Copy link
Copy Markdown

bebuch commented Nov 3, 2025

I can confirm that a Ninja compiled from this branch was able to build our CMake code base on Windows 11, which consists of over 6,000 steps, including various third-party tools such as Qt, protobuf/gRPC, ActiveX, etc.

@jhasse Are more changes required? We would be very happy to see this in an offical release.

@bebuch
Copy link
Copy Markdown

bebuch commented Nov 13, 2025

@jhasse ping

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Nov 29, 2025

Do I understand it correctly that @panther7 found a regression?

@digit-google
Copy link
Copy Markdown
Contributor Author

Sorry for the long answer.

Do I understand it correctly that @panther7 found a regression?

This is unclear, @panther7 , can you clarify exactly what you are seeing? As I wrote previously, it looks like you are trigerring another Ninja bug that may be unrelated to this PR. Exact repro steps would thus be very welcomed here.

@lygstate
Copy link
Copy Markdown

lygstate commented Feb 7, 2026

I am also facing this issue, don't know if its another Ninja bug

[0/245] Running ninja for QtWebEngineCore in D:/work/xtal/xtal-wasm/build-msvc/qtwebengine/src/core/Release/AMD64ninja: warning: premature end of file; recovering
ninja: Entering directory `D:/work/xtal/xtal-wasm/build-msvc/qtwebengine/src/core/Release/AMD64'
[215/18340] CXX obj/base/allocator/partition_allocator/src/partition_alloc/allocator_base/process_handle_win.obj
C:\Program Files (x86)\Windows Kits\10\\include\10.0.26100.0\\um\handleapi.h(27): warning C4005: 'INVALID_HANDLE_VALUE': macro redefinition
../../../../../../qt-everywhere-src-6.10.2/qtwebengine/src/3rdparty/chromium/base/allocator/partition_allocator/src\partition_alloc/partition_alloc_base/win/windows_types.h(69): note: see previous definition of 'INVALID_HANDLE_VALUE'
[295/18340] CC obj/third_party/nasm/nasm/outmacho.obj

@lygstate
Copy link
Copy Markdown

lygstate commented Feb 7, 2026

Sorry for the long answer.

Do I understand it correctly that @panther7 found a regression?

This is unclear, @panther7 , can you clarify exactly what you are seeing? As I wrote previously, it looks like you are trigerring another Ninja bug that may be unrelated to this PR. Exact repro steps would thus be very welcomed here.

      unsigned checksum = *reinterpret_cast<unsigned*>(buf + size - 4);
      int expected_id = ~checksum;
      int id = static_cast<int>(nodes_.size());
      if (id != expected_id || node->id() >= 0) {
        read_failed = true;
        break;
      }

anything wrong with this?
id == expected_id but node->id() >=0

I found ../../../../../../qt-everywhere-src-6.10.2/qtwebengine/src/3rdparty/chromium/third_party/protobuf/src/google/protobuf/arenaz_sampler.h appeared twice
in .ninja_deps
so what's the cause?

@lygstate
Copy link
Copy Markdown

lygstate commented Feb 7, 2026

Sorry for the long answer.

Do I understand it correctly that @panther7 found a regression?

This is unclear, @panther7 , can you clarify exactly what you are seeing? As I wrote previously, it looks like you are trigerring another Ninja bug that may be unrelated to this PR. Exact repro steps would thus be very welcomed here.

I am verfied the issue is triggered this MR, I redo this mr at #2723, and now it's fine

Do not use fixed-size buffers in order to better support
long file paths. This does *not* get rid of the failure
described at [1], because GetFullPathNameA() will fail
with long paths, even when the feature is enabled in
the application's manifest.

Insteead, using GetFullPathNameW() is required to completely
fix this issue.

[1] ninja-build#2442 (comment)
These will be used by functions that need to use Unicode
Win32 APIS, like GetFullPathNameW() which supports long paths,
instead of GetFullPathNameA() which does not, even  when
long paths are enabled on the system!
Use GetFullPathNameW() instead of GetFullPathNameA() to ensure that
normalization works for all long paths on Windows. Adjust unit
tests accordingly.

Fixes ninja-build#2442
@digit-google digit-google force-pushed the fix-includes_normalize-win32 branch from 842fca0 to ed72c11 Compare February 20, 2026 21:12
@itavero
Copy link
Copy Markdown

itavero commented Mar 26, 2026

Our team is again facing issues on Windows 11 due to this bug. It's a bit unclear to me who's in the lead now and what needs to happen to get either this PR or #2723 to be merged.

Is there anything we (as users. potential first-time contributors) can do to help?

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Mar 26, 2026

Yes, complain to Microsoft.

@itavero
Copy link
Copy Markdown

itavero commented Mar 26, 2026

Yes, complain to Microsoft.

I thought people tried that already?
I also fail to see how that would help to get one of these PRs merged.

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Mar 26, 2026

In my opinion it wasn't done enough. Merging this PR takes the pressure off, so I'm very reluctant to help the billion dollar company in my free time.

The more I think about it ... fixing this also improves Windows as a development platform in general. Merging this might actually be unethical to me :O

@lygstate
Copy link
Copy Markdown

Yes, complain to Microsoft.

The complain didn't have any good things, can we just deal with it first? as many people suffer from it and cost so many time to deal with it, can we just just keep complain Microsoft and still get it working

@lygstate
Copy link
Copy Markdown

Our team is again facing issues on Windows 11 due to this bug. It's a bit unclear to me who's in the lead now and what needs to happen to get either this PR or #2723 to be merged.

Is there anything we (as users. potential first-time contributors) can do to help?

#2723 Solved this issue:
#2552 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants