Skip to content

Speed up methods in State#2695

Open
elliotgoodrich wants to merge 1 commit intoninja-build:masterfrom
elliotgoodrich:speed-up-add-methods-in-state
Open

Speed up methods in State#2695
elliotgoodrich wants to merge 1 commit intoninja-build:masterfrom
elliotgoodrich:speed-up-add-methods-in-state

Conversation

@elliotgoodrich
Copy link
Copy Markdown
Contributor

Remove the double-lookup inside paths_ in State::GetNode in the case where we are first calling this for a particular path.

Add overloads of GetNode that can take ownership of a std::string, which is the common case when calling from ManifestParser and we have a std::string returned by the Evaluate method going to waste.

To avoid writing an overload of AddIn, AddOut, and AddValidation taking a std::string&& by changing the signature to instead take a Node * and require caller to get this through State::GetNode.

CanonicalizePath(&path, &slash_bits);
std::string default_err;
if (!state_->AddDefault(path, &default_err))
if (!state_->AddDefault(std::move(path), &default_err))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: AddDefault() takes a StringPiece as its first argument, so this std::move() doesn't do anything. No need to change its signature given that this is so rarely called in practice.

const bool isNew = result.second;
if (isNew) {
assert(node == nullptr);
node = new Node(std::move(path), slash_bits);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Node() constructor takes a const std::string& as its first argument, so this will still perform a copy :-/ But it should be easy to fix by changing the constructor definition in graph.h with something that looks like:

Mode(std::string path, ...) : path_(std::move(path)), .... {}

This would also benefit the Node(path.AsString(), ...) call on line 105 above.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, an additionnal Nost(std::string&&) constructor works too.

src/state.cc Outdated
if (isNew) {
assert(node == nullptr);
node = new Node(path.AsString(), slash_bits);
// Update the key to refer to the long-lived string in the Node
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh this is smart but also a bit subtle. Can I ask you to add a comment explaining here that this cannot break the hash map's internals because the key type's equality and hash operations only care about the content of the StringPiece-pointed data, not the address it lives at. I think that would help for long-term maintenance.

Copy link
Copy Markdown
Contributor

@digit-google digit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

@elliotgoodrich elliotgoodrich force-pushed the speed-up-add-methods-in-state branch from cf8fb15 to 91f5242 Compare November 27, 2025 21:47
Remove the double-lookup inside `paths_` in `State::GetNode` in the case
where we are first calling this for a particular `path`.

Add overloads of `GetNode` that can take ownership of a `std::string`,
which is the common case when calling from `ManifestParser` and we
have a `std::string` returned by the `Evaluate` method going to waste.

To avoid writing an overload of `AddIn`, `AddOut`, and `AddValidation`
taking a `std::string&&` by changing the signature to instead take a
`Node *` and require caller to get this through `State::GetNode`.
@elliotgoodrich elliotgoodrich force-pushed the speed-up-add-methods-in-state branch from 91f5242 to 6c55f95 Compare November 27, 2025 21:49
@digit-google
Copy link
Copy Markdown
Contributor

@jhasse please take a look

@moritzx22
Copy link
Copy Markdown
Contributor

To avoid writing an overload of AddIn, AddOut, and AddValidation taking a std::string&& by changing the signature to instead take a Node * and require caller to get this through State::GetNode.

Changing the signatures of AddIn, AddOut, and AddValidation does not preserve the original semantics of these functions. Moving this responsibility to the caller would alter behavior and require widespread changes throughout the codebase, including updates to the unit tests. The scope of this PR should remain a pure speed optimization, without introducing semantic changes.

A simpler and semantically correct alternative is to keep the existing behavior and update the signatures to accept rvalue strings:

void AddIn(Edge* edge, std::string&& path, uint64_t slash_bits);
bool AddOut(Edge* edge, std::string&& path, uint64_t slash_bits, std::string* err);
void AddValidation(Edge* edge, std::string&& path, uint64_t slash_bits);

This preserves the current beaviour of the functions, avoids unnecessary copies, and does not require overloads for lvalue semantics (which are not needed here). The impact on existing call sites is minimal, and the unit tests can remain unchanged. I also do not see any performance penalty compared to the currently proposed solution.

Conflicts with PR #2713

As discussed in #2713 (comment), PR 2713 changes the semantics of GetNode from a two‑step creation process to a single‑step process. After that change, the following comment is no longer guaranteed to be true:

ninja/src/state.h

Lines 110 to 113 in cc60300

/// Add input / output / validation nodes to a given edge. This also
/// ensures that the generated_by_dep_loader() flag for all these nodes
/// is set to false, to indicate that they come from the input manifest.
void AddIn(Edge* edge, StringPiece path, uint64_t slash_bits);

Because GetNode would no longer guarantee that a node originates from the manifest, shifting responsibility to the caller would break the assumptions that AddIn, AddOut, and AddValidation currently rely on.

#include <queue>
#include <set>
#include <string>
#include <utility>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s not clear what requires this include.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, any use of std::move() requires it. In practice, most other C++ standard library headers will include it anyway due to their template-based implementation, one could consider this an implementation detail. This looks fine to me.

@elliotgoodrich
Copy link
Copy Markdown
Contributor Author

Changing the signatures of AddIn, AddOut, and AddValidation does not preserve the original semantics of these functions. Moving this responsibility to the caller would alter behavior and require widespread changes throughout the codebase, including updates to the unit tests. The scope of this PR should remain a pure speed optimization, without introducing semantic changes.

All necessary changes are in this PR and there are 10 call sites that need to be updated - including all those in unit tests. I don't think that is particularly widespread.

Conflicts with PR #2713

My PR has been open and approved for 2 months now. If another change is landed before this one then I will update it no problem, but before that point it doesn't make sense for me to guess what PRs are going to land and to make pre-emptive changes to avoid conflicts.

@elliotgoodrich
Copy link
Copy Markdown
Contributor Author

@jhasse @digit-google is this okay to land?

@jhasse
Copy link
Copy Markdown
Collaborator

jhasse commented Feb 7, 2026

I don't have time to look at this.

@elliotgoodrich
Copy link
Copy Markdown
Contributor Author

Okay thank you - is this change then blocked until further notice?

paths_[node->path()] = node;
const std::pair<Paths::iterator, bool> result = paths_.try_emplace(path, nullptr);
Node*& node = result.first->second;
const bool isNew = result.second;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: prefer snake_case for variable names in the Ninja source code, e.g. is_new

Copy link
Copy Markdown
Contributor

@digit-google digit-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good to me, @moritzx22 you may need to rebase your PR on top of this one once this lands, but this shouldn't be too hard. @jhasse can you take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants