Skip to content

Add an arena for allocating strings.#2532

Closed
sesse wants to merge 1 commit intoninja-build:masterfrom
sesse:arena
Closed

Add an arena for allocating strings.#2532
sesse wants to merge 1 commit intoninja-build:masterfrom
sesse:arena

Conversation

@sesse
Copy link
Copy Markdown

@sesse sesse commented Nov 25, 2024

This is the patch we took out during the review of PR #2519, now rebased and re-measured. It's much less memory-bad, but still wins ~12%.

Benchmark 1: ./ninja.old -C ~/chromium/src/out/Default content_shell
  Time (mean ± σ):      4.608 s ±  0.014 s    [User: 3.073 s, System: 1.531 s]
  Range (min … max):    4.590 s …  4.634 s    10 runs
 
Benchmark 2: ./ninja -C ~/chromium/src/out/Default content_shell
  Time (mean ± σ):      4.084 s ±  0.009 s    [User: 2.558 s, System: 1.523 s]
  Range (min … max):    4.068 s …  4.097 s    10 runs
 
Summary
  ./ninja -C ~/chromium/src/out/Default content_shell ran
    1.13 ± 0.00 times faster than ./ninja.old -C ~/chromium/src/out/Default content_shell

What do you think? Should we still pursue this?

This reduces the amount of malloc traffic significantly,
speeding up parsing. For a no-op build of Chromium (Linux, Zen 2),
this reduces time spent from 4.61 to 4.08 seconds. However, note
that it also increases RSS from 914 to 937 MB; I haven't looked
deeply into why, but it's reasonable to assume that this is related
to the fact that we no longer merge small strings together (since
they are now immutable).

We still use some time in actually copying the string into the arena,
but it seems this is cheaper than just persisting the file contents
wholesale and pointing into that.
@sesse
Copy link
Copy Markdown
Author

sesse commented Dec 17, 2024

I moved so that the EvalStrings used in Rule are now owned by a smaller arena on the BindingEnv instead of being allocated on the heap and leaked.

@elliotgoodrich
Copy link
Copy Markdown
Contributor

I spent a bit of time removing allocations in trimja and the main changes related to EvalString were to:

  1. Change EvalString so that its puts all segments in a single std::string so its full capacity is retained when cleared (trimja/evalstring.h)
  2. Structure the ManifestParser to reuse the same EvalString throughout parsing (m_storage)

Then the only copies you need to make of EvalString is in Rule::AddBinding, which would be faster as the copy constructor of trimja's EvalString only needs 1 allocation and copy of contiguous data. Otherwise, during parsing the capacity is reused in the single EvalString and eventually we'll stop any allocations.

Maybe changing the entire ManifestParser is too much work, but it looks possible to add an EvalString member variable and reuse this?

@elliotgoodrich
Copy link
Copy Markdown
Contributor

I've submitted 1. here #2696

The second part would need a ReusableStack<EvalString> that doesn't actually destruct elements when clearing and will reuse them later on. This will replace the vector<EvalString> inside ManifestParser to remove the remaining allocations in this area.

@sesse sesse closed this by deleting the head repository Jan 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants