Skip to content

Add flat/cumulative function summary to handle recursion-heavy profiles#430

Open
huyhoang171106 wants to merge 1 commit intojoerick:mainfrom
huyhoang171106:feat/add-flat-cumulative-function-summary-to-
Open

Add flat/cumulative function summary to handle recursion-heavy profiles#430
huyhoang171106 wants to merge 1 commit intojoerick:mainfrom
huyhoang171106:feat/add-flat-cumulative-function-summary-to-

Conversation

@huyhoang171106
Copy link
Copy Markdown

Summary

The issue is not a crash or incorrect sampling; it’s a usability gap in how results are presented for recursive call trees. Pyinstrument currently emphasizes hierarchical (tree) output, which makes recursive stacks visually deep and makes it hard to estimate total time spent in a function across many recursive invocations, or in callees beneath those recursive frames. The missing capability is an aggregated “top functions” view (flat/self time and cumulative/inclusive time by function identity), similar to pprof’s top table.

Files changed

  • pyinstrument/renderers/console.py (modified)

Testing

  • Not run in this environment.

Closes #293

…profiles

Signed-off-by: Nguyen Huy Hoang <181364121+huyhoang171106@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 26, 2026 16:29
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new console rendering mode intended to make recursion-heavy profiles easier to interpret by providing an aggregated “top functions” table (self vs cumulative/total time) instead of the deep call-tree view.

Changes:

  • Add a show_top option to ConsoleRenderer and enforce mutual exclusion with flat.
  • Route rendering through a new render_top_summary() function when show_top=True.
  • Add time formatting helper _format_top_time() for seconds vs percent-of-total output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 33 to +67
@@ -41,6 +42,7 @@ def __init__(
:param unicode: Use unicode, like box-drawing characters in the output.
:param color: Enable color support, using ANSI color sequences.
:param flat: Display a flat profile instead of a call graph.
:param show_top: Display an aggregated top-functions table.
:param time: How to display the duration of each frame - ``'seconds'`` or ``'percent_of_total'``
:param flat_time: Show ``'self'`` time or ``'total'`` time (including children) in flat profile.
:param short_mode: Display a short version of the output.
@@ -52,13 +54,17 @@ def __init__(
self.unicode = unicode
self.color = color
self.flat = flat
self.show_top = show_top
self.time = time
self.flat_time = flat_time
self.short_mode = short_mode

if self.flat and self.timeline:
raise Renderer.MisconfigurationError("Cannot use timeline and flat options together.")

if self.flat and self.show_top:
raise Renderer.MisconfigurationError("Cannot use flat and show_top options together.")

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new show_top option was added to ConsoleRenderer, but it isn’t currently exposed through Profiler.print() / Profiler.output_text() (and the CLI option plumbing appears absent). This will also break test_profiler_convenience_methods_have_all_options_available (it asserts Profiler convenience methods mirror renderer signatures). Add show_top (with matching default/annotation) to the Profiler convenience methods and any option parsing that constructs a ConsoleRenderer.

Copilot uses AI. Check for mistakes.
Comment on lines +190 to +205
sorted_rows = sorted(
aggregated.items(), key=lambda item: (item[1]["total_time"], item[1]["self_time"]), reverse=True
)

if not sorted_rows:
return f"{indent}No samples were recorded.\n"

header = f"{indent}{'Self':>10} {'Total':>10} Function"
lines = [header]

for (function, location), times in sorted_rows:
self_time = self._format_top_time(times["self_time"], frame.time, precision)
total_time = self._format_top_time(times["total_time"], frame.time, precision)
label = function if not location else f"{function} ({location})"
lines.append(f"{indent}{self_time:>10} {total_time:>10} {label}")

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

render_top_summary() prints every aggregated row unconditionally. For large profiles this can produce extremely large outputs, and it also ignores the existing show_all behavior used by render_frame_flat() (0.1% threshold). Consider adding a default filter/limit (e.g., drop entries below a % threshold or show top N) and let show_all=True disable that filter.

Copilot uses AI. Check for mistakes.
Comment on lines +82 to 86
if self.show_top:
result += self.render_top_summary(self.root_frame, precision=precision, indent=indent)
elif self.flat:
result += self.render_frame_flat(self.root_frame, precision=precision)
else:
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new show_top rendering path isn’t covered by tests (existing tests cover empty profiles, timeline, show_all, and flat modes). Add a renderer test that exercises ConsoleRenderer(show_top=True) and asserts the output format/ordering (at least that it doesn’t crash and includes the expected header).

Copilot uses AI. Check for mistakes.
while frames:
current = frames.pop()
function = current.function or "<unknown>"
location = current.code_position_short or ""
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frame.code_position_short is a method, but here it’s being used as an attribute (current.code_position_short). This will make location a bound-method object (always truthy), producing unusable keys/labels (e.g., printing the method repr) and breaking aggregation. Call current.code_position_short() instead.

Suggested change
location = current.code_position_short or ""
location = current.code_position_short() or ""

Copilot uses AI. Check for mistakes.
Comment on lines +184 to +187

aggregated[key]["self_time"] += current.total_self_time
aggregated[key]["total_time"] += current.time

Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

total_time is being aggregated by summing current.time for every frame occurrence. For recursive call stacks, inclusive times overlap across nested invocations, so this will significantly over-count (and can exceed the root total), making the new “top functions” view misleading for the recursion-heavy use case this PR targets. Consider defining “total” in a way that avoids double-counting recursion (e.g., subtract time from recursive children with the same key, or compute cumulative time on a call graph with cycle handling, similar to pprof).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve the experience with recursive functions

2 participants