Skip to content

osquerybeat: add optional query profiling for scheduled and live runs#49514

Open
marc-gr wants to merge 19 commits intoelastic:mainfrom
marc-gr:feat/osquery-profiles
Open

osquerybeat: add optional query profiling for scheduled and live runs#49514
marc-gr wants to merge 19 commits intoelastic:mainfrom
marc-gr:feat/osquery-profiles

Conversation

@marc-gr
Copy link
Contributor

@marc-gr marc-gr commented Mar 17, 2026

Proposed commit message

Add optional query profiling for osquerybeat. Scheduled queries can set profile: true in config; live (ad-hoc) actions can send data.profile: true. When enabled, metrics (utilization, duration, memory, cpu time, etc.) are collected from osquery_schedule or process snapshots and published to the osquery_manager.query_profile datastream. Integration must supply an input stream with that dataset for events to be published. Diagnostic hook scheduled_query_profiles exposes recent scheduled-query profiles as JSON.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works. Where relevant, I have used the stresstest.sh script to run them under stress conditions and race detector to verify their stability.
  • I have added an entry in ./changelog/fragments using the changelog tool.

Disruptive User Impact

None. Profiling is opt-in; no new stream required unless profile events are desired.

Author's Checklist

  • [ ]

How to test this PR locally

  • go test ./x-pack/osquerybeat/... for unit tests.
  • Run osquerybeat with a policy that enables profile: true on one or more scheduled queries and (optionally) an input stream with dataset osquery_manager.query_profile; trigger a live query with data.profile: true to exercise live profiling.

Related issues

Use cases

  • Operators enable profiling on selected scheduled queries to monitor cost (CPU, memory, duration) over time.
  • Live queries can request a one-off profile via action data.profile: true to inspect resource usage for a given SQL statement.

marc-gr added 8 commits March 17, 2026 10:49
- Add DefaultQueryProfileDataset and QueryProfileDatastream() for
  osquery_manager.query_profile events
- Add optional Profile *bool to config.Query (schedule/packs)
- Add optional Profile *bool to StreamConfig for deprecated streams path
- Add Profile bool to Action; parse data.profile (bool) in FromMap
- Reject invalid profile type with ErrActionRequest
- Tests: valid profile flag, invalid profile type
- queryProfiler: profileScheduledQuery (osquery_schedule), state for deltas
- Handle execution counter reset (e.g. osqueryd restart) by treating as single run
- collectRuntimeSnapshot: processes JOIN osquery_info for live profiling
- buildLiveQueryProfile: utilization, duration, memory, fds, exit code
- scheduledProfilesDiagnosticsWithResolver for diagnostic hook JSON
- Helpers: toInt64, toString, utilizationFromMillis, millisToSeconds
- Initialize queryProfiles map in NewConfigPlugin
- queryProfiles map and LookupQueryProfile(name) for scheduled queries
- set(): populate queryProfiles from Schedule, Packs, and Streams (Profile on Query/StreamConfig)
- TestSetScheduledQueryProfileFlag: Schedule with profile enabled
- queryProfileClient for dataset osquery_manager.query_profile
- Configure: attach client when input has query_profile dataset; document requirement
- PublishQueryProfile: type osquery_profile, optional query/action/response ids
- Close queryProfileClient and actionResponsesClient
- processorsForInputConfig: document empty Processors / Fleet behavior
Beater:
- queryProfiler instance; registerDiagnosticHooks (scheduled_query_profiles)
- scheduledQueryProfilesDiagTimeout constant
- setDiagnosticsQueryExecutor/Resolver for diagnostic hook
- handleQueryResult: profile scheduled queries when LookupQueryProfile; PublishQueryProfile
- Fix duplicate runOsquery error log

Action handler:
- namespace() refactor; executeQuery takes index
- executeQuery: when Profile, collectRuntimeSnapshot before/after, buildLiveQueryProfile, PublishQueryProfile
- publisher interface: PublishQueryProfile
- Debug log when profile requested but pre-snapshot missing

Tests: mock PublishQueryProfile; TestOsquerybeatRegistersScheduledProfilesDiagnostics
- toInt64, toString, utilizationFromMillis, millisToSeconds
- buildLiveQueryProfile (success and error exit)
- scheduledProfileFromScheduleRow, diagnosticsErrorJSON
- profileScheduledQuery: first run, execution reset (osquery restart)
- scheduledProfilesDiagnosticsWithResolver nil executor
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Mar 17, 2026
@github-actions
Copy link
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@mergify
Copy link
Contributor

mergify bot commented Mar 17, 2026

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feat/osquery-profiles upstream/feat/osquery-profiles
git merge upstream/main
git push upstream feat/osquery-profiles

@mergify
Copy link
Contributor

mergify bot commented Mar 17, 2026

This pull request does not have a backport label.
If this is a bug or security fix, could you label this PR @marc-gr? 🙏.
For such, you'll need to label your PR with:

  • The upcoming major version of the Elastic Stack
  • The upcoming minor version of the Elastic Stack (if you're not pushing a breaking change)

To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-8./d is the label to automatically backport to the 8./d branch. /d is the digit
  • backport-active-all is the label that automatically backports to all active branches.
  • backport-active-8 is the label that automatically backports to all active minor branches for the 8 major.
  • backport-active-9 is the label that automatically backports to all active minor branches for the 9 major.

@marc-gr marc-gr added the Team:Security-Windows Platform Windows Platform Team in Security Solution label Mar 17, 2026
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Mar 17, 2026
@marc-gr marc-gr added needs_team Indicates that the issue/PR needs a Team:* label backport-skip Skip notification from the automated backport with mergify labels Mar 17, 2026
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Mar 17, 2026
Resolve conflicts:
- osquerybeat.go: keep upstream osqueryInstallConfig and executablePath, add qp (queryProfiler)
- osquerybeat_status_test.go: keep both TestOsquerybeatRegistersScheduledProfilesDiagnostics and TestOsquerybeatStatusReporting_RuntimeResolutionFailure
@marc-gr marc-gr marked this pull request as ready for review March 17, 2026 10:17
@marc-gr marc-gr requested a review from a team as a code owner March 17, 2026 10:17
@elasticmachine
Copy link
Contributor

Pinging @elastic/sec-windows-platform (Team:Security-Windows Platform)

@marc-gr
Copy link
Contributor Author

marc-gr commented Mar 17, 2026

/ai

@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds per-query profiling to osquerybeat: a changelog fragment; new Profile flags on Action, StreamConfig, and internal Query; ConfigPlugin support with LookupQueryProfile; a new internal query profiler that captures runtime snapshots, builds scheduled and live profile payloads, and exposes scheduled_profiles diagnostics hooks; a PublishQueryProfile API and dedicated query_profile publisher client plus datastream helpers; wiring in action_handler and osquerybeat to collect and publish live and scheduled profiles; and unit tests covering profiler, diagnostics, config, action parsing, and publishing.

Possibly related PRs

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • 🛠️ Update Documentation: Commit on current branch
  • 🛠️ Update Documentation: Create PR
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Avoid calling collectRuntimeSnapshot for the after snapshot when
beforeReady is false; we would skip publishing the profile anyway.
@github-actions
Copy link
Contributor

I reviewed this PR (/ai) and did not find any actionable issues that meet a blocking severity threshold.

I also ran:

go test ./x-pack/osquerybeat/...

All tests passed in this environment.

Given the current diff and discussion context, I don’t have changes to request.


What is this? | From workflow: Mention in PR

Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.

Warning

⚠️ Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • osquery.io

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "osquery.io"

See Network Configuration for more information.

marc-gr added 2 commits March 17, 2026 12:06
- Add Profile bool to QueryInfo; remove separate queryProfiles map.
  LookupQueryProfile checks newQueryInfoMap then queryInfoMap so the
  flag is up to date immediately after Set().
- Use plain bool for Profile in config.Query and StreamConfig (default false).
@mergify
Copy link
Contributor

mergify bot commented Mar 17, 2026

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b feat/osquery-profiles upstream/feat/osquery-profiles
git merge upstream/main
git push upstream feat/osquery-profiles

marc-gr added 4 commits March 17, 2026 15:13
Remove live rows/fds and scheduled last_executed/profile_source fields.
Persist latest live query profiles per query, include them in diagnostics,
and add elastic_options settings to control storage and retention.
@marc-gr marc-gr changed the title osquerybeat: add optional query profiling for scheduled and live runs osquerybeat: store live query profiles Mar 17, 2026
@marc-gr marc-gr changed the title osquerybeat: store live query profiles osquerybeat: add optional query profiling for scheduled and live runs Mar 18, 2026
marc-gr added 3 commits March 18, 2026 09:59
Add a test for live profile file eviction and ensure status tests
initialize beat paths before constructing the beater.
Initialize beat paths in status reporting tests that construct a beater
so New() can resolve path.data without panicking.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-skip Skip notification from the automated backport with mergify enhancement Osquerybeat Team:Security-Windows Platform Windows Platform Team in Security Solution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants