osquerybeat: add optional query profiling for scheduled and live runs#49514
osquerybeat: add optional query profiling for scheduled and live runs#49514marc-gr wants to merge 19 commits intoelastic:mainfrom
Conversation
- Add DefaultQueryProfileDataset and QueryProfileDatastream() for osquery_manager.query_profile events - Add optional Profile *bool to config.Query (schedule/packs) - Add optional Profile *bool to StreamConfig for deprecated streams path
- Add Profile bool to Action; parse data.profile (bool) in FromMap - Reject invalid profile type with ErrActionRequest - Tests: valid profile flag, invalid profile type
- queryProfiler: profileScheduledQuery (osquery_schedule), state for deltas - Handle execution counter reset (e.g. osqueryd restart) by treating as single run - collectRuntimeSnapshot: processes JOIN osquery_info for live profiling - buildLiveQueryProfile: utilization, duration, memory, fds, exit code - scheduledProfilesDiagnosticsWithResolver for diagnostic hook JSON - Helpers: toInt64, toString, utilizationFromMillis, millisToSeconds
- Initialize queryProfiles map in NewConfigPlugin - queryProfiles map and LookupQueryProfile(name) for scheduled queries - set(): populate queryProfiles from Schedule, Packs, and Streams (Profile on Query/StreamConfig) - TestSetScheduledQueryProfileFlag: Schedule with profile enabled
- queryProfileClient for dataset osquery_manager.query_profile - Configure: attach client when input has query_profile dataset; document requirement - PublishQueryProfile: type osquery_profile, optional query/action/response ids - Close queryProfileClient and actionResponsesClient - processorsForInputConfig: document empty Processors / Fleet behavior
Beater: - queryProfiler instance; registerDiagnosticHooks (scheduled_query_profiles) - scheduledQueryProfilesDiagTimeout constant - setDiagnosticsQueryExecutor/Resolver for diagnostic hook - handleQueryResult: profile scheduled queries when LookupQueryProfile; PublishQueryProfile - Fix duplicate runOsquery error log Action handler: - namespace() refactor; executeQuery takes index - executeQuery: when Profile, collectRuntimeSnapshot before/after, buildLiveQueryProfile, PublishQueryProfile - publisher interface: PublishQueryProfile - Debug log when profile requested but pre-snapshot missing Tests: mock PublishQueryProfile; TestOsquerybeatRegistersScheduledProfilesDiagnostics
- toInt64, toString, utilizationFromMillis, millisToSeconds - buildLiveQueryProfile (success and error exit) - scheduledProfileFromScheduleRow, diagnosticsErrorJSON - profileScheduledQuery: first run, execution reset (osquery restart) - scheduledProfilesDiagnosticsWithResolver nil executor
🤖 GitHub commentsJust comment with:
|
|
This pull request is now in conflicts. Could you fix it? 🙏 |
|
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
Resolve conflicts: - osquerybeat.go: keep upstream osqueryInstallConfig and executablePath, add qp (queryProfiler) - osquerybeat_status_test.go: keep both TestOsquerybeatRegistersScheduledProfilesDiagnostics and TestOsquerybeatStatusReporting_RuntimeResolutionFailure
|
Pinging @elastic/sec-windows-platform (Team:Security-Windows Platform) |
|
/ai |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds per-query profiling to osquerybeat: a changelog fragment; new Profile flags on Action, StreamConfig, and internal Query; ConfigPlugin support with LookupQueryProfile; a new internal query profiler that captures runtime snapshots, builds scheduled and live profile payloads, and exposes scheduled_profiles diagnostics hooks; a PublishQueryProfile API and dedicated query_profile publisher client plus datastream helpers; wiring in action_handler and osquerybeat to collect and publish live and scheduled profiles; and unit tests covering profiler, diagnostics, config, action parsing, and publishing. Possibly related PRs
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
Avoid calling collectRuntimeSnapshot for the after snapshot when beforeReady is false; we would skip publishing the profile anyway.
|
I reviewed this PR ( I also ran:
All tests passed in this environment. Given the current diff and discussion context, I don’t have changes to request. What is this? | From workflow: Mention in PR Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not. Warning
|
- Add Profile bool to QueryInfo; remove separate queryProfiles map. LookupQueryProfile checks newQueryInfoMap then queryInfoMap so the flag is up to date immediately after Set(). - Use plain bool for Profile in config.Query and StreamConfig (default false).
|
This pull request is now in conflicts. Could you fix it? 🙏 |
Remove live rows/fds and scheduled last_executed/profile_source fields.
Persist latest live query profiles per query, include them in diagnostics, and add elastic_options settings to control storage and retention.
Add a test for live profile file eviction and ensure status tests initialize beat paths before constructing the beater.
Initialize beat paths in status reporting tests that construct a beater so New() can resolve path.data without panicking.
Proposed commit message
Add optional query profiling for osquerybeat. Scheduled queries can set
profile: truein config; live (ad-hoc) actions can senddata.profile: true. When enabled, metrics (utilization, duration, memory, cpu time, etc.) are collected fromosquery_scheduleor process snapshots and published to theosquery_manager.query_profiledatastream. Integration must supply an input stream with that dataset for events to be published. Diagnostic hookscheduled_query_profilesexposes recent scheduled-query profiles as JSON.Checklist
stresstest.shscript to run them under stress conditions and race detector to verify their stability../changelog/fragmentsusing the changelog tool.Disruptive User Impact
None. Profiling is opt-in; no new stream required unless profile events are desired.
Author's Checklist
How to test this PR locally
go test ./x-pack/osquerybeat/...for unit tests.profile: trueon one or more scheduled queries and (optionally) an input stream with datasetosquery_manager.query_profile; trigger a live query withdata.profile: trueto exercise live profiling.Related issues
Use cases
data.profile: trueto inspect resource usage for a given SQL statement.