[TAM]: Add telemetry stats APIs and maximal polling interval for stream telemetry#2214
[TAM]: Add telemetry stats APIs and maximal polling interval for stream telemetry#2214Pterosaur wants to merge 1 commit intoopencomputeproject:masterfrom
Conversation
cab7604 to
0e2f3a0
Compare
0e2f3a0 to
ee630ed
Compare
|
it will not build, you are breaking backward compatybility with structure item shift |
Hi @kcudnik , long time no talk. Glad to have your eyes on this. I’m aware that this PR breaks backward compatibility. The root cause is that the original design didn’t provide APIs for querying SAI statistics of TAM objects. I attempted to add the new APIs at the end of Do you have any suggestion? |
|
for apis you can add then at the end of api struct, but for extended struct you would need to break compatybility and add exception |
fd6659b to
eda591e
Compare
eda591e to
f949373
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
f949373 to
90bf1aa
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
90bf1aa to
a542cd5
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ze Gan <ganze718@gmail.com>
a542cd5 to
c160b59
Compare
|
@tjchadaga This PR adds a new field We could work around this by adding meta/structs.pl--- a/meta/structs.pl
+++ b/meta/structs.pl
@@ -164,18 +164,27 @@ sub BuildCommitHistory
# of union may not increase by adding members, and actual union size
# check is performed by sai sanity check
+ # Structs that are allowed to append new members at the end (ABI extension).
+ # api_t structs are always allowed. Add other structs here as needed.
+ my @extensible_structs = (
+ "sai_switch_health_data_t",
+ "sai_port_oper_status_notification_t",
+ "sai_stat_st_capability_t",
+ );
+
+ my %extensible = map { $_ => 1 } @extensible_structs;
+
if ($currCount != $histCount and not $structTypeName =~ /^sai_\w+_api_t$/
- and $structTypeName ne "sai_switch_health_data_t"
- and $structTypeName ne "sai_port_oper_status_notification_t")
+ and not $extensible{$structTypeName})
{
LogError "FATAL: struct $structTypeName member count differs, was $histCount but is $currCount on commit $commit" if $type eq "struct";
}
if ($histCount > $currCount)
{
- if ($structTypeName eq "sai_port_oper_status_notification_t")
+ if ($extensible{$structTypeName})
{
- # we allow this to change back backward compatibility
+ # we allow extensible structs to change for backward compatibility
}
else
{meta/test.pm--- a/meta/test.pm
+++ b/meta/test.pm
@@ -646,7 +646,14 @@ sub CreateStructUnionSizeCheckTest
$STRUCTS{$name} = $name;
next if $name =~ /^sai_\w+_api_t$/; # skip api structs
- next if $name eq "sai_switch_health_data_t";
+
+ # Skip extensible structs that are allowed to grow (see also structs.pl)
+ my %extensible_structs = map { $_ => 1 } (
+ "sai_switch_health_data_t",
+ "sai_port_oper_status_notification_t",
+ "sai_stat_st_capability_t",
+ );
+ next if $extensible_structs{$name};
my $upname = uc($name);However, I don't think |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@tjchadaga I've also created a draft PR #2260 that includes the same functional changes plus the |
|
just to inform, that breaking compatibility can cause some errors or read uninitialized memory, potentially a crash, if in notification struct is differetnt between what syncd expects and what library support, especially if there is a array of structs the alignement will fail, fields will shift, and garbage data will be read |
Description
Enhance stream telemetry (Phase 2) with the following changes:
1. TAM telemetry stats APIs (
inc/saitam.h)Add stats APIs for TAM telemetry objects for debugging purposes:
sai_get_tam_telemetry_stats_fn— get telemetry stats (deprecated, for backward compatibility)sai_get_tam_telemetry_stats_ext_fn— get telemetry stats extendedsai_clear_tam_telemetry_stats_fn— clear telemetry statsNew stat counters (
sai_tam_telemetry_stat_t):SAI_TAM_TELEMETRY_STAT_INGESTED_RECORDS— total records ingestedSAI_TAM_TELEMETRY_STAT_PENDING_READ_RECORDS— records pending processing (gauge)SAI_TAM_TELEMETRY_STAT_CONSUMED_RECORDS— total records consumedSAI_TAM_TELEMETRY_STAT_DROPPED_RECORDS— total records dropped2. Maximal polling interval (
inc/saitypes.h)Add
maximal_polling_intervalfield tosai_stat_st_capability_t. The collector handles single-cycle counter rollovers, but vendors must ensure data does not roll over twice between two collection intervals. This field allows vendors to specify the upper bound.Note: Adding a new member to
sai_stat_st_capability_tis an ABI-breaking change that causes metadata CI to fail. See PR comments for discussion and proposed workaround.