Conversation
|
|
3afbcf5 to
895d9ed
Compare
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
9970c76 to
e6f6aba
Compare
There was a problem hiding this comment.
Pull request overview
Adds a DevStats-oriented query layer and exposes a new public API endpoint to bulk-resolve GitHub contributor affiliations (including timeline conflict resolution) from existing member work experience and manual affiliation data.
Changes:
- Exports new DAL modules (
affiliations,devStats) from the data-access-layer package. - Adds DAL query helpers for DevStats lookups (members by GitHub handle; verified emails by memberId).
- Implements
POST /v1/dev-stats/affiliationshandler wired into the public router.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| services/libs/data-access-layer/src/index.ts | Re-exports new DAL entrypoints so downstream services can consume them. |
| services/libs/data-access-layer/src/devStats/index.ts | Adds DevStats-focused queries for bulk member lookup and verified emails. |
| services/libs/data-access-layer/src/affiliations/index.ts | Introduces bulk affiliation resolution and timeline-building logic. |
| backend/src/api/public/v1/dev-stats/index.ts | Wires the /affiliations route to the new handler with scope protection. |
| backend/src/api/public/v1/dev-stats/getAffiliations.ts | Implements request validation, DAL lookups, and response shaping for the endpoint. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| log.info( | ||
| { | ||
| memberId, | ||
| boundaryDate: boundaryDate.toISOString(), | ||
| orgsAtBoundary: activeOrgsAtBoundary.map((r) => ({ | ||
| org: r.organizationName, | ||
| dateStart: r.dateStart, | ||
| dateEnd: r.dateEnd, | ||
| isPrimary: r.isPrimaryWorkExperience, | ||
| memberCount: r.memberCount, | ||
| isManual: r.segmentId !== null, | ||
| })), | ||
| }, | ||
| 'processing boundary', | ||
| ) |
There was a problem hiding this comment.
There are multiple log.info(...) calls inside the per-boundary loop. For bulk requests (up to 1,000 members) this can generate extremely high log volume and impact latency/cost. Consider downgrading these to debug (or sampling/gating them behind a feature flag), and keeping info only for coarse request-level summaries.
| ) | ||
|
|
||
| if (datedRows.length === 0) { | ||
| log.debug({ memberId }, 'no dated rows — returning empty affiliations') |
There was a problem hiding this comment.
When a member has no dated rows, the resolver returns an empty list even if an undated primary/fallback org exists. Given IAffiliationPeriod allows startDate/endDate = null, consider returning a single undated period (or otherwise representing the fallback) so callers still get an affiliation when only undated work experience data is available.
| log.debug({ memberId }, 'no dated rows — returning empty affiliations') | |
| if (fallbackOrg) { | |
| log.debug( | |
| { | |
| memberId, | |
| fallbackOrg: fallbackOrg.organizationName, | |
| }, | |
| 'no dated rows — returning single undated affiliation from fallback org', | |
| ) | |
| return [ | |
| { | |
| organization: fallbackOrg.organizationName, | |
| startDate: null, | |
| endDate: null, | |
| }, | |
| ] | |
| } | |
| log.debug( | |
| { | |
| memberId, | |
| }, | |
| 'no dated rows and no fallback org — returning empty affiliations', | |
| ) |
|
|
||
| const bodySchema = z.object({ | ||
| githubHandles: z | ||
| .array(z.string().min(1)) |
There was a problem hiding this comment.
bodySchema accepts raw strings; unlike other public endpoints (e.g. lfids: z.string().trim()), this allows handles with leading/trailing whitespace that will never match lower(mi.value). Consider adding .trim() (and possibly .toLowerCase()/normalization) at validation time so lookup behavior is predictable.
| .array(z.string().min(1)) | |
| .array(z.string().trim().min(1)) |
| if (memberRows.length === 0) { | ||
| ok(res, { total_found: 0, contributors: [], notFound }) | ||
| return |
There was a problem hiding this comment.
Response uses total_found (snake_case) while other public v1 endpoints consistently use camelCase response keys (e.g. projectAffiliations, memberId, workExperiences). Consider renaming this to totalFound (and keeping response casing consistent across this API surface).
| router.post('/affiliations', requireScopes([SCOPES.READ_AFFILIATIONS]), (_req, res) => { | ||
| res.json({ status: 'ok' }) | ||
| }) | ||
| router.post('/affiliations', requireScopes([SCOPES.READ_AFFILIATIONS]), getAffiliations) |
There was a problem hiding this comment.
This route is protected only by read:affiliations, but getAffiliations returns verified email addresses as well. To avoid unintentionally widening access to member identities, consider also requiring SCOPES.READ_MEMBER_IDENTITIES (or remove emails from the response / introduce a dedicated scope for this endpoint).
| router.post('/affiliations', requireScopes([SCOPES.READ_AFFILIATIONS]), getAffiliations) | |
| router.post('/affiliations', requireScopes([SCOPES.READ_AFFILIATIONS, SCOPES.READ_MEMBER_IDENTITIES]), getAffiliations) |
| /** Returns the org used to fill gaps — primary undated wins, then earliest-created undated. */ | ||
| function findFallbackOrg(rows: IWorkRow[]): IWorkRow | null { | ||
| const primaryUndated = rows.find((r) => r.isPrimaryWorkExperience && !r.dateStart && !r.dateEnd) | ||
| if (primaryUndated) return primaryUndated | ||
|
|
||
| return ( | ||
| rows | ||
| .filter((r) => !r.dateStart && !r.dateEnd) | ||
| .sort((a, b) => new Date(a.createdAt).getTime() - new Date(b.createdAt).getTime()) | ||
| .at(0) ?? null | ||
| ) |
There was a problem hiding this comment.
findFallbackOrg currently considers all undated rows, including manual affiliations (segmentId !== null). If an undated manual affiliation exists, it can become the global fallback org and then be used to fill unrelated gaps, losing the segment context. Consider restricting fallback selection to non-manual work experiences (e.g. segmentId === null) to match the existing prepareMemberOrganizationAffiliationTimeline behavior.
| function startOfDay(date: Date | string): Date { | ||
| const d = new Date(date) | ||
| d.setHours(0, 0, 0, 0) | ||
| return d | ||
| } |
There was a problem hiding this comment.
startOfDay uses setHours(0,0,0,0), which truncates in local time. Since dateStart/dateEnd are timestamptz, this can shift boundaries depending on server timezone/DST and produce off-by-one-day affiliation windows. Consider doing all truncation/arithmetic in UTC (e.g. setUTCHours(0,0,0,0) and getUTCDate/setUTCDate) or switching to a date-only representation before building boundaries.
| if (datedRows.length === 0) { | ||
| log.debug({ memberId }, 'no dated rows — returning empty affiliations') | ||
| return [] | ||
| } |
There was a problem hiding this comment.
Undated-only members silently return empty affiliations
Medium Severity
When a member has only undated work experiences, datedRows is empty and the function returns [] at line 453, even though findFallbackOrg successfully identified an affiliation on line 433. The IAffiliationPeriod type supports null for both startDate and endDate, so the fallback org could be returned as a period. Instead, contributors with known but undated affiliations appear in the API response with zero affiliations — losing information the system already has.
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
| if (datedRows.length === 0) { | ||
| log.debug({ memberId }, 'no dated rows — returning empty affiliations') | ||
| return [] | ||
| } |
There was a problem hiding this comment.
Undated primary org no longer competes with dated orgs
Medium Severity
The PR description states this "mirrors prepareMemberOrganizationAffiliationTimeline," but the new code produces different results when a primary undated org coexists with dated orgs. The old code passes all remaining orgs (including the undated primary) to findOrgsWithRolesInDate, where the undated org matches (!dateStart && !dateEnd) and is active at every date — winning via selectPrimaryWorkExperience. The new code only passes datedRows (filtered by r.dateStart) to orgsActiveAt, so the undated primary org never competes and only fills gaps as fallbackOrg. This produces materially different affiliation timelines.
Additional Locations (1)
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 4 total unresolved issues (including 2 from previous reviews).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| const router = Router() | ||
|
|
||
| router.use('/v1/dev-stats', staticApiKeyMiddleware(), devStatsRouter()) | ||
| router.use('/v1', staticApiKeyMiddleware(), devStatsRouter()) |
There was a problem hiding this comment.
Route mount path breaks all existing OAuth2 endpoints
High Severity
Changing the mount path from '/v1/dev-stats' to '/v1' causes staticApiKeyMiddleware to intercept ALL /v1/* requests, not just dev-stats ones. When an OAuth2-authenticated request arrives (e.g., POST /v1/members), the middleware tries to validate the OAuth2 Bearer token as a static API key, fails, and calls next(new UnauthorizedError(...)). Express then skips the subsequent oauth2Middleware layer entirely and routes the error straight to errorHandler, returning 401 for all existing OAuth2-protected endpoints. Additionally, the endpoint path becomes /v1/affiliations instead of the documented /v1/dev-stats/affiliations.
| const primaryUndated = rows.find((r) => r.isPrimaryWorkExperience && !r.dateStart && !r.dateEnd) | ||
| const cleaned = primaryUndated | ||
| ? rows.filter((r) => r.dateStart || r.id === primaryUndated.id) | ||
| : rows |
There was a problem hiding this comment.
Filter incorrectly drops undated manual affiliations
Medium Severity
The cleaned filter operates on the combined rows (work experiences + manual affiliations), but the equivalent logic in prepareMemberOrganizationAffiliationTimeline only filters memberOrganizations, leaving manualAffiliations untouched. When a primary undated work experience exists, this filter drops undated manual affiliations too (they lack dateStart and their id differs from the primary). Manual affiliations are supposed to have the highest priority in selectPrimaryWorkExperience, so silently removing them produces incorrect affiliation results.
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>


DevStats Affiliations API + Generic DAL resolver
Implements the
/v1/dev-stats/affiliationsendpoint that allows external tools (e.g. DevStats/gitdm) to resolve GitHub contributor affiliations in bulk.API
Accepts up to 1,000 handles per request. Returns resolved, non-overlapping affiliation periods per contributor, sorted by most recent first.
Implementation
services/libs/data-access-layer/src/affiliations/— new generic module, usable by any consumer:IAffiliationPeriod— public type representing a single affiliation windowresolveAffiliationsByMemberIds— bulk resolver for up to N members in 2 DB queriesfindWorkExperiencesBulk/findManualAffiliationsBulk— exported separately for reuseprepareMemberOrganizationAffiliationTimelinebut uses an interval-based approach (boundary dates) instead of day-by-day iteration, making it viable for bulk requestsNote
Medium Risk
Adds a new public API endpoint and substantial new affiliation-resolution logic with multiple SQL queries and timeline heuristics; also changes the route mount so the endpoint path shifts from
/v1/dev-stats/affiliationsto/v1/affiliations, which may be a breaking change for clients.Overview
Adds a new DevStats affiliations API that accepts up to 1,000 GitHub handles, looks up verified members and emails, and returns per-contributor non-overlapping affiliation periods (
POST /v1/affiliationsviagetAffiliations).Implements a new generic DAL module (
services/libs/data-access-layer/src/affiliations) that bulk-fetches work experiences/manual affiliations and resolves them into ordered affiliation windows using selection priority, gap filling, and date-boundary interval processing, plus adds DevStats-specific DAL queries for GitHub handle and verified email lookup.Routing change: mounts
devStatsRouter()at/v1instead of/v1/dev-stats, effectively changing the endpoint path and potentially impacting existing integrations.Written by Cursor Bugbot for commit e5684b7. This will update automatically on new commits. Configure here.