Skip to content

Support Multiple Crawls Per PDF#79

Open
albert-du wants to merge 2 commits intomainfrom
multiple-crawls-per-pdf
Open

Support Multiple Crawls Per PDF#79
albert-du wants to merge 2 commits intomainfrom
multiple-crawls-per-pdf

Conversation

@albert-du
Copy link
Collaborator

@albert-du albert-du commented Mar 8, 2026

Implements #38

Adds table to PDFPreview to display all crawl instances a PDF has been scraped. If there are more than 5, then it collapses the list by default. Front end remains backwards compatible with existing server API.

image

/preview (before and after)
image

image

A new config argument adds a hard cap on the number of crawls returned by the API. Default is 500, shown here, demonstrated here at 1. If there are more than the cap returned, then the API sets a flag in the response so the UI knows to add the message.
image

Adds every crawl instances to a new response field on the backend's search and pages APIs. The original response fields are retained for the most recent crawl for backwards compatibility.

@albert-du albert-du linked an issue Mar 8, 2026 that may be closed by this pull request
4 tasks
@albert-du albert-du marked this pull request as draft March 10, 2026 20:57
@albert-du albert-du marked this pull request as draft March 10, 2026 20:57
@albert-du albert-du marked this pull request as ready for review March 14, 2026 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Adjust Govscape To Handle Multiple Crawls Per PDF

1 participant