Skip to content

Create process / infra diagram for scanning #10

@grossir

Description

@grossir

Once raw scans are uploaded to S3 (either phone scanning or Hein data) we intend to perform the following steps

Blackletter service section:

  • get raw reporters from S3
  • separate opinions out of the reporter file
  • censor copyrighted material using YOLO
  • save ready to extract opinions to S3

Courtlistener section

  • call LLM services to extract XML
  • save extracted content and metadata into the DB

So, we have some open questions which fall on Chaco's expertise

  • How will black letter be deployed?
  • How will it run? How will it know what to process next?
  • What hardware does it need to run?
    • is YOLO usage heavy on processor / RAM?

Am I missing anything?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

To Do

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions