-
Notifications
You must be signed in to change notification settings - Fork 10
Adds PromiseProgressUpdaterJob that updates the promise progress based on the attached evidence #48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
f7c16f5
7a1ddfe
1f3cf60
1f585af
d673cff
3156ef4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| class PromiseProgressUpdaterJob < ApplicationJob | ||
| queue_as :default | ||
|
|
||
| def perform(promise) | ||
| promise.update_progress!(inline: true) | ||
| end | ||
| end |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,128 @@ | ||
| class PromiseProgressUpdater < Chat | ||
| include Structify::Model | ||
|
|
||
| def prompt(promise) | ||
| <<~PROMPT | ||
| You are a specialized government accountability analyst. Your task is to assess the progress made on specific government commitments based on available evidence of government actions. | ||
|
|
||
| **Your Mission:** | ||
| Analyze the provided government commitment and associated evidence to determine how much progress has been made toward fulfilling that commitment. You will assign a progress score and provide a factual summary based solely on the evidence provided. | ||
|
|
||
| **Input Data Structure:** | ||
| You will receive: | ||
| 1. **Promise Information:** | ||
| - `promise_id`: The internal ID for the promise | ||
| - `title`: Title of the promise | ||
| - `description`: Summary of the promise | ||
| - `text`: Original text of the promise | ||
|
|
||
| 2. **Evidence Items:** A list of government actions/evidence related to this commitment, each containing: | ||
| - `title_or_summary`: Brief description of the government action | ||
| - `evidence_source_type`: Type of evidence (e.g., "Bill Event (LEGISinfo)", "Canada Gazette Part II", "OIC", "News") | ||
| - `evidence_date`: When this action occurred (YYYY-MM-DD format) | ||
| - `description_or_details`: Detailed description of the action | ||
| - `source_url`: Official government source URL | ||
| - `bill_one_sentence_description_llm`: (For bills only) AI-generated description of the bill's purpose | ||
|
|
||
| First, carefully read and analyze the following promise information: | ||
| <promise> | ||
| #{promise.format_for_llm} | ||
| </promise> | ||
|
|
||
| Now, review the list of evidence items: | ||
|
|
||
| <evidence_items> | ||
| #{promise.evidences.map(&:format_for_llm).join("\n")} | ||
| </evidence_items> | ||
|
|
||
| **Progress Scoring Scale (1-5):** | ||
|
|
||
| **Score 1 - No Progress:** | ||
| - No meaningful government action found | ||
| - No relevant legislation introduced | ||
| - No funding allocated or programs launched | ||
|
|
||
| **Score 2 - Initial Steps:** | ||
| - Early-stage actions like consultations launched | ||
| - Preliminary announcements or studies initiated | ||
| - Minor policy discussions or planning activities | ||
| - No significant legislative action or substantial funding | ||
|
|
||
| **Score 3 - Meaningful Action:** | ||
| - Legislation introduced and progressing through Parliament | ||
| - Significant budget allocation announced or programs launched | ||
| - Substantial policy development or regulatory work initiated | ||
| - Clear government commitment with concrete steps taken | ||
|
|
||
| **Score 4 - Major Progress:** | ||
| - Key legislation passed major parliamentary stages (e.g., passed one House) | ||
| - Substantial regulatory changes enacted or published | ||
| - Significant funding disbursed and programs operational | ||
| - Major implementation milestones achieved | ||
|
|
||
| **Score 5 - Complete/Fully Implemented:** | ||
| - All necessary legislation received Royal Assent and in force | ||
| - Key regulations published and operational | ||
| - All announced funding allocated and programs fully operational | ||
| - Commitment objectives substantially achieved | ||
|
|
||
| **Analysis Guidelines:** | ||
| 1. **Evidence-Based Assessment:** Base your score only on the provided evidence | ||
| 2. **Legislative Tracking:** Consider all stages of bill progress (introduction, readings, committee, Royal Assent) | ||
| 3. **Implementation Focus:** Distinguish between announcements and actual implementation | ||
| 4. **Proportional Scoring:** Consider the scope and complexity of the commitment | ||
| 5. **Temporal Relevance:** Focus on actions within the current parliamentary session | ||
|
|
||
| **Output Format:** | ||
| Provide your assessment as a JSON object with this exact structure: | ||
|
|
||
| ```json | ||
| { | ||
| "progress_score": 3, | ||
| "progress_summary": "A concise, factual summary (max 150 words) describing the key actions taken and current status based on the evidence provided. Focus on concrete actions, legislative milestones, funding allocations, and implementation status." | ||
| } | ||
| ``` | ||
|
|
||
| **Key Requirements:** | ||
| - **Objectivity:** Base assessments only on provided evidence, avoid speculation | ||
| - **Clarity:** Use clear, factual language in the progress summary | ||
| - **Completeness:** Consider all evidence items when determining the score | ||
| - **Accuracy:** Ensure the score aligns with the evidence and scoring criteria | ||
| - **Conciseness:** Keep the summary focused and under 150 words | ||
|
|
||
| **Example Scoring Logic:** | ||
| - If a bill was introduced but hasn't progressed → Score 2-3 | ||
| - If a bill passed one House of Parliament → Score 3-4#{' '} | ||
| - If a bill received Royal Assent → Score 4-5 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if this is true, a bill could contribute to the progress but not implement all of the required criteria for it to be fulfilled. For example, it could just be funding a particular program or giving the government the ability to do something and not actually follow through
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I used the prompt directly from the OutcomeTracker repo. I had assumed this was used for the initial scores and wanted to keep it the same.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cool, works for me |
||
| - If funding was announced but not yet disbursed → Score 2-3 | ||
| - If programs are operational with funding flowing → Score 4-5 | ||
| - If no relevant evidence found → Score 1#{' '} | ||
| PROMPT | ||
| end | ||
|
|
||
|
|
||
| schema_definition do | ||
| version 1 | ||
| name "PromiseProgressUpdater" | ||
| description "Updates the progress of a promise" | ||
| field :promise_summary, :object, properties: { | ||
| "progress_score" => { type: "integer" }, | ||
| "progress_summary" => { type: "string", description: "Summary of the progress the government has made towards a promise" } | ||
| } | ||
| end | ||
|
|
||
|
|
||
| def update_promise_progress! | ||
| raise ArgumentError.new("Promise is not provided") unless self.record and self.record.is_a?(Promise) | ||
|
|
||
| p = prompt( | ||
| self.record | ||
| ) | ||
|
|
||
| self.extract! p | ||
|
|
||
| self.record.progress_score = promise_summary["progress_score"] | ||
| self.record.progress_summary = promise_summary["progress_summary"] | ||
| self.record.save! | ||
| end | ||
| end | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this section is really helpful, but I am okay with shipping it as is and iterating.
We don't have great evals atm anyways.