Conversation
…d on the attached evidence
| **Example Scoring Logic:** | ||
| - If a bill was introduced but hasn't progressed → Score 2-3 | ||
| - If a bill passed one House of Parliament → Score 3-4#{' '} | ||
| - If a bill received Royal Assent → Score 4-5 |
There was a problem hiding this comment.
I'm not sure if this is true, a bill could contribute to the progress but not implement all of the required criteria for it to be fulfilled.
For example, it could just be funding a particular program or giving the government the ability to do something and not actually follow through
There was a problem hiding this comment.
I used the prompt directly from the OutcomeTracker repo. I had assumed this was used for the initial scores and wanted to keep it the same.
xrendan
left a comment
There was a problem hiding this comment.
Can you change the promise format_for_llm function?
Otherwise this looks good as a first iteration. We're going to need to iterate and ensure it's working well, but I don't need it top be perfect to start
| You will receive: | ||
| 1. **Promise Information:** | ||
| - `promise_id`: The internal ID for the promise | ||
| - `title`:#{' '} |
There was a problem hiding this comment.
Because I forgot to fill it in and I think our linter changed it to an empty space. I'll update this section along with format_for_llm update.
| - `promise_id`: The internal ID for the promise | ||
| - `title`:#{' '} | ||
| - `description`: | ||
| - `text`:#{' '} |
|
|
||
| **Input Data Structure:** | ||
| You will receive: | ||
| 1. **Promise Information:** |
There was a problem hiding this comment.
I don't know if this section is really helpful, but I am okay with shipping it as is and iterating.
We don't have great evals atm anyways.
app/models/promise.rb
Outdated
| title: concise_title, | ||
| description: description, | ||
| text: text | ||
| text: text, |
There was a problem hiding this comment.
This might impact promise generation/linking and just make scoring worse overall/
The general structure of the promises needs to be revisited and pared down. It's a direct lift and shift from the old LLM generated app and it leaves lots to be desired.
For example.
{description: "The government commits to reducing the annual increase in its operational expenditures to less than two percent, emphasizing fiscal discipline.",
concise_title: "Cap Federal Spending Growth at Two Percent",
responsible_department_lead: nil,
intended_impact_and_objectives:
["Canadians may experience reduced pressure on taxes and a more stable national debt, fostering long-term economic security.",
"Fiscal prudence aims to protect Canada's economic standing and ensure preparedness for future financial challenges.",
"Government departments and services may face tighter budgets, potentially leading to more efficient resource allocation or prioritization of core services.",
"This measure signals a commitment to intergenerational equity by managing public finances responsibly for future generations."],
background_and_context:
"This commitment stems from the Liberal Party's emphasis on a responsible fiscal plan in the context of ongoing economic challenges and post-pandemic spending. The platform acknowledges the need to balance necessary investments with fiscal discipline to ensure long-term prosperity. By capping day-to-day government spending growth, the government aims to establish a key 'fiscal anchor,' manage the national debt, and uphold intergenerational equity. This measure is presented as a means to provide predictability and confidence in Canada's economic future, demonstrating financial prudence while still enabling strategic investments for Canadians."}
It adds a bunch of stuff that makes evaluating what the promise actually is harder for the LLM.
I'd rather keep this as is
|
Thanks for the changes @xrendan. There was a reason for the extra details in format_for_llm but I can't remember why now so good to go without them |
Will fix #28