Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ gem "puma", ">= 5.0"
# Use GoodJob for Active Job queue adapter
gem "good_job", "~> 4.10", ">= 4.10.2"
# Build JSON APIs with ease [https://github.com/rails/jbuilder]
# gem "jbuilder"
gem "jbuilder"

# Use Active Model has_secure_password [https://guides.rubyonrails.org/active_model_basics.html#securepassword]
# gem "bcrypt", "~> 3.1.7"
Expand Down Expand Up @@ -46,7 +46,8 @@ gem "importmap-rails", "~> 2.1"

gem "propshaft", "~> 1.1"

gem "ruby_llm", "~> 1.3"
gem "ruby_llm", "~> 1.3", github: "xrendan/ruby_llm", branch: "structured-output"
# gem "ruby_llm", "~> 1.3", path: "../ruby_llm"

gem "dotenv", groups: [ :development, :test ]

Expand All @@ -55,3 +56,5 @@ gem "feedjira", "~> 3.2"
gem "http", "~> 5.3"

gem "iconv", "~> 1.1"

gem "structify", "~> 0.3.4"
36 changes: 26 additions & 10 deletions Gemfile.lock
Original file line number Diff line number Diff line change
@@ -1,3 +1,18 @@
GIT
remote: https://github.com/xrendan/ruby_llm.git
revision: c849a1ef77e533190173dc7d6aa274e0f491b29f
branch: structured-output
specs:
ruby_llm (1.3.1)
base64
event_stream_parser (~> 1)
faraday (>= 1.10.0)
faraday-multipart (>= 1)
faraday-net_http (>= 1)
faraday-retry (>= 1)
marcel (~> 1.0)
zeitwerk (~> 2)

GEM
remote: https://rubygems.org/
specs:
Expand Down Expand Up @@ -78,6 +93,8 @@ GEM
addressable (2.8.7)
public_suffix (>= 2.0.2, < 7.0)
ast (2.4.3)
attr_json (2.5.0)
activerecord (>= 6.0.0, < 8.1)
avo (3.21.1)
actionview (>= 6.1)
active_link_to
Expand Down Expand Up @@ -185,6 +202,9 @@ GEM
pp (>= 0.6.0)
rdoc (>= 4.0.0)
reline (>= 0.4.2)
jbuilder (2.13.0)
actionview (>= 5.0.0)
activesupport (>= 5.0.0)
json (2.12.2)
language_server-protocol (3.17.0.5)
lint_roller (1.1.0)
Expand Down Expand Up @@ -339,18 +359,12 @@ GEM
rubocop-performance (>= 1.24)
rubocop-rails (>= 2.30)
ruby-progressbar (1.13.0)
ruby_llm (1.3.1)
base64
event_stream_parser (~> 1)
faraday (>= 1.10.0)
faraday-multipart (>= 1)
faraday-net_http (>= 1)
faraday-retry (>= 1)
marcel (~> 1.0)
zeitwerk (~> 2)
sax-machine (1.3.2)
securerandom (0.4.1)
stringio (3.1.7)
structify (0.3.4)
activesupport (>= 7.0, < 9.0)
attr_json (~> 2.1)
thor (1.3.2)
timeout (0.4.3)
turbo-rails (2.0.16)
Expand Down Expand Up @@ -401,12 +415,14 @@ DEPENDENCIES
http (~> 5.3)
iconv (~> 1.1)
importmap-rails (~> 2.1)
jbuilder
pg (~> 1.1)
propshaft (~> 1.1)
puma (>= 5.0)
rails (~> 8.0.2)
rubocop-rails-omakase
ruby_llm (~> 1.3)
ruby_llm (~> 1.3)!
structify (~> 0.3.4)
tzinfo-data

BUNDLED WITH
Expand Down
37 changes: 34 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,45 @@
# Government Outcomes Tracker API


Data Model
## Data Model

`Government`:
Think of this as the core tenant. This is currently unused, but is useful for future development where we want to track
Think of this as the core tenant. All records are currently linked to the federal government,
but is useful for future development where we want to track
provincial governments in addition to the federal government.

`Department`:
Departments are the units within a government. They are associated with a government.

`Minister`:
Ministers are the individuals who are responsible for a department. They are associated with a department.
Each department may have multiple ministers. They have a start and end date. There may also be multiple ministers
associated with a department at the same time if there is both a minister and a secretary of state.

`Promise`:
Promises are the commitments that the government has made. They are associated with a government.
These were originally extracted using an LLM, but are currently static. We need to rebuild the ability
to extract them from source documents. They are associated with a department.

`Feeds`:
Feeds are sources of information that we scrape to understand what the government is doing.
They are associated with a government.When we scrape them we generate `Entry` records
with the raw data. Currently we support RSS feeds, but we would like to also support
newsletters (using active mailbox), and scraping unstructured webpages.

`Entry`:
Entries contain the raw data scraped from feeds. They are associated with a feed and a government.
Some Entries (like those scraped from the Canada Gazette RSS feeds) are indexes of other Entries,
those are flagged as index_entries and are not used for matching, but are then used to find other Entries
which are linked using the parent_entry_id column.

`Activity`:
Activities are the actions that the government is taking. They are associated with an Entry.
These are extracted using an LLM from the Entry's raw data. Each entry might have multiple activities.

`Evidence`:
Evidence links an Activity to a Promise. They are linked using an LLM.

## Using AI


### 🛠 Setup
Expand Down
19 changes: 19 additions & 0 deletions app/avo/resources/activity.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
class Avo::Resources::Activity < Avo::BaseResource
# self.includes = []
# self.attachments = []
# self.search = {
# query: -> { query.ransack(id_eq: params[:q], m: "or").result(distinct: false) }
# }

def fields
field :id, as: :id
field :entry, as: :belongs_to
field :government, as: :belongs_to
field :title, as: :text
field :details, as: :text
field :source_url, as: :text
field :info, as: :code
field :publication_date, as: :date
field :in_force_date, as: :date
end
end
33 changes: 12 additions & 21 deletions app/avo/resources/evidence.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,26 +7,17 @@ class Avo::Resources::Evidence < Avo::BaseResource

def fields
field :id, as: :id
field :raw_gazette_notice_id, as: :text
field :rias_summary, as: :textarea
field :description_or_details, as: :textarea
field :evidence_date, as: :date_time
field :evidence_id, as: :text
field :evidence_source_type, as: :text
field :hybrid_linking_avg_confidence, as: :number
field :hybrid_linking_method, as: :text
field :hybrid_linking_timestamp, as: :date_time
field :ingested_at, as: :date_time
field :parliament_session_id, as: :text
field :promise_linking_processed_at, as: :date_time
field :promise_linking_status, as: :text
field :promise_links_found2, as: :number
field :source_document_raw_id, as: :text
field :source_url, as: :text
field :title_or_summary, as: :textarea
field :key_concepts, as: :textarea
field :linked_departments, as: :textarea
field :promise_ids, as: :textarea
field :llm_analysis_raw, as: :code
field :activity, as: :belongs_to
field :promise, as: :belongs_to
field :impact, as: :text
field :impact_magnitude, as: :number
field :impact_reason, as: :text
field :linked_at, as: :date_time
field :linked_by, as: :belongs_to
field :link_type, as: :text
field :link_reason, as: :text
field :review, as: :boolean
field :reviewed_by, as: :belongs_to
field :reviewed_at, as: :date_time
end
end
21 changes: 21 additions & 0 deletions app/controllers/activities_controller.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
class ActivitiesController < ApplicationController
before_action :set_activity, only: %i[ show ]

# GET /activities
def index
@activities = Activity.all

render json: @activities
end

# GET /activities/1
def show
render json: @activity
end

private
# Use callbacks to share common setup or constraints between actions.
def set_activity
@activity = Activity.find(params.expect(:id))
end
end
4 changes: 4 additions & 0 deletions app/controllers/avo/activities_controller.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# This controller has been generated to enable Rails' resource routes.
# More information on https://docs.avohq.io/3.0/controllers.html
class Avo::ActivitiesController < Avo::ResourcesController
end
12 changes: 11 additions & 1 deletion app/controllers/evidences_controller.rb
Original file line number Diff line number Diff line change
@@ -1,11 +1,21 @@
class EvidencesController < ApplicationController
before_action :set_evidence, only: %i[ show ]

# GET /evidences
def index
@evidences = Evidence.all

render json: @evidences
end

# GET /evidences/1
def show
@evidence = Evidence.find(params[:id])
render json: @evidence
end

private
# Use callbacks to share common setup or constraints between actions.
def set_evidence
@evidence = Evidence.find(params.expect(:id))
end
end
11 changes: 11 additions & 0 deletions app/jobs/feed_refresher_job.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
class FeedRefresherJob < ApplicationJob
queue_as :default

def perform(feed = nil)
if feed.nil?
Feed.all.each { |f| FeedRefresherJob.perform_later(f) }
else
feed.refresh!
end
end
end
7 changes: 7 additions & 0 deletions app/models/activity.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
class Activity < ApplicationRecord
belongs_to :entry
belongs_to :government

has_many :evidences
has_many :promises, through: :evidences
end
97 changes: 97 additions & 0 deletions app/models/activity_extractor.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
class ActivityExtractor < Chat
include Structify::Model

def prompt(promises, entry)
<<~PROMPT
You are tasked with extracting activities from political artifacts that could impact the progress of promises made by the government. This analysis will help track the government's actions and their potential effects on fulfilling their commitments.

First, review the list of government promises:
<government_promises>
#{promises.map(&:format_for_llm).join("\n")}
</government_promises>

Now, carefully read and analyze the following political artifact:
#{entry.format_for_llm}

Your task is to extract activities mentioned in the political artifact that could potentially impact the progress of the government promises listed above. Follow these steps:

1. Identify any actions, initiatives, policies, or decisions mentioned in the artifact.
2. For each identified activity, determine if it relates to any of the government promises.
3. Assess whether the activity could have a positive, negative, or neutral impact on the progress of the related promise(s).

When analyzing the impact:
- Consider direct and indirect effects
- Think about short-term and long-term consequences
- Take into account the scale and scope of the activity

Present your findings in the following format:

<extracted_activities>
<activity>
<description>[Describe the activity]</description>
<related_promise>[List the related government promise(s)]</related_promise>
<potential_impact>[Explain the potential impact on the promise(s), whether positive, negative, or neutral, and why]</potential_impact>
</activity>
</extracted_activities>

If no relevant activities are found in the political artifact, state this clearly in your response.

Try to minimize the number of activities listed, they should be combined if they are similar enough to avoid redundancy.

Remember to focus only on activities that could potentially impact the government's progress on their promises. Do not include unrelated information or speculate beyond what is reasonably implied by the artifact.
PROMPT
end

schema_definition do
version 1
name "ActivityExtraction"
description "Extracts activities from an entry"
field :reason_for_no_activities, :string, description: "Reason for the activity extraction"
field :activities, :array,
description: "List of activities that the government has impacted one of the promises",
items: {
type: "object", properties: {
"title" => { type: "string" },
"summary" => { type: "string", description: "Summary of what the government has done or proposed to do" },
"impacted_promises" => { type: "array", items: { type: "object", properties: {
"promise_id" => { type: "string", description: "The ID of the promise that the activity impacts" },
"potential_impact" => { type: "string", enum: [ "positive", "negative", "neutral" ] },
"potential_impact_magnitude" => { type: "integer", description: "The magnitude of the potential impact on the promise(s). 1 indicates a minor impact, 2 indicates a moderate impact, and 3 indicates a significant impact." },
"potential_impact_reason" => { type: "string", description: "Explain the potential impact on the promise(s), whether positive, negative, or neutral, and why" }
} } }
}
}
end

def extract_activities!
raise ArgumentError.new("Record is not provided") unless self.record and self.record.is_a?(Entry)
p = prompt(
Promise.all,
self.record
)

self.extract! p


activities.each do |activity|
rec = Activity.create!(
government_id: self.record.government_id,
entry: self.record,
title: activity["title"],
summary: activity["summary"],
published_at: self.record.published_at
)

activity["impacted_promises"].each do |impacted_promise|
rec.evidences.create!(
promise_id: Promise.find_by!(promise_id: impacted_promise["promise_id"]).id,
linked_at: Time.now,
link_type: "automated",
impact: impacted_promise["potential_impact"],
impact_magnitude: impacted_promise["potential_impact_magnitude"],
impact_reason: impacted_promise["potential_impact_reason"]
)
end
end
end
end
Loading