Conversation
|
✔️ Deploy Preview for vlab-research canceled. 🔨 Explore the source changes: 8dcfb2d 🔍 Inspect the deploy log: https://app.netlify.com/sites/vlab-research/deploys/61e4c752115705000796a912 |
5cf9aec to
ceee61e
Compare
ceee61e to
a7ec67f
Compare
| current_state = 'QOUT' OR | ||
| current_state = 'BLOCKED' | ||
| ` | ||
| return get(conn, getTimeOff, query) |
There was a problem hiding this comment.
Here is the query performance using EXPLAIN:
root@:26257/chatroach>
EXPLAIN WITH x AS (
SELECT
responses.pageid, responses.userid, states.current_state
FROM responses
INNER JOIN surveys_metadata ON
surveys_metadata.surveyid = responses.surveyid
INNER JOIN states ON
states.pageid = responses.pageid AND
states.userid = responses.userid
WHERE off_date < NOW()
)
SELECT userid, pageid
FROM x
WHERE
current_state = 'QOUT' OR
current_state = 'BLOCKED';
tree | field | description
--------------------------------------+--------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| distributed | false
| vectorized | false
root | |
├── render | |
│ └── filter | |
│ │ | filter | (current_state = 'QOUT') OR (current_state = 'BLOCKED')
│ └── scan buffer node | |
│ | label | buffer 1 (x)
└── subquery | |
│ | id | @S1
│ | original sql | SELECT responses.pageid, responses.userid, states.current_state FROM responses INNER JOIN surveys_metadata ON surveys_metadata.surveyid = responses.surveyid INNER JOIN states ON (states.pageid = responses.pageid) AND (states.userid = responses.userid) WHERE off_date < now()
│ | exec mode | all rows
└── buffer node | |
│ | label | buffer 1 (x)
└── render | |
└── hash-join | |
│ | type | inner
│ | equality | (pageid, userid) = (pageid, userid)
│ | left cols are key |
├── scan | |
│ | table | states@states_current_state_updated_idx
│ | spans | FULL SCAN
└── merge-join | |
│ | type | inner
│ | equality | (surveyid) = (surveyid)
│ | right cols are key |
│ | mergeJoinOrder | +"(surveyid=surveyid)"
├── scan | |
│ | table | responses@responses_surveyid_userid_timestamp_question_ref_idx
│ | spans | FULL SCAN
└── scan | |
| table | surveys_metadata@primary
| spans | FULL SCAN
| filter | off_date < now()
(34 rows)
Time: 25.911ms
root@:26257/chatroach>
There was a problem hiding this comment.
Just pushed an more performant alternative, here is the result from EXPLAIN:
root@:26257/chatroach>
EXPLAIN WITH y AS (
WITH x AS (
SELECT
current_state, pageid, userid
FROM states
WHERE
current_state = 'QOUT' OR
current_state = 'BLOCKED'
)
SELECT
responses.pageid, responses.surveyid, responses.userid,
surveys_metadata.off_date, current_state
FROM x
INNER JOIN responses ON
responses.pageid = x.pageid AND
responses.userid = x.userid
INNER JOIN surveys_metadata ON
surveys_metadata.surveyid = responses.surveyid
WHERE surveys_metadata.off_date < NOW()
)
SELECT userid, pageid
FROM y;
tree | field | description
--------------------------------------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| distributed | false
| vectorized | false
root | |
├── render | |
│ └── scan buffer node | |
│ | label | buffer 2 (y)
└── subquery | |
│ | id | @S1
│ | original sql | WITH x AS (SELECT current_state, pageid, userid FROM states WHERE (current_state = 'QOUT') OR (current_state = 'BLOCKED')) SELECT responses.pageid, responses.surveyid, responses.userid, surveys_metadata.off_date, current_state FROM x INNER JOIN responses ON (responses.pageid = x.pageid) AND (responses.userid = x.userid) INNER JOIN surveys_metadata ON surveys_metadata.surveyid = responses.surveyid WHERE surveys_metadata.off_date < now()
│ | exec mode | all rows
└── buffer node | |
│ | label | buffer 2 (y)
└── render | |
└── hash-join | |
│ | type | inner
│ | equality | (pageid, userid) = (pageid, userid)
│ | left cols are key |
├── render | |
│ └── scan | |
│ | table | states@states_current_state_updated_idx
│ | spans | /"BLOCKED"-/"BLOCKED"/PrefixEnd /"QOUT"-/"QOUT"/PrefixEnd
└── merge-join | |
│ | type | inner
│ | equality | (surveyid) = (surveyid)
│ | right cols are key |
│ | mergeJoinOrder | +"(surveyid=surveyid)"
├── scan | |
│ | table | responses@responses_surveyid_userid_timestamp_question_ref_idx
│ | spans | FULL SCAN
└── scan | |
| table | surveys_metadata@primary
| spans | FULL SCAN
| filter | off_date < now()
(33 rows)
Time: 20.187ms
root@:26257/chatroach>
There was a problem hiding this comment.
Couple thoughts:
-
I think more states should be included (i.e. ERROR state, no?) - shouldn't it just be every state except RESPONDING and OFF?
-
This either assumes everyone only answers one survey or just creates off events for every survey the user ever answered? I like the latter. But there is an issue where if someone is technically in a particular form, but never responded. They will then be able to respond and complete the form later, even if the survey is off (no good!) -- but the off time, being something that is survey-specific (not form-specific), we don't need the actual surveyid. We can use the shortcodes in the state to get all the surveys that the person ever took and do the same thing (or just get the LAST shortcode, already available as
current_form, and use that.
There was a problem hiding this comment.
I think more states should be included (i.e. ERROR state, no?) - shouldn't it just be every state except RESPONDING and OFF?
Sure.
The implementation is based on our convo where we only mention QOUT and BLOCKED, https://curiouslearning.slack.com/archives/D02CDGM9DQB/p1640828961004900?thread_ts=1640792174.000200&cid=D02CDGM9DQB, but I am happy to include all the others except RESPONDING and OFF.
There was a problem hiding this comment.
I mean, it requires some decision-making, but that's a trivial change to make later so I'm not overly concerned if we don't get it right on the first try. My instinct is to make more offs rather than less offs at first though!
There was a problem hiding this comment.
(NOT moving to OFF state when the survey is off seems to me like an exception and we can solve that problem as we see it become a problem)
There was a problem hiding this comment.
Added ERROR and WAIT_EXTERNAL_EVENT, see e2efd91
|
One thing that was a bit hard to get my head around is that, in order to get |
| @@ -0,0 +1,5 @@ | |||
| CREATE TABLE chatroach.surveys_metadata ( | |||
| surveyid UUID NOT NULL REFERENCES chatroach.surveys(id) ON DELETE CASCADE, | |||
There was a problem hiding this comment.
OK - this is where our naming convention problems come in. It shouldn't be surveyid (which is really a "form"), it should be shortcode.
There was a problem hiding this comment.
@calufa - any movement on this? This is a problem, because the off time shouldn't be unique to each "survey", which is really a "version of a survey", correct?
There was a problem hiding this comment.
In my mind, off time applies to a shortcode and pageid. Pageid gets you to survey user. And shortcode + survey user should be unique per metadata.
There was a problem hiding this comment.
@calufa - just following up on this! Any thoughts?
TODO: