Skip to content

Add "off" time using Dean#71

Open
calufa wants to merge 5 commits intofeature/add-unit-test-suitefrom
feature/add-off-time-dean
Open

Add "off" time using Dean#71
calufa wants to merge 5 commits intofeature/add-unit-test-suitefrom
feature/add-off-time-dean

Conversation

@calufa
Copy link
Copy Markdown
Contributor

@calufa calufa commented Jan 17, 2022

TODO:

  • add "off" time to surveys

@netlify
Copy link
Copy Markdown

netlify bot commented Jan 17, 2022

✔️ Deploy Preview for vlab-research canceled.

🔨 Explore the source changes: 8dcfb2d

🔍 Inspect the deploy log: https://app.netlify.com/sites/vlab-research/deploys/61e4c752115705000796a912

@calufa calufa changed the base branch from main to feature/add-unit-test-suite January 17, 2022 01:33
@calufa calufa force-pushed the feature/add-off-time-dean branch 3 times, most recently from 5cf9aec to ceee61e Compare January 17, 2022 01:39
@calufa calufa force-pushed the feature/add-off-time-dean branch from ceee61e to a7ec67f Compare January 17, 2022 01:50
current_state = 'QOUT' OR
current_state = 'BLOCKED'
`
return get(conn, getTimeOff, query)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the query performance using EXPLAIN:

root@:26257/chatroach> 

EXPLAIN WITH x AS (
    SELECT
            responses.pageid, responses.userid, states.current_state
    FROM responses
    INNER JOIN surveys_metadata ON
            surveys_metadata.surveyid = responses.surveyid
    INNER JOIN states ON
            states.pageid = responses.pageid AND
            states.userid = responses.userid
    WHERE off_date < NOW()
)
SELECT userid, pageid
FROM x
WHERE
    current_state = 'QOUT' OR
    current_state = 'BLOCKED';

                 tree                 |       field        |                                                                                                                                    description
--------------------------------------+--------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                      | distributed        | false
                                      | vectorized         | false
  root                                |                    |
   ├── render                         |                    |
   │    └── filter                    |                    |
   │         │                        | filter             | (current_state = 'QOUT') OR (current_state = 'BLOCKED')
   │         └── scan buffer node     |                    |
   │                                  | label              | buffer 1 (x)
   └── subquery                       |                    |
        │                             | id                 | @S1
        │                             | original sql       | SELECT responses.pageid, responses.userid, states.current_state FROM responses INNER JOIN surveys_metadata ON surveys_metadata.surveyid = responses.surveyid INNER JOIN states ON (states.pageid = responses.pageid) AND (states.userid = responses.userid) WHERE off_date < now()
        │                             | exec mode          | all rows
        └── buffer node               |                    |
             │                        | label              | buffer 1 (x)
             └── render               |                    |
                  └── hash-join       |                    |
                       │              | type               | inner
                       │              | equality           | (pageid, userid) = (pageid, userid)
                       │              | left cols are key  |
                       ├── scan       |                    |
                       │              | table              | states@states_current_state_updated_idx
                       │              | spans              | FULL SCAN
                       └── merge-join |                    |
                            │         | type               | inner
                            │         | equality           | (surveyid) = (surveyid)
                            │         | right cols are key |
                            │         | mergeJoinOrder     | +"(surveyid=surveyid)"
                            ├── scan  |                    |
                            │         | table              | responses@responses_surveyid_userid_timestamp_question_ref_idx
                            │         | spans              | FULL SCAN
                            └── scan  |                    |
                                      | table              | surveys_metadata@primary
                                      | spans              | FULL SCAN
                                      | filter             | off_date < now()
(34 rows)

Time: 25.911ms

root@:26257/chatroach> 

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed an more performant alternative, here is the result from EXPLAIN:

root@:26257/chatroach> 
EXPLAIN WITH y AS (
        WITH x AS (
                SELECT
                        current_state, pageid, userid
                FROM states
                WHERE
                        current_state = 'QOUT' OR
                        current_state = 'BLOCKED'
        )
        SELECT
                responses.pageid, responses.surveyid, responses.userid,
                surveys_metadata.off_date, current_state
        FROM x
        INNER JOIN responses ON
                responses.pageid = x.pageid AND
                responses.userid = x.userid
        INNER JOIN surveys_metadata ON
                surveys_metadata.surveyid = responses.surveyid
        WHERE surveys_metadata.off_date < NOW()
)
SELECT userid, pageid
FROM y;
                 tree                 |       field        |                                                                                                                                                                                                                       description
--------------------------------------+--------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                                      | distributed        | false
                                      | vectorized         | false
  root                                |                    |
   ├── render                         |                    |
   │    └── scan buffer node          |                    |
   │                                  | label              | buffer 2 (y)
   └── subquery                       |                    |
        │                             | id                 | @S1
        │                             | original sql       | WITH x AS (SELECT current_state, pageid, userid FROM states WHERE (current_state = 'QOUT') OR (current_state = 'BLOCKED')) SELECT responses.pageid, responses.surveyid, responses.userid, surveys_metadata.off_date, current_state FROM x INNER JOIN responses ON (responses.pageid = x.pageid) AND (responses.userid = x.userid) INNER JOIN surveys_metadata ON surveys_metadata.surveyid = responses.surveyid WHERE surveys_metadata.off_date < now()
        │                             | exec mode          | all rows
        └── buffer node               |                    |
             │                        | label              | buffer 2 (y)
             └── render               |                    |
                  └── hash-join       |                    |
                       │              | type               | inner
                       │              | equality           | (pageid, userid) = (pageid, userid)
                       │              | left cols are key  |
                       ├── render     |                    |
                       │    └── scan  |                    |
                       │              | table              | states@states_current_state_updated_idx
                       │              | spans              | /"BLOCKED"-/"BLOCKED"/PrefixEnd /"QOUT"-/"QOUT"/PrefixEnd
                       └── merge-join |                    |
                            │         | type               | inner
                            │         | equality           | (surveyid) = (surveyid)
                            │         | right cols are key |
                            │         | mergeJoinOrder     | +"(surveyid=surveyid)"
                            ├── scan  |                    |
                            │         | table              | responses@responses_surveyid_userid_timestamp_question_ref_idx
                            │         | spans              | FULL SCAN
                            └── scan  |                    |
                                      | table              | surveys_metadata@primary
                                      | spans              | FULL SCAN
                                      | filter             | off_date < now()
(33 rows)

Time: 20.187ms

root@:26257/chatroach> 

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple thoughts:

  1. I think more states should be included (i.e. ERROR state, no?) - shouldn't it just be every state except RESPONDING and OFF?

  2. This either assumes everyone only answers one survey or just creates off events for every survey the user ever answered? I like the latter. But there is an issue where if someone is technically in a particular form, but never responded. They will then be able to respond and complete the form later, even if the survey is off (no good!) -- but the off time, being something that is survey-specific (not form-specific), we don't need the actual surveyid. We can use the shortcodes in the state to get all the surveys that the person ever took and do the same thing (or just get the LAST shortcode, already available as current_form, and use that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think more states should be included (i.e. ERROR state, no?) - shouldn't it just be every state except RESPONDING and OFF?

Sure.

The implementation is based on our convo where we only mention QOUT and BLOCKED, https://curiouslearning.slack.com/archives/D02CDGM9DQB/p1640828961004900?thread_ts=1640792174.000200&cid=D02CDGM9DQB, but I am happy to include all the others except RESPONDING and OFF.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, it requires some decision-making, but that's a trivial change to make later so I'm not overly concerned if we don't get it right on the first try. My instinct is to make more offs rather than less offs at first though!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(NOT moving to OFF state when the survey is off seems to me like an exception and we can solve that problem as we see it become a problem)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added ERROR and WAIT_EXTERNAL_EVENT, see e2efd91

@calufa
Copy link
Copy Markdown
Contributor Author

calufa commented Jan 17, 2022

One thing that was a bit hard to get my head around is that, in order to get surveyid, which is required to get the off-time from the surveys_metadata table, you need to resolve pageid, and you can only find both surveyid and pageid in the responses table.

root@:26257/chatroach> \d states
      column_name      |  data_type  | is_nullable | column_default |                                                                                                                                                                                                                                                                                                       generation_expression                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                   indices                                                                                                                                                                                                                                                                                    | is_hidden
-----------------------+-------------+-------------+----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------
  userid               | VARCHAR     |    false    | NULL           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | {primary,states_current_state_updated_idx,states_auto_index_fk_pageid_ref_facebook_pages,states_current_state_fb_error_code_idx,states_state_json_idx,states_current_state_current_form_updated_idx,states_previous_with_token_previous_is_followup_form_start_time_current_state_updated_idx,states_error_tag_current_state_current_form_updated_idx,states_stuck_on_question_current_state_current_form_updated_idx,states_current_state_timeout_date_idx,states_current_state_error_tag_updated_next_retry_idx,states_current_state_fb_error_code_updated_next_retry_idx} |   false
  pageid               | VARCHAR     |    false    | NULL           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | {states_current_state_error_tag_updated_next_retry_idx,states_current_state_updated_idx,states_auto_index_fk_pageid_ref_facebook_pages,primary,states_state_json_idx,states_current_state_current_form_updated_idx,states_previous_with_token_previous_is_followup_form_start_time_current_state_updated_idx,states_error_tag_current_state_current_form_updated_idx,states_stuck_on_question_current_state_current_form_updated_idx,states_current_state_timeout_date_idx,states_current_state_fb_error_code_idx,states_current_state_fb_error_code_updated_next_retry_idx} |   false
  updated              | TIMESTAMPTZ |    false    | NULL           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | {states_current_state_updated_idx,states_current_state_current_form_updated_idx,states_previous_with_token_previous_is_followup_form_start_time_current_state_updated_idx,states_error_tag_current_state_current_form_updated_idx,states_stuck_on_question_current_state_current_form_updated_idx,states_current_state_fb_error_code_updated_next_retry_idx,states_current_state_error_tag_updated_next_retry_idx}                                                                                                                                                           |   false
  current_state        | VARCHAR     |    false    | NULL           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | {states_current_state_fb_error_code_updated_next_retry_idx,states_current_state_fb_error_code_idx,states_current_state_current_form_updated_idx,states_previous_with_token_previous_is_followup_form_start_time_current_state_updated_idx,states_error_tag_current_state_current_form_updated_idx,states_stuck_on_question_current_state_current_form_updated_idx,states_current_state_timeout_date_idx,states_current_state_error_tag_updated_next_retry_idx,states_current_state_updated_idx}                                                                              |   false
  state_json           | JSONB       |    false    | NULL           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | {states_current_state_fb_error_code_idx,states_state_json_idx,states_previous_with_token_previous_is_followup_form_start_time_current_state_updated_idx,states_current_state_timeout_date_idx}                                                                                                                                                                                                                                                                                                                                                                               |   false
  fb_error_code        | VARCHAR     |    true     | NULL           | (state_json->'error')->>'code'                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | {states_current_state_fb_error_code_updated_next_retry_idx,states_current_state_fb_error_code_idx}                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |   false
  current_form         | VARCHAR     |    true     | NULL           | (state_json->'forms')->>-1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | {states_current_state_current_form_updated_idx,states_error_tag_current_state_current_form_updated_idx,states_stuck_on_question_current_state_current_form_updated_idx}                                                                                                                                                                                                                                                                                                                                                                                                      |   false
  previous_is_followup | BOOL        |    true     | NULL           | ((state_json->'previousOutput')->>'followUp') IS NOT NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | {states_previous_with_token_previous_is_followup_form_start_time_current_state_updated_idx}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |   false
  previous_with_token  | BOOL        |    true     | NULL           | ((state_json->'previousOutput')->>'token') IS NOT NULL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | {states_previous_with_token_previous_is_followup_form_start_time_current_state_updated_idx}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |   false
  form_start_time      | TIMESTAMPTZ |    true     | NULL           | ceiling(((state_json->'md')->>'startTime')::INT8 / 1000)::INT8::TIMESTAMPTZ                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | {states_previous_with_token_previous_is_followup_form_start_time_current_state_updated_idx}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |   false
  error_tag            | VARCHAR     |    true     | NULL           | (state_json->'error')->>'tag'                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | {states_error_tag_current_state_current_form_updated_idx,states_current_state_error_tag_updated_next_retry_idx}                                                                                                                                                                                                                                                                                                                                                                                                                                                              |   false
  stuck_on_question    | VARCHAR     |    true     | NULL           | CASE WHEN ((((state_json->'qa')->-1)->>0) = (((state_json->'qa')->-2)->>0)) AND ((((state_json->'qa')->-2)->>0) = (((state_json->'qa')->-3)->>0)) THEN ((state_json->'qa')->-1)->>0 ELSE NULL END                                                                                                                                                                                                                                                                                                                                                                                                                                  | {states_stuck_on_question_current_state_current_form_updated_idx}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |   false
  timeout_date         | TIMESTAMPTZ |    true     | NULL           | CASE WHEN (((state_json->'wait')->>'type') = 'timeout') AND ((((state_json->'wait')->'value')->>'type') = 'absolute') THEN (((state_json->'wait')->'value')->>'timeout')::TIMESTAMPTZ WHEN (((state_json->'wait')->>'type') = 'timeout') AND ((((state_json->'wait')->'value')->>'type') = 'relative') THEN (ceiling((state_json->>'waitStart')::INT8 / 1000)::INT8::TIMESTAMPTZ + (((state_json->'wait')->'value')->>'timeout')::INTERVAL) WHEN ((state_json->'wait')->>'type') = 'timeout' THEN (ceiling((state_json->>'waitStart')::INT8 / 1000)::INT8::TIMESTAMPTZ + ((state_json->'wait')->>'value')::INTERVAL) ELSE NULL END | {states_current_state_timeout_date_idx}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |   false
  next_retry           | TIMESTAMP   |    true     | NULL           | (floor(((power(2, (CASE WHEN json_array_length(state_json->'retries') <= 16 THEN json_array_length(state_json->'retries') ELSE 16 END)) * 60000) + ((state_json->'retries')->>-1)::INT8)::INT8) / 1000)::INT8::TIMESTAMP                                                                                                                                                                                                                                                                                                                                                                                                           | {states_current_state_fb_error_code_updated_next_retry_idx,states_current_state_error_tag_updated_next_retry_idx}                                                                                                                                                                                                                                                                                                                                                                                                                                                            |   false
(14 rows)

Time: 124.326ms
root@:26257/chatroach> \d responses
      column_name     |  data_type  | is_nullable | column_default | generation_expression  |                                                                                                                         indices                                                                                                                          | is_hidden
----------------------+-------------+-------------+----------------+------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------
  parent_surveyid     | UUID        |    true     | NULL           |                        | {responses_auto_index_fk_parent_surveyid_ref_surveys,responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                               |   false
  parent_shortcode    | VARCHAR     |    false    | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                                                                                   |   false
  surveyid            | UUID        |    false    | NULL           |                        | {responses_auto_index_fk_surveyid_ref_surveys,responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                                      |   false
  shortcode           | VARCHAR     |    false    | NULL           |                        | {responses_shortcode_question_ref_response_clusterid_timestamp_idx,responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                 |   false
  flowid              | INT8        |    false    | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                                                                                   |   false
  userid              | VARCHAR     |    false    | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx,primary,responses_auto_index_fk_surveyid_ref_surveys,responses_metadata_idx,responses_shortcode_question_ref_response_clusterid_timestamp_idx,responses_auto_index_fk_parent_surveyid_ref_surveys} |   false
  question_ref        | VARCHAR     |    false    | NULL           |                        | {primary,responses_auto_index_fk_parent_surveyid_ref_surveys,responses_auto_index_fk_surveyid_ref_surveys,responses_metadata_idx,responses_surveyid_userid_timestamp_question_ref_idx,responses_shortcode_question_ref_response_clusterid_timestamp_idx} |   false
  question_idx        | INT8        |    false    | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                                                                                   |   false
  question_text       | VARCHAR     |    false    | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                                                                                   |   false
  response            | VARCHAR     |    false    | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx,responses_shortcode_question_ref_response_clusterid_timestamp_idx}                                                                                                                                 |   false
  seed                | INT8        |    false    | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                                                                                   |   false
  timestamp           | TIMESTAMPTZ |    false    | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx,primary,responses_auto_index_fk_parent_surveyid_ref_surveys,responses_auto_index_fk_surveyid_ref_surveys,responses_shortcode_question_ref_response_clusterid_timestamp_idx,responses_metadata_idx} |   false
  metadata            | JSONB       |    true     | NULL           |                        | {responses_metadata_idx,responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                                                            |   false
  pageid              | VARCHAR     |    true     | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                                                                                   |   false
  clusterid           | VARCHAR     |    true     | NULL           | metadata->>'clusterid' | {responses_shortcode_question_ref_response_clusterid_timestamp_idx,responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                 |   false
  translated_response | VARCHAR     |    true     | NULL           |                        | {responses_surveyid_userid_timestamp_question_ref_idx}                                                                                                                                                                                                   |   false
(16 rows)

Time: 91.694ms
root@:26257/chatroach> \d surveys_metadata;
  column_name |  data_type  | is_nullable | column_default | generation_expression |  indices  | is_hidden
--------------+-------------+-------------+----------------+-----------------------+-----------+------------
  surveyid    | UUID        |    false    | NULL           |                       | {primary} |   false
  off_date    | TIMESTAMPTZ |    false    | NULL           |                       | {}        |   false
(2 rows)

Time: 76.302ms

@calufa calufa requested a review from nandanrao January 17, 2022 03:29
@@ -0,0 +1,5 @@
CREATE TABLE chatroach.surveys_metadata (
surveyid UUID NOT NULL REFERENCES chatroach.surveys(id) ON DELETE CASCADE,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - this is where our naming convention problems come in. It shouldn't be surveyid (which is really a "form"), it should be shortcode.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@calufa - any movement on this? This is a problem, because the off time shouldn't be unique to each "survey", which is really a "version of a survey", correct?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my mind, off time applies to a shortcode and pageid. Pageid gets you to survey user. And shortcode + survey user should be unique per metadata.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@calufa - just following up on this! Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants