-
Notifications
You must be signed in to change notification settings - Fork 72
Description
When performing a pipe mode join on inputs sourced via robot scan, error values due to problems sourcing the input can be swallowed up, leaving the user with a debugging challenge.
Details
Repro is with super commit ed23b7a.
We'll start with this working join example with the inputs specified via constants.
$ super -version
Version: v0.2.0-13-ged23b7a48
$ echo '{n:1, word:"one"} {n:2, word:"two"}' > a.sup &&
echo '{i:1, upperword:"ONE"} {i:2, upperword:"TWO"}' > b.sup &&
super -c "
const leftsource = 'a.sup'
const rightsource = 'b.sup'
from f'{leftsource}'
| join (
from f'{rightsource}'
) on left.n=right.i
"
{left:{n:1,word:"one"},right:{i:1,upperword:"ONE"}}
{left:{n:2,word:"two"},right:{i:2,upperword:"TWO"}}
But let's say there was a typo when specifying one of the inputs via f-string reference, e.g., dropping the e from rightsource. Now there's no output at all.
$ super -c "
const leftsource = 'a.sup'
const rightsource = 'b.sup'
from f'{leftsource}'
| join (
from f'{rightsourc}'
) on left.n=right.i
"
[no output]
If we're hip to what's going on and start running subsets of the query in isolation, we can see there was an error value generated. But it looks like it was (understandably, I guess?) treated as a non-match in the join predicate, hence working as designed by showing no output.
$ super -c "
const rightsource = 'b.sup'
from f'{rightsourc}'"
error({message:"from encountered non-string input",on:error("missing")})
However, in this era where SuperDB does type checking, users may expect the tooling to catch these kinds of mistakes much like they do when there's a typo in accessing a field by name that doesn't exist in the input.
I'm honestly not sure what I'd request here as a user. I just know silence isn't great. Maybe error values should always match in join predicates so they'll be visible further along in the query?
Thinking this through, if this was how it had to remain, I guess a defensive user could start to put guards like this in all their join predicates:
$ super -c "
const leftsource = 'a.sup'
const rightsource = 'b.sup'
from f'{leftsource}'
| join (
from f'{rightsourc}'
) on left.n=right.i or has_error(left) or has_error(right)
"
{left:{n:1,word:"one"},right:error({message:"from encountered non-string input",on:error("missing")})}
{left:{n:2,word:"two"},right:error({message:"from encountered non-string input",on:error("missing")})}