Skip to content

Upsert with None values fails on "Invalid literal value: None" #2426

@mdwint

Description

@mdwint

Apache Iceberg version

main (development)

Please describe the bug 🐞

Upserting a table fails when the input dataframe contains None in a join column.

I've reproduced this error by editing test_upsert_with_nulls from #1861, adding this at the end:

# upsert table with null value
data_with_null = pa.Table.from_pylist(
    [
        {"foo": None, "bar": 1, "baz": False},
    ],
    schema=schema,
)
upd = table.upsert(data_with_null, join_cols=["foo"])

The foo column contains None, causing TypeError: Invalid literal value: None.

tests/table/test_upsert.py:720: in test_upsert_with_nulls
    upd = table.upsert(data_with_null, join_cols=["foo"])
pyiceberg/table/__init__.py:1343: in upsert
    return tx.upsert(
pyiceberg/table/__init__.py:798: in upsert
    matched_predicate = upsert_util.create_match_filter(df, join_cols)
pyiceberg/table/upsert_util.py:37: in create_match_filter
    return In(join_cols[0], unique_keys[0].to_pylist())
pyiceberg/expressions/__init__.py:682: in __new__
    literals_set: Set[Literal[L]] = _to_literal_set(literals)
pyiceberg/expressions/__init__.py:52: in _to_literal_set
    return {_to_literal(v) for v in values}
pyiceberg/expressions/__init__.py:52: in <setcomp>
    return {_to_literal(v) for v in values}
pyiceberg/expressions/__init__.py:59: in _to_literal
    return literal(value)
pyiceberg/expressions/literals.py:159: in literal
    raise TypeError(f"Invalid literal value: {repr(value)}")
E   TypeError: Invalid literal value: None

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions