Skip to content

Cannot append fields of type "dense_vector" to an existing index #659

@walkingmug

Description

@walkingmug

Description:
When trying to append a pandas dataframe of type "dense_vector" to an existing elastic index with the same field type, an error occurs.

Reproduction:

  1. Install requirements:
    pip install elasticsearch eland pandas numpy
  2. Imports:
from elasticsearch import Elasticsearch
import eland as ed
import pandas as pd
import numpy as np
  1. Connect to Elasticsearch:
client = Elasticsearch(HOST, timeout=120)
  1. Create vector dataframes:
vector1 = np.random.rand(512)
vector2 = np.random.rand(512)
df_1 = pd.DataFrame({
    'vector_column': [vector1, vector2]
})

vector3 = np.random.rand(512)
vector4 = np.random.rand(512)
df_2 = pd.DataFrame({
    'vector_column': [vector3, vector4]
})
  1. ✅ Upload first dataframe:
# upload df_1 to elasticsearch
ed.pandas_to_eland(
  pd_df=df_1,
  es_client=client,
  es_dest_index='test-upload',
  es_if_exists="append",
  es_refresh=True,
  es_type_overrides={
      "vector_column": {
          "type": "dense_vector",
          "dims": 512,
          "index": True,
          "similarity": "cosine"
      },
  },
  chunksize=100
)
  1. ❌ Append second dataframe to first dataframe:
# upload df_2 to elasticsearch
ed.pandas_to_eland(
  pd_df=df_2,
  es_client=client,
  es_dest_index='test-upload',
  es_if_exists="append",
  es_refresh=True,
  es_type_overrides={
      "vector_column": {
          "type": "dense_vector",
          "dims": 512,
          "index": True,
          "similarity": "cosine"
      },
  },
  chunksize=100
)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-16-b0e5aa8d561e>](https://localhost:8080/#) in <cell line: 2>()
      1 # upload df_2 to elasticsearch
----> 2 ed.pandas_to_eland(
      3   pd_df=df_2,
      4   es_client=client,
      5   es_dest_index='test-upload',

1 frames
[/usr/local/lib/python3.10/dist-packages/eland/field_mappings.py](https://localhost:8080/#) in verify_mapping_compatibility(ed_mapping, es_mapping, es_type_overrides)
    919         key_type = es_type_overrides.get(key, key_def["type"])
    920         es_key_type = es_props[key]["type"]
--> 921         if key_type != es_key_type and es_key_type not in ES_COMPATIBLE_TYPES.get(
    922             key_type, ()
    923         ):

TypeError: unhashable type: 'dict'

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtopic:dataframeIssue or PR about eland.DataFrame

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions