-
Notifications
You must be signed in to change notification settings - Fork 19
OCEL2 not correctly imported #19
Description
Hello,
I would like to report the following bug when importing an ocel2.
To make sure that it wasn't a bug in the ocel2 that I was working with, I used the procure2pay dataset from ocelot.pm.
The following section is problematic in ocpa\objects\log\importer\ocel2\sqlite\versions\import_ocel2_sqlite.py
# Read each dynamically generated event_[ocel_type_map] table, create a DataFrame and store them in a list for event_type in event_types['ocel_type_map']: table_name = f"event_{event_type}" event_type_df = pd.read_sql(f'SELECT * FROM {table_name}', connection) # Rename the columns to avoid conflicts while merging and add the 'event_' prefix event_type_df.rename(columns=lambda col: f'event_{col}' if col != 'ocel_id' else 'event_id', inplace=True) event_type_df.rename(columns={f'event_{event_type}_ocel_time': f'event_timestamp_{event_type}'}, inplace=True) # Merge the event_df with the event_type_df event_df = event_df.merge(event_type_df, on='event_id', how='left')
This section means that event attributes have to have unique names, or else if you have 3 event_types with the same columns on the first merge the attribute columns would be renamed to event_attribute_x and event_attribute_y and after this pandas will not allow you to rename columns that already have the suffix _x and _y.
Suggested fix.
I thought that an easy fix would be the following
event_type_df.rename( columns=lambda col: f"event_{event_type}_{col}" if col != "ocel_id" else "event_id", inplace=True ) event_type_df.rename(columns={f'event_{event_type}_ocel_time': f'event_timestamp_{event_type}'}, inplace=True)
but there is some other issue in the import (which this fix may have caused). Namely later in ocpa\objects\log\converter\versions\df_to_ocel.py
#logging.debug(_sample_dict(3, objects))
This line will fail, I commented it out and then the import works, but when I check the variants of log, I only get the array [1.0] and the generation of an ocpn doesn't work, so there still seems to be something wrong. That's how far I got in debugging, I hope that it helps.
Cheers
Kena