Skip to content

OCEL2 not correctly imported #19

@Kena-Njonge

Description

@Kena-Njonge

Hello,

I would like to report the following bug when importing an ocel2.
To make sure that it wasn't a bug in the ocel2 that I was working with, I used the procure2pay dataset from ocelot.pm.

The following section is problematic in ocpa\objects\log\importer\ocel2\sqlite\versions\import_ocel2_sqlite.py

# Read each dynamically generated event_[ocel_type_map] table, create a DataFrame and store them in a list for event_type in event_types['ocel_type_map']: table_name = f"event_{event_type}" event_type_df = pd.read_sql(f'SELECT * FROM {table_name}', connection) # Rename the columns to avoid conflicts while merging and add the 'event_' prefix event_type_df.rename(columns=lambda col: f'event_{col}' if col != 'ocel_id' else 'event_id', inplace=True) event_type_df.rename(columns={f'event_{event_type}_ocel_time': f'event_timestamp_{event_type}'}, inplace=True) # Merge the event_df with the event_type_df event_df = event_df.merge(event_type_df, on='event_id', how='left')

This section means that event attributes have to have unique names, or else if you have 3 event_types with the same columns on the first merge the attribute columns would be renamed to event_attribute_x and event_attribute_y and after this pandas will not allow you to rename columns that already have the suffix _x and _y.

Suggested fix.
I thought that an easy fix would be the following
event_type_df.rename( columns=lambda col: f"event_{event_type}_{col}" if col != "ocel_id" else "event_id", inplace=True ) event_type_df.rename(columns={f'event_{event_type}_ocel_time': f'event_timestamp_{event_type}'}, inplace=True)

but there is some other issue in the import (which this fix may have caused). Namely later in ocpa\objects\log\converter\versions\df_to_ocel.py

#logging.debug(_sample_dict(3, objects))

This line will fail, I commented it out and then the import works, but when I check the variants of log, I only get the array [1.0] and the generation of an ocpn doesn't work, so there still seems to be something wrong. That's how far I got in debugging, I hope that it helps.

Cheers
Kena

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions