📣 Introduction 📣

Sentiment Analysis and Emotion Classification in Comics using LLMs: This ongoing project addresses sentiment analysis and emotion classification in comics using large langauge models (LLMs). We reformulate emotion classification in comics as a text generation task where the LLM is prompted to generate the emotion label for utterance(s). We implement emotion classification as zero-shot classification (ZSC), in-context learning (ICL) and fine-tuning (FT).

📂 Repository Structure

This repository is organized as follows:

bert: this directory contains the implementation of BERT finetuning.
data: this directory contains the datasets used for the experiments.
emory_nlp: this directory contains the implementation for the Emory NLP dataset.
finetuning: this directory contains the implementation of LLM finetuning for comics.
incontext_learning: this directory contains the implementation of in-context learning with LLMs.
meld: this directory contains the implementation for the MELD dataset.
zeroshot: this directory contains the implementation of zero-shot classification for comics using LLMs.

.
├── bert
│   ├── notebooks
├── data
├── emory_nlp
│   ├── model_args
│   ├── notebooks
│   ├── saved_models
│   ├── scripts
├── finetuning
│   ├── saved_models
│   ├── model_args
│   ├── notebooks
│   ├── scripts
│   └── utils
├── incontext_learning
│   ├── notebooks
│   ├── results
│   ├── scripts
│   └── utils
├── meld
│   ├── model_args
│   ├── notebooks
│   ├── saved_models
│   ├── scripts
└── zeroshot
    ├── notebooks
    ├── results
    ├── scripts
    └── utils

⛓️ Models

We experiment with the following models:

BERT -- Google BERT
LLaMA -- LLaMA-3-8B-Instruct, LLaMA-3-70B-Instruct, LLaMA-3.1-8B-Instruct, LLaMA-3.1-70B-Instruct, LLaMA-3.2-1B-Instruct, LLaMA-3.2-3B-Instruct -- Meta AI
Gemma -- Gemma-2-2B-it, Gemma-2-9B-it -- Google
Qwen -- Qwen-2-7B-Instruct, Qwen-2.5-0.5B-Instruct, Qwen-2.5-1.5B-Instruct, Qwen-2.5-7B-Instruct -- Qwen
Mistral-7B-Instruct -- Mistral AI
Phi-3-mini-instruct -- Microsoft
Falcon-7b-instruct -- TII Falcon

🧮 Data

We experiment with three datasets:

Comics: Comics dataset consists of 35 annotated Comics titles. We use the Eckman emotions model which consists of six bases emotions: Anger (AN), Disgust (DI), Fear (FE), Sadness (SA), Surprise (SU), Joy (JO) and Neutral. The 35 titles consist of 7,129 annotated utterances. Of these, the train set comprises of 5803 utterances and the test set of 1326 utterances.
EmoryNLP: EmoryNLP dataset consists of 97 episodes, 897 scenes, and 12,606 utterances based on the popular TV show called Friends, where each utterance is annotated with one of the seven emotions borrowed from the six primary emotions in the Willcox (1982)’s feeling wheel, sad, mad, scared, powerful, peaceful, joyful, and a default emotion of neutral.
MELD: Multimodal EmotionLines Dataset (MELD) consists of more than 1400 dialogues and 13000 utterances from Friends TV series.

📚 Context Configurations

We finetune LLMs for the Comics dataset on three context levels:

Utterance level classification: Every raw utterance in the comics titles is classified into one or more of the emotion classes, with no additiona context given.
Page level classification: Every raw utterance in the comics titles is classified into one or more of the emotion classes, with additional context on the page level provided as input to the LLM.
Title level classification: Every raw utterance in the comics titles is classified into one or more of the emotion classes, with additional context on the page complete book level provided as input to the LLM.

🎛️ Modalities

We use LLMs for three classification tasks:

Zero-Shot Classification (ZSC): Zero-shot classification is a Deep Learning technique where the pre-trained model is used off the shelf (i.e. witout any further training) for inference on completely unseen data samples.
In-Context Learning (ICL): In-Context Learning is a Deep Learning technique where a model is guided for accurate inference with the help of a few solved demonstrations added in the model's input prompt.
Fine-Tuning (FT): Fine-tuning involves further training of a pre-trained model on a downstream dataset. This helps general-purpose model training to be complemented with task specific supervised training.

⌨️ Prompts

For all three modalities, we experiment with different prompting techniques.

Zero-Shot Classification (ZSC): The prompt used for LLaMA and Qwen models is given below:

[{'role': 'system',
  'content': '### Task description: You are an expert sentiment analysis assistant that takes an utterance from a comic book and must classify the utterance into appropriate emotion class(s): anger, surprise, fear, disgust, sadness, joy, neutral. You must absolutely not generate any text or explanation other than the following JSON format {"utterance_emotion": <predicted emotion classes for the utterance (str)>}\n\n'},
{'role': 'user',
  'content': '# Utterance:\n {utterance} \n\n# Result:\n'}]

Fine-Tuning (FT): For fine-tuning, we used the template default for the respective model. In general, the prompt is in the {"instruction", "input", "output"} format given below:

{'instruction': '### Emotion Analysis Expert Role\n\nYou are an advanced emotion analysis expert specializing in comic book dialogue interpretation. Your task is to analyze utterances and identify their emotional content.\n\nINPUT:\n- You will receive a single utterance from a comic book\n- The utterance may express one or multiple emotions\n\nTASK:\n1. Carefully analyze the emotional context and tone of the utterance\n2. Identify applicable emotions from the following classes:\n   "Anger", "Disgust", "Fear", "Sadness", "Surprise", "Joy", "Neutral"\n\nOUTPUT REQUIREMENTS:\n- Format: JSON object with a single key "list_emotion_classes"\n- Value: Array of one or more emotion classes as strings\n- Example: {"list_emotion_classes": ["Anger", "Fear"]}\n\nIMPORTANT NOTES:\n- Do not include any explanations in the output, only the JSON object\n\n', 'input': '### Here is the utterance from a comic book: {utterance}', 'output': '{"list_emotion_classes": ["Fear", "Surprise"]}'}

In-Context Learning (ICL): For ICL, an instance of the prompt with 3 examples (k = 3) is given below:

### Task description: You are an expert sentiment analysis assistant that takes an utterance from a comic book and must classify the utterance into appropriate emotion class(s): anger, surprise, fear, disgust, sadness, joy, neutral. You are given one utterance to classify and 3 example utterances to help you. You must absolutely not generate any text or explanation other than the following JSON format: {"utterance_emotion": "<predicted emotion classes for the utterance (str)>}"

### Examples:

## Example 1
Utterance 1={uterrance_1}

# Result:
{"utterance_emotions": "{result_1}"}

## Example 2
Utterance 2={uterrance_2}

# Result:
{"utterance_emotions": "{result_1}"}

## Example 3
Utterance 3={uterrance_3}

# Result:
{"utterance_emotions": "{result_1}"}

# Utterance:
{utterance}

# Result:

📦 Requirements

We use the following versions of the packages:

torch==2.4.0
gradio==4.43.0
pydantic==2.9.0
LLaMA-Factory==0.9.0
transformers==4.44.2
bitsandbytes==0.43.1

For fine-tuning, you need to install LLaMA-Factory. Run the following command to install LLaMA-Factory and all the necessary dependencies and updates:

bash setup.sh

💻 Platform and Compute

For fine-tuning LLMs, we use LLaMA-Factory.
For model checkpoints, we use Unsloth.
We also use Hugging Face.

All experiments have been performed on the High Performance Cluster at La Rochelle Université.

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
bert/notebooks		bert/notebooks
comics_FT		comics_FT
comics_ICL		comics_ICL
comics_ZSL		comics_ZSL
data		data
ddialogue		ddialogue
emory_nlp_FT		emory_nlp_FT
emory_nlp_ZSL		emory_nlp_ZSL
ft_native		ft_native
meld_FT		meld_FT
meld_ZSL		meld_ZSL
output_dirs_dd_ft		output_dirs_dd_ft
outputs_dir_tmp		outputs_dir_tmp
unilabel_comics_model		unilabel_comics_model
.gitignore		.gitignore
README.md		README.md
clip_encoder.ipynb		clip_encoder.ipynb
comics35_pg_image_icl.ipynb		comics35_pg_image_icl.ipynb
comics35_pg_image_vec_icl.ipynb		comics35_pg_image_vec_icl.ipynb
comics_narratives.ipynb		comics_narratives.ipynb
comics_unilabel_ft.ipynb		comics_unilabel_ft.ipynb
comics_unilabel_ft_pg.ipynb		comics_unilabel_ft_pg.ipynb
confusion_matrices.ipynb		confusion_matrices.ipynb
cpt_softprompt.ipynb		cpt_softprompt.ipynb
dataset_statistics.ipynb		dataset_statistics.ipynb
ft_native_emory.ipynb		ft_native_emory.ipynb
ft_w.ipynb		ft_w.ipynb
ft_w_DD.ipynb		ft_w_DD.ipynb
ft_w_context.ipynb		ft_w_context.ipynb
ft_w_context_pg.ipynb		ft_w_context_pg.ipynb
ft_w_context_pg_sum.ipynb		ft_w_context_pg_sum.ipynb
ft_w_spanish.ipynb		ft_w_spanish.ipynb
ft_w_summary.ipynb		ft_w_summary.ipynb
ft_w_vlm.ipynb		ft_w_vlm.ipynb
llava_generation.ipynb		llava_generation.ipynb
pt_softprompt.ipynb		pt_softprompt.ipynb
qwen_zs_comics.ipynb		qwen_zs_comics.ipynb
scene_description_llamavision.ipynb		scene_description_llamavision.ipynb
setup.sh		setup.sh
speakerid_ft.ipynb		speakerid_ft.ipynb
zs_pg_summary.ipynb		zs_pg_summary.ipynb
zs_w_spanish.ipynb		zs_w_spanish.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📣 Introduction 📣

📂 Repository Structure

⛓️ Models

🧮 Data

📚 Context Configurations

🎛️ Modalities

⌨️ Prompts

📦 Requirements

💻 Platform and Compute

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

📣 Introduction 📣

📂 Repository Structure

⛓️ Models

🧮 Data

📚 Context Configurations

🎛️ Modalities

⌨️ Prompts

📦 Requirements

💻 Platform and Compute

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages