Implementation for the paper (submitted to IEEE Transactions on Affective Computing).
VISTANet: VIsual Spoken Textual Additive Net for Interpretable Multimodal Emotion Recognition
Puneet Kumar, Sarthak Malik, Balasubramanian Raman, and Xiaobai Li
The code files are currently been prepared for upload. They will be made publically available soon.
Access to the ‘IIT Roorkee Multimodal Emotion Recognition (IIT-R MMEmoRec) dataset’ can be obtained by through Access Form - IIT-R MMEmoRec Dataset.pdf. The dataset is prepared by Puneet Kumar and Sarthak Malik at Machine Intelligence Lab, IIT Roorkee under the supervision of Prof. Balasubramanian Raman. It contains speech utterances, corresponding images and emotion labels (happy, sad, hate, anger).