McQueen: A Transformer-based multimodal query rewrite benchmark

Our code is based on the original VLT5/Bart code.

Setup

# Create python environment (optional)
conda create -n MQR python=3.7
source activate MQR

# Install python dependencies
pip install -r requirements.txt

# Download language evalutation tools
https://github.com/bckim92/language-evaluation

# Download T5/BART backbone checkpoint
python download_backbones.py


# Train VL-T5
./VL-T5/
    src/
        modeling_t5.py modeling_bart.py                       <= VL-T5/VL-BART model classes
        pretrain.py, pretrain_data.py, pretrain_model.py      <= pretraining
        vqa.py, vqa_data.py vqa_model.py ...                  <= fine-tuning on downstream tasks (ex. VQA, GQA, NLVR2)
        multitask.py, multitask_data.py multiask_model.py     <= multitask learning on 7 downstream tasks
        param.py                                              <= (argparse) configuration
        tokenization.py                                       <= custom tokenizer
        utils.py, dist_utils.py                               <= utility functions
    snap/                                                     <= store weight checkpoints
    scripts/                                                  <= bash scripts for pretraining and finetuning

Dataset

The image files (anno_images) can be found in link.

The textual files (McQR_data) can be found in link.

Image feature extraction code can be found in ./feature_extraction. All the extracted image features can also be downloaded via link

The original dataset file with image annotations can be found in link.

Download Pre-trained models / Pre-extracted features

We host model checkpoints and features via google drive. We recommend using gdrive to download them.

Pretrained Models

Download snap/ from Google Drive

gdrive download 1_SBj4sZ0gUqfBon1gFBiNRAmfHv5w_ph --recursive

Downstream tasks

[Query Rewrite]

First replace the generation_utils.py to the Huggingface transformers package installed in your device.

mv generation_utils.py [your path]/transformers/

Then start fine-tuning

# Finetuning with 4 gpus
cd VL-T5/
bash scripts/QueryRewrite_VLT5.sh 4
bash scripts/QueryRewrite_VLBart.sh 4

Reference

Please cite our paper if you use the dataset and model in your works:

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
VL-T5		VL-T5
assets		assets
feature_extraction		feature_extraction
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_backbones.py		download_backbones.py
generation_utils.py		generation_utils.py
inference_example.ipynb		inference_example.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

McQueen: A Transformer-based multimodal query rewrite benchmark

Setup

Dataset

Download Pre-trained models / Pre-extracted features

Pretrained Models

Downstream tasks

[Query Rewrite]

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

McQueen: A Transformer-based multimodal query rewrite benchmark

Setup

Dataset

Download Pre-trained models / Pre-extracted features

Pretrained Models

Downstream tasks

[Query Rewrite]

Reference

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages