You need a virtual environment to install all the libraries required to run our models. A virutal environment can be created either using 'conda' or 'venv'. If you system has an older version of python then use conda otherwise use venv.
# Option1: create a virtual environment using conda
conda create --name myenv python=3.11.5
# replace myenv with your desired env name# Option2: create a virtual environment using venv
python3 -m venv myenv
# replace myenv with your desired env nameAfter creating your virtual environment, activate it to install libraries inside that env.
Note: If you used option1 to create virtual env, then activate using -
conda activate myenv
Note: If you used option2 to create virtual env, then navigate to the dir where the virtual environment is created and activate using -
source myenv/bin/activate
A requirement.txt file is present with this repo which has a list of all the required libraries. Install it using -
pip3 install -r requirements.txtNote: If any of the libraries fail to install, then recheck your python version as shown in step 1.
To replicate our results, you only need UT Austin Egocentric Data consisting of three video datasets. Each video dataset consists of 80,000 samples of 64x64 resolution images. Download the dataset from the link provided below -
Train Dataset (Video 1)
Train Dataset (Video 2)
Train Dataset (Video 3)
These instructions will walk you through creating a tarball of the images and uploading them to a server. This is an alternative to uploading zipped folders of images because .zip support is not guaranteed on servers.
Navigate to the directory containing your image folder.
cd /path/to/your/images/my-image-folder
If you're working on a Mac, disable tar from including AppleDouble files
export COPYFILE_DISABLE=1
Create a .tar.gz file (tarball)
tar -czvf my-images.tar.gz my-images/
where my-images is the folder containing the images and my-images.tar.gz is the filename of the tarball
Upload the tarball to the server
scp my-images.tar.gz username@your.server.com:/path/on/server/
On server, extract the tarball using:
tar -xzvf my-images.tar.gz
# then remove the tarball once you're done:
rm my-images.tar.gz
ShapeBasedVision_HumanAdult
└── datamodules: directory containing python code to set up the datasets and dataloaders to train the model.
│ ├── image_pairs.py - create temporal_window-based dataloader to train the model
│ ├── transforms.py - custom data augmentation class
│
└── models: directory containing python code of model architecture.
│ ├── vit_contrastive.py - contains ViT-CoT architecture and training functions
│
└── notebooks: Jupyter notebook files used for creating the RDMs and bar charts.
│ └── RDMs.ipynb
│ └── graphs.ipynb
│
└── requirements.txt: text file containing all the required libraries for this project
└── train_ViTCoT.py: python script initializing model, dataloader, and trainer to train the model.
└── train_ViTCoT.sh: bash script initializing arguments and to start training the model.
├── media: directory containing images and videos for the readme
Enter the project directory and make sure that your virtual environment is active.
Type the following command to get familiar with the type of arguments required when training a model -
# make sure you are in the dir where this file is located
python3 train_ViTCoT.py --helpNote: If you don't see a list of arguments on your terminal and encounter an error, it means that 'Environment Setup' was not successful. Raise a GitHub issue and we will help you solve your issue at our earliest availability.
Open and modify train_ViTCoT.sh as per the choice of your experiment. You can change the number of GPUs, temporal window size, architecture size, experiment name, shuffling and many other arguments in this file before training the model.
Example -
Note: Once training starts, it is important to carefully check the terminal for display statements with '[INFO]' tags. These tags indicate important training informations that you would have passed. These tags are highlighted below in red -
After training the models using the steps shown above, we freeze the models and evaluate their feature space. For testing these models, download this dataset -
Test Dataset
We provide three different scripts inside the 'notebooks' directory to evaluate the models:
-
RDMs.ipynb - This jupyter notebook plots the raw features from the encoder in the form of a matrix.
-
graphs.ipynb - This jupyter notebook is used to plot the shape/color sensitivity scores and std_error from the RDM generated using the above script.
-
tsne.ipynb - This jupyter notebook plots the 2D representation of the encoder's feature space
If you are using VS Code’s Remote SSH to open your project folder on the server, you do not need to set up port forwarding or use a browser for Jupyter.
Instead: Open your project on the server with VS Code Remote SSH. Activate your desired conda environment in the VS Code terminal:
'''bash conda activate myenv '''
Install ipykernel in that environment (if not already installed):
'''bash conda install ipykernel '''
Open your .ipynb notebook file in VS Code and select your conda environment as the kernel.
Tip: If your environment doesn’t appear, try quitting and restarting VS Code after installing ipykernel.
Assuming that you want to run the jupyter notebooks on a linux server, open terminal and connect to the server with port forwarding. You can use any available port above 1024 (here, we use 16008 as an example):
ssh -L 16008:127.0.0.1:16008 username@<server_name>.luddy.indiana.eduAfter successful login, navigate to your project directory and and start Jupyter Notebook on the same port you chose above:
jupyter notebook --port=16008 --no-browserNote: The port number should match with the port number in step 1.
Copy the URL from the terminal window as shown in the screenshot below and run it on the browser (e.g., http://localhost:16008/?token=...):
Follow the steps given in the notebooks to create RDMs, t-SNEs, and graphs as shown in the paper.
If you use our code, consider referencing our paper -
In Progress!!!


