Video transformations for deep learning Works with both image and feature map Video input is simply a numpy array (no need of PIL). Supports both channels first and channels last Prerequisites numpy >= 1.12 cv2 Options Options Finished Read Compose RandomCrop RandomCornerCrop CenterCrop FiveCrop TenCrop Resize RandomHorizontalFlip RandomVerticalFlip Montage ToTensor ToBatchTensor Normalize