GitHub - kd1510/neural_image_caption: Implementing a ConvNet+LSTM caption net

A Neural Image Caption Generator

Introduction
Architecture

Introduction

This is a neural image caption generator based on the paper Show and Tell: A Neural Image Caption Generator by Vinyals et al.
The model is trained on the Flickr8k dataset.

Architecture

The pytorch implementation can be found in encoder_decoder.py.

Encoder

The encoder is an EfficientNet with weights pretrained on ImageNet.
The final layer of the EfficientNet is removed all prior layers are frozen for the duration of the training process.
The image embedding is passed through a linear layer to reduce the dimensionality of the feature vector to the dimensionality of the joint embedding space.
This final layer is jointly trained along with the decoder in order to learn the joint embedding space.

Decoder

The decoder is an LSTM which generates a caption for the image.
At the start of the decoding process, the feature vector from the encoder is passed through the LSTM to allow the hidden state to view the embedded representation of the image.
A linear layer is added in order to map the hidden state outputs to the vocabulary space, in order to generate a probability distribution over the next word in the caption.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
evaluation		evaluation
resources		resources
.gitignore		.gitignore
README.md		README.md
encoder_decoder.py		encoder_decoder.py
hooks.py		hooks.py
prep_data.py		prep_data.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A Neural Image Caption Generator

Introduction

Architecture

Encoder

Decoder

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

A Neural Image Caption Generator

Introduction

Architecture

Encoder

Decoder

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages