GitHub - mdaniluk/transformer: Transformer from scratch

Simple Transformer implementation from scratch in PyTorch

Goal of this implementation is to show the simplicity of transformer models and self-attention. This transformer model consists of stack of simple transformer blocks. It doesn't have encoder-decoder structure as historical transformer implementation.

SelfAttention is simple implementation of multi head attentnion module.

TransformerBlock is simple block that consists of attentnion layer, layer normalization and feed forward network with resnet connections between them.

CTransformer is designed for classifying sequences. It consists of several transformer blocks and takes the average of output tokens from last layer and apply linear projection to this final layer.

You can easily train it to classify sequences from IMDB dataset:

python classify.py

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
model		model
.gitignore		.gitignore
README.md		README.md
classify.py		classify.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages