we want to be able to easily include arbitrary features at each decoding timestep
we want to be able to easily include arbitrary features at each decoding timestep