- [x] Finish token generation mechanism - [x] Create encoder - [x] Copying tokens - [x] Attention - [x] Create code to read data - [x] Bahdanau attention - [x] Generate vocabularies for source and target Depending on model, we need different vocabularies Look at OpenNMT Look at tensor2tensor - [x] Blacklist some elements in Python grammar ("ctx" fields) - [x] Add optimizers to registry - [x] Improve registry to avoid config - [x] Setup model to connect everything together (`enc2dec`) - [x] Figure out how to specify vocabulary, grammar, etc. to models - [x] Figure out what to do when a singular type has only one derivation: Nothing special in the case of no unary closure - [x] Masking of actions - [x] At training time: no masking applies - [x] Deal with types of constants: Num -> object should be Num -> int, Num -> float - [x] Name -> identifier -> str to Name -> str - [x] Variational dropout Complicated to do in PyTorch - [x] Ensure that all loss components have positive sign - [x] Adjust grammar based on empirical observations throughout the data - [ ] Implement unary closures Others - [ ] Save training progress more permanently - [ ] Batching + constructing training instructions in separate processes - [ ] Introduce abstract base classes - [x] preproc - [ ] model - [ ] encoder
Depending on model, we need different vocabularies
Look at OpenNMT
Look at tensor2tensor
enc2dec)Nothing special in the case of no unary closure
Complicated to do in PyTorch
Others