Model design and training dataset information

Hello team, 

Thank you so much for all your hard work in releasing this! I had a couple quick questions about the model design and dataset:

1. Did you use any specific base model to initialize weights before training or was this model trained from scratch?
2. How many tokens was this model trained using?

Thanks again!