-
Notifications
You must be signed in to change notification settings - Fork 13
Using a trained model to predict binding values for new protein & SMILES combo's #14
Description
Hi all,
I have performed the whole ProSmith pipeline now two times: training the model with a dataset with amino acid sequences, small substrate SMILES, and given (known) binding values. This results in a trained model (.pkl file) which can be used for gradient boosting, after which a final prediction will be given for each amino acid sequence and SMILES combination. We can validate the model by mapping these predictions to the dataset that we used with known binding values. I performed this correctly and I want to continue using the trained model.
Now comes my question: is it possible to predict binding values for amino acid sequences and SMILES combinations using the above mentioned trained model? So that I don't have to retrain the model every time I want to predict enzyme-substrate interactions? This is similar to what the web based tool does: https://esp.cs.hhu.de/ESP_single_input but i want to make predictions based on my own trained model.
If someone already has a script for this, I would be really eager to use it. Thank you in advance.
Greetings,
Max