Avoid skipping candidate RBS positions in rbs_score#102
Open
althonos wants to merge 1 commit intohyattpd:GoogleImportfrom
Open
Avoid skipping candidate RBS positions in rbs_score#102althonos wants to merge 1 commit intohyattpd:GoogleImportfrom
rbs_score#102althonos wants to merge 1 commit intohyattpd:GoogleImportfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Hi, one final PR 😃
In the test sequence I used for #100 I noticed the following bug: after reverse-complementing a sequence, the RBS spacer for one of the predicted gene was changing when the contig was reverse-complemented:
Details
Forward: Reverse-complemented:Indeed, the gene with the
GGA/GAG/AGGRBS motif has a spacer detected as3-4bpwhen on the forward strand, and5-10bpon the reverse strand. The contig in question starts with the following sequence:so it has both a match in the
3-4bprange (AGG) and in the5-10bprange (GGA), but since the5-10bpspacer has a higher score it should be the one to be selected. This actually matters on the gene score, so it could cause some predictions to change.The problem was coming from the loops in
rbs_scorewhich skip some positions before index0; however, when there may be a partial match (as it is the case here, with aGGAmotif right on the contig edge), the positions should not be skipped, and the decision to ignore some positions should be made by theshine_dalgarno_exactandshine_dalgarno_mmfunctions directly.After applying the patch, the predictions are consistent independent of the directionality of the contig, the RBS spacers and hence the gene scores match:
Details
Forward: Reverse-complemented: