-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
Description
The definition of a complete chloroplast comprises the following requirements:
- The subgraph need to have between MINNODES (3) and MAXNODES (100)
Line 200 in be99085
next if (@{$wcc} < $MINNODES || @{$wcc} > $MAXNODES); - Need to be a cyclic subgraph with a total sequence length between MINSEQLEN (25 kbp) and MAXSEQLEN (1 Mbp)
Line 213 in be99085
next unless ($c->is_cyclic && $seqlen >= $MINSEQLEN && $seqlen <= $MAXSEQLEN); - Subgraph need to have at least one blast hit against the reference database
Lines 233 to 239 in be99085
my $output = qx(tblastx -db $blastdbfile -query $filename -evalue 1e-10 -outfmt 6 -num_alignments 1 -num_threads 4); if (length($output) > 0) { $L->debug("Found hits for cyclic graph: ".$c); push(@cyclic_contigs_with_blast_hits, $c); } - Only one subgraph having blast hits is allowed
Line 246 in be99085
if (@cyclic_contigs_with_blast_hits == 1) - The node with the highest connectivity is assigned as IR
Line 270 in be99085
my $inverted_repeat = "$degree[0]{v}"; - After removing the IR nodes, only two other nodes are allowed
Line 281 in be99085
if (keys %nodes == 2) - LSC and SSC are simply assigned by sequence length
Lines 286 to 289 in be99085
if (length($seq[$lsc]) < length($seq[$ssc])) { ($lsc, $ssc) = ($ssc, $lsc); }
I think we can improve our detection by avoiding some of those requirements, eg. 6.
Any ideas are welcome!
Reactions are currently unavailable