Conversation
|
Here is a guide on how to implement fast The result is +/- 10% faster than GMP, without assembly: status-im/nim-stint#126 (comment) |
|
We trade one shr and a minus sign by a div, a shr and a mod? |
The |
|
@dlesnoff @narimiran what do you think? |
|
I am not convinced. The exponent being an int in your code, it would allocate at most a two-limb BigInt in my code (and for most of the uses, it will be a single limb). Is the cost of memory allocation so huge that it is better to perform a few more computations?
It does not seem so, on your computer at least. The number of allocations might impact on more constrained environments (embedded) but I do not think that we target embedded. At least, there are algorithmic changes to perform before micro-optimizations. I would be interested to see either (or both): |

modulocall ifexponent >= 0exponentUnfortunately, I didn't notice any difference in performance, but this should reduce the number of allocations.