Prime generator in Python and C

Question

I have written the Rabin-Miller test in Python and ported it to C. But somehow, the C version is ~20 times slower on my system, and I am not sure how. Anyway, here is my code.

Python

import random
import time

def rabin_miller(n):
    if n in [2, 3, 5]:
        return True
    if n < 2 or n % 2 == 0:
        return False

    s = 0
    d = n - 1
    while d % 2 == 0:
        s += 1
        d >>= 1

    for k in range(2 ** 4):
        a = random.randrange(2, n - 1)
        x = pow(a, d, n)
        if x == 1 or x == n - 1:
            continue
        c = False
        for r in range(s - 1):
            x = pow(x, 2, n)
            if x == 1:
                return False
            if x == n - 1:
                break
        else:
            return False
    return True

for i in range(10000):
    rabin_miller(i)

C

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

long mod_pow(long b, long e, long m){
    if(m == 1)
        return 0;
    long c = 1;
    for(long i = 0; i < e; i++)
        c = (c * b) % m;
    return c;
}

int rabin_miller(long n){
    if(n == 2 || n == 3)
        return 1;
    if(n < 2 || n % 2 == 0)
        return 0;

    long s = 0, d = n - 1, k = 10, a, x, stop;
    while(d % 2 == 0)
        s += 1, d >>= 1;

    srand(time(NULL));
    for(long i = 0; i < k; i++){
        a = (long) (((float) rand() / RAND_MAX) * (n - 3)) + 2;
        x = mod_pow(a, d, n);
        if(x == 1 || x == n - 1)
            continue;
        stop = 1;
        for(long r = 0; r < s - 1; r++){
            x = mod_pow(x, 2, n);
            if(x == 1)
                return 0;
            if(x == n - 1){
                stop = 0;
                break;
            }
        }
        if(stop){
            return 0;
        }
    }
    return 1;
}

int main(int argc, char *argv[]){
    for(long i = 1; i < 10000; i++){
        rabin_miller(i);
    }
    return 0;
}

So you are looking to improve the C code, I think ...? If so, the Python code is unnecessary and could be removed — Billal BEGUERADJ
– Billal BEGUERADJ, Commented Oct 2, 2017 at 14:07
You could at least improve the while (d % 2 == 0) With I think while (~t & 1) — Timmetje
– Timmetje, Commented Oct 2, 2017 at 14:47
I also assume you know this, but while Raben Miller is an efficient prime test, it isn't a good generator, as a sieve would be much faster for either. — Oscar Smith
– Oscar Smith, Commented Oct 2, 2017 at 22:09

Oscar Smith · Accepted Answer · 2017-10-02 14:35:06Z

13

I believe the problem is that long mod_pow is very inefficient, as it takes e steps to complete, as compared to Python's approach which likely only takes log(e) steps. To fix this, you should use an algorithm similar to the following

function modular_pow(base, exponent, modulus)
if modulus = 1 then return 0
Assert :: (modulus - 1) * (modulus - 1) does not overflow base
result := 1
base := base mod modulus
while exponent > 0
    if (exponent mod 2 == 1):
       result := (result * base) mod modulus
    exponent := exponent >> 1
    base := (base * base) mod modulus
return result

answered Oct 2, 2017 at 14:35

Oscar Smith

3,71718 silver badges31 bronze badges

\$\begingroup\$ I see the value of if modulus = 1 then return 0, yet not base := base mod modulus aside from reduced chance of overflow in base * base. Any other reason for it? \$\endgroup\$

chux
– chux

2017-10-02 22:33:42 +00:00
Commented Oct 2, 2017 at 22:33
1

\$\begingroup\$ That's it. No reason to chance overflow if you don't need to. This way, it works for any base that fits the type, if the modulus is small enough. \$\endgroup\$

Oscar Smith
– Oscar Smith

2017-10-02 22:41:36 +00:00
Commented Oct 2, 2017 at 22:41
\$\begingroup\$ It also doesn't take a ton of time, as it is one mod statement compared to up to 64 for the loop. (and some conditional branches) \$\endgroup\$

Oscar Smith
– Oscar Smith

2017-10-02 22:43:33 +00:00
Commented Oct 2, 2017 at 22:43

Add a comment |

Edward · Accepted Answer · 2017-10-02 16:32:57Z

I see a number of things that may help you improve your C program in addition to the review from Oscar Smith which correctly identifies the reason for the major performance difference in the two versions.

Use better variable names

Single letter variable names are not descriptive and not helpful to the reader of the code.

Only call `srand` once

The call to srand should only be done once. Calling it multiple times does not improve randomness.

Use `bool` where appropriate

Since C99, the C language supports bool, and the rabin_miller() function should return a bool rather than an int. You will need to #include <stdbool.h> to do this.

Use the appropriate data type

In the original code k and i are just used as part of a loop count. An int or unsigned would be a more appropriate type than long and may confer a performance advantage. The same goes for r and s.

Prefer `for` to `while`

The code which calculates s and d could be written a bit more succinctly and clearly, in my view, like this:

for(s = 0; d % 2 == 0; ++s) {
    d >>= 1;
}

Simplify `for` loop termination condition

The point to the for (r ... loop is to simply iterate s-1 times. Rather than comparing r to s-1 each iteration, it makes more sense to simply start the loop from 1 rather than 0 and compare with s directly:

    for(unsigned r = 1; r < s; r++) {

Even simpler, count down instead. Some processors have a special instruction which does both a decrement and test for zero so here again, there may be a performance advantage on some architectures:

    for(unsigned r = s; r; --r) {

Declare variables in as small a scope as practical

The variables a and x are only used inside the big loop and so could easily be declared only there. Keeping the scope of variables as small as practical makes the code easier to read and understand and may give the compiler some better hints as to register allocation.

Would there be any benefit to using !(n & 0x1) over n % 2 == 0? You missed your usual point about symbolic constants over raw numbers (main). — pacmaninbw
– pacmaninbw ♦, Commented Oct 2, 2017 at 16:51
For the first question, no, there's usually no benefit because compilers are generally smart enough to generate the same optimal code for either. For the second, other than k in the original, I think that having hard coded numbers (e.g. 2, 3) in this intrinsically mathematical function is fine. — Edward
– Edward, Commented Oct 2, 2017 at 17:02
Pedantic corner !(n & 0x1) is not the same as n % 2 == 0 in the archaic one's complement. when n<0. — chux
– chux, Commented Oct 2, 2017 at 21:23
I'll concede the point if you can name any currently used compiler which uses one's complement. :) — Edward
– Edward, Commented Oct 2, 2017 at 21:29

Stack Exchange Network

Prime generator in Python and C

Python

C

2 Answers 2

Use better variable names

Only call `srand` once

Use `bool` where appropriate

Use the appropriate data type

Prefer `for` to `while`

Simplify `for` loop termination condition

Declare variables in as small a scope as practical

You must log in to answer this question.

Hot Network Questions

Prime generator in Python and C

Python

C

2 Answers 2

Use better variable names

Only call srand once

Use bool where appropriate

Use the appropriate data type

Prefer for to while

Simplify for loop termination condition

Declare variables in as small a scope as practical

You must log in to answer this question.

Related

Hot Network Questions

Only call `srand` once

Use `bool` where appropriate

Prefer `for` to `while`

Simplify `for` loop termination condition