C0 | Documentation
About Downloads Tutorial References Courses
C0 Tutorial C0 and other languages Other references

Computing primes, in Python and C0

We explore some of the differences and similarity between Python and C0 by example, using a few simple functions computing prime numbers. We start from Python code and rewrite it into C0. The original Python source, sieve2.py is due to David Kosbie.

We begin in the middle of the Python code: what is a prime number?

def isPrime(n):
    if (n < 2): return False
    # deal with evens first (to cut time in half)
    if (n == 2): return True
    if (n % 2 == 0): return False
    # then only check odds up to the square root
    for factor in range(3, 1+int(round(n**0.5)), 2):
        if (n % factor == 0):
            return False
    return True

The first difference we have to account for is that functions are typed, which means we have to specify that isPrime takes an integer argument and returns a boolean (either true or false)

bool isPrime(int n)
{
  ...
}

Rather than relying on indendation, the body of the function is enclosed in curly braces {...}. You will see many of these when writing code in C0. ... is of course just a placeholder for code we have still to write.

Prime numbers are integers greater or equal to 2, so just as in the Python code we first check whether it is a valid candidate and return false if it isn't. The constant false is all in lowercase, and this matters; False would be flagged as an undeclared identifier.

bool isPrime(int n)
{
  if (n < 2) return false;
  ...
}

In C0, there is no colon (:) after the condition, and the return statement is followed by semicolon (;). All simple statements in C0 are followed by semicolons, but not compound statements like conditionals. Therefore we know that the semicolon "belongs to" the return statement, rather than the conditional (if).

The next two lines in the Python code are an optimization, which we can faithfully reproduce. The constant true is also in lowercase. The modulus and comparison operators are like in Python, except that C0 enforces that the two sides are expressions of the same type.

bool isPrime(int n)
{
  if (n < 2) return false;
  if (n == 2) return true;
  if (n % 2 == 0) return false;
  ...
}

Let's look again at the remainder of the Python function.

for factor in range(3, 1+int(round(n**0.5)), 2):
    if (n % factor == 0):
        return False
return True

We translate the iteration using the for statement into a so-called for loop. A for-loop is much more restrictive than Pythons iteration: we say how to initialize a variable, how to test whether to exit the loop, and how to step from one iteration to the next. This is written as for(init; test; step). In this program, init is int factor = 3, declaring the new integer variable factor and giving it the initial value of 3. The exit test avoid the square root (since C0 has neither a built-in square root nor floating point numbers) and we just check if the potential factor is greater or equal to the square root by squaring it and comparing it to n/factor (we don't write factor * factor <= n because the multiplication could overflow). Finally, the range expression indicates a step size of 2, which we model in C0 by incrementing the factor by 2 on each iteration.

for (int factor = 3; factor <= n/factor; factor += 2) {
  if (n % factor == 0)
    return false;
}

Indentation here has no semantic meaning, but it is excellent practice for indentation to reflect the program structure, just as you would in Python. If this loop uncovers no true factor, we return true: the number is indeed prime.

/* file sieve2.c0 */
bool isPrime(int n)
{
  if (n < 2) return false;
  if (n == 2) return true;
  if (n % 2 == 0) return false;
  for (int factor = 3; factor <= n/factor; factor += 2) {
    if (n % factor == 0)
      return false;
  }
  return true;
}

For reference, here is our initial Python program.

def isPrime(n):
    if (n < 2): return False
    # deal with evens first (to cut time in half)
    if (n == 2): return True
    if (n % 2 == 0): return False
    # then only check odds up to the square root
    for factor in range(3, 1+int(round(n**0.5)), 2):
        if (n % factor == 0):
            return False
    return True

Types

The two functions, in C0 and Python, are quite similar. One difference is that in C0 we have specified that the argument to the function is supposed to be an integer, while there is no corresponding declaration in Python. We could add the declaration assert(isinstance(x,int)), but this will check the type condition only when the function is executed (dynamically). In C0, all types are checked before the program is ever executed (statically), catching errors earlier and more reliably. Let's do some testing in Python.

% python -i sieve2.py
>>> isPrime(10)
False
>>> isPrime(17)
True
>>> 

This is as expected, fortunately. Now let's try some other numbers as arguments to the function.

>>> isPrime(1.5)
False
>>> isPrime(2.5)
True
>>> isPrime("abc")
isPrime("abc")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 5, in isPrime
TypeError: not all arguments converted during string formatting

All of these are quite meaningless. 1.5 is rejected because it is less than 2, but 2.5 is accepted as a "prime number" because it passes all the tests. The string "abc" finally yields a type error because the modulus operation is not defined on strings. We have uncovered here a hidden assumption, namely that the argument to isPrime is an integer.

In C0, using the coin interpreter, the behavior is quite different.

% coin sieve2.c0
Coin 0.2.10 'Penny'(r76M, Sun May 27 17:34:40 EDT 2012)
Type `#help' for help or `#quit' to exit.
--> isPrime(10);
false (bool)
--> isPrime(17);
true (bool)
--> isPrime(2.5);
<stdio>:1.11-1.12:error:expected identifier, found '5'
--> isPrime("abc");
<stdio>:1.1-1.15:error:type mismatch
expected: int
   found: string
--> 

C0 has only integers, so 2.5 is rejects as illegal. It does have strings, but before ever running the function is realizes that we cannot pass a string to a function that requires an integer.

Testing

We have seen above that we can test the code in coin in the same way as we can test it with a Python interpreter. We might put some testing code into a file, using Python's assert statement.

print "Testing isPrime for correctness...",
assert(isPrime(10) == False)
assert(isPrime(17) == True)
assert(isPrime(-1) == False)
print "Passed!"

But we cannot embed the testing commands directly into the file as we can in Python. A general convention is to create a second file sieve2-test.c0 which contains a function main of no arguments that runs some testing functions and returns an integer. In most testing functions we just return 0. The line #use <conio> (for console input/output) allows us to call a printing function.

/* file sieve2-test.c0 */
#use <conio>

int main () {
  print("Testing isPrime for correctness...\n");
  assert(isPrime(10) == false);
  assert(isPrime(17) == true);
  assert(isPrime(-1) == false);
  print("Passed!\n");

  return 0;
}

We can see that print is just a library function, and therefore takes its argument in parentheses. Also, we add a newline character \n to the end of each print statement. In C0, output is buffered which means that a line may not be actually printed on the console until an end-of-line character is encountered. So if we want to see that it has started testing, we need to add that character at the end of the string. Now we can test it.

% coin sieve2.c0 sieve2-test.c0
Coin 0.2.10 'Penny'(r76M, Sun May 27 17:34:40 EDT 2012)
Type `#help' for help or `#quit' to exit.
--> main();
Testing isPrime for correctness...
Passed!
--> 

If we encounter an error (here created intentionally), the assert statement in C0 will abort the program and issue an appropriate error message.

% coin sieve2.c0 sieve2-error.c0
Coin 0.2.10 'Penny'(r76M, Sun May 27 17:34:40 EDT 2012)
Type `#help' for help or `#quit' to exit.
--> main();
Testing isPrime for correctness...
sieve2-error.c0:8.3-8.31: assert failed
Last position: sieve2-error.c0:4.1-12.2
               main from <stdio>:1.1-1.7
--> 

The error message is given with a region, which is line.column-line.column, here from line 8, column 3 to line 8, column 31. We can now go to line 8 of the file sieve2-error.c0 and examine it

assert(isPrime(-1) == true);

and we see the incorrect test.

Python lists and C0 arrays

The next function we look at in Python implements the Sieve of Eratosthenes. For an explanation of this algorithm, see http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes. Briefly, we start with a list of numbers from 2 to n. We declare the first number p in the list to be prime. Then we "cross out" all the multiples of p in the rest of the list, and find the first number that hasn't been crossed out and repeat until we reach the end of the list.

def sieve(n):
    isPrime = [ True ] * (n+1) # assume all are prime to start
    primes = [ ]
    for prime in range(2, n+1):
        if (isPrime[prime] == True):
            # we found a prime, so add it to our result
            primes += [prime]
            # and mark all its multiples as not prime
            for multiple in range(2*prime, n+1, prime):
                isPrime[multiple] = False
    return primes

There are multiple different ways one might represent a Pyton list in C0. For example, it could be represented as an explicitly linked list, as a fixed-size array, or as a dictionary data structure such as a hash table or a binary search tree. In this example we take a very simple approach where we use an array of fixed size. One way to think of this array isPrime is that isPrime[i] is true if the integer i is a prime number. Therefore, the array contains booleans and it has C0 type bool[].

bool[] sieve(int n)
{
...
}

At this point we turn to *contracts* because we would like to be explicit about the assumptions that the function sieve makes and what it guarantees. First, we require n to be at least 2 (the smallest prime), and we guarantee that the returned array has n+1 elements. Here we say n+1 because C0 array indices count starting from 0 (which is also true for Python lists and an almost universal convention in computer science). These two properties are stated with a @requires annotation for the precondition, and an @ensures annotation for the postcondition. In the conditions we write \length(A) to refer to the length of an array (the number of elements in it) and \result for the return value of the function (here, an array).

bool[] sieve(int n)
//@requires n >= 2;
//@ensures \length(\result) == n+1;
{
...
}

Next, we have to allocate and initialize the array with true for each element. In Python, this may be done with the statement

isPrime = [ True ] * (n+1)

Here, we use alloc_array which we give a type (here bool) and the number of elements to allocate, and it returns the array. We then initialize it with a for loop, already introduced in the isPrime function above. This is not important for the Python code, but we explicitly note that 0 and 1 are not prime numbers.

bool[] isPrime = alloc_array(bool, n+1);
isPrime[0] = false; isPrime[1] = false;
for (int i = 2; i < n+1; i++)
   isPrime[i] = true;

Of course, not all these numbers are prime! It would be more accurate to say that they are valid candidates for being prime numbers. Let's look back at the Python code. Starting at 2, it iterates over the candidates until it finds a number that has not yet been crossed out. That number must actually be prime. It then marks all of its multiples as not prime and continues in the main loop.

  primes = [ ]
  for prime in range(2, n+1):
      if (isPrime[prime] == True):
          # we found a prime, so add it to our result
          primes += [prime]
          # and mark all its multiples as not prime
          for multiple in range(2*prime, n+1, prime):
              isPrime[multiple] = False

This code builds up a list of all the primes numbers it finds. We will not do this. Instead, when the outer loop completes, anything in the isPrime array that has not been crossed out (that is, is still true) is a prime.

for (int prime = 2; prime < n+1; prime++)
  if (isPrime[prime] == true)  /* not yet crossed out, so really a prime */
    /* cross out all its multiples, up to n */
    for (int multiple = 2*prime; multiple < n+1; multiple += prime)
       isPrime[multiple] = false;  /* cross out */

After the outer loop completes we are done: we have computed all the prime numbers up to n. However, there is still an issue: in C0, we use two's complement arithmetic with 32-bit numbers. This means multiplication such as 2*prime or addition such as multiple += prime could overflow. To avoid such issues we require n to be less that the maximal integer divided by 2. The maximal integer (returned by int_max()) is 2147483647, although the precisely value is not important for the correctness of this program.

Here is the complete code:

bool[] sieve(int n)
//@requires 2 <= n && n < int_max()/2;
//@ensures \length(\result) == n+1;
{
  bool[] isPrime = alloc_array(bool, n+1);
  isPrime[0] = false; isPrime[1] = false;
  for (int i = 2; i < n+1; i++)
     isPrime[i] = true;

  for (int prime = 2; prime < n+1; prime++)
    if (isPrime[prime] == true)  /* not yet crossed out, so really a prime */
      /* cross out all its multiples, up to n */
      for (int multiple = 2*prime; multiple < n+1; multiple += prime)
         isPrime[multiple] = false;  /* cross out */

  return isPrime;
}

Next we want to implement the prime counting function pi; see http://en.wikipedia.org/wiki/Prime-counting_function. This is immediate in Python, since the Python function returns a list of all the prime numbers.

def piUsingSieve(n):
    return len(sieve(n))

In C0, we have to go through the array once more and count up the number of elements that are true.

int piUsingSieve(int n)
//@requires 2 <= n && n < int_max()/2;
{
  bool[] isPrime = sieve(n);

  int count = 0;
  for (int i = 2; i < n+1; i++)
    if (isPrime[i] == true) count++;
  return count;
}

We also update our testing code

int main () {
  ...
  print("Testing piUsingSieve for correctness...\n");
  assert(piUsingSieve(10) == 4);
  assert(piUsingSieve(100) == 25);
  assert(piUsingSieve(1000) == 168);
  print("Passed!\n");

  return 0;
}

and run it:

% coin -d sieve2.c0 sieve2-test.c0
Coin 0.2.10 'Penny'(r76M, Sun May 27 17:34:40 EDT 2012)
Type `#help' for help or `#quit' to exit.
--> main();
Testing isPrime for correctness...
Passed!
Testing piUsingSieve for correctness...
Passed!
--> 

Here we passed the -d flag to coin to make sure that the contracts were being checked.

Loop invariants

In addition to the testing with concrete values above, we can also express properties of the function using additional pre- and post-conditions as well as loop invariants. This will help in testing the code, but also in explaining why the code is correct. This is not usually done in Python, but we show it here nonetheless because it shows one way in which C0 goes beyond Python. Before reading this section, you should probably read about Contracts, including the secton on loop invariants.

Using isPrime, we can write a generic testing function for the sieve function. This function is intended for testing and contracts only. It tests if the elements isPrimeArray[0..n] are all correctly labeled as prime (true) or not (false). In order to ensure that all array accesses are in bounds, we require n+1 <= \length(isPrimeArray).

bool allPrimes (bool[] isPrimeArray, int n)
//@requires n+1 <= \length(isPrimeArray);
{
  for (int i = 0; i < n+1; i++)
    if (isPrimeArray[i] != isPrime(i)) {
      return false;
    }
  return true;
}

Now we can use this in the postcondition of the function.

/* file sieve2-inv.c0 */
bool[] sieve(int n)
//@requires 2 <= n && n < int_max()/2;
//@ensures \length(\result) == n+1;
//@ensures allPrimes(\result, n);
{
  ...
}

At this point we get automatic testing of the sieve function, when the cc0 compiler or the coin interpreter is called with the -d flag. But we want to check something about the internals of the function as well, so we add loop invariants. First, the initialization. The invariant is just strong enough to guarantee that any array access remains in bound.

bool[] isPrime = alloc_array(bool, n+1);
isPrime[0] = false; isPrime[1] = false;
for (int i = 2; i < n+1; i++)
  //@loop_invariant 2 <= i && i <= n+1;
  isPrime[i] = true;

//@assert allPrimes(isPrime, 2);

We also assert explicitly that, after the initialization is complete, number 0, 1, and 2 have correctly been assigned an isPrime value. That's because 0 and 1 are not primes, and 2 is a prime. In the following loop, we note that we have already correctly computed the isPrime value for 0, 1, 2, ..., prime, where prime is the loop variable.

//@assert allPrimes(isPrime, 2);
for (int prime = 2; prime < n; prime++)
  //@loop_invariant 2 <= prime && prime <= n;
  //@loop_invariant allPrimes(isPrime, prime);
  if (isPrime[prime] == true)  /* not yet crossed out, so really a prime */
    /* cross out all its multiples, up to n */
    for (int multiple = 2*prime; multiple < n+1; multiple += prime)
       isPrime[multiple] = false;  /* cross out */

There is a small subtlety here: the loop in the original program ran through the whole array. So the last value for the prime variable was n+1. The function allPrimes would not be defined for this value. But we note that because of our loop invariant we can actually stop at n-1, because, by the loop invariant, isPrime[n] will already have to be correct at that point. If we had crossed it out before it is not a prime, but if we hadn't crossed it out is must indeed be prime.

The loop invariant is strong enough to guarantee the postcondition, because we finish the loop with prime == n. However, the loop invariant is not strong enough to show that if it is true after k iterations, it is still true after k+1 iterations. This is because the invariant does not capture that all multiples of smaller primes have been crossed out. Adding this requires another definition, namely what it means for a number to be divisible by any integer up to some k. We leave that as an exercise.

Here is the sieve code with contracts and loop invariants.

bool[] sieve(int n)
//@requires 2 <= n && n < int_max()/2;
//@ensures \length(\result) == n+1;
//@ensures allPrimes(\result, n);
{
  bool[] isPrime = alloc_array(bool, n+1);
  isPrime[0] = false; isPrime[1] = false;
  for (int i = 2; i < n+1; i++)
    //@loop_invariant 2 <= i && i <= n+1;
    isPrime[i] = true;

  //@assert allPrimes(isPrime, 2);

  for (int prime = 2; prime < n; prime++) /* changed from n+1 to n here! */
    //@loop_invariant 2 <= prime && prime <= n;
    //@loop_invariant allPrimes(isPrime, prime);
    if (isPrime[prime] == true)  /* not yet crossed out, so really a prime */
      /* cross out all its multiples, up to n */
      for (int multiple = 2*prime; multiple < n+1; multiple += prime)
        //@loop_invariant 2*prime <= multiple;
        isPrime[multiple] = false;  /* cross out */

  return isPrime;
}

This code is of course very slow when dynamic checking of contracts is enabled. When we do timings (see next section), we need to be careful not to specify -d.

Timing

In the Python source, there is some code using the time library to check the running time of the sieve code by computing how many primes there are up to a million.

import time

n = 1000 * 1000  # one million

print "***************"
print "Timing piUsingIsPrime(" + str(n) + "):"
startTime = time.time()
result1 = piUsingIsPrime(n)
endTime = time.time()
time1 = endTime - startTime
print "   result = " + str(result1)
print "   time = " + str(time1) + " sec"

print "***************"
print "Timing piUsingSieve(" + str(n) + "):"
startTime = time.time()
result2 = piUsingSieve(n)
endTime = time.time()
time2 = endTime - startTime
print "   result = " + str(result2)
print "   time = " + str(time2) + " sec"

In C0, we do not have a comparable library, so we use Unix tools to accomplish the timing. First, we write a straightforward function that counts the primes from 1 to n by using the explicit primality test for every element. Then will just call either this function or the previous one in our main function.

/* file sieve2-time.c0 */
#use <conio>
#use "sieve2.c0"

int piUsingIsPrime(int n)
//@requires n >= 0;
{
  int primeCount = 0;
  for (int i = 2; i <= n; i++)
    if (isPrime(i) == true)
      primeCount++;
  return primeCount;
}

int main() {
  int n = 1000 * 1000 * 10; /* ten million */
  print("Timing computing pi(") ; printint(n); print(")\n");
  // piUsingIsPrime(n);
  piUsingSieve(n);
  return 0;
}

Now we call the compiler, cc0, to generate efficient code and execute it using the time Unix utility. cc0 produces an executable, by default called a.out, which we pass as an argument to the time command.

% cc0 sieve2-time.c0
% time ./a.out
Timing computing pi(10000000)
0
1.004u 0.004s 0:01.00 100.0%	0+0k 0+0io 0pf+0w

After editing the file to call piUsingIsPrime instead, recompiling, and then rerunning the result, we get:

% cc0 sieve2-time.c0
% time ./a.out
Timing computing pi(10000000)
0
5.926u 0.003s 0:05.93 99.8%	0+0k 0+0io 0pf+0w

We note that the prime sieve is significantly faster. Increasing the upper bound suggests that the time is roughly linear. The theory predicts this: the complexity is O(n log(log(n))) which is nearly linear (see the notes on complexity of the prime sieve).

Timing computing pi(10000000)
1.004u 0.004s 0:01.00 100.0%	0+0k 0+0io 0pf+0w
Timing computing pi(20000000)
2.023u 0.008s 0:02.03 99.5%	0+0k 0+0io 0pf+0w
Timing computing pi(40000000)
4.113u 0.016s 0:04.13 99.7%	0+0k 0+0io 0pf+0w

Using the isPrime primality test increases at a much faster rate.