C0 | Documentation
About Downloads Tutorial References Courses
C0 Tutorial C0 and other languages Other references

Statements

Statements (sometimes called commands) are executed for their effect. Such effects can be on the values of variables, on the contents of the store, or externally visible effects such as input or output. Statements form the body of functions, but they cannot be directly embedded into expressions. However, we can cause a statement to be executed inside an expression by calling a function whose body (by definition) is again a statement.

We broadly distinguish seven classes of statements: variable declarations, assignments, expressions as statements, blocks, conditionals, loops, and returns.

Variable Declarations

Variables in C0 must be declared with their type. This serves as unambiguous documentation as to the values the variable may hold and enables the compiler to issue an error message should a variable be used incorrectly. There are two forms of variable declarations

t x;      /* uninitialized variable */
t x = e;  /* initialized variable */

where t is a type, x is a variable, and e is an expression. Variables also have a definitive scope, and they can only be used within that scope. The scope extends from the statement immediately following the declaration until the end of the current block of statements. Here is a simple example, computing the factorial function.

/* file fact.c0 */
int factorial(int n)
//@requires n >= 0;
{
  int result = 1;      /* line 1 */
  int i = 1;           /* line 2 */
  while (i <= n) {     /* line 3 */
    result = result*i; /* line 4 */
    i = i+1;           /* line 5 */
  }
  return result;       /* line 6 */
}

The body of the function is a block of four statements: two declarations (result, i), a while-loop, and a return statement. The scope of variable result is lines 2-6; the scope if i is lines 3-6. The following function would be an error:

/* file fact-e1.c0 */
int factorial(int n)
//@requires n >= 0;
{
  int result = 1;      /* line 1 */
  while (i <= n) {     /* line 2; reference to undeclared variable */
    int i = 1;         /* line 3 */
    result = result*i; /* line 4 */
    i = i+1;           /* line 5 */
  }
  return result;       /* line 6 */
}

This is because the reference to i in line 2 is out of scope since it comes before the declaration of i. The error message just notes an undeclared variable:

% coin fact-e1.c0
fact-e1.c0:5.10-5.11:error:undeclared variable `i'
Unable to load files, exiting...

Because it is a frequent source of error in an imperative language, we disallow that a variable is redeclared in its scope, even it would be with the same type as before. For example:

/* file fact-e2.c0 */

int factorial(int n)
//@requires n >= 0;
{
  int result = 1;
  int i = 1;
  while (i <= n) {
    int i = 1;
    result = result*i;
    i = i+1;
  }
  return result;
}

The error message indicates that the variable cannot be "doubly declared":

% coin fact-e2.c0
fact-e2.c0:7.5-7.14:error:variable `i' doubly declared
Unable to load files, exiting...

This is allowed, however, as long as the scopes of the two variables do not overlap. For the following example, we clarify that in a for-loop (see loop), the scope of the variable declared in the initializer extends through the body of the loop.

Assignments

The effect of an assignment is to change the value of a variable, or the state of memory. The simplest form of assignment, already liberally used in examples, is

x = e;

where x is a variable and e is an expression. The effect is to evaluate e to a value v and then assign v to x. For an assignment to be correct, the variable x and the right-hand side e must have the same type. This guarantees that the value of a variable always of the correct type, which is crucial when reasoning about the program.

A variable declaration with an initializer e

t x = e;

can be decomposed into a simple declaration and an assignment:

t x;
x = e;

Note that the expression e cannot itself mention x, because when e is evaluated x would not yet have a value. A corresponding error message will be issued before the program is executed.

--> int x;
--> x = x+1;
Error: uninitialized value used
Last position: <stdio>:1.1-1.8

More generally, the left-hand side of an assignment can denote a memory location, in which case the effect of the assignment is to write a value to that location. We call the kind of expression that denotes a location a l-value (writing lv for short). We enumerate by type:

type of lv l-value assignment effect
x x = e assign x the value of e
t[] lv[e1] lv[e1] = e2 evaluate e1 to i and set lv[i] to value of e2
t* *lv *lv = e write value of e to location denoted by lv
struct s* lv->f lv->f = e write value of e to field f of struct at location denoted by lv
struct s lv.f lv.f = e write value of e to field f of struct lv

Compound Assignments

C0, like C, offers some short-hands for certain kinds of assignments. These shorthands exist primarily for historical reasons. Informally, we might say "increment x by 5". The straightforward code for this is x = x + 5;, which mentions the variable x twice. The shorthand combines the assignment with the operator, and we can write x += 5; which expands into the previous assignment. For any appropriate binary operator in the language, we have a corresponding compound operator (+=, -=, *=, \=, %=, &=, ^=, |=, <<=, >>=).

Compound operators can be applied to more complex values, which is marginally more useful. For example, to mask the array element at j*width+i to the lowest 24 bits, we can write A[j*width+i] &= 0x00FFFFFF; which otherwise would seem more complicated (A[j*width+i] = A[j*width+i] & 0x00FFFFFF;). However, we can also compute j*width+i as a temporary value and use it on both sides, or we can write a function mask24, or we can rely on the compiler to avoid the recomputation if the index expression has no function calls itself. This last point is a subtlety: if we write lv += e, then lv will only be evaluated once, while in the expanded assignment lv = lv + e, lv would be evaluated twice.

A very common case is increment by 1 or decrement by 1, so there are the additional statements lv++ and lv-- for l-values lv, which expand into lv += 1 and lv -= 1 respectively.

Stylistically we feel that the few characters saved are rarely worth using compound assignment, except for increment (x++) and decrement (x--) which yield easily recognizable patterns of iteration.

Expressions as Statements

Instead of assignments x = e;, we can also calculate the value of an expression and throw away its value (if it has one) by simply giving the expression itself as a statement, e;. The only reason to do this would be to obtain the effect of the expression e which would therefore be a function call. Generally, it is good style if functions called in this manner do not return a value, but have the pseudo-return type void. The following print_array function is executed only for its effect (namely, to display the content of the given array). In its body, it calls print and printint, which are functions in the conio library, each of which also has return type void.

/* file print_array.c0 */
#use <conio>

void print_array(int[] A, int n, string array_name)
//@requires 0 <= n && n <= \length(A);
{
  for (int i = 0; i < n; i++) {
    print(array_name);
    print("["); printint(i); print("] = ");
    printint(A[i]);
    print("\n");
  }
  return;
}

Blocks

A block is just a sequence of statements. Because statements are part of other statements like conditionals or loops, the sequence of statements must be surrounded by curly braces { ... }. So a block has the form { s1 s2 ... sn } where s1, s2, ..., sn are statements. As a special case, the empty block (which contains no statements and therefore has no effect) is written as { }. The body of a function definition is always a block of statements; defining a function g has the form:

t g (t1 x1, ..., tn xn) {
 s1
 ...
 sk
}

where { s1 ... sk } is a block.

Blocks also limit the scope of variable declarations: is si is a declaration, its scope is si+1 ... sk.

Conditionals

Conditional statements have one of the following two forms:

if (e) s1 else s2
if (e) s1             /* same as if (e) s1 else { } */

The second form is equivalent to the second with an empty else clause. The condition e must be of boolean type (evaluate to either true or false). The following illustrates some common uses of conditionals.

int abs1(int x) {
  if (x < 0) return -x;
  return x;       /* indented here: no else clause */
}

int abs2(int x) {
  if (x < 0)
    return -x;    /* single-statement: no { ... } */
  else
    return x;
}

int abs3(int x) {
  if (x < 0) {
    return -x;    /* some prefer braces even for single statements */
  } else {
    return x;
  }
}

All of these functions are somewhat problematic with a finite range and two's complement representation because the result is not guaranteed to be positive! We have to rule out the minimal integer (which is -231, for 32-bit numbers), so a better version of the absolute value would be the following:

int abs(int x)
//@requires x-1 < x;     /* x may not be minimal integer */
//@ensures \result >= 0;
{
  if (x < 0) {
    assert(-x > 0); /* abort on minimal integer */
    return -x;
  } else {
    return x;
  }
}

We check if something is the not minimal integer by requiring that subtracting 1 will yield a smaller number. This particular version of the function is defense is that it also uses the assert statement to abort if the abs function would have returned a negative number. Like any assert, this should never fail, and it will not if the client satifies the stated requirement.

Because the else-clause is optional, there is some ambiguity in statements such as

if (e1) if (e2) s1 else s2

Do we read this as if (e1) { if (e2) s1 else s2 } or if (e1) { if (e2) s1 } else s2? The rule is that the else-clause is paired with the most recent if that does not yet have an else-clause. Relying on this can make code quite obscure and error-prone, so we suggest using indentation, blocks, and perhaps even redundant else-clauses with empty blocks to make the structure of the code as clear as possible. Using the <tab> key in Emacs will find the proper indentative level for most statements and should help you read the code.

Loops

There are two forms of loops: for-loops and while-loops.

For Loops

For loops have the form

for (init ; exit ; step)
  body

where init and step are simple statements, exit is a boolean expression, and body is an arbitrary statement. It executes as follows:

  • Execute init.
  • Evaluate exit. If it returns true proceed with 3; if it returns false exit the loop.
  • Execute body.
  • Execute step.
  • Go to 2.

init and step are simple statements, which means declarations, assignments, or expressions (executed for their effect).

Here is a some simple example of a for-loop, summing the elements of an array (in modular arithmetic!):

/* file for.c0 */
int sum1(int[] A, int n)
//@requires 0 <= n && n <= \length(A);
{ int i;
  int sum = 0;
  for (i = 0; i < n; i = i+1)
    sum = sum + A[i];
  return sum;
}

This is an appropriate use of a for loop, since we can read off the traversal pattern by looking at a single line for (i = 0; i < n; i = i+1) to see what we go from 0 to n, incrementing the index by 1 each time. Note that A is never accessed at n, because the last time through the body of the loop, i = n-1. This is appropriate because an array of length n has valid indices 0, ..., n-1. The precondition on the function is also typical for manipulating arrays in C0 (and C).

There are a couple of idioms programmers like to use to make this even more conspicuous. The first is to coalesce the declaration and initialization of i, and express that it is a local variable in the loop only and not used outside. We do this by changing the initializer of the loop to a declaration, int i = 0. The second change is to shorten i = i+1 to i++, where the latter is merely a shorthand for the former. A third change some programmers prefer is to always write the body of a loop as a block, to avoid any possible mistakes regarding which statements are part of the loop and which are not.

int sum2(int[] A, int n)
//@requires 0 <= n && n <= \length(A);
{
  int sum = 0;
  for (int i = 0; i < n; i++) {
    sum = sum + A[i];
  }
  return sum;
}

While Loops

While-loops are simpler and more flexible than for-loops. They are usually appropriate if the structure of the iteration is more complex so that it would be difficult to pack the conditions all on the same line as in a for-loop. The general form is

while (e) body

It is executed as follows:

  • Evaluate e. If it returns true go to 2; otherwise exit the loop.
  • Execute body.
  • Go to 1.

Here is an example of Euclid's algorithm for the greatest common divisor on strictly positive numbers, written in the form of a loop. It uses the recurrence

  • gcd(x,0) = x
  • gcd(x,y) = gcd(y, x mod y)
int gcd(int x, int y)
//@requires x > 0 && y > 0;
{
  while (y != 0) {
    int r = x % y;
    x = y;
    y = r;
  }
  return x;
}

Selecting Control Structure

The main techniques for systematically repeated computation are recursion and loops. Choosing between them is often not easy and sometimes a matter of taste. Generally, the pattern of traversal of a data structure follows its definition, which suggests that we traverse "linear" data structures like arrays in loops and tree-like data structures with recursion. Also, when computation appears highly imperative, making a lot of changes to memory, loops are often preferable while for pure computation, recursion tends to be more elegant.

Once we have decided that a loop is appropriate, we still have to chose between two styles of loops: for-loops and while-loops. The rule of thumb is to use for-loops for simple iterations through arrays as long as initializer, exit condition and step all fit onto one line, and while-loops for more complicated iterations.

Returns

Because the body of a function is a block (and therefore a statement), it does not return a value. Therefore, we need an explicit return statement. It has two forms

return e;  /* return value of e */
return;    /* return void */

In the case of return e;, we evaluate e and returns its value as the value of the current function's invocation. This requires that the return type of the function is the same as the type of e; otherwise the compiler will give an error message. We have already seen this form of return statement in the examples.

The form return; is used for functions whose return "type" is the special keyword void. This means the functions is to be only executed for effect, but does not actually return a value. A trailing return; at the end of a function body may be omitted, but it is helpful to be explicit to avoid the impression is may have been forgotten.

The fib_array function below computes an array of the first n Fibonacci numbers and returns it; the print_array function (shown above) just has an effect of display the array.

/* file fib.c0 */
int[] fib_array(int n)
//@requires n >= 0;
{
  int[] A = alloc_array(int, n);
  if (n > 0) A[0] = 0;
  if (n > 1) A[1] = 1;
  for (int i = 2; i < n; i++)
    //@loop_invariant i >= 2;
    A[i] = A[i-1] + A[i-2];
  return A;
}

For example:

% coin -d fib.c0 print_array.c0
Coin 0.2.8 'Penny'(r2229M, Mon Aug 22 12:16:27 EDT 2011)
Type `#help' for help or `#quit' to exit.
--> print_array(fib_array(20), 20, "fib");
fib[0] = 0
fib[1] = 1
fib[2] = 1
fib[3] = 2
fib[4] = 3
fib[5] = 5
fib[6] = 8
fib[7] = 13
fib[8] = 21
fib[9] = 34
fib[10] = 55
fib[11] = 89
fib[12] = 144
fib[13] = 233
fib[14] = 377
fib[15] = 610
fib[16] = 987
fib[17] = 1597
fib[18] = 2584
fib[19] = 4181
--> 

Semicolons

Sometimes, error messages given by the compiler on omitted or extraneous semicolons can be difficult to understand. It might help to understand the "logic" behind the use of semicolons (;). In C0, ; is not a separator for statements, but a terminator for certain statements.

We call statement which contain other statements within them complex statements. In C0, the complex statements are conditionals (if), loops (for and while), and blocks ({ ... }). The rule is that complex statements are not followed by a semicolon; all others are. This usually means that one-line statements are followed by semicolons and others are not. For example, the conditional

if (x < 0) return -1;

is a complex statement (and therefore has no semicolon), but the then-branch return -1; is a simple statement and therefore is terminated by a semicolon.

Also, at the top level in a file we have structure definitions, function definitions, and type definitions, as well as structure declarations, function declarations, and use declarations. Only function definitions (whose bodies are complex statements) and use declarations (which go on a single line) are not followed by a semicolon. A slight inconsistency here is present by struct definitions,

struct s {
  t1 f1;
  ...
  tn fn;
};

where the trailing semicolon is an artifact of the way C interprets such definitions.