Introduction

Hello World

Printing variables

User input

Testing printf

Mathematical operations

Functions

Arrays

Loops

Projects:

Estimating sin(x)

Calculating cube roots

Sieve of Eratosthenes

The Sieve of Eratosthenes

The Sieve of Eratosthenes is an ancient prime number sieve. The process is simple and elegant and work very well when teaching (or implenting) arrays.
The process starts with an array of numbers starting at 2 since 1 is not considered prime.

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Starting at index zero, is the value at the index greater than zero?
In this case it is, it's 2, so note 2 as prime and step from the index to the top of the array. in steps of 2 setting all subsequent values to zero. This removes multiples of 2 from the array.

2 3 0 5 0 7 0 9 0 11 0 13 0 15 0 17 0 19 0

The pointer now moves to the next index, one, and checks if the value is greater than zero.
The next value is 3 so 3 is noted as prime and now step from the index to the top of the array in steps of 3 setting all subsequent values to zero. This removes multiples of 3 from the array.
Note that even multiples of 3 (6, 12, 18, etc) have already been removed.

2 3 0 5 0 7 0 0 0 11 0 13 0 0 0 17 0 19 0

The next index is two. The value there would have been 4 but it is now zero so the index is incremented to three. The value there is 5 so 5 is noted as prime and all the multiples of 5 are removed from the array.
This has already been done, so I won't reproduce the array again.
In fact, if you look at the array now it contains only primes.

Development

Initialising the array

This programme went through a few development stages so it is worth using it as a teaching tool (and to remind myself how I did it).
The first task is to create an empty array one hundred locations long. Each location will hold an eight byte value so the array needs to be 800 addresses wide.
That is done with this code:

    
        .data
        output_msg:
            .asciz "Index: %ld contains value: %ld\n"
        array:
	        .skip 800	// 100 x 8 byte addresses
    

The next task is to fill the 100 locations with the numbers 2 to 102. This is done with this code:

    
        // main code
        ldr x0, =array
        mov x19, #0		// array pointer
        mov x20, #2		// value to store
    loop1:
        str x20, [x0]   // store the next integer in the array
        add x0, x0, #8	    // load x0 with address of next array location
        cmp x19, #100
        beq check
        add x19, x19, #1
        add x20, x20, #1
        b loop1
    

To start with, x0 is loaded with the base address of the array. X19 is loaded with zero as an array pointer. It's not being used as a pointer in this programme, but rather as a counter. X20 is loaded with 2 since 1 is not considered prime.
The loop then stores the next integer, from x20, into the array, increments the array pointer then checks x19 to find if that was the last insertion. If it was then the loop ends. If not, x19 is incremented, and x20 is incremented to the next integer to be stored.
The question now is, did it work?

To check the contents of the array, the following code will be used. This is added beneath the existing code and can be deleted once it has been used.

    
    check:
	    mov x19, #0			// array pointer.  Safe from printf
    loop2:
	    ldr x0, =output_msg		// x0 holds output string
	    mov x1, x19				// x1 holds index
	    ldr x2, =array			// x2 holds base addr of array
	    lsl x3, x1, #3			// x2 holds addrss of index in x1
	    add x2, x2, x3
	    ldr x2, [x2]			// x2 holds value at addr

	    bl printf
	    cmp x19, #100
	    beq end
	    add x19, x19, #1
	    b loop2
    

This code is a little different than the loading code. The loading code started with the base address of the array in x0 and added eight to this value every iteration of the loop. The value in x19, which looks like an array pointer, was really used as a loop counter. The value in x0 was stable throughout the process.
This is not the case in the check loop. In this loop the call to printf clobbers all the registers from x0 to x7. There is also to problem that printf requires x0, x1, and x2 to print the output message. The result is that these registers need to be reloaded every iteration. X19 is still used, but now it really is an array pointer.

Starting the seive

The basic process for the sieve is:

  1. Read value at array pointer
  2. If value is greater than zero:
    1. Print the value
    2. Step from pointer to top of array in steps of value
    3. Set each subsequent value to zero
  3. Increment the array pointer
  4. Repeat the process from the top

The code for that algorithm is:

    
        mov x19, #0	// Init the loop pointer

check_candidate:
	ldr x0, =array
	lsl x1, x19, #3
	add x0, x0, x1
	ldr x1, [x0]	// X1 now hold next value in array
	cmp x1, #0	// Has it been cancelled already
	bgt found_prime
	add x19, x19, #1	// Incr loop pointer
	b check_candidate
    

The first four lines are becoming familiar.

Once the value is in x1 it is compared to zero. if it is greater than zero then it is a prime and can be reported. If not, the pointer is incremented and the process starts again.
The loop has to reload x0 and x1 because if the value is a prime, the call to printf to print it will disrupt the address and value in x0 and x1.

I will skip over the printing of the prime since there are already examples of printing an integer on other pages in the site.

The final task is to step through the array and cancel all the multiples of the current value.

    
        do_cancel:
	ldr x0, =array		// Base addr of array
	lsl x1, x19, #3		// Get offset to array pointer position
	add x0, x0, x1		// Update x0 addr to addr of array pointer
	mov x1, x19		// x1 becomes location pointer.  Points at array pointer
	ldr x2, [x0]		// Get step value
	lsl x3, x2, #3		// Convert step value to mem offset step
cancel_loop:
	add x0, x0, x3		// Step addr to multiple of step value
	str xzr, [x0]		// Store zero in array
	add x1, x1, x2		// loc ptr = loc ptr + step
	cmp x1, #100
	blt cancel_loop
	add x19, x19, #1	// Inc the array pointer
	cmp x19, #100		// Has the array pointer reached the top of the array
	blt check_candidate
    

There is a lot in this section of the code.
The first three lines take the base address of the array and add the offset for the current value of the pointer which is still in x19. After the add x0, x0, x1 command x0 contains the address of the current pointer value.
The mov x1, x19 instruction copies the array pointer from x19 to x1. This will become a location pointer that will be stepped up the array.
The ldr x2, [x0] instruction gets that value for the step. It's the current prime.
The lsl x3, x2, #3 instruction converts the step value as an integer to an offset for the stepping function.

All the pieces are in place:

The cancel_loop function starts by incrementing the current array pointer memory address, in x0, by the memory offset in x3. The value there is replaced with zero. Since the str instruction cannot take an immediate value, that is str #0, [x0] is not a valid instruction, the code uses the xzr register which always holds zero and cannot be written to. This means the code does not have to have a register dedicated to holding a value of zero.
The location pointer (an integer) is then incremented by the step value (also an integer). This is compared with an immediate value of #100 to see if the cancelling process is completed. If the location pointer is less than 100 the loop repeats.
The array pointer is then incremented and checked against 100 to see if it has reached the top of the array. If not then it gets the next candidate. If it has reached the top of the array then the sieve is complete and the code terminates.

Source Code

    
        // eratosthenes1.s

.global main
.extern printf

.data
loading_msg:
	.asciz "Creating array and loading initial values.\n"
begin_msg:
	.asciz "Beginning seive.\n"
prime_msg:
	.asciz "%d is prime.\n"
//test_msg:
//	.asciz "Location %d has value %d.\n"

array:
	.skip 800	// 100 x 8 byte slots

.text
main:
	// prolog
	stp x29, x30, [sp, -16]!
	mov x29, sp

	// main code
	// Load the array
	ldr x0, =loading_msg
	bl printf

	// Load the array with seqential values from 2 to 101
	// This routine should be optimised so it does not
	// reload the array base addr each time

	mov x19, #0			// Array pointer
	mov x20, #2			// First value for array
load_loop:
	ldr x0, =array		// Base addr of array
	lsl x1, x19, #3		// Calc offset
	add x0, x0, x1		// Add offset to base addr
	str x20, [x0]
	add x19, x19, #1	// Incr pointer
	add x20, x20, #1	// Incr value to store
	cmp x19, #100
	blt load_loop

	mov x19, #0		// Array pointer
check_loop:
	// This loop is diagnostic.  It loops through the array
	// and prints all the contents.  Can be deleted
//	ldr x0, =test_msg
//	mov x1, x19
//	ldr x2, =array
//	lsl x3, x1, #3
//	add x2, x2, x3
//	ldr x2, [x2]
//
//	bl printf
//	add x19, x19, #1
//
//	cmp x19, #100
//	blt check_loop

	// Now start the sieve
	ldr x0, =begin_msg
	bl printf

	// Get the next  value in the array
	// Check if it's a zero.  If not then print and cancel.
	mov x19, #0	// Init the loop pointer

check_candidate:
	ldr x0, =array
	lsl x1, x19, #3
	add x0, x0, x1
	ldr x1, [x0]	// X1 now hold next value in array
	cmp x1, #0	// Has it been cancelled already
	bgt found_prime
	add x19, x19, #1	// Incr loop pointer
	b check_candidate

found_prime:
	// First, print the value
	ldr x0, =prime_msg
	bl printf		// Print the newly found prime

do_cancel:
	ldr x0, =array		// Base addr of array
	lsl x1, x19, #3		// Get offset to array pointer position
	add x0, x0, x1		// Update x0 addr to addr of array pointer
	mov x1, x19		// x1 becomes location pointer.  Points at array pointer
	ldr x2, [x0]		// Get step value
	lsl x3, x2, #3		// Convert step value to mem offset step
cancel_loop:
	add x0, x0, x3		// Step addr to miltiple of step value
	str xzr, [x0]		// Store zero in array
	add x1, x1, x2		// loc ptr = loc ptr + step
	cmp x1, #100
	blt cancel_loop
	add x19, x19, #1	// Inc the array pointer
	cmp x19, #100		// Has the array pointer reached the top of the array
	blt check_candidate


end:
	// cleanup
	mov x0, #0
	ldp x29, x30, [sp], 16
	RET