Introduction

Hello World

Printing variables

User input

Testing printf

Mathematical operations

Functions

Arrays

Loops

Projects:

Estimating sin(x)

Calculating cube roots

Sieve of Eratosthenes

Estimating sin(θ)

There is a simple series for estimating sin(x).
The equation to use is:

sin(x) = x - x 3 3! + x 5 5!

For ease of implementation, this equation will be implemented as:

sin(x) = x - ( x 3 ÷ 6.0 ) + ( x 5 ÷ 120.0 )

The value for x, theta, also needs to be in radians rather than degrees.
To accomplish this a value of π / 180 = 0.01745329252 will be used as a conversion from degrees to radians.

The final code, shown below, is 100 lines long so let's break it into smaller chunks for analysis.

Code Breakdown

The data sections

The .data section contains all the variables:

    
        .data
        output_fmt:
	        .asciz "sin(%lf) = %lf.\n"	// Allow float for input
        input_msg:
	        .asciz "Input angle in degrees: "
        input_fmt:
	        .asciz "%lf"		// Must be read as 64 bit value and stored in d register
        value:
	        .double 0
        to_radians:
	        .double 0.01745329252   // pi / 180
        three_fact:
	        .double 6.0
        five_fact:
	        .double 120.0
    

The points to note are:

  1. The output_fmt strings have %lf or long-floats as placeholders. These would have worked even if %f had been used.
  2. The input_fmt string has a %lf placeholder. This MUST be a long-float to ensure that it is read as a 64 bit, double precision value.
  3. The value variable is declared as type .double. This is a 64 bit float.

The values can be seen in the mathematical description at the top of the page.

The user input section

The next section prompts the user for an input angle, in degrees, takes that angle and stores it in the variable, value. This has been declared as type .double so it will be stored as a 64 bit IEEE754 float.

    
            // Input
	        ldr x0, =input_msg	    // Load input msg
	        bl printf	            // Display input msg

	        ldr x0, =input_fmt	    // Load input format (64 bit long float)
	        ldr x1, =value	        // Load address for double precision number
	        bl scanf	            // Read long float into 64 bit location
    

Setting up the registers

The next section set up all the registers for the calculation. The code will use d0 to d7. Each register has the following purpose.

    
        setup:
	        //Move all the values into position
	        ldr x0, =value
	        ldr d0, [x0]			// Value, in degrees, in d0
	        ldr x0, =to_radians		// 0.017453 for converting degrees to radians
	        ldr d4, [x0]
	        ldr x0, =three_fact		// 3! = 6
	        ldr d5, [x0]
	        ldr x0, =five_fact		// 5! = 120
	        ldr d6, [x0]
    

Now that all the values are in place the remainder of the code performs the calculations. There are three terms to be calculated and each is done in turn and stored in registers d1 - d3.

Calculating the first term

The first term is very easy. It's just the angle converted from degrees to radians.

    
            term1:
	        fmul d1, d0, d4		    // Convert deg to rad
    

Calculating the second term

The second term requires more steps:

    
        term2:
	        fmul d2, d0, d4		    // Convert deg to rad

        setup_pow_3:
	        mov x10, #2			    // Loop counter. one mult = sqrd, two = cubed
	        fmov d7, d2			    // Copy d2 to d7.  D7 will be the multiplier

        do_pow_3:
	        fmul d2, d2, d7         // d2 = d2 * d7. Equiv to x = x^2
	        sub x10, x10, #1        // Decrement loop counter
	        cmp x10, #0             // Counter = 0?
	        beq div_6               // Yes, do division ...
	        b do_pow_3              // No, repeat multiplication

        div_6:
	        fdiv d2, d2, d5		    // Divide by 6
    

The first step is to convert the angle in degrees, currently in d0, to radians and store in d2.
The second step is to set up the third power calculation. Initially, the value in d2 is copied to d7. Then x10 is loaded with #2 as a counter for the number of multiplications that are required. The reason two is used is because after the first muliplication, resulting in x2, the counter will be reduced by one to one. A comparison with zero will fail and the multiplication will occur again resulting in x3. The counter will be reduced by one again to zero and a comparison with zero will pass and the process will move on.
The final step is the divide by six calculation and the final result for the second term is now stored in d2.

Calculating the third term

This, the final term, is the same process as above except that the power is five, requiring x10 to be loaded with #4, and the final division is by 120.

        
        term3:
	        fmul d3, d0, d4		    // Convert deg to rad

        setup_pow_5:
	        mov x10, #4
	        fmov d7, d3

        do_pow_5:
	        fmul d3, d3, d7
	        sub x10, x10, #1
	        cmp x10, #0
	        beq div_120
	        b do_pow_5

        div_120:
	        fdiv d3, d3, d6		    // /120
        
    

Summing the terms

The final step is to sum the terms (actually it's term 1 - term 2 + term 3).

        
        sum:
	        fsub d1, d1, d2
	        fadd d1, d1, d3

        print:
	        ldr x0, =output_fmt
	        bl printf
        
    

Summary

And that is the project. The results are not accurate beyond the fourth, or possibly third, decimal place but the purpose was not to write an accurate calculation algorithm, but rather, to demonstrate the algorithm in ARM64 assembly.



Source code

    
        // calc_sin.s
        .data
        output_fmt:
	        .asciz "sin(%f) = %f.\n"	// Allow float for input
        input_msg:
	        .asciz "Input angle in degrees: "
        input_fmt:
	        .asciz "%lf"		// Must be read as 64 bit value and stored in d register
        value:
	        .double 0
        to_radians:
	        .double 0.01745329252   // pi / 180
        three_fact:
	        .double 6.0
        five_fact:
	        .double 120.0

        .global main
        .extern scanf
        .extern printf

        .text
        main:
	        // Prolog
	        stp x29, x30, [sp, -16]!
	        mov x29, sp

	        // Input
	        ldr x0, =input_msg	    // Load input msg
	        bl printf	            // Display input msg

	        ldr x0, =input_fmt	    // Load input format (64 bit long float)
	        ldr x1, =value	        // Load address for double precision number
	        bl scanf	            // Read long float into 64 bit location

        setup:
	        //Move all the values into position
	        ldr x0, =value
	        ldr d0, [x0]			// Value, in degrees, in d0
	        ldr x0, =to_radians		// 0.017453 for converting degrees to radians
	        ldr d4, [x0]
	        ldr x0, =three_fact		// 3! = 6
	        ldr d5, [x0]
	        ldr x0, =five_fact		// 5! = 120
	        ldr d6, [x0]

        term1:
	        fmul d1, d0, d4		    // Convert deg to rad

        term2:
	        fmul d2, d0, d4		    // Convert deg to rad

        setup_pow_3:
	        mov x10, #2			    // Loop counter. one mult = sqrd, two = cubed
	        fmov d7, d2			    // Copy d2 to d7.  D7 will be the multiplier

        do_pow_3:
	        fmul d2, d2, d7         // d2 = d2 * d7. Equiv to x = x^2
	        sub x10, x10, #1        // Decrement loop counter
	        cmp x10, #0             // Counter = 0?
	        beq div_6               // Yes, do division ...
	        b do_pow_3              // No, repeat multiplication

        div_6:
	        fdiv d2, d2, d5		    // Divide by 6

        term3:
	        fmul d3, d0, d4		    // Convert deg to rad

        setup_pow_5:
	        mov x10, #4
	        fmov d7, d3

        do_pow_5:
	        fmul d3, d3, d7
	        sub x10, x10, #1
	        cmp x10, #0
	        beq div_120
	        b do_pow_5

        div_120:
	        fdiv d3, d3, d6		    // /120

        sum:
	        fsub d1, d1, d2
	        fadd d1, d1, d3

        print:
	        ldr x0, =output_fmt
	        bl printf


	        // cleanup
	        mov x0, #0
	        ldp x29, x30, [sp], 16
	        RET